NAME
eGuideDog::Dict::Mandarin - an informal Pinyin dictionary.
SYNOPSIS
use utf8;
use eGuideDog::Dict::Mandarin;
binmode(stdout, 'utf8');
my $dict = eGuideDog::Dict::Mandarin->new();
my @symbols = $dict->get_multi_phon("长");
print "长(all pronunciation):@symbols\n"; # zhang3 chang2
my $symbol = $dict->get_pinyin("长");
print "长(default pronunciation): $symbol\n"; # zhang3
$symbol = $dict->get_pinyin("长江");
print "长江的长: $symbol\n"; # chang2
my @symbols = $dict->get_pinyin("拼音");
print "拼音: @symbols\n"; # pin1 yin1
my @words = $dict->get_words("长");
print "Some words begin with 长: @words\n";
DESCRIPTION
This module is for looking up Pinyin of Mandarin characters or words. The dictionary is from Mandarin dictionary of espeak (http://espeak.sf.net).
The Mandarin pronunciation dictionary included with eSpeak is a compact summary of data from CEDICT and Unihan, with some corrections. Rather than include every word in the language, it includes only words that are pronounced differently from the default pronunciations of their component characters (which are also included).
EXPORT
None by default.
METHODS
new()
Initialize dictionary.
get_pinyin($str)
Return an array of Pinyin phonetic symbols of all characters in $str if it is in an array context.
Return a string of Pinyin phonetic symbol of the first character if it is not in an array context.
get_words($char)
Return an array of words which are begun with $char.
is_multi_phon($char)
Return non-zero if $char is a multi-phonetic-symbol character. The returned value plus 1 is the number of phonetic symbols the character has.
Return 0 if $char is single-phonetic-symbol character.
get_multi_phon($char)
Return an array of phonetic symbols of $char.
print_phon_char_list()
Return a list of all Pinyin phonetic symbols with all corresponding characters.
SEE ALSO
eGuideDog::Dict::Cantonese, http://e-guidedog.sf.net
AUTHOR
Cameron Wong, <hgn823-perl at yahoo.com.cn>
ACKNOWLEDGMENT
Thanks to Silas S. Brown (http://people.pwf.cam.ac.uk/ssb22/) for maintaining the Mandarin dictionary file of espeak.
COPYRIGHT AND LICENSE
- of the module
-
Copyright 2008 by Cameron Wong
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
- of the dictionary data
-
Unihan and CC-CEDICT are used in the dictionary data.
About Unihan: Copyright (c) 1996-2006 Unicode, Inc. All Rights reserved.
Name: Unihan database Unicode version: 5.0.0 Table version: 1.1 Date: 7 July 2006
CC-CEDICT is a continuation of the CEDICT project started by Paul Denisowski in 1997 with the aim to provide a complete downloadable Chinese to English dictionary with pronunciation in pinyin for the Chinese characters. It is licensed under a Creative Commons Attribution-Share Alike 3.0 License. http://www.mdbg.net/chindict/chindict.php?page=cedict