NAME

eGuideDog::Dict::Mandarin - an informal Pinyin dictionary.

SYNOPSIS

use utf8;
use eGuideDog::Dict::Mandarin;

binmode(stdout, 'utf8');
my $dict = eGuideDog::Dict::Mandarin->new();
my @symbols = $dict->get_multi_phon("长");
print "长(all pronunciation):@symbols\n"; # zhang3 chang2
my $symbol = $dict->get_pinyin("长");
print "长(default pronunciation): $symbol\n"; # zhang3
$symbol = $dict->get_pinyin("长江");
print "长江的长: $symbol\n"; # chang2
my @symbols = $dict->get_pinyin("拼音");
print "拼音: @symbols\n"; # pin1 yin1
my @words = $dict->get_words("长");
print "Some words begin with 长: @words\n";

DESCRIPTION

This module is for looking up Pinyin of Mandarin characters or words. The dictionary is from Mandarin dictionary of espeak (http://espeak.sf.net).

The Mandarin pronunciation dictionary included with eSpeak is a compact summary of data from CEDICT and Unihan, with some corrections. Rather than include every word in the language, it includes only words that are pronounced differently from the default pronunciations of their component characters (which are also included).

EXPORT

None by default.

METHODS

new()

Initialize dictionary.

get_pinyin($str)

Return an array of Pinyin phonetic symbols of all characters in $str if it is in an array context.

Return a string of Pinyin phonetic symbol of the first character if it is not in an array context.

get_words($char)

Return an array of words which are begun with $char.

is_multi_phon($char)

Return non-zero if $char is a multi-phonetic-symbol character. The returned value plus 1 is the number of phonetic symbols the character has.

Return 0 if $char is single-phonetic-symbol character.

get_multi_phon($char)

Return an array of phonetic symbols of $char.

Return a list of all Pinyin phonetic symbols with all corresponding characters.

SEE ALSO

eGuideDog::Dict::Cantonese, http://e-guidedog.sf.net

AUTHOR

Cameron Wong, <hgn823-perl at yahoo.com.cn>

ACKNOWLEDGMENT

Thanks to Silas S. Brown (http://people.pwf.cam.ac.uk/ssb22/) for maintaining the Mandarin dictionary file of espeak.

COPYRIGHT AND LICENSE

of the module

Copyright 2008 by Cameron Wong

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

of the dictionary data

Unihan and CC-CEDICT are used in the dictionary data.

About Unihan: Copyright (c) 1996-2006 Unicode, Inc. All Rights reserved.

Name: Unihan database
Unicode version: 5.0.0
Table version: 1.1
Date: 7 July 2006

CC-CEDICT is a continuation of the CEDICT project started by Paul Denisowski in 1997 with the aim to provide a complete downloadable Chinese to English dictionary with pronunciation in pinyin for the Chinese characters. It is licensed under a Creative Commons Attribution-Share Alike 3.0 License. http://www.mdbg.net/chindict/chindict.php?page=cedict