NAME

Lingua::ZH::MMSEG Mandarin Chinese segmentation

SYNOPSIS

#!/usr/bin/perl
use utf8;
use Lingua::ZH::MMSEG;

my $seg = Lingua::ZH::MMSEG->new();

my $zh_string="現代漢語的複合動詞可分三個結構語意關係來探討";

my @phrases = $seg->mmseg($zh_string);
# use MMSEG algorithm

my @phrases = $seg->fmm($zh_string);
# use Forward Maximum Matching algorithm

DESCRIPTION

A problem in computational analysis of Chinese text is that there are no word boundaries in conventionally printed text. Since the word is such a fundamental linguistic unit, it is necessary to identify words in Chinese text so that higher-level analyses can be performed.

Lingua::ZH::MMSEG implements MMSEG original developed by Chih-Hao-Tsai. The whole module is rewritten in pure Perl, and the phrase library is 新酷音 forked from OpenFoundry.

INSTALL

To install this module, just type

cpanm Lingua::ZH::MMSEG

If you don't have cpanm,

curl -LO http://bit.ly/cpanm
chmod +x cpanm
sudo cp cpanm /usr/local/bin

USAGE

Since this module has no dependency at all, you just simply create a new perl script as shown in SYNOPSIS.

METHODS

`new`

my $seg = Lingua::ZH::MMSEG->new()

Initialize phrase dictionary. Currently it is not allowed to add new phrase into the dictionary.

`mmseg`

my @phrases = $seg->mmseg($zh_string);

Use MMSEG algorithm to generate segmented chinese phrases.

`fmm`

my @phrases = $seg->fmm($zh_string);

Use forward maximum matching algorithm to generate segmented chinese phrases. It has lower complexity compare to mmseg, but it cannot solve phrase ambiguities.

AUTHOR

Felix Ren-Chyan Chern (dryman) <idryman@gmail.com>

LICENSE AND COPYRIGHT

GNU Lesser General Public License 2.1

To install Lingua::ZH::MMSEG, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Lingua::ZH::MMSEG

CPAN shell

perl -MCPAN -e shell
install Lingua::ZH::MMSEG

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)