NAME
Treex::Tool::Tagger::MeCab - perl wrapper for C implemented japanese morphological analyzer MeCab
VERSION
version 0.13095
SYNOPSIS
use Treex::Tool::Tagger::MeCab;
my $tagger = Treex::Tool::Tagger::MeCab->new();
my $sentence = qw(わたしは日本語を話します);
my @tokens = $tagger->process_sentence($sentence);
DESCRIPTION
This is a Perl wrapper for MeCab tagger and tokenizer implemented in C++. Generates string of features (first one is wordform) for each token generated. Returns array of tokens for further use.
INSTALLATION
Before installing MeCab, make sure you have properly installed the Treex-Core package (see Treex Installation), since it is prerequisite for this module anyway. After installing Treex-Core you can install MeCab using this Makefile (username "public" passwd "public"). Prior to runing the makefile, you must set the enviromental variable "$TMT_ROOT" to the location of your .treex directory.
You can also install MeCab manually but then you must link the installation directory to the ${TMT_ROOT}/share/installed_tools/tagger/MeCab/ (location within Treex share), otherwise the modules will not be able to use the program.
METHODS
- @tokens = $tagger->process_sentence($sentence);
-
Returns list of "tokens" for the tokenized input with its morphological categories each separated by \t.
SEE ALSO
AUTHOR
Dušan Variš <dvaris@seznam.cz>
COPYRIGHT AND LICENSE
Copyright © 2014 by Institute of Formal and Applied Linguistics, Charles University in Prague
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.