NAME
Parse::Pyapp - PCFG Parser
SYNOPSIS
use Parse::Pyapp;
my $parser = Parse::Pyapp->new();
$parser->addrule($LHS, [ $RHS_1, $P_RHS_1 ], [ $RHS_2, $P_RHS_2 ]);
$parser->addlex($LHS, [ $RHS_1, $P_RHS_1 ], [ $RHS_2, $P_RHS_2 ]);
$parser->start($LHS);
$parser->parse(@words) or print "Parse error\n";
DESCRIPTION
This module is a (PCFG | SCFG) parser. You may use this module to do stochastic parsing.
USAGE
Initiation of a parser
$parser = Parse::Pyapp->new();
Adding lexicons
$parser->addlex('N',
[ 'house', .5 ],
[ 'book', .5 ]
);
You can hook an semantic action to alexicon. For instance,
$parser->addlex('N',
[ 'house', .5 ],
[ 'book', .5 ],
sub { print $_[1] }
);
Parse::Pyapp passes the parser itself as the first parameter, and the lexicon comes in the second place. The left-hand-side symbol can be accessed with $_[0]->{lhs}.
Adding rules
$parser->addrule('VP',
[ 'V', 0.5 ],
[ 'V', 'NP', .5 ]
);
First one is the LHS symbol, and then follow all the possible right-hand-side derivations with their probabilities.
Similarly, you can hook semantic actions to the end of a derivation. For instance,
$parser->addrule('VP',
[ 'V', 0.5, sub { print $_[1] } ],
[ 'V', 'NP', .5 ]
);
Parse::Pyapp passes the parser itself as the first parameter, and the corresponding tokens as the rest. The left-hand-side symbol can be accessed with $_[0]->{lhs}, and right-hand POS tags with @{$_->{pos}}
Currently, this module does not check if the sum of probabilities going out from a non-terminal is equal to 1.
Setting the starting symbol
$parser->start('S');
Parsing a sentence
You need to tokenize the sentence yourself.
$parser->parse(@words);
It returns non-undef if there is no error.
CAVEATS
This is still an alpha version, and everything is subject to change. Use it with your cautions. By the way, since it's all written in Perl, thus slowness is the fate.
TO DO
Grammar learning, lexical relations, structural modeling, yacc-like input, error handling, etc. There is a lot of room for improvement.
COPYRIGHT
xern <xern@cpan.org>
This module is free software; you can redistribute it or modify it under the same terms as Perl itself.