NAME
Lingua::Align::Corpus - Perl extension for reading a tokenized plain text corpus, 1 sentence per line; can also be used as a virtual module to open other types of corpora (treebanks etc) using the "-type" attribute
SYNOPSIS
use Lingua::Align::Corpus;
my $corpus = new Lingua::Align::Corpus(-file => $corpusfile);
my @words=();
while ($corpus->next_sentence(\@words)){
print "\n",$corpus->current_id,"> ";
print $treebank->print_sentence(\%tree);
}
my $treebank = new Lingua::Align::Corpus(-file => $corpusfile,
-type => 'TigerXML');
my %tree=();
while ($treebank->next_sentence(\%tree)){
print $treebank->print_sentence(\%tree);
print "\n";
}
DESCRIPTION
SEE ALSO
AUTHOR
Joerg Tiedemann, <jorg.tiedemann@lingfil.uu.se>
COPYRIGHT AND LICENSE
Copyright (C) 2009 by Joerg Tiedemann
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.