NAME
preProcessingRewriting - Perl script for rewriting the POS-tagged terms provided by TreeTagger.
SYNOPSIS
preProcessingRewriting [-help] [-man] [--configuration file] input_file output_file
OPTIONS
- --help, -h, -? brief help message
- --man, -m full documentation
- input_file, -i BioYaTeA input file in TreeTagger ouput format
- output_file, -o Rewriting output file (TreeTagger format)
DESCRIPTION
This script performs the pre-processing of the TreeTagger output in order to improve the extraction of both terms containing prepositional phrases (with TO and AT prepositions) and terms containing participles (past participles -ED and gerunds -ING). Context-based rules are applied to the POS tags either to trigger the extraction of relevant structures or to prevent the extraction of irrelevant ones. The modified file becomes a new input file for BioYaTeA.
Without specifying the input file, the input data are read on stdin. Without specifying the output file, the ouput data are print on stdout.
INPUT/OUTPUT FILE FORMATS
See Documentation in Lingua::YaTeA
EXAMPLES
preProcessingRewriting -i examples/sampleEN.ttg -o examples/sampleEN-prepro
preProcessingRewriting < examples/sampleEN.ttg > examples/sampleEN-prepro
SEE ALSO
Documentation of Lingua::BioYaTeA::PostProcessing, Lingua::BioYaTeA and Lingua::YaTeA
AUTHORS
Wiktoria Golik <wiktoria.golik@jouy.inra.fr>, Zorana Ratkovic <Zorana.Ratkovic@jouy.inra.fr>, Robert Bossy <Robert.Bossy@jouy.inra.fr>, Claire Nédellec <claire.nedellec@jouy.inra.fr>, Thierry Hamon <thierry.hamon@univ-paris13.fr>
LICENSE
Copyright (C) 2012 Wiktoria Golik, Zorana Ratkovic, Robert Bossy, Claire Nédellec and Thierry Hamon
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.