NAME
bioyatea - Perl script for extracting terms from a corpus of biomedical texts (based on the module Lingua::YaTeA).
SYNOPSIS
bioyatea [-help] [-man] [--rcfile=file] file
OPTIONS
- --help, -h, -? brief help message
- --man, -m full documentation
- --rcfile=file load the given configuration file
- --extraction perform the term extraction
- --post-processing=file, -C file set the filename for the output in case of post-processing
- --pre-processing=file, -f file set the filename for the output in case of pret-processing
- --post-processing-config=file set the configuration file for the post-processing
- file corpus of texts in TreeTagger output format. If only post-processing is set, the file is a YaTeA XML output
DESCRIPTION
BioYaTeA is an adaptation of YaTeA (Lingua::YaTeA
) for biomedical text. The tuning concerns the configuration files (in the directory share/BioYaTeA
, pre-processing of the input file and post-processing of the XML output.
USE OF BIOYATEA
Using BioYaTeA requires to have a output of TreeTagger (<http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html> or GeniaTagger (<http://www.nactem.ac.uk/GENIA/tagger/>. It will be the input of BioYaTeA.
To run bioyatea, a configuration file is needed (usually bioyatea.rc in /etc/bioyatea). This file describes the behaviour of the term extractor. You have to indicate the language of the configuration file you use (see section CONFIGURATION FILE FORMAT of Lingua::YaTeA
for more details, ). It also indicates the path of the configuration files for the linguistic analysis. You have to adapt the path if your configuration is not standard.
An example of the configuration file is available in etc/bioyatea/bioyatea.rc
from the archive directory.
- The most common command line to run BioYaTeA is
-
bioyatea -e TreeTaggerOutputFile.ttg
It is assumed that the directory containing the program bioyatea is in your PATH variable and that the configuration file is
/etc/bioyatea/bioyatea.rc
. - If you are not allow to copy the configuration file
bioyatea.rc
in the directory/etc/bioyatea
(or create this directory), or if you want to use your own configuration file, you can specify the file with its path by using the option--rcfile
-
bioyatea -e --rcfile MyBioYaTeAConfig.rc TreeTaggerOutputFile.ttg
More examples of the use of bioyaeta script is given below.
INPUT/OUTPUT FILE FORMATS
See Documentation in Lingua::YaTeA
EXAMPLES
Processing of a file without post-processing, with the default configuration file (/etc/bioyatea/bioyatea.rc
):
bioyatea -e sampleEN.ttg
Processing of a file without post-processing. The configuration file is given in the option --rcfile
:
bioyatea -e --rcfile etc/bioyatea.rc sampleEN.ttg
Processing of a file with post-processing:
bioyatea -e --rcfile etc/bioyatea.rc --post-processing-config etc/post-processing-filtering.conf --post-processing sampleEN-PP.xml sampleEN.ttg
Only post-processing a file (XML YaTeA output format):
bioyatea --post-processing-config etc/post-processing-filtering.conf --post-processing sampleEN-PP.xml sampleEN-output.xml
Processing of a file with pre-processing:
bioyatea -e --rcfile etc/bioyatea.rc --pre-processing sampleEN-prepro.ttg sampleEN.ttg
Only pre-processing a file (TreeTagger output format):
bioyatea --pre-processing sampleEN-prepro.ttg sampleEN.ttg
Processing of a file with pre-processing and post-processing:
bioyatea -e --rcfile etc/bioyatea.rc --post-processing-config etc/post-processing-filtering.conf --post-processing sampleEN-PP.xml --pre-processing sampleEN-prepro.ttg sampleEN.ttg
SEE ALSO
Documentation of Lingua::YaTeA
AUTHORS
Wiktoria Golik <wiktoria.golik@jouy.inra.fr>, Zorana Ratkovic <Zorana.Ratkovic@jouy.inra.fr>, Robert Bossy <Robert.Bossy@jouy.inra.fr>, Claire Nédellec <claire.nedellec@jouy.inra.fr>, Thierry Hamon <thierry.hamon@univ-paris13.fr>
LICENSE
Copyright (C) 2012 Wiktoria Golik, Zorana Ratkovic, Robert Bossy, Claire Nédellec and Thierry Hamon
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.