NAME
Alvis::NLPPlatform::Annotation - Perl extension for managing XML annotation of documents in the Alvis format
SYNOPSIS
use Alvis::NLPPlatform::Annotation;
Alvis::NLPPlatform::Annotation::load_xml($doc_xml);
Alvis::NLPPlatform::Annotation::render_xml($doc_xml, \*STDOUT);
DESCRIPTION
This module provides two main methods (load_xml
and render_xml
) for loading and dumping XML annotated documents conformed to the Alvis DTD (see http://www.alvis/info ).
Documents are read on the standard input and load in a has table. Annotated documents are written on a file thanks to the descriptor given as parameter. Note that the input documents can be annoted or not, even partially annotated.
METHODS
read_key_id()
read_key_id($element_id);
this method returns the number in the id ($element_id
) of the token or word XML element (10 in the element id 'token10').
sort_keys()
sort_keys($element_id1, $element_id2);
This method sorts two xml element ids ($element_id1
and $element_id2
) after removing string refering to the type of the xml element ("token", "word", etc.).
sort()
sort($ref_hashtable)
This method sorts elements of the hash table ($ref_hashtable
) according to the number in the id ($element_id
) of the XML elements (10 in the element id 'token10').
render()
render($doc_hash, $descriptor);
Write the XML document annotation in the specified decriptor ($descriptor
). The document is passed as a hashtable ($doc_hash
) loaded by the method load_xml. This hashtable can be modified by NLP Wrappers (Alvis::NLPPlatform::NLPWrappers
).
The method return 0 in case of success.
render_xml()
render($doc_hash, $descriptor, $printCollectionHeaderFooter);
Main method used for generating XML document annotations. $descriptor
is the decriptor of the file where the document will be stored. $doc_hash
is the hashtable containing the annotated document. $printCollectionHeaderFooter
indicates if the documentCollection
header and footer have to be printed. $hash_config
is the reference to the hashtable containing the variables defined in the configuration file).
The method return 0 in case of success.
load_xml()
load_xml($doc_xml);
Read a input XML annotated document ($doc_xml
) on STDIN. The loaded annotations are stored in a hashtable. This hashtable can be modified by NLP Wrappers (Alvis::NLPPlatform::NLPWrappers
).
The method return 0 in case of success.
print_Annotation()
print_Annotation($descriptor, $string);
This method prints annotations in the descriptor and insures any conformance, (to UTF-8 for instance).
# =head1 ENVIRONMENT
SEE ALSO
Alvis::NLPPlatform
Alvis web site: http://www.alvis.info
AUTHORS
Thierry Hamon <thierry.hamon@lipn.univ-paris13.fr> and Julien Deriviere <julien.deriviere@lipn.univ-paris13.fr>
LICENSE
Copyright (C) 2005 by Thierry Hamon and Julien Deriviere
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.