NAME
Treex::Core::Block - the basic data-processing unit in the Treex framework
VERSION
version 0.07191
SYNOPSIS
package Treex::Block::My::Block;
use Moose;
use Treex::Core::Common;
extends 'Treex::Core::Block';
sub process_bundle {
my ( $self, $bundle) = @_;
# bundle processing
}
DESCRIPTION
Treex::Core::Block
is a base class serving as a common ancestor of all Treex blocks. Treex::Core::Block
can't be used directly in any scenario. Use it's descendants which implement one of the methods process_document()
, process_bundle()
, process_zone()
, process_[atnp]tree()
or process_[atnp]node()
.
CONSTRUCTOR
- my $block = Treex::Block::My::Block->new();
-
Instance of a block derived from
Treex::Core::Block
can be created by the constructor (optionally, a reference to a hash of block parameters can be specified as the constructor's argument, see "BLOCK PARAMETRIZATION"). However, it is not likely to appear in your code since block initialization is usually invoked automatically when initializing a scenario.
METHODS FOR BLOCK EXECUTION
You must override one of the following methods:
- $block->process_document($document);
-
Applies the block instance on the given instance of Treex::Core::Document. The default implementation iterates over all bundles in a document and calls
process_bundle()
. So in most cases you don't need to override this method. - $block->process_bundle($bundle);
-
Applies the block instance on the given bundle (Treex::Core::Bundle).
- $block->process_zone($zone);
-
Applies the block instance on the given bundle zone (Treex::Core::BundleZone). Unlike
process_document
andprocess_bundle
,process_zone
requires block attributelanguage
(and possibly alsoselector
) to be specified. - $block->process_end();
-
This method is called after all documents are processed. The default implementation is empty, but derived classes can override it to e.g. print some final summaries, statistics etc. Overriding this method is preferable to both standard Perl END blocks (where you cannot access
$self
and instance attributes), and DEMOLISH (which is not called in some cases, e.g.treex --watch
).
BLOCK PARAMETRIZATION
- my $block = BlockGroup::My_Block->new({$name1=>$value1,$name2=>$value2...});
-
Block instances can be parametrized by a hash containing parameter name/value pairs.
- my $param_value = $block->get_parameter($param_name);
-
Parameter values used in block construction can be revealed by
get_parameter
method (but cannot be changed).
MISCEL
- my $langcode_selector = $block->zone_label();
- my $block_name = $block->get_block_name();
-
It returns the name of the block module.
-
If a block requires some files to be present in the shared part of Treex, their list (with relative paths starting in Treex::Core::Config-share_dir|Treex::Core::Config/share_dir>) can be specified by redefining by this method. By default, an empty list is returned. Presence of the files is automatically checked in the block constructor. If some of the required file is missing, the constructor tries to download it from http://ufallab.ms.mff.cuni.cz.
This method should be used especially for downloading statistical models, but not for installed tools or libraries.
sub get_required_share_files { my $self = shift; return ( 'data/models/mytool/'.$self->language.'/features.gz', 'data/models/mytool/'.$self->language.'/weights.tsv', ); }
-
This method checks existence of files given as parameters, it tries to download them if they are not present
SEE ALSO
Treex::Core::Node, Treex::Core::Bundle, Treex::Core::Document, Treex::Core::Scenario,
AUTHOR
Zdeněk Žabokrtský <zabokrtsky@ufal.mff.cuni.cz>
Martin Popel <popel@ufal.mff.cuni.cz>
COPYRIGHT AND LICENSE
Copyright © 2011 by Institute of Formal and Applied Linguistics, Charles University in Prague
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.