NAME

DTA::CAB::Chain - serial multi-analyzer pipeline

SYNOPSIS

use DTA::CAB::Chain;

##========================================================================
## Constructors etc.

$obj = CLASS_OR_OBJ->new(%args);
@keys = $anl->typeKeys(\%opts);

##========================================================================
## Methods: Chain selection

\@analyzers = $ach->chain();
\@analyzers = $ach->subAnalyzers();

##========================================================================
## Methods: I/O

$bool = $ach->ensureLoaded();

##========================================================================
## Methods: Analysis

$bool = $ach->canAnalyze();
$bool = $anl->enabled(\%opts);
undef = $anl->initInfo();

$doc = $ach->analyzeTypes($doc,$types,\%opts);
$doc = $ach->analyzeSentences($doc,\%opts);
$doc = $ach->analyzeLocal($doc,\%opts);
$doc = $ach->analyzeClean($doc,\%opts);

DESCRIPTION

DTA::CAB::Chain is an abstract DTA::CAB::Analyzer subclass for implementing serial document processing "pipelines" or "cascades" in terms of a flat list of DTA::CAB::Analyzer objects.

Constructors etc.

new
$obj = CLASS_OR_OBJ->new(%args);

%$obj, %args:

chain => [ $a1, $a2, ..., $aN ],        ##-- default analysis chain; see also chain() method (default: empty)
typeKeys
@keys = $anl->typeKeys(\%opts);

Returns list of type-wise keys to be expanded for this analyzer by expandTypes() Default implementation just concatenates typeKeys() for sub-analyzers.

Methods: Chain selection

chain
\@analyzers = $ach->chain();
\@analyzers = $ach->chain(\%opts)

Get selected analyzer chain. Default method returns all globally enabled analyzers in $anl->{chain}.

subAnalyzers
\@analyzers = $ach->subAnalyzers();
\@analyzers = $ach->subAnalyzers(\%opts)

Returns a list of all sub-analyzers. Override just calls chain().

Methods: I/O

ensureLoaded
$bool = $ach->ensureLoaded();
$bool = $ach->ensureLoaded(\%opts)

Ensures analysis data is loaded from default files Override calls $a->ensureLoaded() for each $a in $ach->subAnalyzers(\%opts).

Methods: Analysis

canAnalyze
$bool = $ach->canAnalyze();
$bool = $ach->canAnalyze(\%opts)

Returns true if analyzer can perform its function (e.g. data is loaded & non-empty). Override returns true if all enabled analyzers in the chain can analyze.

enabled
$bool = $anl->enabled(\%opts);

Returns $anl->{enabled} and (disjunction over all sub-analyzers).

initInfo
undef = $anl->initInfo();

Logs initialization info. Default method reports values of {label}, enabled().

Methods: Analysis: API

analyzeTypes
$doc = $ach->analyzeTypes($doc,$types,\%opts);

Perform type-wise analysis of all (text) types in $doc->{types}. Chain default calls $a->analyzeTypes for each analyzer $a in the chain.

analyzeSentences
$doc = $ach->analyzeSentences($doc,\%opts);

Perform sentence-wise analysis of all sentences $doc->{body}[$si]. Chain default calls $a->analyzeSentences for each analyzer $a in the chain.

analyzeLocal
$doc = $ach->analyzeLocal($doc,\%opts);

Perform local document-level analysis of $doc. Chain default calls $a->analyzeLocal for each analyzer $a in the chain.

analyzeClean
$doc = $ach->analyzeClean($doc,\%opts);

Cleanup any temporary data associated with $doc. Chain default calls $a->analyzeClean for each analyzer $a in the chain, then superclass Analyzer->analyzeClean.

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2010-2019 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

dta-cab-analyze.perl(1), DTA::CAB::Analyzer(3pm), DTA::CAB::Chain::Multi(3pm), DTA::CAB(3pm), perl(1), ...