NAME

DiaColloDB::methods::compile - compile-time methods for DiaColloDB

SYNOPSIS

##========================================================================
## PRELIMINARIES

use DiaColloDB;
$coldb = DiaColloDB->new(%args);

##========================================================================
## create: utils

$multimap = $coldb->create_multimap($base, \%ts2i, $packfmt, $label="multimap");
\@attrs = $coldb->attrs();
$atitle = $CLASS_OR_OBJECT->attrTitle($attr_or_alias);
$acbexpr = $CLASS_OR_OBJECT->attrCountBy($attr_or_alias,$matchid=0);
$aquery_or_filter_or_undef = $CLASS_OR_OBJECT->attrQuery($attr_or_alias,$cquery);
\@attrdata = $coldb->attrData();
$bool = $coldb->hasAttr($attr);

##========================================================================
## create: from corpus

$bool = $coldb->create($corpus,%opts);

##========================================================================
## create: union (aka merge)

$coldb = $CLASS_OR_OBJECT->union(\@coldbs_or_dbdirs,%opts);

DESCRIPTION

The DiaColloDB::methods::compile module adds compile-time methods for the top-level DiaColloDB package, which see for more details.

Prior to v2.12.012, the methods defined by this module were defined directly in the top-level DiaColloDB package.

create: utils

Variables: (%ATTR_ALIAS,%ATTR_RALIAS,%ATTR_TITLE,%ATTR_CBEXPR);

Global attribute alias hacks.

%ATTR_ALIAS  = ($name_or_alias=>$name, ...)
%ATTR_RALIAS = ($name=>\@aliases, ...)
%ATTR_CBEXPR = ($name=>$ddcCountByExpr, ...)
%ATTR_TITLE  = ($name_or_alias=>$title, ...)
create_multimap
$multimap = $coldb->create_multimap($base, \%ts2i, $packfmt, $label="multimap");

Create an expansion multimap, used by create().

attrs
\@attrs = $coldb->attrs();
\@attrs = $coldb->attrs($attrs=$coldb-E<gt>{attrs}, $default=[]);

parse attributes in $attrs as array.

attrName
$aname = $CLASS_OR_OBJECT->attrName($attr)

Returns canonical (short) attribute name for $attr. Supports aliases in %ATTR_ALIAS = ($alias=>$name, ...).

attrTitle
$atitle = $CLASS_OR_OBJECT->attrTitle($attr_or_alias);

Returns an attribute title for $attr_or_alias

attrCountBy
$acbexpr = $CLASS_OR_OBJECT->attrCountBy($attr_or_alias,$matchid=0);

Returns a DDC::XS:CQCountKeyExpr object for $attr_or_alias with match-id $matchid.

attrQuery
$aquery_or_filter_or_undef = $CLASS_OR_OBJECT->attrQuery($attr_or_alias,$cquery);

returns a DDC::XS::CQuery or DDC::XS::CQFilter object for condition $cquery on $attr_or_alias.

attrData
\@attrdata = $coldb->attrData();
\@attrdata = $coldb->attrData(\@attrs=$coldb->attrs)

get attribute data for \@attrs; returns @attrdata = ({a=>$a, i=>$i, enum=>$aenum, pack_x=>$pack_xa, a2x=>$a2x, ...})

hasAttr
$bool = $coldb->hasAttr($attr);

Returns true iff $coldb natively supports the attribute (or alias) $attr.

create: from corpus

create
$coldb = $CLASS->create($corpus,%opts);
$coldb = $coldb->create($corpus,%opts);

Create and return a new DiaColloDB database object $coldb from a DiaColloDB::Corpus object $corpus. %opts overrides %$coldb properties.

If $corpus is a pre-compiled and pre-filtered DiaColloDB::Corpus::Compiled object, only the corpus content filters pre-compiled into $corpus itself are used. Otherwise, a temporary DiaColloDB::Corpus::Compiled object will be created for $corpus, and the DiaColloDB::Corpus::Filters keys of $coldb itself will be used as content filters.

Honors the $coldb properties index_tdf, index_xf, and index_cof to determine which underlying DiaColloDB::Relation|DiaColloDB::Relations are included in the output database.

Applies frequency-cutoffs tfmin, cfmin, etc. after parsing (filtered) corpus data.

Honors $DiaColloDB::NJOBS|DiaColloDB::Utils::nJobs for partial parallelization of selected sub-tasks.

Currently does not support appending new data to an existing DiaColloDB index.

create: union (aka merge)

union
$coldb = $CLASS->union(\@coldbs_or_dbdirs,%opts);
$coldb = $coldb->union(\@coldbs_or_dbdirs,%opts);

Populates $coldb as a union over @coldbs_or_dbdirs. Clobbers argument DB keys {_union_${a}i2u}, {_union_xi2u}, {_union_argi} for administrative purposes.

Often faster than using the create() method on the original source corpora, since corpus document file(s) do not need to be re-parsed for union() operations.

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2015-2020 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

DiaColloDB(3pm), ...