NAME
DiaColloDB::methods::compile - compile-time methods for DiaColloDB
SYNOPSIS
##========================================================================
## PRELIMINARIES
use DiaColloDB;
$coldb = DiaColloDB->new(%args);
##========================================================================
## create: utils
$multimap = $coldb->create_multimap($base, \%ts2i, $packfmt, $label="multimap");
\@attrs = $coldb->attrs();
$atitle = $CLASS_OR_OBJECT->attrTitle($attr_or_alias);
$acbexpr = $CLASS_OR_OBJECT->attrCountBy($attr_or_alias,$matchid=0);
$aquery_or_filter_or_undef = $CLASS_OR_OBJECT->attrQuery($attr_or_alias,$cquery);
\@attrdata = $coldb->attrData();
$bool = $coldb->hasAttr($attr);
##========================================================================
## create: from corpus
$bool = $coldb->create($corpus,%opts);
##========================================================================
## create: union (aka merge)
$coldb = $CLASS_OR_OBJECT->union(\@coldbs_or_dbdirs,%opts);
DESCRIPTION
The DiaColloDB::methods::compile
module adds compile-time methods for the top-level DiaColloDB package, which see for more details.
Prior to v2.12.012, the methods defined by this module were defined directly in the top-level DiaColloDB
package.
create: utils
- Variables: (%ATTR_ALIAS,%ATTR_RALIAS,%ATTR_TITLE,%ATTR_CBEXPR);
-
Global attribute alias hacks.
%ATTR_ALIAS = ($name_or_alias=>$name, ...) %ATTR_RALIAS = ($name=>\@aliases, ...) %ATTR_CBEXPR = ($name=>$ddcCountByExpr, ...) %ATTR_TITLE = ($name_or_alias=>$title, ...)
- create_multimap
-
$multimap = $coldb->create_multimap($base, \%ts2i, $packfmt, $label="multimap");
Create an expansion multimap, used by create().
- attrs
-
\@attrs = $coldb->attrs(); \@attrs = $coldb->attrs($attrs=$coldb-E<gt>{attrs}, $default=[]);
parse attributes in $attrs as array.
- attrName
-
$aname = $CLASS_OR_OBJECT->attrName($attr)
Returns canonical (short) attribute name for $attr. Supports aliases in %ATTR_ALIAS = ($alias=>$name, ...).
- attrTitle
-
$atitle = $CLASS_OR_OBJECT->attrTitle($attr_or_alias);
Returns an attribute title for $attr_or_alias
- attrCountBy
-
$acbexpr = $CLASS_OR_OBJECT->attrCountBy($attr_or_alias,$matchid=0);
Returns a DDC::XS:CQCountKeyExpr object for $attr_or_alias with match-id $matchid.
- attrQuery
-
$aquery_or_filter_or_undef = $CLASS_OR_OBJECT->attrQuery($attr_or_alias,$cquery);
returns a DDC::XS::CQuery or DDC::XS::CQFilter object for condition $cquery on $attr_or_alias.
- attrData
-
\@attrdata = $coldb->attrData(); \@attrdata = $coldb->attrData(\@attrs=$coldb->attrs)
get attribute data for \@attrs; returns @attrdata = ({a=>$a, i=>$i, enum=>$aenum, pack_x=>$pack_xa, a2x=>$a2x, ...})
- hasAttr
-
$bool = $coldb->hasAttr($attr);
Returns true iff $coldb natively supports the attribute (or alias) $attr.
create: from corpus
- create
-
$coldb = $CLASS->create($corpus,%opts); $coldb = $coldb->create($corpus,%opts);
Create and return a new DiaColloDB database object $coldb from a DiaColloDB::Corpus object
$corpus
. %opts overrides %$coldb properties.If
$corpus
is a pre-compiled and pre-filtered DiaColloDB::Corpus::Compiled object, only the corpus content filters pre-compiled into$corpus
itself are used. Otherwise, a temporary DiaColloDB::Corpus::Compiled object will be created for$corpus
, and the DiaColloDB::Corpus::Filters keys of$coldb
itself will be used as content filters.Honors the $coldb properties
index_tdf
,index_xf
, andindex_cof
to determine which underlyingDiaColloDB::Relation|DiaColloDB::Relation
s are included in the output database.Applies frequency-cutoffs
tfmin
,cfmin
, etc. after parsing (filtered) corpus data.Honors
$DiaColloDB::NJOBS|DiaColloDB::Utils::nJobs
for partial parallelization of selected sub-tasks.Currently does not support appending new data to an existing DiaColloDB index.
create: union (aka merge)
- union
-
$coldb = $CLASS->union(\@coldbs_or_dbdirs,%opts); $coldb = $coldb->union(\@coldbs_or_dbdirs,%opts);
Populates $coldb as a union over @coldbs_or_dbdirs. Clobbers argument DB keys {_union_${a}i2u}, {_union_xi2u}, {_union_argi} for administrative purposes.
Often faster than using the create() method on the original source corpora, since corpus document file(s) do not need to be re-parsed for
union()
operations.
AUTHOR
Bryan Jurish <moocow@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2015-2020 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.
SEE ALSO
DiaColloDB(3pm), ...