NAME
DiaColloDB::Relation - diachronic collocation db, relation API (abstract & utilities)
SYNOPSIS
##========================================================================
## PRELIMINARIES
use DiaColloDB::Relation;
##========================================================================
## Constructors etc.
$rel = $CLASS_OR_OBJECT->new(%args);
##========================================================================
## Relation API: creation
$rel = $CLASS_OR_OBJECT->create($coldb, $tokdat_file, %opts);
$rel = $CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);
##========================================================================
## Relation API: profiling
$mprf = $rel->profile($coldb, %opts);
$mprf = $rel->extend($coldb, %opts);
$mpdiff = $rel->compare($coldb, %opts);
$mpdiff = $rel->diff($coldb, %opts);
##========================================================================
## Relation API: default
\%slice2prf = $rel->subprofile1(\@tids, \%opts);
\%slice2prf = $rel->subprofile2(\%slice2prf, %opts);
\%slice2prf = $rel->subextend(\%slice2prf, \%opts);
\%qinfo = $rel->qinfo($coldb, %opts);
(\@q1strs,\@q2strs,\@qxstrs,\@fstrs) = $rel->qinfoData($coldb,%opts);
DESCRIPTION
DiaColloDB::Relation is a base class for low-level indices capable of returning raw frequency data suitable for constructing DiaColloDB::Profile::Multi objects. In addition to the API specification, the DiaColloDB::Relation package also provides several common utility methods used by native DiaColloDB index types.
Globals & Constants
- Variable: @ISA
-
DiaColloDB::Relation inherits from DiaColloDB::Persistent.
Constructors etc.
- new
-
$rel = CLASS_OR_OBJECT->new(%args);
%args, object structure: nothing here, see subclass documentation for details.
Relation API: creation
- create
-
$rel = $CLASS_OR_OBJECT->create($coldb, $tokdat_file, %opts);
populates relation database from $tokdat_file, a tt-style text file with lines of the form:
TID DATE ##-- single token "\n" ##-- blank line ~ EOS (hard co-occurrence boundary)
%opts: clobber %$rel
- union
-
$rel = $CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);
merge multiple co-frequency indices into new object
@pairs
: array of pairs([$argrel,\@ti2u],...)
of relation-objects$argrel
and tuple-id maps\@ti2u
for$argrel
%opts: clobber %$rel
should implicitly flush the new relation index
Relation API: profiling
- profile
-
$mprf = $rel->profile($coldb, %opts);
Get a relation-specific profile for selected items as a DiaColloDB::Profile::Multi object; called by DiaColloDB::profile().
%opts:
##-- selection parameters query => $query, ##-- target request ATTR:REQ... date => $date1, ##-- string or array or range "MIN-MAX" (inclusive) : default=all ## ##-- aggregation parameters slice => $slice, ##-- date slice (default=1, 0 for global profile) groupby => $groupby, ##-- string or array "ATTR1[:HAVING1] ...": default=$coldb->attrs; see groupby() method ## ##-- scoring and trimming parameters eps => $eps, ##-- smoothing constant (default=0) score => $func, ##-- scoring function (f|fm|lf|lfm|mi|ld) : default="f" kbest => $k, ##-- return only $k best collocates per date (slice) : default=-1:all cutoff => $cutoff, ##-- minimum score global => $bool, ##-- trim profiles globally (vs. locally for each date-slice?) (default=0) ## ##-- profiling and debugging parameters strings => $bool, ##-- do/don't stringify (default=do) fill => $bool, ##-- if true, returned multi-profile will have null profiles inserted for missing slices onepass => $bool, ##-- if true, use fast but incorrect 1-pass method (default=0; Cofreqs subclass only)
The default implementation
parses the request and extracts target tuple-ids,
calls $rel->subprofile1() to compute slice-wise joint frequency profiles (f12),
calls $rel->subprofile2() to compute independent collocate frequencies (f2), and finally
collects the result in a DiaColloDB::Profile::Multi object.
Default values for
%opts
should be set by a higher-level call, e.g. DiaColloDB::profile(). - extend
-
$mprf = $rel->extend($coldb, %opts);
Get independent f2 frequencies for
$opts{slice2keys}
as a DiaColloDB::Profile::Multi object; called by DiaColloDB::extend().%opts: as for profile(), also:
slice2keys => \%slice2keys, ##-- target f2-items by slice-label (REQUIRED)
Default implementation calls $rel->subextend().
- compare
-
$mpdiff = $rel->compare($coldb, %opts);
Get a relation-specific comparison profile for selected items as a DiaColloDB::Profile::MultiDiff object.
%opts:
##-- selection parameters (a|b)?query => $query, ##-- target query as for parseRequest() (a|b)?date => $date1, ##-- string or array or range "MIN-MAX" (inclusive) : default=all ## ##-- aggregation parameters groupby => $groupby, ##-- string or array "ATTR1[:HAVING1] ...": default=$coldb->attrs; see groupby() method (a|b)?slice => $slice, ##-- date slice (default=1, 0 for global profile) ## ##-- scoring and trimming parameters eps => $eps, ##-- smoothing constant (default=0) score => $func, ##-- scoring function (f|fm|lf|lfm|mi|ld) : default="f" kbest => $k, ##-- return only $k best collocates per date (slice) : default=-1:all cutoff => $cutoff, ##-- minimum score global => $bool, ##-- trim profiles globally (vs. locally for each date-slice?) (default=0) diff => $diff, ##-- low-level score-diff operation (diff|adiff|sum|min|max|avg|havg); default='adiff' ## ##-- profiling and debugging parameters strings => $bool, ##-- do/don't stringify (default=do) onepass => $bool, ##-- if true, use fast but incorrect 1-pass profiling method (default=0) ## ##-- sublcass abstraction parameters _gbparse => $bool, ##-- if true (default), 'groupby' clause will be parsed only once, using $coldb->groupby() method _abkeys => \@abkeys, ##-- additional key-suffixes KEY s.t. (KEY=>VAL) gets passed to profile() calls if e.g. (aKEY=>VAL) is in %opts
The default implementation just wraps the profile() method; default values for
%opts
should be set by higher-level call, e.g. DiaColloDB::compare(). - diff
-
$mpdiff = $rel->diff($coldb, %opts);
alias for compare()
Relation API: default
- subprofile1
-
\%slice2prf = $rel->subprofile1(\@tids,\%opts);
Native index API low-level first-pass profiling function for joint frequency acquisition (f12); default implementation just throws an error.
- subprofile2
-
\%slice2prf = $rel->subprofile2(\%slice2prf, %opts);
Native index API low-level second-pass profiling function for independent frequency acquisition (f2); default implementation just returns
\%slice2prf
, which is appropriate for relations which use a single-pass strategy to populate$prf->{f2}
in their implementation of subprofile1(). - subextend
-
\%slice2prf = $rel->subextend(\%slice2prf,\%opts);
Native index API low-level profile-extension function for slice-wise independent frequency acquisition (f2). Default implementation throws an error.
- qinfo
-
\%qinfo = $rel->qinfo($coldb, %opts);
get query-info hash for profile administrivia (ddc kwic links). %opts: as for profile(), additionally:
qreqs => \@areqs, ##-- as returned by $coldb->parseRequest($opts{query}) gbreq => \%groupby, ##-- as returned by $coldb->groupby($opts{groupby})
- qinfoData
-
(\@q1strs,\@q2strs,\@qxstrs,\@fstrs) = $rel->qinfoData($coldb,%opts);
parses @opts{qw(qreqs gbreq)} into conditions on w1, w2 and metadata filters (for ddc linkup). call this from subclass qinfo() methods.
AUTHOR
Bryan Jurish <moocow@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2015-2020 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.
SEE ALSO
DiaColloDB::Persistent(3pm), DiaColloDB::Relation::Cofreqs(3pm), DiaColloDB::Relation::Unigrams(3pm), DiaColloDB::Relation::TDF(3pm), DiaColloDB::Relation::DDC(3pm), DiaColloDB(3pm), perl(1), ...