##======================================================================== ## NAME =pod

NAME

GermaNet::Flat - Simple flat interface to GermaNet (and other) thesaurus relations

SYNOPSIS

##========================================================================
## PRELIMINARIES

use GermaNet::Flat;

##========================================================================
## Basics

$gn  = GermaNet::Flat->new();
$ver = $gn->dbversion();
$gn  = $gn->clear();

##========================================================================
## Relations

##-- Generic relations
\@vals  = $gn->relation($rel, $arg);
\&CODE  = relationWrapper($relation);

##-- Specific relations
\@lexids = $gn->orth2lex($lemma);
\@lemmas = $gn->lex2orth($lexid);
\@synids = $gn->lex2syn($lexid);
\@lexids = $gn->syn2lex($synid);
\@subids = $gn->hypernyms($synid); # a.k.a. $gn->hyperonyms($synid)
\@supids = $gn->hyponyms($synid);

##-- Convenience wrappers
\@synsets = $gn->get_synsets($lemma);
\@terms   = $gn->synset_terms($synset);

##========================================================================
## I/O

##-- generic input (guess input format)
$gn = $CLASS_OR_OBJECT->load($filename_or_xmldirname);

##-- I/O: GermaNet XML directory (input only)
$gn = $gn->loadXmlDir($directoryx);
$gn = $gn->loadXml(@xml_filenames_or_handles);

##-- I/O: raw text
$gn   = $gn->loadText($filename_or_fh);
$bool = $gn->saveText($filename_or_fh);

##-- I/O: Berkeley DB
$gn   = $gn->loadDB($dbfile);
$bool = $gn->saveDB($dbfilename);

##-- I/O: CDB
$gn   = $gn->loadCDB($dbfile);
$bool = $gn->saveCDB($dbfilename);

##-- I/O: Storable
$gn   = $gn->loadBin($filename_or_fh);
$bool = $gn->saveBin($filename_or_fh);

##========================================================================
## Low-Level Utilities

\@array_uniq = GermaNet::Flat::auniq(\@array);
@uniq        = GermaNet::Flat::luniq(@list);
$gn          = $gn->sanitize();

DESCRIPTION

Basics

new

Create and return a new (empty) GermaNet::Flat object. The returned object $gn is a blessed HASH-ref containing at least a rel key to store the underlying relation data as a non-deterministic finite partial function:

$gn->{rel} = { "${relation}:${arg}"=>join(' ',@vals), ... };

clear

Clears all data from the object.

Relations

Generic Relations

relation

\@vals  = $gn->relation($rel, $arg);
\@vals  = $gn->relation($rel, \@args);

Returns the stored value(s) for relation $rel and argument(s) $arg rsp. @args as an ARRAY-ref. Returned value(s) are not necessarily unique.

relationWrapper

\&CODE  = relationWrapper($relation);

Returns a CODE-ref for accessing the unique stored value(s) for relation $relation; basically just a wrapper for "relation".

Specific relations

dbversion

$ver = $gn->dbversion();

Returns the current database version, which is internally represented as the first value of the pseudo-relation dbversion.

orth2lex

\@lexids = $gn->orth2lex($lemma);

Returns lexical ID(s) for the lemma (string) $lemma.

lex2orth

\@lemmas = $gn->lex2orth($lexid);

Returns orthographic form(s) for the lexical ID $lexid.

lex2syn

\@synids = $gn->lex2syn($lexid);

Returns synset ID(s) for the lexical ID $lexid.

syn2lex

\@lexids = $gn->syn2lex($synid);

Returns lexical ID(s) for the synset ID $synid.

hypernyms

\@subids = $gn->hypernyms($synid);
\@subids = $gn->hyperonyms($synid);

Returns hyperonym synset IDs (subclasses) for the synset $synid.

hyponyms

\@supids = $gn->hyponyms($synid);

Returns hyponym sysnset IDs (superclasses) for the synset $synid.

Convenience wrappers

get_synsets

\@synsets = $gn->get_synsets($lemma);

Returns all synset-IDs for the lemma $lemma; wraps "orth2lex" and "lex2syn". Uniqueness is not guaranteed.

synset_terms

\@terms = $gn->synset_terms($synset);

Returns all lemma(ta) for the synset ID $synset; wraps "syn2lex" and "lex2orth". Uniqueness is not guaranteed.

I/O

Generic input

load

$gn = $CLASS_OR_OBJECT->load($filename_or_xmldirname);

Load GermaNet relation data from $filename_or_xmldirname, which should be some supported GermaNet::Flat database format:

GermaNet XML directory

If $filename_or_xmldirname is a directory, it is assumed to contain GermaNet-format XML which will be loaded by the "loadXmlDir, loadXml" method.

Storable file

If $filename_or_xmldirname carries the extension .bin or .sto, it will be loaded as a perl Storable HASH-ref using the "loadBin, saveBin" method.

Berkeley DB

If $filename_or_xmldirname carries the extension .db or .bdb, it will be tie()d as a Berkeley DB file using the "loadDB, saveDB" method.

CDB

If $filename_or_xmldirname carries the extension .cdb, it will be tie()d as a CDB file using the "loadCDB, saveCDB" method.

Raw Text

Otherwise, $filename_or_xmldirname is expected to contain raw text relation data to be loaded using the "loadText, saveText" method.

GermaNet XML

loadXmlDir, loadXml

$gn = CLASS_OR_OBJECT->loadXmlDir($directoryx);
$gn = CLASS_OR_OBJECT->loadXml(@xml_filenames_or_handles);

Loads relation data from a directory (first form) or files (second form) assumed to be in GermaNet XML format.

loadBin, saveBin

$gn   = $gn->loadBin($filename_or_fh);
$bool = $gn->saveBin($filename_or_fh);

Loads/saves relation data from/to a serialized Storable HASH-ref file or filehandle.

loadDB, saveDB

$gn = $gn->loadDB($dbfile);
$gn = $gn->saveDB($dbfilename);

tie()s relation data to/from the Berkeley-DB file $dbfile.

loadCDB, saveCDB

$gn   = $gn->loadCDB($dbfile);
$bool = $gn->saveCDB($dbfilename);

tie()s relation data to/from the CDB file $dbfile. UTF-8 support is wonky with CDB files.

loadText, saveText

$gn   = $gn->loadText($filename_or_fh);
$bool = $gn->saveText($filename_or_fh);

Loads/saves relation data from/to a plain text file $filename_or_fh. Each line of $filename_or_fh corresponds to a single relation entry in %{$gn->{rel}} of the form $KEY\t$VALUES, where $KEY is the item key of the form ${RELATION}:${ARG1} and $VALUES is a space-separated list of value(s) associated with $ARG1 by $RELATION.

Low-Level Utilities

auniq

\@array_uniq = GermaNet::Flat::auniq(\@array);

Returns unique values from an ARRAY-ref.

uniq

@uniq = GermaNet::Flat::luniq(@list);

Returns unique values for an array or list.

sanitize

$gn = $gn->sanitize();

Low-level compilation utility for trimming duplicates and extraneous whitespace from relation data values.

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2013-2019 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

http://www.sfs.uni-tuebingen.de/GermaNet/, https://code.google.com/p/perlapi4germanet, perl(1), ...