NAME
Lingua::LinkParser - Perl module implementing the Link Grammar Parser by Sleator, Temperley and Lafferty at CMU.
SYNOPSIS
use Lingua::LinkParser;
my $parser = new Lingua::LinkParser;
my $sentence = $parser->parse_sentence("This is the turning point.");
my @linkages = $parser->get_linkages($sentence);
foreach $linkage (@linkages) {
print ($parser->get_diagram($linkage));
}
DESCRIPTION
To quote the Link Grammar documentation, "the Link Grammar Parser is a syntactic parser of English, based on link grammar, an original theory of English syntax. Given a sentence, the system assigns to it a syntactic structure, which consists of set of labeled links connecting pairs of words."
This module provides acccess to the parser API using Perl objects to easily analyze linkages. The module organizes data returned from the parser API into an object hierarchy consisting of, in order, sentence, linkage, sublinkage, and link. If this is unclear to you, see the several examples in the 'eg/' directory for a jumpstart on using these objects.
The objects within this module should not be confused with the types familiar to users of the Link Parser API. The objects used in this module reorganize the API data in a way more usable and friendly to Perl users, and do not exactly represent the types used in the API.
This documentation must be supplemented with the extensive texts included with the Link Parser and on the Link Parser web site.
- $parser = new Lingua::LinkParser(DICT_PATH,KNOWLEDGE_PATH)
-
This returns a new Lingua::LinkParser object, loads the specified dictionary files, and sets basic configuration. If no dictionary files are specified, the parser will attempt to open the default files specified in the header files.
- $parser->opts(OPTION_NAME,OPTION_VALUE)
-
This sets the parser option OPTION_NAME to the value specified by OPTION_VALUE. A full list of these options is found at the end of this document, as well as in the Link Parser distribution documentation.
- $sentence = $parser->create_sentence(TEXT)
-
Creates and assigns a sentence object (Lingua::LinkParser::Sentence) using the supplied value. This object is used in subsequent creation and analysis of linkages.
- $sentence->num_linkages
-
Returns the number of linkages found for $sentence.
- $linkage = $sentence->linkage(NUM)
-
Assigns a linkage object (Lingua::LinkParser::Linkage) for linkage NUM of sentence $sentence.
- @linkages = $sentence->linkages
-
Assigns a list of linkage objects for all linkages of $sentence.
- $linkage->num_sublinkages
-
Returns the number of sublinkages for linkage $linkage.
- $sublinkage = $linkage->sublinkage(NUM)
-
Assigns a sublinkage object (Lingua::LinkParser::Linkage::Sublinkage) for sublinkage NUM of linkage $linkage.
- @sublinkages = $linkage->sublinkages
-
Assigns an array of sublinkage objects.
- $sublinkage->num_links
-
Returns the number of links for sublinkage $sublinkage.
- $link = $sublinkage->link(NUM)
-
Assigns a link object (Lingua::LinkParser::Link) for link NUM of sublinkage $sublinkage.
- @links = $sublinkage->links
-
Assigns an array of link objects.
- $link->length
-
Returns the number of words spanned by $link.
- $link->label
-
Returns the "intersection" label for $link.
- $link->llabel
-
Returns the left label for $link.
- $link->rlabel
-
Returns the right label for $link.
- $link->lword
-
Returns the number of the left word for $link.
- $link->rword
-
Returns the number of the right word for $link.
- $parser->get_diagram($linkage)
-
Returns an ASCII pretty-printed diagram of the specified linkage or sublinkage.
- $parser->get_postscript($linkage)
-
Returns Postscript code for a diagram of the specified linkage or sublinkage.
- $parser->get_domains($linkage)
-
Returns formatted ASCII text showing the links and domains for the specified linkage or sublinkage.
OTHER FUNCTIONS
A few high-level functions have also been provided.
- @bigstruct = $sentence->get_bigstruct
-
Assigns a potentially large data structure merging all linkages/sublinkages/links for $sentence. This structure is an array of hashes, with a single array entry for each word in the sentence. This function is only useful for high-level analysis of sentence grammar; most applications should be served by using the below functions.
This array has the following structure:
@bigstruct ( %{ 'word' => 'WORD', 'links' => %{ 'LINKTYPE_LINKAGENUM' => 'TARGETWORDNUM',... }, }, ... } , ...);
Where LINKAGENUM is the number of the linkage for $sentence, and LINKTYPE is the link type label. TARGETWORDNUM is the number of the word to which each link connects.
get_bigstruct() can be useful in finding, for example, all links for a given word in a given sentence:
$sentence = $parser->create_sentence( "Architecture is present in nearly every civilized society."); @bigstruct = $sentence->get_bigstruct; while (($k,$v) = each %{$bigstruct[6]->{links}} ) { print "$k => ", $bigstruct[$v]->{word}, "\n"; }
This would output:
A => civilized.a Jp => in Dsu => every.d
Signifying that for word "society", links are found of type A (pre-noun adjective) with "civilized" (tagged 'a' for adjective), type Jp (preposition to object) with "in", and type Dsu (noun determiner, singular-mass) with word "every", which is tagged 'd' for determiner.
LINK PARSER OPTIONS
The following list of options may be set or retrieved with Lingua::LinkParser object with the function:
$parser->opts(OPTION, [VALUE])
Supplying no VALUE returns the current value for OPTION. Note that not all of the options are implemented by the API, and instead are intended for use by the program.
verbosity
The level of detail reported during processing, 0 reports nothing.
linkage_limit
The maximum number of linkages to process for a sentence.
disjunct_cost
Determines the maximum disjunct cost used during parsing, where the cost of a disjunct is equal to the maximum cost of all of its connectors.
min_null_count
max_null_count
The range of null links to parse.
null_block
Sets the block count ratio for null linkages; a value of '4' causes a linkage of 1, 2, 3, or 4 null links to have a null cost of 1.
short_length
Limits the number length of links to this value (the number of words a link can span).
islands_ok
Allows 'islands' of links (links not connected to the 'wall') when set.
max_parse_time
Determines the approximate maximum time permitted for parsing.
max_memory
Determines the maximum memory allowed during parsing.
timer_expired
memory_exhausted
resources_exhausted
reset_resources
These options tell whether the timer or memory constraints have been exceeded during parsing.
cost_model_type
screen_width
Sets the screen width for pretty-print functions.
allow_null
Allow or disallow null links in linkages.
display_walls
Toggles the display of linkage "walls".
all_short_connectors
If true, then all connectors have length restrictions imposed on them.
BUGS/TODO
- I suspect the docs are lacking. This is a very-beta release. Please supply me with input as to the accuracy and any bugs or enhancements you may have.
- Add domain functions
AUTHOR
Daniel Brian, dbrian@clockwork.net
SEE ALSO
perl(1). http://www.link.cs.cmu.edu/link/.