NAME
Word2vec::Lesk - Word2vec-Interface Utility Module.
SYNOPSIS
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
my $string_a = "This is a test string";
my $string_b = "This is another test string";
my $lesk_score = $lesk->CalculateLeskScore( $string_a, $string_b );
my $cosine_score = $lesk->CalculateCosineScore( $string_a, $string_b );
my $f_score = $lesk->CalcualteFScore( $string_a, $string_b );
print( "Lesk Score: $lesk_score\n" );
print( "Cosine Score: $cosine_score\n" );
print( "F Score: $f_score\n" );
undef( $lesk );
or
my $lesk = Word2vec::Lesk->new();
my $string_a = "This is a test string";
my $string_b = "This is another test string";
my %results = %{ $lesk->CalculateAllScores( $string_a, $string_b ) };
for my $key ( sort keys %results )
{
print "$key: $results{ $key }\n";
}
undef( %results );
undef( $lesk );
DESCRIPTION
Word2vec::Lesk is a module of Lesk functions for the Word2vec::Interface package. Lesk, Raw Lesk, Cosine, F, Recall and Precision scores are all calculated and returned to the used based on phrase/feature overlap between two strings.
Main Functions
new
Description:
Returns a new "Word2vec::Lesk" module object.
Note: Specifying no parameters implies default options.
Default Parameters:
debugLog = 0
writeLog = 0
Input:
$debugLog -> Instructs module to print debug statements to the console. (1 = True / 0 = False)
$writeLog -> Instructs module to print debug statements to a log file. (1 = True / 0 = False)
Output:
Word2vec::Lesk object.
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
undef( $lesk );
DESTROY
Description:
Removes Word2vec::Lesk object from memory.
Input:
None
Output:
None
Example:
See above example for "new" function.
Note: Destroy function is also automatically called during global destruction when exiting the program.
GetMatchingFeatures
Description:
Given two strings, this returns a hash of all overlapping (matching) features between both strings and their frequency counts.
Input:
$string_a -> First comparison string
$string_b -> Second comparison string
Output:
$hash_ref -> Returns a hash table reference with keys being the unique matching feature between two input string parameters and the value as the frequency count of each unique feature.
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
my %matching_features = %{ $lesk->GetMatchingFeatures( "I like to eat cookies", "Sometimes I like to eat cookies" ) };
for my $feature ( sort keys %matching_features )
{
print "$feature : $matching_features{ $feature }\n";
}
undef( %matching_features );
undef( $lesk );
GetPhraseOverlap
Description:
Given two strings, this returns a hash of all overlapping (matching) phrases between both strings and their frequency counts. This prioritizes longer phrases as higher priority when matching.
Input:
$string_a -> First comparison string
$string_b -> Second comparison string
Output:
$hash_ref -> Returns a hash table reference with keys being the unique matching phrase between two input string parameters and the value as the frequency count of each unique phrase.
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
my %phrase_overlaps = %{ $lesk->GetPhraseOverlap( "I like to eat cookies", "Sometimes I like to eat cookies" ) };
for my $phrase ( sort keys %phrase_overlaps )
{
print "$phrase : $phrase_overlaps{ $phrase }\n";
}
undef( %phrase_overlaps );
undef( $lesk );
CalculateLeskScore
Description:
Given two strings, this returns a lesk score based on overlapping (matching) features between both strings.
Input:
$string_a -> First comparison string
$string_b -> Second comparison string
Output:
$score -> Lesk Score (Float)
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
my $lesk_score = $lesk->CalculateLeskScore( "I like to eat cookies", "Sometimes I like to eat cookies" );
print "Lesk Score: $lesk_score\n";
undef( $lesk );
CalculateCosineScore
Description:
Given two strings, this returns a cosine score based on overlapping (matching) features between both strings.
Input:
$string_a -> First comparison string
$string_b -> Second comparison string
Output:
$score -> Cosine Score (Float)
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
my $cosine_score = $lesk->CalculateCosineScore( "I like to eat cookies", "Sometimes I like to eat cookies" );
print "Cosine Score: $cosine_score\n";
undef( $lesk );
CalculateFScore
Description:
Given two strings, this returns a F score based on overlapping (matching) features between both strings.
Input:
$string_a -> First comparison string
$string_b -> Second comparison string
Output:
$score -> F Score (Float)
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
my $f_score = $lesk->CalculateFScore( "I like to eat cookies", "Sometimes I like to eat cookies" );
print "F Score: $f_score\n";
undef( $lesk );
CalculateAllScores
Description:
Given two strings, this returns a list of scores (F, Cosine, Lesk, Raw Lesk, Precision, Recall), frequency counts (features, phrases, string lengths).
Input:
$string_a -> First comparison string
$string_b -> Second comparison string
Output:
$result_hash -> Hash reference containing: Lesk, Raw Lesk, F, Precision, Recall, Cosine, Matching Feature Frequency, Matching Phrase Frequency, String A Length and String B Length.
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
my %scores = %{ $lesk->CalculateAllScores( "I like to eat cookies", "Sometimes I like to eat cookies" ) };
for my $score_name ( sort keys %scores )
{
print "$score_name : $scores{ $score_name }\n";
}
undef( $lesk );
Accessor Functions
GetDebugLog
Description:
Returns the _debugLog member variable set during Word2vec::Lesk object initialization of new function.
Input:
None
Output:
$value -> '0' = False, '1' = True
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new()
my $debugLog = $lesk->GetDebugLog();
print( "Debug Logging Enabled\n" ) if $debugLog == 1;
print( "Debug Logging Disabled\n" ) if $debugLog == 0;
undef( $lesk );
GetWriteLog
Description:
Returns the _writeLog member variable set during Word2vec::Lesk object initialization of new function.
Input:
None
Output:
$value -> '0' = False, '1' = True
Example:
use Word2vec::Lesk;
my $lesk = Word2vec::Lesk->new();
my $writeLog = $lesk->GetWriteLog();
print( "Write Logging Enabled\n" ) if $writeLog == 1;
print( "Write Logging Disabled\n" ) if $writeLog == 0;
undef( $lesk );
Debug Functions
WriteLog
Description:
Prints passed string parameter to the console, log file or both depending on user options.
Note: printNewLine parameter prints a new line character following the string if the parameter
is undefined and does not if parameter is 0.
Input:
$string -> String to print to the console/log file.
$value -> 0 = Do not print newline character after string, all else prints new line character including 'undef'.
Output:
None
Example:
use Word2vec::Lesk:
my $lesk = Word2vec::Lesk->new();
$lesk->WriteLog( "Hello World" );
undef( $lesk );
Author
Clint Cuffy, Virginia Commonwealth University
COPYRIGHT
Copyright (c) 2016
Bridget T McInnes, Virginia Commonwealth University
btmcinnes at vcu dot edu
Clint Cuffy, Virginia Commonwealth University
cuffyca at vcu dot edu
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to:
The Free Software Foundation, Inc.,
59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.