NAME
WebService::GoogleHack::Rate - This module implements a simple relatedness measure and semantic orientation related type functions.
SYNOPSIS
use WebService::GoogleHack::Rate;
#GIVE PATH TO INPUT FILE HERE
my $INPUTFILE="";
#GIVE PATH TO TRACE FILE HERE
my $TRACEFILE="";
#create an object of type Rate
my $rate = WebService::GoogleHack::Rate->new();
$results=$rate->measureSemanticRelatedness1("dog", "cat");
#The PMI measure is stored in the variable $results, and it can also
#be accessed as $rate->{'PMI'};
$results=$rate->predictSemanticOrientation($INPUTFILE, "excellent", "bad",$TRACEFILE);
#The resutls can be accessed through
print $results->{'prediction'}."\n";
$results->{'PMI Measure'}."\n";
$rate->{'prediction'} &."\n";
$rate->{'PMI Measure'}."\n";
DESCRIPTION
WebService::GoogleHack::Rate - This package uses Google to do some basic natural language processing. For example, given two words, say "knife" and "cut", the module has the ability to retrieve a semantic relatedness measure, commonly known as the PMI (Pointwise mututal information) measure. The larger the measure the more related the words are. The package can also predict the semantic orientation of a given paragraph of english text. A positive measure means that the paragraph has a positive meaning, and negative measure means the opposite.
PACKAGE METHODS
__PACKAGE__->new()
Purpose: This function creates an object of type Rate and returns a blessed reference.
__PACKAGE__->init(Params Given Below)
Purpose: This this function can used to inititalize the member variables.
Valid arguments are :
key
string. key to the google-api
File_location
string. This the wsdl file name
__PACKAGE__->measureSemanticRelatedness1(searchString1,searchString2)
Purpose: This function is used to measure the relatedness between two words.
Formula used: log(hits(w1)) + log(hits(w2)) - log(hits(w1w2))
Valid arguments are :
searchString1
string. The search string which can be a phrase or word
searchString2
string. The search string which can be a phrase or word
Returns: Returns the object containing the relatedness measure.
__PACKAGE__->measureSemanticRelatedness2(searchString1,searchString2)
Purpose: This function is used to measure the relatedness between two words.
Formula used: log(w1w2/(w1+w2))
Valid arguments are :
searchString1
string. The search string which can be a phrase or word
searchString2
string. The search string which can be a phrase or word
Returns: Returns the object containing the relatedness measure.
__PACKAGE__->measureSemanticRelatedness3(searchString1,searchString2)
Purpose: This function is used to measure the relatedness between two words.
Formula used: log( hits(w1w2) / (hits(w1) * hits(w2)))
Valid arguments are :
searchString1
string. The search string which can be a phrase or word
searchString2
string. The search string which can be a phrase or word
Returns: Returns the object containing the relatedness measure.
__PACKAGE__->predictSemanticOrientation(infile,posInf, negInf,trace)
Purpose: this function tries to predict the semantic orientation of a paragraph of text.
Valid arguments are :
infile
string. The location of the review file
posInf.
string. Positive inference such as excellent
negInf.
string. Negative inference such a poor
trace.
string. The location of the trace file. If a file_name is given, the results are stored in this file
Returns : the PMI measure and the prediction which is 0 or 1.
__PACKAGE__->predictWordSentiment(infile,posInf,negInf,html,trace)
Purpose:Given an file containing text, this function tries to find the positive and negative words. The formula used to calculate the sentiment of a word is based on the PMI-IR formula given in Peter Turneys paper.
(hits(word AND "excellent") hits (poor))
log2 ----------------------------------------
(hits(word AND "poor") hits (excellent))
For more information refer the paper, "Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews" By Peter Turney.
infile
string. The input file
posInf
string. A positive word such as "Excellent"
negInf.
string. A negative word such as "Bad"
html.
string. Set to "true" if you want the results to be HTML formatted
trace.
string. Set to a file if you want the results to be written to the given filename.
returns : Returns an html or text version of the results.
__PACKAGE__->predictPhraseSentiment(infile,,posInf,negInf,html,trace)
Purpose:Given an file containing text, this function tries to find the positive and negative phrases. The formula used to calculate the sentiment of a phrase is based on the PMI-IR formula given in Peter Turneys paper.
(hits(phrase AND "excellent") hits (poor))
log2 ------------------------------------------
(hits(phrase AND "poor") hits (excellent))
For more information refer the paper, "Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews" By Peter Turney.
infile
string. The input file
posInf
string. A positive word such as "Excellent"
negInf.
string. A negative word such as "Bad"
html.
string. Set to "true" if you want the results to be HTML formatted
trace.
string. Set to a file if you want the results to be written to the given filename.
returns : Returns an html or text version of the results.
AUTHOR
Pratheepan Raveendranathan, <rave0029@d.umn.edu>
Ted Pedersen, <tpederse@d.umn.edu>
BUGS
SEE ALSO
WebService::GoogleHack home page - http://google-hack.sourceforge.net
Pratheepan Raveendranathan - http://www.d.umn.edu/~rave0029/research
Ted Pedersen - www.d.umn.edu./~tpederse
Google-Hack Maling List <google-hack-users@lists.sourceforge.net>
COPYRIGHT AND LICENSE
Copyright (c) 2005 by Pratheepan Raveendranathan, Ted Pedersen
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to
The Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.