NAME

sval2plain.pl - Convert a Senseval-2 data file into plain text format

SYNOPSIS

sval2plain.pl [OPTIONS] SVAL2

Note that there are 255 instances (contexts) in the Senseval-2 formatted input file.

frequency.pl begin.v-test.xml

OUTPUT =>

<sense id="begin%2:30:00::" percent="64.31"/>
<sense id="begin%2:30:01::" percent="14.51"/>
<sense id="begin%2:42:04::" percent="21.18"/>
Total Instances = 255
Total Distinct Senses=3
Distribution={64.31,21.18,14.51}
% of Majority Sense = 64.31

After converting to plain text, note that there are 255 lines in that file, one per context.

sval2plain.pl begin.v-test.xml > begin.v-test.txt

wc begin.v-test.txt

OUTPUT =>

255   15049   92598 begin.v-test.txt

You can find begin.v-test.xml in samples/Data

You can type sval2plain.pl --help for a quick summary of options

DESCRIPTION

Converts a given file from Senseval-2 format into plain text format. Each line of the plain text files contains a single context. This is useful when you have Senseval-2 data that you would like to use as feature extraction (training) data, which much be in plain text format.

INPUT

Required Arguments:

SVAL2

Input file in Senseval-2 format that is to be converted into plain text format.

Optional Arguments:

--help

Displays the summary of command line options.

--version

Displays the version information.

OUTPUT

sval2plain displays the given SVAL2 file in plain text format with the contextual data of each instance on a separate line. Specifically, each i'th line displayed on STDOUT shows the context of the i'th instance in the given SVAL2 file.

AUTHOR

Ted Pedersen, University of Minnesota, Duluth
tpederse at d.umn.edu

Amruta Purandare, University of Pittsburgh

COPYRIGHT

Copyright (c) 2002-2008, Ted Pedersen and Amruta Purandare

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to

The Free Software Foundation, Inc.,
59 Temple Place - Suite 330,
Boston, MA  02111-1307, USA.