NAME
alvis_wikipedia_add_cats.pl - adds relevance scores for categories
to an Alvis version of a Wikipedia dump
SYNOPSIS
alvis_wikipedia_add_cats.pl [options] [Alvis XML root directory]
Options:
--out-dir output directory
--alvis-suffix the suffix of Alvis XML source files
--dump-file category graph dump file
--score-type name of the score type added
--method category picking method
--category-list-file file containing (prepicked) categories
--help brief help message
--man full documentation
--[no]warnings warnings output flag
OPTIONS
- --out-dir
-
Sets the output directory. Default value: '.'.
- --alvis-suffix
-
The suffix of the source Alvis XML files. Default: 'alvis'.
- --dump-file
-
The loadable (in Storable format) category graph dump file. Default: 'CatGraph.Storable'.
- --score-type
-
The name of the new score type to be added to the Alvis XML files. Default: 'wikipedia Fundamental two top levels'.
- --method
-
The method of determining the categories whose relevance to add. Choices: '2toplevels' (two top levels starting from the root).
- --help
-
Prints a brief help message and exits.
- --man
-
Prints the manual page and exits.
- --[no]warnings
-
Output (or suppress) warnings. Default value: yes.
DESCRIPTION
Converts the articles in the Wikipedia XML dump to Alvis records.