NAME
linkMpca -- default run of MPCA on data from linkBags(1)
SYNOPSIS
linkMpca [--fields M|--norank|--onerank|--reporting] K STEM
Options:
K number of components to use
STEM file stem for linkBags(1) bag
--fields M set the maximum token index to be M
--norank dont do general ranking with the MPCA run
--onerank build a single rank per document only
--reporting build reports only
-h, --help display help message and exit.
--man print man page and exit.
DESCRIPTION
Builds K components with default mpca and their ranking with mprank Requires a basic STEM to exist with .srcpar and bags already created, presuably by linkbags(1). It builds the new model with name STEM plus K. Requires files STEM.tokens, which lists the tokens in number order (e.g., 0-th in file has numeric code 0, etc.), one per line. Also requires STEM.docmap file.
Reporting creates three simple text tables: STEM.topicmap : documents with high component probability COMP PROBABILITY DOC-URL STEM.rankmap : documents with high component-specific pagerank COMP PROBABILITY DOC-URL STEM.wordmap : tokens (words, URLs, URL+title) frequently on documents for the component COMP PROBABILITY TOKEN
Onerank option uses "mprank -u -h " to generate the single rank, and the output goes to STEM.onerank. Print this file with the MPCA pvec utility: /usr/local/share/mpca/scripts/pvec STEM.onerank
SEE ALSO
linkRedir(1), linkBags(1), linkTables(1).
AUTHOR
Wray Buntine
COPYRIGHT AND LICENSE
Copyright (C) 2005-2006 Wray Buntine
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.