NAME

linkMpca -- default run of MPCA on data from linkBags(1)

SYNOPSIS

linkMpca [--fields M|--norank|--onerank|--reporting] K STEM

Options:

  K                   number of components to use
  STEM                file stem for linkBags(1) bag
  --fields M          set the maximum token index to be M
  --norank            dont do general ranking with the MPCA run
  --onerank           build a single rank per document only
  --reporting         build reports only
  -h, --help          display help message and exit.
  --man               print man page and exit.

DESCRIPTION

Builds K components with default mpca and their ranking with mprank Requires a basic STEM to exist with .srcpar and bags already created, presuably by linkbags(1). It builds the new model with name STEM plus K. Requires files STEM.tokens, which lists the tokens in number order (e.g., 0-th in file has numeric code 0, etc.), one per line. Also requires STEM.docmap file.

Reporting creates three simple text tables: STEM.topicmap : documents with high component probability COMP PROBABILITY DOC-URL STEM.rankmap : documents with high component-specific pagerank COMP PROBABILITY DOC-URL STEM.wordmap : tokens (words, URLs, URL+title) frequently on documents for the component COMP PROBABILITY TOKEN

Onerank option uses "mprank -u -h " to generate the single rank, and the output goes to STEM.onerank. Print this file with the MPCA pvec utility: /usr/local/share/mpca/scripts/pvec STEM.onerank

SEE ALSO

linkRedir(1), linkBags(1), linkTables(1).

AUTHOR

Wray Buntine

COPYRIGHT AND LICENSE

Copyright (C) 2005-2006 Wray Buntine

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.