NAME
makedb - generate, update or remove wais databases
SYNOPSIS
makedb [[-clean] -tidy] [-update] [-config config_file] [-test] [-debug] [-verbose] [-copy tmpdir] ([-all] | database ...)
DESCRIPTION
makedb creates, updates or removes databases specified in a makedb config file (./makedb.conf unless overwritten by the -config option).
OPTIONS
Note that all options may be abreviated with a uniquely identifying prefix.
- -clean -tidy
-
Delete databases. This option can be used together with the -update option. Deletion is done before the update regardless of the order ogf options on the command line :-). Files with extension
src
,fmt
,fde
,syn
,stop
, andcat
will not be removed unless -tidy is given too. - -config config_file
-
Read an alternate config file. Default is ./makedb.conf.
- -update
-
Update the databases.
- -all
-
Do clean/update all databases specified in the config file. If not given clean/update all databases specified on the command line.
- -test
-
Do nothing. Just print actions.
- -copy tmpdir
-
Do the actual indexing in tmpdir. Copy the database to tmpdir, run the index commands and copy the result back.
- -debug
-
Not implemented yet.
- -verbose
-
Additional messages to stderr.
Config File
The config file should be made up of lines assigning values to variables as in:
waisindex = /usr/local/ls6/wais/bin/waisindex
Each assignment must start in column 1. Shell comments are allowed. Some of the variables have predefined meaning. There are global and local variables. Local variables are instantiated for each database. Each database =
assignment introduces a new local block. Use the -verbose option if you are unsure about the scoping. Assignments may have the form variable +=
value in which case the value is appended to variable.
The following variables are global. The last occurance in the file counts.
- waisindex
-
Path to the waisindex program. See example above.
- wais_opt
-
Options for all waisindex runs. For example:
wais_opt = -nocat
- fmtdir
-
Directory where to look for database
.fmt
if it does not exist in dbdir. Also database.src
, database.fde
, database.syn
, database.stop
and database.cat
are copied unless they exist in dbdir.
The following variables are local to a database block. The last occurance up to the end of the block counts. For limit, dbdir and options there can be global defaults (given before the current block). When leaving a block these values are restored.
- database
-
The name of the database.
- files
-
A list of shell fileglob expressions as in:
files = /usr/local/doc/*.html files += /usr/local/doc/*.doc
You may also use backticks (
`
) but no double quotes ("
):files = `find $dbdir -name make\* -print`
- options
-
Additional wasindex options. For example
options = -t fields
- dbdir
-
The directory in which the wais database lives.
- limit
-
The number of dead files which should be tolerated in the index. A dead file is a file which was in the index, changed and then re-indexed. Since the index does not provide deletions, the file is removed from the filename table instead. All postings remain in the index thus occupying space on the disc and slowing down the search. Also the global occurence counter for terms in the file have too high values thus twisting final weights for hits. When more than limit files are killed this way, makedb regenerates the whole index. This will take more time than simply updating but the index size is reduced and searches will be faster. So set limit to make your tradeoff. limit defaults to 100.
All other variables do not have any meaning to makedb unless you use them in the value part of an assignment as in:
docdir = /home/robots/wais/wais-docs
database = test
files = $docdir/TEST
EXAMPLE
# makedb.conf -- makdb configuration file
# Global options
dbdir = /home/robots/wais/wais-sources
waisindex = /usr/local/ls6/wais/bin/waisindex
wais_opt = -nocat # don't create catalog files
limit = 10 # 10 dead files maximum
# User defined variables
docdir = /home/robots/wais/wais-docs
# the databases
database = bibdb-html
files = $docdir/bibdb.html # use of variables in the value
limit = 0 # no dead files
options = -T HTML -t fields
database = journals
files = $docdir/journals/*
limit = 3
options = -t fields
database = www-pages
wwwroot = /home/robots/www/pages # new global variable
files = `find $wwwroot -name \*.html -print`
options = -t URL $wwwroot http:
database = test
dbdir = /home/crew/pfeifer/tmp/wittenberg
files = $dbdir/ma*
files += $dbdir/te* # append
options = -t text
AUTHOR
Ulrich Pfeifer <pfeifer@ls6.informatik.uni-dortmund.de>