NAME

cwb-make - Automated indexing and compression for CWB corpora

SYNOPSIS

cwb-make [options] CORPUS [<attributes>]

Options:

-r <dir>   use registry directory <dir> [system default]
-M <n>     use <n> MBytes of RAM for indexing [default: 75]
-V         validate newly created files
-g <name>  put newly created files into group <name>
-p <nnn>   set access permissions of created files to <nnn>
-v         print some progress information
-D         activate debugging output
-h         show help page

Long forms of command-line options are listed below.

DESCRIPTION

The cwb-make utility automates index building and compression for a CWB corpus, calling cwb-makeall, cwb-huffcode and cwb-compress-rdx as needed. Main advantages over the manual procedure are:

Old index files are updated automatically (unlike cwb-makeall, which does not check the age of index files), and it is safe to call cwb-make on an indexed and compressed corpus (again, unlike cwb-makeall).
Data files that are no longer needed after compression are immediately deleted.
The build process is optimised to reduce the amount of temporary disk space and memory needed. This is particularly important when indexing large corpora on 32-bit platforms, where cwb-makeall might easily run out of address space when called directly.

The basic usage pattern is

cwb-make [options] CORPUS [attribute ...]

where CORPUS is the CWB name (ID) of the corpus to be indexed (after encoding with cwb-encode) and should be written in upper case. If positional attributes are added at a later time, they can be indexed separately by specifying the attribute names after the corpus ID. Note that it is always safe simply to call cwb-make: existing indexed and compressed attributes will be ignored. Further command-line options are detailed below.

cwb-make is a minimal front-end to the CWB::Indexer functionality provided by the CWB::Encoder module, which can also be used directly from a Perl script. See "CWB::Indexer METHODS" in CWB::Encoder manpage for further information.

COMMAND-LINE OPTIONS

--registry=dir, -r dir: Use registry directory dir instead of standard registry (CWB default or specified by CORPUS_REGISTRY environment variable).
--memory=n, -M n: Use approx. n megabytes (MiB) of RAM for indexing. The default of 75 MiB is safe even for computers with a small amount of memory or many concurrent users. If more RAM is available, indexing can be speeded up considerably by setting higher memory limit. For instance, -M 500 or -M 1000 is a good choice on a machine with 2 GiB of RAM and a low work load.
--validate, -V: Validate newly created data files (index files and compressed corpus data). This is normally not required, as the CWB indexing and compression algorithms have been tested thoroughly by a large user community.
--group=name, -g name
--permissions=ddd, -p ddd: Set group membership (name) and access permissions (octal code ddd) of new data files. If these options are not specified, the system defaults for newly created files are used.
--verbose, -v: Print some progress information on STDOUT. Use this option to see feedback during a potentially long-running operation.
--debug, -D: Activate debugging output, which shows all shell commands executed to build the additional indexing files (on STDERR).
--help, -h: Display help page with short usage summary (similar to SYNOPSIS above).

COPYRIGHT

This software is provided AS IS and the author makes no warranty as to its use and performance. You may use the software, redistribute and modify it under the same terms as Perl itself.

To install CWB, copy and paste the appropriate command in to your terminal.

cpanm

cpanm CWB

CPAN shell

perl -MCPAN -e shell
install CWB

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

DESCRIPTION

COMMAND-LINE OPTIONS

COPYRIGHT

Module Install Instructions