NAME
alvisXMLsplit -- splits a big file into pieces in a directory for easier processing.
SYNOPSIS
alvisXMLsplit [--bzip2] <Alvis XML file> <N per file> <out-dir>
Split a large file into N documentRecords per file into a directory.
Set --bzip2 if both input and output are bzip2'ed
Output file is UTF8 and Perl friendly, so one <documentRecord> or
</documentRecord> per line to facilitate processing.
DESCRIPTION
Script to split a big file into pieces in a directory for easier processing. Algorithm is simple, but a bit slow because each document is built up in memory before being dumped, and this is not efficient in Perl.
AUTHOR
Wray Buntine