NAME

rem-boilerplate-text

VERSION

Version 0.2

> rem-boilerplate-text [options] <list of files>

E.g.

> rem-boilerplate-text --min_dupl=6 intranet/txt/*.txt

Removes repeated text from a set of files.

Note that the system only works when more than one file is specified, since boilerplate text is detected based on repetition across files.

New files are written, with a suffix appended to the original filenames.

-m, --min_dupl: The minimum number of thimes a line has to occur to be considered boilerplate (default: 3). Can be either an integer or a percentage ('50 %') of the number of files processed. Minimum value: 2.
-i, --ignore_digits: Lines only seperated by differences in digits will be considered duplicates (default: yes).
-s, --suffix: Added to the new files (default: 'content').
-o, --only_headers_and_footers: Only sets consecutive lines of duplicates at the start and end of documents are considered boilerplate (default: yes).
-d, digest: Lines will be replaced by a MD5 digest during duplicate compilation, saving memory (default: no).
-l, log: Name of the log file, where deleted lines are recorded; if set to false, no log will be created (default: './text-identify-boilerplate.log').
-h, --help: Display usage information.
-v, --verbose: Be verbose.

Lars Nygaard, <lars.nygaard@inl.uio.no>

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

To install Text::Identify::BoilerPlate, copy and paste the appropriate command in to your terminal.

cpanm Text::Identify::BoilerPlate

perl -MCPAN -e shell
install Text::Identify::BoilerPlate

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)