NAME

fix_latin - filters a data stream that is predominantly utf8 and 'fixes' any latin (ie: non-ASCII 8 bit) characters

SYNOPSIS

fix_latin options <input_file >output_file

Options:

 --use-xs <value> 'auto' | 'always' | 'never'
 --version        list version number
 --help           detailed help message

DESCRIPTION

The script acts as a filter, taking source data which may contain a mix of ASCII, UTF8, ISO8859-1 and CP1252 characters, and producing output will be all ASCII/UTF8.

Multi-byte UTF8 characters will be passed through unchanged (although over-long UTF8 byte sequences will be converted to the shortest normal form). Single byte characters will be converted as follows:

0x00 - 0x7F   ASCII - passed through unchanged
0x80 - 0x9F   Converted to UTF8 using CP1252 mappings
0xA0 - 0xFF   Converted to UTF8 using Latin-1 mappings

OPTIONS

--use-xs 'auto' | 'always' | 'never': Override default ('auto') behaviour of trying to use XS module and falling back to pure-Perl version if not available. Set to 'never' to always use the Perl version or 'always' to always use XS and die if not available.
--version (alias -v): Display version number of underlying Encoding::FixLatin and XS modules.
--help (alias -?): Display this documentation.

EXAMPLES

This script was originally written to assist in converting a Postgres database from SQL-ASCII encoding to UNICODE UTF8 encoding. The following examples illustrate its use in that context.

If you have a SQL format dump file that you would normally restore by piping into 'psql', you can simply filter the dump file through this script:

fix_latin < dump_file | psql -d database

If you have a compressed dump file that you would normally restore using 'pg_restore', you can omit the '-d' option on pg_restore and pipe the resulting SQL through this script and into psql:

pg_restore -O dump_file | fix_latin | psql -d database

To take a look at non-ASCII lines in the dump file:

perl -ne '/^COPY (\S+)/ and $t = $1; print "$t:$_" if /[^\x00-\x7F]/' dump_file

COPYRIGHT & LICENSE

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

To install Encoding::FixLatin, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Encoding::FixLatin

CPAN shell

perl -MCPAN -e shell
install Encoding::FixLatin

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)