Changes for version 1.02 - 2010-04-30

  • shrink over-long UTF-8 sequences to shortest form
  • add overlong_fatal option for trapping overlong sequences
  • add ascii_hex option for undefined bytes - enabled by default

Documentation

filters a data stream that is predominantly utf8 and 'fixes' any latin (ie: non-ASCII 8 bit) characters

Modules

takes mixed encoding input and produces UTF-8 output