Revision history for MARC::Charset
0.96 Wed Mar 14 01:24:48 EDT 2007
- added ignore_errors() to skip MARC8 -> UTF8 snafus
- added assume_encoding() to treat transcoding failures as if they
are from a known, specific encoding. Useful if you have a set of
records that, for instance, report being MARC8 but are actually
encoded in Latin1 (which, btw, is completely invalid and also very
common). Only in effect when ignore_errors() is true.
- added assume_unicode() to treat invalid MARC8 as UTF8. This is a
convenience function based on assume_encoding().
0.92 Sat Feb 4 19:34:19 CST 2006
- marc8_to_utf8 and utf8_to_marc8 needed to pass along spaces
without translation
- added tests to t/escape2.t and t/utf8.t to test space behavior
0.91 Fri Feb 3 23:10:59 EST 2006
- fix in marc8_to_utf8 for error reporting when no mapping is found
0.9 Fri Feb 3 22:25:39 EST 2006
- the utf8->marc8 will prefer the first mapping it runs across in the
LoC XML mapping table. v0.8 preferred the last mapping found which
meant that utf8_to_marc8 would escape to non-ascii character sets
for some punctuation. Thanks Mike Rylander for helping isolate
this problem.
- added a test that makes sure punctuation is working properly
to no_escape.t
- modified test of multiple combining characters in utf8.t to
actually test for correct result
0.8 Tue Dec 6 07:10:19 CST 2005
- complete overhaul to make MARC::Charset use LoC XML mapping table.
0.7 Wed Sep 7 21:34:18 2005
- pod fixes
0.6 Thu Feb 26 10:26:22 2004
- fixed MARC::Charset::EastAsian to not hexify results of character lookup
since we are now storing hex values in the BerkeleyDB.
- also fixed the method for looking up the location of the BerkeleyDB
so that the testing version takes precedence over one that is
installed. This is why the above error was not detected during
testing.
0.5 Fri Apr 11 06:47:00 2003
- all Charset classes inherit from MARC::Charset::Generic
- added MARC::Charset::UTF8
- added MARC::Charset::to_marc8() for conversion of UTF8 back to MARC8
- t/115.utf8.t basic tests of to_marc8()
- modified Makefile.PL to create a reverse mapping database for mapping
UTF8 characters back to their MARC8 equivalent.
0.3 Tue Dec 3 17:09:23 2002
- revamped to_utf8() to handle multibyte character sets. It is no longer
recursive, and didn't really need to be in the first place.
- created MARC::Charset::EastAsian!
0.2 Sat Oct 19 03:24:22 2002
- Added the 'Final character' to identify Extended Latin.
0.1 Fri Jul 26 09:10:36 2002
- Original version