NAME
Encode::ISO2022 - ISO/IEC 2022 character encoding scheme
SYNOPSIS
package FooEncoding;
use base qw(Encode::ISO2022);
__PACKAGE__->Define(
Name => 'foo-encoding',
CCS => [ {...CCS one...}, {...CCS two...}, ....]
);
DESCRIPTION
This module provides a character encoding scheme (CES) switching a set of multiple coded character sets (CCS).
A class method Define() may take following arguments.
- Alias => REGEX
-
The regular expression representing alias of this encoding, if any.
- Name => STRING
-
The name of this encoding as Encode::Encoding object. Mandatory.
- CCS => [ FEATURE, FEATURE, ...]
-
List of features defining CCSs used by this encoding. Mandatory. Each item is a hash reference containing following items.
- bytes => NUMBER
-
Number of bytes to represent each character. Default is 1.
- cl => BOOLEAN
-
If true value is set, this CCS includes map to/from code points between 0/0 and 1/15. There should be one CCS with this flag to reset broken designation.
- dec_only => BOOLEAN
-
If true value is set, this CCS will be used only for decoding.
- encoding => STRING | ENCODING
-
Encode::Encoding object used as CCS, or its name. Mandatory.
Encodings used for CCS must provide "raw" conversion. Namely, they must be stateless and fixed-length conversion over 94^n or 96^n code tables. Encode::ISO2022::CCS lists available CCSs.
- g => STRING
- g_init => STRING
-
Working set this CCS may be designated to:
'g0'
,'g1'
,'g2'
or'g3'
.If
g_init
is set, this CCS will be designated at beginning of coversion implicitly, and at end of conversion explicitly.If
g
org_init
is set and neither ofls
norss
is set, this CCS will be invoked when it is designated.If neither of
g
,g_init
,ls
norss
is set, this CCS is invoked always. - g_seq => STRING
-
Escape sequence to designate this CCS, if it can be designated explicitly.
- gr => BOOLEAN
-
If true value is set, this CCS will be invoked to GR using 7-bit conversion table.
- ls => STRING
- ss => STRING
-
Escape sequence or control character to invoke this CCS, if it should be invoked explicitly.
If
ls
is set, this CCS will be invoked by locking-shift. Ifss
is set, this CCS will be invoked by single-shift. - range => STRING
-
Possible range of encoded bytes. General value is
'\x21-\x7E'
,'\x20-\x7F'
,'\xA1-\xFE'
or'\xA0-\xFF'
. This is required for multibyte CCSs to detect broken multibyte sequences.
- LineInit => BOOLEAN
-
If it is true, designation and invokation states will be initialized at beginning of lines.
- SubChar => STRING
-
Unicode string to be used for substitution character.
To know more about use of this module, the source of Encode::ISO2022JP2 may be an example.
CAVEATS
This module implements small subset of the features defined by ISO/IEC 2022. Each encoding recognizes only several predefined designation and invokation functions. It can handle limited number of coded character sets. Variable length multibyte coded character sets aren't supported. And so on.
SEE ALSO
ISO/IEC 2022 Information technology - Character code structure and extension techniques.
AUTHOR
Hatuka*nezumi - IKEDA Soji, <nezumi@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2013 by Hatuka*nezumi - IKEDA Soji
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.