NAME
Unicode::Precis::Preparation - RFC 8264 PRECIS Framework - Preparation
SYNOPSIS
use Unicode::Precis::Preparation qw(prepare IdentifierClass);
$result = prepare($string, IdentifierClass);
%result = prepare($string, IdentifierClass);
DESCRIPTION
Unicode::Precis::Preparation prepares Unicode string or UTF-8 bytestring according to PRECIS framework.
Note that the word "UTF-8" in this document is used in its proper meaning.
Function
- prepare ( $string, [ $stringclass ], [ UnicodeVersion => $version ] )
-
Check if a string conforms to specified string class.
Parameters:
- $string
-
A string to be checked, Unicode string or bytestring.
Note that bytestring won't be upgraded to Unicode string but will be treated as UTF-8 sequence.
- $stringclass
-
One of the constants
ValidUTF8
(default),IdentifierClass
(see RFC 8264) orFreeFormClass
(ditto). - UnicodeVersion => $version
-
If a version of Unicode is given, repertoire is restricted according to it. By default, repertoire of Unicode version supported by Perl using this module is available.
Returns:
In scalar context: True value if the string conforms to specified string class. Otherwise false value.
In array context: A list of pairs describing detail of result with these keys:
result
-
One of property values described in "Constants".
offset
-
If the check fails, offset from beginning of string. If succeeds, length of string.
Offset or length is based on byte for bytestring, and based on character for Unicode string.
length
-
When the check fails, length of disallowed character. Length is
1
to4
for bytestring, always1
for Unicode string and undefined for invalid sequence. ord
-
Unicode scalar value of character, when
length
item is set.
Constants
- FreeFormClass
- IdentifierClass
- ValidUTF8
-
String classes.
ValidUTF8
is the extension by this module. - UNASSIGNED
- PVALID
- CONTEXTJ
- CONTEXTO
- DISALLOWED
-
Property values to represent results.
PVALID
means successful result.
Exports
None are exported by default. prepare() and constants may be exported by :all
tag.
RESTRICTIONS
prepare() can not check Unicode string on EBCDIC platforms.
Unicode versions
- String classes
-
Derived properties are based on Unicode 6.3.0 or later. Some characters have imcompatible property values with Unicode prior to 6.0.0 (See also RFC 6452). Property values of characters added by Unicode version after 6.3.0 can be changed in the future.
- Contextual rules
-
Character properties checked by contextual rules are based on Unicode version that recent version of Perl supports. Some characters have imcompatible property values with Unicode 6.3.0.
SEE ALSO
RFC 8264 PRECIS Framework: Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols. https://tools.ietf.org/html/rfc8264.
AUTHOR
Hatuka*nezumi - IKEDA Soji, <hatuka@nezumi.nu>
COPYRIGHT AND LICENSE
Copyright (C) 2015, 2018 by Hatuka*nezumi - IKEDA Soji
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. For more details, see the full text of the licenses at <http://dev.perl.org/licenses/>.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.