The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Pcore::Util::Data

SYNOPSIS

DESCRIPTION

JSON SERIALIZE

ascii(1):
- qq[\xA3] -> \u00A3, upgrded and encoded to UTF-8 character;
- qq[£]    -> \u00A3, UTF-8 character;
- qq[ᾥ]    -> \u1FA5, UTF-8 character;

latin1(1):
- qq[\xA3] -> qq[\xA3], encoded as bytes;
- qq[£]    -> qq[\xA3], downgraded and encoded as bytes;
- qq[ᾥ]    -> \u1FA5, downgrade impossible, encoded as UTF-8 character;

utf8 - used only when ascii(0) and latin1(0);
utf8(0) - upgrade scalar, UTF8 on, DO NOT USE, SERIALIZED DATA SHOULD ALWAYS BY WITHOUT UTF8 FLAG!!!!!!!!!!!!!!!!!!;
- qq[\xA3] -> "£" (UTF8, multi-byte, len = 1, bytes::len = 2);
- qq[£]    -> "£" (UTF8, multi-byte, len = 1, bytes::len = 2);
- qq[ᾥ]    -> "ᾥ" (UTF8, multi-byte, len = 1, bytes::len = 3);

utf8(1) - upgrade, encode scalar, UTF8 off;
- qq[\xA3] -> "\xC2\xA3" (latin1, bytes::len = 2);
- qq[£]    -> "\xC2\xA3" (latin1, bytes::len = 2);
- qq[ᾥ]    -> "\xE1\xBE\xA5" (latin1, bytes::len = 3);

So,
- don't use latin1(1);
- don't use utf8(0);

JSON DESERIALIZE

utf8(0):
- qq[\xA3]     -> "£", upgrade;
- qq[£]        -> "£", as is;
- qq[\xC2\xA3] -> "£", upgrade each byte, invalid;
- qq[ᾥ]        -> error;

utf8(1):
- qq[\xA3]     -> "£", error, can't decode utf8;
- qq[£]        -> "£", error, can't decode utf8;
- qq[\xC2\xA3] -> "£", decode utf8;
- qq[ᾥ]        -> error, can't decode utf8;

So,
- if data was encoded with utf8(0) - use utf8(0) to decode;
- if data was encoded with utf8(1) - use utf8(1) to decode;