NAME
JSON::XS::ByteString - Thin wrapper around fast JSON::XS that makes each JSON fields as string, and Perl fields as bytes (utf8 octet)
SYNOPSIS
use JSON::XS::ByteString qw(encode_json decode_json decode_json_safe encode_utf8 decode_utf8);
$json_string = encode_json($perl_data);
$json_string = encode_json_unsafe($perl_data);
$perl_data = decode_json($json_string);
$perl_data = decode_json_safe($json_string);
# low-level tool, I don't use them directly.
# But if your situation is not exactly the same as mine,
# you might use them directly to fit your own situation.
encode_utf8($perl_data);
# no return value. downgrade each string field into utf8 encoded octet
decode_utf8($perl_data);
# no return value. upgrade each string and numeric field into multibyte chars,
DESCRIPTION
This module is a wrapper around JSON::XS for making the life easier dealing with UTF-8 byte strings.
The added overhead is very low, you can try that your self ^^
The module try to achieve that by 3 approaches below:
Transfer all the non-ref, non-undef values into strings before building the JSON string from Perl data
Because by the Perl nature, it's hard to determine if the outputted one is a string or numeric one. The nondeterministic will make the life harder if the acceptor is writing in other languages that strictly care about if it's string or number.
Transfer all the utf8 encoded octet into multibyte-char strings before encoding to JSON string. If there're any malform octets, we'll transfer those bytes into questionmarks(?). If you use the _unsafe version, we'll just leave them there, otherwise we'll recover the questionmarks back to the original malform octets.
If your situation is just like me that we all use utf8 encoded octet all around, it's cumbersome and slow that we need to recursively upgrade all the string value into multibyte chars before
JSON::XS::encode_json
.Transfer all the multibyte-char strings into utf8 encoded octet after decoding JSON string to Perl data.
If your situation is just like me that we all use utf8 encoded octet all around, it's cumbersome and slow that we need to recursively downgrade all the string value back to utf8 encoded octet after
JSON::XS::decode_json
.
DESIGN CONSIDERATION
I didn't transfer the numeric value from json_decode
back to string values
Because in the pure Perl world, there's insignificant difference between numeric or string. So I think we don't need to do it since the result will be used in Perl.
FUNCTIONS
$json_string = encode_json($perl_data)
Get a JSON string from a perl data structure.
Before calling to JSON::XS::encode_json. This function will transfer (modify the input data directly)
each non-string, non-arrayref, non-hashref scalar into multibyte-char string value
each whole bytes (utf8-octet) into multibyte-char string value. when there're any malform octets, transfer them to questionmarks(?).
After that, the function will then transfer
each multibyte-char string back to bytes (utf8-octet)
each questionmark back to original malform octets
$json_string = encode_json_unsafe($perl_value)
Same as encode_json
except the last step after JSON::XS::encode_json
. The argument will be upgraded to multibyte chars and never back.
This function is a little faster than the encode_json
. Use it if you're sure that you'll not use the argument after the JSON call.
$perl_data = decode_json($json_string)
Get the perl data structure back from a JSON string.
After the call to JSON::XS::decode_json, the function will transfer each multibyte-char string field into bytes (utf8-octet)
Note that only the string values are converted, the numeric ones are not.
$perl_data = decode_json_safe($json_string)
The same as decode_json
but wrap it around an eval
block and suppress the $SIG{__DIE__}
signal. We'll get an undef
value back when decode fail.
This function is only for convenience.
encode_utf8($perl_data)
Downgrade each string fields of the $perl_data
to utf8 encoded octets.
decode_utf8($perl_data)
Upgrade each string or numeric fields of the $perl_data
to multibyte chars.
If there're any malform utf8 octets, transfer them to questionmarks(?).
CAVEATS
The input argument of encode_json
/ encnode_json_unsafe
will be changed
The encode_json_unsafe
will upgrade all the string or numeric scalar into multibyte char strings and never back.
Though the encode_json
will try to convert it back to utf8 encoded octets. It didn't remember if any of them is originally numeric or multibyte chars already. They'll all transfer back to utf8 encoded octets.
The malform octets in the hash key is not handled
The malform octets in the hash key is left as is. Then the JSON::XS::encode_json
will complain about that.
SEE ALSO
This mod's github repository https://github.com/CindyLinz/Perl-JSON-XS-ByteString
AUTHOR
Cindy Wang (CindyLinz)
COPYRIGHT AND LICENSE
Copyright (C) 2014 by Cindy Wang (CindyLinz)
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8 or, at your option, any later version of Perl 5 you may have available.