NAME
utf8::all - turn on Unicode - all of it
VERSION
version 0.017
SYNOPSIS
use utf8::all; # Turn on UTF-8, all of it.
open my $in, '<', 'contains-utf8'; # UTF-8 already turned on here
print length 'føø bār'; # 7 UTF-8 characters
my $utf8_arg = shift @ARGV; # @ARGV is UTF-8 too (only for main)
DESCRIPTION
The use utf8
pragma tells the Perl parser to allow UTF-8 in the program text in the current lexical scope. This also means that you can now use literal Unicode characters as part of strings, variable names, and regular expressions.
utf8::all
goes further:
charnames
are imported so\N{...}
sequences can be used to compile Unicode characters based on names.On Perl
v5.11.0
or higher, theuse feature 'unicode_strings'
is enabled.use feature fc
anduse feature unicode_eval
are enabled on Perl5.16.0
and higher.Filehandles are opened with UTF-8 encoding turned on by default (including STDIN, STDOUT, STDERR). Meaning that they automatically convert UTF-8 octets to characters and vice versa. If you don't want UTF-8 for a particular filehandle, you'll have to set
binmode $filehandle
.@ARGV
gets converted from UTF-8 octets to Unicode characters (whenutf8::all
is used from the main package). This is similar to the behaviour of the-CA
perl command-line switch (see perlrun).readdir
,readlink
,readpipe
(including theqx//
and backtick operators), andglob
(including the<>
operator) now all work with and return Unicode characters instead of (UTF-8) octets.
Lexical scope
The pragma is lexically-scoped, so you can do the following if you had some reason to:
{
use utf8::all;
open my $out, '>', 'outfile';
my $utf8_str = 'føø bār';
print length $utf8_str, "\n"; # 7
print $out $utf8_str; # out as utf8
}
open my $in, '<', 'outfile'; # in as raw
my $text = do { local $/; <$in>};
print length $text, "\n"; # 10, not 7!
Instead of lexical scoping, you can also use no utf8::all
to turn off the effects.
Note that the effect on @ARGV
and the STDIN
, STDOUT
, and STDERR
file handles is always global!
SEE ALSO
File::Find::utf8 for fully utf-8 aware File::Find functions.
Cwd::utf8 for fully utf-8 aware Cwd functions.
INTERACTION WITH AUTODIE
If you use autodie, which is a great idea, you need to use at least version 2.12, released on June 26, 2012. Otherwise, autodie obliterates the IO layers set by the open pragma. See RT #54777 and GH #7.
AVAILABILITY
The project homepage is http://metacpan.org/release/utf8-all/.
The latest version of this module is available from the Comprehensive Perl Archive Network (CPAN). Visit http://www.perl.com/CPAN/ to find a CPAN site near you, or see https://metacpan.org/module/utf8::all/.
SOURCE
The development version is on github at http://github.com/doherty/utf8-all and may be cloned from git://github.com/doherty/utf8-all.git
BUGS AND LIMITATIONS
You can make new bug reports, and view existing ones, through the web interface at https://github.com/doherty/utf8-all/issues.
COMPATIBILITY
The filesystems of Dos, Windows, and OS/2 do not (fully) support UTF-8. The readlink
and readdir
functions and glob
operators will therefore not be replaced on these systems.
AUTHORS
Michael Schwern <mschwern@cpan.org>
Mike Doherty <doherty@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2009 by Michael Schwern <mschwern@cpan.org>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.