NAME
Win32::Codepage - get Win32 codepage information
LICENSE
Copyright 2005 Clotho Advanced Media, Inc., <cpan@clotho.com>
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SYNOPSIS
use Win32::Codepage;
print "Current language: " . Win32::Codepage::get_codepage() . "\n"; # e.g. "en-us"
print "Install language: " . Win32::Codepage::get_install_codepage() . "\n";
use Encode qw(encode);
my $w32encoding = Win32::Codepage::get_encoding(); # e.g. "cp1252"
my $encoding = $w32encoding ? Encode::resolve_alias($w32encoding) : '';
print $encoding ? encode($string, $encoding) : $string;
DESCRIPTION
This module is intended as a companion to Win32::Locale. That module offers information about user prefs for language and locale. However, Windows has a separate setting for how files and filenames are encoded by default, which is specified by the "codepage" (a legacy term from DOS days). It is possible to be on a computer whose language, date, currency, etc are set to English, but the file contents and filesystem names default to SHIFT-JIS (Japanese) encoding.
This module offers information about that codepage, which allows your Perl code to know what encoding to expect for file names and file contents.
On Windows XP, you can change the current codepage from the default via Control Panel > Regional and Language Settings > Advanced tab. If you change it to, say, Japanese and then reboot, the default codepage will be cp932, which is Microsoft's version of SHIFT-JIS. This will allow non-Unicode Windows applications (like ActiveState Perl) to read filenames that contain Japanese characters. If you have files named with Japanese characters but your codepage is set to cp1252 (Microsoft's version of ISO-latin-1), then the foreign characters in the filename appear as ?
to Perl.
If there's a better way around this than messing with codepages, PLEASE LET ME KNOW! I hate that I ever had to write this module...
SEE ALSO
I tried to contact the author of that module to get him to extend his distribution to include the codepage functionality, but I received no response for seven months. So, I created this module. See the RT ticket: http://rt.cpan.org/Ticket/Display.html?id=11739
FUNCTIONS
- get_codepage
-
Returns the language name for the current codepage language. For example
en-us
orja
. Returns false if the codepage language cannot be identified.If this function is passed an argument (not recommended), then it returns the language name for the specified language ID instead of the system language ID.
- get_install_codepage
-
Returns the language name for the installed codepage language. This is the same as get_codepage(), but refers to the codepage that was the default when Windows was first installed.
- get_encoding
-
Returns an encoding name usable with Encode.pm based on the current codepage. For example,
cp1252
for iso-8859-1 (latin-1) orcp932
for Shift-JIS Japanese. Returns false if an encoding cannot be identified.Note: this only returns encoding names that start with
cp
. - get_ms_codepage
-
Returns the numeric language ID for the current codepage language. For example
0x0409
for en-us or0x0411
for ja. Returns false if the codepage cannot be identified. - get_ms_install_codepage
-
Returns the numeric language ID for the installed codepage language. This is the same as get_ms_codepage(), but refers to the codepage that was the default when Windows was first installed.
AUTHOR
Clotho Advanced Media, Inc. cpan@clotho.com
Primary developer: Chris Dolan