NAME

Unicode::Casing - Perl extension to override system case changing functions

SYNOPSIS

use Unicode::Casing
          uc => \&my_uc, lc => \&my_lc,
          ucfirst => \&my_ucfirst, lcfirst => \&my_lcfirst;
no Unicode::Casing;

DESCRIPTION

This module allows overriding the system-defined character case changing functions. Any time something in its lexical scope would ordinarily call lc(), lcfirst(), uc(), or ucfirst() the corresponding user-specified function will instead be called. This applies to direct calls, and indirect calls via the \L, \l, \U, and \u escapes in double quoted strings and regular expressions.

Each function is passed a string to change the case of, and should return the case-changed version of that string. Using, for example, \U inside the override function for uc() will lead to infinite recursion, but the standard casing functions are available via CORE::. For example,

sub my_uc {
   my $string = shift;
   print "Debugging information\n";
   return CORE::uc($string);
}
use Unicode::Casing uc => \&my_uc;
uc($foo);

gives the standard upper-casing behavior, but prints "Debugging information" first.

It is an error to not specify at least one override in the "use" statement. Ones not specified use the standard version. It is also an error to specify more than one override for the same function.

use re 'eval' is not needed to have the inline case-changing sequences work in regular expressions.

Here's an example of a real-life application, for Turkish, that shows context-sensitive case-changing.

sub turkish_lc($) {
   my $string = shift;

   # Unless an I is before a dot_above, it turns into a dotless i (the
   # dot above being attached to the I, without an intervening other
   # Above mark; an intervening non-mark (ccc=0) would mean that the
   # dot above would be attached to that character and not the I)
   $string =~ s/I (?! [^\p{ccc=0}\p{ccc=Above}]* \x{0307} )/\x{131}/gx;

   # But when the I is followed by a dot_above, remove the dot_above so
   # the end result will be i.
   $string =~ s/I ([^\p{ccc=0}\p{ccc=Above}]* ) \x{0307}/i$1/gx;

   $string =~ s/\x{130}/i/g;

   return CORE::lc($string);
}

A potential problem with context-dependent case changing is that the routine may be passed insufficient context, especially with the in-line escapes like \L.

turkish.t, which comes with the distribution includes a full implementation of all the Turkish casing rules.

AUTHOR

Karl Williamson, <khw@cpan.org>

COPYRIGHT AND LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.1 or, at your option, any later version of Perl 5 you may have available.

To install Unicode::Casing, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Unicode::Casing

CPAN shell

perl -MCPAN -e shell
install Unicode::Casing

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

DESCRIPTION

AUTHOR

COPYRIGHT AND LICENSE

Module Install Instructions