NAME
Encode::Arabic - Perl extension for encodings of Arabic
REVISION
$Revision: 1.14 $ $Date: 2005/10/02 16:08:04 $
SYNOPSIS
use Encode::Arabic; # imports just like 'use Encode' even with options would
while ($line = <>) { # renders the ArabTeX notation for Arabic both in the ..
print encode 'utf8', decode 'arabtex', $line; # .. Arabic script proper and the
print encode 'utf8', decode 'arabtex-zdmg', $line; # .. Latin phonetic transcription
}
# 'use Encode::Arabic ":modes"' would export the functions controlling the conversion modes
Encode::Arabic::demode 'arabtex', 'default';
Encode::Arabic::enmode 'buckwalter', 'full', 'xml', 'strip off kashida';
# Arabic in lower ASCII transliterations <--> Arabic script in Perl's internal encoding
$string = decode 'ArabTeX', $octets;
$octets = encode 'Buckwalter', $string;
$string = decode 'Buckwalter', $octets;
$octets = encode 'ArabTeX', $string;
# Arabic in lower ASCII transliterations <--> Latin phonetic transcription, Perl's utf8
$string = decode 'Buckwalter', $octets;
$octets = encode 'ArabTeX', $string;
$string = decode 'ArabTeX-ZDMG', $octets;
$octets = encode 'utf8', $string;
DESCRIPTION
This module is a wrapper for various implementations of the encoding systems used for the Arabic language and covering even some non-Arabic extensions to the Arabic script. The included modules fit in the philosophy of Encode::Encoding and can be used directly with the Encode module.
LIST OF ENCODINGS
- ArabTeX
-
ArabTeX multi-character notation for Arabic / Perl's internal format for the Arabic script
- ArabTeX-RE
-
Deprecated method using sequential regular-expression substitutions. Limited in scope over the ArabTeX notation and non-efficient in data processing, still, not requiring the Encode::Mapper module.
- ArabTeX-Verbatim
-
ArabTeX multi-character verbatim notation for Arabic / Perl's internal format for the Arabic script
- ArabTeX-ZDMG
-
ArabTeX multi-character notation for Arabic / Perl's internal format for the Latin phonetic trascription in the ZDMG style
- ArabTeX-ZDMG-RE
-
Deprecated method using sequential regular-expression substitutions. Limited in scope over the ArabTeX notation and non-efficient in data processing, still, not requiring the Encode::Mapper module.
- Buckwalter
-
Buckwalter one-to-one notation for Arabic / Perl's internal format for the Arabic script
There are generic aliases to these provided by Encode. Case does not matter and all characters of the class [ _-]
are interchangable.
Note that the standard Encode module already deals with several other single-byte encoding schemes for Arabic popular with whichever operating system, be it *n*x, Windows, DOS or Macintosh. See Encode::Supported and Encode::Byte for their identification names and aliases.
EXPORTS & MODES
The module exports as if use Encode
also appeared in the calling package. The import
options are just delegated to Encode and imports performed properly, with the exception of the :modes
option coming first in the list. In such a case, the following functions will be introduced into the namespace of the importing package:
- enmode ($enc, @list)
-
Calls the
enmode
method associated with the given$enc
encoding, and passes the@list
to it. The idea is similar to theencode
functions and methods of the Encode and Encode::Encoding modules, respectively. Used for control over the modes of conversion. - demode ($enc, @list)
-
Analogous to
enmode
, but calling the appropriatedemode
method. See the individual implementations of the listed encodings.
SEE ALSO
Encode::Arabic Online Interface http://ckl.mff.cuni.cz/smrz/Encode/Arabic/
Klaus Lagally's ArabTeX ftp://ftp.informatik.uni-stuttgart.de/pub/arabtex/arabtex.htm
Tim Buckwalter's Qamus http://www.qamus.org/
Arabeyes Arabic Unix Project http://www.arabeyes.org/
Lecture Notes on Arabic NLP http://ckl.mff.cuni.cz/smrz/ANLP/anlp-lecture-notes.pdf
Encode, Encode::Encoding, Encode::Mapper, Encode::Supported, Encode::Byte
Locale::Recode, Locale::RecodeData
MARC::Charset, MARC::Charset::ArabicBasic, MARC::Charset::ArabicExtended
AUTHOR
Otakar Smrz, http://ckl.mff.cuni.cz/smrz/
eval { 'E<lt>' . 'smrz' . "\x40" . ( join '.', qw 'ckl mff cuni cz' ) . 'E<gt>' }
Perl is also designed to make the easy jobs not that easy ;)
COPYRIGHT AND LICENSE
Copyright 2003-2005 by Otakar Smrz
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.