NAME
Lingua::LO::NLP::Romanize - Romanize Lao syllables
FUNCTION
This is a factory class for Lingua::LO::NLP::Romanize::*
. Currently there are the following romanization modules:
- Lingua::LO::NLP::Romanize::PCGN for the standard set by the Permanent Committee on Geographical Names for British Official Use
- Lingua::LO::NLP::Romanize::IPA for the International Phonetic Alphabet
SYNOPSIS
my $o = Lingua::LO::NLP::Romanize->new(
variant => 'PCGN',
hyphen => 1,
);
METHODS
new
The constructor takes any number of hash-style named arguments. The following ones are always recognized:
variant
-
Standard according to which to romanize; this determines the Lingua::LO::NLP::Romanize subclass to actually instantiate.
hyphen
-
Separate runs of Lao syllables with hyphens. Set this to the character you would like to use as a hyphen - usually this will be the ASCII "hyphen minus" (U+002D) but it can be the unambiguous Unicode hyphen ("‐", U+2010), a slash or anything you like. As a special case, you can pass a 1 to use the ASCII version. If this argument is missing or
undef
, blanks are used. Syllables duplicated using "ໆ" are always joined with a hyphen: either the one you specify or the ASCII one. normalize
-
Run text through tone mark order normalization; see "normalize_tone_marks" in Lingua::LO::NLP::Data. If your text looks fine but syllables are not recognized, you may need this.
Subclasses may specify additional arguments, such as IPA's tone
that controls the rendering of IPA diacritics for tonal languages.
romanize
romanize( $text )
Return the romanization of $text
according to the standard passed to the constructor. Text is split up by "get_fragments" in Lingua::LO::NLP::Syllabify; Lao syllables are processed and everything else is passed through unchanged save for possible conversion of combining characters to a canonically equivalent form by "NFC" in Unicode::Normalize.
romanize_syllable
romanize_syllable( $syllable )
Return the romanization of a single $syllable
according to the standard passed to the constructor. This is a virtual method that must be implemented by subclasses.
hyphen
my $hypen = $o->hyphen;
$o->hyphen('-');
Accessor for the hyphen
attribute, see "new".
normalize
my $normalization = $o->normalize;
$o->normalize( $bool );
Accessor for the normalize
attribute, see "new".