NAME
Lingua::EN::ABC - American, British, and Canadian English
SYNOPSIS
use Lingua::EN::ABC ':all';
my $colour = a2b ('color');
print "$colour\n";
produces output
colour
(This example is included as synopsis.pl in the distribution.)
VERSION
This documents Lingua::EN::ABC version 0.11_01 corresponding to git commit 372ffc313d5e506702fa8944b4c77f016b45ae5c released on Sat Nov 13 22:27:15 2021 +0900.
DESCRIPTION
This module offers functions to convert between the spellings and vocabulary of American, British, and Canadian versions of English.
FUNCTIONS
The naming convention for the functions is "a" for American, "b" for British, "c" for Canadian, so "a2b" converts "American to British".
a2b
my $british = a2b ('color');
# $british = 'colour'.
Convert American into British spellings. An option oxford
controls whether to use Oxford spelling (realize rather than realise):
my $oxford_british = a2b ('realize', oxford => 1);
This does not convert words with different pronunciations or words which are completely different between American and British uses.
This cannot correctly convert ambiguous spellings like "program", which may be either "program" or "programme" in British English. See "BUGS". It tries to convert American formations like "gotten" into "got".
An option s
, if true, results in a spelling-only conversion:
use utf8;
use Lingua::EN::ABC ':all';
print a2b ('aluminum airplane labor center pajamas'), "\n";
print a2b ('aluminum airplane labor center pajamas', s => 1), "\n";
produces output
aluminium aeroplane labour centre pyjamas
aluminum airplane labour centre pyjamas
(This example is included as alairlab.pl in the distribution.)
In this case, word pairs with differing pronunciations, like "burnt" and "burned" are not interchanged, and word pairs which are ambiguous, like "check" and "cheque", are also not interchanged.
b2a
my $american = b2a ('the colour of my pyjamas');
# $american = 'the color of my pajamas'
Convert British spellings into American spellings. This cannot convert British formations like "got" into "gotten" due to the grammatical ambiguity ("I've got a car" versus "I've gotten into an accident", or "I got into an accident").
An option s
, if true, results in a spelling-only conversion. See "a2b".
a2c
my $canadian = a2c ('the color');
# $canadian = 'the colour'
Convert American to Canadian spelling. An option s
, if true, results in a spelling-only conversion. See "a2b".
c2a
my $american = c2a ('the color');
# $american = 'the colour'
Convert Canadian to American spelling. An option s
, if true, results in a spelling-only conversion. See "a2b".
b2c
my $canadian = b2c ('the programme');
# $canadian = 'the program'
Convert British to Canadian spelling. An option s
, if true, results in a spelling-only conversion. See "a2b".
c2b
my $british = c2b ($canadian);
Convert Canadian to British spelling. An option oxford
controls whether to use Oxford spelling (realize rather than realise):
my $oxford_british = c2b ($canadian, oxford => 1);
An option s
, if true, results in a spelling-only conversion. See "a2b".
DEPENDENCIES
- Carp
-
Carp is used to print errors.
- JSON::Parse
-
JSON::Parse is used to read in the file of spelling data.
- "make_regex" in Convert::Moji
-
This is used to make a regular expression which converts the words from one form to another.
SEE ALSO
- Lingua::EN::ABC::Data
-
This is the underlying data for this module, put into POD format so that it's easy to search and check.
- respell
-
respell
is a tool to convert English text from one spelling system to another. This used to be at http://membled.com/work/apps/respell, but that web site has now disappeared as of Sat Nov 13 22:27:15 2021 +0900.
STANDALONE SCRIPT
There is a script called econv in the distribution which runs these functions on its command line. Please use econv --help
for detailed usage instructions.
DATA FILE
The data file provided with the distribution isn't intended to be human-edited. The master file containing the spelling variations is abc.txt in the top directory of the distribution. The comment at the top of the file contains information about the format. To add to this module's list of words, edit the file and send a pull request on github.
BUGS
- No handling of ambiguous words like "program".
-
"Program" is used in British English for computer programs, whereas a theatre programme uses the -mme spelling.
- It only converts lower case
-
For example, "a2c" will not convert "The Color Purple" or "The World Trade Center" into "The Colour Purple" or "The World Trade Centre". This is a feature as well as a bug, since proper names like movie titles or place names should not be respelt.
- Word lists are not comprehensive
-
Please feel free to contribute. See "DATA FILE" for an easy way to contribute new items.
- There are no tests involving the ambiguity data
-
Up to version 0.05 of the module, the ambiguity data about which words are ambiguous (vice/vise etc.) was not being put into the JSON data file, and yet it was passing all its tests, so there cannot be any tests of this.
HISTORY
- 0.09 2018-09-26
-
Additional word pairs coloured, colouration, mouldy, vapourise, vapourisation.
Ambiguous spellings (check/cheque, meter/metre) no longer converted when using the
s
option.Some pairs incorrectly marked as spelling-only (towards, mum) restored.
- 0.10
-
Plurals ending in s were added.
ACKNOWLEDGEMENTS
A list of words by Wikipedia user Ohconfucius was used in the preparation of the data. Nigel Horne (NJH) and Ed Avis (EDAVIS) contributed some word additions and other suggestions.
AUTHOR
Ben Bullock, <bkb@cpan.org>
COPYRIGHT & LICENCE
This package and associated files are copyright (C) 2013-2021 Ben Bullock.
You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.