NAME

Lingua::EN::ABC - American, British, and Canadian English

SYNOPSIS

use Lingua::EN::ABC ':all';
my $colour = a2b ('color');
print "$colour\n";

produces output

colour

(This example is included as synopsis.pl in the distribution.)

VERSION

This documents Lingua::EN::ABC version 0.12 corresponding to git commit 20d6292bafb5dd6d7e04bd1b24ffc119d01ab761 released on Tue Nov 16 17:19:21 2021 +0900.

DESCRIPTION

This module offers functions to convert between the spellings and vocabulary of American, British, and Canadian versions of English.

FUNCTIONS

The naming convention for the functions is "a" for American, "b" for British, "c" for Canadian, so "a2b" converts "American to British".

a2b

my $british = a2b ('color');
# $british = 'colour'.

Convert American into British spellings. An option oxford controls whether to use Oxford spelling (realize rather than realise):

my $oxford_british = a2b ('realize', oxford => 1);

This does not convert words with different pronunciations or words which are completely different between American and British uses.

This cannot correctly convert ambiguous spellings like "program", which may be either "program" or "programme" in British English. See "BUGS". It tries to convert American formations like "gotten" into "got".

An option s, if true, results in a spelling-only conversion:

use utf8;
use Lingua::EN::ABC ':all';
print a2b ('aluminum airplane labor center pajamas'), "\n";
print a2b ('aluminum airplane labor center pajamas', s => 1), "\n";

produces output

aluminium aeroplane labour centre pyjamas
aluminum airplane labour centre pyjamas

(This example is included as alairlab.pl in the distribution.)

In this case, word pairs with differing pronunciations, like "burnt" and "burned" are not interchanged, and word pairs which are ambiguous, like "check" and "cheque", are also not interchanged.

b2a

my $american = b2a ('the colour of my pyjamas');
# $american = 'the color of my pajamas'

Convert British spellings into American spellings. This cannot convert British formations like "got" into "gotten" due to the grammatical ambiguity ("I've got a car" versus "I've gotten into an accident", or "I got into an accident").

An option s, if true, results in a spelling-only conversion. See "a2b".

a2c

my $canadian = a2c ('the color');
# $canadian = 'the colour'

Convert American to Canadian spelling. An option s, if true, results in a spelling-only conversion. See "a2b".

c2a

my $american = c2a ('the color');
# $american = 'the colour'

Convert Canadian to American spelling. An option s, if true, results in a spelling-only conversion. See "a2b".

b2c

my $canadian = b2c ('the programme');
# $canadian = 'the program'

Convert British to Canadian spelling. An option s, if true, results in a spelling-only conversion. See "a2b".

c2b

my $british = c2b ($canadian);

Convert Canadian to British spelling. An option oxford controls whether to use Oxford spelling (realize rather than realise):

my $oxford_british = c2b ($canadian, oxford => 1);

An option s, if true, results in a spelling-only conversion. See "a2b".

DEPENDENCIES

Carp

Carp is used to print errors.

JSON::Parse

JSON::Parse is used to read in the file of spelling data.

"make_regex" in Convert::Moji

This is used to make a regular expression which converts the words from one form to another.

SEE ALSO

Lingua::EN::ABC::Data

This is the underlying data for this module, put into POD format so that it's easy to search and check.

respell

respell is a tool to convert English text from one spelling system to another. This used to be at http://membled.com/work/apps/respell, but that web site has now disappeared as of Tue Nov 16 17:19:21 2021 +0900.

STANDALONE SCRIPT

There is a script called econv in the distribution which runs these functions on its command line. Please use econv --help for detailed usage instructions.

DATA FILE

The data file provided with the distribution isn't intended to be human-edited. The master file containing the spelling variations is abc.txt in the top directory of the distribution. The comment at the top of the file contains information about the format. To add to this module's list of words, edit the file and send a pull request on github.

BUGS

No handling of ambiguous words like "program".

"Program" is used in British English for computer programs, whereas a theatre programme uses the -mme spelling.

It only converts lower case

For example, "a2c" will not convert "The Color Purple" or "The World Trade Center" into "The Colour Purple" or "The World Trade Centre". This is a feature as well as a bug, since proper names like movie titles or place names should not be respelt.

Word lists are not comprehensive

Please feel free to contribute. See "DATA FILE" for an easy way to contribute new items.

There are no tests involving the ambiguity data

Up to version 0.05 of the module, the ambiguity data about which words are ambiguous (vice/vise etc.) was not being put into the JSON data file, and yet it was passing all its tests, so there cannot be any tests of this.

HISTORY

0.09 2018-09-26

Additional word pairs coloured, colouration, mouldy, vapourise, vapourisation.

Ambiguous spellings (check/cheque, meter/metre) no longer converted when using the s option.

Some pairs incorrectly marked as spelling-only (towards, mum) restored.

0.10

Plurals ending in s were added.

ACKNOWLEDGEMENTS

A list of words by Wikipedia user Ohconfucius was used in the preparation of the data. Nigel Horne (NJH) and Ed Avis (EDAVIS) contributed some word additions and other suggestions.

AUTHOR

Ben Bullock, <bkb@cpan.org>

COPYRIGHT & LICENCE

This package and associated files are copyright (C) 2013-2021 Ben Bullock.

You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.