NAME
WebService::DetectLanguage - interface to the language detection API at DetectLanguage.com
SYNOPSIS
use WebService::DetectLanguage;
my $api = WebService::DetectLanguage->new(key => '...');
my @possibilities = $api->detect("there can be only one");
foreach my $poss (@possibilities) {
printf "language = %s confidence=%f\n",
$poss->language->name,
$poss->confidence;
}
DESCRIPTION
This module is an interface to the DetectLanguage service, which provides an API for guessing what natural language is used in a sample of text.
This is very much a first cut at an interface, so (a) the interface may well change, and (b) contributions are welcome.
To use the API you must sign up to get an API key, at https://detectlanguage.com/plans. There is a free level which lets you make 1,000 requests per day, and you don't have to provide a card to sign up for the free level.
Example Usage
Let's say you've got a sample of text in a file. You might read it into $text
using read_text()
from File::Slurper.
To identify the language, you call the detect()
method:
@results = $api->detect($text);
Each result is an instance of WebService::DetectLanguage::Result. If there's only one result, you should look at the is_reliable
flag to see whether they're confident of the identification The more text they're given, the more confident they are, in general.
if (@results == 1) {
$result = $results[0];
if ($result->is_reliable) {
printf "Language is %s!\n", $result->language->name;
}
else {
# Hmm, maybe check with the user?
}
}
You might get more than one result though. This might happen if your sample contains words from more than one language, for example.
In that case, the is_reliable
flag can be used to check if the first result is reliable enough to go with.
if (@results > 1 && $results[0]->is_reliable) {
# we'll go with that!
}
There will only ever be at most one result with is_reliable
set to a true value. If you get multiple results, they're always in decreasing order of reliability.
Each result also includes a confidence value, which looks a bit like a percentage, but their FAQ says that it can go higher than 100.
foreach my $result (@results) {
my $language = $result->language;
printf "language = %s (%s) with confidence %f\n",
$language->name,
$language->code,
$result->confidence;
}
METHODS
new
You must provide the key that you got from detectlanguage.com
.
my $api = WebService::WordsAPI->new(
key => '...',
);
detect
This method takes a UTF-8 text string, and returns a list of one or more guesses at the language.
Each guess is a data object which has attributes language
, confidence
, and is_reliable
.
my $text = "It was a bright cold day in April, ...";
my @results = $api->detect($text);
foreach my $result (@results) {
printf "language = %s (%s) confidence = %f reliable = %s\n",
$result->language->name,
$result->language->code,
$result->confidence,
$result->is_reliable ? 'Yes' : 'No';
}
Look at the API documentation to see how to interpret each result.
multi_detect
This takes multiple strings and returns a list of arrayrefs; there is one arrayref for each string, returned in the same order as the strings. Each arrayref contains one or more language guess, as for detect()
above.
my @strings = (
"All happy families are alike; each unhappy family ... ",
"This is my favourite book in all the world, though ... ",
"It is a truth universally acknowledged, that Perl ... ",
);
my @results = $api->multi_detect(@strings);
for (my $i = 0; $i < @strings; $i++) {
print "Text: $strings[$i]\n";
my @results = @{ $results[$i] };
# ... as for detect() above
}
languages
This returns a list of the supported languages:
my @languages = $api->languages;
foreach my $language (@languages) {
printf "%s: %s\n",
$language->code,
$language->name;
}
account_status
This returns a bunch of information about your account:
my $status = $api->account_status;
printf "plan=%s status=%s requests=%d\n",
$status->plan,
$status->status,
$status->requests;
For the full list of attributes, either look at the API documentation, or WebService::DetectLanguage::AccountStatus.
SEE ALSO
https://detectlanguage.com is the home page for the service; documentation for the API can be found at https://detectlanguage.com/documentation.
REPOSITORY
https://github.com/neilb/WebService-DetectLanguage
AUTHOR
Neil Bowers <neilb@cpan.org>
LICENSE AND COPYRIGHT
This software is copyright (c) 2019 by Neil Bowers <neilb@cpan.org>.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.