The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Intellexer::API - API client for Intellexer

Perl API client for the Intellexer, a webservice that, "enables developers to embed Intellexer semantics products using XML or JSON."

SYNOPSIS

my $api_key = q{...get this from intellexer.com};
my $api = Intellexer::API->new($api_key);
my $response = $api->checkTextSpelling(
    $sample_text,
    'language' => 'ENGLISH',
    'errorTune' => '2',
    'errorBound' => '3',
    'minProbabilityTune' => '2',
    'minProbabilityWeight' => '30',
    'separateLines' => 'true'
);
say $json->encode($response);

DESCRIPTION

Methods

Topic Modeling

Automatically extract topics from text.

getTopicsFromUrl($url)

Accepts a single argument that is a valid URL to a document or webpage.

my $response = $api->getTopicsFromUrl( 'https://perldoc.perl.org/perlsub' );

getTopicsFromFile($file_path)

Accepts a single argument that is a valid path to a local text file.

my $response = $api->getTopicsFromFile($file_path);

getTopicsFromText($text)

Accepts a single argument that is text to be analyzed.

my $response = $api->getTopicsFromText($text);

Linguistic Processor

Parse an input text stream and extract various linguistic information: detected sentences with their offsets in a source text; text tokens (words of sentences) with their part of speech tags, offsets and lemmas (normal forms); subject-verb-object semantic relations.

Parameters

  • loadSentences - load source sentences (TRUE by default)

  • loadTokens - load information about words of sentences (TRUE by default)

  • loadRelations - load information about extracted semantic relations in sentences (TRUE by default)

analyzeText($text, %params)

Accepts the text to be analyzed and optionally one or more of the shown parameters.

my $response = $api->analyzeText(
    $sample_text,
    'loadSentences' => 'True', # load source sentences (TRUE by default)
    'loadTokens'    => 'True', # load information about words of sentences (TRUE by default)
    'loadRelations' => 'True'  # load information about extracted semantic relations in sentences (TRUE by default)
);

Sentiment Analyzer

Automatically extracts sentiments (positivity/negativity), opinion objects (e.g., product features with associated sentiment phrases) and emotions (liking, anger, disgust, etc.) in unstructured text data.

You will need the list of currently available ontologies which you can retrieve using the sentimentAnalyzerOntologies method

Parameters

  • ontology - specify which of the existing ontologies will be used to group the results

  • loadSentences - load source sentences (FALSE by default)

sentimentAnalyzerOntologies()

Accepts no arguments but returns the ontologies available from the API

my $response = $api->sentimentAnalyzerOntologies();

analyzeSentiments(\@reviews, %params)

Accepts a reference to a list of reviews and the shown parameters

my @reviews = (
    {
        "id" => "snt1",
        "text" => "YourText"
    },
    {
        "id" => "snt2",
        "text" => "YourText"
    }
);

my $ontology = "Gadgets";
my $response = $api->analyzeSentiments(
    \@reviews,
    'ontology'      => 'Hotels', # required
    'loadSentences' => 'True',   # defaults to false
);

Named Entity Recognizer

Identifies elements in text and classifies them into predefined categories such as personal names, names of organizations, position/occupation, nationality, geographical location, date, age, duration and names of events. Additionally allows identifying the relations between named entities.

Parameters

  • url - The url to parse (when using url based method)

  • fileName - name of the file to process (when using file processing only

  • fileSize - size of the file to process in bytes

  • loadNamedEntities - load named entities (FALSE by default)

  • loadRelationsTree - load tree of relations (FALSE by default)

  • loadSentences - load source sentences (FALSE by default)

recognizeNe(%params)

Load Named Entities from a document from a given URL. Accepts the shown parameters.

my $response = $api->recognizeNe(
    'url'               => 'https://en.wikipedia.org/wiki/Boogie', # required
    'loadNamedEntities' => 'True',    # load named entities (FALSE by default)
    'loadRelationsTree' => 'True',    # load tree of relations (FALSE by default)
    'loadSentences'     => 'True',    # load source sentences (FALSE by default)
);

recognizeNeFileContent($file_path, %params)

Load Named Entities from a file. Accepts a file path and the shown parameters as arguments.

my $response = $api->recognizeNeFileContent(
    $filepath,
    'fileSize'          => $size,
    'loadNamedEntities' => 'True',
    'loadRelationsTree' => 'True',
    'loadSentences'     => 'True',
);

recognizeNeText($text, %params)

Load Named Entities from a text. Accepts a sample text and the shown parameters.

my $response = $api->recognizeNeText(
   $sample_text,
   'loadNamedEntities' => 'True',
   'loadRelationsTree' => 'True',
   'loadSentences'     => 'True',
);

Summarizer

Automatically generates a summary (short description) of a document with its main ideas. Intellexer Summarizer's unique feature is the possibility to create different kinds of summaries: theme-oriented (e.g., politics, economics, sports, etc.), structure-oriented (e.g., scientific article, patent, news article) and concept-oriented.

Parameters

  • loadConceptsTree - load a tree of concepts (FALSE by default)

  • loadNamedEntityTree - load a tree of Named Entities (FALSE by default)

  • summaryRestriction - determine size of a summary measured in sentences

  • usePercentRestriction - use percentage of the number of sentences in the original text instead of the exact number of sentences

  • conceptsRestriction - determine the length of a concept tree

  • structure - specify structure of the document (News Article, Research Paper, Patent or General)

  • returnedTopicsCount - determine max count of document topics to return

  • fullTextTrees - load full text trees

  • useCache - if TRUE, document content will be loaded from cache if there is any

  • wrapConcepts - mark concepts found in the summary with HTML bold tags (FALSE by default)

summarize($url, %params)

Return summary data for a document from a given URL. Accepts a valid URL as the first argument then any parameters.

my $response = $api->summarize(
    $url,
   'summaryRestriction'    => '7',
   'returnedTopicsCount'   => '2',
   'loadConceptsTree'      => 'true',
   'loadNamedEntityTree'   => 'true',
   'usePercentRestriction' => 'true',
   'conceptsRestriction'   => '7',
   'structure'             => 'general',
   'fullTextTrees'         => 'true',
   'textStreamLength'      => '1000',
   'useCache'              => 'false',
   'wrapConcepts'          => 'true'
);

summarizeText($text, %params)

Return summary data for a text. Accepts text as first argument then any parameters.

my $response = $api->summarizeText( $sample_text, %params );

summarizeFileContent($file_path, %params)

Return summary data for a text file. Accepts a file path as first argument then any parameters.

my $response = $api->summarizeFileContent( $file_path, %params );

Multi-Document Summarizer

With Related Facts automatically generates a summary (short description) from multiple documents with their main ideas. Also it detects the most important facts between the concepts of the selected documents (this feature is called Related Facts).

Parameters

  • loadConceptsTree - load a tree of concepts (FALSE by default)

  • loadNamedEntityTree - load a tree of Named Entities (FALSE by default)

  • summaryRestriction - determine size of a summary measured in sentences

  • usePercentRestriction - use percentage of the number of sentences in the original text instead of the exact number of sentences

  • conceptsRestriction - determine the length of a concept tree

  • structure - specify structure of the document (News Article, Research Paper, Patent or General)

  • returnedTopicsCount - set max number of document topics to return

  • relatedFactsRequest - add a query to extract facts and concepts related to it

  • maxRelatedFactsConcepts - set max number of related facts/concepts to return

  • maxRelatedFactsSentences - set max number of sample sentences for each related fact/concept

  • fullTextTrees - load full text trees

multiUrlSummary(\@url_list, %params)

Accepts a reference to a list of valid URLs and then any parameters.

my $response = $api->multiUrlSummary(
    \@url_list,
    'filename'              => 'sample.txt',  #required
    'summaryRestriction'    => '7',
    'returnedTopicsCount'   => '2',
    'loadConceptsTree'      => 'true',
    'loadNamedEntityTree'   => 'true',
    'usePercentRestriction' => 'true',
    'conceptsRestriction'   => '7',
    'structure'             => 'general',
    'fullTextTrees'         => 'true',
    'textStreamLength'      => '1000',
    'useCache'              => 'false',
    'wrapConcepts'          => 'true'
);

Comparator

Accurately compares documents of any format and determines the degree of similarity between them.

Parameters

  • useCache - if TRUE, document content will be loaded from cache if there is any

compareText( $text1, $text2 )

Compares the specified sources. Accepts two arguments that are text.

my $response = $api->compareText( $sample_text, $sample_text2 );

compareUrls( $url1, $url2, %params )

Compares the specified sources. Accepts two arguments that are valid URLs followed by any params.

my $response = $api->compareUrls( $url1, $url2 );

compareUrlwithFile( $url, $file, %params )

Compares URL source to a local file. Accepts a valid URL followed by a valid file path followed by any params

my $response = $api->compareUrlwithFile( $url, $filename )

compareFiles( $file1, $file2 )

Compares the given sources. Accepts two arguments that are both valid file paths.

my $response = $api->compareFiles('sample.txt','sample2.txt');

Clusterizer

Hierarchically sorts an array of documents or terms from given texts.

Parameters

  • conceptsRestriction - determine the length of a concept tree

  • fullTextTrees - load full text trees

  • useCache - if TRUE, document content will be loaded from cache if there is any ( when using URLs )

  • loadSentences - load all sentences

  • wrapConcepts - mark concepts found in the summary with HTML bold tags (FALSE by default)

clusterize($url, %params)

Return tree of concepts for a document from a given URL. Accepts a valid URL followed by any parameters.

my $response = $api->clusterize(
    $url_list[0],
    'conceptsRestriction' => '10',
    'fullTextTrees'       => 'true',
    'loadSentences'       => 'true',
    'wrapConcepts'        => 'true'
);

clusterizeText($text, %params)

Return tree of concepts for a text. Accepts text followed by any parameters.

my $response = $api->clusterizeText( $sample_text, %params );

clusterizeFileContent($file, %params)

Return tree of concepts for a text. Accepts a valid file path followed by any parameters.

my $response = $api->clusterizeFileContent($file, %params)

Natural Language Interface

Transforms Natural Language Queries into Boolean queries.

convertQueryToBool( $text )

Convert a user query in English to a set of terms and concepts joined by logical operators. Accepts a single argument which is the text to be processed

my $response = $api->convertQueryToBool('I just enter some text here and see what happens');

Preformator

Extracts plain text and information about the text layout from documents of different formats (doc, pdf, rtf, html, etc.).

Parameters

  • useCache - if TRUE, document content will be loaded from cache if there is any

  • getTopics - if TRUE, response will contain Topic ID of the document

supportedDocumentStructures()

Return available Preformator Document structures.

my $response = $api->supportedDocumentStructures();

supportedDocumentTopics()

Return available Preformator Document topics.

my $response = $api->supportedDocumentTopics();

parse( $url, %params )

Parse internet/intranet file content using Preformator. Accepts a valid URL followed by any parameters.

my $response = $api->parse( $url, 'getTopics' => 'true');

parseFileContent( $file )

Parse file content using Preformator. Accepts a single argument that is a valid path to a file.

my $response = $api->parse( $file_path );

Language Recognizer

Identifies the language and character encoding of incoming documents.

recognizeLanguage( $text )

Recognize language and encoding of an input text stream. Accepts one argument that is the text to be analyzed.

my $response = $api->recognizeLanguage( $text );

Spellchecker

Automatically corrects spelling errors due to well-chosen statistic and linguistic rules, including: rules for context-dependent misspellings; rules for evaluating the probability of possible corrections; rules for evaluating spelling mistakes caused by different means of representing speech sounds by the letters of alphabet; dictionaries with correct spelling and etc.

Parameters

  • separateLines - process each line independently

  • language - set input language

  • errorTune - adjust 'errorBound' to the length of words according to the expert bound values. There are 3 possible modes:

    1. Reduce - choose the smaller value between the expert value and the bound set by the user;

    2. Equal - choose the bound set by the user;

    3. Raise - choose the bigger value between the expert value and the bound set by the user.

  • errorBound - manually set maximum number of corrections for a single word regardless of its length

  • minProbabilityTune - adjust 'minProbabilityWeight' to the length of words according to the expert probability values. Modes are similar to 'errorTune'

  • minProbabilityWeight - set minimum probability for the words to be included to the list of candidates

checkTextSpelling($text, %params)

Perform text spell check. Accepts the text as the first argument, followed by any parameters.

my $result = $api->checkTextSpelling(
    $text,
    'language'             => 'ENGLISH',
    'errorTune'            => 2,
    'errorBound'           => 3,
    'minProbabilityTune'   => 2,
    'minProbabilityWeight' => 30,
    'separateLines'        => 'true'
);

ENVIRONMENT

An API key is required to access the Intellexer API. You can get one free for 30 days.

https://www.intellexer.com/

Intellexer.com provides API documentation that this module attempts to follow.

https://esapi.intellexer.com/Home/Help

BUGS

Please report bugs to the Github repository for this project.

https://github.com/haxmeister/Perl-Intellexer-API

AUTHOR

HAXMEISTER (Joshua S. Day) <haxmeister@hotmail.com>

LICENSE & COPYRIGHT

This software is copyright (c) 2024 by Joshua S. Day.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.