NAME

Business::CompanyDesignator - module for matching and stripping/manipulating the company designators appended to company names

VERSION

Version: 0.05.

This module is considered an ALPHA release. Interfaces may change and/or break without notice until the module reaches version 1.0.

SYNOPSIS

Business::CompanyDesignator is a perl module for matching and stripping/manipulating the typical company designators appended to company names. It supports both long forms (e.g. Corporation, Incorporated, Limited etc.) and abbreviations (e.g. Corp., Inc., Ltd., GmbH etc).

use Business::CompanyDesignator;

# Constructor
$bcd = Business::CompanyDesignator->new;
# Optionally, you can provide your own company_designator.yml file, instead of the bundled one
$bcd = Business::CompanyDesignator->new(datafile => '/path/to/company_designator.yml');

# Get lists of designators, which may be long (e.g. Limited) or abbreviations (e.g. Ltd.)
@des = $bcd->designators;
@long = $bcd->long_designators;
@abbrev = $bcd->abbreviations;

# Lookup individual designator records (returns B::CD::Record objects)
# Lookup record by long designator (unique)
$record = $bcd->record($long_designator);
# Lookup records by abbreviation or long designator (may not be unique)
@records = $bcd->records($designator);

# Get a regex for matching designators
$re = $bcd->regex;
$company_name =~ $re and say 'designator found!';
$company_name =~ /$re\s*$/ and say 'final designator found!';

# Split $company_name on designator, returning a ($before, $designator, $after) triplet,
# plus the normalised form of the designator matched.
($short_name, $des, $after, $normalised_des) = $bcd->split_designator($company_name);

DATASET

Business::CompanyDesignator uses the company designator dataset from here:

L<https://github.com/ProfoundNetworks/company_designator>

which is bundled with the module. You can use your own (updated or custom) version, if you prefer, by passing a 'datafile' parameter to the constructor.

The dataset defines multiple long form designators (like "Company", "Limited", or "Incorporée"), each of which have zero or more abbreviations (e.g. 'Co.', 'Ltd.', 'Inc.' etc.), and one or more language codes. The 'Company' entry, for instance, looks like this:

Company:
  abbr:
    - Co.
    - '& Co.'
    - and Co.
  lang: en

Long designators are unique across the dataset, but abbreviations are not e.g. 'Inc.' is used for both "Incorporated" and "Incorporée".

METHODS

new()

Creates a Business::CompanyDesignator object.

$bcd = Business::CompanyDesignator->new;

By default this uses the bundled company_designator dataset. You may provide your own (updated or custom) version by passing via a 'datafile' parameter to the constructor.

$bcd = Business::CompanyDesignator->new(datafile => '/path/to/company_designator.yml');

designators()

Returns the full list of company designator strings from the dataset (both long form and abbreviations).

@designators = $bcd->designators;

long_designators()

Returns the full list of long form designators from the dataset.

@long = $bcd->long_designators;

abbreviations()

Returns the full list of abbreviation designators from the dataset.

@abbrev = $bcd->abbreviations;

record($long_designator)

Returns the Business::CompanyDesignator::Record object for the given long designator (and dies if not found).

records($designator)

Returns a list of Business::CompanyDesignator::Record objects for the given abbreviation or long designator (for long designators there will only be a single record returned, but abbreviations may map to multiple records).

Use this method for abbreviations, or if you're aren't sure of a designator's type.

regex()

Returns a regex that matches all designators from the dataset (case-insensitive, non-anchored).

split_designator($company_name)

Attempts to split $company_name on (the first) company designator found. If found, it returns a list of four items - a triplet of strings from $company_name: ( $before, $designator, $after ), plus a normalised version of the designator as a fourth element.

($short_name, $des, $after_text, $normalised_des) = $bcd->split_designator($company_name);

The initial $des designator is the designator as matched in the text, while the second $normalised_des is the normalised version as found in the dataset. For instance, "ABC Pty Ltd" would return "Pty Ltd" as the $designator, but "Pty. Ltd." as the normalised form, and the latter would be what you would find in designators() or would lookup with records(). Similarly, "Accessoires XYZ Ltee" (misspelt without the grave accent) would still be matched, returning "Ltee" (as found) for the $designator, but "Ltée" as the normalised form.

SEE ALSO

Finance::CompanyNames

AUTHOR

Gavin Carr <gavin@profound.net>

COPYRIGHT AND LICENCE

Copyright (C) 2013 Gavin Carr and Profound Networks.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 251:

Non-ASCII character seen before =encoding in '"Incorporée"),'. Assuming UTF-8