NAME

Data::Babel - Translator for biological identifiers

VERSION

Version 1.11

SYNOPSIS

use Data::Babel;
use Data::Babel::Config;
use Class::AutoDB;
use DBI;

# open database containing Babel metadata
my $autodb=new Class::AutoDB(database=>'test');

# try to get existing Babel from database
my $babel=old Data::Babel(name=>'test',autodb=>$autodb);
unless ($babel) {              
  # Babel does not yet exist, so we'll create it
  # idtypes, masters, maptables are names of configuration files that define 
  #   the Babel's component objects
  $babel=new Data::Babel
    (name=>'test',idtypes=>'examples/idtype.ini',masters=>'examples/master.ini',
     maptables=>'examples/maptable.ini');
}
# open database containing real data
my $dbh=DBI->connect("dbi:mysql:database=test",undef,undef);

# CAUTION: rest of SYNOPSIS assumes you've loaded the real database somehow
# translate several Entrez Gene ids to other types
my $table=$babel->translate
  (input_idtype=>'gene_entrez',
   input_ids=>[1,2,3],
   output_idtypes=>[qw(gene_symbol gene_ensembl chip_affy probe_affy)]);
# print a few columns from each row of result
for my $row (@$table) {
  print "Entrez gene=$row->[0]\tsymbol=$row->[1]\tEnsembl gene=$row->[2]\n";
}
# same translation but limit results to Affy hgu133a
my $table=$babel->translate
  (input_idtype=>'gene_entrez',
   input_ids=>[1,2,3],
   filters=>{chip_affy=>'hgu133a'},
   output_idtypes=>[qw(gene_symbol gene_ensembl chip_affy probe_affy)]);
# generate a table mapping all Entrez Gene ids to UniProt ids
my $table=$babel->translate
  (input_idtype=>'gene_entrez',
   output_idtypes=>[qw(protein_uniprot)]);
# convert to HASH for easy programmatic lookups
my %gene2uniprot=map {$_[0]=>$_[1]} @$table;

# count number of Entrez Gene ids represented on Affy hgu133a
my $count=$babel->count
  (input_idtype=>'gene_entrez',filters=>{chip_affy=>'hgu133a'});

# tell which input ids are valid
my $table=$babel->validate
  (input_idtype=>'gene_entrez',
   input_ids=>[1,2,3]);
# print validity status of each
for my $row (@$table) {
  my($input_id,$valid,$current_id)=@$row;
  print "Entrez gene $input_id is ",
        ($valid? "valid with current value $current_id": 'invalid'),"\n";
}

DESCRIPTION

Data::Babel translates biological identifiers based on information contained in a database. Each Data::Babel object provides a unique mapping over a set of identifier types. The system as a whole can contain multiple Data::Babel objects; these may share some or all identifier types, and may provide the same or different mappings over the shared types.

The principal method is 'translate' which converts identifiers of one type into identifiers of one or more output types. In typical usage, you call 'translate' with a list of input ids to convert. You can also call it without any input ids (or with the special option 'input_ids_all' set) to generate a complete mapping of the input type to the output types. This is convenient if you want to hang onto the mapping for repeated use. You can also filter the output based on values of other identifier types.

Comparisons are done in a case insensitive manner. This includes input ids, filters, and internal comparisons used to join database tables. For example, when translating the gene symbol 'HTT' (the human Huntington Disease gene), you will also get information on gene symbol 'Htt' (the mouse and rat ortholog of the human gene) assuming, of course, this information is in the database.

CAVEAT: Some features of Data::Babel are overly specific to the procedure we use to construct the underlying Babel database. We note such cases when they arise in the documentation below.

The main components of a Data::Babel object are

1. a list of Data::Babel::IdType objects, each representing a type of identifier
2. a list of Data::Babel::Master objects, one per IdType, providing
  • a master list of valid values for the type, and

  • optionally, a history mapping old values to current ones

3. a list of Data::Babel::MapTable objects which implement the mapping

One typically defines these components using configuration files whose basic format is defined in Config::IniFiles. See examples in "Configuration files" and the examples directory of the distribution.

Each MapTable represents a relational table stored in the database and provides a mapping over a subset of the Babel's IdTypes; the ensemble of MapTables must, of course, cover all the IdTypes. The ensemble of MapTables must also be non-redundant as explained in "Technical details".

MapTables must always contain current identifiers, even for IdTypes that have histories (more precisely, for IdTypes whose Masters have histories). The query or program that loads the database is responsible for mapping old identifiers to current ones (presumably via the history).

'translate' checks the input IdType to see if its Master has history information. If so, 'translate' automatically applies the history to all input ids. It does the same for filters.

You need not explicitly define Masters for all IdTypes; Babel will create 'implicit' Masters for any IdTypes lacking explicit ones. An implicit Master has a list of valid identifiers but no history and could be implemented as a view over all MapTables containing the IdType. In the current implementation, we use views for IdTypes contained in single MapTables but construct actual tables for IdTypes contained in multiple MapTables.

Configuration files

Our configuration files use 'ini' format as described in Config::IniFiles: 'ini' format files consist of a number of sections, each preceded with the section name in square brackets, followed by parameter names and their values.

There are separate config files for IdTypes, Masters, and MapTables. There are complete example files in the distribution. Here are some excerpts:

IdType

[chip_affy]
display_name=Affymetrix array
referent=array
defdb=affy
meta=name
format=/^[a-z]+\d+/
sql_type=VARCHAR(32)

The section name is the IdType name. The parameters are

  • display_name. human readable name for this type

  • referent. the type of things to which this type of identifier refers

  • defdb. the database, if any, responsible for assigning this type of identifier

  • meta. some identifiers are purely synthetic (eg, Entrez gene IDs) while others have some mnemonic content; legal values are

    • eid (meaning synthetic)

    • symbol

    • name

    • description

  • format. Perl format of valid identifiers

  • sql_type. SQL data type

As of version 1.11, it is also possible to specify 'history' for an IdType. Previously, you could only specify 'history' for the IdType's Master.

Master

[gene_entrez_master]
inputs=<<INPUTS
MainData/GeneInformation
INPUTS
query=<<QUERY
SELECT locus_link_eid AS gene_entrez FROM gene_information 
QUERY

The section name is the Master name; the name of the IdType is the same but without the '_master'. The 'inputs' and 'query' parameters are used by our database construction procedure and may not be useful in other settings.

The next example illustrates a Master that includes history information.

[gene_entrez_master]
inputs=<<INPUTS
MainData/GeneInformation MainData/GeneHistory
INPUTS
query=<<QUERY
SELECT old_locus_link_eid AS _ANY_gene_entrez, locus_link_eid AS gene_entrez
FROM gene_information LEFT OUTER JOIN gene_history 
  ON locus_link_eid=new_locus_link_eid
QUERY
history=1

As of version 1.11, it is also possible to specify 'history' for an IdType. Previously, you could only specify 'history' for the IdType's Master.

A Master without history is implemented as a one column table whose column has the same name as the IdType.

A Master with history is implemented as a two column table: one column has the same name as the IdType and the other has the prefix '_X_' prepended to the IdType. The '_X_' column contains ids that were valid in the past or are valid now. Each row maps the '_X_' id to its current value, if any, or NULL. For ids that are valid now, the table contains a row in which the '_X_' and current versions are the same.

MapTable

[gene_entrez_information]
inputs=MainData/GeneInformation 
idtypes=gene_entrez gene_symbol gene_description organism_name_common
query=<<QUERY
SELECT 
       GENE.locus_link_eid AS gene_entrez, 
       GENE.symbol AS gene_symbol, 
       GENE.description AS gene_description,
       ORG.common_name AS organism_name_common
FROM 
       gene_information AS GENE
       LEFT OUTER JOIN
       organism AS ORG ON GENE.organism_id=ORG.organism_id
QUERY

[% maptable %]
inputs=MainData/GeneUnigene
idtypes=gene_entrez gene_unigene
query=<<QUERY
SELECT UG.locus_link_eid AS gene_entrez, UG.unigene_eid AS gene_unigene
FROM   gene_unigene AS UG
QUERY

This excerpt has two MapTable definitions which illustrate two ways that MapTables can be named. The first uses a normal section name; the second invokes a Template Toolkit macro which generates unique names of the form 'maptable_001'. This is very convenient because Babel databases typically contain a large number of MapTables, and it's hard to come up with good names for most of them. In any case, the names don't matter much, because software generates the queries that operate on these tables.

The 'inputs' and 'query' parameters are used by our database construction procedure and may not be useful in other settings.

Input ids that do not connect to any outputs

By default, the 'translate' method does not return any output for input identifiers that do not connect to any identifiers of the desired output types; these are output rows in which the output columns are all NULL. You can instruct 'translate' to include these rows in the result by setting the 'validate' option.

An input identifier can fail to connect for two reasons:

1. The identifier is not valid, in other words, it does not exist in the Master table for the input IdType.
2. The identifier is valid but doesn't doesn't connect to any ids of the desired output types. This is normal.

If you set the 'validate' option, the output will contain at least one row for each input identifier, and an additional column that indicates whether each input identifier is valid.

If no output IdTypes are specified, 'translate' returns a row containing one element, namely, the input identifier, for each input id that exists in the corresponding Master table. If the 'validate' option is set, the output will contain one row for each input identifier; this is essentially a (possibly re-ordered) copy of the input list with duplicates removed.

Technical details

A basic Babel property is that translations are stable. You can add output types to a query without changing the answer for the types you had before, you can remove output types from the query without changing the answer for the ones that remain, and if you "reverse direction" and swap the input type with one of the outputs, you get everything that was in the original answer.

We accomplish this by requiring that the database of MapTables satisfy the universal relation property (a well-known concept in relational database theory), and that 'translate' retrieves a sub-table of the universal relational. Concretely, the universal relational is the natural full outer join of all the MapTables. 'translate' performs natural left out joins starting with the Master table for the input IdType, and then including enough tables to connect the input and output IdTypes. Left outer joins suffice, because 'translate' starts with the Master.

We further require that the database of MapTables be non-redundant. The basic idea is that a given IdType may not be present in multiple MapTables, unless it is being used as join column. More technically, we require that the MapTables form a tree schema (another well-known concept in relational database theory), and any pair of MapTables have at most one IdType in common. As a consequence, there is essentially a single path between any pair of IdTypes.

To represent the connections between IdTypes and MapTables we use an undirected graph whose nodes represent IdTypes and MapTables, and whose edges go between each MapTable and the IdTypes it contains. In this representation, a non-redundant schema is a tree.

'translate' uses this graph to find the MapTables it must join to connect the input and output IdTypes. The algorithms is simple: start at the leaves and recursively prune back branches that do not contain the input or output IdTypes.

METHODS AND FUNCTIONS

new

Title   : new 
Usage   : $babel=new Data::Babel
                     name=>$name,
                     idtypes=>$idtypes,masters=>$masters,maptables=>$maptables 
Function: Create new Data::Babel object or fetch existing object from database
          and update its components.  Store the new or updated object.
Returns : Data::Babel object
Args    : name        eg, 'test'
          idtypes, masters, maptables
                      define component objects; see below
          old         existing Data::Babel object in case program already
                      fetched it (typically via 'old')
          autodb      Class::AutoDB object for database containing Babel.
                      class method often set before running 'new'
Notes   : 'name' is required. All other args are optional

The component object parameters can be any of the following:

1. filenames referring to configuration files that define the component objects
2. any other file descriptors that can be handled by the new method of Config::IniFiles, eg, filehandles and IO::File objects
3. objects of the appropriate type for each component, namely, Data::Babel::IdType, Data::Babel::Master, Data::Babel::MapTable, respectively
4. ARRAYs of the above

old

Title   : old 
Usage   : $babel=old Data::Babel($name)
          -- OR --
          $babel=old Data::Babel(name=>$name)
Function: Fetch existing Data::Babel object from database          
Returns : Data::Babel object or undef
Args    : name of Data::Babel object, eg, 'test'
          if keyword form used, can also specify autodb to set the
          corresponding class attribute

attributes

The available object attributes are

name       eg, 'test' 
id         name prefixed with 'babel', eg, 'babel:test'. not really used.  
           exists for compatibility with component objects
idtypes    ARRAY of this Babel's Data::Babel::IdType objects
masters    ARRAY of this Babel's Data::Babel::Master objects
maptables  ARRAY of this Babel's Data::Babel::MapTable objects

The available class attributes are

autodb     Class::AutoDB object for database containing Babel

translate

Title   : translate 
Usage   : $table=$babel->translate
                    (input_idtype=>'gene_entrez',
                     input_ids=>[1,2,3],
                     filters=>{chip_affy=>'hgu133a'},
                     output_idtypes=>[qw(transcript_refseq transcript_ensembl)],
                     limit=>100)
Function: Translate the input ids to ids of the output types
Returns : table represented as an ARRAY of ARRAYS. Each inner ARRAY is one row
          of the result. The first element of each row is an input id. If the
          validate option is set, the second element of each row indicates
          whether the input id is valid. The rest are outputs in the same order
          as output_idtypes
Args    : input_idtype   name of Data::Babel::IdType object or object
          input_ids      id or ARRAY of ids to be translated. If absent or
                         undef, all ids of the input type are translated. If an
                         empty array, ie, [], no ids are translated and the 
                         result will be empty.
          input_ids_all  boolean. If true, all ids of the input type are
                         translated. Same as omitting input_ids or setting it
                         to undef but more explicit.
          filters        HASH or ARRAY of conditions limiting the output; see 
                         below.
          output_idtypes ARRAY of names of Data::Babel::IdType objects or
                         objects
          validate       boolean. If true, the output will contain at least one
                         row for each input id and an additional column 
                         indicating whether the input id is valid.
          limit          maximum number of rows to retrieve
          count          boolean. If true, return number of output rows rather 
                         than the rows themselves. Equivalent to 'count'
                         method.

Notes on translate

  • 'translate' retains duplicate output columns.

  • The order of output rows is arbitrary.

  • If input_ids is absent or undef, it translates all ids of the input type.

  • Duplicate input_ids are ignored.

  • If input_ids is an empty ARRAY, ie, [], the result will be empty.

  • It is an error to set both input_ids and input_ids_all.

  • It is legal but odd to specify a filter on the input idtype. This effectively computes the intersection of the input and filter ids.

  • Input and filter ids can be old (valid in the past) or current (valid now). Output ids are always current.

  • By default, 'translate' does not return rows in which the output columns are all NULL. Setting 'validate' changes this and ensures that every input id will appear in the output.

  • If 'count' and 'limit' both set, the result is the number of output rows after the limit is applied and will always be <= the limit.

  • If 'validate' and 'limit' both set, the result may not contain all input ids if to do so would produce more rows than the limit. This defeats one of the purposes of 'validate', namely to ensure that all input ids appear in the output.

  • If 'count' and 'validate' both set, the result is the number of output rows including ones added by 'validate', ie, rows with in which all output columns are NULL.

  • If 'validate' and 'filters' both set, the result may contain input ids excluded by the filter. These rows will have NULLs in all output columns.

  • If no output idtypes are specified, the output will contain one row for each valid input id (by default) or one row for each id whether valid or not (if 'validate' is set).

  • Comparisons are case insensitive. This includes input ids, filters, and internal comparisons used to join database tables. For example, when translating the gene symbol 'HTT' (the human Huntington Disease gene), you will also get information on gene symbol 'Htt' (the mouse and rat ortholog of the human gene) assuming, of course, this information is in the database.

Filters

The 'filters' argument is a HASH or ARRAY of types and values. The types can be names of Data::Babel::IdType objects or objects themselves. The values can be single ids, ARRAYs of ids, or undef; the ARRAYs may also contain undef. For example

filters=>{chip_affy=>'hgu133a'}
filters=>{chip_affy=>['hgu133a','hgu133plus2']}
filters=>{chip_affy=>['hgu133a','hgu133plus2'],pathway_kegg_id=>4610}
filters=>{chip_affy=>['hgu133a','hgu133plus2'],pathway_kegg_id=>undef}
filters=>{chip_affy=>'hgu133a',pathway_kegg_id=>[undef,4610]}

If the argument is an ARRAY, it is possible for the same type to appear multiple times in which case the values are combined. For example,

filters=>[chip_affy=>'hgu133a',chip_affy=>'hgu133plus2']

is equivalent to

filters=>{chip_affy=>['hgu133a','hgu133plus2']}

If a filter value is an empty ARRAY, ie, [], the result will be empty.

As noted in "Notes on translate", comparisons are case insensitive.

If a filter value is undef, all ids of the given type are acceptable. This limits the output to rows for which the filter type is not NULL. For example,

$babel->translate(input_idtype=>'gene_entrez',
                  filters=>{pathway_kegg_id=>undef},
                  output_idtypes=>[qw(gene_symbol)])

generates a table of all Entrez Gene ids and gene symbols which appear in any KEGG pathway.

Including undef in an ARRAY lets the output contain rows for which the filter type is NULL. For example,

$babel->translate(input_idtype=>'gene_entrez',
                  filters=>{pathway_kegg_id=>[undef,4610]},
                  output_idtypes=>[qw(gene_symbol)])

generates a table of all Entrez Gene ids and gene symbols which either appear in KEGG pathway 4610 or appear in no KEGG pathway.

CAUTION: undef has opposite semantics depending on whether it's the only value for a filter type or whether it's one of several.

Histories

'translate' automatically applies histories, when they exist, to input and filter ids. In other words, input and filter ids can be ones that were valid in the past but are not valid now. Output ids, however, are always current.

CAUTION: If the input type is also used as an output, the result can contain rows in which the output id does not equal the input id. This will occur if the input id is old and is mapped to a different current value. Likewise, if a filter type is used as an output, the result can contain rows in which the output id does not match the filter.

count

Title   : count 
Usage   : $number=$babel->count
                    (input_idtype=>'gene_entrez',
                     input_ids=>[1,2,3],
                     filters=>{chip_affy=>'hgu133a'},
                     output_idtypes=>[qw(transcript_refseq transcript_ensembl)])
Function: Count number of output rows that would be generated by 'translate'
Returns : number
Args    : same as 'translate'

'count' is a wrapper for translate that sets the 'count' argument to a true value.

validate

Title   : validate 
Usage   : $table=$babel->validate
                    (input_idtype=>'gene_entrez',
                     input_ids=>[1,2,3])
Function: Tell which input ids are valid now or in the past, and the mapping 
          from old to current values
Returns : table represented as an ARRAY of ARRAYS. Each inner ARRAY is one row
          of the result. If output_idtypes is omiited (the usual case), the 
          elements of each row are
            0) input id as given
            1) validity status. 1 for valid; 0 for invalid
            2) current value of the id or undef if it has no current value; may
               be the same as the original id
          If output_idtypes is set, the result is ther same as 'translate' with
          the 'validate' option set
Args    : input_idtype   name of Data::Babel::IdType object or object
          input_ids      id or ARRAY of ids to be translated. If absent or
                         undef, all ids of the input type are translated. If an
                         empty array, ie, [], no ids are translated and the 
                         result will be empty.
          input_ids_all  boolean. If true, all ids of the input type are
                         translated. Same as omitting input_ids or setting it
                         to undef but more explicit.
          output_idtypes optional and usually omitted. ARRAY of names of 
                         Data::Babel::IdType objects or objects. If set,
                         equivalent to calling 'translate' with the 'validate'
                         option set
          limit          maximum number of rows to retrieve

'validate' looks up the given input ids in the Master tables for the given input type and returns a table indicating which ids are valid. For types with history information, the method also indicates the current value of the id. For types that have no history, the current value will always equal the given id if the id is valid.

'validate' can also retrieve a complete table of valid ids (along with history information) for the type.

'validate' is a wrapper for translate that sets the 'validate' argument to a true value and the output_idtypes argument to the input_idtype. All other 'translate' arguments (filters, count) are legal here and work but are of dubious value.

Notes on validate

  • For rows whose validity status is 1 (valid), the given id and current value indicate the history: if the elements are equal, the given id is current; else if the current value is defined, the given id has been replaced by the new one; else the given id was valid in the past but has no current value.

  • For types that have no history, all valid ids are current. If the given id is valid, the given id and current value will be equal; else the current value will be undef.

  • For rows whose status is 0 (invalid), the current value will always be undef.

  • The 'translate' arguments 'filters' and 'count' are legal here and work but are of dubious value.

  • As noted in "Notes on translate", comparisons are case insensitive.

show

Title   : show
Usage   : $babel->show
Function: Print object in readable form
Returns : nothing useful
Args    : none

show_schema_graph

Title   : show_schema_graph
Usage   : $babel->show_schema_graph('schema.sif','sif')
Function: Emit schema graph in text or sif format
Returns : nothing useful
Args    : file           output filename. default: standard out
          format         'sif' or 'txt'. default: 'sif'

check_schema

Title   : check_schema
Usage   : @errstrs=$babel->check_schema
          -- OR --
          $ok=$babel->check_schema
Function: Validate schema. Presently checks that schema graph is tree and all
          IdTypes contained in some MapTable
Returns : in array context, list of errors
          in scalar context, true if schema is good, false if schema is bad
Args    : none

check_contents - NOT YET IMPLEMENTED

Title   : check_contents
Usage   : $babel->check_schema
Function: Validate contents of Babel database. Checks consistency of explicit
          Masters and MapTables
Returns : boolean
Args    : none

load_implicit_masters

Title   : load_implicit_masters
Usage   : $babel->load_implicit_masters
Function: Creates database structures for implicit Masters. 
Returns : nothing useful
Args    : none

Babel creates 'implicit' Masters for any IdTypes lacking explicit ones. An implicit Master has a list of valid identifiers and could be implemented as a view over all MapTables containing the IdType. In the current implementation, we use views for IdTypes contained in single MapTables but construct actual tables for IdTypes contained in multiple MapTables.

This method must be called after the real database is loaded.

Objects have names and ids: names are strings like 'gene_entrez' and are unique for a given class of object; ids have a short form of the type prepended to the name, eg, 'idtype:gene_entrez', and are unique across all classes. We use ids as nodes in schema and query graphs. In most cases, applications should should use names.

The methods in this section map names or ids to component objects, or (as a trivial convenience), convert ids to names.

name2idtype

Title   : name2idtype
Usage   : $idtype=$babel->name2idtype('gene_entrez')
Function: Get the IdType object given its name
Returns : Data::Babel::IdType object or undef
Args    : name of object
Notes   : only looks at this Babel's component objects

name2master

Title   : name2master
Usage   : $master=$babel->name2master('gene_entrez_master')
Function: Get the Master object given its name
Returns : Data::Babel::Master object or undef
Args    : name of object
Notes   : only looks at this Babel's component objects

name2maptable

Title   : name2maptable
Usage   : $maptable=$babel->name2maptable('maptable_012')
Function: Get the MapTable object given its name
Returns : Data::Babel::MapTable object or undef
Args    : name of object
Notes   : only looks at this Babel's component objects

id2object

Title   : id2object
Usage   : $object=$babel->id2object('idtype:gene_entrez')
Function: Get object given its id
Returns : Data::Babel::IdType, Data::Babel::Master, Data::Babel::MapTable
          object or undef
Args    : id of object
Notes   : only looks at this Babel's component objects

id2name

Title   : id2name
Usage   : $name=$babel->id2name('idtype:gene_entrez')
          -- OR --
          $name=Data::Babel->id2name('idtype:gene_entrez')
Function: Convert object id to name
Returns : string
Args    : id of object
Notes   : trival convenience method

METHODS AND ATTRIBUTES OF COMPONENT CLASS Data::Babel::IdType

new

Title   : new 
Usage   : $idtype=new Data::Babel::IdType name=>$name,...
Function: Create new Data::Babel::IdType object or fetch existing object from 
          database and update its components. Store the new or updated object.
Returns : Data::Babel::IdType object
Args    : any attributes listed in the attributes section below, except 'id'
          (because it is computed from name)
          old         existing Data::Babel object in case program already
                      fetched it (typically via 'old')
          autodb      Class::AutoDB object for database containing Babel.
                      class method often set before running 'new'
Notes   : 'name' is required. All other args are optional

old

Title   : old 
Usage   : $idtype=old Data::Babel::IdType($name)
          -- OR --
          $babel=old Data::Babel::IdType(name=>$name)
Function: Fetch existing Data::Babel::IdType object from database          
Returns : Data::Babel::IdType object or undef
Args    : name of Data::Babel::IdType object, eg, 'gene_entrez'
          if keyword form used, can also specify autodb to set the
          corresponding class attribute

attributes

The available object attributes are

name          eg, 'gene_entrez' 
id            name prefixed with 'idtype', eg, 'idtype:::gene_entrez'
master        Data::Babel::Master object for this IdType
maptables     ARRAY of Data::Babel::MapTable objects containing this IdType
external      boolean indicating whether this is a regular external ID or one
              intended for internal use
internal      opposite of external
history       boolean indicating whether this IdType's Master contains history
              information
tablename     name of this IdType's Master's table
display_name  human readable name, eg, 'Entrez Gene ID'; for internal 
              identifiers, a warning is appended to the end
referent      the type of things to which this type of identifier refers
defdb         the database, if any, which assigns identifiers
meta          meta-type: eid (meaning synthetic), symbol, name, description
format        Perl format of valid identifiers, eg, /^\d+$/
perl_format   synonym for format
sql_type      SQL data type, eg, INT(11)

The available class attributes are

autodb     Class::AutoDB object for database containing Babel

degree

Title   : degree 
Usage   : $number=$idtype->degree
Function: Tell how many Data::Babel::MapTables contain this IdType          
Returns : number
Args    : none

METHODS AND ATTRIBUTES OF COMPONENT CLASS Data::Babel::Master

new

Title   : new 
Usage   : $master=new Data::Babel::Master name=>$name,idtype=>$idtype,...
Function: Create new Data::Babel::Master object or fetch existing object from 
          database and update its components. Store the new or updated object.
Returns : Data::Babel::Master object
Args    : any attributes listed in the attributes section below, except 'id'
          (because it is computed from name)
          old         existing Data::Babel object in case program already
                      fetched it (typically via 'old')
          autodb      Class::AutoDB object for database containing Babel.
                      class method often set before running 'new'
Notes   : 'name' is required. All other args are optional

old

Title   : old 
Usage   : $master=old Data::Babel::Master($name)
          -- OR --
          $babel=old Data::Babel::Master(name=>$name)
Function: Fetch existing Data::Babel::Master object from database          
Returns : Data::Babel::Master object or undef
Args    : name of Data::Babel::Master object, eg, 'gene_entrez'
          if keyword form used, can also specify autodb to set the
          corresponding class attribute

attributes

The available object attributes are

name          eg, 'gene_entrez_master' 
id            name prefixed with 'master::', eg, 'master:::gene_entrez_master'
idtype        Data::Babel::IdType object for which this is the Master
implicit      boolean indicating whether Master is implicit
explicit      opposite of implicit
view          boolean indicating whether Master is implemented as a view
history       boolean indicating whether Master contains history information.
tablename     synonym for name
inputs, namespace, query
              used by our database construction procedure

The available class attributes are

autodb     Class::AutoDB object for database containing Babel

degree

Title   : degree 
Usage   : $number=$master->degree
Function: Tell how many Data::Babel::MapTables contain this Master's IdType          
Returns : number
Args    : none

METHODS AND ATTRIBUTES OF COMPONENT CLASS Data::Babel::MapTable

new

Title   : new 
Usage   : $maptable=new Data::Babel::MapTable name=>$name,idtypes=>$idtypes,...
Function: Create new Data::Babel::MapTable object or fetch existing object from 
          database and update its components. Store the new or updated object.
Returns : Data::Babel::MapTable object
Args    : any attributes listed in the attributes section below, except 'id'
          (because it is computed from name)
          old         existing Data::Babel object in case program already
                      fetched it (typically via 'old')
          autodb      Class::AutoDB object for database containing Babel.
                      class method often set before running 'new'
Notes   : 'name' is required. All other args are optional

old

Title   : old 
Usage   : $maptable=old Data::Babel::MapTable($name)
          -- OR --
          $babel=old Data::Babel::MapTable(name=>$name)
Function: Fetch existing Data::Babel::MapTable object from database          
Returns : Data::Babel::MapTable object or undef
Args    : name of Data::Babel::MapTable object, eg, 'gene_entrez'
          if keyword form used, can also specify autodb to set the
          corresponding class attribute

attributes

The available object attributes are

name          eg, 'gene_entrez_master' 
id            name prefixed with 'maptable', eg, 'maptable:::gene_entrez_master'
idtypes       ARRAY of Data::Babel::IdType objects contained by this MapTable
inputs, namespace, query
              used by our database construction procedure

The available class attributes are

autodb     Class::AutoDB object for database containing Babel

SEE ALSO

I'm not aware of anything.

AUTHOR

Nat Goodman, <natg at shore.net>

BUGS AND CAVEATS

Please report any bugs or feature requests to bug-data-babel at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Data-Babel. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

Known Bugs and Caveats

1. The attributes of Master and MapTable objects are overly specific to the procedure we use to construct databases and may not be useful in other settings.
2. This class uses Class::AutoDB to store its metadata and inherits all the Known Bugs and Caveats of that module.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Data::Babel

You can also look for information at:

ACKNOWLEDGEMENTS

This module extends a version developed by Victor Cassen.

LICENSE AND COPYRIGHT

Copyright 2012 Institute for Systems Biology

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.