NAME

Bio::LITE::Taxonomy::NCBI - Lightweight and efficient NCBI taxonomic manager

SYNOPSIS

use Bio::LITE::Taxonomy::NCBI;

my $taxDB = Bio::LITE::Taxonomy::NCBI->new (
                                            db=>"NCBI",
                                            names=> "/path/to/names.dmp",
                                            nodes=>"/path/to/nodes.dmp"
                                           );

my $tax = $taxDB->get_taxonomy(1442); # 1442 is a Taxid
my $taxid = $taxDB->get_taxid_from_name("Bacteroidetes");
my $term = $taxDB->get_term_at_level(1442,"family");

my $taxDB2 = Bio::LITE::Taxonomy::NCBI-> new (
                                              db=>"NCBI",
                                              names=> "/path/to/names.dmp",
                                              nodes=>"/path/to/nodes.dmp",
                                              dict=>"/path/to/dictionary/file",
                                             );
my $tax2 = $taxDB2->get_taxonomy_from_gi(12553);

# Methods from Bio::LITE::Taxonomy::NCBI::Gi2taxid
# can also be called directly:

my $taxid2 = $taxDB2->get_taxid(12553);

DESCRIPTION

This module provides easy and efficient access to the NCBI taxonomy with minimal dependencies and without intermediary databases.

This module is not part of the Bioperl bundle. For bioperl alternatives see the "SEE ALSO" section of this document.

CONSTRUCTOR

new (%ARGS)

Creates a Bio::LITE::Taxonomy::NCBI object.

The following parameters are needed

names

The location of the names.dmp file. Filehandles are also allowed. Mandatory.

nodes

The location of the nodes.dmp file. Filehandles are also allowed. Mandatory.

synonyms

An array reference listing the categories of synonymous names made available to methods get_taxid_from_name and get_taxonomy_from_name. This parameter is optional and set to ['synonym'] by default.

As of May 2015, meaningful values are: acronym, anamorph, authority, blast name, common name, equivalent name, genbank acronym, genbank anamorph, genbank common name, genbank synonym, in-part, includes, misnomer, misspelling, synonym, teleomorph, type material.

my $taxDB = Bio::LITE::Taxonomy::NCBI->new (
                                            db=>"NCBI",
                                            names=> "/path/to/names.dmp",
                                            nodes=>"/path/to/nodes.dmp",
                                            synonyms=>['anamorph','teleomorph','synonym']
                                           );
dict

You can query the tree using GIs directly instead of Taxids. For doing this, you should provide the NCBIs GI to Taxid mapper in binary format as explained in Bio::LITE::Taxonomy::NCBI::Gi2taxid. Optional

save_mem

Use this option to avoid to load the binary dictionary (GI to Taxid) into memory. This will save almost 1GB of system memory but looking up for Taxids will be ~20% slower. This parameter is optional, only makes sense if you are using the GI to Taxid dictionary and is off by default.

METHODS

This module inherits from Bio::LITE::Taxonomy so all the methods explained there are accessible. These methods are also available:

get_taxonomy_from_gi

Accepts a GI as input and returns an array with its ascendants ordered from top to bottom.

my @tax = $tax->get_taxonomy_from_gi($gi);
print "$_\n" for (@tax);

If called in scalar context, returns an array reference instead of the array. See Bio::LITE::Taxonomy::get_taxonomy

get_taxonomy_with_levels_from_gi

The same as get_taxonomy_from_gi but instead of getting the ascendants returns an array of array references. Each array reference has the ascendant and its taxonomic level (at positions 0 and 1 respectively). This is simpler than it sounds. Check Bio::LITE::Taxonomy::get_taxonomy_with_levels for more information.

If called in scalar context, returns an array reference instead of the array.

get_term_at_level_from_gi

Given a gi and a taxonomic level as input, returns the taxon. For example,

my $taxon = $tax->get_term_at_level_from_gi($gi,"family");

See Bio::LITE::Taxonomy::get_term_at_level.

SEE ALSO

Bio::Gi2taxid - Module to convert NCBIs GIs to Taxids

Bio::LITE::Taxonomy

Bio::Taxonomy::RDP

Bio::DB::Taxonomy::* - Bioperl alternative for NCBI taxonomies.

AUTHOR

Miguel Pignatelli Any comments or suggestions should be addressed to emepyc@gmail.com

CONTRIBUTORS

Denis Baurain (denis.baurain -AT- ulg.ac.be)

LICENSE

Copyright 2015 Miguel Pignatelli, all rights reserved.

This library is free software; you may redistribute it and/or modify it under the same terms as Perl itself.