NAME
GO::AnnotationProvider
DESCRIPTION
GO::AnnotationProvider is an interface that defines an API that should be implemented by specific subclasses, which may read GO annotation from databases, flatfiles, XML files etc.
GO (Gene Ontology) is a project of the Gene Ontology Consortium (http://www.geneontology.org). The GO project has 3 'aspects' :
Biological Process
Molecular Function
Cellular Component
When a method requires the client to refer to an aspect, it is simply by a shorthand, namely P, F and C, respectively.
In GO associations, annotated entities may be identified by many different names. Firstly, they should have a database identifier, which should be unique for an entity. Secondly, they should have a standard name. Standard names should be unique among standard names, but it is possible that a standard name of one entity may be used as an alias of another. An entity may have many aliases, and an alias may be used for many entities. Hence, a name (drawn from databaseIds, standard names, and aliases) may be ambiguous in the entity to which it refers. This is an important concept for clients of concrete subclasses to take into consideration, so that unexpected results are avoided.
TODO
Currently this interface dictates that clients can retrieve GOIDs that have been used to annotated genes. In future, this interface is likely to change, such that instead of GOIDs, GO::Annotation objects are instead returned, which will be richer in the terms of information they can give about a given annotation. Such objects would contain a GO::AnnotatedGene object, one or more GO::Reference objects, and an evidence code. The retrieval of annotations for a given database id could then be extended to allow filtering by evidence codes, to either include or exclude certain codes.
This interface also currently only allows retrieval of GOIDs for genes, in future, it will be extended such that the genes can be retrieved by GOID.
Constructor
Because this is an abstract class, there is no constructor. A constructor must be implemented by concrete subclasses.
Public instance methods
All of these public instance methods must be implemented by concrete subclasses.
Some methods dealing with ambiguous names
Because there are many names by which an annotated entity may be referred to, that are non-unique, this interface defines a set of methods for determining whether a name is ambiguous, and to what database identifiers such ambiguous names may refer.
nameIsAmbiguous
This public method returns a boolean to indicate whether a name is ambiguous, ie whether the name might map to more than one entity (and therefore more than one databaseId)
Usage:
if ($annotationProvider->nameIsAmbiguous($name)){
do something useful....or not....
}
databaseIdsForAmbiguousName
This public method returns an array of database identifiers for an ambiguous name. If the name is not ambiguous, an empty list will be returned.
Usage:
my @databaseIds = $annotationProvider->databaseIdsForAmbiguousName($name);
ambiguousNames
This method returns an array of names, which from the annotation source have been deemed to be ambiguous.
Usage:
my @ambiguousNames = $annotationProvider->ambiguousNames();
Methods for retrieving GO annotations for entities
This public method returns a reference to an array of GOIDs that are associated with the supplied databaseId for a specific aspect. If no annotations are associated with that databaseId in that aspect, then a reference to an empty array will be returned. If the databaseId is not recognized, then undef will be returned.
Usage:
my $goidsRef = $annotationProvider->goIdsByDatabaseId(databaseId=>$databaseId,
aspect=><P|F|C>);
goIdsByStandardName
This public method returns a reference to an array of GOIDs that are associated with the supplied standardName for a specific aspect. If no annotations are associated with the entity with that standard name in that aspect, then a a reference to an empty list will be returned. If the supplied name is not used as a standard name, then undef will be returned.
Usage:
my $goidsRef = $annotationProvider->goIdsByStandardName(standardName=>$databaseId,
aspect=><P|F|C>);
goIdsByName
This public method returns a reference to an array of GO IDs that are associated with the supplied name for a specific aspect. If there are no GO associations for the entity corresponding to the supplied name in the provided aspect, then a reference to an empty list will be returned. If the supplied name does not correspond to any entity, then undef will be returned. Because the name can be any of the databaseId, the standard name, or any of the aliases, it is possible that the name might be ambiguous. Clients of this object should first test whether the name they are using is ambiguous, using the nameIsAmbiguous() method, and handle it accordingly. If an ambiguous name is supplied, then it will die.
Usage:
my $goidsRef = $annotationProvider->goIdsByName(name=>$name,
aspect=><P|F|C>);
Methods for mapping different types of name to each other
standardNameByDatabaseId
This method returns the standard name for a database id.
Usage:
my $standardName = $annotationProvider->standardNameByDatabaseId($databaseId);
databaseIdByStandardName
This method returns the database id for a standard name.
Usage:
my $databaseId = $annotationProvider->databaseIdByStandardName($standardName);
$_[0]->__complainStubMethod;
}
############################################################################ sub databaseIdByName{ ############################################################################ =pod
databaseIdByName
This method returns the database id for any identifier for a gene (eg by databaseId itself, by standard name, or by alias). If the used name is ambiguous, then the program will die. Thus clients should call the nameIsAmbiguous() method, prior to using this method. If the name does not map to any databaseId, then undef will be returned.
Usage:
my $databaseId = $annotationProvider->databaseIdByName($name);
standardNameByName
This public method returns the standard name for the the gene specified by the given name. Because a name may be ambiguous, the nameIsAmbiguous() method should be called first. If an ambiguous name is supplied, then it will die with an appropriate error message. If the name does not map to a standard name, then undef will be returned.
Usage:
my $standardName = $annotationProvider->standardNameByName($name);
Other methods relating to names
nameIsStandardName
This method returns a boolean to indicate whether the supplied name is used as a standard name.
Usage :
if ($annotationParser->nameIsStandardName($name)){
# do something
}
nameIsDatabaseId
This method returns a boolean to indicate whether the supplied name is used as a database id.
Usage :
if ($annotationParser->nameIsDatabaseId($name)){
# do something
}
Other public methods
databaseName
This method returns the name of the annotating authority of the annotations.
Usage :
my $databaseName = $annotationProvider->databaseName();
numAnnotatedGenes
This method returns the number of entities in the annotation file that have annotations in the supplied aspect. If no aspect is provided, then it will return the number of genes with an annotation in at least one aspect of GO.
Usage:
my $numAnnotatedGenes = $annotationProvider->numAnnotatedGenes();
my $numAnnotatedGenes = $annotationProvider->numAnnotatedGenes($aspect);
allDatabaseIds
This public method returns an array of all the database identifiers
Usage:
my @databaseIds = $annotationProvider->allDatabaseIds();
allStandardNames
This public method returns an array of all standard names.
Usage:
my @standardNames = $annotationProvider->allStandardNames();
Protected Methods
_handleMissingArgument
This protected method simply provides a simple way for concrete subclasses to deal with missing arguments from method calls. It will die with an appropriate error message.
Usage:
$self->_handleMissingArgument(argument=>'blah');
AUTHOR
Gavin Sherlock, sherlock@genome.stanford.edu