NAME
Bio::Chado::Schema::Result::Sequence::Feature
DESCRIPTION
A feature is a biological sequence or a section of a biological sequence, or a collection of such sections. Examples include genes, exons, transcripts, regulatory regions, polypeptides, protein domains, chromosome sequences, sequence variations, cross-genome match regions such as hits and HSPs and so on; see the Sequence Ontology for more. The combination of organism_id, uniquename and type_id should be unique.
ACCESSORS
feature_id
data_type: 'integer'
is_auto_increment: 1
is_nullable: 0
sequence: 'feature_feature_id_seq'
dbxref_id
data_type: 'integer'
is_foreign_key: 1
is_nullable: 1
An optional primary public stable identifier for this feature. Secondary identifiers and external dbxrefs go in the table feature_dbxref.
organism_id
data_type: 'integer'
is_foreign_key: 1
is_nullable: 0
The organism to which this feature belongs. This column is mandatory.
name
data_type: 'varchar'
is_nullable: 1
size: 255
The optional human-readable common name for a feature, for display purposes.
uniquename
data_type: 'text'
is_nullable: 0
The unique name for a feature; may not be necessarily be particularly human-readable, although this is preferred. This name must be unique for this type of feature within this organism.
residues
data_type: 'text'
is_nullable: 1
A sequence of alphabetic characters representing biological residues (nucleic acids, amino acids). This column does not need to be manifested for all features; it is optional for features such as exons where the residues can be derived from the featureloc. It is recommended that the value for this column be manifested for features which may may non-contiguous sublocations (e.g. transcripts), since derivation at query time is non-trivial. For expressed sequence, the DNA sequence should be used rather than the RNA sequence. The default storage method for the residues column is EXTERNAL, which will store it uncompressed to make substring operations faster.
seqlen
data_type: 'integer'
is_nullable: 1
The length of the residue feature. See column:residues. This column is partially redundant with the residues column, and also with featureloc. This column is required because the location may be unknown and the residue sequence may not be manifested, yet it may be desirable to store and query the length of the feature. The seqlen should always be manifested where the length of the sequence is known.
md5checksum
data_type: 'char'
is_nullable: 1
size: 32
The 32-character checksum of the sequence, calculated using the MD5 algorithm. This is practically guaranteed to be unique for any feature. This column thus acts as a unique identifier on the mathematical sequence.
type_id
data_type: 'integer'
is_foreign_key: 1
is_nullable: 0
A required reference to a table:cvterm giving the feature type. This will typically be a Sequence Ontology identifier. This column is thus used to subclass the feature table.
is_analysis
data_type: 'boolean'
default_value: false
is_nullable: 0
Boolean indicating whether this feature is annotated or the result of an automated analysis. Analysis results also use the companalysis module. Note that the dividing line between analysis and annotation may be fuzzy, this should be determined on a per-project basis in a consistent manner. One requirement is that there should only be one non-analysis version of each wild-type gene feature in a genome, whereas the same gene feature can be predicted multiple times in different analyses.
is_obsolete
data_type: 'boolean'
default_value: false
is_nullable: 0
Boolean indicating whether this feature has been obsoleted. Some chado instances may choose to simply remove the feature altogether, others may choose to keep an obsolete row in the table.
timeaccessioned
data_type: 'timestamp'
default_value: current_timestamp
is_nullable: 0
original: {default_value => \"now()"}
For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado.
timelastmodified
data_type: 'timestamp'
default_value: current_timestamp
is_nullable: 0
original: {default_value => \"now()"}
For handling object accession or modification timestamps (as opposed to database auditing data, handled elsewhere). The expectation is that these fields would be available to software interacting with chado.
RELATIONS
analysisfeatures
Type: has_many
Related object: Bio::Chado::Schema::Result::Companalysis::Analysisfeature
cell_line_features
Type: has_many
Related object: Bio::Chado::Schema::Result::CellLine::CellLineFeature
elements
Type: has_many
Related object: Bio::Chado::Schema::Result::Mage::Element
type
Type: belongs_to
Related object: Bio::Chado::Schema::Result::Cv::Cvterm
dbxref
Type: belongs_to
Related object: Bio::Chado::Schema::Result::General::Dbxref
organism
Type: belongs_to
Related object: Bio::Chado::Schema::Result::Organism::Organism
feature_cvterms
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::FeatureCvterm
feature_dbxrefs
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::FeatureDbxref
feature_expressions
Type: has_many
Related object: Bio::Chado::Schema::Result::Expression::FeatureExpression
feature_genotype_features
Type: has_many
Related object: Bio::Chado::Schema::Result::Genetic::FeatureGenotype
feature_genotype_chromosomes
Type: has_many
Related object: Bio::Chado::Schema::Result::Genetic::FeatureGenotype
featureloc_features
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::Featureloc
featureloc_srcfeatures
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::Featureloc
feature_phenotypes
Type: has_many
Related object: Bio::Chado::Schema::Result::Phenotype::FeaturePhenotype
featurepos_feature
Type: has_many
Related object: Bio::Chado::Schema::Result::Map::Featurepos
featurepos_map_features
Type: has_many
Related object: Bio::Chado::Schema::Result::Map::Featurepos
featureprops
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::Featureprop
feature_pubs
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::FeaturePub
featurerange_leftendfs
Type: has_many
Related object: Bio::Chado::Schema::Result::Map::Featurerange
featurerange_rightstartfs
Type: has_many
Related object: Bio::Chado::Schema::Result::Map::Featurerange
featurerange_rightendfs
Type: has_many
Related object: Bio::Chado::Schema::Result::Map::Featurerange
featurerange_leftstartfs
Type: has_many
Related object: Bio::Chado::Schema::Result::Map::Featurerange
featurerange_features
Type: has_many
Related object: Bio::Chado::Schema::Result::Map::Featurerange
feature_relationship_subjects
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::FeatureRelationship
feature_relationship_objects
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::FeatureRelationship
feature_synonyms
Type: has_many
Related object: Bio::Chado::Schema::Result::Sequence::FeatureSynonym
library_features
Type: has_many
Related object: Bio::Chado::Schema::Result::Library::LibraryFeature
phylonodes
Type: has_many
Related object: Bio::Chado::Schema::Result::Phylogeny::Phylonode
studyprop_features
Type: has_many
Related object: Bio::Chado::Schema::Result::Mage::StudypropFeature
ADDITIONAL RELATIONSHIPS
parent_relationships
Type: has_to_many
Returns a list of parent relationships.
Related object: Bio::Chado::Schema::Result::Sequence::FeatureRelationship
child_relationships
Type: has_to_many
Returns a list of child relationships.
Related object: Bio::Chado::Schema::Result::Sequence::FeatureRelationship
primary_dbxref
Alias for dbxref
MANY-TO-MANY RELATIONSHIPS
parent_features
Type: many_to_many
Returns a list of parent features (i.e. features that are the object of feature_relationship rows in which this feature is the subject).
Related object: Bio::Chado::Schema::Result::Sequence::Feature
child_features
Type: many_to_many
Returns a list of child features (i.e. features that are the subject of feature_relationship rows in which this feature is the object).
Related object: Bio::Chado::Schema::Result::Sequence::Feature
synonyms
Type: many_to_many
Related object: Bio::Chado::Schema::Result::Sequence::Synonym
dbxrefs_mm
Type: many_to_many
Related object: Bio::Chado::Schema::Result::General::Dbxref (i.e. dbxref table) Bio::Chado::Schema::Result::Sequence::FeatureDbxref (feature_dbxref table)
secondary_dbxrefs
Alias for dbxrefs_mm
ADDITIONAL METHODS
create_featureprops
Usage: $set->create_featureprops({ baz => 2, foo => 'bar' });
Desc : convenience method to create feature properties using cvterms
from the ontology with the given name
Args : hashref of { propname => value, ...},
options hashref as:
{
autocreate => 0,
(optional) boolean, if passed, automatically create cv,
cvterm, and dbxref rows if one cannot be found for the
given featureprop name. Default false.
cv_name => cv.name to use for the given featureprops.
Defaults to 'feature_property',
db_name => db.name to use for autocreated dbxrefs,
default 'null',
dbxref_accession_prefix => optional, default
'autocreated:',
definitions => optional hashref of:
{ cvterm_name => definition,
}
to load into the cvterm table when autocreating cvterms
allow_duplicate_values => default false.
If true, allow duplicate instances of the same cvterm
and value in the properties of the feature. Duplicate
values will have different ranks.
}
Ret : hashref of { propname => new featureprop object }
search_featureprops
Status : public
Usage : $feat->search_featureprops( 'description' )
# OR
$feat->search_featureprops({ name => 'description'})
Returns : DBIx::Class::ResultSet like other search() methods
Args : single string to match cvterm name,
or hashref of search criteria. This is passed
to $chado->resultset('Cv::Cvterm')
->search({ your criteria })
Convenience method to search featureprops for a feature that
match to Cvterms having the given criterion hash
Bio::PrimarySeqI METHODS
The methods below are intended to provide some compatibility with BioPerl's Bio::PrimarySeqI interface, so that a feature may be used as a sequence. Note that Bio::PrimarySeqI only provides identifier, accession, and sequence information, no subfeatures, ranges, or the like.
Support for BioPerl's more complete Bio::SeqI interface, which includes those things, still needs to be implemented. If you are interested in helping with this, please contact GMOD!
id, primary_id, display_id
These are aliases for name(), which just returns the contents of the feature.name field
seq
Alias for $feature->residues()
subseq( $start, $end )
Same as Bio::PrimarySeq subseq method, with one important exception. If the residues column is not set (null) for this feature, it checks for a featureprop of type large_residues
(irrespective of the type's CV membership), and uses its value as the sequence if it is present.
So, you can store large (i.e. megabase or greater) sequences in a large_residues
featureprop, and use this subseq()
method to fetch pieces of them, with the sequences never being entirely stored in memory or transferred in total from the database server to the app server. This is implemented behind the scenes by using SQL substring operations on the featureprop's value.
trunc
Same as subseq above, but return a sequence object rather than a bare string.
accession, accession_number
Usage: say $feature->accession_number
Desc : get an "<accession>.<version>"-style string. gets this from
either the primary dbxref, or the first secondary_dbxref
found
Args : none
Ret : string of the form "accession.version" formed from the
accession and version fields of either the primary or
secondary dbxrefs
length
No arguments, returns the seqlen(), or length( $feature->residues ) if that is not defined.
desc, description
No arguments, returns the value of the first 'description' featureprop found for this feature.
alphabet
Returns "protein" if the feature's type name is "polypeptide". Otherwise, returns "dna". This is not very correct, but works in most of the use cases we've seen so far.