NAME
Bio::LiveSeq::IO::BioPerl - Loader for LiveSeq from EMBL entries with BioPerl
SYNOPSIS
my $db="EMBL";
my $file="../data/M20132";
my $id="HSANDREC";
my $loader=Bio::LiveSeq::IO::BioPerl->load(-db=>"$db", -file=>"$file");
# or
my $loader=Bio::LiveSeq::IO::BioPerl->load(-db=>"$db", -id=>"$id");
my @translationobjects=$loader->entry2liveseq();
my $genename="AR";
my $gene=$loader->gene2liveseq(-gene_name => "$genename",
-getswissprotinfo => 0);
#NOTE1: The only -db now supported is EMBL. Hence it defaults to EMBL.
#NOTE2: -file requires a filename (and path if necessary) containing an
# EMBL entry
# -id will use Bio::DB::EMBL.pm to fetch the sequence from the web,
# (bioperl wraparound to [w]getz from SRS)
#NOTE3: To retrieve the swissprot (if possible) attached to the embl entry
# (to get protein domains at dna level), only Bio::DB::EMBL.pm
# is supported under BioPerl. Refer to Bio::LiveSeq::IO::SRS
# otherwise.
#NOTE4: NOTE3 is not implemented yet for bioperl, working on it
DESCRIPTION
This package uses BioPerl (SeqIO) to fetch a sequence database entry, analyse it and create LiveSeq objects out of it.
A filename (or an ID that will fetch entry through the web) has to be passed to this package which will return references to all translation objects created from the EMBL entry. References to Transcription, DNA and Exon objects can all be retrieved departing from these.
Alternatively, a specific "gene" name can be specified, together with the embl-acc ID. This will create a LiveSeq::Gene object with all relevant gene features attached/created.
ATTENTION: if web fetching is requested, the package HTTP::Request needs to be installed.
AUTHOR - Joseph A.L. Insana
Email: Insana@ebi.ac.uk, jinsana@gmx.net
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
load
Title : load
Usage : my $filename="../data/M20132";
$loader=Bio::LiveSeq::IO::BioPerl->load(-db=>"EMBL", -file=>"$filename");
or
$loader=Bio::LiveSeq::IO::BioPerl->load(-db=>"EMBL", -id=>"HSANDREC");
Function: loads an entry with BioPerl from a database into a hash
Returns : reference to a new object of class IO::BioPerl holding an entry
Errorcode 0
Args : an filename containing an EMBL entry OR an ID or ACCESSION code
embl2hash
Title : embl2hash
Function: retrieves with BioPerl an EMBL entry, parses it and creates
a hash that contains all the information.
Returns : a reference to a hash
Errorcode: 0
Args : a BioPerl Sequence Object (from file or web fetching)
two array references to skip features and qualifiers (for
performance)
Example: @valid_features=qw(CDS exon prim_transcript mRNA);
@valid_qualifiers=qw(gene codon_start db_xref product rpt_family);
$hashref=&embl2hash($seqobj,\@valid_features,\@valid_qualifiers);
novelaasequence2gene
Title : novelaasequence2gene
Usage : $gene=Bio::LiveSeq::IO::BioPerl->novelaasequence2gene(-aasequence => "MGLAAPTRS*");
: $gene=Bio::LiveSeq::IO::BioPerl->novelaasequence2gene(-aasequence => "MGLAAPTRS*",
-cusg_data => "58 44 7 29 3 3 480 267 105 143 122 39 144 162 14 59 53 25 233 292 19 113 88 246 28 68 161 231 27 102 128 151 67 60 138 131 48 61 153 19 233 73 150 31 129 38 147 71 138 43 181 81 44 15 255 118 312 392 236 82 20 10 14 141");
: $gene=Bio::LiveSeq::IO::BioPerl->novelaasequence2gene(-aasequence => "MGLAAPTRS*",
-cusg_data => "58 44 7 29 3 3 480 267 105 143 122 39 144 162 14 59 53 25 233 292 19 113 88 246 28 68 161 231 27 102 128 151 67 60 138 131 48 61 153 19 233 73 150 31 129 38 147 71 138 43 181 81 44 15 255 118 312 392 236 82 20 10 14 141",
-translation_table => "2",
-gene_name => "tyr-kinase");
Function: creates LiveSeq objects from a novel amino acid sequence,
using codon usage information (loaded from a file) to choose
codons according to relative frequencies.
If a codon_usage information is not specified,
the default is to use Homo sapiens data (taxonomy ID 9606).
If a translation_table ID is not specified, it will default to 1
(standard code).
Returns : reference to a Gene object containing references to LiveSeq objects
Errorcode 0
Args : string containing an amino acid sequence
string (optional) with codon usage data (64 integer numbers)
string (optional) specifying a gene_name
integer (optional) specifying a translation_table ID