NAME
Bio::SeqIO::seqxml - SeqXML sequence input/output stream
SYNOPSIS
# Do not use this module directly. Use it via the Bio::SeqIO class.
use Bio::SeqIO;
# read a SeqXML file
my $seqio = Bio::SeqIO->new(-format => 'seqxml',
-file => 'my_seqs.xml');
while (my $seq_object = $seqio->next_seq) {
print join("\t",
$seq_object->display_id,
$seq_object->description,
$seq_object->seq,
), "\n";
}
# write a SeqXML file
#
# Note that you can (optionally) specify the source
# (usually a database) and source version.
my $seqwriter = Bio::SeqIO->new(-format => 'seqxml',
-file => ">outfile.xml",
-source => 'Ensembl',
-sourceVersion => '56');
$seqwriter->write_seq($seq_object);
# once you've written all of your seqs, you may want to do
# an explicit close to get the closing </seqXML> tag
$seqwriter->close;
DESCRIPTION
This object can transform Bio::Seq objects to and from SeqXML format. For more information on the SeqXML standard, visit http://www.seqxml.org.
In short, SeqXML is a lightweight sequence format that takes advantage of the validation capabilities of XML while not overburdening you with a strict and complicated schema.
This module is based in part (particularly the XML-parsing part) on Bio::TreeIO::phyloxml by Mira Han.
FEEDBACK
Mailing Lists
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Support
Please direct usage questions or support issues to the mailing list:
bioperl-l@bioperl.org
rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.
Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web:
https://github.com/bioperl/bioperl-live/issues
AUTHORS - Dave Messina
Email: dmessina@cpan.org
CONTRIBUTORS
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
_initialize
Title : _initialize
Usage : $self->_initialize(@args)
Function: constructor (for internal use only).
Besides the usual SeqIO arguments (-file, -fh, etc.),
Bio::SeqIO::seqxml accepts three arguments which are used
when writing out a seqxml file. They are all optional.
Returns : none
Args : -source => source string (usually a database name)
-sourceVersion => source version. The version number of the source
-seqXMLversion => the version of seqXML that will be used
Throws : Exception if XML::LibXML::Reader or XML::Writer
is not initialized
next_seq
Title : next_seq
Usage : $seq = $stream->next_seq()
Function: returns the next sequence in the stream
Returns : L<Bio::Seq> object, or nothing if no more available
Args : none
write_seq
Title : write_seq
Usage : $stream->write_seq(@seq)
Function: Writes the $seq object into the stream
Returns : 1 for success and 0 for error
Args : Array of 1 or more L<Bio::PrimarySeqI> objects
_initialize_seqxml_node_methods
Title : _initialize_seqxml_node_methods
Usage : $self->_initialize_xml_node_methods
Function: sets up code ref mapping of each seqXML node type
to a method for processing that node type
Returns : none
Args : none
schemaLocation
Title : schemaLocation
Usage : $self->schemaLocation
Function: gets/sets the schema location in the <seqXML> header
Returns : the schema location string
Args : To set the schemaLocation, call with a schemaLocation as the argument.
source
Title : source
Usage : $self->source
Function: gets/sets the data source in the <seqXML> header
Returns : the data source string
Args : To set the source, call with a source string as the argument.
sourceVersion
Title : sourceVersion
Usage : $self->sourceVersion
Function: gets/sets the data source version in the <seqXML> header
Returns : the data source version string
Args : To set the source version, call with a source version string
as the argument.
seqXMLversion
Title : seqXMLversion
Usage : $self->seqXMLversion
Function: gets/sets the seqXML version in the <seqXML> header
Returns : the seqXML version string.
Args : To set the seqXML version, call with a seqXML version string
as the argument.
Methods for parsing the XML document
processXMLNode
Title : processXMLNode
Usage : $seqio->processXMLNode
Function: reads the XML node and processes according to the node type
Returns : none
Args : none
Throws : Exception on unexpected XML node type, warnings on unexpected
XML element names.
processAttribute
Title : processAttribute
Usage : $seqio->processAttribute(\%hash_for_attribute);
Function: reads the attributes of the current element into a hash
Returns : none
Args : hash reference where the attributes will be stored.
parseHeader
Title : parseHeader
Usage : $self->parseHeader();
Function: reads the opening <seqXML> block and grabs the metadata from it,
namely the source, sourceVersion, and seqXMLversion.
Returns : none
Args : none
Throws : Exception if it hits an <entry> tag, because that means it's
missed the <seqXML> tag and read too far into the file.
element_seqXML
Title : element_seqXML
Usage : $self->element_seqXML
Function: processes the opening <seqXML> node
Returns : none
Args : none
element_entry
Title : element_entry
Usage : $self->element_entry
Function: processes a sequence <entry> node
Returns : none
Args : none
Throws : Exception if sequence ID is not present in <entry> element
element_species
Title : element_entry
Usage : $self->element_entry
Function: processes a <species> node, creating a Bio::Species object
Returns : none
Args : none
Throws : Exception if <species> tag exists but is empty,
or if the attributes 'name' or 'ncbiTaxID' are undefined
element_description
Title : element_description
Usage : $self->element_description
Function: processes a sequence <description> node;
a no-op -- description text is read by
processXMLnode
Returns : none
Args : none
element_RNAseq
Title : element_RNAseq
Usage : $self->element_RNAseq
Function: processes a sequence <RNAseq> node
Returns : none
Args : none
element_DNAseq
Title : element_DNAseq
Usage : $self->element_DNAseq
Function: processes a sequence <DNAseq> node
Returns : none
Args : none
element_AAseq
Title : element_AAseq
Usage : $self->element_AAseq
Function: processes a sequence <AAseq> node
Returns : none
Args : none
element_DBRef
Title : element_DBRef
Usage : $self->element_DBRef
Function: processes a sequence <DBRef> node,
creating a Bio::Annotation::DBLink object
Returns : none
Args : none
element_property
Title : element_property
Usage : $self->element_property
Function: processes a sequence <property> node, creating a
Bio::Annotation::SimpleValue object
Returns : none
Args : none
end_element_RNAseq
Title : end_element_RNAseq
Usage : $self->end_element_RNAseq
Function: processes a sequence <RNAseq> node
Returns : none
Args : none
end_element_DNAseq
Title : end_element_DNAseq
Usage : $self->end_element_DNAseq
Function: processes a sequence <DNAseq> node
Returns : none
Args : none
end_element_AAseq
Title : end_element_AAseq
Usage : $self->end_element_AAseq
Function: processes a sequence <AAseq> node
Returns : none
Args : none
end_element_entry
Title : end_element_entry
Usage : $self->end_element_entry
Function: processes the closing </entry> node, creating the Seq object
Returns : a Bio::Seq object
Args : none
Throws : Exception if sequence, sequence ID, or alphabet are missing
end_element_default
Title : end_element_default
Usage : $self->end_element_default
Function: processes all other closing tags;
a no-op.
Returns : none
Args : none
DESTROY
Title : DESTROY
Usage : called automatically by Perl just before object
goes out of scope
Function: performs a write flush
Returns : none
Args : none
close
Title : close
Usage : $seqio_obj->close().
Function: writes closing </seqXML> tag.
close() will be called automatically by Perl when your
program exits, but if you want to use the seqXML file
you've written before then, you'll need to do an explicit
close first to get the final </seqXML> tag.
Returns : none
Args : none