NAME

Bio::ViennaNGS::AnnoC - Object-oriented interface for storing and converting biological sequence annotation formats

SYNOPSIS

use Bio::ViennaNGS::AnnoC;

my $obj = Bio::ViennaNGS::AnnoC->new();

# parse GFF3 file to internal data straucture
$obj->parse_gff($gff3_file);

# compute summary of parsed annotation
$obj->featstat;

# dump feature summary to file
$obj->feature_summary($dest);

# dump all tRNAs contained in data structure as BED12
$obj->features2bed("tRNA",$dest,$bn,$log)

DESCRIPTION

This module provides an object-oriented interface for storing and converting biological sequence annotation data. Based on the Moose object system, it maintains a central data structure which is curently designed to represent simple, non-spliced (ie single-exon) annotation data. Future versions of the module will account for more generic scenarios, including spliced isoforms.

METHODS

parse_gff
Title   : parse_gff
Usage   : $obj->parse_gff($gff3_file);
Function: Parses GFF3 annotation files of non-spliced genomes into
          C<$self->features>
Args    : The full path to a GFF3 file
Returns :
Notes   : The GFF3 specification is available at
          L<http://www.sequenceontology.org/resources/gff3.html>.
          This routine has been tested with NCBI bacteria GFF3
          annotation.
feature_summary
Title   : feature_summary
Usage   : $obj->feature_summary($dest);
Function: Generate a summary file for all features present in
          C<$self->features> 
Args    : Full output path for summary.txt file
Returns :
features2bed
Title   : features2bed
Usage   : $obj->features2bed($feature,$workdir,$bn,$log);
Function: Dumps genomic features from C<$self->features> hash to a
          BED12 file.
Args    : C<$gbkey> can be either a string corresponding to a
          genbank key in C<$self->featstat> or C<undef>. If defined,
          only features of the speficied key will be dumped to a single
          BED12 file. If C<$gbkey> is C<undef>, BED12 files will be
          generated for each type present in C<$self->featstat>.
          C<$dest> is the output directory and C<$bn> the basename for
          all output files. C<$log> is either be the full path to a
          logfile or C<undef>.
Returns  :

DEPENDENCIES

Bio::Tools::GFF
IPC::Cmd
Path::Class
Carp

AUTHORS

Michael T. Wolfinger <michael@wolfinger.eu>

COPYRIGHT AND LICENSE

Copyright (C) 2014 Michael T. Wolfinger <michael@wolfinger.eu>

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.12.4 or, at your option, any later version of Perl 5 you may have available.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.