NAME

Bio::Assembly::ContigAnalysis - Perform analysis on sequence assembly contigs.

SYNOPSIS

# Module loading
use Bio::Assembly::ContigAnalysis;

# Assembly loading methods
my $ca = new Bio::Assembly::ContigAnalysis( -contig=>$contigOBJ );

my @lcq = $ca->low_consensus_quality;
my @hqd = $ca->high_quality_discrepancies;
my @ss  = $ca->single_strand_regions;

DESCRIPTION

A contig is as a set of sequences, locally aligned to each other, when the sequences in a pair may be aligned. It may also include a consensus sequence. Bio::Assembly::ContigAnalysis is a module holding a collection of methods to analyze contig objects. It was developed around the Bio::Assembly::Contig implementation of contigs and can not work with another contig interface.

FEEDBACK

Mailing Lists

User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing lists Your participation is much appreciated.

bioperl-l@bioperl.org                  - General discussion
http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web:

http://bugzilla.open-bio.org/

AUTHOR - Robson Francisco de Souza

Email: rfsouza@citri.iq.usp.br

APPENDIX

The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

Object creator

new

Title     : new
Usage     : my $contig = Bio::Assembly::ContigAnalysis->new(-contig=>$contigOBJ);
Function  : Creates a new contig analysis object
Returns   : Bio::Assembly::ContigAnalysis
Args      :
            -contig : a Bio::Assembly::Contig object

Analysis methods

high_quality_discrepancies

Title     : high_quality_discrepancies
Usage     : my $sfc = $ContigAnal->high_quality_discrepancies();
Function  : 

            Locates all high quality discrepancies among aligned
            sequences and the consensus sequence.

            Note: see Bio::Assembly::Contig POD documentation,
            section "Coordinate System", for a definition of
            available types. Default coordinate system type is
            "gapped consensus", i.e. consensus sequence (with gaps)
            coordinates. If limits are not specified, the entire
            alignment is analyzed.

Returns   : Bio::SeqFeature::Collection
Args      : optional arguments are
            -threshold : cutoff value for low quality (minimum high quality)
                         Default: 40
            -ignore    : number of bases that will not be analysed at
                         both ends of contig aligned elements
                         Default: 5
            -start     : start of interval that will be analyzed
            -end       : start of interval that will be analyzed
            -type      : coordinate system type for interval

low_consensus_quality

Title     : low_consensus_quality
Usage     : my $sfc = $ContigAnal->low_consensus_quality();
Function  : Locates all low quality regions in the consensus
Returns   : an array of Bio::SeqFeature::Generic objects
Args      : optional arguments are
            -threshold : cutoff value for low quality (minimum high quality)
                         Default: 25
            -start     : start of interval that will be analyzed
            -end       : start of interval that will be analyzed
            -type      : coordinate system type for interval

not_confirmed_on_both_strands

Title     : low_quality_consensus
Usage     : my $sfc = $ContigAnal->low_quality_consensus();
Function  : 

            Locates all regions whose consensus bases were not
            confirmed by bases from sequences aligned in both
            orientations, i.e., in such regions, no bases in aligned
            sequences of either +1 or -1 strand agree with the
            consensus bases.

Returns   : an array of Bio::SeqFeature::Generic objects
Args      : optional arguments are
            -start : start of interval that will be analyzed
            -end   : start of interval that will be analyzed
            -type  : coordinate system type for interval

single_strand

Title     : single_strand
Usage     : my $sfc = $ContigAnal->single_strand();
Function  : 

            Locates all regions covered by aligned sequences only in
            one of the two strands, i.e., regions for which aligned
            sequence's strand() method returns +1 or -1 for all
            sequences.

Returns   : an array of Bio::SeqFeature::Generic objects
Args      : optional arguments are
            -start : start of interval that will be analyzed
            -end   : start of interval that will be analyzed
            -type  : coordinate system type for interval

Internal Methods

_merge_overlapping_features

Title     : _merge_overlapping_features
Usage     : my @feat = $ContigAnal->_merge_overlapping_features(@features);
Function  : Merge all overlapping features into features
            that hold original features as sub-features
Returns   : array of Bio::SeqFeature::Generic objects
Args      : array of Bio::SeqFeature::Generic objects

_complementary_features_list

Title     : _complementary_features_list
Usage     : @feat = $ContigAnal->_complementary_features_list($start,$end,@features);
Function  : Build a list of features for regions
            not covered by features in @features array
Returns   : array of Bio::SeqFeature::Generic objects
Args      : 
            $start    : [integer] start of first output feature
            $end      : [integer] end of last output feature
            @features : array of Bio::SeqFeature::Generic objects