NAME
Seeder::Finder - Finder object
VERSION
Version 0.01
DESCRIPTION
This module provides the find_motifs method.
SYNOPSIS
use Seeder::Finder;
my $finder = Seeder::Finder->new(
seed_width => "6",
n_motif => "10",
hd_index_file => "6.index",
seq_file => "seq.fasta",
bkgd_file => "seq.bkgd",
out_file => "motif.out",
strand => "forward",
);
$finder -> find_motifs;
EXPORT
None by default
FUNCTIONS
new
Title : new
Usage : my $finder = Seeder::Finder->new(%args);
Function: constructor for the Seeder::Finder object
Returns : a new Seeder::Finder object
Args :
seed_width # Seed width
motif_width # Motif width
n_motif # Number of motifs
hd_index_file # Index file
seq_file # Sequence file
bkgd_file # Background file
out_file # Output file
strand # Strand (forward or revcom), if the "revcom" option is
selected, the forward strand and the reverse
complement are included in the analysis
find_motifs
Title : find_motifs
Usage : $finder -> find_motifs;
Function: coordination of the motif finding process
Args : none
_read_seq
Title : _read_seq
Usage : $self->_read_seq;
Function: read the sequence file, count number of sequences
Returns : reference to sequence tables
( $self->{n_seq} )
Args : none
_read_bkgd
Title : _read_bkgd
Usage : $self->_read_bkgd;
Function: read the background Hamming distance file
Returns : reference to a 2D array of background Hamming distances and
reference to an array of nucleotide frequencies
Args : none
_oligo_count
Title : _oligo_count
Usage : $self->_oligo_count;
Function: count oligos in sequences
Returns : reference to a 2D array of oligo counts
Args : none
_extent
Title : _extent
Usage : $self->_extent;
Function: verify that motif extension width is even
Returns : motif extension width
Args : none
_build_hd_matrix
Title : _build_hd_matrix
Usage : $self->_build_hd_matrix;
Function: calculate Hamming distance between oligos and sequences
Returns : reference to a 2D array of Hamming distances
Args : none
_pr_sum
Title : _pr_sum
Usage : my $distribution = _pr_sum( $n_seq, \@freq );
Function: generate the probability distribution of a sum of i.i.d. random
variables
Returns : reference to an array of real numbers in the range from 0 to 1
Args : reference to oligo probability distribution, number of sequences
_convolution
Title : _convolution
Usage : my $p = _convolution($p, $f, $m);
Function: convolution of two distributions
Returns : reference to an array of real numbers in the range from 0 to 1
Args : reference to the distributions to be convoluted
_iupac
Title : _iupac
Usage : $self->_iupac;
Function: set IUPAC degenerate symbol correspondence
Returns : reference to a hash of IUPAC degenerate symbol
Args : none
_get_seed
Title : _get_seed
Usage : $self->_get_seed;
Function: coordinate seed site selection
Args : none
_mtc
Title : _mtc
Usage : @q_value = _mtc(@p_value);
Function: generate a list of q-values from a list of p-values
Returns : array of q-values
Args : array of p-values
Note : This is an adaptation of the algorithm described in Storey, J.D.
and Tibshirani, R. (2003) Statistical significance for genomewide
studies, Proc Natl Acad Sci U S A, 100, 9440-9445
_frequency_matrix
Title : _frequency_matrix
Usage : $self->_frequency_matrix;
Function: convert a set of instances into a frequency matrix, frequencies for
sequences holding multiple instances are weighted proportionally
Returns : reference to a 2D array of nucleotide frequencies
Args : none
_probability_matrix
Title : _probability_matrix
Usage : $self->_probability_matrix;
Function: convert a frequency matrix into a probability matrix
Returns : reference to a 2D array of probabilities
Args : none
_weight_matrix
Title : _weight_matrix
Usage : $self->_weight_matrix;
Function: convert a probability matrix into a weight matrix
Returns : reference to a 2D array of position weights
Args : none
_select_site
Title : _select_site
Usage : $self->_select_site;
Function: select the best site among instances for each sequence given
position weight matrix
Returns : reference to a 2D array of sites
Args : none
_extend_motif
Title : _extend_motif
Usage : $self->_extend_motif;
Function: extend seeds to motif width
Returns : reference to a 2D array of sites
Args : none
_information_content
Title : _information_content
Usage : $self->_information_content;
Function: calculate total information content
Returns : total information content
Args : none
_mask_site
Title : _mask_site
Usage : $self->_mask_site;
Function: mask the occurence in the sequence and in the count matrix
Args : none
_output_data
Title : _output_data
Usage : $self->_output_data;
Function: writes predicted motif to the output file
Args : none
AUTHOR
François Fauteux, <ffauteux at cpan.org>
BUGS
Please report any bugs or feature requests to bug-motif at rt.cpan.org
, or at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Seeder. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Seeder
You can also look for information at:
RT: CPAN's request tracker
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
This algorithm was developed by François Fauteux, Mathieu Blanchette and Martina Strömvik. We thank the Perl Monks <http://www.perlmonks.org/> for their support.
COPYRIGHT & LICENSE
Copyright 2008 François Fauteux, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 1076:
Non-ASCII character seen before =encoding in 'François'. Assuming UTF-8