The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Seeder::Finder - Finder object

VERSION

Version 0.01

DESCRIPTION

This module provides the find_motifs method.

SYNOPSIS

use Seeder::Finder;
my $finder = Seeder::Finder->new(
    seed_width    => "6",
    n_motif       => "10",
    hd_index_file => "6.index",
    seq_file      => "seq.fasta",
    bkgd_file     => "seq.bkgd",
    out_file      => "motif.out",
    strand        => "forward",
);
$finder -> find_motifs;

EXPORT

None by default

FUNCTIONS

new

Title   : new
Usage   : my $finder = Seeder::Finder->new(%args);
Function: constructor for the Seeder::Finder object
Returns : a new Seeder::Finder object
Args    :
    seed_width       # Seed width
    motif_width      # Motif width
    n_motif          # Number of motifs
    hd_index_file    # Index file
    seq_file         # Sequence file
    bkgd_file        # Background file
    out_file         # Output file
    strand           # Strand (forward or revcom), if the "revcom" option is
                       selected, the forward strand and the reverse
                       complement are included in the analysis

find_motifs

Title   : find_motifs
Usage   : $finder -> find_motifs;
Function: coordination of the motif finding process
Args    : none

_read_seq

Title   : _read_seq
Usage   : $self->_read_seq;
Function: read the sequence file, count number of sequences
Returns : reference to sequence tables
          ( $self->{n_seq} )
Args    : none

_read_bkgd

Title   : _read_bkgd
Usage   : $self->_read_bkgd;
Function: read the background Hamming distance file
Returns : reference to a 2D array of background Hamming distances and 
          reference to an array of nucleotide frequencies
Args    : none

_oligo_count

Title   : _oligo_count
Usage   : $self->_oligo_count;
Function: count oligos in sequences
Returns : reference to a 2D array of oligo counts
Args    : none

_extent

Title   : _extent
Usage   : $self->_extent;
Function: verify that motif extension width is even
Returns : motif extension width
Args    : none

_build_hd_matrix

Title   : _build_hd_matrix
Usage   : $self->_build_hd_matrix;
Function: calculate Hamming distance between oligos and sequences
Returns : reference to a 2D array of Hamming distances
Args    : none

_pr_sum

Title   : _pr_sum
Usage   : my $distribution = _pr_sum( $n_seq, \@freq );
Function: generate the probability distribution of a sum of i.i.d. random
          variables
Returns : reference to an array of real numbers in the range from 0 to 1
Args    : reference to oligo probability distribution, number of sequences

_convolution

Title   : _convolution
Usage   : my $p = _convolution($p, $f, $m);
Function: convolution of two distributions
Returns : reference to an array of real numbers in the range from 0 to 1
Args    : reference to the distributions to be convoluted

_iupac

Title   : _iupac
Usage   : $self->_iupac;
Function: set IUPAC degenerate symbol correspondence
Returns : reference to a hash of IUPAC degenerate symbol
Args    : none

_get_seed

Title   : _get_seed
Usage   : $self->_get_seed;
Function: coordinate seed site selection
Args    : none

_mtc

Title   : _mtc
Usage   : @q_value = _mtc(@p_value);
Function: generate a list of q-values from a list of p-values
Returns : array of q-values
Args    : array of p-values
Note    : This is an adaptation of the algorithm described in Storey, J.D.
          and Tibshirani, R. (2003) Statistical significance for genomewide
          studies, Proc Natl Acad Sci U S A, 100, 9440-9445

_frequency_matrix

Title   : _frequency_matrix
Usage   : $self->_frequency_matrix;
Function: convert a set of instances into a frequency matrix, frequencies for
          sequences holding multiple instances are weighted proportionally
Returns : reference to a 2D array of nucleotide frequencies
Args    : none

_probability_matrix

Title   : _probability_matrix
Usage   : $self->_probability_matrix;
Function: convert a frequency matrix into a probability matrix
Returns : reference to a 2D array of probabilities
Args    : none

_weight_matrix

Title   : _weight_matrix
Usage   : $self->_weight_matrix;
Function: convert a probability matrix into a weight matrix
Returns : reference to a 2D array of position weights
Args    : none

_select_site

Title   : _select_site
Usage   : $self->_select_site;
Function: select the best site among instances for each sequence given
          position weight matrix
Returns : reference to a 2D array of sites
Args    : none

_extend_motif

Title   : _extend_motif
Usage   : $self->_extend_motif;
Function: extend seeds to motif width
Returns : reference to a 2D array of sites
Args    : none

_information_content

Title   : _information_content
Usage   : $self->_information_content;
Function: calculate total information content
Returns : total information content
Args    : none

_mask_site

Title   : _mask_site
Usage   : $self->_mask_site;
Function: mask the occurence in the sequence and in the count matrix
Args    : none

_output_data

Title   : _output_data
Usage   : $self->_output_data;
Function: writes predicted motif to the output file
Args    : none

AUTHOR

François Fauteux, <ffauteux at cpan.org>

BUGS

Please report any bugs or feature requests to bug-motif at rt.cpan.org, or at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Seeder. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Seeder

You can also look for information at:

ACKNOWLEDGEMENTS

This algorithm was developed by François Fauteux, Mathieu Blanchette and Martina Strömvik. We thank the Perl Monks <http://www.perlmonks.org/> for their support.

COPYRIGHT & LICENSE

Copyright 2008 François Fauteux, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 1076:

Non-ASCII character seen before =encoding in 'François'. Assuming UTF-8