NAME

fastQ_brew - a module for preprocessing of fastQ formatted files

SYNOPSIS

use fastQ_brew;
use List::Util qw(min max sum);
use fastQ_brew_Utilities;
use Cwd;

my $lib       = "sanger";
my $file_path = cwd();
my $in_file   = "sample_sanger.fastq";

my $tmp = fastQ_brew->new();

$tmp->load_fastQ_brew(
                  library_type  => $lib || "illumina",
                  file_path     => $file_path,
                  in_file       => $in_file,
                  de_duplex     => "Y",
                  qual_filter   => 1200,
                  length_filter => 80,
                  adapter_left  => "GTACGTGTGGTGGGGAT",
                  mismatches_l  => 1,
                  adapter_right => "TAGCGCGCGATGATT",
                  mismatches_r  => 1,
                  left_trim     => 5,
                  right_trim    => 8,
                  fasta_convert => "Y",
                  dna_rna       => "Y",
                  rev_comp      => "Y",
                  remove_n      => "Y",
                  cleanup       => "Y"
);

$tmp->run_fastQ_brew();

DESCRIPTION

Returns summary statistics for all reads from fastQ formatted files and provides methods for filtering and trimming reads by lenght and quality.

FEEDBACK

damienoh@gwu.edu

Mailing Lists

User feedback is an integral part of the evolution of this module. Send your comments and suggestions preferably to one of the mailing lists. Your participation is much appreciated.

Support

Please direct usage questions or support issues to: <damienoh@gwu.edu> Please include a thorough description of the problem with code and data examples if at all possible.

Reporting Bugs

Report bugs to the GitHub bug tracking system to help keep track of the bugs and their resolution. Bug reports can be submitted via the GitHub page:

https://github.com/dohalloran/fastQ_brew/issues

AUTHORS - Damien OHalloran

Email: damienoh@gwu.edu

APPENDIX

The rest of the documentation details each of the object methods.

new()

Title   : new()
Usage   : my $tmp = fastQ_brew->new();
Function: constructor routine
Returns : a blessed object
Args    : none

load_fastQ_brew()

Title   : load_fastQ_brew()
Usage   : $tmp->load_fastQ_brew(
                   library_type  => $lib || "illumina",
                   file_path     => $file_path,
                   in_file       => $in_file,
                   de_duplex     => "Y",
                   qual_filter   => 1200,
                   length_filter => 80,
                   adapter_left  => "GTACGTGTGGTGGGGAT",
                   mismatches_l  => 1,
                   adapter_right => "GTACGTGTGGTGGGGAT",
                   mismatches_r  => 1,
                   left_trim     => 5,
                   right_trim    => 8,
                   fasta_convert => "Y",
                   dna_rna       => "Y",
                   rev_comp      => "Y",
                   remove_n      => "Y",
                   cleanup       => "Y"
             );
Function: Populates the user data into $self hash
Returns : nothing returned
Args    :
-library_type, either sanger or illumina
-file_path, path to sequences
-in_file, the name of the files containing the fastQ reads
-de_duplex, remove duplicate entries
-qual_filter, fiter reads by Q score: N=no, 200=remove reads with Quality (Q) scores below 200
-adapter_left, remove adapter from left side
-mismatches_l, remove adapter from left side that include a number of mismatches
-adapter_right, remove adapter from right side
-mismatches_r, remove adapter from right side that include a number of mismatches
-left_trim, remove x number of bases from left end
-right_trim, remove x nnumber of bases from right end
-length_filter, fiter reads by length: N=no, 40=remove reads shorter than 40 bases
-fasta_convert, option to convert to fastA file: Y=yes, N=no
-dna_rna, transcribe reads in fastQ file: N=no, Y=yes
-rev_comp, reverse complement reads in fastQ file: N=no, Y=yes
-remove_n, remove reads with non-designated bases (i.e. N's) in fastQ file: N=no, Y=yes
-cleanup, option to delete tmp file: Y=yes, N=no

_io_file()

Title   : _io_file()
Usage   : $self->_io_file(%arg)
Function: processes the input file
Returns : tmp file with only phred score and sequence for each read
Args    : fastQ file

run_fastQ_brew()

Title   : run_fastQ_brew()
Usage   : run_fastQ_brew();
Function: starts the analysis of the fastQ file
Returns : the stats
Args    : $self, %arg

_convert_fasta()

Title   : _convert_fasta()
Usage   : _convert_fasta();
Function: option to convert fastQ file to fastA
Returns : fastA file
Args    : Y=yes, N=no

_de_duplex()

Title   : _de_duplex
Usage   : _de_duplex();
Function: remove duplicate reads
Returns : fastQ file with only singletons
Args    : Y=yes, N=no

_remove_adapter_left()

Title   : _remove_adapter_left
Usage   : _remove_adapter_left();
Function: option to remove specific adapters from left side 
Returns : fastQ file
Args    : string="GTCGAGT" and mismatches=integer

_remove_adapter_right()

Title   : _remove_adapter_right
Usage   : _remove_adapter_right();
Function: option to remove specific adapters from right side 
Returns : fastQ file
Args    : string="GTCGAGT" and mismatches=integer

_prune_fastq()

Title   : _prune_fastq()
Usage   : _prune_fastq();
Function: option to remove reads below phred score
Returns : pruned fastQ file
Args    : integer=yes, N=no

_reverse_comp()

Title   : _reverse_comp()
Usage   : $self->_reverse_comp(%arg)
Function: option to rev comp fastQ reads
Returns : reverse complemented fastQ file
Args    : Y=yes, N=no

_dna_rna()

Title   : _dna_rna()
Usage   : $self->_dna_rna(%arg)
Function: option to convert dna to rna for fastQ reads
Returns : RNA fastQ file
Args    : Y=yes, N=no

_right_trim()

Title   : _right_trim()
Usage   : $self->_right_trim(%arg)
Function: option to remove right side bases from reads
Returns : right trimmed fastQ file
Args    : integer=yes, N=no

_left_trim()

Title   : _left_trim()
Usage   : $self->_left_trim(%arg)
Function: option to remove left side bases from reads
Returns : left trimmed fastQ file
Args    : integer=yes, N=no

_trim_length()

Title   : _trim_length()
Usage   : $self->_trim_length(%arg)
Function: option to remove reads below specified length
Returns : trimmed fastQ file
Args    : integer=yes, N=no

remove_n()

Title   : remove_n()
Usage   : $self->remove_n(%arg)
Function: option to remove reads with N's
Returns : fastQ file
Args    : Y=yes, N=no

_cleanup()

Title   : _cleanup()
Usage   : _cleanup();
Function: option to delete tmp files
Returns : nothing
Args    : Y=yes, N=no

DESTROY()

Title   : DESTROY
Usage   : DESTROY();
Function: garbage collection
Returns : nothing
Args    : automatically called

get_lib_type()

Title   : get_lib_type()
Usage   : my $get_lib_type= $tmp->get_lib_type();
Function: Retrieves the library type used
Returns : A string of the type e.g. Sanger
Args    : none

set_lib_type()

Title   : set_lib_type()
Usage   : my $set_lib_type = $tmp->set_lib_type("sanger");
Function: Populates the $self->{lib_type} property
Returns : $self->{lib_type}
Args    : the lib as a string

get_in_file()

Title   : get_in_file()
Usage   : my $get_in_file = $tmp->get_in_file();
Function: Retrieves the input filename
Returns : A string containing filename
Args    : none

set_in_file()

Title   : set_in_file()
Usage   : my $set_in_file= $tmp->set_in_file("myOutPutFile.txt");
Function: Populates the $self->{in_file} property
Returns : $self->{in_file}
Args    : name of the user provided input file

get_de_duplex()

Title   : get_de_duplex()
Usage   : my $get_de_duplex= $tmp->get_de_duplex();
Function: Retrieves the de_duplex choice 
Returns : Y or N
Args    : none

set_de_duplex()

Title   : set_de_duplex()
Usage   : my $set_de_duplex= $tmp->set_de_duplex();
Function: Sets the de_duplex choice 
Returns : Populates the $self->{de_duplex} property
Args    : Y or N

get_qual_filter()

Title   : get_qual_filter()
Usage   : my $get_qual_filter= $tmp->get_qual_filter();
Function: Retrieves the qual filter used
Returns : integer
Args    : none

set_qual_filter()

Title   : set_qual_filter()
Usage   : my $set_qual_filter= $tmp->set_qual_filter();
Function: Sets the qual filter used
Returns : Populates the $self->{qual_filter} property
Args    : integer

get_len_filter()

Title   : get_len_filte()
Usage   : my $get_len_filte= $tmp->get_len_filte();
Function: Retrieves the length filter
Returns : integer
Args    : none

set_len_filter()

Title   : set_len_filter()
Usage   : my $set_len_filter= $tmp->set_len_filter();
Function: Sets the len filter used
Returns : Populates the $self->{length_filter} property
Args    : integer

get_adapter_l()

Title   : get_adapter_l()
Usage   : my $get_adapter_l= $tmp->get_adapter_l();
Function: Retrieves the left adapter specified 
Returns : A string of the left adapater
Args    : none

set_adapter_l()

Title   : set_adapter_l()
Usage   : my $set_adapter_l= $tmp->set_adapter_l();
Function: Sets the $self->{adapter_left} property
Returns : Populates the $self->{adapter_left} property
Args    : string

get_adapter_r()

Title   : get_adapter_r()
Usage   : my $get_adapter_r= $tmp->get_adapter_r();
Function: Retrieves the right adapter specified 
Returns : A string of the right adapater
Args    : none

set_adapter_r()

Title   : set_adapter_r()
Usage   : my $set_adapter_r= $tmp->set_adapter_r();
Function: Sets the $self->{adapter_right} property
Returns : Populates the $self->{adapter_right} property
Args    : string

get_left_trim()

Title   : get_left_trim()
Usage   : my $get_left_trim= $tmp->get_left_trim();
Function: Retrieves the left trim number
Returns : integer
Args    : none

set_left_trim()

Title   : set_left_trim()
Usage   : my $set_left_trim = $tmp->set_left_trim();
Function: Populates the $self->{left_trim} property
Returns : $self->{left_trim}
Args    : integer

get_right_trim()

Title   : get_right_trim()
Usage   : my $get_right_trim= $tmp->get_right_trim();
Function: gets the right trim number
Returns : integer
Args    : none

set_right_trim()

Title   : set_right_trim()
Usage   : my $set_right_trim = $tmp->set_right_trim();
Function: Populates the $self->{right_trim} property
Returns : $self->{right_trim}
Args    : integer

get_fasta()

Title   : get_fasta()
Usage   : my $get_fasta= $tmp->get_fasta();
Function: Retrieves the get_fasta option
Returns : Y or N
Args    : none

set_fasta()

Title   : set_fasta()
Usage   : my $set_fasta = $tmp->set_fasta();
Function: Populates the $self->{fasta_convert} property
Returns : $self->{fasta_convert}
Args    : a command to execute fastA convert or not: Y=yes, N=no

get_rev_com()

Title   : get_rev_com()
Usage   : my $get_rev_com= $tmp->get_rev_com();
Function: Retrieves the rev_comp option
Returns : Y or N
Args    : none

set_rev_com()

Title   : set_rev_com()
Usage   : my $set_rev_com = $tmp->set_rev_com();
Function: Populates the $self->{rev_comp} property
Returns : $self->{rev_comp}
Args    : a command to execute rev_comp or not: Y=yes, N=no

get_remove_n()

Title   : get_remove_n()
Usage   : my $get_remove_n= $tmp->get_remove_n();
Function: Retrieves the command for N removal reads
Returns : Y or N
Args    : none

set_remove_n()

Title   : set_remove_n()
Usage   : my $set_remove_n = $tmp->set_remove_n();
Function: Populates the $self->{remove_n} property
Returns : $self->{remove_n}
Args    : a command to remove reads with N or not: Y=yes, N=no

get_cleanup()

Title   : get_cleanup()
Usage   : my $get_cleanup = $tmp->get_cleanup();
Function: returns the value option for cleanup
Returns : Y or N
Args    : none

set_cleanup()

Title   : set_cleanup()
Usage   : my $set_cleanup = $tmp->set_cleanup("Y");
Function: Populates the $self->{cleanup} property
Returns : $self->{cleanup}
Args    : a command to execute cleanup or not: Y=yes, N=no

LICENSE AND COPYRIGHT

Copyright (C) 2017 Damien M. O'Halloran
GNU GENERAL PUBLIC LICENSE