NAME

rnaseq_peakfinder.pl - Identify peaks/enriched regions in RNA-seq data

SYNOPSIS

rnaseq_peakfinder.pl [--bgpos FILE] [--bgneg FILE] [options]

DESCRIPTION

This program identifies peaks in RNA-seq data. Starting from coverage information in bedGraph format, this tool applies a two-step sliding window approach to characterize enriched regions with predefined properties, including maximum length, minimum coverage or maximum coverage at both enads of the genomic inerval.

Please note: It is highly recommended to use normalized input data.

OPTIONS

--bgpos

BedGraph input file containing coverage of the [+] strand.

--bgneg

BedGraph input file containing coverage of the [-] strand.

--winsize -w

Size of the sliding window in nt.

--interval -i

Size of the interval the sliding window is shifted at each step ('step size').

--mincov -m

Minimum coverage required for an enriched region to be considered.

--length -l

Maximum length of a peak in nt.

--threshold -t

Percentage of the maximum coverage value allowed at both ends of the peaks (default 0.1). This value is used to identify peak boundaries.

--out -o

Output directory.

--help -h

Print short help

--man

Prints the manual page and exits

NOTES

The memory footprint of this tool is rather high (several GB for eucaryotic systems). This is due to the fact that the input BedGraph files are first parsed into an Array of Bio::ViennaNGS::BedGraphEntry objects by Bio::ViennaNGS::FeatureIO. In a second step, this array is parsed into a Hash of Arrays data structure within Bio::ViennaNGS::Peaks to allow for efficient window sliding. This may be refactored in a future release.

AUTHOR

Michael T. Wolfinger <michael@wolfinger.eu>