NAME
Compress::DSRC - Perl bindings to the DSRC compression library
SYNOPSIS
Single-shot (de)compression
use Compress::DSRC;
my $engine = Compress::DSRC::Module->new;
my $settings = Compress::DSRC::Settings->new;
my $threads = 8;
$settings->set_dna_level(2);
$settings->set_lossy(1);
$engine->compress(
'foo.fq' => 'foo.fq.dsrc',
$settings,
$threads,
) or die $engine->error;
$engine->decompress(
'foo.fq.dsrc' => 'bar.fq',
$threads,
) or die $engine->error;
Per-record (de)compression
use Compress::DSRC;
my $reader = Compress::DSRC::Reader->new;
$reader->start( 'bar.fq.dsrc', $threads,)
or die $reader->error;
my $record = Compress::DSRC::Record->new;
while ($reader->read_record($record) {
print $reader->get_tag, "\n";
print $reader->get_sequence, "\n";
print $reader->get_plus, "\n";
print $reader->get_quality, "\n";
# or, more likely, do something else with record
}
$reader->finish;
DESCRIPTION
This module provides bindings to the DSRC compression library. It provides basic access to the DsrcModule (one-shot (de)compression) and DsrcArchive (record-by-record (de)compression) APIs.
CLASSES
Compress::DSRC
provides the following classes used in compression and decompression:
Compress::DSRC::Module
-
Objects of this class are used for one-shot compression and decompression (providing an input filename and output filename, along with some other optional parameters).
Compress::DSRC::Reader
-
Objects of this class are used to read record-by-record from a compressed archive.
Compress::DSRC::Writer
-
Objects of this class are used to writer record-by-record to a compressed archive.
Compress::DSRC::Settings
-
Objects of this class contain compression settings and are provided as arguments to several methods that write compressed data.
Compress::DSRC::Record
-
Objects of this class contain a single FASTQ record with accessors to each of the four data slots.
METHODS
Compress::DSRC::Module
- new
-
my $engine = Compress::DSRC::Module->new;
Creates a new one-shot (de)compression engine
- compress
-
$engine->compress( 'foo.fq', 'foo.fq.dsrc', $settings, $threads, ) or die $engine->error;
Compress a FASTQ file in one shot. Required arguments are (in order) input filename, output filename, and a
Compress::DSRC::Settings
object. Number of threads to use for compression is an optional fourth argument (default: 1). - decompress
-
$engine->decompress( 'foo.fq.dsrc', 'foo.fq', $threads, ) or die $engine->error;
As with
compress()
but in the other direction. Required arguments are (in order) input filename and output filename. Number of threads to use for decompression is an optional third argument (default: 1). - error
-
If an error occurs, a description can be retrieving using this method.
Compress::DSRC::Reader
- new
-
my $reader = Compress::DSRC::Reader->new;
Create a new Reader object
- start
-
$reader->start( 'foo.fq', $threads );
Initialize a decompression session. Arguments are the input filename (required) and the number of threads to use (default: 1).
- read_record
-
while ($reader->read_record( $record )) { # do something with $record; }
Read the next record in the file. A single argument is expected - a Compress::DSRC::Record object whose data slots will be populated from the record read.
- next_record
-
while (my $record = $reader->next_record()) { # do something with $record; }
This provides a slightly more Perl-ish alternative to
read_record()
for those who prefer it, at the cost of ~ 1.5x longer run times (a new Compress::DSRC::Record object is generated for each call). - finish
-
$reader->finish;
Finalize the session.
- error
-
If an error occurs, a description can be retrieving using this method.
Compress::DSRC::Writer
- new
-
my $writer = Compress::DSRC::Writer->new;
Create a new Writer object
- start
-
$writer->start( 'foo.fq', $settings, $threads );
Initialize a compression session. Arguments are the input filename and Compress::DSRC::Settings object (required) and the number of threads to use (default: 1).
- write_record
-
$writer->write_record( $record );
Write a record to file. A single argument is expected - a Compress::DSRC::Record object.
- finish
-
$writer->finish;
Finalize the session.
- error
-
If an error occurs, a description can be retrieving using this method.
Compress::DSRC::Record
The underlying class is a C++ struct, so all methods are accessors to class member variables. See FASTQ documentation for more information. get_plus
and set_plus
will be rarely used (This slot in the FASTQ specification is generally redundant and usually empty) but are included for completeness.
my $record = Compress::DSRC::Record->new;
$record->set_tag( '@read1 other info' );
$records->set_sequence( 'ATGGCCTA' );
$records->set_quality( '998398A8' );
# do something with $record;
Compress::DSRC::Settings
The underlying class is a C++ struct, so all methods are accessors to class member variables. For more information on the meaning of settings, see DSRC documentation.
- get_dna_level / set_dna_level
-
Get/set the DNA compression level.
- get_qual_level / set_qual_level
-
Get/set the quality compression level.
- get_lossy / set_lossy
-
Get/set whether to use lossy (binning) quality compression
- get_calc_crc32 / set_calc_crc32
-
Get/set whether to do CRC32 checking during compression
- get_buffer_size / set_buffer_size
-
See DSRC documentation.
- get_tag_mask / set_tag_mask
-
See DSRC documentation.
DEPENDENCIES
Requires a C++ compiler and the Boost system/thread libraries. There are no other external dependencies.
CAVEATS AND BUGS
Currently the underlying C++ library (and thus this module) does not handle the edge case of a FASTQ file containing a single record. A bug report has been filed upstream.
Please report bugs to the author.
AUTHOR
Jeremy Volkening <jdv@base2bio.com>
COPYRIGHT AND LICENSE
Copyright 2015-2016 Jeremy Volkening
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.