NAME

Compress::BGZF::Reader

VERSION

version 0.001

SYNOPSIS

use Compress::BGZF::Reader;

# Use as filehandle
my $fh_bgz = Compress::BGZF::Reader->new_filehandle( $bgz_filename );

# you can do this, but it's probably faster just to pipe gunzip
while (my $line = <$fh_bgz>) {
    print $line;
}

# here's the random-access goodness
# fetch 32 bytes from uncompressed offset 1001
seek $fh_bgz, 1001, 0;
read $fh_bgz, my $data, 32;
print $data;

# Use as object
my $reader = Compress::BGZF::Reader->new( $bgz_filename );

# Move to a virtual offset (somehow pre-calculated) and read 32 bytes
$reader->move_to_vo( $virt_offset );
my $data = $reader->read_data(32);
print $data;

$reader->write_index( $fn_idx );

DESCRIPTION

Compress::BGZF::Reader is a module implementing random access to the BGZIP file format. While it can do sequential/streaming reads, there is really no point in using it for this purpose over standard GZIP tools/libraries, since BGZIP is GZIP-compatible. The

There are two main modes of construction - as an object (using new()) and as a filehandle glob (using new_filehandle). The filehandle mode is straightforward for general use (emulating seek/read/tell functionality and passing to other classes/methods that expect a filehandle). The object mode has additional features such as seeking to virtual offsets and dumping the offset index to file.

METHODS

Filehandle Functions

new_filehandle
my $fh_bgzf = Compress::BGZF::Writer->new_filehandle( $input_fn );

Create a new Compress::BGZF::Reader engine and tie it to a IO::File handle, which is returned. Takes a mandatory single argument for the filename to be read from.

<>
readline
seek
read
tell
eof
my $line = <$fh_bgzf>;
my $line = readline $fh_bgzf;
seek $fh_bgzf, 256, 0;
read $fh_bgzf, my $buffer, 32;
my $loc = tell $fh_bgzf;
print "End of file\n" if eof($fh_bgzf);

These functions emulate the standard perl functions of the same name.

Object-oriented Methods

new
my $reader = Compress::BGZF::Reader->new( $fn_in );

Create a new Compress::BGZF::Reader engine. Requires a single argument - the name of the BGZIP file to be read from.

move_to
$reader->move_to( 493, 0 );

Seeks to the given uncompressed offset. Takes two arguments - the requested offset and the relativity of the offset (0: file start, 1: current, 2: file end)

move_to_vo
$reader->move_to( $virt_offset );

Like move_to, but takes as a single argument a virtual offset. Virtual offsets are described more in the top-level documentation for Compress::BGZF.

read_data
my $data = $reader->read_data( 32 );

Read uncompressed data from the current location. Takes a single argument - the number of bytes to be read - and returns the data read or undef if at EOF.

getline
my $line = $reader->getline();

Reads one line of uncompressed data from the current location, shifting the current file offset accordingly. Returns the line read or undef if currently at EOF.

usize
my $size = $reader->usize();

Returns the uncompressed size of the file, as calculated during indexing.

write_index
$reader->write_index( $fn_index );

Writes the compressed index to file. The index format (as defined by htslib) consists of little-endian int64-coded values. The first value is the number of offsets in the index. The rest of the values consist of pairs of block offsets relative to the compressed and uncompressed data. The first offset (always 0,0) is not included. The index files written by Compress::BGZF should be compatible with those of the htslib bgzip software, and vice versa.

CAVEATS AND BUGS

This is code is in alpha testing stage and the API is not guaranteed to be stable.

Please reports bugs to the author.

AUTHOR

Jeremy Volkening <jdv *at* base2bio.com>

COPYRIGHT AND LICENSE

Copyright 2015-2016 Jeremy Volkening

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.