NAME
Compress::BGZF::Reader
VERSION
version 0.001
SYNOPSIS
use Compress::BGZF::Reader;
# Use as filehandle
my $fh_bgz = Compress::BGZF::Reader->new_filehandle( $bgz_filename );
# you can do this, but it's probably faster just to pipe gunzip
while (my $line = <$fh_bgz>) {
print $line;
}
# here's the random-access goodness
# fetch 32 bytes from uncompressed offset 1001
seek $fh_bgz, 1001, 0;
read $fh_bgz, my $data, 32;
print $data;
# Use as object
my $reader = Compress::BGZF::Reader->new( $bgz_filename );
# Move to a virtual offset (somehow pre-calculated) and read 32 bytes
$reader->move_to_vo( $virt_offset );
my $data = $reader->read_data(32);
print $data;
$reader->write_index( $fn_idx );
DESCRIPTION
Compress::BGZF::Reader
is a module implementing random access to the BGZIP file format. While it can do sequential/streaming reads, there is really no point in using it for this purpose over standard GZIP tools/libraries, since BGZIP is GZIP-compatible. The
There are two main modes of construction - as an object (using new()
) and as a filehandle glob (using new_filehandle
). The filehandle mode is straightforward for general use (emulating seek/read/tell functionality and passing to other classes/methods that expect a filehandle). The object mode has additional features such as seeking to virtual offsets and dumping the offset index to file.
METHODS
Filehandle Functions
- new_filehandle
-
my $fh_bgzf = Compress::BGZF::Writer->new_filehandle( $input_fn );
Create a new
Compress::BGZF::Reader
engine and tie it to a IO::File handle, which is returned. Takes a mandatory single argument for the filename to be read from. - <>
- readline
- seek
- read
- tell
- eof
-
my $line = <$fh_bgzf>; my $line = readline $fh_bgzf; seek $fh_bgzf, 256, 0; read $fh_bgzf, my $buffer, 32; my $loc = tell $fh_bgzf; print "End of file\n" if eof($fh_bgzf);
These functions emulate the standard perl functions of the same name.
Object-oriented Methods
- new
-
my $reader = Compress::BGZF::Reader->new( $fn_in );
Create a new
Compress::BGZF::Reader
engine. Requires a single argument - the name of the BGZIP file to be read from. - move_to
-
$reader->move_to( 493, 0 );
Seeks to the given uncompressed offset. Takes two arguments - the requested offset and the relativity of the offset (0: file start, 1: current, 2: file end)
- move_to_vo
-
$reader->move_to( $virt_offset );
Like
move_to
, but takes as a single argument a virtual offset. Virtual offsets are described more in the top-level documentation forCompress::BGZF
. - read_data
-
my $data = $reader->read_data( 32 );
Read uncompressed data from the current location. Takes a single argument - the number of bytes to be read - and returns the data read or
undef
if atEOF
. - getline
-
my $line = $reader->getline();
Reads one line of uncompressed data from the current location, shifting the current file offset accordingly. Returns the line read or
undef
if currently atEOF
. - usize
-
my $size = $reader->usize();
Returns the uncompressed size of the file, as calculated during indexing.
- write_index
-
$reader->write_index( $fn_index );
Writes the compressed index to file. The index format (as defined by htslib) consists of little-endian int64-coded values. The first value is the number of offsets in the index. The rest of the values consist of pairs of block offsets relative to the compressed and uncompressed data. The first offset (always 0,0) is not included. The index files written by Compress::BGZF should be compatible with those of the htslib
bgzip
software, and vice versa.
CAVEATS AND BUGS
This is code is in alpha testing stage and the API is not guaranteed to be stable.
Please reports bugs to the author.
AUTHOR
Jeremy Volkening <jdv *at* base2bio.com>
COPYRIGHT AND LICENSE
Copyright 2015-2016 Jeremy Volkening
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.