NAME
CracTools::GenomeMask - A bit vector mask over the whole genome
VERSION
version 1.25
SYNOPSIS
my $genome_mask = CracTools::GenomeMask->new( genome => { "chr1" => 100000, "chr2" => 20000 } );
$genome_mask->setRegion("chr1",200,250);
$genome_mask->getNbBitsSetInRegion("chr1",190,220);
DESCRIPTION
This module defines a BitVector mask over a whole genome and provide method to query this mask. It can read genome sequence and length from various sources (SAM headers, CRAC index, User input).
SEE ALSO
You can look at CracTools::BitVector that is the underlying datastructure of CracTools::GenomeMask.
TODO
The GenomeMask should be able to handle double strand DNA (as an option)
METHODS
new
There is mutiple ways to create a genome mask:
One can specify a argument called genome
that is a hashref where keys are chromosome names and values are chromosomes length.
my $genome_mask = CracTools::GenomeMask->new( genome => { seq_name => length,
seq_name => length,
...} );
One can specify a argument called C<crac_index_conf> that the configuration file of a CRAC index
my $genome_mask = CracTools::GenomeMask->new(crac_index_conf => file.conf);
One can specify a CracTools::SAMReader
object in order to read chromosomes names and lenght from the header
my $genome_mask = CracTools::GenomeMask->new(sam_reader => CracTools::SAMReader->new(file.sam));
getBitvector
Arg [1] : String - Chromosome
Description : Return the CracTools::BitVector associated with the reference name given in argument.
If no bitvectors exists for this reference, a warning will be reported.
ReturnType : CracTools::BitVector
getChrLength
Arg [1] : String - Chromosome
Description : Return the length of the chromosome
ReturnType : Integer
setPos
Arg [1] : String - Chromosome
Arg [2] : Integer - Position
Description : Set the bit a this genome location
setRegion
Arg [1] : String - Chromosome
Arg [2] : Integer - Position start
Arg [3] : Integer - Position end
Example ; $genome_mask->setRegion($chr,$start,$end)
Description : Set all bits to 1 for this region
getPos
Arg [1] : String - Chromosome
Arg [2] : Integer - Position
Description : Return true is the bit is set at this genomic location
ReturnType : Boolean
getPosSetInRegion
Arg [1] : String - Chromosome
Arg [2] : Integer - Position start
Arg [3] : Integer - Position end
Example : my @nb_pos_set = @{$genome_mask->getNbBitsSetInRegion($chr,$start,$end)};
Description : Return all the posititions of the bits set in this genomic
region
ReturnType : Array(Integer)
getNbBitsSetInRegion
Arg [1] : String - Chromosome
Arg [2] : Integer - Position start
Arg [3] : Integer - Position end
Description : Return the number of bits set in this genomic region
ReturnType : Integer
rank
Arg [1] : String - Chromosome
Arg [2] : Integer - Position
Description : Return the number of bits set, up to this genomic
position as if the genome was linear.
ReturnType : Integer
select
Arg [1] : Integer - Nth bit set
my ($chr,$pos) = $genome_mask->select(12)
Description : Return an array with the (chr,pos) of the Nth bit set
ReturnType : Array(String,Integer)
AUTHORS
Nicolas PHILIPPE <nphilippe.research@gmail.com>
Jérôme AUDOUX <jaudoux@cpan.org>
Sacha BEAUMEUNIER <sacha.beaumeunier@gmail.com>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2017 by IRMB/INSERM (Institute for Regenerative Medecine and Biotherapy / Institut National de la Santé et de la Recherche Médicale) and AxLR/SATT (Lanquedoc Roussilon / Societe d'Acceleration de Transfert de Technologie).
This is free software, licensed under:
The GNU Affero General Public License, Version 3, November 2007