NAME
CracTools - A set of tools designed to extract data from CRAC's SAM files and to provide annotations.
VERSION
version 1.25
DESCRIPTION
CracTools-core is the cornerstone of the CracTools. It is a toolbox that aim to ease the build of pipelines in the field of bioinformatics. It has been originally built to produce pipelines on top of CRAC software, but you can use the CracTools-core tools in an other context. It has a lot of built-in features to parse file, intersect biological events, integrate annotation, sharing configuration.
CracTools-core is also shiped with some binaries that are directly based on the CracTools-core API:
cractools extract
: this tools aims to extract biological events (splices, snp, indels, chimeras) from BAM files produced by CRAC's analysis.cractools gtf2togff3
: is a tools that convert gtf annotation files to gff3 format that is the standard in CracTools.- More tools are about to come (soon)
SPECIFICITIES
In CracTools, strand are encoded as 1, -1 for forward and reverse respectively. CracTools work on close intervals [a,b] and 0-based coordinate system.
MODULES
File parsing
CracTools::Utils
Is a module that provide usefull functions for opening files (I/O) with iterators, simple parsing of standard files format (VCF,BED,GTF,GFF), or performing transormations like reverse-complementing.
CracTools::SAMReader and CracTools::SAMReader::SAMline
Are modules that provide iterators and objects to easily read SAM/BAM file generated by CRAC and provide dedicated methods to extract additional fields added by CRAC to each record.
CracTools::GFF::Annotation
Is a module to parse and access GFF3 file.
Genomic-based datastructures
CracTools::Interval::Query
Is a module to store and query genomic intervals associated with variables. It is based on the interval tree datastructure provided by Set::IntervalTree.
CracTools::Interval::Query::File
Acts like CracTools::Interval::Query but read interval from files and return lines of the file matching the query. It has built-in methods to parse, SAM, G{T|F}F, BED, VCF files but you can provide your own method for other file formats.
CracTools::GenomeMask
Is a module that define a BitVector mask over a whole genome and provide method to query this mask. It can read genome sequence and length from various sources (SAM headers, CRAC index, User input).
Annotation
CracTools::Annotator
Is a module based on CracTools::Interval::Query::File that provides powerfull methods to query annotation files and prioritize hits to fit specific application needs.
Utilities
CracTools::Config
Is a module that aim to integrate a common configuration file among all the cractools pipelines. It automatically load the configuration file by looking to diverse locations, then it provides methods to retrieved the variables declared in the configuration file.
CracTools::Output
Is a module that provide methods to write customized column-based output files with pre-defined headers.
AUTHORS
Nicolas PHILIPPE <nphilippe.research@gmail.com>
Jérôme AUDOUX <jaudoux@cpan.org>
Sacha BEAUMEUNIER <sacha.beaumeunier@gmail.com>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2017 by IRMB/INSERM (Institute for Regenerative Medecine and Biotherapy / Institut National de la Santé et de la Recherche Médicale) and AxLR/SATT (Lanquedoc Roussilon / Societe d'Acceleration de Transfert de Technologie).
This is free software, licensed under:
The GNU Affero General Public License, Version 3, November 2007