NAME
wgs2ncbi - prepares whole genome sequencing projects for submission to NCBI
SYNOPSIS
Usage: wgs2ncbi [action] -conf [config file]
Typically, you will run the following sequence of commands:
$ wgs2ncbi prepare -conf config.ini
$ wgs2ncbi process -conf config.ini
$ wgs2ncbi convert -conf config.ini
$ wgs2ncbi prune -conf config.ini
$ wgs2ncbi trim -conf config.ini
$ wgs2ncbi compress -conf config.ini
The prepare
and compress
steps will be one time operations, but process
, convert
, trim
and prune
may be iterative, depending on the feedback you will get from NCBI (e.g. about invalid product names, unmasked adaptor sequences, and other problematic regions).
DESCRIPTION
wgs2ncbi
is a script that helps users prepare submissions of annotated, whole genomes to NCBI. It does this by performing a number of actions that need to be taken in sequence. Each of these actions need to be invoked as a subcommand (i.e. wgs2ncbi [action]
), which will run for a certain amount of time. The actions are documented more fully in the module of functions that this script is based on. Links to the respective, expanded documentation sections are given below. Here follows a brief description of the actions:
prepare
Prepares the rest of the procedure by expanding the single genome annotation file into separate files, one for each contig. See "prepare" in Bio::WGS2NCBI.
process
Processes the genome by writing out feature tables and masking contig segments as needed.
convert
Converts the masked contigs and feature tables into ASN.1 using tbl2asn.
prune
Based on a validation file from NCBI, makes pruned versions of feature tables that omit features within regions identified by NCBI.
trim
Trims leading and trailing NNNs from sequence files and feature tables.
compress
Packs the ASN.1 files into a .tar.gz archive for upload to NCBI.