NAME

BS_PCRTagger.pl

VERSION

Version 3.00

DESCRIPTION

This utility creates unique tags for open reading frames to aid the analysis
  of synthetic content in a nascent synthetic genome. Each tag in a gene has
  a wildtype and a synthetic version that correspond to the same offset in the
  gene; each tag can be paired with another to form gene specific amplicons
  which are also specific to either wildtype or synthetic sequence, depending
  on which tags are used.

To pick tags for a chromosome, each open reading frame over I<MINORFLEN> base
 pairs long will be slightly recoded to contain a set of PCR tags. The
 locations and sequences of these tags are carefully chosen to maximize the
 selectivity of the tags for either wild type or synthetic sequence. Each wild
 type or synthetic tag and its reverse complement are unique in the entire
 wild type genome; this is accomplished by creating a BLAST database for the
 entire wild type genome and BLASTing each potential tag against it (this
 requires that a complete wild type genome is available in the BioStudio
 repository). Pairs of tags are selected in such a way that they will not
 amplify any other genomic sequence under 1000 bases long. Each synthetic
 counterpart to a wild type tag is recoded with GeneDesign's "most different"
 algorithm to guarantee maximum nucleotide sequence difference while
 maintaining identical protein sequence and, hopefully, minimizing any effect
 on gene expression. The synthetic tags are all at least I<MINPERDIFF> percent
 recoded from the wild type tags. Each tag is positioned in such a way that
 the first and last nucleotides correspond to the wobble of a codon that can
 be edited to change its wobble without changing its amino acid.  This usually
 automatically excludes methionine or tryptophan, but it can exclude others
 when a I<MINRSCUVAL> filter is in place. The wobble restriction ensures that
 the synthetic and wild type counterparts have different 5' and 3'
 nucleotides, minimizing the chances that they (and their complements) will
 cross-prime. This means that tags will be between I<MINTAGLEN> and
 I<MAXTAGLEN> base pairs long, where I<TAGLEN> is a multiple of 3 plus 1. All
 tags have melting temperature between I<MINTAGMELT> and I<MAXTAGMELT> so they
 can be used in a single set of PCR conditions.

Tag pairs are chosen to form amplicons specific for each ORF, with at least
 one amplicon chosen per kilobase of ORF. Each amplicon is between
 I<MINAMPLEN> and I<MAXAMPLEN> base pairs long, ensuring that they will all
 fall within an easily identifiable range on an agarose gel. No amplicon will
 be chosen within the first I<FIVEPRIMESTART> base pairs of an ORF to avoid
 disrupting unknown regulatory features. Amplicons are forbidden from
 overlapping each other by more than I<MAXAMPOLAP> percent.

ARGUMENTS

Required arguments:

-C, --CHROMOSOME : The chromosome to be modified
-E, --EDITOR : The person responsible for the edits
-ME, --MEMO : Justification for the edits

Optional arguments:

--ITERATE : [genome, chromosome (def)] Which version number to increment?
-STA, --STARTPOS : The first base for analysis;
-STO, --STOPPOS  : The last base for analysis;
--MINTAGMELT : (default 58) Minimum melting temperature for tags
--MAXTAGMELT : (default 60) Maximum melting temperature for tags
--MINPERDIFF : (default 33) Minimum base pair difference between synthetic and
               wildtype versions of a tag
--MINTAGLEN  : (default 19) Minimum length for tags. Must be a multiple of 3,
               plus 1
--MAXTAGLEN  : (default 28) Maximum length for tags. Must be a multiple of 3,
               plus 1
--MINAMPLEN  : (default 200) Minimum span for a pair of tags
--MAXAMPLEN  : (default 500) Maximum span for a pair of tags
--MAXAMPOLAP : (default 25) Maximum percentage of overlap allowed between
               different tag pairs
--MINORFLEN  : (default 501) Minimum size of gene for tagging eligibility
--FIVEPRIMESTART : (default 101) The first base in a gene eligible for a tag
--MINRSCUVAL : (default 0.06) The minimum RSCU value for any replacement codon
               in a tag
--OUTPUT    : [html, txt (def)] Format of reporting and output.
-h, --help : Display this message