NAME

ali2phylip.pl - Convert (and filter) ALI files to PHYLIP files for tree building

VERSION

version 0.242020

USAGE

ali2phylip.pl <infiles> [optional arguments]

REQUIRED ARGUMENTS

<infiles>

Path to input ALI files [repeatable argument].

OPTIONAL ARGUMENTS

--test-out=<file>

Path to main outfile collecting statistics for all infiles [default: none]. When specified, the script does not produce any outfile but instead reports statistics in a tabular output suitable to further analysis in R. This option is useful for evaluating the effects of various parameter settings. It overrides all the options pertaining to outfiles.

--from-scafos

Consider the input ALI file as generated by SCaFoS [default: no]. Currently, specifying this option results in turning all ambiguous and missing character states to gaps.

--max[-res-drop-site]=<n>

Number of non-gap character states allowed in a site to be dropped [default: 0]. When specified as a fraction between 0 and 1, it is interpreted as relative to the number of sequences in the ALI. By default, only shared gaps are dropped. To completely disable deletion of shared gaps, use -1.

--gb-mask=<mode>

Stringency of the Gblocks mask to be applied [default: none]. The following modes are available: strict, medium and loose. This option requires to have a Gblocks executable in the $PATH.

--bmge-mask=<mode>

Stringency of the BMGE mask to be applied [default: none]. The following modes are available: strict, medium and loose. This option requires to have a bmge.sh script in the $PATH.

The bmge.sh script should be as follows:

#!/bin/sh
java -jar path-to-bmge/BMGE.jar -i $1 -t $2 -h $3 -g $4 -oh $5
--pars-mask

Apply a simple parsimony mask where only parsimony-informative sites are retained [default: no].

--min[-res-seq]=<n>

Number of known character states required by a sequence for it to be exported [default: 0]. When specified as a fraction between 0 and 1, it is interpreted as relative to the longest ungapped sequence of the ALI. Note that this optional filtering step takes place after the optional Gblocks-based masking step.

--del-const

Delete constant sites just as the -dc option of PhyloBayes [default: no].

--map-ids

Sequence id mapping switch [default: no]. When specified, sequence ids are renamed to 'seqN' and IDM files are created. Enabling this option is highly recommended when exporting to PHYLIP.

--p80

Output file in P80 format instead of PHYLIP [default: no]. This option is useful for keeping full-length ids without mapping.

--ali

Output file in ALI format instead of PHYLIP [default: no]. This option is useful for generating filtered ALI files usable by SCaFoS.

--version
--usage
--help
--man

Print the usual program information

AUTHOR

Denis BAURAIN <denis.baurain@uliege.be>

CONTRIBUTORS

  • Arnaud DI FRANCO <arnaud.difranco@gmail.com>

  • Raphael LEONARD <rleonard@doct.uliege.be>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by University of Liege / Unit of Eukaryotic Phylogenomics / Denis BAURAIN.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.