Changes for version 0.243430 - 2024-12-08

  • Additions
    • binaries: new script tseq2ali.pl (for TinySeq files)
    • binaries: idealize.pl and ali2phylip.pl are now codon-compatible
    • Ali: new method load_tinyseq to import NCBI TinySeq XML files
    • Seq: now silently converts '!' (used as a frameshift symbol) to 'x'
    • SeqMask: new method codon_mask to allow all masks to preserve codons
  • Changes
    • Taxonomy: now silently handles GCF numbers as GCA numbers to spare space
  • Fixes
    • Ali::Stash: fixed missing method filename
    • Taxonomy: restored support for GTDB setup (following remote changes)

Documentation

Convert ALI files to FASTA files
Convert (and filter) ALI files to PHYLIP files for tree building
Appends seq lengths to ids in ALI files (as SCaFoS)
Abbreviate or restore the org component of full seq ids in ALI files
Classify ALI files based on taxonomic filters
Download formatted trees from iTOL
Extract sequences from a FASTA database file based on id lists
Convert FASTA files to ALI files
Fetch (and format) information from the NCBI Taxonomy database
Format (and annotate) trees for printing
Discard (nearly) gap-only sites from ALI files
Upload trees and associate metadata files to iTOL
Abbreviate seq ids in FASTA files (optimized)
Discard low-quality nt seqs in FASTA files (optimized)
Split sequences of FASTA files into shorter sequences (optimized)
Apply a taxonomic filter to a (UniProt) FASTA database (optimized)
Jackknife a directory of ALI files
Build final id mapper from id list using the NCBI Taxonomy database
Mask an ALI file according to BLOCKS file(s)
Convert PHYLIP files to ALI files
Prune sequences from ALI files based on id lists
Prune tips from TREE files based on id lists
Change (restore) full seq ids in ALI files
Setup a local mirror of the NCBI Taxonomy (or GTDB) database
Extract individual gene ALIs from a SCaFoS supermatrix
Split ALI files into subsets of sites based on site-wise statistics
Convert STOCKHOLM files to ALI files
Subsample forest (multiple trees) files (and restore ids)
Build an id mapper from a tabular file giving annotation strings
Apply a taxonomic filter to ALI files
Mask ALI files based on taxonomic filters
Generate id lists from tree tips
Convert trees to TPL files
Convert NCBI TinySeq XML files to ALI files

Modules

Core classes and utilities for Bio::MUST
Multiple sequence alignment
Thin wrapper for an indexed Ali read from disk
Thin wrapper for a temporary mapped Ali written on disk
Distribution-wide constants for Bio::MUST::Core
Genetic code for conceptual translation
Genetic code factory based on NCBI gc.prt file
Id list for selecting specific sequences
Id mapper for translating sequence ids
Posterior predictive tests for sequences
Posterior predictive test for compositional bias
Aliable Moose role (pure interface) for Ali-like objects
Commentable Moose role for storable objects
Filterable Moose role for objects that behave as filters
Listable Moose role for objects with implied id lists
Taxable Moose role for objects that query a taxonomy
Nucleotide or protein sequence
Modern and legacy MUST-compliant sequence id
Helper class for filtering seqs according to SeqId components
Sequence mask for selecting specific sites
Arbitrary frequencies for sequence sites
Posterior mean site frequencies (PMSF) for sequence sites
Evolutionary profiles for sequence sites
Evolutionary rates for sequence sites
NCBI Taxonomy one-stop shop
Helper class for multiple-criterion classifier based on taxonomy
Helper class for multiple-criterion classifier based on taxonomy
Helper class providing color scheme for taxonomic annotations
Helper class for multiple-criterion classifier based on taxonomy
Helper class for filtering seqs according to taxonomy
Helper class for simple labeler based on taxonomy
Wrapper class for serializing Bio::LITE::Taxonomy::NCBI object
Thin wrapper around Bio::Phylo trees
Collection of (bootstrap) trees
Tree splits (bipartitions)
Distribution-wide Moose types for Bio::MUST::Core
Utility functions for enabling multiple file processing