Changes for version 0.242020 - 2024-07-20

  • Additions
    • Filterable: new role to generalize filtering of SeqId objects beyond taxonomy
    • Listable: added method desc_seq_len_list
    • SeqId: added method instance for memoized object construction (to speed up Splits analysis)
    • SeqId::Filter: new class to implement method SeqId::family_filter
    • Tree: added method get_node_that_maximizes (e.g., to look for long branch lengths)
    • Tree: added method newick_str as a robust wrapper around Bio::Phylo to_newick method
    • Tree: finally added full-working method root_tree!
    • Tree::Splits: new class to handle splits and preserve node metadata after rooting
    • binaries: added automated rooting on taxon or family to format-tree.pl
  • Changes
    • Listable: refactored complete_seq_list method to optionally report seq lengths
    • Listable: refactored (and renamed) long_leaf_list method (to long_branch_list)
    • SeqId: now automatically (and silently) removes quotes around ids (especially in trees)
    • Taxonomy: reduced redundancy of Note messages about merged taxa
  • Fixes
    • IdList: fixed SeqId-related bug preventing seqs to be batch-extracted in some cases
    • binaries: improved robustness of classify-ali.pl and inst-abbr-ids.pl

Documentation

Convert ALI files to FASTA files
Convert (and filter) ALI files to PHYLIP files for tree building
Appends seq lengths to ids in ALI files (as SCaFoS)
Abbreviate or restore the org component of full seq ids in ALI files
Classify ALI files based on taxonomic filters
Download formatted trees from iTOL
Extract sequences from a FASTA database file based on id lists
Convert FASTA files to ALI files
Fetch (and format) information from the NCBI Taxonomy database
Format (and annotate) trees for printing
Discard (nearly) gap-only sites from ALI files
Upload trees and associate metadata files to iTOL
Abbreviate seq ids in FASTA files (optimized)
Discard low-quality nt seqs in FASTA files (optimized)
Split sequences of FASTA files into shorter sequences (optimized)
Apply a taxonomic filter to a (UniProt) FASTA database (optimized)
Jackknife a directory of ALI files
Build final id mapper from id list using the NCBI Taxonomy database
Mask an ALI file according to BLOCKS file(s)
Convert PHYLIP files to ALI files
Prune sequences from ALI files based on id lists
Prune tips from TREE files based on id lists
Change (restore) full seq ids in ALI files
Setup a local mirror of the NCBI Taxonomy (or GTDB) database
Extract individual gene ALIs from a SCaFoS supermatrix
Split ALI files into subsets of sites based on site-wise statistics
Convert STOCKHOLM files to ALI files
Subsample forest (multiple trees) files (and restore ids)
Build an id mapper from a tabular file giving annotation strings
Apply a taxonomic filter to ALI files
Mask ALI files based on taxonomic filters
Generate id lists from tree tips
Convert trees to TPL files

Modules

Core classes and utilities for Bio::MUST
Multiple sequence alignment
Thin wrapper for an indexed Ali read from disk
Thin wrapper for a temporary mapped Ali written on disk
Distribution-wide constants for Bio::MUST::Core
Genetic code for conceptual translation
Genetic code factory based on NCBI gc.prt file
Id list for selecting specific sequences
Id mapper for translating sequence ids
Posterior predictive tests for sequences
Posterior predictive test for compositional bias
Aliable Moose role (pure interface) for Ali-like objects
Commentable Moose role for storable objects
Filterable Moose role for objects that behave as filters
Listable Moose role for objects with implied id lists
Taxable Moose role for objects that query a taxonomy
Nucleotide or protein sequence
Modern and legacy MUST-compliant sequence id
Helper class for filtering seqs according to SeqId components
Sequence mask for selecting specific sites
Arbitrary frequencies for sequence sites
Posterior mean site frequencies (PMSF) for sequence sites
Evolutionary profiles for sequence sites
Evolutionary rates for sequence sites
NCBI Taxonomy one-stop shop
Helper class for multiple-criterion classifier based on taxonomy
Helper class for multiple-criterion classifier based on taxonomy
Helper class providing color scheme for taxonomic annotations
Helper class for multiple-criterion classifier based on taxonomy
Helper class for filtering seqs according to taxonomy
Helper class for simple labeler based on taxonomy
Wrapper class for serializing Bio::LITE::Taxonomy::NCBI object
Thin wrapper around Bio::Phylo trees
Collection of (bootstrap) trees
Tree splits (bipartitions)
Distribution-wide Moose types for Bio::MUST::Core
Utility functions for enabling multiple file processing