The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

  Please email comments or questions to the public Ensembl
  developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

  Questions may also be sent to the Ensembl help desk at
  <http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::IdMapping::SyntenyFramework - framework representing syntenic regions across the genome

SYNOPSIS

  # build the SyntenyFramework from unambiguous gene mappings
  my $sf = Bio::EnsEMBL::IdMapping::SyntenyFramework->new(
    -DUMP_PATH  => $dump_path,
    -CACHE_FILE => 'synteny_framework.ser',
    -LOGGER     => $self->logger,
    -CONF       => $self->conf,
    -CACHE      => $self->cache,
  );
  $sf->build_synteny($gene_mappings);

  # use it to rescore the genes
  $gene_scores = $sf->rescore_gene_matrix_lsf($gene_scores);

DESCRIPTION

The SyntenyFramework is a set of SyntenyRegions. These are pairs of locations very analoguous to the information in the assembly table (the locations dont have to be the same length though). They are built from genes that map uniquely between source and target.

Once built, the SyntenyFramework is used to score source and target gene pairs to determine whether they are similar. This process is slow (it involves testing all gene pairs against all SyntenyRegions), this module therefor has built-in support to run the process in parallel via LSF.

METHODS

  new
  build_synteny
  _by_overlap
  add_SyntenyRegion
  get_all_SyntenyRegions
  rescore_gene_matrix_lsf
  rescore_gene_matrix
  logger
  conf
  cache

new

  Arg [LOGGER]: Bio::EnsEMBL::Utils::Logger $logger - a logger object
  Arg [CONF]  : Bio::EnsEMBL::Utils::ConfParser $conf - a configuration object
  Arg [CACHE] : Bio::EnsEMBL::IdMapping::Cache $cache - a cache object
  Arg [DUMP_PATH] : String - path for object serialisation
  Arg [CACHE_FILE] : String - filename of serialised object
  Example     : my $sf = Bio::EnsEMBL::IdMapping::SyntenyFramework->new(
                  -DUMP_PATH    => $dump_path,
                  -CACHE_FILE   => 'synteny_framework.ser',
                  -LOGGER       => $self->logger,
                  -CONF         => $self->conf,
                  -CACHE        => $self->cache,
                );
  Description : Constructor.
  Return type : Bio::EnsEMBL::IdMapping::SyntenyFramework
  Exceptions  : thrown on wrong or missing arguments
  Caller      : InternalIdMapper plugins
  Status      : At Risk
              : under development

build_synteny

  Arg[1]      : Bio::EnsEMBL::IdMapping::MappingList $mappings - gene mappings
                to build the SyntenyFramework from
  Example     : $synteny_framework->build_synteny($gene_mappings);
  Description : Builds the SyntenyFramework from unambiguous gene mappings.
                SyntenyRegions are allowed to overlap. At most two overlapping
                SyntenyRegions are merged (otherwise we'd get too large
                SyntenyRegions with little information content).
  Return type : none
  Exceptions  : thrown on wrong or missing argument
  Caller      : InternalIdMapper plugins
  Status      : At Risk
              : under development

add_SyntenyRegion

  Arg[1]      : Bio::EnsEMBL::IdMaping::SyntenyRegion - SyntenyRegion to add
  Example     : $synteny_framework->add_SyntenyRegion($synteny_region);
  Description : Adds a SyntenyRegion to the framework. For speed reasons (and
                since this is an internal method), no argument check is done.
  Return type : none
  Exceptions  : none
  Caller      : internal
  Status      : At Risk
              : under development

get_all_SyntenyRegions

  Example     : foreach my $sr (@{ $sf->get_all_SyntenyRegions }) {
                  # do something with the SyntenyRegion
                }
  Description : Get a list of all SyntenyRegions in the framework.
  Return type : Arrayref of Bio::EnsEMBL::IdMapping::SyntenyRegion
  Exceptions  : none
  Caller      : general
  Status      : At Risk
              : under development

rescore_gene_matrix_lsf

  Arg[1]      : Bio::EnsEMBL::IdMapping::ScoredmappingMatrix $matrix - gene
                scores to rescore
  Example     : my $new_scores = $sf->rescore_gene_matrix_lsf($gene_scores);
  Description : This method runs rescore_gene_matrix() (via the
                synteny_resocre.pl script) in parallel with lsf, then combines
                the results to return a single rescored scoring matrix.
                Parallelisation is done by chunking the scoring matrix into
                several pieces (determined by the --synteny_rescore_jobs
                configuration option).
  Return type : Bio::EnsEMBL::IdMapping::ScoredMappingMatrix
  Exceptions  : thrown on wrong or missing argument
                thrown on filesystem I/O error
                thrown on failure of one or mor lsf jobs
  Caller      : InternalIdMapper plugins
  Status      : At Risk
              : under development

rescore_gene_matrix

  Arg[1]      : Bio::EnsEMBL::IdMapping::ScoredmappingMatrix $matrix - gene
                scores to rescore
  Example     : my $new_scores = $sf->rescore_gene_matrix($gene_scores);
  Description : Rescores a gene matrix. Retains 70% of old score and builds
                other 30% from the synteny match.
  Return type : Bio::EnsEMBL::IdMapping::ScoredMappingMatrix
  Exceptions  : thrown on wrong or missing argument
  Caller      : InternalIdMapper plugins
  Status      : At Risk
              : under development

logger

  Arg[1]      : (optional) Bio::EnsEMBL::Utils::Logger - the logger to set
  Example     : $object->logger->info("Starting ID mapping.\n");
  Description : Getter/setter for logger object
  Return type : Bio::EnsEMBL::Utils::Logger
  Exceptions  : none
  Caller      : constructor
  Status      : At Risk
              : under development

conf

  Arg[1]      : (optional) Bio::EnsEMBL::Utils::ConfParser - the configuration
                to set
  Example     : my $basedir = $object->conf->param('basedir');
  Description : Getter/setter for configuration object
  Return type : Bio::EnsEMBL::Utils::ConfParser
  Exceptions  : none
  Caller      : constructor
  Status      : At Risk
              : under development

cache

  Arg[1]      : (optional) Bio::EnsEMBL::IdMapping::Cache - the cache to set
  Example     : $object->cache->read_from_file('source');
  Description : Getter/setter for cache object
  Return type : Bio::EnsEMBL::IdMapping::Cache
  Exceptions  : none
  Caller      : constructor
  Status      : At Risk
              : under development