NAME
Bio::Das::Map - Resolve map coordinates
SYNOPSIS
use Bio::Das::Map 'print_location';
my $m = Bio::Das::Map->new('my_map');
$m->add_segment(['chr1',100,1000],['c1.1',1,901]);
$m->add_segment(['chr1',1001,2000],['c1.2',501,1500]);
$m->add_segment(['chr1',2001,4000],['c1.1',3000,4999]);
$m->add_segment(['c1.1',4000,4999],['c1.1.1',1,1000]);
my @abs_locations = $m->resolve('c1.1.1',500=>600);
print_location(@abs_locations);
for my $location (@abs_locations) {
my @rel_locations = $m->project($location,'c1.1.1');
print_location(@rel_locations);
my @all_rel_locations = $m->sub_segments($location);
print_location(@all_rel_locations);
}
DESCRIPTION
This module provides the infrastructure for handling relative coordinates in sequence annotations. You use it by creating a "map" that relates a set of sequence segment pairs. The segments are related in a parent/child relationship so that you can move "up" or "down" in the hierarchy. However the exact meaning of the paired relationships is up to you; it can be chromosome->contig->clone, scaffold->supercontig->contig->read, or whatever you wish.
Once the map is created you can perform the following operations:
- resolution to absolute coordinates
-
Given a sequence segment somewhere in the map, the resolve() call will move up to the topmost level, translating the coordinates into absolute coordinates.
- directed projection to relative coordinates
-
Given a sequence segment somewhere in the map, the project() call will attempt to project the coordinate downward into the specified coordinate system, changing it into the corresponding relative coordinates.
- undirected projection
-
Given a sequence segment somewhere in the map, the sub_segments() call will return all possible coordinates relative to the children of this segment. The super_segments() performs the opposite operation, moving upwards in the hierarchy.
Here is an example using ASCII art:
100 1000 2000 4000
chr1 |---------------|--------------|----------------|
. .. . .
1 901. 3000 4000 4999
c1.1 |---------------|. .|-------|-------|
. . . .
501 1500 . .
c1.2 |-------------| . .
. .
1 1000
c1.1.1 |-------|
These relationships can be described with the following code fragment:
my $m = Bio::Das::Map->new('my_map');
$m->add_segment(['chr1',100,1000] => ['c1.1',1,901]);
$m->add_segment(['chr1',1001,2000] => ['c1.2',501,1500]);
$m->add_segment(['chr1',2001,4000] => ['c1.1',3000,4999]);
$m->add_segment(['c1.1',4000,4999] => ['c1.1.1',1,1000]);
A call to resolve() can now be used to transform a segment relative to "c1.1.1" into "chr1" coordinates:
my @chr1_coordinates = $m->resolve('c1.1.1',500=>600);
This will return the segment chr1:3500..3600.
Conversely a call to project() can be used to transform a segment relative to "chr1" into "c1.1.1" coordinates:
my @c1_1_1_coordinates = $m->project('chr1',3500=>3600,'c1.1.1');
As expected, this returns the segment c1.1.1:500..600.
METHODS
- $map = Bio::Das::Map->new('map_name')
-
Create a new Bio::Das::Map, optionally giving it name "map_name."
- $name = $map->name(['new_name'])
-
Get or set the map name.
- $clip_flag = $map->clip([$new_clip_flag])
-
Get or set the "clip" flag. If the clip flag is set to a true value, then requests for operations on coordinate ranges that are outside the list of segments contained within the map will be clipped to that portion of the coordinate range within known segments. If the flag is false, then the coordinate mapping routines will perform linear extrapolation on those portions of the segments that are outside the map.
The default is false.
- $map->add_segment($segment1 => $segment2)
-
Establish a parent/child relationship between $segment1 and $segment2. The two segments can be array references or Bio::LocationI objects. In the former case, the format of the array reference is:
[$coordinate_system_name,$start,$end [,$strand]
$coordinate_system_name is any sequence ID. $start and $end are the usual BioPerl 1-based coordinates with $start <= $end. The $strand is one of +1, 0 or -1. If not provided the strand is assumed to be +1. A strand of zero is equivalent to a strand of +1 for the coordinate calculations.
Bio::LocationI objects can be used for either or both of the segments. For example:
$map->add_segment(Bio::Location::Simple->new(-seq_id=>'chr1',-start=>4001,-end=>5000), Bio::Location::Simple->new(-seq_id=>'c1.3',-start=>10,-end=>1009));
You can think of this operation as adding an alignment between two sequences.
- @abs_segments = $map->resolve($location)
-
Given a location, the resolve() method returns a list of corresponding absolute coordinates by recursively following all parent segments until it reaches a segment that has no parent. There may be several segments that satisfy this criteria, or none at all.
The argument is either a Bio::LocationI, or an array reference of the form [$seqid,$start,$end,$strand]. The returned list consists of a set of Bio::Location::Simple objects.
- @rel_segments = $map->project($location,$seqid)
-
Given a location and a sequence ID, the project() method attempts to project the location into the coordinates specified by $seqid. $location can be a Bio::LocationI object, or an array reference in the format described earlier. The method returns a list of zero or more Bio::Location::Simple objects.
- @subsegments = $map->sub_segments($location)
-
This method returns all Bio::LocationI segments that can be reached by following $location's children downward. $location is either a Bio::LocationI or an array reference. @subsegments are Bio::Location::Simple objects.
- @supersegments = $map->super_segments($location)
-
This method returns all Bio::LocationI segments that can be reached by following $location's parents upwards. $location is either a Bio::LocationI or an array reference. The return value is a list of zero or more Bio::Location::Simple objects.
- @allsegments = $map->expand_segments($location)
-
Returns all Bio::LocationI segments that are equivalent to the given location, including the original location itself. $location is either a Bio::LocationI or an array reference. The return value is a list of one or more Bio::Location::Simple objects.
- @segments = $map->lookup_segments($location)
-
Return a list of all segments that directly overlap the specified location (without traversing alignments). The location can be given as an array reference or a Bio::LocationI. As a special case, if you provide a single argument containing a sequence ID, the method will return all segments that use this sequence ID for their coordinate system.
The return value is a list of zero or more Bio::Location::Simple objects.
- print_location($location)
-
This is a utility function (not an object method) which given a location will print it to STDOUT in the format:
seqid:start..end (strand)
This function is not imported by default, but you can request that it be imported into the caller's namespace by calling:
use Bio::Das::Map 'print_location';
LIMITATIONS
Everything is done in memory with unsorted data structures, which means that large maps will have memory and/or performance problems.
AUTHOR
Lincoln Stein <lstein@cshl.org>.
Copyright (c) 2004 Cold Spring Harbor Laboratory
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See DISCLAIMER.txt for disclaimers of warranty.
SEE ALSO
Bio::Das::Request, Bio::Das::HTTP::Fetch, Bio::Das::Segment, Bio::Das::Type, Bio::Das::Stylesheet, Bio::Das::Source, Bio::RangeI