NAME

Bio::Draw::FeatureStack - BioPerl module to generate GD images of stacked gene models

SYNOPSIS

  use Bio::DB::SeqFeature::Store;
  use Bio::Draw::FeatureStack;
 
  # load GFF3-compliant features from GFF file 
  # features could be obtained from/with any other source/methods as well...
  #---
  my @features;
  my $store = Bio::DB::SeqFeature::Store->new
  (
    -adaptor => 'memory',
    -dsn => 'my_gff_file.gff3' 
  );    			
  push(@features, $store->features(-name => 'gene1', -aliases => 1));
  push(@features, $store->features(-name => 'gene2', -aliases => 1));

  # create FeatureStack, passing features as array-ref
  #---
  my $feature_stack = new Bio::Draw::FeatureStack
  (
    -features => \@features,    # array-ref of features to be rendered
    -glyph => 'gene',           # features will be rendered using this BioPerl glyph
    -flip_minus => 1,           # flip features on reverse strand (default is on)
    -ignore_utr => 1,           # do not show UTRs (default is off)
    -panel_params => {          # Bio::Graphics::Panel parameters
      -width => 1024,          
      -pad_left => 80,
      -pad_right => 20,
      -grid => 1
    },
    -glyph_params => {          # glyph-specific parameters (Bio::Graphics::Glyph::gene in this case)
      -utr_color   => 'white',
      -label_position => 'left',
      -label_transcripts => 1,
      -description => 1
    }
  );

  # output SVG, including HTML image map
  #---
  (my $svg, $map) = $feature_stack->svg(-image_map => 1);
	
  # output PNG
  #---
  my $png = $feature_stack->png;
	

DESCRIPTION

FeatureStack creates GD images of vertically stacked gene models to facilitate visual comparison of gene structures. Compared genes can be clusters of orthologous genes, gene family members, or any other genes of interest. FeatureStack takes an array of BioPerl feature objects as input, projects them onto a common coordinate space, flips features from the negative strand (optional), left-aligns them by start coordinates (optional), sets a fixed intron size (optional), removes unwanted transcripts (optional), and then draws the so transformed features with a user-specified glyph. Internally, this transformation is achieved by cloning all input features into Bio::Graphics::Feature objects before the features get rendered by the specified glyph. Output images can be generated in SVG (scalable vectorized image) or PNG (rastered image) format.

FeatureStack was designed with the goal to retain maximum control of the rendering process. As such, the user can not only control how FeatureStack behaves using the FeatureStack parameters described below, but also can provide both panel- and glyph-specific parameters to fine-control all aspects of the rendered image.

Albeit FeatureStack can be used in combination with any glyph, it is particularly useful when used in combination with the Bio::Graphics::Glyph::decorated_gene glyph. This glyph is currently not distributed with BioPerl, but should install together with FeatureStack. Bio::Graphics::Glyph::decorated_gene can also be used and obtained independent from FeatureStack via CPAN. The decorated_gene glyph allows to highlight protein motifs such as signal peptides, transmembrane domains, or protein domains on top of gene models, which greatly faclitates the comparison of gene structures. Please refer to the documentation of Bio::Graphics::Glyph::decorated_gene for more details. If protein decorations are associated with gene features in the input data, FeatureStack can also automatically align gene models by a user-defined decoration type, such that for example gene models are aligned by a particularly well conserved protein motif.

FeatureStack requires GFF3-complient features. That is, features provided to FeatureStack need to have either a two-tier 'mRNA'->'CDS' or three-tier 'gene'->'mRNA'->'CDS' level structure. Here is an example gene structure in GFF3 format compatible with FeatureStack:

MAL10  test  gene  1596486  1597604  .  +  .  ID=PF10_0392;Name=PF10_0392
MAL10  test  mRNA  1596486  1597604  .  +  .  ID=rna_PF10_0392-1;Name=PF10_0392-1;Parent=PF10_0392
MAL10  test  CDS   1596486  1596554  .  +  .  ID=cds_PF10_0392-1;Parent=rna_PF10_0392-1
MAL10  test  CDS   1596747  1597604  .  +  .  ID=cds_PF10_0392-2;Parent=rna_PF10_0392-1

FeatureStack can display multiple transcripts (isoforms) per gene if the specified glyph supports this as well (for example the 'gene' or the 'decorated_gene' glyph).

In addition to drawing a set of gene models on top of each other, FeatureStack can intermingle gene models with alternative tracks that display additional features associated with these genes. This can be used for example to display regulatory elements or sequence variants (SNPs, indels) alongside gene model. There is currently no limitation of how these alternative features are displayed, and any BioPerl glyph can be used for this purpose. In the input data, alternative features must be specified one level below the gene or transcript feature that is passed to FeatureStack. Here is an example GFF that shows how a regulatory motif (associated with the gene) and a SNP (associated with a transcript) can be specified:

CHR_I  test  gene      5100769  5101677  .  +  .  ID=Gene:Y110A7A.20;Name=ift-20
CHR_I  test  promoter  5100709  5100722  .  +  .  ID=Promoter:Y110A7A.20;Note=GTCTCTATAGCAAC;Parent=Gene:Y110A7A.20
CHR_I  test  mRNA      5100769  5101677  .  +  .  ID=Transcript:Y110A7A.20;Parent=Gene:Y110A7A.20
CHR_I  test  SNP       5100888  5100888  .  +  .  ID=SNP123456;Parent=Transcript:Y110A7A.20;Note=C>T
CHR_I  test  CDS       5100769  5101423  .  +  .  ID=CDS:Y110A7A.20:1;Parent=Transcript:Y110A7A.20
CHR_I  test  CDS       5101468  5101677  .  +  .  ID=CDS:Y110A7A.20:2;Parent=Transcript:Y110A7A.20

OPTIONS

 Option          Description                                              Default
 ------          -----------                                              -------

-features                                                                 none
 
                 Array reference (mandatory). BioPerl features to be 
                 displayed. Currently, features can be either of type 
                 'mRNA' or 'gene'. 
                 
 -glyph                                                                   'generic'

                 String (optional). Name of glyph to be used to render 
                 features. The glyph specified here should be suitable 
                 for rendering the provided features (e.g., use 
                 'processed_transcript' glyph for features of type 'mRNA' 
                 and 'gene' glyph for features of type 'gene'). The 
                 'decorated_gene' or 'decorated_transcript' glyph 
                 can also be used for highlighting protein features on 
                 top of gene models (see description above). 
                 
                 If no glyph is specified, the 'generic' glyph will 
                 be used.
                 
 -glyph_params                                                            none

                 Hash reference (optional). Glyph-specific parameters. 
                 Will be passed unmodified to the glyph. Parameters 
                 can include callback functions for fine-grained control 
                 of the rendering process. Please refer to the
                 documentation of the glyph for a description of which
                 glyph parameters are available. 

 -panel_params                                                            none

                 Hash reference (optional). Panel parameters. Will be 
                 passed unmodified to the L<Bio::Graphics::Panel> instance 
                 that is internally created by FeatureStack.  

                 Typical parameters here include -width, -pad_left, 
                 -pad_right, or -grid (see L<Bio::Graphics::Panel> for
                 more information).

 -ignore_utr                                                              false
 
                 Boolean (optional). If true, gene models will be drawn
                 without untranslated regions (UTRs).
                 
 -flip_minus                                                              true
 
                 Boolean (optional). By default, features on the negative
                 (reverse) strand are drawn flipped, such that the 
                 5' end of features is always on the left side. This 
                 behaviour can be turned off by setting this parameter to
                 0 (false).

 -intron_size                                                             undef
 
                 Integer (optional). Intron size in base-pairs. If specified, 
                 introns of gene models will be transformed to have 
                 this specified size. This is useful when comparing gene 
                 models of vastly different sizes due to very large
                 introns (for example, when comparing protist genes with human 
                 genes). By default, gene models are drawn to scale with
                 original intron sizes. This parameter does not affect
                 the length of exons, which are always drawn to scale.
                 
 -feature_offsets                                                         undef
 
                 Hash reference or string (optional). This parameter allows 
                 you to control the horizontal alignment of features. By
                 default, all features are left-aligned by their start
                 coordinate.  
                 
                 If a hash reference is specified here, it is assumed that
                 keys correspond to feature IDs and values to offsets in bp. 
                 This way the alignment of individual features can be 
                 manually fine-controlled. 
                 
                 If 'start_codon' is specified, features will be aligned
                 by their smallest CDS coordinate, assuming that this
                 will be the translation start site.
                 
                 Any other value here will be interpreted as the name of
                 a protein decoration. In this case, FeatureStack will
                 attempt to use L<Bio::Graphics::Glyph::decorated_transcript>
                 to map this protein decoration to nucleotide space and 
                 will then left-align the feature by this mapped 
                 coordinate. This way, features can for example be 
                 automatically aligned by their most conserved protein 
                 domain. If no protein decoration with this name is found
                 for a feature, then this feature will not be aligned.
                 Please refer to the documentation of the 
                 decorated_transcript glyph to see how protein decorations
                 can be specified for transcripts.

 -transcripts_to_skip                                                     none

                 Array reference (optional). Contains transcript IDs not to
                 be included in the output image. This parameter can be used
                 if a gene feature passed to FeatureStack has multiple 
                 isoforms but only a subset of these isoforms should appear
                 in the output.
 
 -alt_feature_type                                                        none

                 String (optional). Type and source of alternative features 
                 (e.g., 'SNP:mpileup') to be outputted alongside gene models. 
                 FeatureStack looks for features of this type/source one
                 level below the specified gene/transcript feature. If found, 
                 alternative features are drawn in a separate track above 
                 the gene track. The appearance of alternative features 
                 can be controlled using the -alt_glyph and -alt_glyph_params 
                 parameters.
                 
                 FeatureStack will automatically compute the distance of
                 alternative features (in bp) to the associated main features's 
                 start coordinate and adds this distance as a feature tag
                 (tag name 'start_dist'). This tag can later be read 
                 by the glyph that displays alternative features. 
                 This can e.g. be useful for labeling regulatory features 
                 with their distance from the transcription start site 
                 (UTRs visible) or from the translation start site 
                 (UTRs ignored).
                 
 -alt_glyph                                                               none
  
                 String (optional). Name of glyph to be used to draw 
                 alternative features specified with -alt_feature_type.

 -alt_glyph_params                                                        none

                 Hash reference (optional). Glyph-specific parameters for 
                 glyph specified with -alt_glyph. Parameters will be passed 
                 unmodified to the glyph. Parameters can include callback 
                 functions for fine-grained control of the rendering process. 

 -ruler                                                                   true

                 Boolean (optional). If true, a ruler indicating distances
                 in base-pairs will be drawn on top of the image. The ruler
                 will automatically adjust to feature offsets; that is,
                 the origin of the ruler will be placed at the
                 point where features are align, showing negative 
                 coordinates left of this point and positive coordinates 
                 right of this point. 

 -span                                                                    [auto]

                 Integer (optional). Span of the output image in bp. By 
                 default, the span is the length of the longest feature. 
                 If one wants to generate an image that shows only the 
                 5' portion of features (for example to visualize only 
                 the first exon of genes and their associated promoters), 
                 one can set a smaller, fixed value here, effectively 
                 clipping the right part of the image at this coordinate.

 -separator                                                               false

                 Boolean (optional). If true, draw horizontal line between
                 gene models. This might be useful if alternative tracks
                 are visible to know which alternative track belongs to
                 which gene model track. 
                 

EXPORT

None by default.

BUGS

Please report all errors.

SEE ALSO

Bio::Graphics::Panel, Bio::Graphics::Glyph, Bio::Graphics::Glyph::gene, Bio::Graphics::Glyph::processed_transcript, Bio::Graphics::Glyph::decorated_gene, Bio::Graphics::Glyph::decorated_transcript, Bio::DB::SeqFeature::Store

It is recommended to study test cases shipped with this module to get additional information of how to use this module.

AUTHOR

Christian Frech <frech.christian@gmail.com>

COPYRIGHT AND LICENSE

Copyright (C) 2012 by Christian Frech

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.