NAME
B::Graph - Perl compiler backend to produce graphs of OP trees
SYNOPSIS
perl -MO=Graph,-text prog.pl >graph.txt
perl -MO=Graph,-vcg prog.pl >graph.vcg
xvcg graph.vcg
perl -MO=Graph,-dot prog.pl | dot -Tps >graph.ps
DESCRIPTION
This module is a backend to the perl compiler (B::*) which, instead of outputting bytecode or C based on perl's compiled version of a program, writes descriptions in graph-description languages specifying graphs that show the program's structure. It currently generates descriptions for the VCG tool (http://www.cs.uni-sb.de/RW/users/sander/html/gsvcg1.html
) and Dot (part of the graph visualization toolkit from AT&T: http://www.research.att.com/sw/tools/graphviz/
). It also can produce plain text output (which is more useful for debugging the module itself than anything else, though you might be able to make cut the nodes out and make a mobile or something similar).
OPTIONS
Like any other compiler backend, this module needs to be invoked using the O
module to run correctly:
perl -MO=Graph,-opt,-opt,-opt program.pl
OR
perl -MO=Graph,-opt,obj -e 'BEGIN {$obj = ["hi"]}; print $obj'
OR EVEN
perl -e 'use O qw(Graph -opt obj obj); print "hi!\n";'
Obj
is the name of a perl variable whose contents will be examined. It can't be a my() variable, and it shouldn't have a prefix symbol ('$@^*'), though you can specify a package -- the name will be used to look up a GV, whose various fields will lead to the scalar, array, and other values that correspond to the named variable. If no object is specified, the whole main program, including the CV that points to its pad, will be displayed.
Each of the the opt
s can come from one of the following (each set is mutually exclusive; case and underscores are insignificant):
-text, -vcg, -dot
Produce output of the appropriate type. The default is '-text', which isn't useful for much of anything (it does draw some nice ASCII boxes, though).
-addrs, -no_addrs
Each of the nodes on the graph produced corresponds to a C structure that has an address and includes pointers to other structures. The module uses these addresses to decide how to draw edges, but it makes the graph more compact if they aren't printed. The default is '-no_addrs'.
-compile_order, -run_order
The collection of OPs that perl compiles a script into has two different layers of structure. It has a tree structure which corresponds roughly to the synactic nesting of constructs in the source text, and a roughly linked-list representation, essentially a postorder traversal of this tree, which is used at runtime to decide what to do next. The graph can be drawn to emphasize one structure or the other. The former, 'compile_order', is the default, as it tends to lead to graphs with aspect ratios close to those of standard paper.
-SVs, -no_SVs
If OPs represent a program's compiled code, SVs represent its data. This includes literal numbers and strings (IVs, NVs, PVs, PVIVs, and PVNVs), regular arrays, hashes, and references (AVs, HVs, and RVs), but also the structures that correspond to individual variables (special HVs for symbol tables and GVs to represent values within them, and special AVs that hold my() variables (as well as compiler temporaries)), structures that keep track of code (CVs), and a variety of others. The default is to display all these too, to give a complete picture, but if you aren't in a holistic mood, you can make them disappear.
-ellipses, -rhombs
The module tries to give the nodes representing SVs a different shape from those of OPs. OPs are usually rectangular, so two obvious shapes for SVs are ellipses and rhombuses (stretched diamonds). This option currently only makes a difference for VCG (ellipse is the default).
-stashes, -no_stashes
The hashes that perl uses to represent symbol tables are called 'stashes'. Since every GV has a pointer back to its stash, it's virtually inevitable for the links in a graph to lead to the main stash. Unfortunately stashes, especially the main one, can be quite big, and lead to forests of other structures -- there's one GV and another SV for each magic variable, plus all of @INC and %ENV, and so on. To prevent information overload, then, the display of stashes is disabled by default.
-fileGVs, -no_fileGVs
Another kind graph element that can be annoying are the pointers from every GV and COP (a kind of OP that occurs for every statement) to the GV that represents the file from which that code came (used for error messages). By default, these links aren't shown, to keep them from cluttering the graph. Also, perl's internal interfaces changed in a recent version, so in perl 5.005_63 or later you can't see the fileGVs at all.
-SEQs, -no_SEQs
As it is visited in the peephole optimization phase, each OP gets a sequence number, which is currently used by anything (except the peephole optimizer, to avoid visiting OPs twice). If you want to see these, ask for them. (COPs have their own sequence numbers too, but they're more interesting to look at -- for instance, they're used to bound the lifetimes of lexicals).
-types, -no_types
B::Graph always gives the type of each OP symbolically ('entersub'), but it can also print the numeric value of the type field, if you want. The default is no_types.
-float, -no_float
Almost every OP has an op_next and an op_sibling pointer, and B::Graph colors them distinctively (pink and light blue, respectively). Because of this, it isn't strictly necessary to 'anchor' the arrow on a line in the OP's box saying 'op_next'. The float option lets the graph layout engine start these arrows wherever it wants, which can sometimes lead to a more pleasing layout, at the expense of being less obvious. The default is not to float.
-targlinks, -no_targlinks
Lexical (my()) variables and temporary values used by individual OPs are stored in 'pads', per-code arrays linked to the CV. OPs store indexes into these arrays in the 'op_targ' field, but B::Graph can often also draw links directly from the OP to the SV that stores the name of the variable. These links don't correspond to any real pointers, however, and they can make the graph more complicated, so they are disabled by default.
WHAT DOES THIS ALL MEAN?
SvFLAGS abbreviations
Pb SVs_PADBUSY reserved for tmp or my already
Pt SVs_PADTMP in use as tmp
Pm SVs_PADMY in use a "my" variable
T SVs_TEMP string is stealable?
O SVs_OBJECT is "blessed"
Mg SVs_GMG has magical get method
Ms SVs_SMG has magical set method
Mr SVs_RMG has random magical methods
I SVf_IOK has valid public integer value
N SVf_NOK has valid public numeric (float) value
P SVf_POK has valid public pointer (string) value
R SVf_ROK has a valid reference pointer
F SVf_FAKE glob or lexical is just a copy
L SVf_OOK has valid offset value (mnemonic: lvalue)
B SVf_BREAK refcnt is artificially low
Ro SVf_READONLY may not be modified
i SVp_IOK has valid non-public integer value
n SVp_NOK has valid non-public numeric value
p SVp_POK has valid non-public pointer value
S SVp_SCREAM has been studied?
V SVf_AMAGIC has magical overloaded methods
op_flags abbreviations
V OPf_WANT_VOID Want nothing (void context)
S OPf_WANT_SCALAR Want single value (scalar context)
L OPf_WANT_LIST Want list of any length (list context)
K OPf_KIDS There is a firstborn child.
P OPf_PARENS This operator was parenthesized.
(Or block needs explicit scope entry.)
R OPf_REF Certified reference.
(Return container, not containee).
M OPf_MOD Will modify (lvalue).
T OPf_STACKED Some arg is arriving on the stack.
* OPf_SPECIAL Do something weird for this op (see op.h)
BUGS
VCG has a problem with boxes that have more than about 55 arrows coming out of them, so with large arrays and hashes B::Graph will stop outputting edges and some boxes may be disconnected.
AUTHOR
Stephen McCamant <smcc@CSUA.Berkeley.EDU>
SEE ALSO
dot(1), xvcg(1), perl(1), perlguts(1).
If you like B::Graph, you might also be interested in Gisle Aas's PerlGuts Illustrated, at http://gisle.aas.no/perl/illguts/
.