NAME
Apocalypse 20 - Debugging [DRAFT]
VERSION
Maintainer: pugs committers
Date: 10 Apr 2005
Last modified: 10 Apr 2005
This document proposes how debugging and AST introspection hooks might look in perl 6, in the context of MMD and macros.
PURPOSE
An attempt at drafting a callback API that should allow useful and simple implementation of:
The perl 6 debugger.
Infectious traits, enabling the implementation of taint mode and more complex variations on the theme.
EXAMPLE
With taint mode generalized into a debugging system, this perl statement will yield:
$x = ($a ?? $b !! $c); # assuming $a is true
----------Xs----------
----------Xe---------
--------E-------
A- B- C-
Assuming:
Xs is the whole statement
Xe is the single expression in Xs, it is apply(=, $x, ...);
E is the ( ?? !! ) expression, it is apply(??!!, A, B, C);
A, B, C are expressions for the vars $a, $b, $c ($x expr omitted for brevity)
The following callbacks:
eval(Xs);
eval(Xe);
eval(E);
apply(??!!, A, B, C);
eval(A);
# evaluated
participated(A, $a); # $a is a value, perhaps is copy
participated(E, $a);
reduced(A, $a is copy); # also a value, even more so
eval(B); # because A reduced to a true value
participated(E, $a, $b);
participated(B, $b);
participated(E, $b);
reduced(B, $b);
reduced(E, $b);
participated(Xe, $b); # by the reduction, we now know that $b interacts in Xe
participated(E, $x, $a, $b); # the order is from largest permutation to smallest
participated(E, $x, $a);
participated(E, $x, $b);
participated(Xe, $x, $b);
participated(E, $x);
participated(Xe, $x);
apply(=, $x, $b); # the same $b that E was reduced to
participated(Xe, $x, $b);
reduced(Xe, $x); # but really $b by now
participated(Xs, $a, $b, $x); # a sort of catchall summary of the permuitations called therein
Participation is noted for every combination of values in an expression, as soon as it can be determined.
participated(A, $a) is called once, as soon as possible. By that time we know that
participated(E, $a) should also be called, so we do that
participated(B, $b) there after. We now know that
participated(E, $a, $b) should also happen, so we do that too
Note that the reduced values are also participating:
substr($x, 3, 5);
...
reduced(X, $substr); # $substr is a new value containing the retun value
participated(X, $substr);
participated(X, $substr, $x);
...
And literals are too. participated is called on 3 and 5 in that example as well.
OVERHEAD VS FEASABILITY
You use interaction between interaction whoring values and variable container values, and then apply that to expressions at the AST level, and then you can figure out from there the subset of suspected variables/values who might care about being told they participated in the expr.
This is just like pre-dispatching MMD at the compiler level.
Basically, you use the system to determine where it will be necessary. Isn't that cool?
If it's unused it should be zero overhead.
(Note: If I understand the above correctly, this hook system is exponential in the size of the expression. Yikes. -luqui)
THEORETICAL USAGE
DEBUGGING TRAPS
multi sub trap_eval (Statement $x where { $x == $trapped }) {
...
}
(i don't know how trap_eval will be named)
THE TAINTED
TRAIT
A tainted
trait should work by trapping
participated(_, $x is tainted, $y isn't tainted);
and then setting tainted
on $y. This requires that the MMD dispatch doesn't care if
participated(X, $a, $b is tainted);
becomes
participated(X, $b, $a);
Similarly, file handles and system calls are supplemented with more specific MMD dispatches, whose prototype contains 'does tainted', to disallow tainted content, or wrapped to mark taintedness.
Untainting is either explicit, by use of functions which just remove the trait, or implicitly:
an trap apply of the match operator (or perhaps 'reduced' on an expression containting the application of a rule match... We'll see how the macro AST for perl expressions will look) will untaint the captures inside the rule match object, after they were infected by being derived.
You could also define expressions which taint using macros:
taints {
...
}
the macro will take the AST for the block it encompasses, and taint everything that goes on inside the block.
DATA SEGREGATION
my $user1_dbh does segregated :parent($user1);
my $user2_dbh does segregated :parent($user2);
data coming from $user1_dbh will be segregated too, belonging to the object $user1.
when data from $user1 and $user2 meets, and exception is thrown.
If i'm implementing an IRC channel, for example, all input from a user's socket will be segregated.
Non commands will be unsegregated.
In this case a fatal error is thrown if due to a bug in the server the user's password as input by a command will be sent to a chatroom, which will try to make that data interact with other users sockets.
CODE COVERAGE METRICS
Just make a trap on statements, and make the callback record.
This model allows rather deep introspection, useful for detecting dead code.
PROFILING (OR NOT)
Just like you could do coverage analysis for code, you could profile it, if you can calculate the overhead of the hook MMDs out of the way. Perhaps counters on the ast also knowing time are in order. This might require parrotting though, but could definately be wrangled by these callbacks later.
We'll see.