The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Carp::Datum - Debugging And Tracing Ultimate Module

SYNOPSIS

# In modules
use Carp::Datum;

# Programming by contract
sub routine {
    DFEATURE my $f_, "optional message";    # $f_ is a lexical lvalue here
    my ($a, $b) = @_;
    DREQUIRE $a > $b, "a > b";
    $a += 1; $b += 1;
    DASSERT $a > $b, "ordering a > b preserved";
    my $result = $b - $a;
    DENSURE $result < 0;
    return DVAL $result;
}

# Tracing
DTRACE "this is a debug message";
DTRACE TRC_NOTICE, "note: a = ", $a, " is positive";
DTRACE {-level => TRC_NOTICE, -marker => "!!"}, "note with marker";

# Returning
return DVAL $scalar;     # single value
return DARY @list;       # list of values

# In application's main
use Carp::Datum qw(:all on);      # turns Datum "on" or "off"

DLOAD_CONFIG(-file => "debug.cf", -config => "config string");

DESCRIPTION

The Carp::Datum module brings powerful debugging and tracing features to your development code: automatic flow tracing, returned value tracing, assertions, and debugging traces. Its various functions may be customized dynamically (i.e. at run time) via a configuration language allowing selective activation on a routines, file, or object type basis. See Carp::Datum::Cfg for configuration defails.

Carp::Datum traces are implemented on top of Log::Agent and go to its debugging channel. This lets the application have full control on the final destination of the debugging information (logfile, syslog, etc...).

Carp::Datum can be globally turned on or off by the application. It is off by default, which means no control flow tracing (routine entry and exit), and no returned value tracing. However, assertions are still fully monitored, and the DTRACE calls are redirected to Log::Agent.

The C version of Carp::Datum is implemented with macros, which may be redefined to nothing to remove all assertions in the released code. The Perl version cannot be handled that way, but comes with a Carp::Datum::Strip module that will lexically remove all the assertions, leaving only DTRACE calls. Modules using Carp::Datum can make use of Carp::Datum::MakeMaker in their Makefile.PL to request stripping at build time. See Carp::Datum::MakeMaker for instructions.

Here is a small example showing how traces look like, and what happens by default on assertion failure. Since we're not customizing Log::Agent, the debugging channel is STDERR. In real life, one would probably customize Log::Agent with a file driver, and redirect the debug channel to a file separate from both STDOUT and STDERR.

First, the script, with line number:

 1 #!/usr/bin/perl
 2 
 3 use Carp::Datum qw(:all on);
 4 
 5 show_inv(2, 0.5, 0);
 6 
 7 sub show_inv {
 8     DFEATURE my $f_;
 9     foreach (@_) {
10         print "Inverse of $_ is ", inv($_), "\n";
11     }
12     return DVOID;
13 }
14 
15 sub inv {
16     DFEATURE my $f_;
17     my ($x) = @_;
18     DREQUIRE $x != 0, "x=$x not null";
19     return DVAL 1 / $x;
20 }
21 

What goes to STDOUT:

Inverse of 2 is 0.5
Inverse of 0.5 is 2

The debugging output on STDERR:

   +-> main::show_inv(2, 0.5, 0) from global at demo:5 [demo:8]
   |  +-> main::inv(2) from main::show_inv() at demo:10 [demo:16]
   |  |  Returning: (0.5) [demo:19]
   |  +-< main::inv(2) from main::show_inv() at demo:10
   |  +-> main::inv(0.5) from main::show_inv() at demo:10 [demo:16]
   |  |  Returning: (2) [demo:19]
   |  +-< main::inv(0.5) from main::show_inv() at demo:10
   |  +-> main::inv(0) from main::show_inv() at demo:10 [demo:16]
!! |  |  pre-condition FAILED: argument 0 not null ($x != 0) [demo:18]
!! |  |  main::inv(0) called at demo line 10
!! |  |  main::show_inv(2, 0.5, 0) called at demo line 5
** |  |  FATAL: PANIC: pre-condition FAILED: x=0 not null ($x != 0) [demo:18]
   |  +-< main::inv(0) from main::show_inv() at demo:10
   +-< main::show_inv(2, 0.5, 0) from global at demo:5
   PANIC: pre-condition FAILED: x=0 not null ($x != 0) [demo:18]

The last three lines were manually re-ordered for this manpage: because of the pre-condition failure, Perl enters its global object destruction routine, and the destruction order of the lexicals is not right. The $f_ in show_inv() is destroyed before the one in inv(), resulting in the inversion. To better please the eye, we fixed it. And the PANIC is emitted when the pre-condition failure is detected, but it would have messed up the trace example.

Note that the stack dump is prefixed with the "!!" token, and the fatal error is tagged with "**". This is a visual aid only, to quickly locate troubles in logfiles by catching the eye.

Routine entry and exit are tagged, returned values and parameters are shown, and the immediate caller of each routine is also traced. The final [demo:8] tags refer to the file name (here the script I used was called "demo") and the line number where the call to the Carp::Datum routine is made: here the DFEATURE at line 8.

The special name "global" (without trailing () marker) is used to indicate that the caller is the main script, i.e. there is no calling routine.

Returned values in inv() are traced a "(0.5)" and "(2)", and not as "0.5" and "2" as one would expect, because the routine was called in non-scalar context (within a print statement).

PROGRAMMING BY CONTRACT

Introduction

The Programming by Contract paradigm was introduced by Bertrand Meyer in his Object Oriented Software Construction book, and later implemented natively in the Eiffel language. It is very simple, yet extremely powerful.

Each feature (routine) of a program is viewed externally as a supplier for some service. For instance, the sqrt() routine computes the square root of any positive number for us. We might do the computation ourselves, but sqrt() probably provides an efficient algorithm for that, and it has already been written and validated for us.

However, sqrt() is only defined for positive numbers. Giving a negative number to it is not correct. The old way (i.e. in the old days before Programming by Contract was formalized), people implemented that restriction by testing the argument x of sqrt(), and doing so in the routine itself to factorize code. Then, on error, sqrt() would return -1 for instance (which cannot be a valid square root for a real number), and the desired quantity otherwise. The caller had then to check the returned value to determine whether an error had occurred. Here it is easy, but in languages where no out-of-band value such as Perl's undef are implemented, it can be quite difficult to both report an error and return a result.

With Programming by Contract, the logic is reversed, and the code is really simplified:

  • It is up to the caller to always supply a positive value to sqrt(), i.e. to check the value first.

  • In return, sqrt() promises to always return the square root of its argument.

What are the benefits of such a gentlemen's agreement? The code of the sqrt() routine is much simpler (whic means has fewer bugs) because it does not have to bother with handling the case of negative arguments, since the caller promised to never call with such invalid values. And the code of the caller is at worst as complex as before (one test to check that the argument is positive, against a check for an error code) and at best less complex: if we know that the value is positive, we don't even have to check, for instance if it is the result of an abs() call.

But if sqrt() is called with a negative argument, and there's no explicit test in sqrt() to trap the case, what happens if we're giving sqrt() a negative value, despite our promise never to do so? Well, it's a bug, and it's a bug in the caller, not in the sqrt() routine.

To find those bugs, one usually monitors the assertions (pre- and post-conditions, plus any other assertion in the code, which is both a post-condition for the code above and a pre-condition for the code below, at the same time) during testing. When the product is released, assertions are no longer checked.

Formalism

Each routine is equipped with a set of pre-conditions and post-conditions. A routine r is therefore defined as:

r(x)
  pre-condition
  body
  post-condition

The pre- and post-conditions are expressions involving the parameters of r(), here only x, and, for the post-condition, the returned value of r() as well. Conditions satisfying this property are made visible to the clients, and become the routine's contract, which can be written as:

  • You, the caller, promise to always call me with my pre-condition satisfied. Failure to do so will be a bug in your code.

  • I promise you, the caller, that my implementation will then perform correctly and that my post-condition will be satisfied. Failure to do so will be a bug in my code.

In object-oriented programming, pre- and post-conditions can also use internal attributes of the object, but then become debugging checks that everything happens correctly (in the proper state, the proper order, etc...) and cannot be part of the contract (for external users of the class) since clients cannot check that the pre-condition is true, because it will not have access to the internal attributes.

Furthermore, in object-oriented programming, a redefined feature must weaken the pre-condition of its parent feature and strengthen its post-condition. It can also keep them as-is. To fully understand why, it's best to read Meyer. Intuitively, it's easy to understand why the pre-condition cannot be strengthen, nor why the post-condition cannot be weakened: because of dynamic binding, a caller of r() only has the static type of the object, not its dynamic type. Therefore, it cannot know in advance which of the routines will be called amongst the inheritance tree.

Common Pitfalls

  • Do not write both a pre-condition and a test with the same expression.

  • Never write a pre-condition when you wish to validate user input!

  • Never write a test on an argument when failure means an error, use a pre-condition.

    If your pre-condition is so important that you would like to always monitor it, even within the released product, then Carp::Datum provides you with VERIFY, a pre-condition that will always be checked (i.e. never stripped by Carp::Datum::Strip). Use it to protect the external interface of your module against abuse.

Implementation

With Carp::Datum, pre-conditions can be given using DREQUIRE or VERIFY. Assertions are written with DASSERT and post-conditions given by DENSURE.

Although you could technically do with only DASSERT to express all your assertion, stating whether it's a pre-condition with DREQUIRE also has a commentary value for the reader. Moreover, one day, there might be an automatic tool to extract the pre- and post-conditions of all the routines for documentation purposes, and if all your assertions are called DASSERT, the tool will have a hard time figuring out which is what.

Moreover, remember that a pre-condition failure always means a bug in the caller, whilst other assertion failures means a bug near the place of failure. If only for that, it's worth making the distinction.

INTERFACE

Control Flow

DFEATURE my $f_, optional comment

This statement marks the very top of any routine. Do not ommit the my which is very important to ensure that what is going to be stored in the lexically scoped $f_ variable will be destroyed when the routine ends. You can use any name for that lexical, but we recommend that name as being both unlikely to conflict with any real variable and short.

The optional comment part will be printed in the logs at routine entry time, and can be used to flag object constructors, for instance, for easier grep'ing in the logs afterwards.

return DVOID

Use this when you would otherwise return from the routine by saying return. It allows tracing of the return statement.

return DVAL scalar

Use this form when returning something in scalar context. Do not put any parenthesis around your scalar, or it will be incorrectly stripped by Carp::Datum::Strip. Examples:

return DVAL 5;                      # OK
return DVAL ($a == 1) ? 2 : 4;      # WRONG (has parenthesis)
return DVAL (1, 2, 4);              # WRONG (and will return 4)

my $x = ($a == 1) ? 2 : 4;
return DVAL $x;                     # OK

return DVAL &foo();                 # Will be traced as array context

Using DVAL allows tracing of the returned value.

return DARY (list)

Use this form when returning something in list context. Using DARY allows tracing of the returned values.

return DARY @x;

When you have a routine returning something different depending on its calling context, then you have to write:

return DARY @x if wantarray;
return DVAL $x;

Be very careful with that, otherwise your program will behave differently when the DARY and DVAL tokens are stripped by Carp::Datum::Strip, thereby raising subtle bugs.

Programming by Contract

DREQUIRE expr, tag

Specify a pre-condition expr, along with a tag that will be printed whenever the pre-condition fails, i.e. when expr evaluates to false. You may use the tag string to actually dump faulty value, for instance:

DREQUIRE $x > 0, "x = $x positive";

The tag is optional and may be left of.

VERIFY expr, tag

This is really the same as DREQUIRE, except that it will not be stripped by Carp::Datum::Strip and that it will always be monitored and causing a fatal error, whatever dynamic configuration you setup.

DENSURE expr, tag

Specify a post-condition expr, along with an optional tag that will be printed whenever the post-condition fails, i.e. when expr evaluates to false.

DASSERT expr, tag

Specify an assertion expr, and an optional tag printed when expr evaluates to false.

Tracing

Tracing is ensured by the DTRACE routine, which is never stripped. When Carp::Datum is off, traces are redirected to Log::Agent (then channel depends on the level of the trace).

The following forms can be used, from the simpler to the more complex:

DTRACE "the variable x+1 is ", $x + 1, " and y is $y";
DTRACE TRC_WARNING, "a warning message";
DTRACE { -level => TRC_CRITICAL, -marker => "##" }, "very critical";

The first call emits a trace at the TRC_DEBUG level, by default. The second call emits a warning at the TRC_WARNING level, and the last call emits a TRC_CRITICAL message prefixed with a marker.

Markers are 2-char strings emitted in the very first columns of the debugging output, and can be used to put emphasis on some particular important messages. Internally, Carp::Datum and Log::Agent use the following markers:

!!    assertion failure and stack trace
**    critical errors, fatal if not trapped by eval {}
>>    a message emitted via a Log::Agent routine, not DTRACE

The table below lists the available TRC_ levels defined by Carp::Datum, and how they remap to Log::Agent routines when Carp::Datum is off:

 Carp::Datum     Log::Agent
-------------   -------------
TRC_EMERGENCY   logdie
TRC_ALERT       logerr
TRC_CRITICAL    logerr
TRC_ERROR       logerr
TRC_WARNING     logwarn
TRC_NOTICE      logsay
TRC_INFO        logtrc "info"
TRC_DEBUG       logtrc "debug"

If your application does not configure Log::Agent specially, all the calls map nicely to perl's native routines (die, warn and print).

Convenience Routines

equiv expr1, expr2

Returns true when both expr1 and expr2 have the same truth value, whether they are both true or both false.

implies expr1, expr2

Returns the truth value of expr1 implies expr2, which is the same as:

!expr1 || expr2

It is always true except when expr1 is true and expr2 is false.

Warning: this is function, not a macro. That is to say, both arguments are evaluated, and there is no short-circuit when expr1 is false.

LIMITATIONS

It's not possible to insert tracing hooks like DFEATURE or DVAL in stringification overloading routines. For DFEATURE, that is because the argument list might be dumped, and printing $self will re-invoke the stringification routine recursively. For DVAL, this is implied by the fact that there cannot be any DFEATURE in the routine, hence DVAL cannot be used.

BUGS

Please report them to the authors.

AUTHORS

Christophe Dehaudt <christophe@dehaudt.org> and Raphael Manfredi <Raphael_Manfredi@pobox.com>.

SEE ALSO

Carp::Datum::Cfg(3), Carp::Datum::MakeMaker(3), Carp::Datum::Strip(3), Log::Agent(3).

2 POD Errors

The following errors were encountered while parsing the POD:

Around line 860:

You forgot a '=back' before '=head2'

Around line 876:

=back without =over