NAME

OptreeCheck - check optrees as rendered by B::Concise

SYNOPSIS

OptreeCheck supports 'golden-sample' regression testing of perl's parser, optimizer, bytecode generator, via a single function: checkOptree(%in).

It invokes B::Concise upon the sample code, checks that the rendering 'agrees' with the golden sample, and reports mismatches.

Additionally, the module processes @ARGV (which is typically unused in the Core test harness), and thus provides a means to run the tests in various modes.

EXAMPLE

 # your test file
 use OptreeCheck;
 plan tests => 1;

 checkOptree (
   name   => "test-name',	# optional, made from others if not given

   # code-under-test: must provide 1 of them
   code   => sub {my $a},	# coderef, or source (wrapped and evald)
   prog   => 'sort @a',	# run in subprocess, aka -MO=Concise
   bcopts => '-exec',		# $opt or \@opts, passed to BC::compile

   errs   => 'Useless variable "@main::a" .*'	# str, regex, [str+] [regex+],

   # various test options
   # errs   => '.*',		# match against any emitted errs, -w warnings
   # skip => 1,		# skips test
   # todo => 'excuse',		# anticipated failures
   # fail => 1			# force fail (by redirecting result)
   # retry => 1		# retry on test failure
   # debug => 1,		# use re 'debug' for retried failures !!

   # the 'golden-sample's, (must provide both)

   expect => <<'EOT_EOT', expect_nt => <<'EONT_EONT' );  # start HERE-DOCS
# 1  <;> nextstate(main 45 optree.t:23) v
# 2  <0> padsv[$a:45,46] M/LVINTRO
# 3  <1> leavesub[1 ref] K/REFC,1
EOT_EOT
# 1  <;> nextstate(main 45 optree.t:23) v
# 2  <0> padsv[$a:45,46] M/LVINTRO
# 3  <1> leavesub[1 ref] K/REFC,1
EONT_EONT

__END__

Failure Reports

Heres a sample failure, as induced by the following command.
Note the argument; option=value, after the test-file, more on that later

$> PERL_CORE=1 ./perl ext/B/t/optree_check.t  testmode=cross
...
ok 19 - canonical example w -basic
not ok 20 - -exec code: $a=$b+42
# Failed at test.pl line 249
#      got '1  <;> nextstate(main 600 optree_check.t:208) v
# 2  <#> gvsv[*b] s
# 3  <$> const[IV 42] s
# 4  <2> add[t3] sK/2
# 5  <#> gvsv[*a] s
# 6  <2> sassign sKS/2
# 7  <1> leavesub[1 ref] K/REFC,1
# '
# expected /(?ms-xi:^1  <;> (?:next|db)state(.*?) v
# 2  <\$> gvsv\(\*b\) s
# 3  <\$> const\(IV 42\) s
# 4  <2> add\[t\d+\] sK/2
# 5  <\$> gvsv\(\*a\) s
# 6  <2> sassign sKS/2
# 7  <1> leavesub\[\d+ refs?\] K/REFC,1
# $)/
# got:          '2  <#> gvsv[*b] s'
# want:  (?-xism:2  <\$> gvsv\(\*b\) s)
# got:          '3  <$> const[IV 42] s'
# want:  (?-xism:3  <\$> const\(IV 42\) s)
# got:          '5  <#> gvsv[*a] s'
# want:  (?-xism:5  <\$> gvsv\(\*a\) s)
# remainder:
# 2  <#> gvsv[*b] s
# 3  <$> const[IV 42] s
# 5  <#> gvsv[*a] s
# these lines not matched:
# 2  <#> gvsv[*b] s
# 3  <$> const[IV 42] s
# 5  <#> gvsv[*a] s

Errors are reported 3 different ways;

The 1st form is directly from test.pl's like() and unlike(). Note that this form is used as input, so you can easily cut-paste results into test-files you are developing. Just make sure you recognize insane results, to avoid canonizing them as golden samples.

The 2nd and 3rd forms show only the unexpected results and opcodes. This is done because it's blindingly tedious to find a single opcode causing the failure. 2 different ways are done in case one is unhelpful.

TestCase Overview

checkOptree(%tc) constructs a testcase object from %tc, and then calls methods which eventually call test.pl's like() to produce test results.

getRendering

getRendering() runs code or prog through B::Concise, and captures its rendering. Errors emitted during rendering are checked against expected errors, and are reported as diagnostics by default, or as failures if 'report=fail' cmdline-option is given.

prog is run in a sub-shell, with $bcopts passed through. This is the way to run code intended for main. The code arg in contrast, is always a CODEREF, either because it starts that way as an arg, or because it's wrapped and eval'd as $sub = sub {$code};

mkCheckRex

mkCheckRex() selects the golden-sample for the threaded-ness of the platform, and produces a regex which matches the expected rendering, and fails when it doesn't match.

The regex includes 'workarounds' which accommodate expected rendering variations. These include:

string constants		# avoid injection
line numbers, etc		# args of nexstate()
hexadecimal-numbers

pad-slot-assignments		# for 5.8 compat, and testmode=cross
(map|grep)(start|while)	# for 5.8 compat

mylike

mylike() calls either unlike() or like(), depending on expectations. Mismatch reports are massaged, because the actual difference can easily be lost in the forest of opcodes.

checkOptree API and Operation

Since the arg is a hash, the api is wide-open, and this really is about what elements must be or are in the hash, and what they do. %tc is passed to newTestCase(), the ctor, which adds in %proto, a global prototype object.

name => STRING

If name property is not provided, it is synthesized from these params: bcopts, note, prog, code. This is more convenient than trying to do it manually.

code or prog

Either code or prog must be present.

prog => $perl_source_string

prog => $src provides a snippet of code, which is run in a sub-process, via test.pl:runperl, and through B::Concise like so:

'./perl -w -MO=Concise,$bcopts_massaged -e $src'

code => $perl_source_string || CODEREF

The $code arg is passed to B::Concise::compile(), and run in-process. If $code is a string, it's first wrapped and eval'd into a $coderef. In either case, $coderef is then passed to B::Concise::compile():

$subref = eval "sub{$code}";
$render = B::Concise::compile($subref)->();

expect and expect_nt

expect and expect_nt args are the golden-sample renderings, and are sampled from known-ok threaded and un-threaded bleadperl (5.9.1) builds. They're both required, and the correct one is selected for the platform being tested, and saved into the synthesized property wanted.

bcopts => $bcopts || [ @bcopts ]

When getRendering() runs, it passes bcopts into B::Concise::compile(). The bcopts arg can be a single string, or an array of strings.

errs => $err_str_regex || [ @err_str_regexs ]

getRendering() processes the code or prog arg under warnings, and both parsing and optree-traversal errors are collected. These are validated against the one or more errors you specify.

testcase modifier properties

These properties are set as %tc parameters to change test behavior.

skip => 'reason'

invokes skip('reason'), causing test to skip.

todo => 'reason'

invokes todo('reason')

fail => 1

For code arguments, this option causes getRendering to redirect the rendering operation to STDERR, which causes the regex match to fail.

retry => 1

If retry is set, and a test fails, it is run a second time, possibly with regex debug.

debug => 1

If a failure is retried, this turns on eval "use re 'debug'", thus turning on regex debug. It's quite verbose, and not hugely helpful.

noanchors => 1

If set, this relaxes the regex check, which is normally pretty strict. It's used primarily to validate checkOptree via tests in optree_check.

Synthesized object properties

These properties are added into the test object during execution.

wanted

This stores the chosen expect expect_nt string. The OptreeCheck object may in the future delete the raw strings once wanted is set, thus saving space.

cross => 1

This tag is added if testmode=cross is passed in as argument. It causes test-harness to purposely use the wrong string.

checkErrs

checkErrs() is a getRendering helper that verifies that expected errs against those found when rendering the code on the platform. It is run after rendering, and before mkCheckRex.

Errors can be reported 3 different ways; diag, fail, print.

diag - uses test.pl _diag()
fail - causes double-testing
print-.no # in front of the output (may mess up test harnesses)

The 3 ways are selectable at runtimve via cmdline-arg: report={diag,fail,print}.

mkCheckRex ($tc)

It selects the correct golden-sample from the test-case object, and converts it into a Regexp which should match against the original golden-sample (used in selftest, see below), and on the renderings obtained by applying the code on the perl being tested.

The selection is driven by platform mostly, but also by test-mode, which rather complicates the code. This is worsened by the potential need to make platform specific conversions on the reftext.

but is otherwise as strict as possible. For example, it should *not* match when opcode flags change, or when optimizations convert an op to an ex-op.

match criteria

The selected golden-sample is massaged to eliminate various match irrelevancies. This is done so that the tests don't fail just because you added a line to the top of the test file. (Recall that the renderings contain the program's line numbers). Similar cleanups are done on "strings", hex-constants, etc.

The need to massage is reflected in the 2 golden-sample approach of the test-cases; we want the match to be as rigorous as possible, and thats easier to achieve when matching against 1 input than 2.

Opcode arguments (text within braces) are disregarded for matching purposes. This loses some info in 'add[t5]', but greatly simplifies matching 'nextstate(main 22 (eval 10):1)'. Besides, we are testing for regressions, not for complete accuracy.

The regex is anchored by default, but can be suppressed with 'noanchors', allowing 1-liner tests to succeed if opcode is found.

Global modes

Unusually, this module also processes @ARGV for command-line arguments which set global modes. These 'options' change the way the tests run, essentially reusing the tests for different purposes.

Additionally, there's an experimental control-arg interface (i.e. subject to change) which allows the user to set global modes.

Testing Method

At 1st, optreeCheck used one reference-text, but the differences between Threaded and Non-threaded renderings meant that a single reference (sampled from say, threaded) would be tricky and iterative to convert for testing on a non-threaded build. Worse, this conflicts with making tests both strict and precise.

We now use 2 reference texts, the right one is used based upon the build's threaded-ness. This has several benefits:

1. native reference data allows closer/easier matching by regex.
2. samples can be eyeballed to grok T-nT differences.
3. data can help to validate mkCheckRex() operation.
4. can develop regexes which accommodate T-nT differences.
5. can test with both native and cross-converted regexes.

Cross-testing (expect_nt on threaded, expect on non-threaded) exposes differences in B::Concise output, so mkCheckRex has code to do some cross-test manipulations. This area needs more work.

Test Modes

One consequence of a single-function API is difficulty controlling test-mode. I've chosen for now to use a package hash, %gOpts, to store test-state. These properties alter checkOptree() function, either short-circuiting to selftest, or running a loop that runs the testcase 2^N times, varying conditions each time. (current N is 2 only).

So Test-mode is controlled with cmdline args, also called options below. Run with 'help' to see the test-state, and how to change it.

selftest

This argument invokes runSelftest(), which tests a regex against the reference renderings that they're made from. Failure of a regex match its 'mold' is a strong indicator that mkCheckRex is buggy.

That said, selftest mode currently runs a cross-test too, they're not completely orthogonal yet. See below.

testmode=cross

Cross-testing is purposely creating a T-NT mismatch, looking at the fallout, which helps to understand the T-NT differences.

The tweaking appears contrary to the 2-refs philosophy, but the tweaks will be made in conversion-specific code, which (will) handles T->NT and NT->T separately. The tweaking is incomplete.

A reasonable 1st step is to add tags to indicate when TonNT or NTonT is known to fail. This needs an option to force failure, so the test.pl reporting mechanics show results to aid the user.

testmode=native

This is normal mode. Other valid values are: native, cross, both.

checkOptree Notes

Accepts test code, renders its optree using B::Concise, and matches that rendering against a regex built from one of 2 reference renderings %tc data.

The regex is built by mkCheckRex(\%tc), which scrubs %tc data to remove match-irrelevancies, such as (args) and [args]. For example, it strips leading '# ', making it easy to cut-paste new tests into your test-file, run it, and cut-paste actual results into place. You then retest and reedit until all 'errors' are gone. (now make sure you haven't 'enshrined' a bug).

name: The test name. May be augmented by a label, which is built from important params, and which helps keep names in sync with whats being tested.

TEST DEVELOPMENT SUPPORT

This optree regression testing framework needs tests in order to find bugs. To that end, OptreeCheck has support for developing new tests, according to the following model:

1. write a set of sample code into a single file, one per
   paragraph.  Add <=for gentest> blocks if you care to, or just look at
   f_map and f_sort in ext/B/t/ for examples.

2. run OptreeCheck as a program on the file

  ./perl -Ilib ext/B/t/OptreeCheck.pm -w ext/B/t/f_map
  ./perl -Ilib ext/B/t/OptreeCheck.pm -w ext/B/t/f_sort

  gentest reads the sample code, runs each to generate a reference
  rendering, folds this rendering into an optreeCheck() statement,
  and prints it to stdout.

3. run the output file as above, redirect to files, then rerun on
   same build (for sanity check), and on thread-opposite build.  With
   editor in 1 window, and cmd in other, it's fairly easy to cut-paste
   the gots into the expects, easier than running step 2 on both
   builds then trying to sdiff them together.

CAVEATS

This code is purely for testing core. While checkOptree feels flexible enough to be stable, the whole selftest framework is subject to change w/o notice.