NAME

Template::Benchmark - Pluggable benchmarker to cross-compare template systems.

SYNOPSIS

use Template::Benchmark;

my $bench = Template::Benchmark->new(
    duration            => 5,
    repeats             => 1,
    array_loop          => 1,
    shared_memory_cache => 0,
    );

my $result = $bench->benchmark();

if( $result->{ result } eq 'SUCCESS' )
{
    ...
}

DESCRIPTION

Template::Benchmark provides a pluggable framework for cross-comparing performance of various template engines across a range of supported features for each, grouped by caching methodology.

If that's a bit of a mouthful... have you ever wanted to find out the relative performance of template modules that support expression parsing when running with a shared memory cache? Do you even know which ones allow you to do that? This module lets you find that sort of thing out.

If you're just after results, then you should probably start with the benchmark_template_engines script first, it provides a commandline UI onto Template::Benchmark and gives you human-readable reports as a reply rather than a raw hashref, it also supports JSON output if you want to dump the report somewhere in a machine-readable format.

IMPORTANT CONCEPTS AND TERMINOLOGY

Template Engines

Template::Benchmark is built around a plugin structure using Module::Pluggable, it will look under Template::Benchmark::Engines::* for template engine plugins.

Each of these plugins provides an interface to a different template engine such as Template::Toolkit, HTML::Template, Template::Sandbox and so on.

Benchmark Types

Benchmark types refer to the environment a benchmark is running in, currently there are five benchmark types: uncached_string, disk_cache, shared_memory_cache, memory_cache and instance_reuse.

For a full list, and for an explanation of what they represent, consult the Template::Benchmark::Engine documentation.

Template Features

Template features are a list of features supported by the various template engines, not all are implemented by all engines although there's a core set of features supported by all engines.

Features can be things like literal_text, records_loop, scalar_variable, variable_expression and so forth.

For a full list, and for an explanation of what they represent, consult the Template::Benchmark::Engine documentation.

Benchmark Functions

Each template engine plugin provides the means to produce a benchmark function for each benchmark type.

The benchmark function is an anonymous sub that is expected to be passed the template, and two hashrefs of template variables, and is expected to return the output of the processed template.

These are the functions that will be benchmarked, and generally consist (depending on the template engine) of a call to the template constructor and template processing functions.

Each plugin can return several benchmark functions for a given benchmark type, so each is given a tag to use as a name and a description for display, this allows plugins like Template::Benchmark::Engines::TemplateToolkit to contain benchmarks for Template::Toolkit, Template::Toolkit running with Template::Stash::XS, and various other options.

Each of these will run as an independent benchmark even though they're provided by the same plugin.

Supported or Unsupported?

Throughout this document are references to whether a template feature or benchmark type is supported or unsupported in the template engine.

But what constitutes "unsupported"?

It doesn't neccessarily mean that it's impossible to perform that task with the given template engine, but generally if it requires some significant chunk of DIY code or boilerplate or subclassing by the developer using the template engine, it should be considered to be unsupported by the template engine itself.

This of course is a subjective judgement, but a general rule of thumb is that if you can tell the template engine to do it, it's supported; and if the template engine allows you to do it, it's unsupported, even though it's possible.

HOW Template::Benchmark WORKS

Construction

When a new Template::Benchmark object is constructed, it attempts to load all template engine plugins it finds.

It then asks each plugin for a snippet of template to implement each template feature requested. If a plugin provides no snippet then it is assumed that that feature is unsupported by that engine.

Each snippet is then combined into a benchmark template for that specific template engine and written to a temporary directory, at the same time a cache directory is set up for that engine. These temporary directories are cleaned up in the DESTROY() of the benchmark instance, usually when you let it go out of scope.

Finally, each engine is asked to provide a list of benchmark functions for each benchmark type along with a name and description explaining what the benchmark function is doing.

At this point the Template::Benchmark constructor exits, and you're ready to run the benchmarks.

Running the benchmarks

When the calling program is ready to run the benchmarks it calls $bench->benchmark() and then twiddles its thumbs, probably for a long time.

While this twiddling is going on, Template::Benchmark is busy running each of the benchmark functions a single time.

The outputs of this initial run are compared and if there are any mismatches then the $bench->benchmark() function exits early with a result structure indicating the errors as compared to a reference copy produced by the reference plugin engine.

An important side-effect of this initial run is that the cache for each benchmark function becomes populated, so that the cached benchmark types truly reflect only cached performance and not the cost of an initial cache miss.

If all the outputs match then the benchmark functions for each benchmark type are handed off to the Benchmark module for benchmarking.

The results of the benchmarks are bundled together and placed into the results structure that is returned from $bench->benchmark().

OPTIONS

New Template:Benchmark objects can be created with the constructor Template::Benchmark->new( %options ), using any (or none) of the options below.

uncached_string => 0 | 1 (default 1)
disk_cache => 0 | 1 (default 1)
shared_memory_cache => 0 | 1 (default 1)
memory_cache => 0 | 1 (default 1)
instance_reuse => 0 | 1 (default 1)

Each of these options determines which benchmark types are enabled (if set to a true value) or disabled (if set to a false value). At least one of them must be set to a true value for any benchmarks to be run.

literal_text => 0 | 1 (default 1)
scalar_variable => 0 | 1 (default 1)
hash_variable_value => 0 | 1 (default 0)
array_variable_value => 0 | 1 (default 0)
deep_data_structure_value => 0 | 1 (default 0)
array_loop_value => 0 | 1 (default 0)
hash_loop_value => 0 | 1 (default 0)
records_loop_value => 0 | 1 (default 1)
array_loop_template => 0 | 1 (default 0)
hash_loop_template => 0 | 1 (default 0)
records_loop_template => 0 | 1 (default 1)
constant_if_literal => 0 | 1 (default 0)
variable_if_literal => 0 | 1 (default 1)
constant_if_else_literal => 0 | 1 (default 0)
variable_if_else_literal => 0 | 1 (default 1)
constant_if_template => 0 | 1 (default 0)
variable_if_template => 0 | 1 (default 1)
constant_if_else_template => 0 | 1 (default 0)
variable_if_else_template => 0 | 1 (default 1)
constant_expression => 0 | 1 (default 0)
variable_expression => 0 | 1 (default 0)
complex_variable_expression => 0 | 1 (default 0)
constant_function => 0 | 1 (default 0)
variable_function => 0 | 1 (default 0)

Each of these options sets the corresponding template feature on or off. At least one of these must be true for any benchmarks to run.

template_repeats => $number (default 30)

After the template is constructed from the various feature snippets it gets repeated a number of times to make it longer, this option controls how many times the basic template gets repeated to form the final template.

The default of 30 is chosen to provide some form of approximation of the workload in a "normal" web page. Given that "how long is a web page?" has much the same answer as "how long is a piece of string?" you will probably want to tweak the number of repeats to suit your own needs.

duration => $seconds (default 10)

This option determines how many CPU seconds should be spent running each benchmark function, this is passed along to Benchmark as a negative duration, so read the Benchmark documentation if you want the gory details.

The larger the number the less statistical variance you'll get, the less likely you are to have temporary blips of the test machine's I/O or CPU skewing the results, the downside is that your benchmarks will take corresspondingly longer to run.

The default of 10 seconds seems to give pretty consistent results for me within +/-1% on a very lightly loaded linux machine.

style => $string (default 'none')

This option is passed straight through as the style argument to Benchmark. By default it is 'none' so that no output is printed by Benchmark, this also means that you can't see any results until all the benchmarks are done. If you set it to 'auto' then you'll see the benchmark results as they happen, but Template::Benchmark will have no control over the generated output.

Might be handy for debugging or if you're impatient and don't want pretty reports.

See the Benchmark documentation for valid values for this setting.

keep_tmp_dirs => 0 | 1 (default 0)

If set to a true value then the temporary directories created for template files and caches will not be deleted when the Template::Benchmark instance is destroyed. Instead, at the point when they would have been deleted, their location will be printed.

This allows you to inspect the directory contents to see the generated templates and caches and so forth.

Because the location is printed, and at an unpredictable time, it may mess up your program output, so this option is probably only useful while debugging.

PUBLIC METHODS

$benchmark = Template::Benchmark->new( %options )

This is the constructor for Template::Benchmark, it will return a newly constructed benchmark object, or throw an exception explaining why it couldn't.

The options you can pass in are covered in the "OPTIONS" section above.

$result = $benchmark->benchmark()

Run the benchmarks as set up by the constructor. You can run $benchmark->benchmark() multiple times if you wish to reuse the same benchmark options.

The structure of the $result hashref is covered in "BENCHMARK RESULTS" below.

%defaults = Template::Benchmark->default_options()

Returns a hash of the valid options to the constructor and their default values. This can be used to keep external programs up-to-date with what options are available in case new ones are added or the defaults are changed. This is what benchmark_template_engines does in fact.

@benchmark_types = Template::Benchmark->valid_benchmark_types()

Returns a list of the valid benchmark types. This can be used to keep external programs up-to-date with what benchmark types are available in case new ones are added. This is what benchmark_template_engines does in fact.

@features = Template::Benchmark->valid_features()

Returns a list of the valid template features. This can be used to keep external programs up-to-date with what template features are available in case new ones are added. This is what benchmark_template_engines does in fact.

$errors = $benchmark->engine_errors()

Returns a hashref of engine plugin to an arrayref of error messages encountered while trying to enable to given plugin for a benchmark.

This may be errors in loading the module or a list of template features the engine didn't support.

$number = $benchmark->number_of_benchmarks()

Returns a count of how many benchmark functions will be run.

$seconds = $benchmark->estimate_benchmark_duration()

Return an estimate, in seconds, of how long it will take to run all the benchmarks.

This estimate currently isn't a very good one, it's basically the duration multiplied by the number of benchmark functions, and doesn't count factors like the overhead of running the benchmarks, or the fact that the duration is a minimum duration, or the initial run of the benchmark functions to build the cache and compare outputs.

It still gives a good lower-bound for how long the benchmark will run, and maybe I'll improve it in future releases.

@engines = $benchmark->engines()

Returns a list of all template engine plugins that were successfully loaded.

Note that this does not mean that all those template engines support all requested template features, it merely means there wasn't a problem loading their module.

@features = $benchmark->features()

Returns a list of all template features that were enabled during construction of the Template::Benchmark object.

BENCHMARK RESULTS

The $benchmark->benchmark() method returns a results hashref, this section documents the structure of that hashref.

Firstly, all results returned have a result key indicating the type of result, this defines the format of the rest of the hashref and whether the benchmark run was a success or why it failed.

SUCCESS

This indicates that the benchmark run completed successfully, there will be the following additional information:

{
    result       => 'SUCCESS',
    start_time   => 1265738228,
    title        => 'Template Benchmark @Tue Feb  9 17:57:08 2010',
    descriptions =>
        {
           'HT'    =>
              'HTML::Template (2.9)',
           'TS_CF' =>
              'Template::Sandbox (1.02) with Cache::CacheFactory (1.09) caching',
        },
    reference    =>
        {
            type   => 'uncached_string',
            tag    => 'TS',
            output => template output,
        },
    benchmarks   =>
        [
            {
               type       => 'uncached_string',
               timings    => Benchmark::timethese() results,
               comparison => Benchmark::cmpthese() results,
            },
            {
               type       => 'memory_cache',
               timings    => Benchmark::timethese() results,
               comparison => Benchmark::cmpthese() results,
            },
            ...
        ],
}
NO BENCHMARKS TO RUN
{
    result       => 'NO BENCHMARKS TO RUN',
}
MISMATCHED TEMPLATE OUTPUT
{
    result    => 'MISMATCHED TEMPLATE OUTPUT',
    reference =>
        {
            type   => 'uncached_string',
            tag    => 'TS',
            output => template output,
        },
    failures =>
        [
            {
                type   => 'disk_cache',
                tag    => 'TT',
                output => template output,
            },
            ...
        ],
}

WRITING YOUR OWN TEMPLATE ENGINE PLUGINS

All template engine plugins reside in the Template::Benchmark::Engines namespace and inherit the Template::Benchmark::Engine class.

See the Template::Benchmark::Engine documentation for details on writing your own plugins.

UNDERSTANDING THE RESULTS

This section aims to give you a few pointers when analyzing the results of a benchmark run, some points are obvious, some less so, and most need to be applied with some degree of intelligence to know when they're applicable or not.

Hopefully they'll prove useful.

If you're wondering what all the numbers mean, the documentation for Benchmark will probably be more helpful.

memory_cache vs instance_reuse

Comparing the memory_cache and instance_reuse times for an engine should generally give you some idea of the overhead of the caching system used by the engine - if the times are close then they're using a good caching system, if the times are wildly divergent then you might want to implement your own cache instead.

uncached_string vs instance_reuse or memory_cache

Comparing the uncached_string vs the instance_reuse or the memory_cache (instance_reuse is better if you can) times for an engine should give you an indication of how costly the parse and compile phase for a template engine is.

uncached_string represents a cache miss

The uncached_string benchmark represents a cache miss, so comparing it to the cache system you intend to use will give you an idea of how much you'll hurt whenever a cache miss occurs.

If you know how likely a cache miss is to happen, you can combine the results of the two benchmarks proportionally to get a better estimate of performance, and maybe compare that between different engines.

Estimating cache misses is a tricky art though, and can be mitigated by a number of measures, or complicated by miss stampedes and so forth, so don't put too much weight on it either.

Increasing repeats emphasises template performance

Increasing the length of the template by increasing the template_repeats option usually places emphasis on the ability of the template engine to process the template vs the overhead of reading the template, fetching it from the cache, placing the variables into the template namespace and so forth.

For the most part those overheads are fixed cost regardless of length of the template (fetching from disk or cache will have a, usually small, linear component), whereas actually executing the template will have a linear cost based on the repeats.

This means that for small values of repeats you're spending proportionally more time on overheads, and for large values of repeats you're spending more time on running the template.

If a template engine has higher-than-average overheads, it will be favoured in the results (ie, it will rank higher than otherwise) if you run with a high template_repeats value, and will be hurt in the results if you run with a low template_repeats value.

Inverting that conclusion, if an engine moves up in the results when you run with long repeats, or moves down in the results if you run with short repeats, it follows that the engine probably has high overheads in I/O, instantiation, variable import or somewhere.

deep_data_structure_value and complex_variable_expression are stress tests

Both the deep_data_structure_value and complex_variable_expression template features are designed to be stress test versions of a more basic feature.

By comparing deep_data_structure_value vs hash_variable_value you should be able to glean an indication of how well the template engine performs at navigating its way through its variable stash (to borrow Template::Toolkit terminology).

If an engine gains ranks moving from hash_variable_value to deep_data_structure_value then you know it has a more-efficient-than-average implementation of its stash, and if it loses ranks then you know it has a less-efficient-than-average implementation.

Similarly, by comparing complex_variable_expression and variable_expression you can draw conclusions about the template engine's expression execution speed.

constant vs variable features

Several template features have constant and variable versions, these indicate a version that is designed to be easily optimizable (the constant one) and a version that cannot be optimized (the variable one).

By comparing timings for the two versions, you can get a feel for whether (and how much) constant-folding optimization is done by a template engine.

Whether this is of interest to you depends entirely on how you construct and design your templates, but generally speaking, the larger and more modular your template structure is, the more likely you are to have bits of constant values "inherited" from parent templates (or config files) that could be optimized in this manner.

This is one of those cases where only you can judge whether it is applicable to your situation or not, Template::Benchmark merely provides the information so you can make that judgement.

duration only effects accuracy

The benchmarks are carefully designed so that any one-off costs from setting up the benchmark are not included in the benchmark results itself.

This means that there should be no change in the results from increasing or decreasing the benchmark duration, except to reduce the size of the error resulting from background load on the machine.

If a template engine gets consistently better (or worse) results as duration is changed, while other template engines are unchanged (give or take statistical error), it indicates that something is wrong with either the template engine, the plugin or something else - either way the results of the benchmark should be regarded as suspect until the cause has been isolated.

KNOWN ISSUES AND BUGS

Test suite is non-existent

The current test-suite is laughable and basically only tests documentation coverage.

Once I figure out what to test and how to do it, this should change, but at the moment I'm drawing a blank.

'Benchmark Types' is a lousy term

Benchmark types is a confusingly named concept at best, but I've not yet been able to think of something that's a better fit.

Maybe 'runtime environment' would be more descriptive.

This needs to be sorted out for v1.00.

Results structure too terse

The results structure could probably do with more information such as what options were set and what version of benchmark/plugins were used.

This would be helpful for anything wishing to archive benchmark results, since it may (will!) influence how comparable results are.

AUTHOR

Sam Graham, <libtemplate-benchmark-perl at illusori.co.uk>

BUGS

Please report any bugs or feature requests to bug-template-benchmark at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Template-Benchmark. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Template::Benchmark

You can also look for information at:

ACKNOWLEDGEMENTS

Thanks to Paul Seamons for creating the the bench_various_templaters.pl script distributed with Template::Alloy, which was the ultimate inspiration for this module.

COPYRIGHT & LICENSE

Copyright 2010 Sam Graham.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.