NAME

Devel::StatProfiler - low-overhead sampling code profiler

VERSION

version 0.53

SYNOPSIS

# profile (needs multiple runs, with representative data/distribution!)
perl -MDevel::StatProfiler foo.pl input1.txt
perl -MDevel::StatProfiler foo.pl input2.txt
perl -MDevel::StatProfiler foo.pl input3.txt
perl -MDevel::StatProfiler foo.pl input1.txt

# prepare a report from profile data
statprofilehtml

DESCRIPTION

Devel::StatProfiler is a sampling (or statistical) code profiler.

Rather than measuring the exact time spent in a statement (or subroutine), the profiler interrupts the program at fixed intervals (10 milliseconds by default) and takes a stack trace. Given a sufficient number of samples this provides a good indication of where the program is spending time and has a relatively low overhead (around 3-5% increased runtime).

Options

Options can be passed either on the command line:

perl -MDevel::StatProfiler=-interval,1000,-template,/tmp/profile/statprof.out

or by loading the profiler directly from the profiled program

use Devel::StatProfiler -interval => 1000, -template => '/tmp/profile/statprof.out';

-template <path> (default: statprof.out)

Sets the base name used for the output file. The full filename is obtained by appending a dot followed by a random string to the template path. This ensures that subsequent profiler runs don't overwrite the same output file.

-nostart

Don't start profiling when the module is loaded. To start the profile call enable_profile().

-interval <microsecs> (default 10000)

Sets the sampling interval, in microseconds (accuracy varies depending on OS/hardware).

-maxsize <size> (default 10MB)

After the trace file grows bigger than this size, start a new one with a bigger ordinal.

-source <strategy> (default 'none')

Sets which source code is saved in the profile

none

No source code is saved in the profile file.

traced_evals

Only the source code for eval()s that have at least one sample during evaluation is saved. This does NOT include eval()s that define subroutines that are sampled after the eval() ends.

all_evals

The source code for all eval()s is saved in the profile file.

all_evals_always

The source code for all eval()s is saved in the profile file, even when profiling is disabled.

-depth <stack depth> (default 20)

Sets the maximum number of stack frames saved for each sample.

-metadata HASHREF

Emit custom metadata in the header section of each profile file; this metadata will be available right after calling Devel::StatProfiler::Reader->new.

-file <path>

In general, using -template above is the preferred option, since -file will not work when using fork() or threads.

Sets the exact file path used for profile output file; if the file is already present, it's overwritten.

CAVEATS

goto &subroutine

With a sampling profiler there is no reliable way to track the goto &foo construct, hence the profile data for this code

sub foo {
    # 100 milliseconds of computation
}

sub bar {
    # 100 milliseconds of computation, then
    goto &foo;
}

bar() for 1..100000; # foo.pl, line 10

will report that the code at foo.pl line 10 has spent approximately the same time in calling foo and bar, and will report foo as being called from the main program rather than from bar.

XSUBs with callbacks

Since XSUBs don't have a Perl-level stack frame, Perl code called from XSUBs is reported as if called from the source line calling the XSUB.

Additionally, the exclusive time for the XSUB incorrectly includes the time spent in callbacks.

XSUBs and overload

If an object has an overloaded &{} operator (code dereference) returning an XSUB as the code reference, the overload might be called twice in some situations.

changing profiler state

Calling enable_profile, disable_profile and stop_profile from an inner runloop (including but not limited to from use, require, sort blocks, callbacks invoked from XS code) can have confusing results: runloops started afterwards will honor the new state, outer runloops will not.

Unfortunately there is no way to detect the situaltion at the moment.

source code and #line directives

The parsing of #line directive used to map logical lines to physical lines uses heuristics, and they can obviously fail.

Files that contain #line directives and have no samples taken in the part of the file outside the part mapped by #line directives will not be found.

first line of subs

The first line of subs is found by searching for the sub definition in the code. Needless to say, this is fragile.

sampling accuracy

Since the profiler uses nanosleep/Sleep between samples, accuracy is at the mercy of the OS scheduler. In particular, under Windows the default system timer has an accuracy of about 15.6 milliseconds.

AUTHORS

  • Mattia Barbon <mattia@barbon.org>

  • Steffen Mueller <smueller@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2015 by Mattia Barbon, Steffen Mueller.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.