YAPC::EU 2010

The Perl Compiler

rurban - Reini Urban Graz, Austria

See the screencast of this talk at http://vimeo.com/14058377

What's new?

Fixed most bugs (in work) bytecode: 12=>0, c: 6=>1, cc: 9=>5, 5.14 CVs
.plc platform compatible, almost version compatible (.plc header change)
added testsuite
more and better optimisations (in work)
B::C::Flags (customised extra_cflags + extra_libs)
removed B::Stash bloat from perlcc, -stash [optional]

Who am I

rurban maintains cygwin perl since 5.8.8 and 3-4 modules, guts, B::* => 5.10

Mostly doing LISP, Perl, C, bash and PHP, and support for custom HW, windows + linux + real-time systems in real-life. Coding in winter, surfing in summer.

1995 first on CPAN with the perl5.hlp file and converter for Windows, and the windows dll versioning.

Compiler was started 1995 by Malcom Beattie, abandoned 2007 by p5p, revived 2008 by me.

Very dynamic language: magic; tie; eval "require $foo;" -> which packages to import?

Overview
Status
Plans

Why use B::C / perlcc?

Improved startup time, esp. significant with larger code.

-fcog: less destruction time, -fno-destruct: no destruction time.

Reduced memory usage. 9% less memory w/ 25000 lines

Distribute binary only versions

No need to ship an entire perl install
Self contained application
But you could also use a "Packager", like perl2exe, perlapp, PAR no compilers! slower startup

And with B::CC - Improve run-time

Overview

In the Perl Compiler suite B::C are three seperate compilers:

B::Bytecode / ByteLoader (freeze/thaw to .plc + .pmc)
B::C (freeze/thaw to .c)
B::CC (optimising to .c)

perl toke.c/op.c - B::C - perl op walker run.c

Eliminate the whole parsing and dynamic allocation time.

The Walker (Basics)

After compilation walk the "op tree" - run.c

The Walker (Basics)

Observation

1. The op tree is not a "tree", it is reduced to a simple linked list of ops. Every "op" (a pp_<opname> function) returns the next op.

2. PERL_ASYNC_CHECK was called after every single op.

Perl Phases - the "Perl Compiler"

=> Parse + Compile to op tree (in three phases, see perlguts and perloptree)
BEGIN (use ...)
CHECK (O modules)
INIT (main phase)
END (cleanup, perl destructors)

Normal Perl functions start at INIT, after BEGIN and CHECK. The O modules start at CHECK, and skip INIT.

Perl Phases - the "B Compilers"

Parse + Compile to op tree (in three phases)
BEGIN (use ...)
=> CHECK (O) => freeze
compiled INIT (main phase)
compiled END (cleanup, perl destructors)

Perl Phases - the "B Compilers"

The B::C compiler, invoked via O, freezes the state in CHECK, and invokes then the walker.

$ perl -MO=C,-omyprog.c -e'print $a;' <br>
$ cc_harness -o myprog myprog.c <br>
$ ./myprog

B::C - Unoptimised / the walker

B::CC - The optimiser / unrolled (1)

B::CC - The optimiser / unrolled (2)

B::CC - The optimiser / unrolled (3)

no CALL_FPTR - call by ref
static direct function call
prefetched into CPU cache!
no unneeded stack handling
PERL_ASYNC_CHECK only at certain ops

Status

5.6.2 and 5.8.9 non-threaded B::C are quite usable and have the least known bugs, but 5.10 and 5.12 became also pretty stable now. 5.14 still has some CV problems.

Best are in the following order: 5.6.2, 5.8.9, 5.10, 5.12 non-threaded.

Status Targets

Bugfixes for B::C (magic, xsub detection)
Test top100 CPAN modules (3-5 fail, all with magic)
Isolate bugs into simple tests (45 cases)
Test the perl cores suite (~20 fails) Estimated 3-4 more open bugs.

Status Summary

5.6.2, 5.8.9, 5.10, 5.12 not-threaded are almost bug free, with B::Bytecode and B::C
B::C >=5.10 threaded (magic, pads) in work 2-3 minor bugs with certain modules
With debugging perls there seem to be less bugs than with releases. Normally it's the other way round
B::CC has some limitations and some more known bugs: See testsuite and STATUS

Projects