NAME

Log::Log4perl::AutoCategorize - extended Log::Log4perl logging

ABSTRACT

Log::Log4perl::AutoCategorize extends Log::Log4perl's easy mode, providing 2 main features on top;

1. extended, automatic, transparent categorization capabilities
2. runtime information useful for:
  a. test-coverage information
  b. managing your logging config.

There are several more mature alternatives which you should check out for comparison;

#1. search for Stealth Loggers in base POD
use Log::Log4perl qw(:easy);

#2. new functionality, developed at approx same time as AutoCategorize
use Log::Log4perl::Filter;

SYNOPSIS

  use Log::Log4perl::AutoCategorize
    (
     alias => 'Logger', # shorthand class-name alias
     # easy init methods (std ones available too)
     # maybe later (when base has #includes), 2nd will override 1st
     initfile => $filename,
     initstr => q{
	 log4perl.rootLogger=DEBUG, A1
	 # log4perl.appender.A1=Log::Dispatch::Screen
	 log4perl.appender.A1 = Log::Dispatch::File
	 log4perl.appender.A1.filename = ./mylog
	 log4perl.appender.A1.mode = write
	 log4perl.appender.A1.layout = PatternLayout
	 log4perl.appender.A1.layout.ConversionPattern=%d %c %m%n
	 # create TEST-COVERAGE log
	 log4perl.appender.COVERAGE = Log::Dispatch::File
         log4perl.appender.COVERAGE.mode = write
	 log4perl.appender.COVERAGE.layout = org.apache.log4j.PatternLayout
	 log4perl.appender.COVERAGE.layout.ConversionPattern = (%d{HH:mm:ss.SSS}) %c: %m%n
	 # save timestamped versions, 1 per process
	 log4perl.appender.COVERAGE.filename = sub {"./test-coverage.txt.". scalar localtime}

	 # now, add the value: send the stuff written at END to it
	 log4perl.logger.Log.Log4perl.Autocategorize.END = INFO, COVERAGE

	 },
     );

  foreach (1..500) {
    Logger->warn($_);
    foo();
    A->bar();
    A::bar();
  }

  sub foo {
    foreach (1..20) {
	Logger->warn($_);
    }
  }

  package A;

  sub bar {
    my @d;
    foreach (1..20) {
	push @d, $_;
	Logger->warn($_,\@d);
    }
  }

DESCRIPTION

Before diving in, a few notes to the documentation:

I use 'Logger' as the 'official' shorthand for the tedious and verbose Log::Log4perl::AutoCategorize. Note that theres support for this, its used in the example above; 'alias => Logger'. Also, I often refer to Log::Log4perl as 'base'.

I try to use 'call' for the act of calling a method at runtime, and 'invocation' to refer to the source-code (ie: package,sub,line-number) that did it (ie was called). I hope that the distinction will help to bend the brick into a more hat-like shape.

AutoCategorization

The primary feature of this module is to automatically and transparently create logging categories for each invocation point in your application code.

use Log::Log4perl ':easy' offers a comparable, but less capable feature; it automatically infers that $logCat = "$caller_package", and uses that category to properly handle the message at runtime.

package Foo;
use Log::Log4perl(':easy');

package Foo::Bar;
use Log::Log4perl(':easy');

This module extends the $logCat with info thats obtained at runtime, leveraging Log::Log4perl's categorization of log-events by giving more detail upon which to filter.

$logcat = "$package.$subname.$loglevel.$linenum";

The base bubble-up properties are still in effect; if your log-config never specifies anything beyond $package, the $logCat of each call bubbles up thru the log-config, until it matches with your package-level config item.

In other words: since the $logcat contains more information than that provided by :easy, your logging configuation can exersize more control over what gets logged.

Note that you could have always used whatever categories you wanted, see "Configuration files" in Log::Log4perl, you just didnt get the help that made it easy to do so.

Test Coverage Results of your Application

When your code calls this module, the calls are categorized and counted; this is reported when the program terminates. The info is returned in 2 chunks.

Seen: How many times each invocation was called

This is variously called a usage report, a test-coverage report, a seen-report, etc.

(15:24:24.772) Logger.END.info.106: $Seen Log Events: = [ { '#log4perl.category.A.bar.debug.55' => '-400', '#log4perl.category.A.bar.warn.54' => 400, '#log4perl.category.Log.Log4perl.AutoCategorize.END.info.201' => 1, '#log4perl.category.main.foo.warn.44' => 200, '#log4perl.category.main.info.32' => 10, '#log4perl.category.main.warn.31' => 10 }

Note that A.bar.debug.55 above has a negative count. This indicates that the logging activity was suppressed by the configuration, but the code was still reached 400 times.

UnSeen: What invocations were never called

(15:24:24.772) Logger.END.info.106: $un-Seen Log Events: = [ { 'debug_00011' => 'main,probe.pl,52', 'info_00010' => 'main,probe.pl,51', } ];

This report identifies the Logging invocations, examined and munged at compile time, minus those invocations that were called. As such, it represents those invocations that were never reached.

Assessing Test Coverage

One way to use this facility is by running your test suite and collecting the individual coverage reports. Since theyre sorted, theyre easily compared, so you can tell what parts of the system are tested by each part of the test suite.

The base log-config facilities support timestamped log-files (see SYNOPSIS usage), simplifying the gathering of test-coverage reports. Then you can `cat *.coverage-log|sort -u`, and quickly assess the total coverage provided by the entire test suite.

Keeping your Logging Config current

Base documentation speaks of the hazards/limitations of categories; this is more critical here, where categories are so leveraged. See "Pitfalls with Categories" in Log::Log4perl for more info.

This package is undisputably more vulnerable; it exposes line numbers, which change regularly during development. If you use numbers, expect that adding comments to your code will change the category reported, rendering those log-config items ineffective for controlling output. Moreover, recognize that as the numbers become more stable, theyre also less valuable; by then, youve reassesed and refined your choices wrt log-levels. (you have, yes?)

However, you can also be more disiplined (TMTOWTDI), using config-lines with a mix of specificity: package, package.method, and package.method.level. Here the coverage report helps, since it lists all executed logger-invocations, it serves as a pretty good roadmap, letting you see the forest through the trees.

Moreover, the coverage report is formatted to be easily edited into a informative and useful logging configuration, ie:

1. comments/documents the 'existence' of a log-config entry
2. doesnt create the hierarchy in the logger-config, efficient
3. identifies all the code instrumentation available
4. serves as cut-paste fodder for generalized log-configs

Bottom line: I recommend that you start your log-config file by copying and editing the coverage report. Just remove the quotes put in by Data::Dumper, change '=>' to '=', remove '#' to activate lines (see example above), and change the count to DEBUG, WARN, etc.

What %Seen cant tell you

Because the %Seen coverage only includes the categories it sees at (a single) runtime, it cant 'remember' the aggregate coverage from previous runs.

The sort technique given above to assess aggregate test-coverage is not as useful here; it sorts commented lines away from their uncommented counterparts, but its a start.

I may add an option to support a separate 'remembering' aggregate via a tied variable thats initialized in a BEGIN block. This is a distant feature; I need to find and copy prior art from somewhere in CPAN before I design and implement this.

Filtering with Log::Log4perl::AutoCategorize

Restating in different words (at risk of redundancy), filtering with this package uses the base behavior of bubbling up a logging event through the config, and using generic config-items which are not overridden by more specific items.

Using part or all of the hierarchy of categories in your config-file, you can enable or disable logging on a class, a method in that class, particular logging levels with that method, or line by line.

log4perl.category.Frob = DEBUG			# 1
log4perl.category.Frob.nicate.debug = INFO		# 2
log4perl.category.Frob.nicate.debug.118 = DEBUG	# 3
#log4perl.category.Frob.Vernier.twiddle.debug = DEBUG	# 4

(1) enables debug logging accross the Frob class, but then (2) overrides that class-wide setting within the method nicate(), suppressing debug logging (by setting the threshold to INFO). (3) further overrides that setting by enabling the debug statement on line 118.

Tweaking your log config

If you look carefully at the example above, (2) may be confusing; it has 'debug = INFO', which looks like nonsense. The 'debug' is the log-level that the programmer wrote, it reflects his judgement as to the nature of the message.

The '= INFO' part tells you that the config user decided to suppress debug messages from nicate(), which were enabled class-wide by (1). (3) restores normal behavior of debug logging, but only for 1 invocation.

Note also that (4), even if it was uncommented, would have no effect on the logging, because its just saying that all debug() calls done by Frob::Vernier::twiddle() are issued at their default level. Keeping such lines commented is more efficient; it saves Log::Log4perl both cpu and memory by not loading config-items that arent meaninful.

Other Features

Automatic categorization, logging coverage, and fine-grained control of logging are the primary value-added features of this wrapper package. Other secondary features are;

Configuration Loading

If use Log::Log4perl::AutoCategorize is given an import list which includes initstr => q{} or initfile => $filename, the string or file is used to initialize Log::Log4perl. You can still use the old way, ie $base->init().

Classname Aliasing

Because it would be inconvenient to type the full classname for every statement, ex: Log::Log4perl::AutoCategorize->debug(), the package has the ability to alias itself into a classname of your choice, ex:

use Log::Log4perl::AutoCategorize ( alias => myLogger );
myLogger->debug("ok doing it");

Because aliasing is used in optimization phase, it must be imported via use, it cannot be deferred as in require, import.

Easy Initialization

You can initialize the base logger by any of the following;

# as use parameters, least typing
a. use Log::Log4perl::Autocategorize ( initstr  => $string );
b. use Log::Log4perl::Autocategorize ( initfile => $filename );

# assuming youve already used the package
c. Log::Log4perl::AutoCategorize->init( $filename );
d. Log::Log4perl::AutoCategorize->init( \$string );

# you can cheat, and use the base class name
e. Log::Log4perl->init( $filename );
f. Log::Log4perl->init( \$string );

# assuming youve aliased
g. Logger->init( $filename );
h. Logger->init( \$string );

i. # no explicit initialization at all (theres a default config)

The default config writes to stdout, and includes a test-coverage setup. This config is used at import() time, unless a or b is provided by you. If you explicitly call init(), the base is re-initialized, and relys on base behavior to work correctly.

I hope at some point to provide init_modify(), which will overlay new configuration on existing, rather than a full reinitialization. This will be implemented using the base\'s notional configuration include mechanism.

SubClassing

As sayeth "Planning for the Future: Better Constructors" in perltoot, you make you package suitable for deriving new ones. test 04_subclass.t verifies this, though it wasnt the 1st thing I got working.

Module Architecture

This package uses a 2 phase strategy; method munge, method vivify. The optimization phase munges the correct static-method calls, giving them unique ids, and AUTOLOAD implements them.

Original AUTOLOAD functionality

In v0.001, AUTOLOAD did the entire job itself; for each call to the logger it did the following:

1. used caller() to construct a category string, $logcat, dynamically.
2. fetched the $logger singleton {$logger = get_logger($logcat)}
3. tested if $logcat was a loggable event {$logger->is_$logcat()}
4. logged the message, or not, as appropriate.

Doing this repeatedly using caller() for every call is computationally expensive for a logging subsystem; the code should be efficient and usable application wide.

It could not commit the collected knowledge into a subroutine though; the method is supposed to be callable from anywhere, and should act the same for each client.

because as there was no way to associate an invocation point uniquely with the info gleaned via caller().

Attempts to make methods here was no good; the method created was similar to Logger::debug(), it was created knowing the log-config based filtering of just one invocation, and then served all other debug calls with the wrong inferred category.

Our use of optimizer

I use Simon Cozen''s optimizer.pm to find and replace (ie munge) all occurrences

of    Logger->info()
with  Logger->info_$UniqueInvocationID()

The function-munging produces a unique function name for every invocation of the Logger. lexical occurrence (glossary term: )

This allows AUTOLOAD (which handles all these calls) to distinguish each invocation, make a custom handler routine for it. Once the custom handler is added into the symbol table, it is called later without any overhead.

How optimizer munges

Your code, 'use optimizer => sub yourfunc;' operates at compile time, letting you search the opcode chain for specific op-chains. Once youve got the right opchain, you can do the opcode surgery.

For example, we must recognize this opcode chain;

`perl -MO=Concise,-exec basic.pl`; t76 <0> pushmark s t77 <$> const(PV "Logger") sM/BARE t78 <$> const(PV "2") sM t79 <$> const(PV "args") sM t7a <$> method_named(PVIV "info_00010") t7b <1> entersub[t23] vKS/TARG

Our criteria is (pushmark, const with PV="Logger", ...., method_named with PVIV=~$namePatt). We also recognize the opcodes that build arguments on the stack, so we can find the balanced method_named opcode. Finally, we insure that the method name being invoked is one of the allowed ones.

Caveat: you must use a specific coding style

To match the above criteria, your code must use *ONLY the 1st* of these 3 different coding styles;

IE: Logger->debug("foo");
NOT: $logger->debug("foo");
NOR: Logger::debug("foo");

This form provides the package name as a bareword (t77), which allows a rather strict (safe?) test; we dont want to stomp in others\' namespaces (except for the explicit takeover of the $Alias package)

The 2nd form is not safely detectable at compile time, since $logger type cannot be absolutely known, and we try not to do risky surgery. For example, we\'d never want to inadvertently do this; $logger->warn if ref $logger eq 'FlareGun'.

The 3rd form is doable, but clutters the code which does $Alias recognition, for little syntactic gain. We wont even warn you if you use this construct.

lightweight customized subroutines

Given that all the customized anonymous subroutines are instantiated as either sub{} or sub{ logit("$cat") }; they all invoke a common logit() routine that does the real work.

The asubs are trivial wrappers, with the category value stringified and held in code, and passed into logit() when called. This doesnt shrink the number of symbol table entries, but it minimizes the size of the CVs hanging there.

To save more space, a common do_nothing() function could be used instead of the current invocation counting do-nothing, at the cost of losing test-coverage info.

AUTOLOAD revisited

Once method munging was in place, $AUTOLOAD would contain, for example; not "debug", but "debug_00011". Now that each invocation is unique, the distinctions can be computed once and remembered; ie they can be reduced to a dynamically created sub, and added to the symbol table.

IOW, these changes were made to AUTOLOAD:

4. computes the category using caller()
5. determines whether the base logger would send the message.
6. instantiates an anonymous subroutine reflecting 5.
7. adds the sub to the symbol table, with the munged method name.

At runtime, invocations like Logger->warn_$UIID() are called, and dispatched to AUTOLOAD. Once the munged function is in the symbol table, AUTOLOAD is never called again for this invocation.

Benchmarking

Ive done one Benchmark.pm based test (tshort.pl) which doesnt seem quite right. The more straightforward test shows more convincing results.

$jimc@harpo logger]$ time t1.pl -n > /dev/null

real	1m30.860s
user	1m28.246s
sys	0m2.002s

$jimc@harpo logger]$ time t2.pl

... debug output snipped ...

real	0m52.229s
user	0m49.363s
sys	0m1.625s

This last benchmark implements above test in pure Log::Log4perl code. As you can see, this package is competetive with Log::Log4perl itself, at least when analogous features are used (that use is a bit forced, you wouldnt NEED to do so all the time).

Nonetheless, its conceivable to get better than native performance if internal short-circuits can be taken.

Please run perftests/timeit.sh and send me email, particularly if you get wildly different results.

CAVEAT

This package expects you to use Logger->warn() style of coding, $logger->warn() will not work, unless . This is because the type of $logger cannot be known at compile time, and its too risky to munge everything that looks like $flaregun->warn().

Even if it were possible, it would probably be bad to mix lexical code customization with objects and their attributes.

BUGS

The category extension can be abused; you can independently suppress warn() while elevating debug(), leading to suprises in what output goes where. But this can be used sanely, so I dont consider this a problem.

The %Seen report can never contain the categories of unexecuted functions, since te catalog entry is done by AUTOLOAD, which is never run. Ive thought of a few 'solutions', but theyre all insane..

1. remember functions during munging, call them once in CHECK{}.
   Wont work; would get useless caller() info.

2. Tie %Total_Invocations to a DBM file, update in the END.  This
   is seductive; but it will have trouble forgetting invocations
   that have been removed from the code, or in the case of
   line-numbered log-specs, been moved by a mere comment. Note
   though that reset control could be added to API.

TODO

fix raft of subroutine redefined warnings

Warnings are issued (under -w) during compile-phase, I cant control them. I suspect that solution lies in warnings::register, or perhaps in more primitive solutions like { package B::Generate; no warnings 'redefine' }. If you can patch these away, Id be eternally grateful.

devise good way to control logging details

AutoCategorize uses Data::Dumper to render the structure of complex data arguments. Ive found it would be nice to selectively suppress the Dump, and print just the strings.

But to do this via the log-config, the config-item syntax must be extended; either by adding new attributes, or by allowing new values. Either approach compromises interoperability with log4java config files. These possible config-schemes should therefore be considered speculative;

# using extra param
log4perl.category.TestUtils.testcase_applies = DEBUG, 0

The 2nd parameter would control whether Data::Dumper was invoked, in this case they would not be. This scheme has advantage of having a wide API; hashrefs etc can (in principle - and subject to reality check by Gurus) be given as parameters.

Other schemes are also possible;

# an additional '.detail' attribute
log4perl.category.A.foo.debug.detail = 0

# new .*.detail attribute, with (needed) wild-card
log4perl.category.A.foo.debug.*.detail = 0
# limit the depth of dumper (one way, others possible)
log4perl.category.A.foo.debug.*.detail = -20
# alternative 
log4perl.category.A.foo.debug.*.detail = YAML
log4perl.category.A.foo.debug.*.detail = sub { YAML->dump }

allow use of alternates to Data::Dumper

YAML comes to mind, as does Data::Dumper::Sortkeys = sub {...}; I took a quick swipe at using YAML::Dump (-dy debug option), but its not yielding expected results.

Its also possible to get various object and context sensitive serializations, esp if combined with extra params above.

devise extended method-name space

This package is hardcoded to only munge invocations to one of (debug, info, notice, warn, error), non matching are left unaltered, because some calls should delegate thru to Log::Log4perl.

Note for example that this module explicitly adds levels: notice, notify as synonyms for warn, and these are not exposed above. (I like it myself, but I dont want to corrupt my users too).

But the point is this: A whole panoply of method suffixes are possible; (ex: /(info|debug)_.*/), a careful use of this method namespace could be used to specify many subtle (or not so subtle) variations in behavior (ex: debug_dump, warn_nodump, error_iftrue, info_iffalse).

short-circuit internal checks

Its likely that several checks internal to Log::Log4perl can be eliminated once its determined (at 1st invocation) that a log event should actually do something.

init and watch

Log4perl has ability to watch the logfile for changes, and change its operation accordingly. This package must hook into that notification, and use it to undefine the appropriate symbol-table entry, thus allowing the AUTOLOAD mechanism to re-vivify it according to the new configuration. Obviously it would be nice to do this only to the parts of the config that are different.

This feature will use the %cat2data mapping built at 1st invocation to allow config changes to be mapped back to the munged method-name that resolved to that category in the 1st place.

context sensitive logging

NDC and MDC are tantalizing features, they appear to hold some promise for conditional logging; ie once an application error is detected, a full dump of the NDC or MDC is logged. Alternatively, once an all-ok is determined, a stack of NDC logs can be discarded, and a summary message issued.

Remembering the AutoCat-alog

See also "BUGS/Seen.2". While it borders on the baroque, its possible to tie %TotalInvocations to a DBM file, whose name is controlled by your log-config, and can thus be disciplined to your VERSION; either $package::VERSION, a complete CVS Revision, or to your application Version. Interestingly, this could give some insight into longer-term code evolution.

More TODO

find and kill bugs..
write test cases..
optimize more
integrate with Log::Log4perl
bundle for CPAN..

help/feedback is welcome.

SEE ALSO

L<Log::Log4perl>
L<optimizer>
L<B::Generate>

AUTHOR

Jim Cromie <jcromie@cpan.org>

COPYRIGHT AND LICENSE

Copyright 2003 Jim Cromie

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

(I always wanted to write that ;-)