NAME
Log::Log4perl::AutoCategorize - extended Log::Log4perl logging
ABSTRACT
Log::Log4perl::AutoCategorize extends Log::Log4perl's easy mode, providing 2 main features on top;
1. extended, automatic, transparent categorization capabilities
2. runtime information useful for:
a. test-coverage information
b. managing your logging config.
There are several more mature alternatives which you should check out for comparison;
#1. search for Stealth Loggers in base POD
use Log::Log4perl qw(:easy);
#2. new functionality, developed at approx same time as AutoCategorize
use Log::Log4perl::Filter;
SYNOPSIS
use Log::Log4perl::AutoCategorize
(
alias => 'Logger', # shorthand class-name alias
# easy init methods (std ones available too)
# maybe later (when base has #includes), 2nd will override 1st
initfile => $filename,
initstr => q{
log4perl.rootLogger=DEBUG, A1
# log4perl.appender.A1=Log::Dispatch::Screen
log4perl.appender.A1 = Log::Dispatch::File
log4perl.appender.A1.filename = ./mylog
log4perl.appender.A1.mode = write
log4perl.appender.A1.layout = PatternLayout
log4perl.appender.A1.layout.ConversionPattern=%d %c %m%n
# create TEST-COVERAGE log
log4perl.appender.COVERAGE = Log::Dispatch::File
log4perl.appender.COVERAGE.mode = write
log4perl.appender.COVERAGE.layout = org.apache.log4j.PatternLayout
log4perl.appender.COVERAGE.layout.ConversionPattern = (%d{HH:mm:ss.SSS}) %c: %m%n
# save timestamped versions, 1 per process
log4perl.appender.COVERAGE.filename = sub {"./test-coverage.txt.". scalar localtime}
# now, add the value: send the stuff written at END to it
log4perl.logger.Log.Log4perl.Autocategorize.END = INFO, COVERAGE
},
);
foreach (1..500) {
Logger->warn($_);
foo();
A->bar();
A::bar();
}
sub foo {
foreach (1..20) {
Logger->warn($_);
}
}
package A;
sub bar {
my @d;
foreach (1..20) {
push @d, $_;
Logger->warn($_,\@d);
}
}
DESCRIPTION
Before diving in, a few notes to the documentation:
I use 'Logger' as the 'official' shorthand for the tedious and verbose Log::Log4perl::AutoCategorize. Note that theres support for this, its used in the example above; 'alias => Logger'. Also, I often refer to Log::Log4perl as 'base'.
I try to use 'call' for the act of calling a method at runtime, and 'invocation' to refer to the source-code (ie: package,sub,line-number) that did it (ie was called). I hope that the distinction will help to bend the brick into a more hat-like shape.
AutoCategorization
The primary feature of this module is to automatically and transparently create logging categories for each invocation point in your application code.
use Log::Log4perl ':easy' offers a comparable, but less capable feature; it automatically infers that $logCat = "$caller_package", and uses that category to properly handle the message at runtime.
package Foo;
use Log::Log4perl(':easy');
package Foo::Bar;
use Log::Log4perl(':easy');
This module extends the $logCat with info thats obtained at runtime, leveraging Log::Log4perl's categorization of log-events by giving more detail upon which to filter.
$logcat = "$package.$subname.$loglevel.$linenum";
The base bubble-up properties are still in effect; if your log-config never specifies anything beyond $package, the $logCat of each call bubbles up thru the log-config, until it matches with your package-level config item.
In other words: since the $logcat contains more information than that provided by :easy, your logging configuation can exersize more control over what gets logged.
Note that you could have always used whatever categories you wanted, see "Configuration files" in Log::Log4perl, you just didnt get the help that made it easy to do so.
Test Coverage Results of your Application
When your code calls this module, the calls are categorized and counted; this is reported when the program terminates. The info is returned in 2 chunks.
Seen: How many times each invocation was called
This is variously called a usage report, a test-coverage report, a seen-report, etc.
(15:24:24.772) Logger.END.info.106: $Seen Log Events: = [ { '#log4perl.category.A.bar.debug.55' => '-400', '#log4perl.category.A.bar.warn.54' => 400, '#log4perl.category.Log.Log4perl.AutoCategorize.END.info.201' => 1, '#log4perl.category.main.foo.warn.44' => 200, '#log4perl.category.main.info.32' => 10, '#log4perl.category.main.warn.31' => 10 }
Note that A.bar.debug.55 above has a negative count. This indicates that the logging activity was suppressed by the configuration, but the code was still reached 400 times.
UnSeen: What invocations were never called
(15:24:24.772) Logger.END.info.106: $un-Seen Log Events: = [ { 'debug_00011' => 'main,probe.pl,52', 'info_00010' => 'main,probe.pl,51', } ];
This report identifies the Logging invocations, examined and munged at compile time, minus those invocations that were called. As such, it represents those invocations that were never reached.
Assessing Test Coverage
One way to use this facility is by running your test suite and collecting the individual coverage reports. Since theyre sorted, theyre easily compared, so you can tell what parts of the system are tested by each part of the test suite.
The base log-config facilities support timestamped log-files (see SYNOPSIS usage), simplifying the gathering of test-coverage reports. Then you can `cat *.coverage-log|sort -u`, and quickly assess the total coverage provided by the entire test suite.
Keeping your Logging Config current
Base documentation speaks of the hazards/limitations of categories; this is more critical here, where categories are so leveraged. See "Pitfalls with Categories" in Log::Log4perl for more info.
This package is undisputably more vulnerable; it exposes line numbers, which change regularly during development. If you use numbers, expect that adding comments to your code will change the category reported, rendering those log-config items ineffective for controlling output. Moreover, recognize that as the numbers become more stable, theyre also less valuable; by then, youve reassesed and refined your choices wrt log-levels. (you have, yes?)
However, you can also be more disiplined (TMTOWTDI), using config-lines with a mix of specificity: package, package.method, and package.method.level. Here the coverage report helps, since it lists all executed logger-invocations, it serves as a pretty good roadmap, letting you see the forest through the trees.
Moreover, the coverage report is formatted to be easily edited into a informative and useful logging configuration, ie:
1. comments/documents the 'existence' of a log-config entry
2. doesnt create the hierarchy in the logger-config, efficient
3. identifies all the code instrumentation available
4. serves as cut-paste fodder for generalized log-configs
Bottom line: I recommend that you start your log-config file by copying and editing the coverage report. Just remove the quotes put in by Data::Dumper, change '=>' to '=', remove '#' to activate lines (see example above), and change the count to DEBUG, WARN, etc.
What %Seen cant tell you
Because the %Seen coverage only includes the categories it sees at (a single) runtime, it cant 'remember' the aggregate coverage from previous runs.
The sort technique given above to assess aggregate test-coverage is not as useful here; it sorts commented lines away from their uncommented counterparts, but its a start.
I may add an option to support a separate 'remembering' aggregate via a tied variable thats initialized in a BEGIN block. This is a distant feature; I need to find and copy prior art from somewhere in CPAN before I design and implement this.
Filtering with Log::Log4perl::AutoCategorize
Restating in different words (at risk of redundancy), filtering with this package uses the base behavior of bubbling up a logging event through the config, and using generic config-items which are not overridden by more specific items.
Using part or all of the hierarchy of categories in your config-file, you can enable or disable logging on a class, a method in that class, particular logging levels with that method, or line by line.
log4perl.category.Frob = DEBUG # 1
log4perl.category.Frob.nicate.debug = INFO # 2
log4perl.category.Frob.nicate.debug.118 = DEBUG # 3
#log4perl.category.Frob.Vernier.twiddle.debug = DEBUG # 4
(1) enables debug logging accross the Frob class, but then (2) overrides that class-wide setting within the method nicate(), suppressing debug logging (by setting the threshold to INFO). (3) further overrides that setting by enabling the debug statement on line 118.
Tweaking your log config
If you look carefully at the example above, (2) may be confusing; it has 'debug = INFO', which looks like nonsense. The 'debug' is the log-level that the programmer wrote, it reflects his judgement as to the nature of the message.
The '= INFO' part tells you that the config user decided to suppress debug messages from nicate(), which were enabled class-wide by (1). (3) restores normal behavior of debug logging, but only for 1 invocation.
Note also that (4), even if it was uncommented, would have no effect on the logging, because its just saying that all debug() calls done by Frob::Vernier::twiddle() are issued at their default level. Keeping such lines commented is more efficient; it saves Log::Log4perl both cpu and memory by not loading config-items that arent meaninful.
Other Features
Automatic categorization, logging coverage, and fine-grained control of logging are the primary value-added features of this wrapper package. Other secondary features are;
Configuration Loading
If use Log::Log4perl::AutoCategorize is given an import list which includes initstr => q{} or initfile => $filename, the string or file is used to initialize Log::Log4perl. You can still use the old way, ie $base->init().
Classname Aliasing
Because it would be inconvenient to type the full classname for every statement, ex: Log::Log4perl::AutoCategorize->debug(), the package has the ability to alias itself into a classname of your choice, ex:
use Log::Log4perl::AutoCategorize ( alias => myLogger );
myLogger->debug("ok doing it");
Because aliasing is used in optimization phase, it must be imported via use, it cannot be deferred as in require, import.
Easy Initialization
You can initialize the base logger by any of the following;
# as use parameters, least typing
a. use Log::Log4perl::Autocategorize ( initstr => $string );
b. use Log::Log4perl::Autocategorize ( initfile => $filename );
# assuming youve already used the package
c. Log::Log4perl::AutoCategorize->init( $filename );
d. Log::Log4perl::AutoCategorize->init( \$string );
# you can cheat, and use the base class name
e. Log::Log4perl->init( $filename );
f. Log::Log4perl->init( \$string );
# assuming youve aliased
g. Logger->init( $filename );
h. Logger->init( \$string );
i. # no explicit initialization at all (theres a default config)
The default config writes to stdout, and includes a test-coverage setup. This config is used at import() time, unless a or b is provided by you. If you explicitly call init(), the base is re-initialized, and relys on base behavior to work correctly.
I hope at some point to provide init_modify(), which will overlay new configuration on existing, rather than a full reinitialization. This will be implemented using the base\'s notional configuration include mechanism.
SubClassing
As sayeth "Planning for the Future: Better Constructors" in perltoot, you make you package suitable for deriving new ones. test 04_subclass.t verifies this, though it wasnt the 1st thing I got working.
Module Architecture
This package uses a 2 phase strategy; method munge, method vivify. The optimization phase munges the correct static-method calls, giving them unique ids, and AUTOLOAD implements them.
Original AUTOLOAD functionality
In v0.001, AUTOLOAD did the entire job itself; for each call to the logger it did the following:
1. used caller() to construct a category string, $logcat, dynamically.
2. fetched the $logger singleton {$logger = get_logger($logcat)}
3. tested if $logcat was a loggable event {$logger->is_$logcat()}
4. logged the message, or not, as appropriate.
Doing this repeatedly using caller() for every call is computationally expensive for a logging subsystem; the code should be efficient and usable application wide.
It could not commit the collected knowledge into a subroutine though; the method is supposed to be callable from anywhere, and should act the same for each client.
because as there was no way to associate an invocation point uniquely with the info gleaned via caller().
Attempts to make methods here was no good; the method created was similar to Logger::debug(), it was created knowing the log-config based filtering of just one invocation, and then served all other debug calls with the wrong inferred category.
Our use of optimizer
I use Simon Cozen''s optimizer.pm to find and replace (ie munge) all occurrences
of Logger->info()
with Logger->info_$UniqueInvocationID()
The function-munging produces a unique function name for every invocation of the Logger. lexical occurrence (glossary term: )
This allows AUTOLOAD (which handles all these calls) to distinguish each invocation, make a custom handler routine for it. Once the custom handler is added into the symbol table, it is called later without any overhead.
How optimizer munges
Your code, 'use optimizer => sub yourfunc;' operates at compile time, letting you search the opcode chain for specific op-chains. Once youve got the right opchain, you can do the opcode surgery.
For example, we must recognize this opcode chain;
`perl -MO=Concise,-exec basic.pl`; t76 <0> pushmark s t77 <$> const(PV "Logger") sM/BARE t78 <$> const(PV "2") sM t79 <$> const(PV "args") sM t7a <$> method_named(PVIV "info_00010") t7b <1> entersub[t23] vKS/TARG
Our criteria is (pushmark, const with PV="Logger", ...., method_named with PVIV=~$namePatt). We also recognize the opcodes that build arguments on the stack, so we can find the balanced method_named opcode. Finally, we insure that the method name being invoked is one of the allowed ones.
Caveat: you must use a specific coding style
To match the above criteria, your code must use *ONLY the 1st* of these 3 different coding styles;
IE: Logger->debug("foo");
NOT: $logger->debug("foo");
NOR: Logger::debug("foo");
This form provides the package name as a bareword (t77), which allows a rather strict (safe?) test; we dont want to stomp in others\' namespaces (except for the explicit takeover of the $Alias package)
The 2nd form is not safely detectable at compile time, since $logger type cannot be absolutely known, and we try not to do risky surgery. For example, we\'d never want to inadvertently do this; $logger->warn if ref $logger eq 'FlareGun'.
The 3rd form is doable, but clutters the code which does $Alias recognition, for little syntactic gain. We wont even warn you if you use this construct.
lightweight customized subroutines
Given that all the customized anonymous subroutines are instantiated as either sub{} or sub{ logit("$cat") }; they all invoke a common logit() routine that does the real work.
The asubs are trivial wrappers, with the category value stringified and held in code, and passed into logit() when called. This doesnt shrink the number of symbol table entries, but it minimizes the size of the CVs hanging there.
To save more space, a common do_nothing() function could be used instead of the current invocation counting do-nothing, at the cost of losing test-coverage info.
AUTOLOAD revisited
Once method munging was in place, $AUTOLOAD would contain, for example; not "debug", but "debug_00011". Now that each invocation is unique, the distinctions can be computed once and remembered; ie they can be reduced to a dynamically created sub, and added to the symbol table.
IOW, these changes were made to AUTOLOAD:
4. computes the category using caller()
5. determines whether the base logger would send the message.
6. instantiates an anonymous subroutine reflecting 5.
7. adds the sub to the symbol table, with the munged method name.
At runtime, invocations like Logger->warn_$UIID() are called, and dispatched to AUTOLOAD. Once the munged function is in the symbol table, AUTOLOAD is never called again for this invocation.
Benchmarking
Ive done one Benchmark.pm based test (tshort.pl) which doesnt seem quite right. The more straightforward test shows more convincing results.
$jimc@harpo logger]$ time t1.pl -n > /dev/null
real 1m30.860s
user 1m28.246s
sys 0m2.002s
$jimc@harpo logger]$ time t2.pl
... debug output snipped ...
real 0m52.229s
user 0m49.363s
sys 0m1.625s
This last benchmark implements above test in pure Log::Log4perl code. As you can see, this package is competetive with Log::Log4perl itself, at least when analogous features are used (that use is a bit forced, you wouldnt NEED to do so all the time).
Nonetheless, its conceivable to get better than native performance if internal short-circuits can be taken.
Please run perftests/timeit.sh and send me email, particularly if you get wildly different results.
CAVEAT
This package expects you to use Logger->warn() style of coding, $logger->warn() will not work, unless . This is because the type of $logger cannot be known at compile time, and its too risky to munge everything that looks like $flaregun->warn().
Even if it were possible, it would probably be bad to mix lexical code customization with objects and their attributes.
BUGS
The category extension can be abused; you can independently suppress warn() while elevating debug(), leading to suprises in what output goes where. But this can be used sanely, so I dont consider this a problem.
The %Seen report can never contain the categories of unexecuted functions, since te catalog entry is done by AUTOLOAD, which is never run. Ive thought of a few 'solutions', but theyre all insane..
1. remember functions during munging, call them once in CHECK{}.
Wont work; would get useless caller() info.
2. Tie %Total_Invocations to a DBM file, update in the END. This
is seductive; but it will have trouble forgetting invocations
that have been removed from the code, or in the case of
line-numbered log-specs, been moved by a mere comment. Note
though that reset control could be added to API.
TODO
fix raft of subroutine redefined warnings
Warnings are issued (under -w) during compile-phase, I cant control them. I suspect that solution lies in warnings::register, or perhaps in more primitive solutions like { package B::Generate; no warnings 'redefine' }. If you can patch these away, Id be eternally grateful.
devise good way to control logging details
AutoCategorize uses Data::Dumper to render the structure of complex data arguments. Ive found it would be nice to selectively suppress the Dump, and print just the strings.
But to do this via the log-config, the config-item syntax must be extended; either by adding new attributes, or by allowing new values. Either approach compromises interoperability with log4java config files. These possible config-schemes should therefore be considered speculative;
# using extra param
log4perl.category.TestUtils.testcase_applies = DEBUG, 0
The 2nd parameter would control whether Data::Dumper was invoked, in this case they would not be. This scheme has advantage of having a wide API; hashrefs etc can (in principle - and subject to reality check by Gurus) be given as parameters.
Other schemes are also possible;
# an additional '.detail' attribute
log4perl.category.A.foo.debug.detail = 0
# new .*.detail attribute, with (needed) wild-card
log4perl.category.A.foo.debug.*.detail = 0
# limit the depth of dumper (one way, others possible)
log4perl.category.A.foo.debug.*.detail = -20
# alternative
log4perl.category.A.foo.debug.*.detail = YAML
log4perl.category.A.foo.debug.*.detail = sub { YAML->dump }
allow use of alternates to Data::Dumper
YAML comes to mind, as does Data::Dumper::Sortkeys = sub {...}; I took a quick swipe at using YAML::Dump (-dy debug option), but its not yielding expected results.
Its also possible to get various object and context sensitive serializations, esp if combined with extra params above.
devise extended method-name space
This package is hardcoded to only munge invocations to one of (debug, info, notice, warn, error), non matching are left unaltered, because some calls should delegate thru to Log::Log4perl.
Note for example that this module explicitly adds levels: notice, notify as synonyms for warn, and these are not exposed above. (I like it myself, but I dont want to corrupt my users too).
But the point is this: A whole panoply of method suffixes are possible; (ex: /(info|debug)_.*/), a careful use of this method namespace could be used to specify many subtle (or not so subtle) variations in behavior (ex: debug_dump, warn_nodump, error_iftrue, info_iffalse).
short-circuit internal checks
Its likely that several checks internal to Log::Log4perl can be eliminated once its determined (at 1st invocation) that a log event should actually do something.
init and watch
Log4perl has ability to watch the logfile for changes, and change its operation accordingly. This package must hook into that notification, and use it to undefine the appropriate symbol-table entry, thus allowing the AUTOLOAD mechanism to re-vivify it according to the new configuration. Obviously it would be nice to do this only to the parts of the config that are different.
This feature will use the %cat2data mapping built at 1st invocation to allow config changes to be mapped back to the munged method-name that resolved to that category in the 1st place.
context sensitive logging
NDC and MDC are tantalizing features, they appear to hold some promise for conditional logging; ie once an application error is detected, a full dump of the NDC or MDC is logged. Alternatively, once an all-ok is determined, a stack of NDC logs can be discarded, and a summary message issued.
Remembering the AutoCat-alog
See also "BUGS/Seen.2". While it borders on the baroque, its possible to tie %TotalInvocations to a DBM file, whose name is controlled by your log-config, and can thus be disciplined to your VERSION; either $package::VERSION, a complete CVS Revision, or to your application Version. Interestingly, this could give some insight into longer-term code evolution.
More TODO
find and kill bugs..
write test cases..
optimize more
integrate with Log::Log4perl
bundle for CPAN..
help/feedback is welcome.
SEE ALSO
L<Log::Log4perl>
L<optimizer>
L<B::Generate>
AUTHOR
Jim Cromie <jcromie@cpan.org>
COPYRIGHT AND LICENSE
Copyright 2003 Jim Cromie
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
(I always wanted to write that ;-)