NAME
loggrep - quickly find relevant lines in a log searching by date
VERSION
version 0.002
SYNOPSIS
loggrep --start <date> --end <date> [ --include <pattern> ]+ [ --exclude <pattern> ]+ <file>
DESCRIPTION
loggrep allows one to search for lines in a file that match particular patterns. In this it is like grep and ack and many other utilities. The functionality it adds is an ability to narrow the search window to those lines that fall within temporal limits. It can find these limits quickly by a variety of binary search, allowing one to search very large log files efficiently. This requires, of course, that the lines in the file (usually) have times stamps which (usually) are in sequence and parsable in a common way.
Loggrep searches for an initial temporal limit by estimating the line offset for the line sought based on the marginal timestamps of the search region and the assumption that lines are added at a roughly constant rate. This candidate line is found; then the nearest line bearing a timestamp is sought. This time is compared the to target time and the process is repeated within the new search region until either the target time is found or the search region cannot be narrowed further.
OPTIONS
Run loggrep with the --help
option to see the option default values, if any.
Log
- -l file, --log=file
-
The log file to search may be provided either as the final argument or as the value of a
--log
option.
Temporal Limits
All temporal limits are parsed by Date::Parse. Date::Parse will turn a temporal expression into a Unix timestamp. You can test whether it understands your temporal expressions like so:
$ perl -MDate::Parse -E 'say str2time shift' "21/dec/93 17:05"
756511500
It does not actually need to get the timestamp right so long as it puts them in the right sequence.
If Date::Parse fails you, see the --time
option.
- -s time, --start=time
-
The initial temporal limit.
- -e time, --end=time
-
The final temporal limit.
- -m time, --moment=time
-
--moment
sets the initial and final temporal limits to the same time. This is useful for extracting a single log line with a known timestamp, perhaps with its context (see--context
). - -d pattern, --date=pattern
-
The pattern used to identify timestamps in log lines.
Search Patterns
All patterns are Perl regular expressions of the idiom understood by the Perl executing loggrep. If no patterns are provided, all lines within the temporal limits are printed. If both including and excluding patterns match a line, the latter take precedence and the line is not printed. Multiple search patterns, or none, may be provided.
- -n pattern, --include=pattern
-
Print the lines matching the given pattern.
- -N string, --include-quoted=string
-
Print lines containing the given substring.
- -v pattern, --exclude=pattern
-
Exclude lines matching the given pattern.
- -V string, --exclude-quoted=string
-
Exclude lines containing the given substring.
- -i, --case-insensitive
-
Pattern and substring matching is case-insensitive. Note that one may turn on case-insensitivity for a single pattern like so:
-i "(?i:match me)"
Likewise, one may turn it off for a single pattern:
-i "(?-i:match me)"
This technique will not work for substring matching.
Debugging
- -w, --warn
-
Warn upon finding a log line with no timestamp.
- --die
-
Throw an error upon finding a log line with no timestamp.
Context
These options facilitate understanding matches by grouping them or providing log context.
- -b, --blank
-
Print a blank line between non-sequential matches. This is shorthand for
--sep=''
. - --sep=string, --separator=string
-
Print the given separator between non-sequential matches.
- -C num, --context=num
-
Print up to the given number of non-matching lines before and after a match. This is equivalent to
--before=num --after=num
. - -B num, --before=num
-
Print up to the given number of non-matching lines before a match.
- -A num, --after=num
-
Print up to the given number of non-matching lines after a match.
Overrides
These options allow one to provide alternative functionality when printing lines or parsing times. The code defined by these options is evaluated in its own package to prevent your accidentally changing the behavior of basic loggrep functionality. It provides no protection against deliberate perversity, of course, but if you can already run Perl code from the command line, why go to the trouble of doing perverse things inside loggrep?
All code is executed by default with strict mode and warnings off and a "use vX" line is injected, where X
represents the major and minor version numbers of the Perl running loggrep itself. This facilitates using modern Perl features.
- -t code, --time=code
-
Code to be used to convert a timestamp expression to a Unix timestamp. This code will see the pattern matched as the sole value in
@_
, and whatever it returns will be interpreted as the timestamp. - -E code, --exec=code
-
Code to be used to convert a matched line into something to print. The values in
@_
for this code will be the raw line, the line number, and whether it was a match. The last parameter allows one to distinguish contextual lines from match lines. Whatever this code returns will be printed. - -M module, --module=module
-
Additional modules to be imported into the package in which user-provided code is evaluated. This option may be repeated.
Miscellaneous
ACKNOWLEDGEMENTS
Thanks go to Green River for letting me spend some time on this when I needed to create a utility to search a large log file quickly.
AUTHOR
David F. Houghton <dfhoughton@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by David F. Houghton.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.