NAME

Log::Parallel::ApacheCLF - parse apache common log format

SYNOPSIS

use Log::Parallel::ApacheCLF;

my $parser = Log::Parallel::ApacheCLF->return_parser($fh, %info);

LOG PROCESSING CONFIG

sources: - name: raw apache server logs hosts: host1.domain path: /var/apache_archive/%YYYY%.%MM%.%DD%{,.bz2} format: ApacheCLF valid_from: 2009-01-01 valid_to: yesterday jobs: - name: server logs destination: server logs source: raw apache server logs path: '%DATADIR%/%YYYY%/%MM%/%DD%/%JOBNAME%.%DURATION%.%BUCKET%.%SOURCE_BKT%' valid_from: 2008-01-01 valid_to: yesterday frequency: daily output_format: TSV use: Log::Parallel::TSV Log::Parallel::ApacheCLF buckets: 20 hosts: host10,host11,host12,host13 bucketizer: $log->{server_time}

DESCRIPTION

Parse the apache web server logs in Common Log Format. The fields from the apache logs are named as follows:

ip

The IP address header field. Sometimes -.

auth_user

The HTTP authenticated user.

server_time

The time, unix time seconds, that the server wrote the log line.

request

The HTTP request line. Eg: GET / HTTP/1.0.

status

The HTTP status code. 200, 301, etc.

bytes_sent

The number bytes transfered.

user_agent

The HTTP UserAgent.

refferer

The HTTP Refferrer field.

This module can also be used to parse more extended Apache logs. Create a new module and invoke this one to do a bunch of the work. There are three extra construction arguments that can be used:

pre_rx

A regular expression to match of things that come before the regular Apache log format on each line. If this has saved matches, they'll be returned as an array: pre_match.

pre_rx_saved_match_count

If you have a pre_rx, and if that regular expression has saved matches, you must say how many for Log::Parallel::ApacheCLF to work. This is how.

post_rx

A regular expression to match of things that come after the regular Apache log format on each line.

If this has saved matches, they'll be returned as an array: post_match.

LICENSE

This package may be used and redistributed under the terms of either the Artistic 2.0 or LGPL 2.1 license.