NAME

Catalyst::Plugin::Cache::HTTP - HTTP/1.1 cache validators for Catalyst

VERSION

Version 0.001000

SYNOPSIS

Load Plugin Into Application

package MyApp;

use Catalyst qw(Cache::HTTP);

Create a Last-Modified Header

package MyApp::Controller::Foo;

sub bar : Local {
  my ($self, $c) = @_;
  my $data = $c->model('MyApp::Model')->fetch_data;
  my $mtime = $data->mod_time;

  ...
  $c->response->headers->last_modified($mtime);
  ...
}

Automatic Creation of ETag

  package MyApp::View::TT;

  use base 'Catalyst::View::TT';
  use MRO::Compat;
  use Digest::MD5 'md5_hex';

  sub process {
    my $self = shift;
    my $c = $_[0];

    $self->next::method(@_)
	or return 0;

    my $method = $c->request->method;
    return 1
	if $method ne 'GET' and $method ne 'HEAD' or
	   $c->stash->{nocache};    # disable caching explicitely

    my $body = $c->response->body;
    if ($body) {
      utf8::encode($body)
        if utf8::is_utf8($body);
      $c->response->headers->etag(md5_hex($body));
    }

    return 1;
  }

DESCRIPTION

Ever since mankind develops web sites, it has to deal with the problems that arise when a site becomes popular. This is especially true for dynamic contents. Optimizations of the web application itself are usually followed by tweaking the system setup, better hardware, improved connectivity, clustering and load balancing. Good if the site yields enough profit to fund all this (and the people that are required).

There are also numerous modules on the CPAN and helpful tips all over the World Wide Web about how to crack the whip on Catalyst applications.

Noticeably often is overlooked, that more than a decade ago the "fathers" of the WWW have created concepts in HTTP/1.1 to reduce traffic between web server and web client (and proxy where applicable). All common web browsers support these concepts for many years now.

These concepts can accelerate a web application and save resources at the same time.

How this is possible? You can look up the concept in RFC 2616 section 13.3, plus the implementation in sections 14.19, 14.24, 14.25, 14.26, 14.28 and 14.44. To cut a long story short: This plugin does not manage any cache on the server and avoids transmitting data where possible.

To utilize this concept in your Catalyst based application some rather small additions have to be made in the code:

1. Use the plugin

This is easy: In the application class (often referred as MyApp.pm) just add Cache::HTTP to the list of plugins after use Catalyst.

2. Add appropriate response headers

Those headers are Last-Modified and ETag. The headers method of Catalyst::Response which actually provides us with an instance of HTTP::Headers gives us two handy accessors to those header lines: last_modified and etag.

2.1 $c->response->headers->last_modified($unix_timestamp)

If this exists in a response for a requested resource, then for the next request to the same resource a modern web browser will add a line to the request headers to check if the resource data has changed since the Last-Modified date, that was given with the last response. If the server answers with a status code 304 and an empty body, the browser takes the data for this resource from its local cache.

2.2 $c->response->headers->etag($entity_tag)

The entity tag is a unique representation of data from a resource. Usually a digest of the response body serves well for this purpose, so for that case whenever you read "ETag" you might replace it with "checksum". If an Etag exists in a response for a requested resource, then for the next request to the same resource the browser will add a line to the request headers with that ETag, that tells the server to only transmit the body if the ETag for the resource has changed. If it hasn't the server responds with a status code 304 and an empty body, and the browser takes the data for this resource from its local cache.

CAVEATS

Using this concept involves the risk of breaking something!

Especially the Last-Modified header has some flaws:

First of all the accuracy of it cannot be better than the HTTP time interval: one second.

But what is really hazardous is trying to calculate a last_modified timestamp for dynamic pages.

As a rough rule of thumb, never use last_modified when

  • serving results joined from multiple sources,

  • the output depends on input parameters.

Hence Last-Modified is ideal for serving data without changing it (e.g. images) or for an RSS feed where Last-Modified is the time of the latest entry.

An ETag header that is calculated as a checksum of the actual response body is much more robust in general. The only real drawback is, that calculating this checksum costs a few CPU cycles. The "SYNOPSIS" at the top shows an example how to create this ETag header automatically.

INTERNAL METHODS

finalize_headers

This hooks into the chain of finalize_headers methods and checks the request headers If-Match, If-Unmodified-Since, If-None-Match and If-Modified-Since as well as the response headers ETag and Last-Modified. Sets the status response code to 304 Not Modified if those fields indicate, that the data for the resource has not changed since the last request from the same client, so the client will use a locally cache copy of the resource data.

CONFIGURATION

none.

SEE ALSO

Catalyst, http://www.ietf.org/rfc/rfc2616.txt

AUTHOR

Bernhard Graf <graf(a)cpan.org>

BUGS

Please report any bugs or feature requests to bug-catalyst-plugin-cache-http at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Catalyst-Plugin-Cache-HTTP. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

COPYRIGHT & LICENSE

Copyright 2009 Bernhard Graf.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.