NAME

Catalyst::Plugin::PageCache - Cache the output of entire pages

SYNOPSIS

use Catalyst;
MyApp->setup( qw/Cache::FileCache PageCache/ );

__PACKAGE__->config(
    'Plugin::PageCache' => {
        expires => 300,
        set_http_headers => 1,
        auto_cache => [
            '/view/.*',
            '/list',
        ],
        debug => 1,

        # Optionally, a cache hook method to be called prior to dispatch to
        # determine if the page should be cached.  This is called both
        # before dispatch, and before finalize.
        cache_hook => 'some_method',

        # You may alternatively set different methods to be used as hooks
        # for dispatch and finalize. The dispatch method will determine
        # whether the currently cached page will be displayed to the user,
        # and the finalize hook will determine whether to save the newly
        # created page.
        cache_dispatch_hook => 'some_method_for_dispatch',
        cache_finalize_hook => 'some_method_for_finalize',
    }
);

sub some_method {
    my $c = shift;
    if ( $c->user_exists and $c->user->some_field ) {
        return 0; # Don't cache
    }
    return 1; # Cache
}

# in a controller method
$c->cache_page( '3600' );

$c->clear_cached_page( '/list' );

# Expire at a specific time
$c->cache_page( $datetime_object );


# Fine control
$c->cache_page(
    last_modified   => $last_modified,
    cache_seconds   => 24 * 60 * 60,    # once a day
    expires         => 300,             # allow client caching
);

DESCRIPTION

Many dynamic websites perform heavy processing on most pages, yet this information may rarely change from request to request. Using the PageCache plugin, you can cache the full output of different pages so they are served to your visitors as fast as possible. This method of caching is very useful for withstanding a Slashdotting, for example.

This plugin requires that you also load a Cache plugin. Please see the Known Issues when choosing a cache backend.

WARNINGS

PageCache should be placed at the end of your plugin list.

You should only use the page cache on pages which have NO user-specific or customized content. Also, be careful if caching a page which may forward to another controller. For example, if you cache a page behind a login screen, the logged-in version may be cached and served to unauthenticated users.

Note that pages that result from POST requests will never be cached.

PERFORMANCE

On my Athlon XP 1800+ Linux server, a cached page is served in 0.008 seconds when using the HTTP::Daemon server and any of the Cache plugins.

CONFIGURATION

Configuration is optional. You may define the following configuration values:

expires => $seconds

This will set the default expiration time for all page caches. If you do not specify this, expiration defaults to 300 seconds (5 minutes).

cache_headers => 1

Enable this value if you need your cached responses to include custom HTTP headers set by your application. This may be necessary if you operate behind an edge cache such as Akamai. This option is disabled by default.

set_http_headers => 1

Enabling this value will cause Catalyst to set the correct HTTP headers to allow browsers and proxy servers to cache your page. This will further reduce the load on your server. The headers are set in such a way that the browser/proxy cache will expire at the same time as your cache. The Last-Modified header will be preserved if you have already specified it. This option is disabled by default.

auto_cache => [
    $uri,
]

To automatically cache certain pages, or all pages, you can specify auto-cache URIs as an array reference. Any controller within your application that matches one of the auto_cache URIs will be cached using the default expiration time. URIs may be specified as absolute: '/list' or as a regex: '/view/.*'

disable_index => 1

To support the "clear_cached_page" method, PageCache attempts keep an index of all cached pages. This adds overhead by performing extra cache reads and writes to maintain the (possibly very large) page index. It's also not reliable, see "KNOWN ISSUES".

If you don't intend to use clear_cached_page, you should enable this config option to avoid the overhead of creating and updating the cache index. This option is currently disabled (i.e. the page index is enabled) by default but that may change in a future release.

index_page_key => '...'

The key string used for the index, Defaults to a string that includes the name of the Catalyst app class.

busy_lock => 10

On a high traffic site where page re-generation may take many seconds, a common problem encountered is the "dog-pile" effect, where many concurrent connections all hit a page where the cache has expired and all perform the same expensive operation to rebuild the cache. To prevent this situation, you can set the busy_lock option to the maximum number of seconds any of your pages can be expected to take to rebuild the cache. Then, when the cache expires, the first request will rebuild the cache while also extending the expiration time by the number of seconds specified, allowing other requests that arrive before the cache has been rebuilt to use the previously cached page. This option is disabled by default.

debug => 1

This will print additional debugging information to the Catalyst log. You will need to have -Debug enabled to see these messages.

auto_check_user => 1

If this option is enabled, automatic caching is disabled for logged in users i.e., if the app class has a user_exists() method and it returns true.

cache_hook => 'cache_hook_method'
cache_finalize_hook => 'cache_finalize_hook_method'
cache_dispatch_hook => 'cache_dispatch_hook_method'

Calls a method on the application that is expected to return a true or false. This method is called before dispatch, and before finalize so you can short circuit the pagecache behavior. As an example, if you want to disable PageCache while running under debug mode:

package MyApp;

...

sub cache_hook_method { return shift->debug; }

Or, if you want to not cache for certain roles, say "admin":

sub cache_hook_method {
    my ( $c ) = @_;
    return !$c->check_user_roles('admin');
}

Note that this is called BEFORE auto_check_user, so you have more flexibility to determine what to do for not logged in users.

To override the generation of page keys:

__PACKAGE__->config(
    'Plugin::PageCache' => {
        key_maker => sub {
            my $c = shift;
            return $c->req->base . '/' . $c->req->path;
        }
    }
);

key_maker can also be the name of a method, which will be invoked as <$c-$key_maker>>.

In most cases you would use a single cache_hook method for consistency.

It is possible to achieve background refreshing of content by disabling caching in cache_dispatch_hook and enabling caching in cache_finalize_hook for a specific IP address (say 127.0.0.1).

A cron of wget "http://localhost/foo.html" would cause the content to be generated fresh and cached for future viewers. Useful for content which takes a very long time to build or pages which should be refreshed at a specific time such as always rolling over content at midnight.

METHODS

cache_page

Call cache_page in any controller method you wish to be cached.

$c->cache_page( $expire );

The page will be cached for $expire seconds. Every user who visits the URI(s) referenced by that controller will receive the page directly from cache. Your controller will not be processed again until the cache expires. You can set this value to a low value, such as 60 seconds, if you have heavy traffic, to greatly improve site performance.

Pass in a DateTime object to make the cache expire at a given point in time.

$two_hours = DateTime->now->add( hours => 2 );
$c->cache_page( $two_hours );

The page will be stored in the page cache until this time.

If set_http_headers is set then Expires and Cache-Control headers will also be set to expire at the given date as well.

Pass in a list or hash reference for finer control.

$c->cache_page(
    last_modified   => $last_modified,
    cache_seconds   => 24 * 60 * 60,
    expires         => 30,
);

This allows separate control of the page cache and the header cache values sent to the client.

Possible options are:

cache_seconds

This is the number of seconds to keep the page in the page cache, which may be different (normally longer) than the time that client caches may store the page. This is the value set when only a single parameter is passed.

expires

This is the length of time in seconds that a client may cache the page before revalidating (by asking the server if the document has changed).

Unlike above, this is a fixed setting that each client will see. Regardless of how much longer the page will be cached in the page cache the client still sees the same expires time.

Setting zero (0) for expires will result in the page being cached, but headers will be sent telling the client to not cache the page. Allows caching expensive content to generate, but any changes will be seen right away.

last_modified

Last modified time in epoch seconds. If not set will use either the current Last-Modified header, or if not set, the current time.

clear_cached_page

To clear the cached pages for a URI, you may call clear_cached_page.

$c->clear_cached_page( '/view/userlist' );
$c->clear_cached_page( '/view/.*' );
$c->clear_cached_page( '/view/.*\?.*\bparam=value\b' );

This method takes an absolute path or regular expression. Returns the number of matching entries found in the index. A warning will be generated if the page index is disabled (see "CONFIGURATION").

The argument is matched against all the keys in the page index. The page index keys include the request path and query string, if any.

Typically you'd call this from a different controller than the cached controller. You may for example wish to build an admin page that lets you clear page caches.

INTERNAL EXTENDED METHODS

dispatch

dispatch decides whether or not to serve a particular request from the cache.

finalize

finalize caches the result of the current request if needed.

setup

setup initializes all default values.

I18N SUPPORT

If your application uses Catalyst::Plugin::I18N for localization, a separate cache key will be used for each language a page is displayed in.

KNOWN ISSUES

The page index, used to support the "clear_cached_page" method is unreliable because it uses a read-modify-write approach which will loose data if more than one process attempts to update the page index at the same time.

It is not currently possible to cache pages served from the Static plugin. If you're concerned enough about performance to use this plugin, you should be serving static files directly from your web server anyway.

Cache::FastMmap does not have the ability to specify different expiration times for cached data. Therefore, if your MyApp->config->{cache}->{expires} value is set to anything other than 0, you may experience problems with the clear_cached_page method, because the cache index may be removed. For best results, you may wish to use Cache::FileCache or Cache::Memcached as your cache backend.

SEE ALSO

Catalyst, Catalyst::Plugin::Cache::FastMmap, Catalyst::Plugin::Cache::FileCache, Catalyst::Plugin::Cache::Memcached

AUTHOR

Andy Grundman, <andy@hybridized.org>

THANKS

Bill Moseley, <mods@hank.org>, for many patches and tests.

Roberto Henríquez, <roberto@freekeylabs.com>, for i18n support.

COPYRIGHT

This program is free software, you can redistribute it and/or modify it under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 835:

Non-ASCII character seen before =encoding in 'Henríquez,'. Assuming UTF-8