NAME

CGI::Application::Plugin::PageLookup - Database driven model framework for CGI::Application

VERSION

Version 1.7

DESCRIPTION

A model component for CGI::Application built around a table that has one row for each page and that provides support for multiple languages and the 'dot' notation in templates.

SYNOPSIS

    package MyCGIApp base qw(CGI::Application);
    use CGI::Application::Plugin::PageLookup qw(:all);

    # Anything but the simplest usage depends on "dot" notation.
    use HTML::Template::Pluggable; 
    use HTML::Template::Plugin::Dot;

    sub cgiapp_init {
        my $self = shift;

        # pagelookup depends CGI::Application::DBH;
        $self->dbh_config(......); # whatever arguments are appropriate
	
        $self->html_tmpl_class('HTML::Template::Pluggable');

        $self->pagelookup_config(

		# prefix defaults to 'cgiapp_'.
		prefix => 'mycgiapp_',

		# load smart dot-notation objects
		objects => 
		{
			# Support for TMPL_LOOP
			loop => 'CGI::Application::Plugin::PageLookup::Loop',

			# Decoupling external and internal representations of URLs
			href => 'CGI::Application::Plugin::PageLookup::Href',

			# Page specific and site wide parameters
			value => 'CGI::Application::Plugin::PageLookup::Value',

			# We have defined a MyCGIApp::method method 
			method => 'create_custom_object',

			# We can also handle CODE refs
			callback => sub {
				my $self = shift;
				my $page_id = shift;
				my $template = shift;
				........  
			}

		},

		# remove certain fields before sending the parameters to the template.
		remove =>
		[
			'custom_col1',
			'priority'
		],

		xml_sitemap_base_url => 'http://www.mytestsite.org'

	);

    }

    sub create_custom_object {
	my $self = shift;
	my $page_id = shift;
	my $template = shift;
	my $name = shift;
	return ........... # smart object that can be used for dot notation
    }

    sub setup {
        my $self = shift;

        $self->run_modes({
		'pagelookup'  => 'pagelookup_rm',
		'xml_sitemap' => 'xml_sitemap',
		'extra_stuff' => 'extra_stuff'
	});
	............
    }

    sub extra_stuff {
	my $self = shift;

	# do page lookup
        my $template_obj = $self->pagelookup($page_id,
					handle_notfound=>0, # force function to return undef if page not found
					objects=> ....); #  but override config for this run mode alone.

	return $self->notfound($page_id) unless $template_obj;

	# More custom stuff
	$template_obj->param( .....);

        return $template_obj->output;
 
    }

DATABASE

Something like the following schema is assumed. In general each column on these tables corresponds to a template parameter that needs to be on every page on the website and each row in the join corresponds to a page on the website. The exact types are not required and can be changed but these are the recommended values. The lang and internalId columns combined should be as unique as the pageId column. They are used to link the different language versions of the same page and also the page with nearby pages in the same language. The lang column is used to join the two pages. The lang and collation fields expect to find some template structure like this: <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="<TMPL_VAR NAME="lang">-<TMPL_VAR NAME="collation">"> ... </html> The priority, lastmod and changefreq columns are used in XML sitemaps as defined by http://www.sitemaps.org/protocol.php. The changefreq field is also used in setting the expiry header. Since these fields are not expected to be in general usage, by default they are deleted just before being sent to the template. The lineage and rank columns are used by menu/sitemap functionality and together should be unique.

Table: cgiapp_structure
Field        Type                                                                Null Key  Default Extra 
------------ ------------------------------------------------------------------- ---- ---- ------- -----
internalId   unsigned numeric(10,0)                                              NO   PRI  NULL          
template     varchar(20)                                                         NO        NULL          
lastmod      date                                                                NO        NULL          
changefreq   enum('always','hourly','daily','weekly','monthly','yearly','never') NO        NULL          
priority     decimal(3,3)                                                        YES       NULL          
lineage      varchar(255)                                                        NO   UNI  NULL    	 
rank	      unsigned numeric(10,0)                                              NO   UNI  NULL    	 
Table: cgiapp_pages
Field        Type                                                                Null Key  Default Extra 
------------ ------------------------------------------------------------------- ---- ---- ------- -----
pageId       varchar(255)                                                        NO   UNI  NULL          
lang         varchar(2)       	                                                  NO   PRI  NULL          
internalId   unsigned numeric(10,0)                                              NO   PRI  NULL          

+ any custom columns that the web application might require.
Table: cgiapp_lang
Field        Type                                                                Null Key  Default Extra 
------------ ------------------------------------------------------------------- ---- ---- ------- -----
lang         varchar(2)                                                          NO   PRI  NULL          
collation    varchar(2)                                                          NO        NULL          

+ any custom columns that the web application might require.

EXPORT

These functions can be optionally imported into the CGI::Application or related namespace.

pagelookup_config
pagelookup_get_config
pagelookup_set_charset
pagelookup_prefix
pagelookup_sql
pagelookup
pagelookup_notfound
pagelookup_set_expiry
pagelookup_default_lang
pagelookup_404
pagelookup_msg_param
pagelookup_rm
xml_sitemap_rm
xml_sitemap_sql
xml_sitemap_base_url

Use the tag :all to export all of them.

FUNCTIONS

pagelookup_config

This function defines the default behaviour of the plugin, though this can be overridden for specific runmodes. The possible arguments are as follows:

prefix

This sets the prefix used in the database schema. It defaults to 'cgiapp_'.

handle_notfound

If set (which it is by default), the pagelookup function will return the results of calling pagelookup_notfound when a pagelookup fails. If not set the runmode must handle page lookup failures itself which it will identify because the pagelookup function will return undef.

expiry

If set (which it is by default), the pagelookup function will set the appropriate expiry header based upon the changefreq column.

remove

This points to an array ref of fields that are not expected to be required by the template. It defaults to template, pageId and internalId, changefreq.

objects

This points to a hash ref. Each key is a parameter name (upto the dot). The value is something that defines a smart object as described in HTML::Template::Plugin::Dot. The point about a smart object is that usually it defines an AUTOLOAD function so if the template has <TMPL_VAR NAME="object.getcarter"> and the pagelookup_config has mapped object to some object $MySmartObject then the method $MySmartObject->getcarter() will be called. Alternatively there may be no AUTOLOAD function but the smart object may have methods that take additional arguments. This way the template can be much more decoupled from the structure of the database.

There are three ways a smart object can be defined. Firstly if the value is a CODE ref, then the ref is passed 1.) the reference to the CGI::Application object; 2.) the page id; 3.) the template, 4.) the parameter name 5.) any argument overrides. Otherwise if the CGI::Application has the value as a method, then the method is called with the same arguments as above. Finally the value is assumed to be the name of a module and the new constructor of the supposed module is called with the same arguments. A typical smart object might be coded as follows:

package MySmartObject;

sub new {
	my $class = shift;
	my $self = .....
	......
	return bless $self, $class;
}

# If you do not have this, then HTML::Template::Plugin::Dot will not know that you can!
# [Note really can is supposed to return a subroutine ref, but this works in this context.]
sub can { return 1; }

# This is the function that actually produces the value to be inserted into the template.
sub AUTOLOAD {
	my $self = shift;
	my $method = $AUTOLOAD;
	if ($method =~ s/^MySmartObject\:\:(.+)$/) {
		$method = $1;	# Now we have what is in the template.
	}
	else {
		....
	}
	.....
	return $value;
}

Note that the smart object does not have access to HASH ref because the data is changing at the point it would be used and so is non-deterministic.

charset

This is a string defining the character encoding. This defaults to 'utf-8'.

template_params

This is a hashref containing additional parameters that are to be passed to the load_templ function.

default_lang

This is a two letter code and defaults to 'en'. It is used when creating a notfound page when a language cannot otherwise be guessed.

status_404

This is the internal id corresponding to the not found page.

msg_param

This is the parameter used to store error messages.

xml_sitemap_base_url

This is the url for the whole site. It is mandatory to set this if you want XML sitemaps (which you should).

pagelookup_get_config

Returns config including any overrides passed in as arguments.

pagelookup_set_charset

This function sets the character set based upon the config.

pagelookup_prefix

This function returns the prefix that is used on the database for all the tables. The prefix can of course be overridden.

pagelookup_sql

This function returns the SQL that is used to lookup a specific page. It takes a single argument which is usually expected to be a pageId. This may also be taken in the form of a HASH ref having two fields: internalId and lang.

pagelookup

This is the function that does the heavy lifting. It takes a page id and optionally some arguments overriding the default config. Then the sequence of events is as follows: 1.) Lookup up the various parameters from the database. 2.) If this fails then exit either handling or just returning undef according to instructions. 3.) Load the template object. 4.) Set the expiry header unless instructed not to. 5.) Load the smart objects that are mentioned in the template. 6.) Remove unwanted parameters. 7.) Put the parameters into the template object. 8.) Return the now partially or completely filled template object.

The page id may also be taken in the form of a HASH ref having two fields: internalId and lang.

pagelookup_rm

This function is a generic run mode. It takes a page id and tries to do everything else. Of course most of the work is done by pagelookup.

xml_sitemap_rm

This method is intended to be installed as a sitemap. Since the format is fixed, it is self-contained and does not load templates from files. Note if a page as a null priority then it is not put in the sitemap. For this function to work it is necessary to set the base BASE_URL parameter.

pagelookup_notfound

This function takes a page id which has failed a page lookup and tries to find the best fitting 404 page. First of all it attempts to find the correct by language by assuming that if the first three characters of the page id consists of two characters followed by a '/'. If this matches then the first two characters are taken to be the language. If that fails then the language is taken to be $self->pagelookup_default_lang. Then the relevant 404 page is looked up by language and internal id. The internalId is taken to be $self->pagelookup_404 . Of course it is assumed that this page lookup cannot fail. The header 404 status is added to the header and the original page id is inserted into the $self->pagelookup_msg_param parameter. If this logic does not match your URL structure you can omit exporting this function or turn notfound handling off and implement your own logic.

pagelookup_set_expiry

This function sets the expiry header based upon the hash_ref.

pagelookup_default_lang

This returns the default language code.

pagelookup_404

This returns the core id used by 404 pages.

pagelookup_msg_param

This returns the parameter that pagelookup uses for inserting error messages.

xml_sitemap_sql

This returns the SQL used to get the XML sitemap data.

xml_sitemap_base_url

This returns the base url used in XML sitemaps.

AUTHOR

Nicholas Bamber, <nicholas at periapt.co.uk>

BUGS

Currently errors are not trapped early enough and hence error messages are less informative than they might be.

Also we are working on validating the code against more DBI drivers. Currently mysql and SQLite are known to work. It is known to be incompatible with postgres, which should be fixed in the next release. This may entail schema changes. It is also known to be in incompatible with DBD::DBM, apparently on account of a join across three tables. The SQL is not ANSI standard and that is one possible change. Another approach may be to make the schema configurable.

Please report any bugs or feature requests to bug-cgi-application-plugin-pagelookup at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=CGI-Application-Plugin-PageLookup. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc CGI::Application::Plugin::PageLookup

You can also look for information at:

ACKNOWLEDGEMENTS

Thanks to JavaFan for suggesting the use of Test::Database. Thanks to Philippe Bruhat for help with getting Test::Database to work more smoothly.

COPYRIGHT & LICENSE

Copyright 2009 Nicholas Bamber.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.