NAME
Dezi::Indexer::Headers - create document headers for Swish-e -S prog
SYNOPSIS
use Dezi::Indexer::Headers;
use SWISH::3;
my $f = 'some/file.html';
my $buf = SWISH::3->slurp( $f ):
my $headers = Dezi::Indexer::Headers->new;
print $headers->head( $buf, { url=>$f } ), $buf;
DESCRIPTION
Dezi::Indexer::Headers generates the correct headers for feeding documents to the indexer.
VARIABLES
$AutoURL
The $AutoURL package variable is used when no URL is supplied in the head() method. It is incremented each time it is used in head(). You can set it to whatever numerical value you choose. It defaults to $^T.
$Debug
Set to TRUE to carp verbage about content length, etc.
METHODS
new
Returns a new object.
init
Called by new().
version
Get/set the API version. Default is 2
.
head( buf [, \%opts ] )
Returns scalar string of proper headers for a document.
The only required parameter is buf, which should be the content of the document as a scalar string.
The following keys are supported in %opts. If not supplied, they will be guessed at based on the contents of buf.
- version
-
Which version of the headers to use. The possible values are
2
for Swish-e version 2.x or3
for Swish3. - url
-
The URL or file path of the document. If not supplied, a guaranteed unique numeric value will be used, based on the start time of the calling script.
- modtime
-
The last modified time of the document in epoch seconds (time() format). If not supplied, the current time() value is used.
- parser
-
The parser type to be used for the document. If not supplied, it will not be included in the header and Swish-e will determine the parser type. See the Swish-e configuration documentation on determining parser type. See also the Dezi parser() method.
- type
-
The MIME type of the document. If not supplied, it will be guessed at based on the file extension of the URL (if supplied) or $DefMime. NOTE: MIME type is only used in SWISH::3 headers.
- action
-
Should the doc be added to, updated in or deleted from the index. The url value is used as the unique identifier of the document in the index. The possible values are:
- add (default)
-
If a document with the same url value already exists, a fatal error is thrown.
- update
-
If a document with the same url does not already exist in the index, a fatal error is thrown.
- add_or_update
-
Check first if url exists in the index, and then add or update as appropriate. Since this requires additional processing overhead for every document, it is not the default. It is, however, the safest action to take.
- delete
-
Remove the document from the index. If url does not exist, a fatal error is thrown.
NOTE: The special environment variable SWISH3
is checked in order to determine the correct header labels. If you are using SWISH::3, the environment variable is set for you. Otherwise, set the version with the version method or param.
Headers API
See the Swish-e documentation at http://swish-e.org/.
For SWISH::3 Headers API (which is slightly different) see http://dev.swish-e.org/wiki/swish3/.
AUTHOR
Peter Karman, <perl@peknet.com>
BUGS
Please report any bugs or feature requests to bug-swish-prog at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Dezi-App. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Dezi
You can also look for information at:
Mailing list
RT: CPAN's request tracker
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
COPYRIGHT AND LICENSE
Copyright 2008-2009 by Peter Karman
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.