NAME

HTML::Index::Document - Perl object used by HTML::Index::Create to create an index of HTML documents for searching

SYNOPSIS

$doc = HTML::Index::Document->new( path => $path );

$doc = HTML::Index::Document->new( 
    name        => $name,
    contents    => $contents,
    mod_time    => $mod_time,
);

DESCRIPTION

This module allows you to create objects to represent HTML documents to be indexed for searching using the HTML::Index modules. These might be HTML files in a webserver document root, or HTML pages stored in a database, etc.

HTML::Index::Document is a subclass of Class::Struct, with 4 attributes:

path

The path to the document. This is an optional attribute, but if used should correspond to an existing, readable file.

name

The name of the document. This attribute is what is returned as a result of a search, and is the primary identifier for the document. It should be unique. If the path attribute is set, then the name attribute defaults to path. Otherwise, it must be provided to the constructor.

modtime

The modification time of the document. This attribute is used to decide whether the document (if it already has been index) needs to be re-indexed (if the modtime has changed and is greater than the stored value). It can also be used to order search results. If the path attribute is set, the modtime attribute is the file modification time that corresponds to path (determined by stat). Otherwise, it must be provided to the constructor.

contents

The (HTML) contents of the document. This attribute provides the text which is indexed by HTML::Search::Index. If the path attribute is set, the contents attribute defaults to the contents of path. Otherwise, it must be provided to the constructor.

SEE ALSO

HTML::Index

AUTHOR

Ave Wrigley <Ave.Wrigley@itn.co.uk>

COPYRIGHT

Copyright (c) 2001 Ave Wrigley. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.