NAME

Text::Index - Create indices of a set of pages using a set of keywords

SYNOPSIS

use Text::Index;
my $index = Text::Index->new;

$index->add_page($content);
$index->add_pages(@strings);
my @pages = $index->pages;

# Add keyword with equivalent derivates
$index->add_keyword('Hamilton function', 'Hamiltonian');
$index->add_keywords([$keyword, @derivates], ...);
my @keywords = $i->keywords;
# ->keywords returns an array reference for each keyword
# (see ->add_keywords syntax)

my $index = $i->generate_index;

# Or for a single keyword:
my @page_list  = $i->find_keyword($keyword);
my @page_list2 = $i->find_keyword($keyword, @derivates);

DESCRIPTION

This (simple) module searches for keywords in a set of pages and creates an index.

EXPORT

None.

METHODS

This is a list of public methods.

new

Returns a new Text::Index object. When called on an existing object, new clones that object (deeply).

add_page

Adds a page to the index object. The page is expected to be a string of text passed in as first argument.

Returns the Text::Index object for convenience of method chaining.

add_pages

Adds a number of pages to the index object.

All arguments are treated as pages. See add_page.

pages

Returns all registered pages as a list.

add_keyword

Adds a new keyword to the index. First argument must be the keyword to add. Following the keyword may be any number of alternative names / string which should be treated to be equal to the keyword.

Returns the Text::Index object for convenience.

add_keywords

Works like add_keyword except that its arguments must be a number of array references each referencing an array containing a keyword and its associated derivates.

Returns the Text::Index object for convenience.

keywords

Returns all registered keywords as a list of array references. Each of those references an array containing the keyword followed by any possible derivates.

generate_index

Generates an index from the registered keywords and pages. It returns an index of the form:

{
  'keyword' => [ @pages_containing_keyword ],
  ...
}

The search for the keywords is performed case and whitespace insensitively.

find_keyword

This method works like generate_index only that it searches for just one keyword which is provided as argument in the style of add_keyword. It ignores any registered keywords and searches just for the one given as argument.

Returns a list of page number on which the keyword was found. The list will be the empty list if the keyword wasn't found at all.

SEE ALSO

AUTHOR

Steffen Müller, <modules at steffen-mueller dot net>

COPYRIGHT AND LICENSE

Copyright (C) 2006 by Steffen Müller

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.6 or, at your option, any later version of Perl 5 you may have available.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 271:

Non-ASCII character seen before =encoding in 'Müller,'. Assuming UTF-8