NAME

Search::FreeText::LexicalAnalysis::Stem - lexicon interface to Lingua::Stem

DESCRIPTION

A filter which uses Lingua::Stem to implement the Porter stemming algorithm. This can then be included in a search system as a part of the indexing and query system.

The filter is wrapped up a bit. This is because Lingua::Stem turns nonwords into absolutely nothing at all. To overcome this, we only stem words, and merge nonwords back in after they have been stemmed.

SYNOPSIS

my $stemmer = new Search::FreeText::LexicalAnalysis::Stem ();
my $words = $lexicaliser->process($oldwords);

METHODS

$self->initialize();

Called when the lexicon system is initialised. This method actually creates and stores the stemmer, and can be overridden if needed.

$self->process($oldwords);

Called to process a reference to an array of words, and returns a reference to an array of stemmed words for further processing. Words that are not stemmable are left in place, which is a slight performance hit as we need to wrap Lingua::Stem, but these are real words for indexing so we mustn't just lose them!

AUTHOR

Stuart Watt <S.N.K.Watt@rgu.ac.uk>

Copyright (c) 2003 The Robert Gordon University. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.