NAME

Search::FreeText::LexicalAnalysis::Tokenize - lexicon tokenizer

DESCRIPTION

A pseudo-filter which should always be called as the first element in the lexical processing system. As usual, it can also be overridden. Called with an array containing an entire string, it returns a new array containing a list of words.

SYNOPSIS

my $stemmer = new Search::FreeText::LexicalAnalysis::Tokenize ();
my $words = $lexicaliser->process($oldwords);

METHODS

$self->initialize();

Called when the lexicon system is initialised. This method actually does very little, although it could compile and cache stuff if it seemed appropriate.

$self->process($oldwords);

Called to process a reference to an array containing strings (well, one string) which can then be tokenized for further lexical processing.

AUTHOR

Stuart Watt <S.N.K.Watt@rgu.ac.uk>

Copyright (c) 2003 The Robert Gordon University. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.