NAME

Lingua::StopWords - Stop words for several languages

SYNOPSIS

use Lingua::StopWords;

my @words = ...;

my $stopwords = Lingua::StopWords::getStopWords('en');
my $stopwords = Lingua::StopWords::EN::getStopWords();

# Print non-stopwords in @words
print join ' ', grep { !$stopwords->{$_} } @words;

DESCRIPTION

Stopword list are encoded in UTF8.

The current supported languages are:

  • English

  • French

  • Spanish

  • Portuguese

  • Italian

  • German

  • Dutch

  • Swedish

  • Norwegian

  • Danish

  • Russian

  • Finnish

EXPORT

None by default.

SEE ALSO

The stopword lists was taken from the http://snowball.tartarus.org/ website.

This POD documentation inspired from the Lingua::EN::StopWords module.

AUTHOR

Fabien POTENCIER, <fabpot@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2004 by Fabien POTENCIER

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.3 or, at your option, any later version of Perl 5 you may have available.