NAME
Search::Tools::Token - a token object returned from a TokenList
SYNOPSIS
use Search::Tools::Tokenizer;
my $tokenizer = Search::Tools::Tokenizer->new();
my $tokens = $tokenizer->tokenize('quick brown red dog');
while ( my $token = $tokens->next ) {
# token isa Search::Tools::Token
print "token = $token\n";
printf("str: %s, len = %d, u8len = %d, pos = %d, is_match = %d, is_hot = %d\n",
$token->str,
$token->len,
$token->u8len,
$token->pos,
$token->is_match,
$token->is_hot
);
}
DESCRIPTION
A Token represents one or more characters culled from a string by a Tokenizer.
METHODS
Most of Search::Tools::Token is written in C/XS so if you view the source of this class you will not see much code. Look at the source for Tools.xs and search-tools.c if you are interested in the internals, or look at Search::Tools::TokenPP.
str
The characters in the token. Stringifies to the str() value with overloading.
len
The byte length of str().
u8len
The character length of str(). For ASCII, len() == u8len(). For non-ASCII UTF-8, u8len() < len().
pos
The zero-based position in the original string.
is_match
Did the token match the re() in the Tokenizer.
is_hot
Did the token match the heat_seeker in the Tokenizer.
is_sentence_start
Returns true value if the Token starts with an UPPER case UTF8 character or other common sentence-starting character.
is_sentence_end
Returns true value if the Token matches common sentence-ending punctuation.
is_abbreviation
Returns true value if the Token looks like a common English abbreviation.
dump
Prints the internal XS attributes to stderr.
set_hot
Set the is_hot() value.
set_match
Set the is_match() value.
AUTHOR
Peter Karman <karman@cpan.org>
BUGS
Please report any bugs or feature requests to bug-search-tools at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Search-Tools. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Search::Tools
You can also look for information at:
RT: CPAN's request tracker
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
COPYRIGHT
Copyright 2009 by Peter Karman.
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself.