NAME
Lucy::Simple - Basic search engine.
SYNOPSIS
First, build an index of your documents.
my $index = Lucy::Simple->new(
path => '/path/to/index/'
language => 'en',
);
while ( my ( $title, $content ) = each %source_docs ) {
$index->add_doc({
title => $title,
content => $content,
});
}
Later, search the index.
my $total_hits = $index->search(
query => $query_string,
offset => 0,
num_wanted => 10,
);
print "Total hits: $total_hits\n";
while ( my $hit = $index->next ) {
print "$hit->{title}\n",
}
DESCRIPTION
Lucy::Simple is a stripped-down interface for the Apache Lucy search engine library.
CONSTRUCTORS
new
my $lucy = Lucy::Simple->new(
path => '/path/to/index/',
language => 'en',
);
Create a Lucy::Simple object, which can be used for both indexing and searching. Both parameters path
and language
are required.
path - Where the index directory should be located. If no index is found at the specified location, one will be created.
language - The language of the documents in your collection, indicated by a two-letter ISO code. 12 languages are supported:
|-----------------------| | Language | ISO code | |-----------------------| | Danish | da | | Dutch | nl | | English | en | | Finnish | fi | | French | fr | | German | de | | Italian | it | | Norwegian | no | | Portuguese | pt | | Spanish | es | | Swedish | sv | | Russian | ru | |-----------------------|
METHODS
add_doc
$lucy->add_doc({
location => $url,
title => $title,
content => $content,
});
Add a document to the index. The document must be supplied as a hashref, with field names as keys and content as values.
search
my $int = $simple->search(
query => $query, # required
offset => $offset, # default: 0
num_wanted => $num_wanted, # default: 10
sort_spec => $sort_spec, # default: undef
);
Search the index. Returns the total number of documents which match the query. (This number is unlikely to match num_wanted
.)
query - A search query string.
offset - The number of most-relevant hits to discard, typically used when “paging” through hits N at a time. Setting offset to 20 and num_wanted to 10 retrieves hits 21-30, assuming that 30 hits can be found.
num_wanted - The number of hits you would like to see after
offset
is taken into account.sort_spec - A SortSpec, which will affect how results are ranked and returned.
next
my $hit_doc = $simple->next();
Return the next hit, or undef when the iterator is exhausted.
INHERITANCE
Lucy::Simple isa Clownfish::Obj.