Sphinx::Search - Sphinx search engine API Perl client
This version is 0.03.
Use version 0.03 for Sphinx 0.9.8-cvs-20070907 and later
Use version 0.02 for Sphinx 0.9.8-cvs-20070818
use Sphinx::Search;
$sphinx = Sphinx::Search->new();
$results = $sphinx->SetMatchMode(SPH_MATCH_ALL)
->Query("search terms");
This is the Perl API client for the Sphinx open-source SQL full-text indexing search engine,
$sph = Sphinx::Search->new;
$sph = Sphinx::Search->new(\%options);
Create a new Sphinx::Search instance.
- log
Specify an optional logger instance. This can be any class that provides error, warn, info, and debug methods (e.g. see Log::Log4perl). Logging is disabled if no logger instance is provided.
- debug
Debug flag. If set (and a logger instance is specified), debugging messages will be generated.
$error = $sph->GetLastError;
Get last error message (string)
$warning = $sph->GetLastWarning;
Get last warning message (string)
$sph->SetServer($host, $port);
Set the host/port details for the searchd server. Returns $sph.
$sph->SetLimits($offset, $limit);
$sph->SetLimits($offset, $limit, $max);
Set match offset/limits, and optionally the max number of matches to return.
Returns $sph.
Set match mode, which may be one of:
Match all words
Match any words
Exact phrase match
Boolean match, using AND (&), OR (|), NOT (!,-) and parenthetic grouping.
Extended match, which includes the Boolean syntax plus field, phrase and proximity operators.
Returns $sph.
$sph->SetSortMode($mode, $sortby);
Set sort mode, which may be any of:
- SPH_SORT_RELEVANCE - sort by relevance
Sort by attribute descending/ascending. $sortby specifies the sorting attribute.
Sort by time segments (last hour/day/week/month) in descending order, and then by relevance in descending order. $sortby specifies the time attribute.
Sort by SQL-like syntax. $sortby is the sorting specification.
Returns $sph.
$sph->SetWeights([ 1, 2, 3, 4]);
Set per-field (integer) weights. The ordering of the weights correspond to the ordering of fields as indexed.
Returns $sph.
$sph->SetIDRange($min, $max);
Set IDs range only match those records where document ID is between $min and $max (including $min and $max)
Returns $sph.
$sph->SetFilter($attr, \@values);
Sets the results to be filtered on the given attribute. Only results which have attributes matching the given values will be returned.
This may be called multiple times with different attributes to select on multiple attributes.
Returns $sph.
$sph->SetFilterRange($attr, $min, $max);
Sets the results to be filtered on a range of values for the given attribute. Only those records where $attr column value is between $min and $max (including $min and $max) will be returned.
Returns $sph.
$sph->SetGroupBy($attr, $func);
$sph->SetGroupBy($attr, $func, $groupsort);
Sets attribute and function of results grouping.
In grouping mode, all matches are assigned to different groups based on grouping function value. Each group keeps track of the total match count, and the best match (in this group) according to current sorting function. The final result set contains one best match per group, with grouping function value and matches count attached.
$attr is any valid attribute. To disable grouping, set $attr to "".
$func is one of:
Group by day (assumes timestamp type attribute of form YYYYMMDD)
Group by week (assumes timestamp type attribute of form YYYYNNN)
Group by month (assumes timestamp type attribute of form YYYYMM)
Group by year (assumes timestamp type attribute of form YYYY)
Group by attribute value
Group by two attributes, being the given attribute and the attribute that immediately follows it in the sequence of indexed attributes. The specified attribute may therefore not be the last of the indexed attributes.
Groups in the set of results can be sorted by any SQL-like sorting clause, including both document attributes and the following special internal Sphinx attributes:
- @id - document ID;
- @weight, @rank, @relevance - match weight;
- @group - group by function value;
- @count - number of matches in group.
The default mode is to sort by groupby value in descending order, ie. by "@group desc".
In the results set, "total_found" contains the total amount of matching groups over the whole index.
WARNING: grouping is done in fixed memory and thus its results are only approximate; so there might be more groups reported in total_found than actually present. @count might also be underestimated.
For example, if sorting by relevance and grouping by a "published" attribute with SPH_GROUPBY_DAY function, then the result set will contain only the most relevant match for each day when there were any matches published, with day number and per-day match count attached, and sorted by day number in descending order (ie. recent days first).
Set count-distinct attribute for group-by queries
$sph->SetRetries($count, $delay);
Set distributed retries count and delay
$results = $sph->Query($query, $index);
Connect to searchd server and run given search query.
- query is query string
- index is index name to query, default is "*" which means to query all indexes. Use a space or comma separated list to search multiple indexes.
Returns undef on failure
Returns hash which has the following keys on success:
- matches
Array containing hashes with found documents ( "doc", "weight", "group", "stamp" )
- total
Total amount of matches retrieved (upto SPH_MAX_MATCHES, see sphinx.h)
- total_found
Total amount of matching documents in index
- time
Search time
- words
Hash which maps query terms (stemmed!) to ( "docs", "hits" ) hash
$sph->AddQuery($query, $index);
Add a query to a batch request.
Batch queries enable searchd to perform internal optimizations, if possible; and reduce network connection overheads in all cases.
For instance, running exactly the same query with different groupby settings will enable searched to perform expensive full-text search and ranking operation only once, but compute multiple groupby results from its output.
Parameters are exactly the same as in Query() call.
Returns corresponding index to the results array returned by RunQueries() call.
Run batch of queries, as added by AddQuery.
Returns undef on network IO failure.
Returns an array of result sets on success.
Each result set in the returned array is a hash which contains the same keys as the hash returned by Query, plus:
Errors, if any, for this query.
Any warnings associated with the query.
$excerpts = $sph->BuildExcerpts($docs, $index, $words, $opts)
Generation document excerpts for the specified documents.
- docs
An array reference of strings which represent the document contents
- index
A string specifiying the index whose settings will be used for stemming, lexing and case folding
- words
A string which contains the words to highlight
- opts
A hash which contains additional optional highlighting parameters:
- before_match - a string to insert before a set of matching words, default is "<b>" =item after_match - a string to insert after a set of matching words, default is "<b>"
- chunk_separator - a string to insert between excerpts chunks, default is " ... "
- limit - max excerpt size in symbols (codepoints), default is 256
- around - how many words to highlight around each match, default is 5
Returns undef on failure.
Returns an array of string excerpts on success.
$sph->UpdateAttributes($index, \@attrs, \%values);
Update specified attributes on specified documents
- index
Name of the index to be updated
- attrs
Array of attribute name strings
- values
A hash with key as document id, value as an array of new attribute values
Returns number of actually updated documents (0 or more) on success
Returns undef on failure
Usage example:
$sph->UpdateAttributes("test1", [ qw/group_id/ ], { 1 => [ 456] }) );
Jon Schutz
This module is based on (not deployed to CPAN) for Sphinx version 0.9.7-rc1, by Len Kranendonk, which was in turn based on the Sphinx PHP API.
Copyright 2007 Jon Schutz, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License.