Changes for version 0.17

  • Reduced memory usage by TermDocsCache
  • Implemented Okapi BM25 weighting function [S.E. Robertson et al., Google for description] for scoring documents. Pass scoring_method => 'legacy_tfidf' to new() or search() to use older method. Thanks to the lucy project (http://www.seg.rmit.edu.au/lucy/) for inspiration.
  • Added docweights table to support Okapi BM25 scoring. Indexes need to be dropped and recreated.
  • Pass "update_commit_interval" to new() to control memory usage in index updates. As the index is built, postings lists are built up in memory; this parameter controls the number of documents held in memory before being flushed to the database. Default is 20000. To disable, pass 0, and all documents passed to add_doc() will indexed in memory before writing to the database.
  • Fixed problem with inaccurate search results if multiple searches were performed on a single instance while another instance was updating index.

Modules

Perl extension for full-text searching in SQL databases

Provides

in lib/DBIx/TextIndex/Exception.pm
in lib/DBIx/TextIndex/TermDocsCache.pm
in lib/DBIx/TextIndex/stop-cz.pm
in lib/DBIx/TextIndex/stop-en.pm