NAME

SQL::KeywordSearch - generate SQL for simple keyword searches

SYNOPSIS

use SQL::KeywordSearch;

my ($search_sql,@bind) =
    sql_keyword_search(
      keywords   => 'cat,brown,whiskers',
      columns    => ['pets','colors','names']
    );

my $sql = "SELECT title from articles
              WHERE user_id = 5 AND ".$search_sql;

About keyword searching

The solution provided here is simple, suitable for relatively small numbers of rows and columns. It is also simple-minded in that it can't sort the results based on their relevance.

For large data sets and more features, a full-text indexing and searching solution is recommended to be used instead. Tsearch2 for PostgreSQL, http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ is one such solution.

Database Support

This module was developed for use with PostgreSQL. It can work with other databases by specifying the regular expression operator to use. The 'REGEXP' operator should work for MySQL.

Since a regular expression for word boundary checking is about the only fancy database feature we used, other databases should work as well.

Functions

sql_keyword_search()

($sql,@bind) = sql_keyword_search(...);
(@interp)    = sql_keyword_search(interp => 1, ...);

sql_keyword_search builds a sql statement based on a keyword field containing a list of comma, space, semicolon or colon separated keywords. This prepares a case-insensitive regular expression search.

($sql, @bind) =
     sql_keyword_search(
        keywords          => 'cat,brown',
        columns           => ['pets','colors'],
        every_colum       => 1,
        every_word        => 1,
        whole_word        => 1,
        operator          => 'REGEXP'
    );

Now the result would look like:

$sql = qq{(
  (lower(pets) ~ lower(?)
   OR lower(colors) ~ lower(?)
  )
   OR
  (lower(pets) ~ lower(?)
   OR lower(colors) ~ lower(?)
  ))};

@bind = ('cat','cat','brown','brown');

You can control the use of AND, OR and other aspects of the SQL generation through the options below.

keywords

A string of comma,space,semicolon or color separated keywords. Required.

columns

An anonymous array of columns to perform the keyword search on. Required.

every_column (default: false)

If you would like all words to match in all columns, you set this to 1.

By default, words can match in one or more columns.

every_word (default: false)

If you would like all words to match in particular column for it to be considered a match, set this value to 1

By default, one or more words can match in a particular column.

whole_word (default: false)

Set this to true to do only match against whole words. A substring search is the default.

operator (default: ~)

Set to 'REGEXP' if you are using MySQL. The default works for PostgreSQL.

interp (default: off)
# integrate with DBIx::Interp
my $articles = $dbx->selectall_arrayref_i("
    SELECT article_id, title, summary
        FROM articles
        WHERE ",
          sql_keyword_search(
              keywords   => $q->param('q'),
              columns    => [qw/title summary/]
              interp     => 1,
        )
        ,attr(Slice=>{}));

Turn this on to return an array of SQL like SQL::Interp or DBIx::Interp expect as input.

AUTHOR

mark@summersault.com, <mark at summersault.com>

BUGS

Please report any bugs or feature requests to bug-sql-keywordsearch at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=SQL-KeywordSearch. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SEE ALSO

COPYRIGHT & LICENSE

Copyright 2006 - 2009 Mark Stosberg, <mark@summersault.com>, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.