NAME

SWISH::WebService - provide HTTP access to a Swish-e index

SYNOPSIS

#!/usr/bin/perl

use strict;
use warnings;
use CGI qw/ -newstyle_urls /;
use SWISH::WebService;

# print multi-byte chars correctly
binmode STDOUT, ":utf8";

my $cgi = CGI->new;

unless ($cgi->param)
{
   print $cgi->header;
   print "no params passed!";
   exit;
}

my $search = SWISH::WebService->new(
                   q => $cgi->param('q'),  # 'my query'
                   o => $cgi->param('o'),  # 'results order'
                   n => $cgi->param('n'),  # 10
                   p => $cgi->param('p'),  # 10
                   s => $cgi->param('s'),  # 1
                   f => $cgi->param('f')   # 'xml'
                   );

# or more simply:
my $search = SWISH::WebService->new($cgi);

$search->index('MOVIES');
$search->uri( $cgi->url() );

my $response = $search->search
 or croak $search->error;

print $cgi->header($search->format);

if ( $search->f eq 'html' )
{
   print <<STYLE;
   
   <style>
    <!--
     div.search        { font-family: verdana, helvetica, arial }
     div.item          { padding: 6px; }
     span.snip, 
      span.url, 
      span.title, 
      span.times       { display: block }
     span.url          { color: green; font-size: 90% }
     span.query_words  { font-weight: bold }
     div.stats         { padding: 10px }
     a                 { color: blue }
     a:visited         { color: red }
     a:hover           { color: black }
     span.hilite       { font-weight: bold }
    -->
   </style>
   
STYLE
}

print $response;
                   

DESCRIPTION

SWISH::WebService implements a front-end search API for a Swish-e index. It can use either the SWISH::API for a local index or SWISH::API::Remote to access a SWISHED server.

Multiple output formats are supported, including RSS, XML and HTML. The general idea is that you can run one or more webservice applications that share a single swished server and provide a common API. Common features like results paging, sorting highlighting and contextual snippets are supported.

API

The supported params are:

q

Query. Query may be of the form:

foo bar        # AND assumed
foo AND bar
foo OR bar
foo NOT bar
field:foo      # limit search to field

See FIELDS below for more on available fields.

Queries are not case sensitive. foo will match FOO and Foo.

o

Sort order. Default is descending by rank. Other default options include:

swishtitle
swishdocpath

Order must be specified as a string like:

swishtitle desc
swishdocpath asc swishtitle desc

Where desc and asc are the sort direction. asc is the default if not specified.

Any property in an index may be sorted on; consult the Swish-e documentation.

n

Number of pages for page links. Default is 10.

p

Page size. Default is 10 hits per page. The maximum allowed is 100.

s

Start at result item. Default is 1.

f

Format. The following formats are available. Case is ignored.

xml
html
rss
simple

See RESPONSE for more details.

RESPONSE

Your HTTP response will be in one of the following formats. The default is html. See the f request param above.

xml

<?xml version="1.0" encoding="UTF-8"?>
<search>
 <results>
  <item id="NNN" rank="XXXX"> <!-- id = sort number, rank = score -->
   <title>result title here</title>
   <url>result url here</url>
   <snip>some contextual snippet here showing query words in context</snip>
  </item>
  .
  .
  .
 </results>
 <stats start="nnn" end="xxx" max="sss" total="yyy" runtime="ttt" searchtime="fff"/>
 <links>
  <prev>http://url_for_prev_sss_results</prev>
  <first>http://url_for_first_page_of_results</first>
  <page id="N">http://url_for_page_N_results</page>
  .
  .
  .
  <last>http://url_for_last_page_of_results</last>
  <next>http://url_for_next_sss_results</next>
 </links>
</search>

html

<div class="search">
 <div class="results">
  <div id="item_NNN" class="item">
   <span class="rank">rank here</span>
   <span class="title">result title here</span>
   <span class="url">result url here</span>
   <snap class="snip">some contextual snippet here showing query words in context</span>
  </div>
  .
  .
  .
 </div>
 <div class="stats"> 
   <span class="stats">
    Results N - M of T
   </span>
   <span class="query">for <span class="query_words">your query</span></span>
   <span class="stopwords">The following words were automatically removed: 
   <span class="stopwords_words">a the an but</span>
   </span>
   <span class="times">
    Run time: 0.100 sec - Search time: 0.020 sec
   </span>
 </stats>

 <div class="links">
  <span class="prev">http://url_for_prev_sss_results</span>
  <span id="pageN">http://url_for_page_N_results</span>
  .
  .
  .
  <span class="next">http://url_for_next_sss_results</span>
 </div>
</div>

rss

The default RSS template uses the RSS 2.0 specification.

simple

t: title
u: url
r: rank
n: number
-
t: ...
.
.
.
--

NOTE: The - delimits each result. The double - denotes the end of the results.

METHODS

new

Instantiate a new Search object. Any of the accessor methods described below can also be used as a key/value pair param with new().

error

template

searchtime

query

stopwords

wordchars

beginchars

endchars

swish

templates

debug

server

index

uri

results

title

rss

hiliter

snipper

xml

AUTHOR

Peter Karman <perl@peknet.com>.

Thanks to Atomic Learning for supporting the development of this module.

COPYRIGHT

This code is licensed under the same terms as Perl itself.