NAME

WWW::Search::Scraper::CraigsList - class for scraping CraigsList

SYNOPSIS

require WWW::Search::Scraper;
$search = new WWW::Search::Scraper('CraigsList');

DESCRIPTION

This class is an CraigsList specialization of WWW::Search. It handles making and interpreting CraigsList searches http://www.CraigsList.com.

This class exports no public interface; all interaction should be done through WWW::Search objects.

OPTIONS

None at this time (2001.04.25)

search_url=URL

Specifies who to query with the CraigsList protocol. The default is at http://www.CraigsList.com/cgi-bin/job-search.

search_debug, search_parse_debug, search_ref Specified at WWW::Search.

Internet/Web Engineering Category options: <null> - ALL JOBS art - web design jobs bus - business jobs mar - marketing jobs eng - internet engineering jobs etc - etcetera jobs wri - writing jobs sof - software jobs acc - finance jobs ofc - office jobs med - media jobs hea - health science jobs ret - retail jobs npo - nonprofit jobs lgl - legal jobs egr - engineering jobs sls - sales jobs sad - sys admin jobs tel - network jobs tfr - tv video radio jobs hum - human resource jobs tch - tech support jobs edu - education jobs trd - skilled trades jobs

Checkboxes - additive to search(?)

addOne value=telecommuting - telecommute addTwo value=contract - contract addThree value=internship - internships addFour value=part-time - part-time addFive value=non-profit - non-profit

SEE ALSO

To make new back-ends, see WWW::Search, or the specialized CraigsList searches described in options.

HOW DOES IT WORK?

native_setup_search is called before we do anything. It initializes our private variables (which all begin with underscores) and sets up a URL to the first results page in {_next_url}.

native_retrieve_some is called (from WWW::Search::retrieve_some) whenever more hits are needed. It calls the LWP library to fetch the page specified by {_next_url}. It parses this page, appending any search hits it finds to {cache}. If it finds a ``next'' button in the text, it sets {_next_url} to point to the page for the next set of results, otherwise it sets it to undef to indicate we're done.

AUTHOR and CURRENT VERSION

WWW::Search::CraigsList is written and maintained by Glenn Wood, <glenwood@alumni.caltech.edu>.

The best place to obtain WWW::Search::CraigsList is from Martin Thurn's WWW::Search releases on CPAN. Because CraigsList sometimes changes its format in between his releases, sometimes more up-to-date versions can be found at http://alumni.caltech.edu/~glenwood/SOFTWARE/index.html.

COPYRIGHT

Copyright (c) 2001 Glenn Wood All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

------------------------------------------------

Search.pm and Search::AltaVista.pm (of which CraigsList.pm is a derivative) is Copyright (c) 1996-1998 University of Southern California. All rights reserved.

Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.