SYNOPSIS
Log::Log4perl->init_once('log4perl.conf');
$crawler = CongRec::Crawler->new();
$crawler->goForth();
ATTRIBUTES
- issuesRoot
-
The root page for Daily Digest issues.
Breadcrumb path: Library of Congress > THOMAS Home > Congressional Record > Browse Daily Issues
- days
-
A hash of issues: %issues{year}{month}{day}{section}
- mech
-
A WWW::Mechanize object with state that we can use to grab the page from Thomas.
METHODS
goFrom()
Start crawling from the Daily Digest issues page, i.e. http://thomas.loc.gov/home/Browse.php?&n=Issues
Also, for a specific congress, where NUM is congress number: http://thomas.loc.gov/home/Browse.php?&n=Issues&c=NUM
Returns the total number of pages grabbed.
parseRoot(Str $content)
Parse the the root of an issue an fill our hash of available issues