NAME
WWW::Sucksub::Attila - automated access to attila french subtitles database
VERSION
Version 0.05
SYNOPSIS
WWW::SuckSub::Attila is a web robot based on the WWW::Mechanize Module. it parses distant web database specialised on french subtitles and build a dbm file to store result ( film title - http link for subtitle file ). The dbm file is used like a dictionnary you can update and use to do quick search.
use WWW::Sucksub::Attila;
my $test=WWW::Sucksub::Attila>new(
motif => $mot,
debug =>1,
logout => '/where/debug/file/is/written.txt',
dbfile=>'/where/dbm/file/is.db',
html=>'/where/html/report/will/be/written.html'
);
$test->update(); #parse all site and collect subtitles http link
$test->search(); #search on local dbm file and produce html report
CONSTRUCTOR AND STARTUP
Attila Constructor
The new() constructor, is associated to default values : you can modify these one as shown in the synopsis example. Default value are these :
my $foo = WWW::Sucksub::Divxstation->new(
dbfile => "$ENV{HOME}"."/attila.db";
html => "$ENV{HOME}"."/attila_repport.html";
motif=> undef,
tempfile=> "$ENV{HOME}"."/.attila_tmp.html";
debug=> 0,
logout => \*STDOUT
useragent=> "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007"
);
The environnement variable $ENV{HOME} must exist unless you redefine the constructor value which need it.
new() constructor attributes and associated methods
All listed attributes can be modified by corresponding methods : - set the attributes value when calling equivalent method whith args. - get the attribute value when calling equivalent method whithout args.
$foo->WWW::Sucksub::Attila->new()
$foo->useragent() # get the useragent attribute value
$foo->useragent('tructruc') # set the useragent attribute value to 'tructruc'
motif()
you should here give a real value to this function : if $foo->motif is undef, the package execution will be aborted
$foo->motif('xxx')
allows to precise that you're searching a word that contains 'xxx'
$foo->motif()
return the current value of the string you search.
debug()
WWW-Sucksub-Divxstation can produce a lot of interresting informations The default value is "0" : that means that any debug informations will be written on the output ( see the logout() method too.)
$foo->debug(0) # stop the product of debbugging informations
$foo->debug(1) # debug info will be written to the log file ( see logout() method) .
logout()
A log file can be defined to keep a trace of website parsing You have to set $obj->debug(1) to get more detailled informations.
$foo->logout(); #get the current logout() value
$foo->logout('/home/xxx/log.txt') #set logout() value.
Note that default value is STDOUT the logout() value can only be set in the new constructor.
dbfile()
define dbm file for store and retrieving extracted informations you must provide au full path to the db file to store results
dbfile('/where/your/db/is.db')
The file will should be readable/writable.
html()
Define simple html output where to write search report. you must provide au full path to the html file if you want to get an html output.
html('/where/the html/repport/is/written.html')
If $foo->html() is defined. you can get the value of this attribute like this :
my $html_page = $foo->html
Default value is automatically defined on the new() call.
html => "$ENV{HOME}"."/attila_report.html";
html file will be used for reporting with search() methods
useragent()
arg should be a valid useragent. There's no reason to change this default value.
$foo->useragent()
return the value of the current useragent
$foo->useragent('xxxxxxxx')
set the useragent() value to ''xxxxxxxx'.
FUNCTIONS
these functions use the precedent attributes value.
search()
this function takes no arguments. it allows to launch a local dbm search.
$foo-> search()
the dbm file is read to give you every couple (title,link) which corresponds to the motif() pattern you defined before.
update()
this function takes no arguments. it allows to initiate the distant search on the web site http://davidbillemont5.free.fr/ ( attila website) the local dbm file is automatically written. Results are accumulated to the dbm file you define on new() call . Note that the update can take a while.
AUTHOR
Timothée Foucart, <timothee.foucart@apinc.org>
BUGS
Please report any bugs or feature requests to bug-www-sucksub-attila@rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=WWW-Sucksub-Attila. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SEE ALSO
COPYRIGHT & LICENSE
Copyright 2005 Timothée Foucart, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 159:
Non-ASCII character seen before =encoding in 'Timothée'. Assuming CP1252