NAME
WWW::Crawler::Mojo::Job - Single crawler job
SYNOPSIS
my $job1 = WWW::Crawler::Mojo::Job->new;
$job1->url('http://example.com/');
my $job2 = $job1->child;
DESCRIPTION
This class represents a single crawler job.
ATTRIBUTES
context
Either Mojo::DOM or Mojo::URL instance that the job is referrered by.
$job->context($dom);
say $job->context;
closed
A flag indecates whether the job is closed or not.
$job->closed(1);
say $job->closed;
depth
The depth of the job in referrer series.
my $job1 = WWW::Crawler::Mojo::Job->new;
my $job2 = $job1->child;
my $job3 = $job2->child;
say $job1->depth; # 0
say $job2->depth; # 1
say $job3->depth; # 2
literal_uri
A Mojo::URL instance of the literal URL that has appeared in the referrer document.
$job1->literal_uri('./index.html');
say $job1->literal_uri; # './index.html'
referrer
A job instance that has referred the URL.
$job1->referrer($job);
my $job2 = $job1->referrer;
redirect_history
An array reference that contains URLs of redirect history.
$job1->redirect_history([$url1, $url2, $url3]);
my $history = $job1->redirect_history;
url
A Mojo::URL instance of the resolved URL.
$job1->url('http://example.com/');
say $job1->url; # 'http://example.com/'
method
HTTP request method such as GET or POST.
$job1->method('GET');
say $job1->method; # GET
tx_params
A hash reference that contains params for Mojo::Transaction.
$job1->tx_params({foo => 'bar'});
$params = $job1->tx_params;
METHODS
clone
Clones the job.
my $job2 = $job1->clone;
close
Closes the job and cuts the referrer series.
$job->close;
child
Instantiates a child job by parent job. The parent URL is set to child referrer.
my $job1 = WWW::Crawler::Mojo::Job->new(url => 'http://example.com/1');
my $job2 = $job1->child(url => 'http://example.com/2');
say $job2->referrer->url # 'http://example.com/1'
digest
Generates digest string with url
, method
, tx_params
attributes.
say $job->digest;
redirect
Replaces the resolved URL and history at once.
my $job = WWW::Crawler::Mojo::Job->new;
$job->url($url1);
$job->redirect($url2, $url3);
say $job->url # $url2
say $job->redirect_history # [$url1, $url3]
original_url
Returns the original URL of redirected job. If redirected, returns last element of redirect_histroy
attribute, otherwise returns url
attribute.
$job1->redirect_history([$url1, $url2, $url3]);
my $url4 = $job1->original_url; # $url4 is $url3
upgrade
Instanciates a job with string or a Mojo::URL instance.
AUTHOR
Keita Sugama, <sugama@jamadam.com>
COPYRIGHT AND LICENSE
Copyright (C) Keita Sugama.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.