NAME
WWW::Curl::UserAgent - UserAgent based on libcurl
VERSION
version 0.9.8
SYNOPSIS
use HTTP::Request;
use WWW::Curl::UserAgent;
my $ua = WWW::Curl::UserAgent->new(
timeout => 10000,
connect_timeout => 1000,
);
$ua->add_request(
request => HTTP::Request->new( GET => 'http://search.cpan.org/' ),
on_success => sub {
my ( $request, $response ) = @_;
if ($response->is_success) {
print $response->content;
}
else {
die $response->status_line;
}
},
on_failure => sub {
my ( $request, $error_msg, $error_desc ) = @_;
die "$error_msg: $error_desc";
},
);
$ua->perform;
DESCRIPTION
WWW::Curl::UserAgent
is a web user agent based on libcurl. It can be used easily with HTTP::Request
and HTTP::Response
objects and handler callbacks. For an easier interface there is also a method to map a single request to a response.
WWW::Curl
is used for the power of libcurl, which e.g. handles connection keep-alive, parallel requests, asynchronous callbacks and much more. This package was written, because WWW::Curl::Simple
does not handle keep-alive correctly and also does not consider PUT, HEAD and other request methods like DELETE.
There is a simpler interface too, which just returns a HTTP::Response
for a given HTTP::Request
, named request(). The normal approach to use this library is to add as many requests with callbacks as your code allows to do and run perform
afterwards. Then the callbacks will be executed sequentially when the responses arrive beginning with the first received response. The simple method request() does not support this of course, because there are no callbacks defined.
This library is in production use on https://www.xing.com.
CONSTRUCTOR METHODS
The following constructor methods are available:
- $ua = WWW::Curl::UserAgent->new( %options )
-
This method constructs a new
WWW::Curl::UserAgent
object and returns it. Key/value pair arguments may be provided to set up the initial state. The default values should be based on the default values of libcurl. The following options correspond to attribute methods described below:KEY DEFAULT ----------- -------------------- user_agent_string www.curl.useragent/$VERSION connect_timeout 300 timeout 0 parallel_requests 5 keep_alive 1 followlocation 0 max_redirects -1
ATTRIBUTES
- $ua->connect_timeout / $ua->connect_timeout($connect_timeout)
-
Get/set the timeout in milliseconds waiting for the response to be received. If the response is not received within the timeout the on_failure handler is called.
- $ua->timeout / $ua->timeout($timeout)
-
Get/set the timeout in milliseconds waiting for the response to be received. If the response is not received within the timeout the on_failure handler is called.
- $ua->parallel_requests / $ua->parallel_requests($parallel_requests)
-
Get/set the number of the maximum of requests performed in parallel. libcurl itself may use less requests than this number but not more.
- $ua->keep_alive / $ua->keep_alive($boolean)
-
Get/set if TCP connections should be reused with keep-alive. Therefor the TCP connection is forced to be closed after receiving the response and the corresponding header "Connection: close" is set. If keep-alive is enabled (default) libcurl will handle the connections.
- $ua->followlocation / $ua->followlocation($boolean)
-
Get/set if curl should follow redirects. The headers of the redirect respones are thrown away while redirecting, so that the final response will be passed into the corresponding handler.
- $ua->max_redirects / $ua->max_redirects($max_redirects)
-
Get/set the maximum amount of redirects. -1 (default) means infinite redirects. 0 means no redirects at all. If the maximum redirect is reached the on_failure handler will be called.
- $ua->user_agent_string / $ua->user_agent_string($user_agent)
-
Get/set the user agent submitted in each request.
- $ua->request_queue_size
-
Get the size of the not performed requests.
- $ua->request( $request, %args )
-
Perform immediately a single
HTTP::Request
. Parameters can be submitted optionally, which will override the user agents settings for this single request. Possible options are:connect_timeout timeout keep_alive followlocation max_redirects
Some examples for a request
my $request = HTTP::Request->new( GET => 'http://search.cpan.org/'); $response = $ua->request($request); $response = $ua->request($request, timeout => 3000, keep_alive => 0, );
If there is an error e.g. like a timeout the corresponding
HTTP::Response
object will have the statuscode 500, the short error description as message and a longer message description as content. It runs perform() internally, so queued requests will be performed, too. - $ua->add_request(%args)
-
Adds a request with some callback handler on receiving messages. The on_success callback will be called for every successful read response, even those containing error codes. The on_failure handler will be called when libcurl reports errors, e.g. timeouts or bad curl settings. The parameters
request
,on_success
andon_failure
are mandatory. Optional aretimeout
,connect_timeout
,keep_alive
,followlocation
andmax_redirects
.$ua->add_request( request => HTTP::Request->new( GET => 'http://search.cpan.org/'), on_success => sub { my ( $request, $response, $easy ) = @_; print $request->as_string; print $response->as_string; }, on_failure => sub { my ( $request, $err_msg, $err_desc, $easy ) = @_; # error handling } );
The callbacks provide as last parameter a
WWW:Curl::Easy
object which was used to perform the request. This can be used to obtain some informations like statistical data about the request.Chaining of
add_request
calls is a feature of this module. If you add a request within anon_success
handler it will be immediately executed when the callback is executed. This can be useful to immediately react on a response:$ua->add_request( request => HTTP::Request->new( POST => 'http://search.cpan.org/', [], $form ), on_failure => sub { die }, on_success => sub { my ( $request, $response ) = @_; my $target_url = get_target_from($response); $ua->add_request( request => HTTP::Request->new( GET => $target_url ), on_failure => sub { die }, on_success => sub { my ( $request, $response ) = @_; # actually do sth. } ); }, ); $ua->perform; # executes both requests
- $ua->add_handler($handler)
-
To have more control over the handler you can add a
WWW::Curl::UserAgent::Handler
by yourself. TheWWW::Curl::UserAgent::Request
inside of the handler needs all parameters provided to libcurl as mandatory to prevent defining duplicates of default values. Within theWWW::Curl::UserAgent::Request
is the possiblity to modify theWWW::Curl::Easy
object before it gets performed.my $handler = WWW::Curl::UserAgent::Handler->new( on_success => sub { my ( $request, $response, $easy ) = @_; print $request->as_string; print $response->as_string; }, on_failure => sub { my ( $request, $err_msg, $err_desc, $easy ) = @_; # error handling } request => WWW::Curl::UserAgent::Request->new( http_request => HTTP::Request->new( GET => 'http://search.cpan.org/'), connect_timeout => $ua->connect_timeout, timeout => $ua->timeout, keep_alive => $ua->keep_alive, followlocation => $ua->followlocation, max_redirects => $ua->max_redirects, ), ); $handler->request->curl_easy->setopt( ... ); $ua->add_handler($handler);
- $ua->perform
-
Perform all queued requests. This method will return after all responses have been received and handler have been processed.
BENCHMARK
A test with the tools/benchmark.pl script against loadbalanced webserver performing a get requests to a simple echo API on an Intel i5 M 520 with Fedora 19 gave the following results:
500 requests (sequentially, 500 iterations):
+-------------------------------+-----------+------+------+------------+------------+
| User Agent | Wallclock | CPU | CPU | Requests | Iterations |
| | seconds | usr | sys | per second | per second |
+-------------------------------+-----------+------+------+------------+------------+
| LWP::UserAgent 6.05 | 21 | 1.10 | 0.20 | 23.8 | 384.6 |
+-------------------------------+-----------+------+------+------------+------------+
| LWP::Parallel::UserAgent 2.61 | 20 | 1.13 | 0.22 | 25.0 | 370.4 |
+-------------------------------+-----------+------+------+------------+------------+
| WWW::Curl::Simple 0.100191 | 95 | 0.66 | 0.27 | 5.3 | 537.6 |
+-------------------------------+-----------+------+------+------------+------------+
| Mojo::UserAgent 4.83 | 10 | 1.19 | 0.08 | 50.0 | 393.7 |
+-------------------------------+-----------+------+------+------------+------------+
| WWW::Curl::UserAgent 0.9.6 | 10 | 0.55 | 0.06 | 50.0 | 819.7 |
+-------------------------------+-----------+------+------+------------+------------+
500 requests (5 in parallel, 100 iterations):
+-------------------------------+-----------+--------+--------+------------+------------+
| User Agent | Wallclock | CPU | CPU | Requests | Iterations |
| | seconds | usr | sys | per second | per second |
+-------------------------------+-----------+--------+--------+------------+------------+
| LWP::Parallel::UserAgent 2.61 | 10 | 1.26 | 0.26 | 50.0 | 65.8 |
+-------------------------------+-----------+--------+--------+------------+------------+
| WWW::Curl::Simple 0.100191 | 815 | 270.16 | 191.76 | 0.6 | 0.2 |
+-------------------------------+-----------+--------+--------+------------+------------+
| Mojo::UserAgent 4.83 | 3 | 1.03 | 0.04 | 166.7 | 93.5 |
+-------------------------------+-----------+--------+--------+------------+------------+
| WWW::Curl::UserAgent 0.9.6 | 3 | 0.42 | 0.06 | 166.7 | 208.3 |
+-------------------------------+-----------+--------+--------+------------+------------+
SEE ALSO
See HTTP::Request and HTTP::Response for a description of the message objects dispatched and received. See HTTP::Request::Common and HTML::Form for other ways to build request objects.
See WWW::Curl for a description of the settings and options possible on libcurl.
AUTHORS
Julian Knocke
Othello Maurer
COPYRIGHT AND LICENSE
This software is copyright (c) 2018 by XING AG.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.