The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

HTML::Any - a common interface for HTTP clients (LWP, AnyEvent::HTTP, Curl)

SYNOPSIS

use HTTP::Any::...
use ...

sub do_http {
	...
	HTTP::Any::...
}

my $opt = { ... };

my $cb = sub {
	my ($is_success, $body, $headers, $redirects) = @_;
	...
}

do_http($url, $opt, $cb);

MOTIVATION

LWP, AnyEvent::HTTP, Curl - each of them has its advantages, disadvantages and peculiarities. The HTML::Any modules were created during the process of investigation of the strong and weak sides of those above-mentioned HTML clients. They allow quick switching between them to use the best one for each definite case.

DESCRIPTION

IMPORT

I recommend placing using HTTP::Any in a separate module which should be used from any point of your project.

Why would not make a simple one-line connection? Because of better flexibility and an option to replace the modules used. For example, using LWP::RobotUA instead for LWP::UserAgent.

LWP

use LWP;
use HTTP::Any::LWP;
sub do_http {
	my $ua = LWP::UserAgent->new;
	HTTP::Any::LWP::do_http($ua, @_);
}

AnyEvent

use EV;
use AnyEvent::HTTP;
use HTTP::Any::AnyEvent;
sub do_http {
	HTTP::Any::AnyEvent::do_http(\&http_request, @_);
}

Curl

use Net::Curl::Easy;
use HTTP::Any::Curl;
sub do_http {
	my ($url, $opt, $cb) = @_;
	my $easy = Net::Curl::Easy->new();
	HTTP::Any::Curl::do_http(undef, $easy, $url, $opt, $cb);
}

Curl with Multi

use Net::Curl::Easy;
use Net::Curl::Multi;
use Net::Curl::Multi::EV;
use HTTP::Any::Curl;
my $multi = Net::Curl::Multi->new();
my $curl_ev = Net::Curl::Multi::EV::curl_ev($multi);
sub do_http {
	my ($url, $opt, $cb) = @_;
	my $easy = Net::Curl::Easy->new();
	HTTP::Any::Curl::do_http($curl_ev, $easy, $url, $opt, $cb);
}

CALL

my $opt = { ... };

my $cb = sub {
	my ($is_success, $body, $headers, $redirects) = @_;
	...
}

do_http($url, $opt, $cb);

where:

url

URL as string

opt

options and headers

cb

callback function to get result

options

referer

Referer url

agent

User agent name

timeout

Timeout, seconds

gzip

This option adds 'Accept-Encoding' header with gzip value to the HTTP query and tells that the response must be decoded. If you don't want to decode the response, please add 'Accept-Encoding' header into the 'headers' parameter.

headers

Ref on HASH of HTTP headers:

{
  'Accept' => '*/*',
   ...
}

It enables cookies support. The "" values enables the session cookies support without saving them. Any other value is transferred as is: ref to a hash (LWP, AnyEvent::HTTP), the file's name (Curl).

persistent

1 or 0. Try to create/reuse a persistent connection. When not specified, see the default behavior of Curl (reverse of CURLOPT_FORBID_REUSE) and AnyEvent::HTTP (persistent)

proxy

http and https proxy

proxy => "$host:$port"
max_size

The size limit for response content, bytes.

Note: when you use the accept_encoding and max_size options will be triggered, the current mode is the following: HTTP::Any::Curl - will return the result partially, HTTP::Any::LWP - will return "", HTTP::Any::AnyEvent - will return "".

However, this state can be changed in future.

When max_size options will be triggered, 'client-aborted' header will added with 'max_size' value.

body

Data for POST method.

method

When method parameter = "POST", the POST request is used with body parameter on data and 'Content-Type' header is added with 'application/x-www-form-urlencoded' value.

finish callback function

my $cb = sub {
	my ($is_success, $body, $headers, $redirects) = @_;
	...
};

where:

is_success

It is true, when HTTP code is 2XX.

body

HTML body

headers

Ref on HASH of HTTP headers (lowercase) and others info: Status, Reason, URL

redirects

Previous headers from last to first

on_header callback function

When specified, this callback will be called after getting all headers.

$opt{on_header} = sub {
	my ($is_success, $headers, $redirects) = @_;
	...
};

on_body callback function

When specified, this callback will be called on each chunk.

$opt{on_body} = sub {
	my ($body) = @_; # body chunk
	...
};

NOTES

Turn off the persistent options to download pages of many sites.

Use libcurl with "Asynchronous DNS resolution via c-ares".

AUTHOR

Nick Kostyria <kni@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2013 by Nick Kostyria

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

Net::Curl AnyEvent::HTTP LWP

Net::Curl::Multi::EV