NAME

proxyhunter - free proxy searcher and checker

DESCRIPTION

proxyhunter may be used to find free proxies, it performs searching using available adapters for search engines (like Google and so on). All found proxies may be checked for availability, proxy type and speed. proxyhunter uses database as storage for found proxies, so you can easily perfrom any SELECT statement to find what you need.

QUICK START

First of all you should install one of the available database schema. The easiest for deploy is SQLite (but not very fast). See CPAN for other alternatives.

$ cpan App::ProxyHunter::Model::Schema::SQLite

Then you should generate configuration file and edit if needed

$ proxyhunter --create-config proxyhunter.jconf

Now it is time to create database schema

$ proxyhunter --config proxyhunter.jconf --create-schema

And finally you can start the process

$ proxyhunter --config proxyhunter.jconf

AVAILABLE OPTIONS

--create-config /path/to/config

Generates default configuration file

--create-schema

Creates table structures inside database. This should be used with --config option

--config /path/to/config

Which config file to use for creating schema or to run

--daemon /path/to/pidfile

Run proxyhunter as daemon. /path/to/pidfile may be omitted, otherwise pid of the daemon will be writed to this file. Proc::Daemon should be additionally installed for this to work.

CONFIGURATION

Use --create-config option to create default configuration file. Configuration file format is JCONF (see Parse::JCONF).

db

Database settings. Available options listed below

driver

DBI driver to use. You should have schema installed for this driver, which implemented as App::ProxyHunter::Model::Schema subclass. See CPAN for available implementations. You can implement schema for preferred database yourself, see App::ProxyHunter::Model::Schema.

driver_cfg

DBD driver specific options, like mysql_auto_reconnect and so on

host

Host of the database, if needed

name

Database name

login

Login for database, if needed

password

Password for database, if needed

Proxy sercher settings. Searcher uses specified search engines to find out proxies. You can implement new search engine, see App::ProxyHunter::SearchEngine. See CPAN for available implementations.

enabled

true to enable proxy searching , false to disable

querylist

A list of queries which will be passed to search engine (like "free proxy", "proxy list", ...)

engines

Which search engines to use. Search engine implementation should be in App::ProxyHunter::SearchEngine namespace.

check

Checker settings. Checker checks proxy for availability and type. Checking performed right after proxy was inserted into database. If proxy will not pass check (e.g. bad proxy) it will be deleted. Available options listed below

enabled

true to enable proxy checking , false to disable

strict

true to enable strict checking. Default is false except http proxy checking (which always will be true regardless this option). For strict checking description see "STRICT CHECKING" in Net::Proxy::Type.

types

Will check proxy only for specified types if presented. For available proxy types see "PACKAGE CONSTANTS AND VARIABLES" in Net::Proxy::Type. Example:

types = ["SOCKS5_PROXY", "CONNECT_PROXY"]

http_url

Use specified HTTP url to check http proxies. Optional.

http_keyword

Which keyword we should find in response for http_url to mark proxy as good. Optional.

https_url

Use specified HTTPS url to check https proxies. Optional.

https_keyword

Which keyword we should find in response for https_url to mark proxy as good. Optional.

workers

How many workers should perform checking in parallel. Bigger value causes higher check speed and higher resources usage

speed_check

Should checker perform speed checking right after successful checking for availability. If true will perform checking even if speed checking disabled in speed_check section

recheck

Rechecker settings. Rechecker checks proxy for availability and type, only proxies which passed first check by checker. Rechecker used to check proxies that passed at least one check and was not deleted from database yet (fails_before_delete limit not exceeded). Available options listed below

enabled

Enable or disable rechecking

strict

true to enable strict checking. Default is false except http proxy checking (which always will be true regardless this option). For strict checking description see "STRICT CHECKING" in Net::Proxy::Type.

workers

How many workers to use for rechecking

interval

How often proxy may be rechecked. For example 180 means no more than each 180 seconds

speed_check

Perform speed check right after recheck

fails_before_delete

How much in a row proxy check may failed before it will be deleted from database. For example 3 means that proxy will be deleted on fourth check fail

speed_check

Speed checker settings. Each proxy which passed initial check by checker may be checked for a speed. Available options listed below

enabled

Enable or disable speed checking

workers

How many workers to use for speed checking

interval

How often to recheck proxy speed

http_url

HTTP url to some file which will be downloaded to check proxy speed. For measurment accuracy file should be at least 1 mb. Not whole file will be downloaded, but only part needed for measurment accuracy, so file may be much bigger. HTTP url used for those proxies that may be used for HTTP protocol

https_url

HTTPS url to some file which will be downloaded to check proxy speed. For measurment accuracy file should be at least 1 mb. Not whole file will be downloaded, but only part needed for measurment accuracy, so file may be much bigger. HTTPS url used for those proxies that may be used for HTTPS protocol

DATABASE STRUCTURE

Database has one table called proxy. This table has columns listed below

id

Primary id

host

Proxy host as ipv4 address

port

Proxy port

checked

Is proxy checked at least one time after inserting in the database

success_total

How many times proxy check successfully passed

fails_total

How many times proxy check failed

insertdate

Date and time when this proxy was inserted in the database

checkdate

Date and time when last check was performed

speed_checkdate

Date and time when last speed check was performed

fails

How much failed checks in a row was performed. Each successful check resets this counter

type

Type of the proxy. One of HTTPS_PROXY, HTTP_PROXY, CONNECT_PROXY, SOCKS4_PROXY, SOCKS5_PROXY, DEAD_PROXY. Not checked proxy has DEAD_PROXY type. See Net::Proxy::Type for proxy types description.

in_progress

Is proxy in the queue for check

conn_time

Connection time to this proxy host

speed

Proxy speed in bytes per second

AUTHOR

Oleg G, <oleg@cpan.org>

COPYRIGHT AND LICENSE

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself