NAME
Net::Google::SafeBrowsing3 - Perl extension for the Google Safe Browsing v3 API. (Google Safe Browsing v2 has been deprecated by Google.)
SYNOPSIS
use Net::Google::SafeBrowsing3;
use Net::Google::SafeBrowsing3::Sqlite;
my $storage = Net::Google::SafeBrowsing3::Sqlite->new(file => 'google-v3.db');
my $gsb = Net::Google::SafeBrowsing3->new(
key => "my key",
storage => $storage,
);
$gsb->update();
my $match = $gsb->lookup(url => 'http://www.gumblar.cn/');
if ($match eq MALWARE) {
print "http://www.gumblar.cn/ is flagged as a dangerous site\n";
}
$storage->close();
DESCRIPTION
Net::Google::SafeBrowsing3 implements the Google Safe Browsing v3 API.
The library passes most of the unit tests listed in the API documentation. See the documentation (https://developers.google.com/safe-browsing/developers_guide_v3) for more details about the failed tests.
The Google Safe Browsing database must be stored and managed locally. Net::Google::SafeBrowsing3::Sqlite uses Sqlite as the storage back-end, Net::Google::SafeBrowsing3::MySQL uses MySQL. Other storage mechanisms (databases, memory, etc.) can be added and used transparently with this module.
The source code is available on github at https://github.com/juliensobrier/Net-Google-SafeBrowsing3.
If you do not need to inspect more than 10,000 URLs a day, you can use Net::Google::SafeBrowsing2::Lookup with the Google Safe Browsing v2 Lookup API which does not require to store and maintain a local database.
IMPORTANT: If you start with an empty database, you will need to perform several updates to retrieve all the Google Safe Browsing information. This may require up to 24 hours. This is a limitation of the Google API, not of this module.
IMPORTANT: Google Safe Browsing v3 requires a different key than v2.
CONSTANTS
Several constants are exported by this module:
- DATABASE_RESET
-
Google requested to reset (empty) the local database.
- INTERNAL_ERROR
-
An internal error occurred.
- SERVER_ERROR
-
The server sent an error back to the client.
- NO_UPDATE
-
No update was performed, probably because it is too early to make a new request to Google Safe Browsing.
- NO_DATA
-
No data was sent back by Google to the client, probably because the database is up to date.
- SUCCESSFUL
-
The operation was successful.
- MALWARE
-
Name of the Malware list in Google Safe Browsing (shortcut to 'goog-malware-shavar')
- PHISHING
-
Name of the Phishing list in Google Safe Browsing (shortcut to 'googpub-phish-shavar')
- LANDING
-
Landing site.
- DISTRIBUTION
-
Distribution site.
CONSTRUCTOR
new()
Create a Net::Google::SafeBrowsing3 object
my $gsb = Net::Google::SafeBrowsing3->new(
key => "my key",
storage => Net::Google::SafeBrowsing3::Sqlite->new(file => 'google-v3.db'),
debug => 0,
list => MALWARE,
);
Arguments
- server
-
Safe Browsing Server. https://safebrowsing.google.com/safebrowsing/ by default
- key
-
Required. Your Google Safe browsing API key
- storage
-
Required. Object which handle the storage for the Google Safe Browsing database. See Net::Google::SafeBrowsing3::Storage for more details.
- list
-
Optional. The Google Safe Browsing list to handle. By default, handles both MALWARE and PHISHING.
- debug
-
Optional. Set to 1 to enable debugging. 0 (disabled) by default.
The debug output maybe quite large and can slow down significantly the update and lookup functions.
- errors
-
Optional. Set to 1 to show errors to STDOUT. 0 (disabled by default).
- perf
-
Optional. Set to 1 to show performance information.
- version
-
Optional. Google Safe Browsing version. 3.0 by default
PUBLIC FUNCTIONS
update()
Perform a database update.
$gsb->update();
Return the status of the update (see the list of constants above): INTERNAL_ERROR, SERVER_ERROR, NO_UPDATE, NO_DATA or SUCCESSFUL
This function can handle two lists at the same time. If one of the list should not be updated, it will automatically skip it and update the other one. It is faster to update two lists at once rather than doing them one by one.
Arguments
- list
-
Optional. Update a specific list. Use the list(s) from new() by default.
- force
-
Optional. Force the update (1). Disabled by default (0).
Be careful if you set this option to 1 as too frequent updates might result in the blacklisting of your API key.
lookup()
Lookup a URL against the Google Safe Browsing database.
my $match = $gsb->lookup(url => 'http://www.gumblar.cn');
my ($match, $type) = $gsb->lookup(url => 'http://www.gumblar.cn');
In scalar context, returns the name of the list if there is any match, returns an empty string otherwise. In array context, return the name of the list (empty if no match) and the type of malware site (0 if no type specified)
Arguments
- list
-
Optional. Lookup against a specific list. Use the list(s) from new() by default.
- url
-
Required. URL to lookup.
PRIVATE FUNCTIONS
These functions are not intended to be used externally.
lookup_suffix()
Lookup a host prefix.
local_lookup_suffix()
Lookup a host prefix in the local database only.
update_error()
Handle server errors during a database update.
ua()
Create LWP::UserAgent to make HTTP requests to Google.
parse_data()
Parse data from a redirection (add and sub chunk information).
parse_s()
Parse s chunks information for a database update.
parse_a()
Parse a chunks information for a database update.
hex_to_ascii()
Transform hexadecimal strings to printable ASCII strings. Used mainly for debugging.
print $gsb->hex_to_ascii('hex value');
ascii_to_hex()
Transform ASCII strings to hexadecimal strings.
debug()
Print debug output.
error()
Print error message.
perf()
Print performance message.
canonical_domain()
Find all canonical domains a domain.
canonical_path()
Find all canonical paths for a URL.
canonical()
Find all canonical URLs for a URL.
canonical_uri()
Create a canonical URI.
NOTE: URI cannot handle all the test cases provided by Google. This method is a hack to pass most of the test. A few tests are still failing. The proper way to handle URL canonicalization according to Google would be to create a new module to handle URLs. However, I believe most real-life cases are handled correctly by this function.
full_hashes()
Return all possible full hashes for a URL.
request_full_hash()
Request full full hashes for specific prefixes from Google.
parse_full_hashes()
Process the request for full hashes from Google.
create_range()
Create a list of ranges (1-3, 5, 7-11) from a list of numbers.
expand_range()
Explode list of ranges (1-3, 5, 7-11) into a list of numbers (1,2,3,5,7,8,9,10,11).