NAME

Net::Google::SafeBrowsing2 - Perl extension Google Safe Browsing v2

SYNOPSIS

  use Net::Google::SafeBrowsing2;
  use Net::Google::SafeBrowsing2::Sqlite;
  
  my $storage = Net::Google::SafeBrowsing2::Sqlite->new(file => 'google-v2.db');
  my $gsb = Net::Google::SafeBrowsing2->new(
	key 	=> "my key", 
	storage	=> $storage,
  );
  
  $gsb->update();
  my $match = $gsb->lookup(url => 'http://www.gumblar.cn/');
  
  if ($match eq MALWARE) {
	print "http://www.gumblar.cn/ is flagged as a dangerous site\n";
  }

DESCRIPTION

Net::Google::SafeBrowsing2 implements (most of) the Google Safe Browsing v2 API.

This version does not handle Message Authentication Code (MAC).

The library passes most of the unit tests listed in the API documentation. See the documentation (http://code.google.com/apis/safebrowsing/developers_guide_v2.html) for more details about the failed tests.

The Google Safe Browsing database must be stored and managed locally. I wrote Net::Google::SafeBrowsing2::Sqlite to use Sqlite as the storage back-end. Other storage mechanisms (databases, memory, etc.) can be added and used transparently with this module.

CONSTANTS

Several constants are exported by this module:

INTERNAL_ERROR

An internal error occurred.

SERVER_ERROR

The server sent an error back to the client.

NO_UPDATE

No update was performed, probably because it is too early to make a new request to Google Safe Browsing.

NO_DATA

No data was sent back by Google to the client, probably because the database is up to date.

SUCCESSFUL

The operation was successful.

MALWARE

Name of the Malware list in Google Safe Browsing (shortcut to 'goog-malware-shavar')

PHISHING

Name of the Phishing list in Google Safe Browsing (shortcut to 'googpub-phish-shavar')

CONSTRUCTOR

new()

Create a Net::Google::SafeBrowsing2 object

  my $gsb = Net::Google::SafeBrowsing2->new(
	key 	=> "my key", 
	storage	=> Net::Google::SafeBrowsing2::Sqlite->new(file => 'google-v2.db'),
	debug	=> 0,
	list	=> MALWARE,
  );

Arguments

key

Required. Your Google Safe browsing API key

storage

Required. Object which handle the storage for the Google Safe Browsing database. See Net::Google::SafeBrowsing2::Storage for more details.

list

Optional. The Google Safe Browsing list to handle. By default, handles both MALWARE and PHISHING.

debug

Optional. Set to 1 to enable debugging. 0 (disabled by default).

The debug output maybe quite large and can slow down significantly the update and lookup functions.

version

Optional. Google Safe Browsing version. 2.2 by default

PUBLIC FUNCTIONS

update()

Perform a database update.

$gsb->update();

Return the status of the update (see the list of constants above): INTERNAL_ERROR, SERVER_ERROR, NO_UPDATE, NO_DATA or SUCCESSFUL

This function can handle two lists at the same time. If one of the list should not be updated, it will automatically skip it and update the other one. It is faster to update two lists at once rather than doing them one by one.

Arguments

list

Optional. Update a specific list. Use the list(s) from new() by default.

force

Optional. Force the update (1). Disabled by default (0).

Be careful if you set this option to 1 as too frequent updates might result in the blacklisting of your API key.

lookup()

Lookup a URL against the Google Safe Browsing database.

my $match = $gsb->lookup(url => 'http://www.gumblar.cn');

Returns the name of the list if there is any match, returns an empty string otherwise.

Arguments

list

Optional. Lookup against a specific list. Use the list(s) from new() by default.

url

Required. URL to lookup.

get_lists()

Returns the name of all the Google Safe Browsing lists

my $@lists = $gsb->get_lists ();

NOTE: this function is useless in practice because Google includes some lists which cannot be used by the Google Safe Browsing API, like lists used by the Google toolbar.

PRIVATE FUNCTIONS

These functions are not intended to be used externally.

lookup_suffix()

Lookup a host prefix.

update_error()

Handle server errors during a database update.

lookup_whitelist()

Lookup a host prefix and suffix in the whitelist (s chunks)

ua()

Create LWP::UserAgent to make HTTP requests to Google.

parse_s()

Parse s chunks information for a database update.

parse_a()

Parse a chunks information for a database update.

hex_to_ascii()

Transform hexadecimal strings to printable ASCII strings. Used mainly for debugging.

print $gsb->hex_to_ascii('hex value');

ascii_to_hex()

Transform ASCII strings to hexadecimal strings.

debug()

Print debug output.

canonical_domain_suffixes()

Find all suffixes for a domain.

canonical_domain()

Find all canonical domains a domain.

canonical_path()

Find all canonical paths for a URL.

canonical()

Find all canonical URLs for a URL.

canonical_uri()

Create a canonical URI.

NOTE: URI cannot handle all the test cases provided by Google. This method is a hack to pass most of the test. A few tests are still failing. The proper way to handle URL canonicalization according to Google would be to create a new module to handle URLs. However, I believe most real-life cases are handled correctly by this function.

canonical()

Return all possible full hashes for a URL.

prefix()

Return a hash prefix. The size of the prefix is set to 4 bytes.

request_full_hash()

Request full full hashes for specific prefixes from Google.

parse_full_hashes()

Process the request for full hashes from Google.

get_a_range()

Get the list of a chunks ranges for a list update.

get_s_range()

Get the list of s chunks ranges for a list update.

create_range()

Create a list of ranges (1-3, 5, 7-11) from a list of numbers.

expand_range()

Explode list of ranges (1-3, 5, 7-11) into a list of numbers (1,2,3,5,7,8,9,10,11).

SEE ALSO

See Net::Google::SafeBrowsing for the implementation of Google Safe Browsing v1.

See Net::Google::SafeBrowsing2::Storage and Net::Google::SafeBrowsing2::Sqlite for information on storing and managing the Google Safe Browsing database.

Google Safe Browsing v2 API: http://code.google.com/apis/safebrowsing/developers_guide_v2.html

AUTHOR

Julien Sobrier, <jsobrier@zscaler.com> or <julien@sobrier.net>

COPYRIGHT AND LICENSE

Copyright (C) 2010 by Julien Sobrier

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

6 POD Errors

The following errors were encountered while parsing the POD:

Around line 107:

You forgot a '=back' before '=head2'

Around line 146:

=back without =over

Around line 179:

You forgot a '=back' before '=head2'

Around line 496:

=back without =over

Around line 504:

You forgot a '=back' before '=head2'

Around line 1379:

=back without =over