NAME

WWW::Patent::Page - get patent documents from WWW source (e.g. ( not available: JP->Eng translations in HTML from JPO,) complete US applications and grants from (USPTO), and place into a WWW::Patent::Page::Response object) (note: ESPACE_EP not provided due to captcha use..)

VERSION

This document describes WWW::Patent::Page version 0.100.0 of February, 2007.

SYNOPSIS

Please see the test suite for working examples in t/ . The following is not guaranteed to be working or up-to-date.

THE ONLY OFFICE CURRENTLY WORKING IS THE USPTO.

$ perl -I. -MWWW::Patent::Page -e 'print $WWW::Patent::Page::VERSION,"\n"'
0.02

$ perl get_patent.pl US6123456 > US6123456.pdf &

$ perl -wT get_JPO_patent_translation_to_english.pl "JPH09-123456A" > JPH09-123456A.zip & 

( see examples/JPH09-123456A.zip for an html formatted, machine translated, Japanese patent document. ) 

(command line interfaces are included in examples/ )

http://www.yourdomain.com/www_get_patent_pdf.pl
http://www.yourdomain.com/www_get_JPO_patent_translation_to_english.pl

(web fetchers are included in examples/ )

Typical usage in perl code:

  use WWW::Patent::Page;

  print $WWW::Patent::Page::VERSION,"\n";

  my $patent_browser = WWW::Patent::Page->new(); # new object

  my $document1 = $patent_document->get_page('6,123,456');
  	# defaults:
	# 	    country => 'US',
	#	    format 	=> 'pdf',
	#		page   	=> undef ,
	# and usual defaults of LWP::UserAgent (subclassed)

  my $document2 = $patent_document->get_page('US6123456',
			format 	=> 'pdf',
			page   	=> 2 ,  #get only the second page
			);

  my $pages_known = $document2->get_parameter('pages');  #how many total pages known?

DESCRIPTION

Intent:  Use public sources to retrieve patent documents such as
TIFF images of patent pages, html of patents, pdf, etc.
Expandable for your office of interest by writing new submodules..
Alpha release by newbie to find if there is any interest

USAGE

See also SYNOPSIS above

   Standard process for building & installing modules:

        perl Build.PL
        ./Build
        ./Build test verbose=1
        ./Build install

        or

        perl Makefile.PL
        make
        make test TEST_VERBOSE=1
        make install

        or on ActiveState or otherwise using nmake
        
        perl Makefile.PL
        nmake
        nmake test TEST_VERBOSE=1
        nmake install

Examples of use:

  $patent_browser = WWW::Patent::Page->new(
  			doc_id	=> 'US6,654,321',
			format 	=> 'pdf',
			page   	=> undef ,  # returns all pages in one pdf
			agent   => 'Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6',
			);

	$patent_response = $patent_browser->get_patent('US6,654,321(B2)issued_2_Okada');

INTERFACE

Object oriented, and modelled on LWP.

SUBROUTINES/METHODS

new

NEW instance of the Page class, subclassing LWP::UserAgent

login

login to a server to use its services; obtain a token or session id or the like

country_known

country_known maps the known two letter acronyms to patenting entities, usually countries; country_known returns undef if the two letter acronym is not recognized.

parse_doc_id

Takes a human readable patent/publication identifier and parses it into country/entity, kind, number, doc_type, ...

     CC[TY]##,###,###(K#)_Comments

     US_6,123,456_A1_-comments

     CC : Two letter country/entity code; e.g. US, EP, WO
     TY  : Type of document; one or two letters only of these choices:
		e.g. in US, Kind = Utility is default and no "Kind" is used, e.g. US6123456
		D : Design, e.g. USD339,456
		PP: Plant, e.g. USPP8,901
		RE: Reissue, e.g. USRE35,312
		T : Defensive Publication, e.g. UST109,201
		SIR: Statutory Invention Registration, e.g. USH1,523
      ##,###,### Document number (e.g. patent number or application number- only digits and optionally separators, no letters)
      K# : the kind or version number, e.g. A1, B2, etc.; placed in parenthesis- at least one letter and at most one number.  Not always used in document fetching.
      Comments:  retained but not used- single string of word characters \w = A-z0-9_ (no spaces, "-", commas, etc.)

      Separators (comma, space, dash, underscore) may occur between entries, and at least one MUST occur before a comment (due to difficulty of parsing the kind code which might be one letter).
      Separators (the comma is handy) may occur within the number

As of version 0.1, the parsed result used at the office of choice is placed in $self->patent->doc_id_standardized

A convenience value of $self->patent->doc_id_commified is provided.

In recognizing the values such as CC country, the priority is:

$self->patent->doc_id as supplied; if absent:
$self->patent->country; if absent:
$WWW::Patent::Page::default_country

get_page

method to use the modules specific to Offices like USPTO, with methods for each document/page format, etc., and LWP::Agent to grab the appropriate URLs and if necessary build the response content or produce error values

request

Method to override the LWP::UserAgent::request that gets a URL. This calls LWP::UserAgent::request itself, but around it adds things like a retry (and possibly debugging, like throwing pages to a browser for display).

terms

method to provide a summary or pointers to the terms and conditions of use of the publicly available databases

_load_modules

internal private method to access helper modules in WWW::Patent::Page

_agent

private method to assign default agent

_load_country_known

private method to load a big hash and allow it to be folded during code development.

DIAGNOSTICS

The accepted tactic is to set $self->{'is_success'} or $self->{'patent'}->{'is_success'} to false and add a message to $self->{'message'} or $self->{'patent'}->{'message'}

CONFIGURATION AND ENVIRONMENT

WWW::Patent::Page requires no configuration files or environment variables.

WWW::Patent::Page makes use of LWP environmental variables such as HTTP_PROXY.

DEPENDENCIES

LWP::UserAgent HTTP::Response

INCOMPATIBILITIES

None reported.

BUGS AND LIMITATIONS

Code contributions, suggestions, and critiques are welcome.

Error handling is undeveloped.

By definition, a non-trivial program contains bugs.

For United States Patents (US) via the USPTO (USPTO), the 'kind' is ignored in method provide_doc

AUTHOR

Wanda B. Anon
Wanda.B.Anon@gmail.com

LICENSE AND COPYRIGHT

Copyright (c) 2008, Wanda B. Anon wanda.b.anon@GMAIL.com . All rights reserved.

This program is free software; you can redistribute it and/or modify it under the Artistic License version 2.0 or above ( http://www.perlfoundation.org/artistic_license_2_0 ) .

ACKNOWLEDGEMENTS

Hermann Schier, Lokkju, Andy Lester, the authors of Finance::Quote, Erik Oliver for patentmailer, Howard P. Katseff of AT&T Laboratories for wsp.pl, version 2, a proxy that speaks LWP and understands proxies, and of course Larry and Randal and the gang.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.