Revision history for Perl module WWW::Patent::Page

0.01 Tue Mar  1 21:54:47 2005
	- original version; created by ExtUtils::ModuleMaker 0.32

Got feedback from Usenet:

comp.lang.perl.modulesÊ

perl.module-authorsÊ

Patent::Retrieve Request for Comments

wanda_b_a...@yahoo.com
Ê
I have written a new module and propose to submit it to CPAN. ÊYour 
comments would be appreciated. 

Patent::Retrieve is alpha software- my first module, and my intent is 
to see if the perl community has any interest in the idea. 

The module provides a consistent way to obtain patent documents from 
various patent offices that make them available on the web. ÊTypically, 
doing this is relatively easy by hand, but involves screen-scraping if 
you want to do it effectively for many pages or doucments. ÊThe offices 
typically make it hard to get the whole document, presumably because 
that is one source of revenue. 

The module uses submodules, specific to patent offices, and comes with 
working examples for the USPTO and EPO, which between them supply 
granted patents in html and tiff (USPTO) and pdf (US, EP, and much of 
the world...). 

For casual users, this module should simplify life. ÊAbusive users will 
likely find their IP address banned by the patent office being 
spidered. 

I propose a new name space, "Patent", because I see no related modules 
in another name space; I am happy to take suggestions. ÊI think it is 
reasonable to have a "Patent" namespace, since patents involve a lot of 
text-wrangling that is single purpose. ÊFor example, searches of the 
prior art, patent family relationships, patent applications via XML, 
etc. ÊWith a namespace, related modules may be grouped easily. 

Here is the documentation as it now stands: 

Patent::Retrieve 

NAME 
Ê Ê Patent::Retrieve - retrieve a patent page (from United States 
Patent and 
Ê Ê Trademark Office (uspto) website or the European Patent Office 
Ê Ê (espace_ep). ) 

SYNOPSIS 
Ê Ê Please see the test suite for working examples. The following is 
not 
Ê Ê guaranteed to be working or up-to-date. 

Ê Ê Ê use Patent::Retrieve; 

Ê Ê Ê my $patent_document = Patent::Retrieve->new(); # new object 

Ê Ê Ê my $document1 = $patent_document->provide_doc('6,123,456'); 
Ê Ê Ê Ê Ê Ê # defaults: Ê Ê office Ê=> 'uspto', 
Ê Ê Ê Ê Ê Ê # Ê Ê Ê Ê Ê Ê Ê country => 'US', 
Ê Ê Ê Ê Ê Ê # Ê Ê Ê Ê Ê Ê Ê format Ê=> 'htm', 
Ê Ê Ê Ê Ê Ê # Ê Ê Ê Ê Ê Ê Ê page Ê Ê=> '1', Ê Ê Ê# typically htm IS "1" 
page 
Ê Ê Ê Ê Ê Ê # Ê Ê Ê Ê Ê Ê Ê modules => qw/ us ep / , 

Ê Ê Ê my $document2 = $patent_document->provide_doc('US_6_123_456', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê office Ê=> 'espace_ep' , 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê format Ê=> 'tif', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê page Ê Ê=> 2 , 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê ); 

Ê Ê Ê my $pages_known = $patent_document->pages_available( Ê# e.g. TIFF 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê document=> '6 123 456', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê ); 

DESCRIPTION 
Ê Ê Ê Intent: ÊUse public sources to retrieve patent documents such as 
Ê Ê Ê TIFF images of patent pages, html of patents, pdf, etc. 
Ê Ê Ê Expandable for your office of interest by writing new 
submodules.. 
Ê Ê Ê Alpha release by newbie to find if there is any interest 

USAGE 
Ê Ê Ê See also SYNOPSIS above 

Ê Ê Ê To install the module... 

Ê Ê perl Makefile.PL 

Ê Ê make 

Ê Ê make test 

Ê Ê make install 

Ê Ê If you are on a windows box you could try to use 'nmake' rather 
than 
Ê Ê 'make'. 

Ê Ê Examples of use: 

Ê Ê Ê $patent_document = Patent::Retrieve->new( 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê doc_id Ê=> 'US6,654,321(B2)issued_2_Okada', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê office Ê=> 'espace_ep' , 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê format Ê=> 'tif', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê page Ê Ê=> 2 , 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê agent Ê => 'Mozilla/5.0 (Windows; U; 
Windows NT 5.0; en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê ); 

Ê Ê # 'Windows IE 6' => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 
Ê Ê 5.1)', # 'Windows Mozilla' => 'Mozilla/5.0 (Windows; U; Windows NT 
5.0; 
Ê Ê en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6', # 'Mac 
Safari' => 
Ê Ê 'Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/85 
(KHTML, 
Ê Ê like Gecko) Safari/85', # 'Mac Mozilla' => 'Mozilla/5.0 (Macintosh; 
U; 
Ê Ê PPC Mac OS X Mach-O; en-US; rv:1.4a) Gecko/20030401', # 'Linux 
Mozilla' 
Ê Ê => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) 
Gecko/20030624', # 
Ê Ê 'Linux Konqueror' => 'Mozilla/5.0 (compatible; Konqueror/3; 
Linux)', 

Ê Ê Ê my %attributes = $patent_document->get_patent('all'); Ê# hash of 
all 

Ê Ê Ê my $document_id = $patent_document->get_patent('doc_id'); 
Ê Ê Ê Ê Ê Ê # US6,654,321(B2)issued_2_Okada 

Ê Ê Ê my $office_used = $patent_document->get_patent('office'); # ep 

Ê Ê Ê my $country_used = $patent_document->get_patent('country'); #US 

Ê Ê Ê my $doc_id_used = $patent_document->get_patent('doc_id'); Ê# 
6654321 

Ê Ê Ê my $page_used = $patent_document->get_patent('page'); Ê# 2 

Ê Ê Ê my $kind_used = $patent_document->get_patent('kind'); Ê# B2 

Ê Ê Ê my $comment_used = $patent_document->get_patent('comment'); Ê# 
issued_2_Okada 

Ê Ê Ê my $format_used = $patent_document->get_patent('format'); #tif 

Ê Ê Ê my $pages_total = 
$patent_document->get_patent('pages_available'); Ê # 101 

Ê Ê Ê my $terms_and_conditions = $patent_document->terms('us'); # and 
conditions 

Ê Ê Ê my $document = $patent_document->get_patent('document'); # the 
loot 

BUGS 
Ê Ê Pre-alpha release, to gauge whether the perl community has any 
interest. 

Ê Ê Code contributions, suggestions, and critiques are welcome. 

Ê Ê Error handling is undeveloped. 

Ê Ê By definition, a non-trivial program contains bugs. 

Ê Ê For United States Patents (US) via the USPTO (us), the 'kind' is 
ignored 
Ê Ê in method provide_doc 

SUPPORT 
Ê Ê Yes, please. Checks are best. Or email me at Wanda_B_A...@yahoo.com 
to 
Ê Ê arrange fund transfers. 

AUTHOR 
Ê Ê Ê Ê Ê Ê Wanda B. Anon 
Ê Ê Ê Ê Ê Ê Wanda_B_A...@yahoo.com 

COPYRIGHT 
Ê Ê This program is free software; you can redistribute it and/or 
modify it 
Ê Ê under the same terms as Perl itself. 

Ê Ê The full text of the license can be found in the LICENSE file 
included 
Ê Ê with this module. 

ACKNOWLEDGEMENTS 
Ê Ê Andy Lester for WWW::Mechanize, that got me thinking, even if 
cygwin was 
Ê Ê trouble., 

Ê Ê The authors of Finance::Quote, which served as an example of 
providing 
Ê Ê submodules, 

Ê Ê Erik Oliver for patentmailer, serving as an example of getting 
patent 
Ê Ê documents, 

Ê Ê Howard P. Katseff of AT&T Laboratories for wsp.pl, version 2, a 
proxy 
Ê Ê that speaks LWP and understands proxies, 

Ê Ê and of course Larry and Randal and the gang. 

SEE ALSO 
Ê Ê perl(1). 

Ê _countries_known() 
Ê Ê ÊUsage Ê Ê : internal method only 
Ê Ê ÊPurpose Ê : list all entities that could give a patent 
Ê Ê ÊReturns Ê : ref to a hash with keys of abbreviations and values of 
entities (usually a country) Ê... 

-----------------------------------------------------------------------------

John Bokma
Ê Feb 12, 8:15Êpm Ê Ê show options

Êwrote: 
> I propose a new name space, "Patent", because I see no related modules 
> in another name space; 

The webscraping modules? 

-- 
John Ê Ê Ê Ê Ê Ê Ê Ê Ê Small Perl scripts: http://johnbokma.com/perl/ 
Ê Ê Ê Ê Ê Ê Ê ÊPerl programmer available: Ê Ê http://castleamber.com/ 
Ê Ê Ê Ê Ê Ê Happy Customers: http://castleamber.com/testimonials.html 

-----------------------------------------------------------------------------
anon-wb
Ê Feb 13, 7:19Êam Ê Ê

> Êwrote: 
>> I propose a new name space, "Patent", because I see no related 
modules 
>> in another name space; 

> The webscraping modules? 

Which namespace do you propose? 

WWW:Patent:Retrieve might be reasonable. ÊBut the information source 
need not be the web, it could also be a file server of cached pages or 
documents or your own drive. ÊOften the question is whether the patent 
document is already on my drive, or if I have to go out on the web. 

Also, WWW seems to take the opposite hierarchy: Êe.g. ÊWWW:Search:Ebay 
implies we should name it WWW:Retrieve::Patent . ÊThat would 
necessitate reorganizing so that uspto.pm, espace_ep.pm, etc. are in 
folder Patent rather than Retrieve, which seems backward in logic. 
Also, it would put a patent searching module into WWW:Search::Patent, 
which is a long way from WWW:Retrieve:Patent. 

But it could work. ÊThe new namespace would not be top level. 

Any other suggestions? 

-----------------------------------------------------------------------------

John Bokma
Ê Feb 13, 2:13Êpm Ê Ê 

anon-wb wrote: 
>> Êwrote: 
>>> I propose a new name space, "Patent", because I see no related 
> modules 
>>> in another name space; 

>> The webscraping modules? 

> Which namespace do you propose? 

> WWW:Patent:Retrieve might be reasonable. ÊBut the information source 
> need not be the web, it could also be a file server of cached pages or 
> documents or your own drive. 

I can imagine that there is no problem at all for the namespace if the 
scraping module does smart caching. 

> Often the question is whether the patent 
> document is already on my drive, or if I have to go out on the web. 

That's just a cache. I can even imagine that other WWW:: modules use a 
caching mechanism, or otherwise can profit from one. 

I see there is a WWW::Mechanize::Cached or maybe Cache::Cached is better 
for your module. 

> Also, WWW seems to take the opposite hierarchy: Êe.g. ÊWWW:Search:Ebay 
> implies we should name it WWW:Retrieve::Patent . 

WWW::Search:: 

Patent? 
> That would 
> necessitate reorganizing so that uspto.pm, espace_ep.pm, etc. are in 
> folder Patent rather than Retrieve, which seems backward in logic. 
> Also, it would put a patent searching module into WWW:Search::Patent, 
> which is a long way from WWW:Retrieve:Patent. 

The search modules also retrieve results. There is no point in searching 
and not getting results :-D. 

> But it could work. ÊThe new namespace would not be top level. 

To me more logical. 

-----------------------------------------------------------------------------

anon-wb
Ê Feb 15, 10:16Êam Ê Ê show options

Regarding the naming of a new module to retrieve pages of patent 
documents: 

The module does not cache, the point about having documents on ones own 
drive (not WWW) was that the WWW is not the only source of the 
documents, you might scan them yourself- so maybe a file:// url would 
be a source. ÊBut that is more the exception than the rule, so I see no 
obvious reason to rule out the WWW hierarchy. 

I am leaning toward 

WWW::Patent::Page 

since this module will retrieve "pages", given a document identifier, 
(without parsing the page) such as html, tiff, pdf, and leads to 

WWW::Patent::Page::uspto.pm 
WWW::Patent::Page::espace_ep.pm 
etc. Êfor the page sources. 

This hierarchy makes sense in light of future possible WWW 
interactions: 

WWW:Patent::Information Ê(given a patent document ID, retrieve 
associated information such as inventors, assignees, earliest filing 
date, etc., family, possibly by screen-scraping some html or going to a 
database) 

WWW::Patent::Search Ê(input topics, receive document identifiers or 
related information) 

WWW::Patent::Submit Ê(input a patent application, receive 
acknowledgement of filing office) 

WWW::Patent::Submit::XML Ê(use an XML interface, e.g. at the USPTO) 

We are somewhat distracted by focussing on screen scraping. ÊThe 
scraping only happens here to find out where the document resides, then 
the document is retrieved. ÊThe scraping results are mostly internal 
and not returned to the user, except gems like how many pages are 
available for the complete document. 

"Patents" work has three main information needs- "searching" for patent 
documents of interest, relating those documents to similar documents, 
e.g. in different countries or by the same inventor (a family, an 
inventor), "retrieving" documents of interest or associated information 
(cited documents, inventors, assignees), and Ê"getting" (submitting an 
application and being granted) a new patent. 

Comments welcome- any objections to WWW::Patent::Page ? 

-----------------------------------------------------------------------------

John Bokma
Ê Feb 16, 10:45Êam Ê Ê
anon-wb wrote: 
> I am leaning toward 

> WWW::Patent::Page 

> since this module will retrieve "pages", given a document identifier, 
> (without parsing the page) such as html, tiff, pdf, and leads to 

How about WWW:Patent::Document ? 

> WWW::Patent::Page::uspto.pm 
> WWW::Patent::Page::espace_ep.pm 
> etc. Êfor the page sources. 

IIRC lower case module names are reserved for pragmas. 

-----------------------------------------------------------------------------

anon-wb
Ê Feb 16, 1:14Êpm Ê Ê show options

> How about WWW:Patent::Document ? 

The module is lower level or more primitive than "Document"; mainly it 
retrieves a page at a time, as the offices typically allow. ÊI leave it 
to the user to decide how to, if desired, stitch the pages together 
into a document. ÊSo, someone might take WWW::Patent::Page and use it 
for making WWW:Patent:Document . ÊThus, Page seems more accurate than 
Document as the finest level of naming. 

> IIRC lower case module names are reserved for pragmas. 

WWW::Patent::Page::Uspto.pm ? 
WWW::Patent::Page::USPTO.pm ? 

Is there a preferred way of naming modules that are worthless without 
their parent? 

John Bokma
Ê Feb 16, 2:20Êpm Ê 

anon-wb wrote: 
>> How about WWW:Patent::Document ? 

> The module is lower level or more primitive than "Document"; mainly it 
> retrieves a page at a time, as the offices typically allow. 

Ah, ok, didn't know that. Yeah, in that case Page is more appropriate. 

> WWW::Patent::Page::Uspto.pm ? 
> WWW::Patent::Page::USPTO.pm ? 

The latter 

> Is there a preferred way of naming modules that are worthless without 
> their parent? 

I would use first upper case, and if it's an acronym I would use all 
uppercase especially if that's common: 

http://www.answers.com/uspto 

-- 

Peter Scott
Ê Feb 16, 1:25Êpm Ê Ê show options

In article <Xns95FF81D68D10Bcastleam...@130.133.1.4>, 
ÊJohn Bokma <postmas...@castleamber.com> writes: 

>anon-wb wrote: 
>> WWW::Patent::Page::uspto.pm 
>> WWW::Patent::Page::espace_ep.pm 
>> etc. Êfor the page sources. 

>IIRC lower case module names are reserved for pragmas. 

Only when they're single words. ÊThey're okay on the end of modules that 
begin with capital letters. ÊSee, for example, LWP::Protocol::{http,ftp,...}. Ê Ê Ê Ê Ê Ê Ê Ê 

-- 
Peter Scott 
http://www.perlmedic.com/ 
http://www.perldebugged.com/ 

John Bokma
Ê Feb 16, 2:18Êpm Ê Ê show options

Peter Scott wrote: 
> In article <Xns95FF81D68D10Bcastleam...@130.133.1.4>, 
> ÊJohn Bokma <postmas...@castleamber.com> writes: 
>>anon-wb wrote: 
>>> WWW::Patent::Page::uspto.pm 
>>> WWW::Patent::Page::espace_ep.pm 
>>> etc. Êfor the page sources. 

>>IIRC lower case module names are reserved for pragmas. 

> Only when they're single words. ÊThey're okay on the end of modules 
> that begin with capital letters. ÊSee, for example, 
> LWP::Protocol::{http,ftp,...}. Ê Ê Ê Ê Ê Ê Ê Ê 

Yeah, actually I saw those two days ago :-D. But personally I would stick 
to HTTP, FTP etc) 

-- 

Naming Proposal: WWW::Patent::Page (continued from earlier at comp.lang.perl.modules)

Fixed font - Proportional font

Ê

Wanda Anon
Ê Feb 21, 2:30Êpm Ê Ê show options

I have written a new module, WWW::Patent::Page, and 
propose to submit it to CPAN. ÊYour comments would be 
appreciated. 

Does the name seem reasonable? I am happy to take 
suggestions. ÊI think it is reasonable to have a 
"Patent" namespace in WWW, since much patent 
information is available on the WWW. For example, 
searches of the prior art, patent family 
relationships, patent applications via XML, etc. ÊWith 
a namespace, related modules may be grouped easily. 
One can imagine future modules like 
"WWW::Patent::Apply", WWW::Patent::Family", or 
WWW::Patent::Search" for interacting with various web 
services. 

WWW::Patent::Page is alpha software- my first module, 
and my intent is to see if the perl community has any 
interest in the idea. ÊIt is rough around the edges, 
but passes what tests it has. 

The module provides a consistent way to obtain pages 
of patent documents from various patent offices that 
make them available on the WWW. ÊTypically, doing this 
is relatively easy by hand, page by page, but takes a 
bit of work if you want to do automate it effectively 
for many pages or documents. ÊThe offices typically 
make it hard to get the whole document, presumably 
because supplying that is one source of revenue. 

From this primitive module, users can stitch together 
tiff or PDF into multipage documents by whatever 
method they prefer. 

The module uses submodules, specific to separate 
patent offices, and comes with working examples for 
the USPTO and EPO, which between them supply granted 
patents in html and tiff (USPTO) and pdf (US, EP, and 
much of the world...). Hopefully, other interested 
users will create new or improved submodules and feed 
them back into the distribution. 

For casual users, this module should simplify life. 
Abusive users will likely find their IP address banned 
by the patent office being spidered. 

Here is the documentation as it now stands: 

NAME 
Ê Ê WWW::Patent::Page - retrieve a patent page (e.g. 
from United States 
Ê Ê Patent and Trademark Office (USPTO) website or the 
European Patent 
Ê Ê Office (ESPACE_EP). ) 

SYNOPSIS 
Ê Ê Please see the test suite for working examples. 
The following is not 
Ê Ê guaranteed to be working or up-to-date. 

Ê Ê Ê use WWW::Patent::Page; 

Ê Ê Ê my $patent_document = WWW::Patent::Page->new(); 
# new object 

Ê Ê Ê my $document1 = 
$patent_document->provide_doc('6,123,456'); 
Ê Ê Ê Ê Ê Ê # defaults: Ê Ê office Ê=> 'USPTO', 
Ê Ê Ê Ê Ê Ê # Ê Ê Ê Ê Ê Ê Ê country => 'US', 
Ê Ê Ê Ê Ê Ê # Ê Ê Ê Ê Ê Ê Ê format Ê=> 'htm', 
Ê Ê Ê Ê Ê Ê # Ê Ê Ê Ê Ê Ê Ê page Ê Ê=> '1', Ê Ê Ê# 
typically htm IS "1" page 
Ê Ê Ê Ê Ê Ê # Ê Ê Ê Ê Ê Ê Ê modules => qw/ us ep / , 

Ê Ê Ê my $document2 = 
$patent_document->provide_doc('US_6_123_456', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê office Ê=> 'ESPACE_EP' , 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê format Ê=> 'tif', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê page Ê Ê=> 2 , 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê ); 

Ê Ê Ê my $pages_known = 
$patent_document->pages_available( Ê# e.g. TIFF 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê document=> '6 123 456', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê ); 

DESCRIPTION 
Ê Ê Ê Intent: ÊUse public sources to retrieve patent 
documents such as 
Ê Ê Ê TIFF images of patent pages, html of patents, 
pdf, etc. 
Ê Ê Ê Expandable for your office of interest by 
writing new submodules.. 
Ê Ê Ê Alpha release by newbie to find if there is any 
interest 

USAGE 
Ê Ê Ê See also SYNOPSIS above 

Ê Ê Ê Ê ÊStandard process for building & installing 
modules: 

Ê Ê Ê Ê Ê Ê Ê perl Build.PL 
Ê Ê Ê Ê Ê Ê Ê ./Build 
Ê Ê Ê Ê Ê Ê Ê ./Build test 
Ê Ê Ê Ê Ê Ê Ê ./Build install 

Ê Ê Examples of use: 

Ê Ê Ê $patent_document = WWW::Patent::Page->new( 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê doc_id Ê=> 
'US6,654,321(B2)issued_2_Okada', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê office Ê=> 'ESPACE_EP' , 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê format Ê=> 'tif', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê page Ê Ê=> 2 , 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê agent Ê => 'Mozilla/5.0 
(Windows; U; Windows NT 5.0; en-US; rv:1.4b) 
Gecko/20030516 Mozilla Firebird/0.6', 
Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê Ê ); 

Ê Ê # 'Windows IE 6' => 'Mozilla/4.0 (compatible; MSIE 
6.0; Windows NT 
Ê Ê 5.1)', 

Ê Ê # 'Windows Mozilla' => 'Mozilla/5.0 (Windows; U; 
Windows NT 5.0; en-US; 
Ê Ê rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6', 

Ê Ê # 'Mac Safari' => 'Mozilla/5.0 (Macintosh; U; PPC 
Mac OS X; en-us) 
Ê Ê AppleWebKit/85 (KHTML, like Gecko) Safari/85', 

Ê Ê # 'Mac Mozilla' => 'Mozilla/5.0 (Macintosh; U; PPC 
Mac OS X Mach-O; 
Ê Ê en-US; rv:1.4a) Gecko/20030401', 

Ê Ê # 'Linux Mozilla' => 'Mozilla/5.0 (X11; U; Linux 
i686; en-US; rv:1.4) 
Ê Ê Gecko/20030624', 

Ê Ê # 'Linux Konqueror' => 'Mozilla/5.0 (compatible; 
Konqueror/3; Linux)', 

Ê Ê Ê my %attributes = 
$patent_document->get_patent('all'); Ê# hash of all 

Ê Ê Ê my $document_id = 
$patent_document->get_patent('doc_id'); 
Ê Ê Ê Ê Ê Ê # US6,654,321(B2)issued_2_Okada 

Ê Ê Ê my $office_used = 
$patent_document->get_patent('office'); # ep 

Ê Ê Ê my $country_used = 
$patent_document->get_patent('country'); #US 

Ê Ê Ê my $doc_id_used = 
$patent_document->get_patent('doc_id'); Ê# 6654321 

Ê Ê Ê my $page_used = 
$patent_document->get_patent('page'); Ê# 2 

Ê Ê Ê my $kind_used = 
$patent_document->get_patent('kind'); Ê# B2 

Ê Ê Ê my $comment_used = 
$patent_document->get_patent('comment'); Ê# 
issued_2_Okada 

Ê Ê Ê my $format_used = 
$patent_document->get_patent('format'); #tif 

Ê Ê Ê my $pages_total = 
$patent_document->get_patent('pages_available'); Ê # 
101 Ê 

Ê Ê Ê my $terms_and_conditions = 
$patent_document->terms('us'); # and conditions 

Ê Ê Ê my $document = 
$patent_document->get_patent('document'); # the loot 

BUGS 
Ê Ê Pre-alpha release, to gauge whether the perl 
community has any interest. 

Ê Ê Code contributions, suggestions, and critiques are 
welcome. 

Ê Ê Error handling is undeveloped. 

Ê Ê By definition, a non-trivial program contains 
bugs. 

Ê Ê For United States Patents (US) via the USPTO (us), 
the 'kind' is ignored 
Ê Ê in method provide_doc 

SUPPORT 
Ê Ê Yes, please. Checks are best. Or email me at 
Wanda_B_A...@yahoo.com to 
Ê Ê arrange fund transfers. 

AUTHOR 
Ê Ê Ê Ê Ê Ê Wanda B. Anon 
Ê Ê Ê Ê Ê Ê Wanda_B_A...@yahoo.com 

COPYRIGHT 
Ê Ê This program is free software; you can 
redistribute it and/or modify it 
Ê Ê under the same terms as Perl itself. 

Ê Ê The full text of the license can be found in the 
LICENSE file included 
Ê Ê with this module. 

ACKNOWLEDGEMENTS 
Ê Ê Andy Lester for WWW::Mechanize, that got me 
thinking, 

Ê Ê The authors of Finance::Quote, which served as an 
example of providing 
Ê Ê submodules, 

Ê Ê Erik Oliver for patentmailer, serving as an 
example of getting patent 
Ê Ê documents, 

Ê Ê Howard P. Katseff of AT&T Laboratories for wsp.pl, 
version 2, a proxy 
Ê Ê that speaks LWP and understands proxies, 

Ê Ê and of course Larry and Randal and the gang. 

SEE ALSO 
Ê Ê perl(1). 

Ê Subroutine _countries_known() 
Ê Ê ÊUsage Ê Ê : internal method only 
Ê Ê ÊPurpose Ê : list all entities that could give a 
patent 
Ê Ê ÊReturns Ê : ref to a hash with keys of 
abbreviations and values of entities (usually a 
country) Ê... 

__________________________________________________ 
Do You Yahoo!? 
Tired of spam? ÊYahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply

Ê

Wanda Anon
Ê Feb 21, 3:43Êpm Ê Ê show options

Comments interspersed. 

--- Andy Lester <a...@petdance.com> wrote: 

> On Mon, Feb 21, 2005 at 02:30:01PM -0800, Wanda Anon 
> (wanda_b_a...@yahoo.com) wrote: 
> > I have written a new module, WWW::Patent::Page, 
> and 
> > propose to submit it to CPAN. ÊYour comments would 
> be 
> > appreciated. 

> First, please start it out with Module::Starter to 
> get the basic 
> framework in place. 

It is already written, using modulemaker. ÊTIMTOWTDI: 
what benefits will I reap with Module::Starter, after 
I Êstarted? ÊI am willing to try with a reason. ÊBut I 
would rather release early and often. 

> > Ê Ê # 'Windows IE 6' => 'Mozilla/4.0 (compatible; 
> MSIE 
> > 6.0; Windows NT 

> Why are you cutting & pasting from Mech? ÊWhy not 
> use Mech directly? ÊOr 
> is this just examples of which agents are available? 

It may be cut and pasted, I can not recall. ÊI 
intended it as examples of agents, though the default 
agent is the module name and version number. ÊI likely 
stole it from Mech, hope this does not offend. ÊI can 
take it out or do it different or acknowledge better. 

Rather than use Mech directly, I decided to use a 
lower level than Mech, so that installation of Mech is 
not required. ÊMech is great, but last I checked, did 
not pass all tests on cygwin because of the proxy 
strategy in testing. ÊI figured being lower level 
would encourage more use, by not having cygwin users 
need to think about installing Mech without passing 
all tests. 

> > ACKNOWLEDGEMENTS 
> > Ê Ê Andy Lester for WWW::Mechanize, that got me 
> > thinking, 

> "Got me thinking." ÊI can't ask much more than that. 

> If you're using Mech in this module, then I'll add 
> it to the list of 
> modules that use Mech. 

As noted above, this does not use Mech. 

> xoa 

> -- 
> Andy Lester => a...@petdance.com => www.petdance.com 
> => AIM:petdance 

Thanks for your comments. ÊI take it the name seems 
ok. 

__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - now with 250MB free storage. Learn more. 
http://info.mail.yahoo.com/mail_250 

Reply

Ê

Andrew Savige
Ê Feb 21, 5:31Êpm Ê Ê show options

--- Wanda Anon wrote: 
> Mech is great, but last I checked, did not pass all tests on 
> cygwin because of the proxy strategy in testing. 

I'm not familiar with the cygwin test failures, but this test 
failure: 

http://www.nntp.perl.org/group/perl.cpan.testers/185037 

appears to be the same hang I experienced in t/local/back on 
Windows XP. A simple patch to t/local/back.t that fixes this 
hang is given at: 

http://rt.cpan.org/NoAuth/Bug.html?id=9026 

but has not yet been applied to WWW-Mechanize. 

/-\ 

Find local movie times and trailers on Yahoo! Movies. 
http://au.movies.yahoo.com 

Reply

Ê

Andy Lester
Ê Feb 21, 8:06Êpm Ê Ê show options

> appears to be the same hang I experienced in t/local/back on 
> Windows XP. A simple patch to t/local/back.t that fixes this 
> hang is given at: 

> http://rt.cpan.org/NoAuth/Bug.html?id=9026 

> but has not yet been applied to WWW-Mechanize. 

Version 1.11_02 just got uploaded to CPAN. ÊShould handle the "can't 
delete the tmp file". 

xoa 

-- 
Andy Lester => a...@petdance.com => www.petdance.com => AIM:petdance 

Reply

Ê

Andy Lester
Ê Feb 21, 3:10Êpm Ê Ê show options

On Mon, Feb 21, 2005 at 02:30:01PM -0800, Wanda Anon (wanda_b_a...@yahoo.com) wrote: 
> I have written a new module, WWW::Patent::Page, and 
> propose to submit it to CPAN. ÊYour comments would be 
> appreciated. 
First, please start it out with Module::Starter to get the basic 
framework in place 

. 

> Ê Ê # 'Windows IE 6' => 'Mozilla/4.0 (compatible; MSIE 
> 6.0; Windows NT 
Why are you cutting & pasting from Mech? ÊWhy not use Mech directly? ÊOr 
is this just examples of which agents are available 

? 

- Show quoted text -