Changes for version 2.08 - 2006-05-03

  • Implemented new rasterizer for grid mapping. Thanks to Roland Schar for a tortuous example of span issues.
  • Regular extraction and TREE mode are using the same rasterizer now.
  • Fixed HTML stripping for a header matching bug on single word text in keep_html mode (thanks to Michael S. Muegel for pointing the bug out)

Modules

Perl module for extracting the content contained in tables within an HTML document, either as text or encoded element trees.

Provides

in lib/HTML/TableExtract.pm
in lib/HTML/TableExtract.pm
in lib/HTML/TableExtract.pm