NAME
Catmandu::Importer::PDFPages - Catmandu importer to extract text data per page from one pdf
SYNOPSIS
# From the command line
# Export pdf pages with their text and coördinates
$ catmandu convert PDFPages --file input.pdf to YAML
#In a script
use Catmandu::Sane;
use Catmandu::Importer::PDFPages;
my $importer = Catmandu::Importer::PDFPages->new( file => "/tmp/input.pdf" );
$importer->each(sub{
my $page = $_[0];
#..
});
EXAMPLE OUTPUT IN YAML
INSTALL
In order to install this package you need the following system packages installed
- Centos
-
* perl-devel
* make
* gcc
* gcc-c++
* libyaml-devel
* libyaml
* poppler-glib ( >= 0.16 )
* poppler-glib-devel ( >= 0.16 )
Centos 6 only has poppler-glib 0.12. So you need at least Centos 7. Or you can compile the package.
- Ubuntu
-
* libpoppler-glib8
* libpoppler-glib-dev
* gobject-introspection
* libgirepository1.0-dev
AUTHORS
Nicolas Franck <nicolas.franck at ugent.be>