NAME
EBook::Tools::EReader - Palm::PDB handler for manipulating the Fictionwise/PeanutPress eReader format.
SYNOPSIS
use EBook::Tools::EReader;
my $pdb = EBook::Tools::EReader->new();
$pdb->Load('myfile-er.pdb');
print "Loaded '",$pdb->{title},"' by ",$pdb->{author},"\n";
my $html = $pdb->html;
my $pml = $pdb->pml
$pdb->write_unknown_records
DEPENDENCIES
Compress::Zlib
Image::Size
P5-Palm
CONSTRUCTOR
new()
Instantiates a new Ebook::Tools::EReader object.
ACCESSOR METHODS
filebase
In scalar context, this is the basename of the object attribute filename
. In list context, it actually returns the basename, directory, and extension as per fileparse
from File::Basename.
footnotes()
Returns a hash containing all of the footnotes found in the file, where the keys are the footnote ids and the values contain the footnote text.
footnotes_pml()
Returns a string containing all of the footnotes in a form suitable to append to the end of PML text output. This is called as part of "pml()".
footnotes_html()
Returns a string containing all of the footnotes in a form suitable to append to the end of HTML text output. This is called as part of "html()".
pml()
Returns a string containing the entire original document text in its original encoding, including all sidebars and footnotes.
html()
Returns a string containing the entire document text (including all sidebars and footnotes) converted to HTML.
Note that the PML text is stored in the object (and thus retrieving it is very fast), but generating the HTML output requires that the text be converted every time this method is used, consuming extra processing time.
sidebars()
Returns a hash containing all of the sidebars found in the file, where the keys are the sidebar ids and the values contain the sidebar text.
sidebars_pml()
Returns a string containing all of the sidebars in a form suitable to append to the end of PML text output. This is called as part of "pml()".
sidebars_html()
Returns a string containing all of the sidebars in a form suitable to append to the end of HTML text output. This is called as part of "html()".
write_html($filename)
Writes the raw book text to disk in PML form (including all sidebars and footnotes) with the given filename.
If $filename
is not specified, writes to $self-
filebase> with a ".html" extension.
Returns the filename used on success, or undef if there was no text to write.
write_images()
Writes each image record to the disk.
Returns a list containing the filenames of all images written, or undef if none were found.
write_pml($filename)
Writes the raw book text to disk in PML form (including all sidebars and footnotes) with the given filename.
If $filename
is not specified, writes to $self-
filebase> with a ".pml" extension.
Returns the filename used on success, or undef if there was no text to write.
write_unknown_records()
Writes each unidentified record to disk with a filename in the format of 'raw-record-####', where #### is the record number (not the record ID).
Returns the number of records written.
MODIFIER METHODS
Load($filename)
Sets $self-
{filename}> and then loads and parses the file specified by $filename
, calling "ParseRecord(%record)" on every record found.
ParseRecord(%record)
Parses PDB records, updating the object attributes. This method is called automatically on every database record during Load()
.
ParseRecord0($data)
Parses the header record and places the parsed values into the hashref $self->{header}
.
Returns the hash (not the hashref).
PROCEDURES
cp1252_to_pml()
An unfinished and completely nonfunctional procedure to convert Windows-1252 characters to PML \a codes.
DO NOT USE.
pml_to_html($text,$filebase)
Takes as input a text string in Windows-1252 encoding containing PML markup codes and returns a string with those codes converted to UTF-8 HTML.
Requires a second argument $filebase
to specify the basename of the file (or specifically, the basename of the file to which output text will be written) so that image links can be generated correctly.
BUGS AND LIMITATIONS
HTML conversion doesn't handle handle the \T command used to indent.
HTML conversion may be suboptimal in many ways.
Most notably, all linebreaks are handled as <br />, and without any heed to whether those linebreaks occur inside of some other element.
AUTHOR
Zed Pobre <zed@debian.org>
LICENSE AND COPYRIGHT
Copyright 2008 Zed Pobre
Licensed to the public under the terms of the GNU GPL, version 2