NAME
RTF::HTMLConverter - Converter from RTF format to HTML.
SYNOPSIS
use XML::GDOME;
use RTF::HTMLConverter;
my $parser = RTF::HTMLConverter->new(in => 'test.rtf',
out => 'test.html');
$parser->parse();
use XML::DOM;
use RTF::HTMLConverter;
open my $in, 'test.rtf' or die;
my $parser = RTF::HTMLConverter->new(
in => $in,
out => 'test.html',
DOMImplementation => 'XML::DOM',
image_uri => "http://somewhere.net/images",
codepage => 'iso-8859-1',
);
$parser->parse();
use XML::GDOME;
use RTF::HTMLConverter;
my $html = '';
my $parser = RTF::HTMLConverter->new(
in => 'test.rtf',
out => \$html,
discard_images => 1,
);
$parser->parse();
DESCRIPTION
RTF::HTMLConverter is a high-level RTF to HTML format converter. It is based on the low-level RTF parser module RTF::Lexer. Additionally, it requires the W3C's DOM implementation and it is known to work with either XML::DOM or XML::GDOME.
METHODS
- new
-
The constructor. The following parameters are recognized:
- in
-
Input file handle or a file name. Default value is
\*STDIN
. SeeRTF::Lexer
for more information. - out
-
Output file handler or file name or scalar reference. If this parameter is a string it is treated as a file name and the constructor tries to open that file. If that file already exists, it is truncated. In the case of failure while opening the file an exception is thrown. If this parameter is a scalar reference the resulting html is stored in that scalar.
- DOMImplementation
-
The DOM implementation module name. Supported values are
XML::DOM
andXML::GDOME
. The default value isXML::GDOME
. - codepage
-
The charset of the resulted html-document. By default is
utf8
. This parameter is recognized only if DOMImplementation isXML::GDOME
. - formatting
-
The formatting of the resulted html-document. This parameter is recognized only if DOMImplementation is
XML::GDOME
. Possible values are:GDOME_SAVE_STANDARD
andGDOME_SAVE_LIBXML_INDENT
. SeeXML::GDOME::Document
for more information. Default value isGDOME_SAVE_LIBXML_INDENT
. - doctype
-
A reference to an array (
$name
,$publicId
,$systemId
) if DOMImplementation isXML::GDOME
or ($name
,$systemId
,$publicId
) if DOMImplementation isXML::DOM
. Default values are: - discard_images
-
Being set, this parameter disables any image processing. By default it is unset.
- image_uri
-
The string that being concatenated with the image name gives this image's URL. Default value is empty string.
- image_dir
-
A directory name where the images are generated. Default value is empty string which means the current directory.
- image_names
-
The pattern for generating image names from there number. Default value is
img%d
. - image_convert
-
A path to ImageMagick's
convert
utility. Default value is simplyconvert
assuming it is in one of the $ENV{PATH} directories. - image_mogrify
-
A path to ImageMagick's
mogrify
utility. If the value isundef
or the specified file does not exists, the images extracted from RTF will not be scaled. Default value ismogrify
. - image_wmf2eps
-
A path to libwmf's
wmf2eps
utility. If the value isundef
or the specified file does not exists, the WMF-images will not be extracted from RTF. Default value iswmf2eps
. - screen_resolution
-
The display resolution in dpi. Default value is 100.
- parse
-
Parses the input RTF stream until the end of file.
SEE ALSO
RTF::Lexer, Rich Text Format (RTF) Specification (version 1.7), The_RTF_Cookbook, RTF::Parser, RTF::Tokenizer.
KNOWN BUGS
- -
-
The symbols that absent in Unicode character set will be displayed incorrectly.
- -
-
The images that are stored in RTF file in WMF format may be scaled incorrectly.
- -
-
The text in WMF images in non-ASCII charset may be displayed incorrectly.
And there should be lots of unknown bugs;)
AUTHOR
Vadim O. Ustiansky <ustiansky@cpan.org>