NAME
Pod::POM::View::HTML::Filter - Use filters on sections of your pod documents
SYNOPSIS
In your POD:
Some coloured Perl code:
=begin filter perl
# now in full colour!
$A++;
=end filter
=for filter=perl $A++; # this works too
This should read C<bar bar bar>:
=begin filter foo
bar foo bar
=end filter
In your code:
my $view = Pod::POM::View::HTML::Filter->new;
$view->add(
foo => {
code => sub { my $s = shift; $s =~ s/foo/bar/gm; $s },
# other options are available
}
);
my $pom = Pod::POM->parse_file( '/my/pod/file' );
$pom->present($view);
The resulting HTML will look like this (modulo the stylesheet):
# now in full colour! $A++;
$A++; # this works too
This should read bar bar bar
:
bar bar bar
DESCRIPTION
This module is a subclass of Pod::POM::View::HTML
that support the filter
extension. This can be used in =begin
/ =end
and =for
pod blocks.
Please note that since the view maintains an internal state, only an instance of the view can be used to present the POM object. Either use:
my $view = Pod::POM::View::HTML::Filter->new;
$pom->present( $view );
or
$Pod::POM::DEFAULT_VIEW = Pod::POM::View::HTML::Filter->new;
$pom->present;
Even though the module was specifically designed for use with Perl::Tidy
, you can write your own filters quite easily (see "Writing your own filters").
FILTERING POD?
The whole idea of this module is to take advantage of all the syntax colouring modules that exist (actually, Perl::Tidy
was my first target) to produce colourful code examples in a POD document (after conversion to HTML).
Filters can be used in two different POD constructs:
=begin filter filter
-
The data in the
=begin filter
...=end filter
region is passed to the filter and the result is output in place in the document.The general form of a
=begin filter
block is as follow:=begin filter lang optionstring # some text to process with filter "lang" =end filter
The optionstring is trimed for whitespace and passed as a single string to the filter routine which must perform its own parsing.
=for filter=filter
-
=for
filters work just like=begin
/C=<end> filters, except that a single paragraph is the target.The general form of a
=for filter
block is as follow:=for filter=lang:option1:option2 # some code in language lang
The option string sent to the filter
lang
would beoption1 option2
(colons are replaced with spaces).
Options
Some filters may accept options that alter their behaviour. Options are separated by whitespace, and appear after the name of the filter. For example, the following code will be rendered in colour and with line numbers:
=begin filter perl -nnn
$a = 123;
$b = 3;
print $a * $b; # prints 369
print $a x $b; # prints 123123123
=end filter
=for
filters can also accept options, but the syntax is less clear. (This is because =for
expects the formatname to match \S+
.)
The syntax is the following:
=for filter=html:nnn=1
<center><img src="camel.png" />
A camel</center>
In summary, options are separated by space for =begin
blocks and by colons for =for
paragraphs.
The options and their paramater depend on the filter, but they cannot contain the pipe (|
) or colon (:
) character, for obvious reasons.
Pipes
Having filter to modify a block of text is usefule, but what's more useful (and fun) than a filter? Answer: a stack of filters piped together!
Take the imaginary filters foo
(which does a simple s/foo/bar/g
) and bang
(which does an even simpler tr/r/!/
). The following block
=begin filter foo|bar
foo bar baz
=end
will become ba! ba! baz
.
And naturally,
=for filter=bar|foo
foo bar baz
will return bar ba! baz
.
A note on verbatim and text blocks
Note: The fact that I mention verbatim and paragraph in this section is due to an old bug in Pod::POM
, which parses the content of begin
/end
sections as the usual POD paragraph and verbatim blocks. This is a bug in Pod::POM
, around which Pod::POM::View::HTML::Filter
tries to work around.
As from version 0.06, Pod::POM::View::HTML::Filter
gets to the original text contained in the =begin
/ =end
block (it was easier than I thought, actually) and put that string throught all the filters.
If any filter in the stack is defined as verbatim
, or if Pod::POM
detect any block in the =begin
/ =end
block as verbatim, then the output will be produced between <pre>
and </pre>
tags. Otherwise, no special tags will be added (his is left to the formatter).
Examples
An example of the power of pipes can be seen in the following example. Take a bit of Perl code to colour:
=begin filter perl
"hot cross buns" =~ /cross/;
print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns>
print "Left: <$`>\n"; # Left: <hot >
print "Match: <$&>\n"; # Match: <cross>
print "Right: <$'>\n"; # Right: < buns>
=end
This will produce the following HTML code:
<pre> <span class="q">"hot cross buns"</span> =~ <span class="q">/cross/</span><span class="sc">;</span>
<span class="k">print</span> <span class="q">"Matched: <$`> $& <$'>\n"</span><span class="sc">;</span> <span class="c"># Matched: <hot > cross < buns></span>
<span class="k">print</span> <span class="q">"Left: <$`>\n"</span><span class="sc">;</span> <span class="c"># Left: <hot ></span>
<span class="k">print</span> <span class="q">"Match: <$&>\n"</span><span class="sc">;</span> <span class="c"># Match: <cross></span>
<span class="k">print</span> <span class="q">"Right: <$'>\n"</span><span class="sc">;</span> <span class="c"># Right: < buns></span></pre>
Which your browser will render as:
"hot cross buns" =~ /cross/; print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns> print "Left: <$`>\n"; # Left: <hot > print "Match: <$&>\n"; # Match: <cross> print "Right: <$'>\n"; # Right: < buns>
Now if you want to colour and number the HTML code produced, it's as simple as tackling the html
on top of the perl
filter:
=begin filter perl | html nnn=1
"hot cross buns" =~ /cross/;
print "Matched: <$`> $& <$'>\n"; # Matched: <hot > cross < buns>
print "Left: <$`>\n"; # Left: <hot >
print "Match: <$&>\n"; # Match: <cross>
print "Right: <$'>\n"; # Right: < buns>
=end
Which produces the rather unreadable piece of HTML:
<pre><span class="h-lno"> 1</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>hot cross buns<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> =~ <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span>/cross/<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span>
<span class="h-lno"> 2</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Matched: <span class="h-ent">&lt;</span>$`<span class="h-ent">&gt;</span> $<span class="h-ent">&amp;</span> <span class="h-ent">&lt;</span>$'<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Matched: <span class="h-ent">&lt;</span>hot <span class="h-ent">&gt;</span> cross <span class="h-ent">&lt;</span> buns<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span>
<span class="h-lno"> 3</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Left: <span class="h-ent">&lt;</span>$`<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Left: <span class="h-ent">&lt;</span>hot <span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span>
<span class="h-lno"> 4</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Match: <span class="h-ent">&lt;</span>$<span class="h-ent">&amp;</span><span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Match: <span class="h-ent">&lt;</span>cross<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span>
<span class="h-lno"> 5</span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"k</span>"<span class="h-ab">></span>print<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"q</span>"<span class="h-ab">></span><span class="h-ent">&quot;</span>Right: <span class="h-ent">&lt;</span>$'<span class="h-ent">&gt;</span>\n<span class="h-ent">&quot;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span><span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"sc</span>"<span class="h-ab">></span>;<span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span> <span class="h-ab"><</span><span class="h-tag">span</span> <span class="h-attr">class</span>=<span class="h-attv">"c</span>"<span class="h-ab">></span># Right: <span class="h-ent">&lt;</span> buns<span class="h-ent">&gt;</span><span class="h-ab"></</span><span class="h-tag">span</span><span class="h-ab">></span></pre>
But your your browser will render it as:
1 <span class="q">"hot cross buns"</span> =~ <span class="q">/cross/</span><span class="sc">;</span> 2 <span class="k">print</span> <span class="q">"Matched: <$`> $& <$'>\n"</span><span class="sc">;</span> <span class="c"># Matched: <hot > cross < buns></span> 3 <span class="k">print</span> <span class="q">"Left: <$`>\n"</span><span class="sc">;</span> <span class="c"># Left: <hot ></span> 4 <span class="k">print</span> <span class="q">"Match: <$&>\n"</span><span class="sc">;</span> <span class="c"># Match: <cross></span> 5 <span class="k">print</span> <span class="q">"Right: <$'>\n"</span><span class="sc">;</span> <span class="c"># Right: < buns></span>
Caveats
There were a few things to keep in mind when mixing verbatim and text paragraphs in a =begin
block. These problems do not exist any more as from version 0.06.
- Text paragraphs are not processed for POD escapes any more
-
Because the
=begin
/=end
block is now processed as a single string of text, the following block:=begin filter html B<foo> =end
will not be transformed into
<b
foo</b> > before being passed to the filters, but will produce the expected:<pre>B<span class="h-ab"><</span><span class="h-tag">foo</span><span class="h-ab">></span></pre>
This will be rendered by your web browser as:
B<foo>
And the same text in a verbatim block
=begin filter html B<foo> =end
will produce the same results.
<pre> B<span class="h-ab"><</span><span class="h-tag">foo</span><span class="h-ab">></span></pre>
Which a web browser will render as:
B<foo>
Which looks quite the same, doesn't it?
- Separate paragraphs aren't filtered separately any more
-
As seen in "A note on verbatim and text blocks", the filter now processes the begin block as a single string of text. So, if you have a filter that replace each
*
character with an auto-incremented number in square brackets, like this:$view->add( notes => { code => sub { my ( $text, $opt ) = @_; my $n = $opt =~ /(\d+)/ ? $1 : 1; $text =~ s/\*/'[' . $n++ . ']'/ge; $text; } } );
And you try to process the following block:
=begin filter notes 2 TIMTOWDI*, but your library should DWIM* when possible. You can't always claims that PICNIC*, can you? =end filter
You'll get the expected result (contrary to previous versions):
<p>TIMTOWDI[2], but your library should DWIM[3] when possible. You can't always claims that PICNIC[4], can you?</p>
The filter was really called only once, starting at
2
, just like requested.Future versions of
Pod::POM::View::HTML::Filter
may supportinit
,begin
andend
callbacks to run filter initialisation and clean up code.
METHODS
Public methods
The following methods are available:
add( lang => { options }, ... )
-
Add support for one or more languages. Options are passed in a hash reference.
The required
code
option is a reference to the filter routine. The filter must take a string as its only argument and return the formatted HTML string (coloured accordingly to the language grammar, hopefully).Available options are:
Name Type Content ---- ---- ------- code CODEREF filter implementation verbatim BOOLEAN if true, force the full content of the =begin/=end block to be passed verbatim to the filter requires ARRAYREF list of required modules for this filter
Note that
add()
is both a class and an instance method.When used as a class method, the new language is immediately available for all future and existing instances.
When used as an instance method, the new language is only available for the instance itself.
delete( $lang )
-
Remove the given language from the list of class or instance filters. The deleted filter is returned by this method.
delete()
is both a class and an instance method, just likeadd()
. filters()
-
Return the list of languages supported.
know( $lang )
-
Return true if the view knows how to handle language
$lang
.
Overloaded methods
The following Pod::POM::View::HTML
methods are overridden in Pod::POM::View::HTML::Filter
:
new()
-
The overloaded constructor initialises some internal structures. This means that you'll have to use a instance of the class as a view for your
Pod::POM
object. Therefore you must usenew
.$Pod::POM::DEFAULT_VIEW = 'Pod::POM::View::HTML::Filter'; # WRONG $pom->present( 'Pod::POM::View::HTML::Filter' ); # WRONG # this is CORRECT $Pod::POM::DEFAULT_VIEW = Pod::POM::View::HTML::Filter->new; # this is also CORRECT my $view = Pod::POM::View::HTML::Filter->new; $pom->present( $view );
The only option at this time is
auto_unindent
, which is enabled by default. This option remove leading indentation from all verbatim blocks within the begin blocks, and put it back after highlighting. view_begin()
view_for()
-
These are the methods that support the
filter
format.
FILTERS
Built-in filters
Pod::POM::View::HTML::Filter
is shipped with a few built-in filters.
The name for the filter is obtained by removing _filter
from the names listed below (except for default
):
- default
-
This filter is called when the required filter is not known by
Pod::POM::View::HTML::Filter
. It does nothing more than normal POD processing (POD escapes for text paragraphs and<pre>
for verbatim paragraphs.You can use the
delete()
method to remove a filter and therefore make it behave likedefault
. - perl_tidy_filter
-
This filter does Perl syntax highlighting with a lot of help from
Perl::Tidy
.It accepts options to
Perl::Tidy
, such as-nnn
to number lines of code. CheckPerl::Tidy
's documentation for more information about those options. - perl_ppi_filter
-
This filter does Perl syntax highlighting using
PPI::HTML
, which is itself based on the incrediblePPI
.It accepts the same options as
PPI::HTML
, which at this time solely consist ofline_numbers
to, as one may guess, add line numbers to the output. - html_filter
-
This filter does HTML syntax highlighting with the help of
Syntax::Highlight::HTML
.The filter supports
Syntax::Highlight::HTML
options:=begin filter html nnn=1 <p>The lines of the HTML code will be numbered.</p> <p>This is line 2.</p> =end filter
See
Syntax::Highlight::HTML
for the list of supported options. - shell_filter
-
This filter does shell script syntax highlighting with the help of
Syntax::Highlight::Shell
.The filter supports
Syntax::Highlight::Shell
options:=begin filter shell nnn=1 #!/bin/sh echo "This is a foo test" | sed -e 's/foo/shell/' =end filter
See
Syntax::Highlight::Shell
for the list of supported options. - kate_filter
-
This filter support syntax highlighting for numerous languages with the help of
Syntax::Highlight::Engine::Kate
.The filter supports
Syntax::Highlight::Engine::Kate
languages as options:=begin filter kate Diff Index: lib/Pod/POM/View/HTML/Filter.pm =================================================================== --- lib/Pod/POM/View/HTML/Filter.pm (revision 99) +++ lib/Pod/POM/View/HTML/Filter.pm (working copy) @@ -27,6 +27,11 @@ requires => [qw( Syntax::Highlight::Shell )], verbatim => 1, }, + kate => { + code => \&kate_filter, + requires => [qw( Syntax::Highlight::Engine::Kate )], + verbatim => 1, + }, ); my $HTML_PROTECT = 0; =end filter
Check the
Syntax::Highlight::Engine::Kate
documentation for the full list of supported languages. Please note that some of them aren't well supported yet (bySyntax::Highlight::Engine::Kate
), so the output may not be what you expect.Here is a list of languages we have successfully tested with
Syntax::Highlight::Engine::Kate
version 0.02:C
,Diff
,Fortran
,JavaScript
,LDIF
,SQL
. - wiki_filter
-
This filter converts the wiki format parsed by
Text::WikiFormat
in HTML.The supported options are:
prefix
,extended
,implicit_links
,absolute_links
. The option and value are separated by a=
character, as in the example below:=begin filter wiki extended=1 [link|title] =end
- wikimedia_filter
-
This filter converts the wiki format parsed by
Text::MediawikiFormat
in HTML.The supported options are:
prefix
,extended
,implicit_links
,absolute_links
andprocess_html
. The option and value are separated by a=
character.
Writing your own filters
Write a filter is quite easy: a filter is a subroutine that takes two arguments (text to parse and option string) and returns the filtered string.
The filter is added to Pod::POM::View::HTML::Filter
's internal filter list with the add()
method:
$view->add(
foo => {
code => \&foo_filter,
requires => [],
}
);
When presenting the following piece of pod,
=begin filter foo bar baz
Some text to filter.
=end filter
the foo_filter()
routine will be called with two arguments, like this:
foo_filter( "Some text to filter.", "bar baz" );
If you have a complex set of options, your routine will have to parse the option string by itself.
Please note that in a =for
construct, whitespace in the option string must be replaced with colons:
=for filter=foo:bar:baz Some text to filter.
The foo_filter()
routine will be called with the same two arguments as before.
BUILT-IN FILTERS CSS STYLES
Each filter uses its own CSS classes, so that one can define their favourite colours in a custom CSS file.
perl
filter
Perl::Tidy
's HTML code looks like:
<span class="i">$A</span>++<span class="sc">;</span>
Here are the classes used by Perl::Tidy
:
n numeric
p paren
q quote
s structure
c comment
v v-string
cm comma
w bareword
co colon
pu punctuation
i identifier
j label
h here-doc-target
hh here-doc-text
k keyword
sc semicolon
m subroutine
pd pod-text
ppi
filter
PPI::HTML
uses the following CSS classes:
comment
double
heredoc_content
interpolate
keyword for language keywords (my, use
line_number
number
operator for language operators
pragma for pragmatas (strict, warnings)
single
structure for syntaxic symbols
substitute
symbol
word for module, function and method names
words
match
html
filter
Syntax::Highlight::HTML
makes use of the following classes:
h-decl declaration # declaration <!DOCTYPE ...>
h-pi process # process instruction <?xml ...?>
h-com comment # comment <!-- ... -->
h-ab angle_bracket # the characters '<' and '>' as tag delimiters
h-tag tag_name # the tag name of an element
h-attr attr_name # the attribute name
h-attv attr_value # the attribute value
h-ent entity # any entities: é «
shell
filter
Syntax::Highlight::Shell
makes use of the following classes:
s-key # shell keywords (like if, for, while, do...)
s-blt # the builtins commands
s-cmd # the external commands
s-arg # the command arguments
s-mta # shell metacharacters (|, >, \, &)
s-quo # the single (') and double (") quotes
s-var # expanded variables: $VARIABLE
s-avr # assigned variables: VARIABLE=value
s-val # shell values (inside quotes)
s-cmt # shell comments
kate
filter
Output formatted with Syntax::Highlight::Engine::Kate
makes use of the following classes:
k-alert # Alert
k-basen # BaseN
k-bstring # BString
k-char # Char
k-comment # Comment
k-datatype # DataType
k-decval # DecVal
k-error # Error
k-float # Float
k-function # Function
k-istring # IString
k-keyword # Keyword
k-normal # Normal
k-operator # Operator
k-others # Others
k-regionmarker # RegionMarker
k-reserved # Reserved
k-string # String
k-variable # Variable
k-warning # Warning
HISTORY
The goal behind this module was to produce nice looking HTML pages from the articles the French Perl Mongers are writing for the French magazine GNU/Linux Magazine France (http://www.linuxmag-france.org/).
The resulting web pages can be seen at http://articles.mongueurs.net/magazines/.
AUTHOR
Philippe "BooK" Bruhat, <book@cpan.org>
THANKS
Many thanks to Sébastien Aperghis-Tramoni (Maddingue), who helped debugging the module and wrote Syntax::Highlight::HTML
and Syntax::Highlight::Shell
so that I could ship PPVHF with more than one filter. He also pointed me to Syntax::Highlight::Engine::Kate
, which led me to clean up PPVHF before adding support for SHEK.
Perl code examples where borrowed in Amelia, aka Programming Perl, 3rd edition.
TODO
There are a few other syntax highlighting modules on CPAN, which I should try to add support for in Pod::POM::View::HTML::Filter
:
Syntax::Highlight::Universal
Syntax::Highlight::Mason
Syntax::Highlight::Perl
(seems old)Syntax::Highlight::Perl::Improved
BUGS
Please report any bugs or feature requests to bug-pod-pom-view-html-filter@rt.cpan.org
, or through the web interface at http://rt.cpan.org. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
COPYRIGHT & LICENSE
Copyright 2004 Philippe "BooK" Bruhat, All Rights Reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 1164:
Non-ASCII character seen before =encoding in 'Sébastien'. Assuming CP1252