NAME
CSS::Adaptor::Whitelist -- filter out potentially dangerous CSS
SYNOPSIS
use CSS
use CSS::Adaptor::Whitelist;
my $css = CSS->new({ adaptor => 'CSS::Adaptor::Whitelist' });
$css->parse_string( <<EOCSS );
body {
margin: 0;
background-image: url(javascript:alert("I am an evil hacker"));
}
#main {
background-color: yellow;
content-after: '<img src="http://example.com/xxx-rated-picture.jpg">';
}
EOCSS
print $css->output;
# prints:
# body {
# margin: 0;
# }
# #main {
# background-color: yellow;
# }
# allow the foo selector, but only with value "bar" or "baz"
# 1) regex way
$CSS::Adaptor::Whitelist::whitelist{foo} = qr/^ba[rz]$/;
# 2) hash way
$CSS::Adaptor::Whitelist::whitelist{foo} = {bar => 1, baz => 1};
# 3) sub way
$CSS::Adaptor::Whitelist::whitelist{foo} = sub {
return ($_[0] eq 'bar' or $_[0] eq 'baz')
}
DESCRIPTION
This is a subclass of CSS::Adaptor that paranoidly prunes anything from the input CSS code that it doesn't recognize as standard.
It is intended as user-input CSS validation before letting it on your site. The allowed CSS properties and corresponding values were mostly taken from w3schools.com/css .
Whitelist
The allowed constructs are given in the %CSS::Adaptor::Whitelist::whitelist
hash. The keys are the allowed selectors and the values can be 1) regular expressions, 2) code refs and 3) hash refs.
Each CSS property is looked up in the whitelist. If it is not found, it is discarded.
Each CSS value found is checked. If it passes the test, then it is output in standard indentation, otherwise a message is passed to the log
method.
In case of regexp, it is checked against the regexp. If it matches, the value passes.
In case of subroutine, the value is passed as the only argument to it. If the sub returns a true value, the CSS value passes.
In case of hash, if the CSS value is a key in the hash, that is associated with a true value, then it passes.
Overriding defaults
You are invited to modify the rules, particularly the ones that allow URL's. See set_url_re
for a convenient way.
Also the font-family
(and thus also font
) properties are quite generous. Feel free to allow just a list of expected font families:
$CSS::Adaptor::Whitelist::whitelist{'font-family'} = qr/^arial|verdana|...$/;
Functions
- list2hash
-
Simplifies giving values in the hash way. Returns hasref.
list2hash('foo', 'bar', 'baz') # returns {foo => 1, bar => 1, baz => 1}
- space_sep_res
-
space_sep_res($string, $regex, $regex, ...) # returns 1 or 0
SPACE-SEParated Regular ExpresssionS. Given a string like
1px solid #CCFF55
and regular expressions for CSS dimension, border type and CSS color, checks if the string matches piece by piece to these regexps.Will fail if some of the regexp matches too small a chunk, for example:
space_sep_res('solid #CCFF55', qr/solid|dotted/, qr/#[A-F\d]{3}|#[A-F\d]{6}/)
will return 0 because the latter regexp stops after matching <#CCF>.
Also beware that the regular expressions provided MUST NOT contain capturing parentheses, otherwise the function will not work. Use
(?: ... )
for non-capturing parenthesising. - set_url_re
-
Sets the regular expression that URL's are checked against. Including the
url( )
wrapper. You are encouraged to use this method to provide a regexp that will only allow URL's to domains you control:CSS::Adaptor::Whitelist::set_url_re(qr{url(https?://example\.com/[\w/]+)});
Notice that the regexp should not be anchored (no
^
and$
at the edges). It is being used in these properties:cursor background background-image list-style list-style-image
- log
-
This is a method that stores messages of things being filtered out in the
@CSS::Adaptor::Whitelist::message_log
array.You are encouraged to override or redefine this method to treat the log messages in accordance with your logging habits.
AUTHOR
Oldrich Kruza <sixtease@cpan.org>
http://www.sixtease.net/
COPYRIGHT
Copyright (c) 2009 Oldrich Kruza. All rights reserved.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.