NAME

CSS::Adaptor::Whitelist -- filter out potentially dangerous CSS

SYNOPSIS

use CSS
use CSS::Adaptor::Whitelist;

my $css = CSS->new({ adaptor => 'CSS::Adaptor::Whitelist' });
$css->parse_string( <<EOCSS );
   body {
       margin: 0;
       background-image: url(javascript:alert("I am an evil hacker"));
   }
   #main {
       background-color: yellow;
       content-after: '<img src="http://example.com/xxx-rated-picture.jpg">';
   }
EOCSS

print $css->output;
# prints:
# body {
#     margin: 0;
# }
# #main {
#     background-color: yellow;
# }

# allow the foo selector, but only with value "bar" or "baz"
# 1) regex way
$CSS::Adaptor::Whitelist::whitelist{foo} = qr/^ba[rz]$/;
# 2) hash way
$CSS::Adaptor::Whitelist::whitelist{foo} = {bar => 1, baz => 1};
# 3) sub way
$CSS::Adaptor::Whitelist::whitelist{foo} = sub {
   return ($_[0] eq 'bar' or $_[0] eq 'baz')
}

DESCRIPTION

This is a subclass of CSS::Adaptor that paranoidly prunes anything from the input CSS code that it doesn't recognize as standard.

It is intended as user-input CSS validation before letting it on your site. The allowed CSS properties and corresponding values were mostly taken from w3schools.com/css .

Whitelist

The allowed constructs are given in the %CSS::Adaptor::Whitelist::whitelist hash. The keys are the allowed selectors and the values can be 1) regular expressions, 2) code refs and 3) hash refs.

Each CSS property is looked up in the whitelist. If it is not found, it is discarded.

Each CSS value found is checked. If it passes the test, then it is output in standard indentation, otherwise a message is passed to the log method.

In case of regexp, it is checked against the regexp. If it matches, the value passes.

In case of subroutine, the value is passed as the only argument to it. If the sub returns a true value, the CSS value passes.

In case of hash, if the CSS value is a key in the hash, that is associated with a true value, then it passes.

Overriding defaults

You are invited to modify the rules, particularly the ones that allow URL's. See set_url_re for a convenient way.

Also the font-family (and thus also font) properties are quite generous. Feel free to allow just a list of expected font families:

$CSS::Adaptor::Whitelist::whitelist{'font-family'} = qr/^arial|verdana|...$/;

Functions

list2hash

Simplifies giving values in the hash way. Returns hasref.

list2hash('foo', 'bar', 'baz') # returns {foo => 1, bar => 1, baz => 1}
space_sep_res
space_sep_res($string, $regex, $regex, ...) # returns 1 or 0

SPACE-SEParated Regular ExpresssionS. Given a string like 1px solid #CCFF55 and regular expressions for CSS dimension, border type and CSS color, checks if the string matches piece by piece to these regexps.

Will fail if some of the regexp matches too small a chunk, for example:

space_sep_res('solid #CCFF55', qr/solid|dotted/, qr/#[A-F\d]{3}|#[A-F\d]{6}/)

will return 0 because the latter regexp stops after matching <#CCF>.

Also beware that the regular expressions provided MUST NOT contain capturing parentheses, otherwise the function will not work. Use (?: ... ) for non-capturing parenthesising.

set_url_re

Sets the regular expression that URL's are checked against. Including the url( ) wrapper. You are encouraged to use this method to provide a regexp that will only allow URL's to domains you control:

CSS::Adaptor::Whitelist::set_url_re(qr{url(https?://example\.com/[\w/]+)});

Notice that the regexp should not be anchored (no ^ and $ at the edges). It is being used in these properties:

cursor
background
background-image
list-style
list-style-image
log

This is a method that stores messages of things being filtered out in the @CSS::Adaptor::Whitelist::message_log array.

You are encouraged to override or redefine this method to treat the log messages in accordance with your logging habits.

AUTHOR

Oldrich Kruza <sixtease@cpan.org>

http://www.sixtease.net/

COPYRIGHT

Copyright (c) 2009 Oldrich Kruza. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.