NAME
CSS::Parser - Base class for CSS stylesheets parsing
SYNOPSIS
C<package YourModule;>
C<use CSS::Parser;>
C<@ISA = qw(CSS::Parser);>
C<sub block {>
C<my $self = shift;>
C<my %properties = %{$_[0]};>
C<}>
C<sub comment {>
C<my $self = shift;>
C<my $comment = shift;>
C<}>
C<sub rule {>
C<my $self = shift;>
C<my @rule_elem = @_;>
C<#where:>
C<#( ($type,$elem_name,$elem_value),...)>
C<# or>
C<#( ($type==hierarchy,$list),...)>
C<}>
Then in a script:
C<use YourModule;>
C<my $css = new YourModule;>
C<$css->css_parse(chunk1);>
C<$css->css_parse(chunk2);>
C<$css->css_eof>
or
C<$css->css_file(path/to/file.css or \*FHANDLE)>
NOTE: the interface to rule will change in the coming version to become more useable
DESCRIPTION
CSS::Parser
will eat up CSS data and parsed chunks to callbacks. These callbacks have to be subclasses in order to get anything interesting out of the parser. The simplest subclass is one that would simply print out the CSS logical bits that have been found by the parser and info it has received about them. You should find an example of this called CSSPrinter
in the example dir of this distribution.
As of now, this parser isn't 100% CCS2 compliant, but it is very close to the CSS1 specification. That is to say that it should successfully parse about 99.9% of stylesheets that you are likely to find on the web, as no browser is yet fully CSS2 compliant either.
The next release (already seriously in the works as of this writing) will come very much closer to CSS2. Also, other modules will be provided together with this one so as to already implement the most useful subclasses. I am currently working on CSS::Expand
that given a stylesheet and an HTML page would return a page in which all tags will have their style
attribute set (a mechanism for a default stylesheet will be present) and CSS::Valid
that will reduce a stylesheet to its valid part as specified by the CSS2 specification.
These modules may become useful for example for robot writers who want to skip parts of pages that have a display: none
or a visibility: hide/hidden
style attribute set, so as to circumvent cheaters. An example of this will be included in the next release.
Also, as XML parsing will be done more and more in perl, and as CSS can be included in XML, it is likely that subclasses will be written to cooperate with modules in the XML:: hierarchy or with scripts using them.
METHODS
Public
new() The construstor, takes no parametres, returns the parser object.
css_parse() The main parsing method, takes a string for argument ($css-
css_parse($string)>)
css_file() Parse a stylesheet from a file, take either a filename or a ref to a handle glob ($css-
css_file("file.css")> or $css-
css_file(\*CSS)>)
css_eof() Signals end of file to end parsing, no argument.
case_sensitive() Get/Set the case_sensitivity of returned rules. This may be useful for in CSS case-sensitivity depends on the case-sensitivity type of the document to which it is applied. That is, in HTML it will be case-insensitive whereas in XML it will not. ($css-
case_sensitive(1)> or $case_s = $css-
case_sensitive()>)
NOTE: this doesn't do anything yet, it will be implemented at the same time as the new rule interface.
comment() Callback on comments. Receives a scalar containing the text of the comment without the /* and */.
rule() Callback on rules (both selectors and @rules). Contains a list of references to lists contain the following data for each rule met before a block $type
(class, id, at_rule, sl_at_rule, element, hierarchy, pseudo-clas). If it isn't a hierarchy then two other elements follow $name
(the name of the rule/selector eg: A for A:link) and $value
(the value of the rule/selector eg: link for A:link). If it is a hierarchy then there is only one element after type that is a reference to yet another list of lists as described above.
VERY IMPORTANT NOTE: This is altogether too complicated, inappropriate and wrong. I have found a much better way to express the complexity and variety of what rules/selectors can be, written it and am currently debugging it and finishing the last details. It should be out between mid and end of August with the next release of this module. Do not waste time building code based on this callback, the interface to come is everything but backwards compatible.
block() Callback on blocks. Receives a ref to a hash containing all the name/value pairs of the block's properties as keys/values.
NOTE: This will remain very much as is, except that in the case of nested blocks the key will be the name before the nested block and the value a ref to a hash containing the property pairs.
NOTE: These last three (or part of them) will probably gain a last parametre containing the original text.
Supposedly private
These are not supposed to be used outside, but you may find them useful (if not for use within this module, maybe for copying elsewhere, feel free). The return values are inverted between them, that is because it fits with their use within this module.
_blk() Returns 0 if the {} are uneven (escapes with \ and quoting are taken into account) and 1 if they are even.
_quote() Returns 1 if quoting is uneven (escapes with \ and quoting interquoting (eg "'" or '"') are taken into account) and 0 if even.
BUGS
Not too many though as it is undergoing change many are likely to appear. I have tested it succesfully on over 100 .css as of now, but they were fairly simple ones as are most on the web as of now.
CREDITS
The parsing strategy has been taken from Gisle Ass's HTML::Parser modified as much as needed to do the job. The eof()
and css_file()
methods are very close to being verbatim copies of their HTML::Parser
equivalent.
AUTHOR
Robin Berjon, robin@idl-net.com
SEE ALSO
HTML::Parser, the CSS2 specification (http://www.w3.org)
COPYRIGHT
Copyright (c) 1998 Robin Berjon. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
DISCLAIMER
This module is alpha code, the interface for some functions will change soon. It is only distributed so that users may have a look at what is in progress and make suggestions or offer bug fixes while in early stages of development. This module is NOT useable for production as yet, use it at your own risk.