XML::Directory - returns a content of directory as XML.
use XML::Directory::String;
$dir = XML::Directory::String->new('/home/petr',3,10);
$rc = $dir->parse_dir;
@res = $dir->get_array;
use XML::Directory::SAX;
use MyHandler;
$h = MyHandler->new();
$e = MyErrorHandler->new();
$dir = XML::Directory::SAX->new( Handler => $h, ErrorHandler => $e,
details => 3, depth => 10);
$rc = $dir->parse_dir('/home/petr');
This extension provides two classes: XML::Directory::String and XML::Directory::SAX. Their methods make it possible to set parameters such as level of details or maximal number of nested sub-directories and generate either string containing the resulting XML or SAX events.
The SAX generator is based on XML::SAX::Base; it's supposed to cooperate with other XML::SAX compliant modules safely.
The original function (get_dir) is still supported via XML::Directory base class.
use XML::Directory(qw(get_dir));
my @xml = get_dir('/home/petr');
Methods common to both classes are defined in the XML::Directory base class.
- set_path
Resets path. An initial path can be set using the constructor.
- set_details
Sets or resets level of details to be returned. Can be also set using the constructor. Valid values are 1, 2 or 3.
1 = brief Example: <?xml version="1.0" encoding="utf-8"?> <dirtree> <directory name="test"> <file name=""/> </directory> </dirtree> 2 = normal Example: <?xml version="1.0" encoding="utf-8"?> <dirtree> <head version="0.70"> <path>/home/petr/test</path> <details>2</details> <depth>5</depth> </head> <directory name="test" depth="0"> <path></path> <modify-time epoch="998300843">Mon Aug 20 11:47:23 2001</modify-time> <file name=""> <mode code="33261">rwx</mode> <size unit="bytes">225</size> <modify-time epoch="998300843">Mon Aug 20 11:47:23 2001</modify-time> </file> </directory> </dirtree> 3 = verbose Example: <?xml version="1.0" encoding="utf-8"?> <dirtree> <head version="0.70"> <path>/home/petr/test</path> <details>3</details> <depth>5</depth> </head> <directory name="test" depth="0" uid="500" gid="100"> <path></path> <access-time epoch="999857485">Fri Sep 7 12:11:25 2001</access-time> <modify-time epoch="998300843">Mon Aug 20 11:47:23 2001</modify-time> <file name="" uid="500" gid="100"> <mode code="33261">rwx</mode> <size unit="bytes">225</size> <access-time epoch="998300843">Mon Aug 20 11:47:23 2001</access-time> <modify-time epoch="998300843">Mon Aug 20 11:47:23 2001</modify-time> </file> </directory> </dirtree>
See the
chapter for more details. - set_maxdepth
Sets or resets number of nested sub-directories to be parsed. Can also be set using the constructor. 0 means that only a directory specified by path is parsed (no sub-directories).
- parse_dir, parse
$rc = $dir->parse_dir; $rc = $dir->parse;
Both methods are equivalent, the later is supported for the sake of backward compatibility. It scans directory tree specified by path. When used from the XML::Directory::String class instance it stores its XML representation in memory ($dir->{xml}) and returns a number of lines. For XML::Directory::SAX it generate SAX events and returns a result of the end_document function. Parse methods of the SAX generator also accept parameters: paths and options.
This method checks a validity of details and depth. In the event a parameter is out of valid range, an error occurs. The same is true if the path specified can't be found. For the SAX generator, missing content handler is also treated as error.
- get_path
$path = $dir->get_path;
Returns current path.
- get_details
$details = $dir->get_details;
Returns current level of details.
- get_maxdepth
$maxdepth = $dir->get_maxdepth;
Returns current number of nested sub-directories.
- pkg->order_by
Sort contents of a directory based. Valid options for $code are :
- df
directory, file default
- fd
file, directory
- a
- enable_ns;
Enables a support for namespaces.
- disable_ns;
Disables a support for namespaces.
- enable_doctype;
A DOCTYPE declaration will be generated.
Level of details: 1
<!DOCTYPE dirtree PUBLIC "-//GA//DTD XML-Directory 1.0 Level1//EN" "">
Level of details: 2
<!DOCTYPE dirtree PUBLIC "-//GA//DTD XML-Directory 1.0 Level2//EN" "">
Level of details: 3
<!DOCTYPE dirtree PUBLIC "-//GA//DTD XML-Directory 1.0 Level3//EN" "">
A local copy of DTD files is in the
directory. - disable_doctype;
No DOCTYPE declaration will be generated. This is the default behavior.
- get_ns_data;
$ns = $dir->get_ns_data;
Returns a hash reference with the following keys:
- enabled
either 1 or 0
- uri
namespace URI, '' by default
- prefix
namespace prefix, 'xd' by default
- encoding
$encoding = $dir->encoding; $dir->encoding('utf-8');
Gets or sets an encoding of generated XML document. The encoding must be a string acceptable for an XML encoding declaration. The default value is 'utf-8'. The encoding doesn't apply to SAX so far.
- enable_rdf
Enables a support of RDF/Notation3 meta-data. The parser looks for files with same name as the argument of this method in each directory. If found, the file is passed to RDF::Notation3 parser and properties of particular resources (files or directories) are merged to resulting XML. The N3 file itself is not listed in the XML. See for more details on RDF/Notation3.
In addition, one more element doc:Position (where doc prefix is bound to URI namespace of is added. This element contains a position of the first triple with the current document as subject within the triple array, so that the order of files/directories can be controlled using the RDF/N3 file. The doc:Position element is generated even when a document is not found in the N3 file or the N3 is not found in a directory; it is generated as a unique identifier handy e.g. for sorting in this event. Use $dir->disable_rdf to disable his feature.
If there is a property called doc:Type with value of 'document' found for a directory, sub-directories and files are not processed. This is a way to emulate multiple-file documents efficiently.
If, for example, a directory contains a file named Apache.html:
<xd:file name="Apache.html"> <xd:mode code="33188">rw-</xd:mode> <xd:size unit="bytes">41913</xd:size> <xd:modify-time epoch="999344286">Sat Sep 1 13:38:06 2001</xd:modify-time> </xd:file>
Then a presence of the following Notation3 file
@prefix dc: <>. <Apache.html> dc:Title "Apache"; dc:Description "mod_perl Apache module".
results in:
<xd:file name="Apache.html"> <xd:mode code="33188">rw-</xd:mode> <xd:size unit="bytes">41913</xd:size> <xd:modify-time epoch="999344286">Sat Sep 1 13:38:06 2001</xd:modify-time> <doc:position>1</doc:position> <dc:Title>Apache</dc:Title> <dc:Description>mod_perl Apache module</dc:Description> </xd:file>
Since using RDF meta-data requires to use namespaces, this method enables namespaces automatically. It also checks whether the RDF::Notation3 module is installed and issues an error if not.
- disable_rdf
Disables RDF/N3 support.
- error_treatment
$et = $dir->error_treatment; $dir->error_treatment('warn');
Gets or sets a way errors are treated in. There are two possible values:
- die
The parse_dir method dies (actually croaks) on an error. Default value.
- warn
The parse_dir methods catches the error. String generator returns an XML error message. SAX driver throws a SAX exception and calls an error handler if defined (otherwise it dies as for "die"). String $dir->{error} property is set to an error number.
Example of an error message:
<?xml version="1.0" encoding="utf-8"?> <dirtree xmlns=""> <error number="3">Path /home/petr/work/done not found!</error> </dirtree>
- new
$dir = XML::Directory->new('/home/petr',2,5); $dir = XML::Directory->new('/home/petr',2); $dir = XML::Directory->new('/home/petr');
The constructor accepts up to 3 parameters: path, details (1-3, brief or verbose XML) and depth (number of nested sub-directories). The last two parameters are optional (defaulted to 2 and 1000).
- get_arrayref
$res = $dir->get_arrayref;
Returns a parsed XML directory image as a reference to array (each field contains one line of the XML file).
- get_array
@res = $dir->get_array;
Returns a parsed XML directory image as an array (each field contains one line of the XML file).
- get_string
$res = $dir->get_string;
Returns a parsed XML directory image as a string.
- new
$dir = XML::Directory::SAX->new( Handler => $h, ErrorHandler => $e, details => 3, depth => 10);
The constructor accepts an option hash as its only parameter. The hash keys may include all options recognized by XML::SAX::Base (e.g. Handler, ErrorHandler, Source) and three options specific to XML::Directory::SAX (details, depth, path).
The only mandatory option is Handler, other options either have their default values (details=2, depth=1000, path=<current working directory>) or aren't required.
- other methods
Other methods include these inherited from XML::Directory (see METHODS COMMON TO BOTH CLASSES) and those inherited from XML::SAX::Base.
Among them the following could be useful to change handlers during the parse time safely:
- set_content_handler
$h = new MyHandler; $dir->set_content_handler($h);
Sets SAX content handler.
- set_error_handler
$e = new MyErrorHandler; $dir->set_error_handler($e);
Sets SAX error handler.
See XML::SAX::Base documentation for more details.
- get_dir();
@xml = get_dir('/home/petr',2,5);
This functions takes a path as a mandatory parameter and details and depth as optional ones. It returns an array containing an XML representation of directory specified by the path. Each field of the array represents one line of the XML.
Optional arguments are defaulted to 2 and 1000.
This is a mod_perl module that serves as an Apache interface to XML::Directory::String. It allows to send parameters in http request and receive a result (XML representation of a directory tree) in http response. Errors are caught and sent as XML via http to prevent Apache error.
Parameters include:
- path
absolute path to a directory to be parsed, mandatory
The path is not send in query but as an extra path instead. This seems to be more appropriate for this kind of parameter.
- dets
level of details, optional
- depth
maximal number of nested sub-directories, optional
- ns
if set to 1, namespaces are used, optional
To use this module, add a similar section to your Apache config file
<Location /xdir>
SetHandler perl-script
PerlHandler XML::Directory::Apache
PerlSendHeader On
and send a request to:
The path portion following 'xdir' is taken as path; other parameters can be send in query.
Resulting XML documents can be of three types. The type of document is specified in the constructor or using the set_details
Level of details: 1 (brief)
<!ELEMENT dirtree (directory)>
<!ELEMENT directory (directory, file)>
<!ATTLIST directory
<!ATTLIST file
Level of details: 2 (normal)
<!ELEMENT dirtree (head, directory)>
<!ELEMENT head (path, details, depth)>
<!ATTLIST head
<!ELEMENT details (#PCDATA)>
<!ELEMENT depth (#PCDATA)>
<!ELEMENT directory (directory, file, path, modify-time)>
<!ATTLIST directory
<!ELEMENT file (mode, size, modify-time)>
<!ATTLIST file
<!ELEMENT modify-time (#PCDATA)>
<!ATTLIST modify-time
<!ATTLIST mode
<!ATTLIST size
unit CDATA #FIXED "bytes">
Level of details: 3 (verbose)
<!ELEMENT dirtree (head, directory)>
<!ELEMENT head (path, details, depth)>
<!ATTLIST head
<!ELEMENT details (#PCDATA)>
<!ELEMENT depth (#PCDATA)>
<!ELEMENT directory (directory, file, path, modify-time, access-time)>
<!ATTLIST directory
<!ELEMENT file (mode, size, modify-time, access-time)>
<!ATTLIST file
<!ELEMENT modify-time (#PCDATA)>
<!ATTLIST modify-time
<!ELEMENT access-time (#PCDATA)>
<!ATTLIST access-time
<!ATTLIST mode
<!ATTLIST size
unit CDATA #FIXED "bytes">
There is also an modular DTD available, see the dtd
directory. You can take a look at an HTML documentation of this DTD by DTDParse utility:
This DTD allows you to extend the list of allowable elements using parameter entities, so that extended XML files can be still validated.
An extended vocabulary can be either because of RDF/N3 metadata (see enable_rdf
), or, for instance, a directory of .dbk files, might be munged for <articleinfo> data which would be included in the output. The output could then be cached and munged again later using another SAX filter or XSLT.
This is how to extend the DTD:
<?xml version = "1.0" ?>
<!DOCTYPE dirtree SYSTEM [
<!ENTITY % file "(mode,size,modify-time,foo)">
Current version is 0.97.
Copyright (c) 2001-2002 Ginger Alliance. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Petr Cimprich,
Duncan Cameron,
Chris Snyder,
Aaron Straup Cope,
perl(1), XML::SAX, RDF::Notation3.