NAME
Daizu::Util - various utility functions
FUNCTIONS
The following functions are available for export from this module. None of them are exported by default.
- trim($s)
-
Returns
$s
with leading and trailing whitespace stripped off, orundef
if$s
is undefined. - trim_with_empty_null($s)
-
Returns
$s
with leading and trailing whitespace stripped off, orundef
if$s
is undefined, or if$s
contains nothing but whitespace.Useful for tidying values which are to be stored in the database, where sometimes it is preferable to store
NULL
in place of a value with no real content. - like_escape($s)
-
Returns an escaped version of
$s
suitable for including in patterns given to the SQLLIKE
operator. Does NOT escape quotes, so you still need to quote the result for the database before including it in any SQL.Returns
undef
if the input is undefined.Escapes backslashes, underscores, and percent signs.
- pgregex_escape($s)
-
Returns an escaped version of
$s
suitable for including in patterns given to PostgreSQL's SQL~
operator. Does NOT escape quotes, so you still need to quote the result for the database before including it in any SQL.Returns
undef
if the input is undefined.Escapes the following characters:
. ^ $ + * ? ( ) [ ] { \
- url_encode($s)
-
Returns a URL encoded version of
$s
, with characters which would be unsuitable for use in a URL escaped as%
followed by two uppercase hexadecimal digits. The opposite of url_decode(). - url_decode($s)
-
If
$s
is URL encoded, return a decoded version. The opposite of url_encode(). - validate_number($num)
-
If
$num
consists only of a sequence of digits, return it as an untainted number, otherwise return nothing. - validate_uri($uri)
-
Return a URI object representing the absolute URI in
$uri
, or undef if it isn't defined, is invalid, or isn't absolute.This is based on code from the Data::Validate::URI module, but it has been changed to only allow absolute URIs, and it doesn't try to reconstruct the URI from it individual parts (something which the URI module can do instead).
- validate_mime_type($mime_type)
-
Given something that might be a MIME type name, return either a valid MIME type, folded to lowercase, or
undef
.Based on the definition from RFC 2045 (see http://www.faqs.org/rfcs/rfc2045.html).
- validate_date($date)
-
Given something that might be a valid date/time in Subversion format, return a DateTime object containing the same timestamp. Otherwise returns
undef
.The date format recognized is one possible format for W3CDTF (http://www.w3.org/TR/NOTE-datetime) dates. Only the exact format used by Subversion is supported, except that: the 'T' and 'Z' letters are case-insensitive, whitespace at the start of end of the string is ignored, and the fractional seconds part is optional.
Note: it would have been nice to use DateTime::Format::W3CDTF for this, but as of version 0.04 it has a bug which prevents parsing of Subversion dates (CPAN bug #14179, http://rt.cpan.org/Public/Bug/Display.html?id=14179).
- w3c_datetime($datetime, $include_micro)
-
Return a string version of the DateTime object, formatted as a W3CDTF (http://www.w3.org/TR/NOTE-datetime) date and time. If
$datetime
is just a string, it is automatically validated and parsed by validate_date() first. If the value is invalid or undefined, thenundef
is returned.$include_micro
indicates whether microseconds should be included in the returned string. If true, a decimal point and six digits of fractional seconds is included, unless they would all be zero, otherwise the value will be accurate only to within a second. - db_datetime($datetime)
-
$datetime
must either be a DateTime object or a string which can be parsed by validate_date(). If not,undef
is returned.If valid, the date and time are returned formatted for use in PostgreSQL, using DateTime::Format::Pg.
- rfc2822_datetime($datetime)
-
$datetime
must either be a DateTime object or a string which can be parsed by validate_date(). If not,undef
is returned.If valid, the date and time are returned formatted for according to RFC 2822 (http://www.faqs.org/rfcs/rfc2822.html), and is suitable for use in (for example) RSS 2.0 feeds.
- parse_db_datetime($datetime)
-
Given a string containing a date and time formatted in PostgreSQL's format, return a corresponding DateTime object. Returns
undef
if$datetime
isn't defined. - display_byte_size($bytes)
-
Given a number of bytes, format it for display to a user with a suffix indicating the units (either
b
,Kb
,Mb
, orGb
, depending how big the value is). - db_row_exists($db, $table, ...)
-
Return true if a row exists in database table
$table
on database connection$db
, otherwise false.The extra arguments can be omitted (in which case the table merely has to be non-empty), can be a single value (which will be matched against the
id
column), or can be a hash of column-name to value mappings which must be met by a record.For example, to find out whether there is a current path for a GUID ID, where
last_revnum
isNULL
:my $guid_already_present = db_row_exists($db, file_path => guid_id => $guid_id, branch_id => $branch_id, last_revnum => undef, );
- db_row_id($db, $table, %where)
-
Return the ID number (the value from the
id
column) from$table
on the database connection$db
, where the values in%where
match the values in a record. If there are more than one such value, an arbitrarily chosen one is returned. Nothing is returned if there are no matches.my $file_id = db_row_id($db, 'wc_file', wc_id => $wc_id, path => $path, );
- db_select($db, $table, $where, @columns)
-
Gets the named columns in
@columns
from a record in table$table
using database connection$db
and returns them as a list. Only one record is selected. If there are multiple matches then an arbitrary one is returned.$where
can be either an ID number (to match theid
column) or a reference to a hash of column names and values to match. Values can beundef
to matchNULL
.$where
can also be a reference to an empty hash if you don't care which record is selected.my $branch_path = db_select($db, branch => $branch_id, 'path');
The column names are not quoted, so they can be SQL expressions:
my $last_known_rev = db_select($db, revision => {}, 'max(revnum)');
- db_select_col($db, $table, $where, $column)
-
Return a list of values from the column named by
$column
in$table
using database connection$db
.$where
can be either an ID number (to match theid
column) or a reference to a hash of column names and values to match. Values can beundef
to matchNULL
.$where
can also be a reference to an empty hash if you want to select all records.my @podcast_urls = db_select_col($db, url => { method => 'article', content_type => 'audio/mpeg' }, 'url', );
The column name is not quoted, so it can be an SQL expression.
- db_insert($db, $table, %value)
-
Insert a new record into
$table
on database connection$db
.%value
should be a hash of column names and values to use for them. The values are SQL quoted, but this should not be used for inserting arbitrary binary data intobytea
columns. Values can beundef
, in which caseNULL
will be inserted.Returns the
id
number of the new record, but only attempts to do this (it might not work on tables withoutserial
columns) if a return value is expected.my $branch_id = db_insert($db, 'branch', path => $path);
- db_update($db, $table, $where, %value)
-
Updates one or more records in
$table
using database connection$db
.Only records matching
$where
are updated. It can be either a single number (matched against theid
column) or a reference to a hash of column names and values to match.db_update($db, wc_file => $file_id, modified_at => db_datetime($time), );
If
$where
is a reference to an empty hash then this function will die. If you really want to update every record unconditionally, use a normal$db->do
method call.Returns the number of rows updated, or
undef
on error, or -1 if the number of rows changed can't be determined. - db_replace($db, $table, $where, %value)
-
Either inserts a new record, if there is none matching
$where
, or updates one or more existing records if there is.$where
must be a reference to a hash of column names and values to match.If there is already at least one record which matches
$where
, then this behaves the same as db_update(). Otherwise a new record is inserted using both the values in%value
and the ones in%$where
combined. If a column's value is given in both hashes, the one in%value
is used.If a new record is inserted and a return value is expected, then the
id
value of the new record will be returned. For updatesundef
is always returned. - db_delete($db, $table, ...)
-
Delete records from
$table
using database connection$db
. If a single additional value is specified then it is matched against theid
column, otherwise a hash of column names and values is expected.This function will die if you don't give it some conditions to check for. If you really want to delete every record unconditionally, use a normal
$db->do
method call. - transactionally($db, $code, @args)
-
Executes
code
(a reference to a sub) within a database transaction on$db
. The optional@args
will be passed to the function. Its return value will be returned fromtransactionally
.If the code being executed dies, then the transaction is rolled back and the exception passed on. Otherwise, the transaction is committed.
A database transaction is not started or finished when this function is called recursively. This means that if you use it consistently if effectively gives you nested transactions.
$code
is called with the same context as this function was called in. Whentransactionally
returns, it returns a single value if it was called in scalar context, or a list of values if called in list context. - wc_file_data($db, $file_id)
-
Returns a reference to the data (content) of the
wc_file
record identified by$file_id
. Fails if the file is actually a directory or doesn't exist.This takes care of getting data from the live working copy if the file just has a reference to a file with the same content.
- guess_mime_type($data, $filename)
-
Return the likely MIME type of the data referenced by
$data
(a scalar reference), or nothing if it is of an unknown type.$filename
is optional, but can be used for some additional guesswork if supplied. Currently it is only used to recognizetext/css
files, which might otherwise get identified astext/plain
. - guid_first_last_times($db, $guid_id)
-
Returns a list of two timestamps, as DateTime values, which can be used for the publication time and the time of the last update, in the case that the user hasn't overridden them with Subversion properties (
dcterms:issued
anddcterms:modified
respectively). - get_subversion_properties($ra, $path, $revnum)
-
Returns a reference to a hash of properties for the file at
$path
(a full path within the Subversion repository, including branch path) in revision$revnum
.$ra
should be a SVN::Ra object.Returns undef if the file doesn't exist.
- wc_set_file_data($cms, $wc_id, $file_id, $content_type, $data, $allow_data_ref)
-
Warning: this should currently only be used for proper updates from the repository, not making live uncommitted changes in a working copy. Doing so will currently break everything.
Updates the data stored for file
$file_id
(which must not be a directory) in working copy$wc_id
. It takes care of things like calculating the digest and the pixel size of image files.$data
should be a reference to a scalar containing the actual data.If
$allow_data_ref
is true, and the working copy isn't the live working copy, then this function will try to find an existing copy of the same data in the live working copy and store a reference to that instead of an additional copy of the data. - mint_guid($cms, $is_dir, $path)
-
Add a new entry to the
file_guid
table for a file which initially (in the first revision for which it exists) resides at$path
.A new 'tag' URI will be created for the GUID, using the appropriate entity as defined in the configuration file (see the documentation for the
guid-entity
element in the Daizu configuration file (see http://www.daizucms.org/doc/config-file/).A list of two values is returned: the ID number of the new record, and the tag URI created for it.
- load_class($class)
-
Load a Perl module called
$class
which contains a class. So this doesn't do anyimport
calling, since that shouldn't be necessary. It keeps track of which classes have already been loaded, and won't do any extra work if you try to load the same class twice.This method is used to load generator classes and plugins.
- instantiate_generator($cms, $class, $root_file)
-
Create a generator object from the Perl class
$class
, passing in the information generator classes expect for their constructors.$root_file
, which should be a Daizu::File object, is passed to the generator and as also used to find the configuration information, if any, for this generator instance. Typically$root_file
will be the on which thedaizu:generator
property was set to enable this generator class.If
$class
is undef then the default generator is used (Daizu::Gen). - update_all_file_urls($cms, $wc_id)
-
Updates the
url
table in the same way as the Daizu::File method update_urls_in_db(), except that it does so for all files in working copy$wc_id
, and the return values are each true if any of the changes include new or updated redirects or 'gone' files.Any active URLs for files which no longer exist in the working copy are marked as 'gone'. This function also takes care of handling temporary duplicate URLs which occur during the update, when one file adds a new URL which is already active for another file, but will be inactive by the end of the transaction.
All of this is done in a single database transaction.
TODO - update docs about new return value
- resolve_url_update_duplicates($db, $wc_id, $dup_urls)
-
TODO
- aggregate_map_changes($changes, $redirects_changed, $gone_changed)
-
TODO
- add_xml_elem($parent, $name, $content, %attr)
-
Create a new XML DOM element (an XML::LibXML::Element object) and add it to the parent element
$parent
.$name
is the name of the new element.If
$content
is defined, then it can either be a libxml object to add as a child of the element, or a piece of text to use as its content.The keys and values in
%attr
are added to the new element as attributes. - xml_attr($filename, $elem, $attr, $default)
-
Returns the value of the attribute of the XML element
$elem
, which must be a XML::LibXML::Element object. If no such element exists, return$default
if that is defined, otherwise die with an appropriate error message. - xml_croak($filename, $node, $message)
-
Croaks with an error message which includes
$message
, but also gives the filename and the line number at which$node
occurs.$node
should be some kind of XML::LibXML::Node object. - expand_xinclude($db, $doc, $wc_id, $path)
-
Expand XInclude elements in
$doc
(a XML::LibXML::Document object). This is used for the content of articles, after it has been returned from an article loader plugin but before it is passed to article filter plugins. The XML DOM is updated in place.A list of the IDs of any included files is returned. When loading articles this list is stored in the
wc_article_included_files
table, so that whenever one of the file's content is changed, the article can be reloaded to include the new version.Any XInclude elements present must use include from a
daizu:
URI. Other URIs, likefile:
, are not allowed, since that would be a security hole if the content was supplied by a user who wouldn't normally have access to the filesystem. Thedaizu:
URI scheme is specific to this function, and causes data to be loaded from the database working copy$wc_id
(which should be the same as the file from which the article content came).$path
should be the path of the file from which the content comes. This is used to resolve relative paths when including. Actually, you can use any base URI by including anxml:base
attribute in the content, but this function adds one (based on$path
) to the root element if it doesn't already exist. This not only allows you to use paths relative to$path
, but also means you don't have to specify thedaizu:
URI prefix in your content. - branch_id($db, $branch)
-
If
$branch
is an number then return it unchanged, and just assume that it is a valid branch ID.Otherwise, try to find a branch with
$branch
as its path, and return the ID number of that. Dies if no such branch exists. - daizu_data_dir($dir)
-
Return the absolute path (on the native filesystem) of the directory called
$dir
under the directoryDaizu
where the Perl modules are installed. This is used to locate data files which can be installed along with the Daizu Perl modules, such as some XML DTD files in thexml
directory. Look for directories whose names are all lowercase inlib/Daizu/
in the source tarball for these.The return value is actually a Path::Class::Dir object.
Note that it is assumed these directories will be alongside the location of the file for this module (Daizu::Util). This should ensure that the right data files are used depending on whether you're using an installed version of Daizu CMS or testing from the source directory.
This function will die if the directory doesn't exist where it is expected to be.
COPYRIGHT
This software is copyright 2006 Geoff Richards <geoff@laxan.com>. For licensing information see this page: