NAME
Data::DRef - Delimited-key access to complex data structures
SYNOPSIS
use Data::DRef qw( :dref_access );
my $hash = { 'items' => [ 'first' ] };
print get_value_for_dref($hash, 'items.0');
set_value_for_dref( $hash, 'items.1', 'second' );
set_value_for_root_dref( 'myhash', $hash );
print get_value_for_root_dref('myhash.items.0');
use Data::DRef qw( :select );
matching_keys($target, %filter_criteria) : $key or @keys
matching_values($target, %filter_criteria) : $item or @items
use Data::DRef qw( :index );
index_by_drefs($target, @drefs) : $index
unique_index_by_drefs($target, @drefs) : $index
ordered_index_by_drefs( $target, $index_dref ) : $entry_ary
use Data::DRef qw( :leaf );
leaf_drefs($target) : @drefs
leaf_values( $target ) : @values
leaf_drefs_and_values( $target ) : %dref_value_pairs
DESCRIPTION
Data::DRef provides a streamlined interface for accessing values within nested Perl data structures. These structures are generally networks of hashes and arrays, some of which may be blessed into various classes, containing a mix of simple scalar values and references to other items in the structure.
The Data::DRef functions allow you to use delimited key strings to set and retrieve values at desired nodes within these structures. These functions are slower than direct variable access, but provide additional flexibility for high-level scripting and other late-binding behaviour. For example, a web-based application could use DRefs to simplify customization, allowing the user to refer to arguments processed by CGI.pm in fairly readable way, such as query.param.foo
.
A suite of utility functions, previous maintained in a separate Data::Collection module, performs a variety of operations across nested data structures. Because the Data::DRef abstraction layer is used, these functions should work equally well with arrays, hashes, or objects that provide their own key-value interface.
REFERENCE
Value-For-Key Interface
The first set of functions define our core key-value interface, and provide its implementation for references to Perl arrays and hashes. For example, direct access to array and hash keys usually looks like this:
print $employee->[3];
$person->{'name'} = 'Joe';
Using these functions, you could replace the above statements with:
print get_value_for_key( $employee, 3 );
set_value_for_key( $person, 'name', 'Joe' );
Each of these functions checks for object methods as described below.
- get_keys($target) : @keys
-
Returns a list of keys for which this item would be able to provide a value. For hash refs, returns the hash keys; for array refs, returns a list of numbers from 0 to $#; otherwise returns nothing.
- get_values($target) : @values
-
Returns a list of values for this item. For hash refs, returns the hash values; for array refs, returns the array contents; otherwise returns nothing.
- get_value_for_key($target, $key) : $value
-
Returns the value associated with this key. For hash refs, returns the value at this key, if present; for array refs, returns the value at this index, or complains if it's not numeric.
- set_value_for_key($target, $key, $value)
-
Sets the value associated with this key. For hash refs, adds or overwrites the entry for this key; for array refs, sets the value at this index, or complains if it's not numeric.
- get_or_create_value_for_key($target, $key) : $value
-
Gets value associated with this key using get_value_for_key, or if that value is undefined, sets the value to refer to a new anonymous hash using set_value_for_key and returns that reference.
- get_reference_for_key($target, $key) : $value_reference
-
Returns a reference to the scalar which is used to hold the value associated with this key.
Multiple-Key Chaining
Frequently we wish to access values at some remove within a structure by chaining through a list of references. Programmatic access to these values within Perl usually looks something like this:
print $report->{'employees'}[3]{'id'};
$report->{'employees'}[3]{'name'} = 'Joe';
Using these functions, you could replace the above statements with:
print get_value_for_keys( $report, 'employees', 3, 'id' );
set_value_for_keys( $report, 'Joe', 'employees', 3, 'name' );
These functions also support the "m_*" method delegation described above.
- get_value_for_keys($target, @keys) : $value
-
Starting at the target, look up each of the provided keys sequentially from the results of the previous one, returning the final value. Return value is undefined if at any time we find a key for which no value is present.
- set_value_for_keys($target, $value, @keys)
-
Starting at the target, look up each of the provided keys sequentially from the results of the previous one; when we reach the final key, use set_value_for_key to make the assignment. If an intermediate value is undefined, replaces it with an empty hash to hold the next key-value pair.
- get_or_create_value_for_keys($target, @keys) : $value
-
As above.
- get_reference_for_keys($target, @keys) : $val_ref
-
As above.
Object Overrides
Each of the value-for-key and multiple-key functions first check for methods with similar names preceeded by "m_" and, if present, uses that implementation. For example, callers can consistently request get_value_for_key($foo, $key)
, but in cases where $foo
supports a method named m_get_value_for_key
, its results will be returned instead.
Classes that wish to provide alternate DRef-like behavior or generate values on demand should implement these methods in their packages. A Data::DRef::MethodBased class is provided for use by objects which use methods to get and set attributes. By making your package a subclass of MethodBased you'll inherit m_get_value_for_key and m_set_value_for_key methods which treat the key as a method name to invoke.
DRef Syntax
In order to simplify expression of the lists of keys used above, we define a string format in which they may be represented. A DRef string is composed of a series of simple scalar keys, each escaped with String::Escape's printable() function, joined with the $Separator character, '.
'.
- $Separator
-
The multiple-key delimiter character, by default
.
, the period character. - get_key_drefs($target) : @drefs
-
Uses get_keys to determine the available keys for this target, and then returns an appropriately-escaped version of each of them.
- dref_from_keys( @keys ) : $dref
-
Escapes and joins the provided keys to create a dref string.
- keys_from_dref( $dref ) : @keys
-
Splits and unescapes a dref string to its consituent keys.
- join_drefs( @drefs ) : $dref
-
Joins already-escaped dref strings into a single dref.
- unshift_dref_key( $dref, $key )
-
Modify the provided dref string by escaping and prepending the provided key. Note that the original $dref variable is altered.
- shift_dref_key( $dref ) : $key
-
Modify the provided dref string by removing and unescaping the first key. Note that the original $dref variable is altered, and set to '' when the last key is removed.
DRef Pragmas
Several types of parenthesized expressions are supported as extension mechanisms for dref strings. Nested parentheses are supported, with the innermost parentheses resolved first.
Continuing the above example, one could write:
set_value_for_root_dref('empl_number', 3);
...
print get_value_for_dref($report, 'employees.(#empl_number).name');
- resolve_pragmas( $dref_with_embedded_parens ) : $dref
- resolve_pragmas( $dref_with_embedded_parens ) : ($dref, %options)
-
Calling resolve_pragmas() causes these expressions to be evaluated, and an expanded version of the dref is returned. In a list context, also returns a list of key-value pairs that may contain pragma information.
DRef Access
These functions provide the main public interface for dref-based access to values in nested data structures. They invoke the equivalent ..._value_for_keys() function after expanding and spliting the provided drefs.
Using these functions, you could replace the above statements with:
print get_value_for_dref( $report, 'employees.3.id' );
set_value_for_dref( $report, 'employees.3.name', 'Joe' );
- get_value_for_dref($target, $dref) : $value
-
Resolve pragmas and split the provided dref, then use get_value_for_keys to look those keys up starting with target.
- set_value_for_dref($target, $dref, $value)
-
Resolve pragmas and split the provided dref, then use set_value_for_keys.
Shared Data Graph Entry
Data::DRef also provides a common point-of-entry datastructure, refered to as $Root. Objects or structures accessible through $Root can be refered to identically from any package using the get_value_for_root_dref and set_value_for_root_dref functions. Here's another example:
set_value_for_root_dref('report', $report);
print get_value_for_root_dref('report.employees.3.name');
- $Root
-
The data graph entry point, by default a reference to an anonymous hash.
- get_value_for_root_dref($dref) : $value
-
Returns the value for the provided dref, starting at the root.
- set_value_for_root_dref($dref, $value) : $value
-
Sets the value for the provided dref, starting at the root.
- get_value_for_optional_dref($literal_or_prefixed_dref) : $value
-
If the argument begins with $DRefPrefix, the "#" character by default, the remainder is passed through get_value_for_root_dref(); otherwise it is returned unchanged.
Select by DRefs
The selection functions extract and return elements of a collection by evaluating them against a provided hash of criteria. When called in a scalar context, they will return the first sucessful match; in a list context, they will return all sucessful matches.
The keys in the criteria hash are drefs to check for each candidate; a match is sucessful if for each of the provided drefs, the candidate returns the same value that is associated with that dref in the criteria hash. To check the value itself, rather than looking up a dref, use undef as the hash key.
- matching_keys($target, %dref_value_criteria_pairs) : $key or @keys
-
Returns keys of the target whose corresponding values match the provided criteria.
- matching_values($target, %dref_value_criteria_pairs) : $item or @items
-
Returns values of the target which match the provided criteria.
Index by DRefs
The indexing functions extract the values from some target structure, then return a new structure containing references to those same values.
- index_by_drefs($target, @drefs) : $index
-
Generates a hash, or series of nested hashes, of arrays containing values from the target. A single dref argument produces a single-level index, a hash which maps each value obtained to an array of values which returned them; multiple dref arguments create nested hashes.
- unique_index_by_drefs($target, @drefs) : $index
-
Similar to index_by_drefs, except that only the most-recently visited single value is stored at each point in the index, rather than an array.
- ordered_index_by_drefs( $target, $index_dref ) : $entry_ary
-
Constructs a single-level index while preserving the order in which top-level index keys are discovered. An array of hashes is returned, each containing one of the index keys and the array of associated values.
DRefs to Leaf nodes
These functions explore all of the references in the network of structures accessible from some starting point, and provide access to the outermost (non-reference) items. For a tree structure, this is equivalent to listing the leaf nodes, but these functions can also be used in structures with circular references.
- leaf_drefs($target) : @drefs
-
Returns a list of drefs to the outermost values.
- leaf_values( $target ) : @values
-
Returns a list of the outermost values.
- leaf_drefs_and_values( $target ) : %dref_value_pairs
-
Returns a flat hash of the outermost drefs and values.
Compatibility
To provide compatibility with earlier versions of this module, many of the functions above are also accesible through an alias with the old name.
EXAMPLES
Here is a sample data structure which will be used to illustrate various example function calls. Note that the individual hashes shown below are only refered to in the following example results, not completely copied.
$spud : {
'type'=>'tubers', 'name'=>'potatoes', 'color'=>'red', 'size'=>[2,3,5]
}
$apple : {
'type'=>'fruit', 'name'=>'apples', 'color'=>'red', 'size'=>[2,2,2]
}
$orange : {
'type'=>'fruit', 'name'=>'oranges', 'color'=>'orange', 'size'=>[1,1,1]
}
$produce_info : [ $spud, $apple, $orange, ];
Select by DRefs
matching_keys($produce_info, 'type'=>'tubers') : ( 0 )
matching_keys($produce_info, 'type'=>'fruit') : ( 1, 2 )
matching_keys($produce_info, 'type'=>'fruit', 'color'=>'red' ) : ( 1 )
matching_keys($produce_info, 'type'=>'tubers', 'color'=>'orange' ) : ( )
matching_values($produce_info, 'type'=>'fruit') : ( $apple, $orange )
matching_values($produce_info, 'type'=>'fruit', 'color'=>'red' ) : ( $apple )
Index by DRefs
index_by_drefs($produce_info, 'type') : {
'fruit' => [ $apple, $orange ],
'tubers' => [ $spud ],
}
index_by_drefs($produce_info, 'color', 'type') : {
'red' => {
'fruit' => [ $apple ],
'tubers' => [ $spud ],
},
'orange' => {
'fruit' => [ $orange ],
},
}
unique_index_by_drefs($produce_info, 'type') : {
'fruit' => $orange,
'tubers' => $spud,
}
ordered_index_by_drefs($produce_info, 'type') : [
{
'value' => 'tubers',
'items' => [ $spud ],
},
{
'value' => 'fruit',
'items' => [ $orange, $apple ],
},
]
DRefs to Leaf nodes
leaf_drefs($spud) : ( 'type', 'name', 'color', 'size.0', 'size.1', 'size.2' )
leaf_values($spud) : ( 'tubers', 'potatoes', 'red', '2', '3', '5' )
leaf_drefs_and_values($spud) : (
'type' => 'tubers', 'name' => 'potatoes', 'color' => 'red',
'size.0' => 2, 'size.1' => 3, 'size.2' => 5
)
Object Overrides
Here's a get_value_for_key method for an object which provides a calculated timestamp value:
package Clock;
sub new { bless { @_ }; }
sub m_get_value_for_key {
my ($self, $key) = @_;
return time() if ( $key eq 'timestamp' );
return $self->{ $key };
}
package main;
set_value_for_root_dref( 'clock', new Clock ( name => "Clock 1" ) );
...
print get_value_for_root_dref('clock.timestamp');
STATUS AND SUPPORT
This release of Data::DRef is intended for public review and feedback. This is the most recent version of code that has been used for several years and thoroughly tested, however, the interface has recently been overhauled and it should be considered "alpha" pending that feedback.
Name DSLI Description
-------------- ---- ---------------------------------------------
Data::
::DRef adph Nested data access using delimited strings
You will also need the String::Escape module from CPAN or www.evoscript.com.
Further information and support for this module is available at <www.evoscript.com>.
Please report bugs or other problems to <bugs@evoscript.com>.
There is one known bug in this version:
We don't always properly escape and unescape special characters within DRef strings or protect $Separators embedded within a subkey. This is expected to change soon.
There is one major change under consideration:
Perhaps a minimal method-based implementation similar to that used in Data::DRef::MethodBased should be exported to UNIVERSAL, rather than requiring all sorts of unrelated classes to establish a dependancy on this module. Prototype checking might prove to be useful here.
AUTHORS AND COPYRIGHT
Copyright 1996, 1997, 1998, 1999 Evolution Online Systems, Inc. <www.evolution.com>
You may use this software for free under the terms of the Artistic License.
Contributors: M. Simon Cavalletto <simonm@evolution.com>, E. J. Evans <piglet@evolution.com>