NAME

Data::Pairs - Perl module to implement ordered mappings with possibly duplicate keys.

SYNOPSIS

use Data::Pairs;

# Simple OO style

my $pairs = Data::Pairs->new( [{a=>1},{b=>2},{c=>3},{b=>4}] );

$pairs->set( a => 0 );
$pairs->add( b2 => 2.5, 2 );  # insert at position 2 (between b and c)

my $value  = $pairs->get_values( 'c' );    # 3
my @keys   = $pairs->get_keys();           # (a, b, b2, c, b)
my @values = $pairs->get_values();         # (0, 2, 2.5, 3, 4)
my @subset = $pairs->get_values(qw(c b));  # (2, 3, 4) (values are data-ordered)

# Tied style

my %pairs;
# recommend saving an object reference, too.
my $pairs = tie %pairs, 'Data::Pairs', [{a=>1},{b=>2},{c=>3},{b=>4}];

$pairs{ a } = 0;
$pairs->add( b2 => 2.5, 2 );  # there's no tied hash equivalent

my $value  = $pairs{ c };

# keys %pairs;    # not supported, use $pairs->get_keys()
# values %pairs;  # not supported, use $pairs->get_values()
# each %pairs;    # not supported, use $pairs->get_keys()/get_values()
# @pairs{@array}; # slices not supported, use $pairs->get_values(@array)
                  # or for(@array){ ... $pairs->getvalues($_) }

# There are more methods/options, see below.

DESCRIPTION

This module implements the Data::Pairs class. Objects in this class are ordered mappings, i.e., they are hashes in which the key/value pairs are in order. This is defined in shorthand as !!pairs in the YAML tag repository: http://yaml.org/type/pairs.html.

The keys in Data::Pairs objects are not necessarily unique, unlike regular hashes.

A closely related class, Data::Omap, implements the YAML !!omap data type, http://yaml.org/type/omap.html. Data::Omap objects are also ordered sequences of key:value pairs but they do not allow duplicate keys.

While ordered mappings are in order, they are not necessarily in a particular order, i.e., they are not necessarily sorted in any way. They simply have a predictable set order (unlike regular hashes whose key/value pairs are in no set order).

By default, Data::Pairs will add new key/value pairs at the end of the mapping, but you may request that they be merged in a particular order with the order() class method.

However, even though Data::Pairs will honor the requested order, it will not attempt to keep the mapping in that order. By passing position values to the set() and add() methods, you may insert new pairs anywhere in the mapping and Data::Pairs will not complain.

IMPLEMENTATION

Normally, the underlying structure of an OO object is encapsulated and not directly accessible (when you play nice). One key implementation detail of Data::Pairs is the desire that the underlying ordered mapping data structure (an array of single-key hashes) be publically maintained as such and directly accessible, if desired.

To that end, no attributes but the data itself are stored in the objects. In the current version, that is why order() is a class method rather than an object method. In the future, inside-out techniques may be used to enable object-level ordering.

This data structure is inefficient in several ways as compared to regular hashes: rather than one hash, it contains a separate hash per key/value pair; because it's an array, key lookups (in the current version) have to loop through it.

The advantage if using this structure is simply that it "natively" matches the structure defined in YAML. So if the (unblessed) structure is dumped using YAML (or perhaps JSON), it may be read as is by another program, perhaps in another language. It is true that this could be accomplished by passing the object through a formatting routine, but I wanted to see first how this implementation might work.

VERSION

Data::Pairs version 0.03

CLASS METHODS

Data::Pairs->new();

Constructs a new Data::Pairs object.

Accepts array ref containing single-key hash refs, e.g.,

my $pairs = Data::Pairs->new( [ { a => 1 }, { b => 2 }, { c => 3 }, { b => 4 } ] );

When provided, this data will be loaded into the object.

Returns a reference to the Data::Pairs object.

Data::Pairs->order();

When ordering is ON, new key/value pairs will be added in the specified order. When ordering is OFF (the default), new pairs will be added to the end of the mapping.

When called with no parameters, order() returns the current code reference (if ordering is ON) or a false value (if ordering is OFF); it does not change the ordering.

Data::Pairs->order();         # leaves ordering as is

When called with the null string, '', ordering is turned OFF.

Data::Pairs->order( '' );     # turn ordering OFF (the default)

Otherwise, accepts the predefined orderings: 'na', 'nd', 'sa', 'sd', 'sna', and 'snd', or a custom code reference, e.g.

Data::Pairs->order( 'na' );   # numeric ascending
Data::Pairs->order( 'nd' );   # numeric ascending
Data::Pairs->order( 'sa' );   # string  ascending
Data::Pairs->order( 'sd' );   # string  descending
Data::Pairs->order( 'sna' );  # string/numeric ascending
Data::Pairs->order( 'snd' );  # string/numeric descending
Data::Pairs->order( sub{ int($_[0]/100) < int($_[1]/100) } );  # code

The predefined orderings, 'na' and 'nd', compare keys as numbers. The orderings, 'sa' and 'sd', compare keys as strings. The orderings, 'sna' and 'snd', compare keys as numbers when they are both numbers, as strings otherwise.

When defining a custom ordering, the convention is to use the operators < or lt between (functions of) $_[0] and $_[1] for ascending and between $_[1] and $_[0] for descending.

Returns the code reference if ordering is ON, a false value if OFF.

Note, when object-level ordering is implemented, it is expected that the class-level option will still be available. In that case, any new objects will inherite the class-level ordering unless overridden at the object level.

OBJECT METHODS

$pairs->set( $key => $value[, $pos] );

Sets the value if $key exists; adds a new key/value pair if not.

Accepts $key, $value, and optionally, $pos.

If $pos is given, and there is a key/value pair at that position, it will be set to $key and $value, even if the key is different. For example:

my $pairs = Data::Pairs->new( [{a=>1},{b=>2}] );
$pairs->set( c => 3, 0 );  # pairs is now [{c=>3},{b=>2}]

(As implied by the example, positions start at 0.)

If $pos is given, and there isn't a pair there, a new pair is added there (perhaps overriding a defined ordering).

If $pos is not given, the key will be located and if found, the value set. If the key is not found, a new pair is added to the end or merged according to the defined order().

Returns $value (as a nod toward $hash{$key}=$value, which "returns" $value).

$pairs->get_values( [$key[, @keys]] );

Get a value or values.

Regardless of parameters, if the object is empty, undef is returned in scalar context, an empty list in list context.

If no paramaters, gets all the values. In scalar context, gives number of values in the object.

my $pairs = Data::Pairs->new( [{a=>1},{b=>2},{c=>3},{b=>4},{b=>5}] );
my @values  = $pairs->get_values();  # (1, 2, 3, 4, 5)
my $howmany = $pairs->get_values();  # 5

If multiple keys given, their values are returned in the order found in the object, not the order of the given keys.

In scalar context, gives the number of values found, e.g.,

@values  = $pairs->get_values( 'c', 'b' );  # (2, 3, 4, 5)
$howmany = $pairs->get_values( 'c', 'b' );  # 4

If only one key is given, first value found for that key is returned in scalar context, all the values in list context.

@values   = $pairs->get_values( 'b' );  # (2, 4, 5)
my $value = $pairs->get_values( 'b' );  # 2

Note, if you don't know if a key will have more than value, calling get_values() in list context will ensure you get them all.

$pairs->add( $key => $value[, $pos] );

Adds a key/value pair to the object.

Accepts $key, $value, and optionally, $pos.

If $pos is given, the key/value pair will be added (inserted) there (possibly overriding a defined order), e.g.,

my $pairs = Data::Pairs->new( [{a=>1},{b=>2}] );
$pairs->add( c => 3, 1 );  # pairs is now [{a=>1},{c=>3},{b=>2}]

(Positions start at 0.)

If $pos is not given, a new pair is added to the end or merged according to the defined order().

Returns $value.

$pairs->_add_ordered( $key => $value );

Private routine used by set() and add().

Accepts $key and $value.

Adds a new key/value pair to the end or merged according to the defined order().

This routine should not be called directly, because it does not check for duplicates.

Has no defined return value.

$pairs->get_pos( $key );

Gets position(s) where a key is found.

Accepts one key (any extras are silently ignored).

In list context, returns a list of positions where the keys is found.

In scalar context, if the key only appears once, that position is returned. If the key appears more than once, an array ref is returned, which contains all the positions, e.g.,

my $pairs = Data::Pairs->new( [{a=>1},{b=>2},{c=>3},{b=>4}] );

my @pos   = $pairs->get_pos( 'c' );  # (2)
my $pos   = $pairs->get_pos( 'c' );  # 2

@pos   = $pairs->get_pos( 'b' );  # (1, 3)
$pos   = $pairs->get_pos( 'b' );  # [1, 3]

Returns ()/undef if no key given, no keys found, or object is empty.

$pairs->get_pos_hash( @keys );

Gets positions where keys are found.

Accepts zero or more keys.

In list context, returns a hash of keys/positions found. In scalar context, returns a hash ref to this hash. If no keys given, all the positions are mapped in the hash. Since keys may appear more than once, the positions are stored as arrays.

my $pairs    = Data::Pairs->new( [{a=>1},{b=>2},{c=>3},{b=>4}] );
my %pos      = $pairs->get_pos_hash( 'c', 'b' );  # %pos      is (b=>[1,3],c=>[2])
my $pos_href = $pairs->get_pos_hash( 'c', 'b' );  # $pos_href is {b=>[1,3],c=>[2]}

If a given key is not found, it will not appear in the returned hash.

Returns undef/() if no keys given or object is empty.

$pairs->get_keys( @keys );

Gets keys.

Accepts zero or more keys. If no keys are given, returns all the keys in the object (list context) or the number of keys (scalar context), e.g.,

my $pairs    = Data::Pairs->new( [{a=>1},{b=>2},{c=>3},{b=>4},{b=>5}] );
my @keys    = $pairs->get_keys();  # @keys is (a, b, c, b, b)
my $howmany = $pairs->get_keys();  # $howmany is 5

If one or more keys are given, returns all the keys that are found (list) or the number found (scalar). Keys returned are listed in the order found in the object, e.g.,

@keys    = $pairs->get_keys( 'c', 'b', 'A' );  # @keys is (b, c, b, b)
$howmany = $pairs->get_keys( 'c', 'b', 'A' );  # $howmany is 4

$pairs->get_array( @keys );

Gets an array of key/value pairs.

Accepts zero or more keys. If no keys are given, returns a list of all the key/value pairs in the object (list context) or an array reference to that list (scalar context), e.g.,

my $pairs    = Data::Pairs->new( [{a=>1},{b=>2},{c=>3}] );
my @array   = $pairs->get_array();  # @array is ({a=>1}, {b=>2}, {c=>3})
my $aref    = $pairs->get_array();  # $aref  is [{a=>1}, {b=>2}, {c=>3}]

If one or more keys are given, returns a list of key/value pairs for all the keys that are found (list) or an aref to that list (scalar). Pairs returned are in the order found in the object, e.g.,

@array = $pairs->get_array( 'c', 'b', 'A' );  # @array is ({b->2}, {c=>3})
$aref  = $pairs->get_array( 'c', 'b', 'A' );  # @aref  is [{b->2}, {c=>3}]

Note, conceivably this method might be used to make a copy (unblessed) of the object, but it would not be a deep copy (if values are references, the references would be copied, not the referents).

$pairs->firstkey();

This routine would support the tied hash FIRSTKEY method. However, since there isn't a way for nextkey() to reliably get the next key (because of duplicates), the tied implementation does not support operations that rely on FIRSTKEY/NEXTKEY.

$pairs->nextkey( $lastkey );

This routine would support the tied hash NEXTKEY method. However, because of duplicates, there isn't a way to reliably get the next key based solely on the value of $lastkey. Therefore, the tied implementation does not support operations that rely on FIRSTKEY/NEXTKEY.

$pairs->exists( $key );

Accepts one key.

Returns true if key is found in object, false if not.

This routine supports the tied hash EXISTS method, but may reasonably be called directly, too.

$pairs->delete( $key );

Accepts one key. If key is found, removes the first matching key/value pair from the object. Must be repeated in a loop to delete all occurrences of the key from the object.

Returns the value from the deleted pair.

This routine supports the tied hash DELETE method, but may be called directly, too.

$pairs->clear();

Expects no parameters. Removes all key/value pairs from the object.

Returns an empty list.

This routine supports the tied hash CLEAR method, but may be called directly, too.

SEE ALSO

Data::Omap

    The code in Data::Omap is the basis for that in the Data::Pairs module. Data::Omap also operates on an ordered hash, but does not allow duplicate keys.

Tie::IxHash

    Use Tie::IxHash if what you need is an ordered hash in general. The Data::Pairs module does repeat many of Tie::IxHash's features. What differs is that it operates directly on a specific type of data structure, and allows duplicate keys.

AUTHOR

Brad Baxter, <bmb@galib.uga.edu>

COPYRIGHT AND LICENSE

Copyright (C) 2008 by Brad Baxter

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.