NAME
Array::To::Moose - Build Moose objects from a data array
VERSION
This document describes Array::To::Moose version 0.0.9
SYNOPSIS
use Array::To::Moose;
# or
use Array::To::Moose qw(array_to_moose set_class_ind set_key_ind
throw_nonunique_keys throw_multiple_rows );
Array::To::Moose
exports function array_to_moose()
by default, and convenience functions set_class_ind()
, set_key_ind()
, throw_nonunique_keys()
and throw_multiple_rows()
if requested.
array_to_moose
array_to_moose()
builds Moose objects from suitably-sorted 2-dimensional arrays of data of the type returned by, e.g., DBI::selectall_arrayref() i.e. a reference to an array containing references to an array for each row of data fetched.
Example 1a
package Car;
use Moose;
has 'make' => (is => 'ro', isa => 'Str');
has 'model' => (is => 'ro', isa => 'Str');
has 'year' => (is => 'ro', isa => 'Int');
package CarOwner;
use Moose;
has 'last' => (is => 'ro', isa => 'Str');
has 'first' => (is => 'ro', isa => 'Str');
has 'Cars' => (is => 'ro', isa => ArrayRef[Car]');
...
# in package main:
use Array::To::Moose;
# In this dataset Alex owns two cars, Jim one, and Alice three
my $data = [
[ qw( Green Alex Ford Focus 2011 ) ],
[ qw( Green Alex VW Jetta 2009 ) ],
[ qw( Green Jim Honda Civic 2007 ) ],
[ qw( Smith Alice Buick Regal 2012 ) ],
[ qw( Smith Alice Toyota Camry 2008 ) ],
[ qw( Smith Alice BMW X5 2010 ) ],
];
my $CarOwners = array_to_moose(
data => $data,
desc => {
class => 'CarOwner',
last => 0,
first => 1,
Cars => {
class => 'Car',
make => 2,
model => 3,
year => 4,
} # Cars
} # Car Owners
);
print $CarOwners->[2]->Cars->[1]->model; # prints "Camry"
Example 1b - Hash(ref) Sub-objects
In the above example, array_to_moose()
returns a reference to an array of CarOwner
objects, $CarOwners
.
If a hash of CarOwner
objects is required, a "key =>
... " entry must be added to the descriptor hash. For example, to construct a hash of CarOwner
objects, whose key is the owner's first name, (unique for every person in the example data), the call becomes:
my $CarOwnersH = array_to_moose(
data => $data,
desc => {
class => 'CarOwner',
key => 1, # note key
last => 0,
first => 1,
Cars => {
class => 'Car',
make => 2,
model => 3,
year => 4,
} # Cars
} # Car Owners
);
print $CarOwnersH->{Alex}->Cars->[0]->make; # prints "Ford"
Similarly, to construct the Cars
sub-objects as hash sub-objects (and not an array as above), define CarOwner
as:
package CarOwner;
use Moose;
has 'last' => (is => 'ro', isa => 'Str' );
has 'first' => (is => 'ro', isa => 'Str' );
has 'Cars' => (is => 'ro', isa => 'HashRef[Car]'); # Was 'ArrayRef[Car]'
and noting that the car make
is unique for each person in the $data
dataset, we construct the reference to an array of objects with the call:
$CarOwners = array_to_moose(
data => $data,
desc => {
class => 'CarOwner',
last => 0,
first => 1,
Cars => {
class => 'Car',
key => 2, # note key
model => 3,
year => 4,
} # Cars
} # Car Owners
);
print $CarOwners->[2]->Cars->{BMW}->model; # prints 'X5'
Example 1c - "Simple" Reference Attributes
If, instead of the car owner object containing an ArrayRef or HashRef of Car
sub-objects, it contains, say, a ArrayRef of strings representing the names of the car makers:
package SimpleCarOwner;
use Moose;
has 'last' => (is => 'ro', isa => 'Str' );
has 'first' => (is => 'ro', isa => 'Str' );
has 'CarMakers' => (is => 'ro', isa => 'ArrayRef[Str]');
Using the same dataset from Example 1a, we construct an arrayref SimpleCarOwner
objects as:
$SimpleCarOwners = array_to_moose(
data => $data,
desc => {
class => 'SimpleCarOwner',
last => 0,
first => 1,
CarMakers => [2], # Note the '[...]' brackets
}
);
print $SimpleCarOwners->[2]->[1]; # prints 'Toyota'
I.e., when the object attribute is an ArrayRef of one of the Moose "simple" types, e.g. 'Str'
, 'Num'
, 'Bool'
, etc (See Moose::Manual::Types), then the column number should appear in square brackets ('CarMakers => [2]
' above) to differentiate them from the bare types (last => 0,
and first => 1,
above).
Note that Array::To::Moose doesn't (yet) handle the case of hashrefs of "simple" types, e.g., ( isa => "HashRef[Str]" )
Example 2 - Use with DBI
The main rationale for writing Array::To::Moose
is to make it easy to build Moose objects from data extracted from relational databases, especially when the database query involves multiple tables with one-to-many relationships to each other.
As an example, consider a database which models patients making visits to a clinic on multiple occasions, and on each visit, having a doctor run some tests and diagnose the patient's complaint. In this model, the database Patient table would have a one-to-many relationship with the Visit table, which in turn would have a one-to-many relationship with the Test table
The corresponding Moose model has nested Moose objects which reflects those one-to-many relationships, i.e., multiple Visit objects per Patient object and multiple Test objects per Visit object, declared as:
package Test;
use Moose;
has 'name' => (is => 'rw', isa => 'Str');
has 'result' => (is => 'rw', isa => 'Str');
package Visit;
use Moose;
has 'date' => (is => 'rw', isa => 'Str' );
has 'md' => (is => 'rw', isa => 'Str' );
has 'diagnosis' => (is => 'rw', isa => 'Str' );
has 'Tests' => (is => 'rw', isa => 'HashRef[Test]' );
package Patient;
use Moose;
has 'last' => (is => 'rw', isa => 'Str' );
has 'first' => (is => 'rw', isa => 'Str' );
has 'Visits' => (is => 'rw', isa => 'ArrayRef[Visit]' );
In the main program:
use DBI;
use Array::To::Moose;
...
my $sql = q{
SELECT
P.Last, P.First
,V.Date, V.Doctor, V.Diagnosis
,T.Name, T.Result
FROM
Patient P
,Visit V
,Test T
WHERE
-- join clauses
P.Patient_key = V.Patient_key
AND V.Visit_key = T.Visit_key
...
ORDER BY
P.Last, P.First, V.Date
};
my $dbh = DBI->connect(...);
my $data = $dbh->selectall_arrayref($sql);
# rows of @$data contain:
# Last, First, Date, Doctor, Diagnosis, Name, Result
# at positions: [0] [1] [2] [3] [4] [5] [6]
my $patients = array_to_moose(
data => $data,
desc => {
class => 'Patient',
last => 0,
first => 1,
Visits => {
class => 'Visit',
date => 2,
md => 3,
diagnosis => 4,
Tests => {
class => 'Test',
key => 5,
name => 5,
result => 6,
} # tests
} # visits
} # patients
);
print $patients->[2]->Visits->[0]->Tests->{BP}->result; # prints '120/80'
Note: We used the Test name
as the key for the Visit 'Tests
', as the tests have unique names within any one Visit. (See t/5.t)
DESCRIPTION
As shown in the above examples, the general usage is:
package MyClass;
use Moose;
(define Moose object(s))
...
use Array::To::Moose;
...
my $data_ref = selectall_arrayref($sql); # for example
my $object_ref = array_to_moose(
data => $data_ref
desc => {
class => 'MyClass',
key => K, # only for HashRefs
attrib_1 => N1,
attrib_2 => N2,
...
attrib_m => [ M ],
...
SubObject => {
class => 'MySubClass',
...
}
}
);
Where:
array_to_moose()
returns an array- or hash reference of MyClass
Moose objects. All Moose classes (MyClass
, MySubClass
, etc) must already have been defined by the user.
$data_ref
is a reference to an array containing references to arrays of scalars of the kind returned by, e.g., DBI::selectall_arrayref()
desc
(descriptor) is a reference to a hash which contains several types of data:
class =>
'MyObj' is required and defines the Moose class or package which will contain the data. The user should have defined this class already.
key => N
is required if the Moose object being constructed is to be a hashref, either at the top-level Moose object returned from array_to_moose()
or as a "isa => 'HashRef[...]'
" sub-object.
attrib => N
where attrib
is the name of a Moose attribute ("has 'attrib' =>
...")
attrib => [ N ]
where attrib
is the name of a Moose "simple" sub-attribute ("has => 'attrib' ( isa => 'ArrayRef[Type]' ...)
"), where Type
is a "simple" Moose type, e.g., 'Str', 'Int'
, etc.
In the above cases, N
is a positive integer containing the the corresponding zero-indexed column number in the data array where that attribute's data is to be found.
Sub-Objects
array_to_moose()
can handle three types of Moose sub-objects, i.e.:
an array of sub-objects:
has => 'Sub_Obj' ( isa => 'ArrayRef[MyObj]' );
a hash of sub-objects:
has => 'Sub_Obj' ( isa => 'HashRef[MyObj]' );
or a single sub-object:
has => 'Sub_Obj' ( isa => 'MyObj' );
the descriptor entry for Sub_Obj
in each of these cases is (almost) the same:
desc => {
class => ...
...
Sub_Obj => {
class => 'MyObj',
key => <keycol> # HashRef['] only
attrib_a => <N>,
...
} # end SubObj
...
} # end desc
(A HashRef[']
sub-object will also require a key => N
entry in the descriptor).
In addition, array_to_moose()
can also handle ArrayRef
s of "simple" types:
has => 'Sub_Obj' ( isa => 'ArrayRef[Type]' );
where Type
is a "simple" Moose type, e.g., 'Str', 'Int, 'Bool'
, etc.
Ordering the data
array_to_moose()
does not sort the input data array, and does all processing in a single pass through the data. This means that the data in the array must be sorted properly for the algorithm to work.
For example, in the previous Patient/Visit/Test example, in which there are many Tests per Visit and many Visits per Patient, the data in the Test column(s) must change the fastest, the Visit data slower, and the Patient data the slowest:
Patient Visit Test
------ ----- ----
P1 V1 T1
P1 V1 T2
P1 V1 T3
P1 V2 T4
P1 V2 T5
P2 V3 T6
P2 V3 T7
P2 V4 T8
In SQL this would be accomplished by a SORT BY
clause, e.g.:
SORT BY Patient.Key, Visit.Key, Test.Key
throw_nonunique_keys ()
By default, array_to_moose()
does not check the uniqueness of hash key values within the data. If the key values in the data are not unique, existing hash entries will get overwritten, and the sub-object will contain the value from the last data row which contained that key value. For example:
package Employer;
use Moose;
has 'year' => (is => 'rw', isa => 'Str');
has 'name' => (is => 'rw', isa => 'Str');
package Person;
use Moose;
has 'name' => (is => 'rw', isa => 'Str' );
has 'Employers' => (is => 'rw', isa => 'HashRef[Employer]');
...
my $data = [
[ 'Anne Miller', '2005', 'Acme Corp' ],
[ 'Anne Miller', '2006', 'Acme Corp' ],
[ 'Anne Miller', '2007', 'Widgets, Inc' ],
...
];
The call:
my $obj = array_to_moose(
data => $data,
desc => {
class => 'Person',
name => 0,
Employers => {
class => 'Employer',
key => 2, # using employer name as key
year => 1,
} # Employer
} # Person
);
Because the employer was 'Acme Corp'
in years 2005 & 2006, array_to_moose
will silently overwrite the 2005 Employer object with the data for the 2006 Employer object:
print $obj->[0]->Employers->{'Acme Corp'}->year, "\n"; # prints '2006'
Calling throw_uniq_keys()
(either with no argument, or with a non-zero argument) enables reporting of non-unique keys. In the above example, array_to_moose()
would exit with warning:
Non-unique key 'Acme Corp' in 'Employer' class ...
Calling throw_uniq_keys(0)
, i.e. with an argument of zero will disable subsequent reporting of non-unique keys. (See t/8c.t)
throw_multiple_rows ()
For single-occurence sub-objects (i.e. ( isa => 'MyObj' )
), if the data contains more than one row of data for the sub-object, only the first row will be used to construct the single sub-object and array_to_moose()
will not report the fact. E.g.:
package Salary;
use Moose;
has 'year' => (is => 'rw', isa => 'Str');
has 'amount' => (is => 'rw', isa => 'Int');
package Person;
use Moose;
has 'name' => (is => 'rw', isa => 'Str' );
has 'Salary' => (is => 'rw', isa => 'Salary'); # a single object
...
my $data = [
[ 'John Smith', '2005', 23_350 ],
[ 'John Smith', '2006', 24_000 ],
[ 'John Smith', '2007', 26_830 ],
...
];
The call:
my $obj = array_to_moose(
data => $data,
desc => {
class => 'Person'
name => 0,
Salary => {
class => 'Salary',
year => 1,
amount => 2
} # Salary
} # Person
);
would silently assign to Salary
, the first row of the three Salary data rows, i.e. for year 2005:
print $object->[0]->Salary->year, "\n"; # prints '2005'
Calling throw_multiple_rows()
(either with no argument, or with a non-zero argument) enables reporting of this situation. In the above example, array_to_moose()
will exit with error:
Expected a single 'Salary' object, but got 3 of them ...
Calling throw_multiple_rows(0)
, i.e. with an argument of zero will disable subsequent reporting of this error. (See t/8d.t)
set_class_ind (), set_key_ind ()
Problems arise if the Moose objects being constructed contain attributes called class or key, causing ambiguities in the descriptor. (Does key => 5
mean the attribute key
or the hash key key
is in the 5th column?)
In these cases, set_class_ind()
and set_key_ind()
can be used to change the keywords for class => ...
and key => ...
descriptor entries.
For example:
package Letter;
use Moose;
has 'address' => ( is => 'ro', isa => 'Str' );
has 'class' => ( is => 'ro', isa => 'PostalClass' );
...
set_key_ind('package'); # use "package =>" in place of "class =>"
my $letters = array_to_moose(
data => $data,
desc => {
package => 'Letter', # the Moose class
address => 0,
class => 1, # the attribute 'class'
...
}
);
Read-only Attributes
One of the recommendations of Moose::Manual::BestPractices is to make attributes read-only (isa => 'ro'
) wherever possible. Array::To::Moose
supports this by evaluating all the attributes for a given object given in the descriptor, then including them all in the call to new(...)
when constructing the object.
For Moose objects with attributes which are sub-objects, i.e. references to a Moose object, or references to an array or hash of Moose objects, it means that the sub-objects must be evaluated before the new()
call. The effect of this for multi-leveled Moose objects is that object evaluations are carried out depth-first.
Treatment of NULL
s
array_to_moose()
uses Array::GroupBy::igroup_by to compare the rows in the data given in data => ...
, using function Array::GroupBy::str_row_equal() which compares the data as strings.
If the data contains undef
values, typically returned from database SQL queries in which DBI maps NULL values to undef
, when str_row_equal()
encounters undef
elements in corresponding column positions, it will consider the elements equal
. When corresponding column elements are defined and undef
respectively, the elements are considered unequal
.
This truth table demonstrates the various combinations:
-------+------------+--------------+--------------+--------------
row 1 | ('a', 'b') | ('a', undef) | ('a', undef) | ('a', 'b' )
row 2 | ('a', 'b') | ('a', undef) | ('a', 'b' ) | ('a', undef)
-------+------------+--------------+--------------+--------------
equal? | yes | yes | no | no
EXPORT
array_to_moose
by default; throw_nonunique_keys
, throw_multiple_rows
, set_class_ind
and set_key_ind
if requested.
DIAGNOSTICS
Errors in the call of array-to-moose()
will be caught by Params::Validate::Array, q.v.
<array-to-moose> does a lot of error checking, and is probably annoyingly chatty. Most of the errors generated are, of course, self-explanatory :-)
DEPENDENCIES
Carp
Params::Validate::Array
Array::GroupBy
SEE ALSO
BUGS
The handling of Moose type constraints is primitive.
AUTHOR
Sam Brain <samb@stanford.edu>
COPYRIGHT AND LICENSE
Copyright (c) Stanford University. June 6th, 2010. All rights reserved. Author: Sam Brain <samb@stanford.edu>
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.