NAME
CohortExplorer::Datasource - CohortExplorer datasource superclass
SYNOPSIS
# The code below shows methods your datasource class overrides;
package CohortExplorer::Application::REDCap::Datasource;
use base qw( CohortExplorer::Datasource );
sub authenticate {
my ($self, $opts) = @_;
# authentication code...
return $response
}
sub default_parameters {
my ($self, $opts, $response) = @_;
# get database handle (i.e. $self->dbh()) and run some SQL queries to get additional parameters
# or, simply add some parameters without querying the database
return $default;
}
sub entity_structure {
my ($self) = @_;
my %struct = (
-columns => {
entity_id => "rd.record",
variable => "rd.field_name",
value => "rd.value",
table => "rm.form_name"
},
-from => [ -join => qw/redcap_data|rd <=>{project_id=project_id} redcap_metadata|rm/ ],
-where => {
"rd.project_id" => $self->project_id()
}
);
$struct{-columns}{visit} = 'rd.event_id-' . $self->init_event_id()
if ( $self->type() eq 'longitudinal');
return \%struct;
}
sub table_structure {
my ($self) = @_;
return {
-columns => {
table => "GROUP_CONCAT( DISTINCT form_name )",
variable_count => "COUNT( field_name )",
label => "element_label"
},
-from => "redcap_metadata",
-where => {
"project_id" => $self->project_id()
},
-order_by => "field_order",
-group_by => "form_name"
};
}
sub variable_structure {
my ($self) = @_;
return {
-columns => {
variable => "field_name",
table => "form_name",
label => "element_label",
type => "IF( element_validation_type IS NULL, 'text', element_validation_type)",
category => "IF( element_enum like '%, %', REPLACE( element_enum, '\\\\n', '\n'), '')"
},
-from => "redcap_metadata",
-where => {
"project_id" => $self->project_id()
},
-order_by => "field_order"
};
}
sub datatype_map {
return {
'int' => 'signed',
'float' => 'decimal',
'date_dmy' => 'date',
'date_mdy' => 'date',
'date_ymd' => 'date',
'datetime_dmy' => 'datetime'
};
}
OBJECT CONSTRUCTION
initialise( $opts, $config_file )
CohortExplorer::Datasource is an abstract factory; initialise()
is the factory method that constructs and returns an object of the datasource supplied as an application option. This class reads the datasource configuration from the config file (i.e. /etc/CohortExplorer/datasource-config.properties
) to instantiate the datasource object. The config file takes the format below,
<datasource Clinical>
namespace=Opal
type=longitudinal
static_tables=Demographics,FamilyHistory
url=myhost
dsn=DBI:mysql:database=opal;host=myhost;port=3306
username=yourusername
password=yourpassword
</datasource>
<datasource Clinical1>
namespace=Opal
type=longitudinal
id_visit_separator=_
name=Clinical
url=myhost
dsn=DBI:mysql:database=opal;host=myhost;port=3306
username=yourusername
password=yourpassword
</datasource>
<datasource Drugs>
namespace=REDCap
dsn=DBI:mysql:database=opal;host=myhost;port=3306
username=yourusername
password=yourpassword
</datasource>
Each blocks holds a unique datasource configuration. Apart from some reserved parameters, namespace
, dsn
, username
and password
it is up to the user to decide what parameters they want to include in the configuration file. The user can specify the actual name of the datasource using the name
parameter provided the block name is an alias. If the name
parameter is not found then the block name is assumed to be the actual name of the datasource. In the example above, both Clinical and Clinical1 connect to the same datasource (i.e. Clinical) but with different configurations. Once this class has instantiated the datasource object, the user can access the parameters by simply calling the methods which have the same name as the parameters. For example, the database handle can be retrieved by $self->dbh()
and id_visit_separator by $self->id_visit_separator()
. The namespace is the name of the repository housing the datasource.
new()
$object = $datasource_pkg->new();
Basic constructor.
PROCESSING
After instantiating the datasource object, the class first calls authenticate to perform the user authentication. If the authentication is successful (i.e. $response is defined), it sets the default parameters, if any ( via default_parameters). The subsequent steps include calling the methods, entity_structure, table_structure, variable_structure, datatype_map and validating the return from each method. Upon successful validation the class attempts to set entity, table and variable specific parameters by invoking the methods below:
set_entity_parameters( $struct )
This method attempts to retrieve the entity parameters, entity_count
and visit_max
(for longitudinal datasources) from the repository. The method accepts the input from entity_structure.
set_table_parameters( $struct )
This method attempts to set the information on tables and their attributes as a hash where, table names are keys and attribute name-value pairs are hash values. The table attributes are read from the -columns
field specified under the hash ref from table_structure.
set_variable_parameters( $struct )
This method attempts to set the information on variables and their attributes as a hash where, keys are table and variable names joined by a dot and, values are the attribute name-value pairs. Instead of using the variable names as keys the method uses the combination of the table and the variable name as keys because,
- a.
-
the resulting name also contains the name of the table, the variable was recorded under (e.g. CaseHistory.Onset_Age),
- b.
-
distinguishes one variable from the other as sometimes variables from different tables may have the same name (e.g. Subject.Sex and Informant.Sex).
set_visit_variables()
This method is only called if the datasource is longitudinal. The method attempts to set the visit variables. The visit variables are only valid to dynamic tables and they represent the visit transformation of variables (e.g., V1.Var, V2.Var ... Vmax.Var, Vany.Var and Vlast.Var). The prefix V1
represents the first visit of the variable var
, V2
represents the second visit, Vany
implies any visit and Vlast
last visit. The compare command allows the use of visit variables when searching for entities of interest.
SUBCLASS HOOKS
The subclasses override the following hooks:
authenticate( $opts )
This method should return a response (a scalar) upon successful authentication otherwise return undef
. The method is called with one parameter, $opts
which is a hash with application options as keys and their user-provided values as hash values. Note the methods below are only called if the authentication is successful.
default_parameters( $opts, $response )
This method should return a hash ref containing parameter name-value pairs. The user can run some SQL queries in case the parameters to be added to the datasource object first need to be retrieved from the database. The parameters used in calling this method are:
$opts
is a hash with application options as keys and their user-provided values as hash values.
$response
is the response received upon successful authentication.
entity_structure()
The method should return a hash ref defining the entity structure in the database. The hash ref must have the following keys:
- -columns
-
entity_id
variable
value
table
visit
(only required for longitudinal datasources) - -from
-
table specifications (see SQL::Abstract::More)
- -where
-
where clauses (see SQL::Abstract)
table_structure()
The method should return a hash ref defining the table structure in the database. The table
in this context implies questionnaires or forms. For example,
{
-columns => {
table => "GROUP_CONCAT( DISTINCT form_name )",
variable_count => "COUNT( field_name )",
label => "element_label"
},
-from => "redcap_metadata",
-where => {
"project_id" => $self->project_id()
},
-order_by => "field_order",
-group_by => "form_name"
}
the user should make sure the returned hash ref is able to produce the SQL output like the one below,
+-------------------+-----------------+------------------+
| table | variable_count | label |
+-------------------+-----------------+------------------+
| demographics | 26 | Demographics |
| baseline_data | 19 | Baseline Data |
| month_1_data | 20 | Month 1 Data |
| month_2_data | 20 | Month 2 Data |
| month_3_data | 28 | Month 3 Data |
| completion_data | 6 | Completion Data |
+-------------------+-----------------+------------------+
Note that -columns
hash ref must have the key table
corresponding to the name of form/questionnaire and others columns are table attributes. It is up to the user to decide what table attributes they think are suitable for the description of tables.
variable_structure()
This method should return a hash ref defining the variable structure in the database. For example,
{
-columns => {
variable => "field_name",
table => "form_name",
label => "element_label"
type => "IF( element_validation_type IS NULL, 'text', element_validation_type)",
category => "IF( element_enum like '%, %', REPLACE( element_enum, '\\\\n', '\n'), '')",
},
-from => "redcap_metadata",
-where => {
"project_id" => $self->project_id()
},
-order_by => "field_order"
}
the user should make sure the returned hash ref is able to produce the SQL output like the one below,
+---------------------------+---------------+-------------------------+---------------+----------+
| variable | table |label | category | type |
+---------------------------+---------------+-------------------------+---------------------------
| kt_v_b | baseline_data | Kt/V | | float |
| plasma1_b | baseline_data | Collected Plasma 1? | 0, No | text |
| | | | 1, Yes | |
| date_visit_1 | month_1_data | Date of Month 1 visit | | date_ymd |
| alb_1 | month_1_data | Serum Albumin (g/dL) | | float |
| prealb_1 | month_1_data | Serum Prealbumin (mg/dL)| | float |
| creat_1 | month_1_data | Creatinine (mg/dL) | | float |
+---------------------------+---------------+-----------+-------------------------------+--------+
Note that -columns
hash ref must have the key variable
and table
. Again it is up to the user to decide what variable attributes (i.e. meta data) they think define the variables in the datasource. The categories in category
should be separated by newline.
datatype_map()
This method should return a hash ref with variable type as keys and equivalent SQL type (i.e. castable) as value.
DIAGNOSTICS
Config::General fails to parse the datasource configuration file.
Failed to instantiate datasource package '<datasource pkg>' via new().
The return from methods
default_parameters
,entity_structure
,table_structure
,variable_structure
anddatatype_map
is either not hash worthy or incomplete.The
select
method from SQL::Abstract::More fails to construct the SQL from the supplied hash ref.The method
execute
from DBI fails to execute the SQL query.
DEPENDENCIES
Carp
SEE ALSO
CohortExplorer::Application::Opal::Datasource
CohortExplorer::Application::REDCap::Datasource
CohortExplorer::Command::Describe
CohortExplorer::Command::History
CohortExplorer::Command::Query::Search
CohortExplorer::Command::Query::Compare
LICENSE AND COPYRIGHT
Copyright (c) 2013-2014 Abhishek Dixit (adixit@cpan.org). All rights reserved.
This program is free software: you can redistribute it and/or modify it under the terms of either:
the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version, or
the " Artistic Licence ".
AUTHOR
Abhishek Dixit