NAME

DBUtils - a module for useful Database utilities

SYNOPSIS

  use DBUtils;

  assert_table_defined($dbh,$table_name);

  assert_dbh($dbh);

  $id = fetch_last_id($dbh,$table_name);

  $sql_string = create_insert_sql($dbh,$table_name,@column_names);

  $sql_string = create_update_sql($dbh,$table_name,
                                  SET=>$array_ref,WHERE=>$string);

  $sql_string = create_select_sql($dbh,
                                  COLUMNS=>$array_ref,
				  FROM=>$array_ref,
				  WHERE=>$where_string,
				  DISTINCT=>$bool,
                                  LIMIT=>$bool,
                                 );

  $species = fetch_es_species($es_pk);

  @ids = fetch_es_ids();

  @ids = fetch_user_al_ids($us_fk);

  @ids = fetch_am_ids($experiment_set_pk, 
                      @array_measurement_types);

  $array_ref = fetch_fkey_list($dbh,
                               $table_name,
                               $column_name,
                               $pkey,
                               $rec_level);

  $exp_set = fetch_spot_data($es_pk,@am_types);

  output_spot_data($file_handle,
                   $experiment_set_pk,
                   @array_measurement_types);

  $id = lookup_id($dbh,$table,$attribute,$value,$approx);

  $id = lookup_species_id($dbh,$species_name,$approx);

  $id = lookup_usf_id($dbh,$usf_name,$approx);

  $id = lookup_contact_id($dbh,$contact_name,$approx);

  $id = lookup_software_id($dbh,$software_name,$approx);

DESCRIPTION

Helper functions for achieving common DB tasks with a GeneX DB.

FUNCTIONS

create_update_sql($dbh,TABLE=>$table,SET=>$hash_ref,WHERE=>$string)

Returns an SQL string valid for the current DBMS. It is up to the caller whether to use DBI placeholders or not, they will not be generated by the subroutine.

create_insert_sql($dbh,$table,\@column_names)
create_insert_sql($dbh,$table,\%col_val_pairs)

There are two methods of invoking create_insert_sql. If invoked with only an array reference of column names, create_insert_sql will return an SQL string with placeholders for use by the DBI prepare() and exectute() methods. In this case the \@columen_name reference contains the names and order of the columns we plan to insert data in $table.

If invoked with a hash reference of column name/value pairs, create_insert_sql will return a final SQL string that can be used by DBI do().

create_select_sql($dbh, COLUMNS=>$array_ref, FROM=>$array_ref, WHERE=>$string, DISTINCT=>$bool, LIMIT=>$integer)

Takes an array reference of column names to select from an array reference of table names used in the FROM clause, a string WHERE clause, and a boolean DISTINCT flag and returns a SQL string to fetch the requested data. A LIMIT clause will be added if the LIMIT parameter is specified.

Return value: an SQL string that is specific to the driver for $dbh (currently this is only implemented for PostgeSQL).

lookup_usf_id($dbh,$sequence_feature_name,$approx)

Given a sequence feature name, this function returns the primary key in the DB for that sequence feature.

If the optional $approx flag is specified, then lookup_usf_id() will use wildcarding around $sequenc_feature_name so the name does not have to be an exact match.

assert_dbh($dbh)

This method ensures that $dbh is a valid instance of class Bio::Genex::Connect, otherwise it triggers an exception.

assert_table_defined($dbh,$table_name)

This ensures that $table_name exists in $dbh, otherwise it triggers an exception. This method caches its results for all tables it finds for $dbh. This significantly speeds up all future calls for all tables.

lookup_contact_id($dbh,$contact_name,$approx)

Given a contact person name, this function returns the primary key in the DB for that contact.

If the optional $approx flag is specified, then lookup_contact_id() will use wildcarding around $contact_name so the name does not have to be an exact match.

lookup_software_id($dbh,$software_name,$approx)

Given a software name, this function returns the primary key in the DB for that software.

If the optional $approx flag is specified, then lookup_software_id() will use wildcarding around $software_name so the name does not have to be an exact match.

lookup_species_id($dbh,$species_name,$approx)

Given a primary scientific name, this function returns the primary key in the DB for that species.

If the optional $approx flag is specified, then lookup_species_id() will use wildcarding around $species_name so the name does not have to be an exact match.

lookup_experiment_id($dbh,$experiment_name)

Given an experiment set name, this function returns the primary key in the DB for that experiment set.

If the optional $approx flag is specified, then lookup_experiment_id() will use wildcarding around $experiment_name so the name does not have to be an exact match.

lookup_id($dbh,$table,$attribute,$pk_name,$value,$approx)

Retrieves the primary key ($pk_name) for a row in $table whose $attribute column is $value).

The optional $approx flag species that approximate mathcing using an SQL 'LIKE' clause should used instead of exact matching using '='.

fetch_last_id($dbh,$table)

Retrieves the primary key of the last row inserted into $table.

check_password($dbh,$user_name,$password)

Checks whether there exists an entry in the UserSec table whose 'login' == $user_name and 'password' == $password.

Return value: On Success: the primary key of the UserSec entry On Failure: undef

fetch_es_ids()

Retrieves the primary keys of all experiment sets in the DB.

fetch_am_ids($es_pk, @am_types)

Retrieves the primary keys of all array measurements belonging to experiment set $es_pk.

The optional @am_types list is used to filter the arrays to a given type. There is no default value for this parameter.

Return Value: list of ArrayMeasurement ids or undef if none.

fetch_spot_data($es_pk,@am_types)

Fetches all spot data from AM_Spots for all arrays of ExperimentSet $es_pk.

The optional @am_types list will restrict the retrieval to only arrays of the specified types, the default value of @am_types is ('derived ratio').

Return value: an instance of Bio::Genex::ExperimentSet

Example use:

use Bio::Genex::DBUtils qw(fetch_spot_data, fetch_es_ids)
my @es_ids = fetch_es_ids();
foreach my $es_pk (@es_ids) {
  my $es = fetch_spot_data($es_pk);
  my @array_list = $es->arraymeasurement_fk();
  foreach my $array (@array_list) {
    my $spot_list_ref = $array->am_spots_fk();

    # locate the column for the spot values
    my @header = shift @{$spot_list_ref};
    my $spot_col;
    my $usf_col;
    for (my $i=0;$i<scalar @header;$i++) {
      $spot_col = $i if $header[$i] =~ /spot_value/;
      $usf_col = $i if $header[$i] =~ /usf_fk/;
    }
    die "Couldn't find columns for " . $array->name() 
      unless defined $spot_col && defined $usf_col;
    
    foreach my $spot (@{$spot_list_ref}) {
      print "$spot->[$usf_col]\t$spot->[$spot_col]";
    }
  }
}
output_spot_data($file_handle, $es_pk, @am_types)

Writes all of the spot data belonging to experiement set $es_pk, to $file_handle.

The optional @am_types list is used to filter the spot data belonging to only arrays of the given types. The default value is ('derived ratio').

fetch_fkey_list(dbh=>$dbh,calling_class=>$table_name,class_to_fetch=>$table_name,column_to_fetch=>$column_name,value_to_fetch=>$pkey)

Parameters: dbh: a database handle; calling_class: the name of the class requesting the data; class_to_fetch the name of the table to lookup the data from; column_to_fetch: the name of foreign key in class_to_fetch that refers to calling_class; value_to_fetch: the value of column_to_fetch to use in the WHERE clause.

Control Variables: Setting $Bio::Genex::DBUtils::NO_FETCH_DISTINCT instructs fetch_fkey_list to not use the DISTINCT SQL keyword in the query (the default is to use DISTINCT. Setting $Bio::Genex::DBUtils::NO_FETCH_ALL instructs the method to not set the fetch_all parameter when calling new() on each object (the default is to use fetch_all).

fetch_user_al_ids($us_fk)

Find all ArrayLayouts for a given user.

Return Value: a list of ArrayLayout ids or undef if none.

fetch_es_species($es_pk)

Parameters: $es_pk, the primary key for the ExperimentSet whose species are being queried.

Return value: an instance of class Bio::Genex::Species.

AUTHOR

Jason Stewart (jes@ncgr.org), and Peter Hraber (pth@ncgr.org)

SEE ALSO

perl(1).