NAME
Bio::ToolBox::Data::core - Common functions to Bio:ToolBox::Data family
DESCRIPTION
Common methods for metadata and manipulation in a Bio::ToolBox::Data data table and Bio::ToolBox::Data::Stream file stream. This module should not be used directly. See the respective modules for more information.
METHODS
For quick reference only. Please see Bio::ToolBox::Data for implementation.
- new
-
Generate new object. Used as a common base for Bio::ToolBox::Data and Bio::ToolBox::Data::Stream.
- verify
-
Verify the integrity of the Data object. Checks multiple things, including metadata, table integrity (consistent number of rows and columns), and special file format structure.
- open_database
-
This is wrapper method that tries to do the right thing and passes on to either "open_meta_database" or "open_new_database" methods. Basically a legacy method for "open_meta_database".
- open_meta_database
-
Open the database that is listed in the metadata. Returns the database connection. Pass a true value to force a new database connection to be opened, rather than returning a cached connection object (useful when forking).
- open_new_database
-
Convenience method for opening a second or new database that is not specified in the metadata, useful for data collection. This is a shortcut to "open_db_connection" in Bio::ToolBox::db_helper. Pass the database name.
- verify_dataset
-
Verifies the existence of a dataset or data file before collecting data from it. Multiple datasets may be verified. This is a convenience method to "verify_or_request_feature_types" in Bio::ToolBox::db_helper. Pass the name of the dataset to verify.
- delete_column
-
Delete one or more columns in a data table. Pass a list of the indices to delete.
- reorder_column
-
Reorder the columns in a data table. Allows for skipping (deleting) and duplicating columns. Pass a list of the new index order.
- feature
-
Returns or sets the string of the feature name listed in the metadata.
- feature_type
-
Returns "named", "coordinate", or "unknown" based on what kind of feature is present in the data table.
- program
-
Returns or sets the program string in the metadata.
- database
-
Returns or sets the name of the database in the metadata.
- bam_adapter
-
Returns or sets the short name of bam adapter being used: "sam" or "hts".
- big_adapter
-
Returns or sets the short name of the bigWig and bigBed adapter being used: "ucsc" or "big".
- format
-
Returns a text string describing the format of the file contents, such as
gff3
,gtf
,bed
,genePred
,narrowPeak
, etc. - gff
-
Returns or sets the GFF version value in the metadata.
- bed
-
Returns or sets the number of BED columns in the metadata.
- ucsc
-
Returns or sets the number of columns in a UCSC-type file format, including genePred and refFlat.
- vcf
-
Returns or sets the VCF version value in the metadata.
- number_columns
-
Returns the number of columns in the data table.
- number_rows
-
Returns the number of rows in the data table.
- last_column
-
Returns the array index of the last column in the data table.
- last_row
-
Returns the array index of the last row in the data table.
- filename
-
Returns the complete filename listed in the metadata.
- basename
-
Returns the base name of the filename listed in the metadata.
- path
-
Returns the path portion of the filename listed in the metadata.
- extension
-
Returns the recognized extension of the filename listed in the metadata.
- comments
-
Returns an array of comment lines present in the metadata.
- add_comment
-
Adds a string to the list of comments to be included in the metadata.
- delete_comment
-
Deletes the indicated array index from the metadata comments array.
- vcf_headers
-
Partially parses VCF metadata header lines into a hash structure.
- rewrite_vcf_headers
-
Rewrites the vcf headers back into the metadata comments array.
- list_columns
-
Returns an array of the column names
- name
-
Returns or sets the name of the column. Pass the index, and optionally new name.
- metadata($index, $key)
-
Returns or sets the metadata key/value pair for a specific column. Pass the index, key, and optionally new value.
- delete_metadata
-
Deletes the metadata key for a column. Pass the index and key.
- copy_metadata
-
Copies the metadata values from one column to another column. Pass the source and target indices.
- find_column
-
Returns the column index for the column with the specified name. Name searches are case insensitive and can tolerate a # prefix character. The first match is returned. Pass the name to search.
- chromo_column
-
Returns the index of the column that best represents the chromosome column.
- start_column
-
Returns the index of the column that best represents the start, position, or transcription start column.
- stop_column
- end_column
-
Returns the index of the column that best represents the stop or end column.
- strand_column
-
Returns the index of the column that best represents the strand.
- name_column
-
Returns the index of the column that best represents the name.
- type_column
-
Returns the index of the column that best represents the type.
- id_column
-
Returns the index of the column that represents the Primary_ID column used in databases.
- score_column
-
Returns the index of the column that represents the Score column in certain formats, such as GFF, BED, bedGraph, etc.
- zero_start
- interbase
-
Returns true (1) or false (0) if the coordinate system appears to be an interbase, half-open, or zero-based coordinate system. This is based on file type, e.g. .bed, or if the start coordinate column name is
start0
. The coordinate system can also be explicitly changed by passing an appropriate value; note that this will also change the start coordinate column name as appropriate. - get_seqfeature
-
Returns the stored SeqFeature object for a given row.
SEE ALSO
AUTHOR
Timothy J. Parnell, PhD
Dept of Oncological Sciences
Huntsman Cancer Institute
University of Utah
Salt Lake City, UT, 84112
This package is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0.