NAME
BioUtil::Util - Utilities for operation on data or file
Some great modules like BioPerl provide many robust solutions. However, it is not easy to install for someone in some platforms. And for some simple task scripts, a lite module may be a good choice. So I reinvented some wheels and added some useful utilities into this module, hoping it would be helpful.
VERSION
Version 2015.0105
EXPORT getopt
file_list_from_argv
get_file_list
delete_string_elements_by_indexes
delete_array_elements_by_indexes
extract_parameters_from_string
get_parameters_from_file
get_list_from_file
get_column_data
read_json_file
write_json_file
run
readable_second
check_positive_integer
filename_prefix
check_all_files_exist
check_in_out_dir
rm_and_mkdir
run_time
SYNOPSIS
use BioUtil::Util;
SUBROUTINES/METHODS
getopt
getopt FOR ME
Example -a b -c t tt -d bb -dbtype asdfafd -test
-a: b
-c: ARRAY(0xee25e8)
-d: bb
-dbtype: asdfafd
-infmt: fasta
-test: 1
file_list_from_argv
Get file list from @ARGV. You should use this after parsing options!
When no arguments given, 'STDIN' will be added to the list, which could be further used by, e.g. FastaReader.
get_file_list
Find files/directories with custom filter, max serach depth could be specified.
Example (searching perl scripts)
my $dir = "~";
my $depth = 2;
my $list = get_file_list(
$dir,
sub {
if ( -d or /^\./i ) { # ignore configuration file and folders
return 0;
}
if (/\.pm/i or /\.pl/i) {
return 1;
}
return 0;
},
$depth
);
print "$_\n" for @$list;
delete_string_elements_by_indexes
Delete string elements by indexes, it uses delete_array_elements_by_indexes
delete_array_elements_by_indexes
Delete array elements by given indexes.
Example:
@list = qw(a b c d e f);
@idx = (1, 2, 4);
$list2 = delete_array_elements_by_indexes(\@list, \@idx);
print "@$list2\n"; # result: a, d, f
extract_parameters_from_string
Extract parameters from string.
The regular expression is
/([\w\d\_\-\.]+)\s*=\s*([^\=;]*)[\s;]*/
Example:
# bad format, but could also be parsed
# my $s = " s = b; a=test; b_c=12 3; a.b =; b
# = asdf
# sd; ads-f = 12313";
# recommended
my $s = "key1=abcde; key2=123; conf.a=file; conf.b=12; ";
my $pa = extract_parameters_from_string($s);
print "=$_:$$p{$_}=\n" for sort keys %$pa;
get_parameters_from_file
Get parameters from a file. Comments start with # are allowed in file.
Example:
my $pa = get_parameters_from_file("d.txt");
print "$_: $$pa{$_}\n" for sort keys %$pa;
For a file with content:
# cell phone
apple = 1 # note
nokia = 2 #
output is:
apple: 1
nokia: 2
get_list_from_file
Get list from a file. Comments start with # are allowed in file.
Example:
my $list = get_list_from_file("d.txt");
print "$_\n" for @$list;
For a file with content:
# cell phone
apple # note
nokia
output is:
apple
nokia
get_column_data
Get one column of a file.
Example:
my $list = get_column_data("d.txt", 2);
print "$_\n" for @$list;
read_json_file
Read json file and decode it into a hash ref.
Example:
my $hashref = read_json_file($file);
write_json_file
Write a hash ref into a file.
Example:
my $hashref = { "a" => 1, "b" => 2 };
write_json_file($hashref, $file);
run
Run a command
Example:
my $fail = run($cmd);
die "failed to run:$cmd\n" if $fail;
readable_second
readable_second
Example:
print readable_second(11312314),"\n"; # 130 day 22 hour 18 min 34 sec
check_positive_integer
Check Positive Integer
Example:
check_positive_integer(1);
filename_prefix
Get filename prefix
Example:
filename_prefix("test.fa"); # "test"
filename_prefix("tmp"); # "tmp"
check_all_files_exist
Check whether all files existed.
check_in_out_dir
Check in and out directory.
Example:
check_in_out_dir("~/dir", "~/dir.out");
rm_and_mkdir
Make a directory, remove it firstly if it exists.
Example:
rm_and_mkdir("out")
run_time
Run a subroutine with given arguments N times, and return the mean and stdev of time.
Example:
my $read_by_record = sub {
my ($file) = @_;
my $next_seq = FastaReader($file);
while ( my $fa = &$next_seq() ) {
my ( $header, $seq ) = @$fa;
# print ">$header\n$seq\n";
}
};
my ($mean, $stdev) = run_time( 8, $read_by_record, $file );
printf STDERR "\n## Compute time: %0.03f ± %0.03f s\n\n", $mean, $stdev;
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 664:
Non-ASCII character seen before =encoding in '±'. Assuming UTF-8