The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

File::Misc -- handy file tools

Description

File::Misc provides a variety of utilities for working with files. These utilities provides tools for reading in, writing out to, and getting information about files.

SYNOPSIS

# slurp in the contents of a file
$var = slurp('myfile.txt');

# spit content into a file
spit 'myfile.txt', $var;

# get the lines in a file as an array
@arr = file_lines('myfile.txt');

# get a list of all the files in a directory
@arr = files('/my/dir');

# ensure a file is deleted - if it is already deleted return success
ensure_unlink('myfile.txt');

# ensure a file exists, update its date to now
touch('myfile.txt');

# many others

INSTALLATION

File::Misc can be installed with the usual routine:

perl Makefile.PL
make
make test
make install

FUNCTIONS

slurp

Returns the contents of the given file

$var = slurp('myfile.txt');

option: max

Sets the maximum amount in bytes to slurp in. By default the maximums is 100k.

# set maximum to 1k
$var = slurp('myfile.txt', max=>1024);

Set max to 0 to set no maximum.

option: firstline

If true, only slurp in the first line.

$line = slurp('myfile.txt', firstline=>1);

options: stdout, stderr

If the stdout option is true, then the contents of the file are sent to STDOUT and are not saved as a scalar at all. slurp returns true.

slurp('myfile.txt', stdout=>1);

The stderr option works the same way except that contents are sent to STDERR. Both options can be set to true, and contents will be sent to both STDOUT and STDERR.

spit

The opposite of slurp(), spit() outputs the given string(s) to the given file in a single command. In its simplest form, spit takes a file path, then one or more strings. Those strings are concatenated together and output the given path. So, the following code outputs "hello world" to /tmp/myfile.txt.

spit('/tmp/myfile.txt', 'hello world');

If you want to append to the file (if it exists) then the first param should be a hashref, with 'path' set to the path to the file and 'append' set to true, like as follows.

spit(
  {path=>'/tmp/myfile.txt', append=>1},
  'hello world'
);

file_lines

Cfile_lines> returns the contents of one or more files as an array. Newlines are stripped off the end of each line. So, for example, the following code would the lines from buffer.txt:

@lines = file_lines('buffer.txt');

If the first param is an arrayref, then every file in the array is read. So, the following code returns lines from buffer.txt and data.txt.

@lines = file_lines(['buffer.txt', 'data.txt']);

option: max

max sets the maximum number of lines to return. So, the following code indicates to send no more than 100 lines.

@lines = file_lines('buffer.txt', max=>100);

option: quiet

If the quiet option is true, then file_lines does not croak on error. For example:

@lines = file_lines('buffer.txt', quiet=>1);

option: skip_empty

If skip_empty is true, then empty lines are not returned. Note that a line with just spaces or tabs is considered empty.

@lines = file_lines('buffer.txt', skip_empty=>1);

size

Returns the size of the given file. If the file doesn't exist, returns undef.

mod_time

Returns the modification time (in epoch seconds) of the given file. If the file doesn't exist, returns undef.

print 'modification time: ', mod_time('myfile.txt'), "\n";

If you are familiar with the stat() function, then it may clarify to know that mod_time simply returns the ninth element of stat().

mod_date

mod_date does exactly the same thing as mod_time.

age

age() returns the number of seconds since the given file has been modifed.

print 'file age: ', age('myfile.txt'), "\n";

age() simply returns the current time minus the value of mod_time.

files

files returns an array of file paths starting at the given directory. In its simplest use, files is called with just a directory path.

@myfiles = files('./tmp');

That command will return all files within ./tmp, including recursing into nested directories. By default, all paths will be relative to the current directory, so the file list mught look something like this:

./tmp/buffer.txt
./tmp/build
./tmp/build/myfile.txt

You can get just the file names with the full_path option, described below.

Note that the

files has several options, explained below.

option: recurse

By default, files recurses directory structures.

option: dirs

option: files

option: full_path

option: extensions

option: follow_links

search_inc

search_inc() searches the @INC directories for a given file and returns the full path to that file. For example, this command:

search_inc('JSON/Tiny.pm')

might return somethng like this:

/usr/local/share/perl/5.18.2/JSON/Tiny.pm

The given path must be the full path within the @INC directory. So, for example, this command would not return the path to JSON/Tiny.pm:

search_inc('Tiny.pm')

That feature might be added later.

If you prefer, you can give the path in Perl module format:

search_inc('JSON::Tiny')

script_dir

Returns the directory of the script. The directory is relative the current directory when the script was called. Call this command before altering $0.

mode

mode() returns the file mode (i.e. type and permissions) of the given path.

tmp_path

tmp_path() is for the situation where you want to create a temporary file, then have that file automatically deleted at the end of your code or code block.

tmp_path() returns a File::Misc::Tmp::Path object. That object stringifies to a random path. When the object goes out of scope, the file, if it exists, is deleted. tmp_path() does not create the file, it just deletes the file if the file exists.

tmp_path() takes one required param: the directory in which the file will go. Here's a simple example:

# variables
my ($tmp, $fh);

# get temporary path: file is NOT created
$tmp = tmp_path('./');

# open a file handle, write stuff to the file, close the handle
$fh = FileHandle->new("> $tmp") or die $!;
print $fh "stuff\n";
undef $fh;

# do something that might cause a crash
# if there is a crash, $tmp goes out of scope and deletes the file
if ( it_could_happen() ) {
  die 'crash!';
}

# move the file somewhere else
rename($tmp, './permanent') or die $!;

# the file doesn't exist anymore, so when $tmp object
# goes out of scope, nothing happens

By default, the path consists of the given directory followed by a random string of four characters. So in the example above, the path would look something like this:

./fZ96

No effort is made to ensure that there isn't already a file with that name. It is simply assumed that four characters is enough to assure a microscopic (but non-zero) chance of a name conflict.

Note that File::Temp provides a similar functionality, but there is an important difference. File::Temp creates the temporay file and returns a file handle for that file. This is useful for situations where you want to cache data for use in the current scope. It gets a little trickier, however, if you want to close the file handle and move the temporary file to a permanent location. tmp_path simply gives you a path that will be deleted if the file exists, allowing you manipulate and move the file as you like. File::Temp also goes to some effort to ensure that there are no name conflicts. What you use is a matter of needs and taste.

option: rand_length

By default the random string is 4 characters long. rand_length gives a different length to the string. So, for example, the following code indicates a random string length of 8:

$tmp = tmp_path('./', rand_length=>8);

That produces a string like this:

./JQd4P6W7

option: auto_delete

If the auto_delete option is sent and is false, then the file is not actually deleted when the tmp object goes out of scope. For example:

$tmp = tmp_path('./', auto_delete=>0);

This option might seem to defeat the purpose of tmp_path, but it's useful for debugging your code. By setting the object so that it doesn't automatically delete the file you can look at the contents of the file later to see if it actually contains what you thought it should.

option: extension

extension allows you to give the path a file extension. For example, the following code creates a path that ends with '.txt'.

$tmp = tmp_path('./', extension=>'txt');

option: prefix

prefix indicates a string that should be put after the directory name but before the random string. So, for example, the following code puts the prefix "build-" in the file name:

$tmp = tmp_path('./', prefix=>'build-');

giving us something like

./build-J3v1

tmp_dir

tmp_dir() creates a temporary directory and returns a File::Misc::Tmp::Dir object. When the object goes out of scope, the directory is deleted.

TERMS AND CONDITIONS

Copyright (c) 2012-2016 by Miko O'Sullivan. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This software comes with NO WARRANTY of any kind.

AUTHOR

Miko O'Sullivan miko@idocs.com

VERSION

Version 0.10.

HISTORY

Version 0.10, Sep 7, 2016

Initial release