NAME
Doit::File - commands for file creation
SYNOPSIS
use Doit;
my $doit = Doit->init;
$doit->add_component('file');
$doit->file_atomic_write('/path/to/file', sub {
my $fh = shift;
print $fh "Hello, world!\n";
});
$doit->file_atomic_write('/path/to/file', sub {
my($fh, $filename) = @_;
$doit->system("a_system_cmd > $filename");
}, tmpsuffix => '.my.tmp');
DESCRIPTION
Doit::File is a Doit component providing methods for file creation. It has to be added to a script using Doit's add_component:
$doit->add_component('file');
DOIT COMMANDS
The following commands are added to the Doit runner object:
file_atomic_write
$doit->file_atomic_write($filename, $code, optkey => optval ...)
Create a file $filename using a code reference $code in an atomic way. This command creates a temporary file first which is used for the writes. If no errors happen during the code execution (i.e. no perl-level exceptions when running the code reference, and no system errors like "no space left on device"), then the temporary file is renamed atomically to the final destination.
The code reference takes two parameters: a filehandle to write to and the filename of a temporary file. Don't close the filehandle; file_atomic_write
takes care of closing it (and fails if something goes wrong).
The command always returns true, as the file is normally always written, but see the "check_change" option below.
Options are specified as named parameters. Possible options are:
tmpdir => directory
-
Change the directory where the temporary file is created. Default is the same directory as the final destination. The directory must already exist. Change this if a system cannot tolerate the existence of stray temporay files, and setting the
tmpsuffix
option (see below) does not help. Example:$doit->file_atomic_write('/path/to/filename', \&writer, tmpdir => '/tmp');
NOTE: if the temporary file is created in another file system than the final destination file, then the final rename is
not
atomic (i.e.File::Copy::move
is used instead ofrename
). For example, on many systems /tmp is located on a separate file system (root fs or a special tmpfs) than the rest of the system. tmpsuffix => suffix
-
Change the suffix used for the temporary file. Default is
.tmp
. Change the suffix if a system may tolerate the existence of stray temporary files if special suffixes are used. For example, in a directory controlled by Debian's run-parts(8) programm it can help to use.dpkg-tmp
as the tempory file suffix. mode => mode
-
Set permissions of the final destination file, using the "chmod" in perlfunc syntax.
If not used, then the permissions would be as creating a normal non-executable file, which usually takes "umask" in perlfunc into account.
check_change => bool
-
If set to a true value, then two things are done:
a comparison between the old file and the newly created temporary is done, and if there's no difference, then the old file will be left untouched
if the old file was not changed, then the command returns a false value
NOTES
The temporary file creation is done using File::Temp. The temporary files are removed as early as possible, even in the case of exceptions. Only left-overs may happen if the script is killed from outside (e.g. SIGINT, SIGKILL...). A possible solution to prevent left-overs on CTRL-C is to a define a signal handler like this:
$SIG{INT} = sub { exit };
Some of File::Temp's standard behaviors are changed here: mode is different (see above), and EXLOCK
is not set.
If the tmpdir
option is used, then the group of the created temporary file may differ as it was created in the final destination directory (e.g. in presence of setgid bits, or on BSD systems if the directory belongs to another group). Some precautions are made to fix this, at least for some common operating systems, but this is far from perfect.
While running this function two versions of the file will exist simultaneously on the disk: the old version (if it's an overwrite) and the new version. This may be problematic if the files are large and the diskspace limited.
The tmpdir
option may be set to /dev/full; in this case the temporary file will be set to /dev/full. This is only useful in test scripts for testing the ENOSPC
(see errno(3) or errno(2)) case.
Special handling is implemented for older perls (5.8.x), as a failing close is not correctly detected here.
file_digest_matches (informational command)
my $bool = $doit->file_digest_matches($file, $digest[, $digest_algorithm[, got_digest => \$got_digest]]);
Return true if the given $file exists and it's current digest (using algorithm $digest_algorithm
which defaults to MD5
) matches the specified $digest. Fail if an unhandled $digest_algorithm
is specified. Otherwise false is returned.
Other possible digest algorithms are listed in Digest. Note that some of the listed algorithms may require additional CPAN modules. Core perl usually has MD5
, SHA-1
, SHA-256
, SHA-384
and SHA-512
available.
If the got_digest
option is specified, then it should point to a scalar reference, which will be filled with the digest of the file.
TODO
- Catch signals for temporary file cleanup?
-
It would be nice if signals like SIGINT would be caught and created temporary files removed. But this would require some framework support from <Doit>.
- Automatic cleanup of leftover temporary files?
-
It would be nice to have an option to detect and remove temporary files from earlier runs. The rule could be time-based (.tmp files older than n days), and maybe a check if the files are actively used (e.g. with lsof(8)) could be done.
tmpdir
option with relative directories?-
rsync(1) creates temporary directories named
.~tmp~
as subdirectories of the final destination paths. This is useful as atomic rename(2) may always be used in this case. Specification could look like this:tmpdir => "./.~tmp~"
In this case file_atomic_write should take care of temporary directory creation and removal (but what if there are multiple simultaneous writers?).
- Use invisible temporary files
-
On recent Linux kernels it's possible to open(2) a file with the
O_TMPFILE
flag, which would create an invisible file, and make it "visible" using linkat(2) (see https://stackoverflow.com/questions/4171713/relinking-an-anonymous-unlinked-but-open-file). It would be nice if file_atomic_write would offer such a solution. - file_atomic_directory?
-
How would an implementation for atomic directory writes could look like?
AUTHOR
Slaven Rezic <srezic@cpan.org>
COPYRIGHT
Copyright (c) 2017,2018,2021,2023 Slaven Rezic. All rights reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.