NAME

File::Information - generic module for extracting information from filesystems

VERSION

version v0.05

SYNOPSIS

use File::Information;

my File::Information $instance = File::Information->new(%config);

my File::Information::Base $obj = $instance->for_link($path)
my File::Information::Base $obj = $instance->for_handle($handle);

my $title                       = $obj->get('title');
my $digest                      = $obj->digest('sha-3-512');

my $result                      = $obj->verify;
my $passed                      = $result->has_passed;

This module provides support to read/write properties of inodes (files) and links in a portable and compact way.

This module will collect data from a number of sources such as the file system (ANSI, POSIX, and operating specific interfaces), .comments/, tagpool, tag databases, and other.

The provided example program file-information-dump dumps all information this module can read for a given file. It is also meant as an example on how to interact with the API.

In addition this module also provides a way to verify a file for corruption. See "verify" in File::Information::Base for that.

A noteable difference of this module to other similar modules is the use of lifecycles. See "lifecycles" for more information on that.

Note: Future versions of this module will depend on Data::Identifier.

METHODS

new

my File::Information $instance  = File::Information->new(%config);

Creates a new instance that can be used to perform lookups later on.

The following options (all optional) are supported:

extractor

An instance of Data::URIID used to create related objects.

db

An instance of Data::TagDB used to interact with a database.

tagpool_rc

A filename (or list of filenames) of tagpool rc files. Pool locations will be read from those files. Default is to try standard locations. To disable this it is possible to set the option to [].

tagpool_path

A path (or a list of paths) of tagpool directories. This is where a pool is located. Default is to try standard locations. To disable this it is possible to set the option to []. However to disable tagpool support fully tagpool_rc also needs to be set to [].

Only valid pools are accepted. Invalid pools are rejected without warning.

device_path

The path (or list of paths) to look for device inodes. This is used as part of filesystem detection. Default is to try a list of standard locations. To disable this it is possible to set the option to [].

This module does not perform recursive searches. Therefore on systems that include paths like /dev/disk those also need to be included for this module to work correctly. It is therefore recommended not to alter this setting.

digest_sizelimit

The size limit (in bytes) for how large of a datablock (such as a file) the module will perform hashing. This can be set to 0 to disable hashing. When set to 'infinite' the limit is disabled. The default is suitable for modern machines and will be not less than 16MiB.

digest_unsafe

An digest or a list of digests to be defined unsafe. See "digest_info" for details. Dies if a digest in the list is unknown (this is for security reasons). This option only allows to mark additinal digests unsafe. It does not allow to mark already marked ones safe again.

mountinfo_path

The path to the mountinfo file. This is a special file on Linux that contains information on mounted filesystems. Defaults to /proc/self/mountinfo. This option has no effect on systems other than Linux.

my File::Information::Link $link = $instance->for_link($path);
# or:
my File::Information::Link $link = $instance->for_link(path => $path [, %opts ]);

Creates a new link instance.

The following options are supported:

path

Required if not using the one-argument form. Gives the path (filename) of the link.

Whether (follow) or not (nofollow; default) symlinks.

for_handle

my File::Information::Inode $inode = $instance->for_handle($handle);
# or:
my File::Information::Inode $inode = $instance->for_handle(handle => $handle [, %opts ]);

Creates a new inode instance.

The following options are supported:

handle

Required if not using the one-argument form. Gives an open handle to the inode.

for_identifier

my File::Information::Base $obj = $instance->for_identifier($identifier);
# or:
my File::Information::Base $obj = $instance->for_identifier(uuid => $uuid);
# or:
my File::Information::Base $obj = $instance->for_identifier(type => 'uuid', id => $uuid [, %opts ]);

Note: This is an experimental method. It may be renamed, removed, or changed in any way with future releases.

This method returns an object based on it's identifier. This method might return different kinds of objects such as links, inodes, filesystems, or tagpools.

The identifier can be passed as an instance of e.g. Data::Identifier or as a plain UUID. Other types may or may not be supported.

tagpool

my @tagpool = $inode->tagpool;

Returns the list of found tagpools if any (See File::Information::Tagpool).

Note: There is no order to the returned values. The order may change between any two calls.

extractor

my Data::URIID $extractor = $instance->extractor;

Returns the extractor given via the configuration. Will die if no extractor is available.

db

my Data::TagDB $db = $instance->db;

Returns the database given via the configuration. Will die if no database is available.

lifecycles

my @lifecycles = $instance->lifecycles;

Returns the list of known lifecycles. The order of the list is not defined. However the method will return them in a way suitable for display to an user.

Currently defined are the following lifecycles:

initial

The initial state. This is the state the object is in when it becomes known. The exact meaning depend on the used data source.

last

The state the object was in when last interacted with a non-read-only manner. The exact meaning depend on the used data source.

current

The current state of the object.

final

The state the object will be in when it is final. Most commonly this is used to compare to when checking if a object is corrupted.

digest_info

my $info = $instance->digest_info('sha-3-512');
# or:
my @info = $instance->digest_info;
# or:
my @info = $instance->digest_info('sha-2-512', 'sha-3-512');

Returns information on one or more digests. If no digest is given returns infos for all known ones.

The digest can be given in the universal tag format (preferred), one of it's aliases (dissuaded), or a complete digest-and-value string in universal tag format (only version v0 or v0m if only one digest is given and the method is called in list context).

The return value is a hashref or an array of hashrefs which contain the following keys:

name

The name of the digest in universal tag format (the format used in this module).

bits

The number of bits the digest will return.

aliases

An arrayref to a list of aliases for this digest.

unsafe

A boolean indicating if the digest is considered unsafe by this module. Security: Note that a digest not defined unsafe by this module may still be unsafe to use. This can for example happen if the digest became unsafe after the release of the version of this module.

rfc9530

The name of the algorithm as per RFC 9530 if any.

AUTHOR

Löwenfelsen UG (haftungsbeschränkt) <support@loewenfelsen.net>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2024-2025 by Löwenfelsen UG (haftungsbeschränkt) <support@loewenfelsen.net>.

This is free software, licensed under:

The Artistic License 2.0 (GPL Compatible)