NAME
Metadata::ByInode::Indexer - customizable file and directory indexer
DESCRIPTION
part of Metadata::ByInode not meant to be used alone!
index()
First argument is an absolute file path.
If this is a dir, will recurse - NON inclusive that means the dir *itself* will NOT be indexed
if it is a file, will do just that one.
returns indexed files count
by default the indexer does not index hidden files to index hidden files,
$m = new Metadata::ByInode::Indexer({
abs_dbfile => '/tmp/mbi_test.db',
index_hidden_files => 1
});
$m->index('/path/to/what'); # dir or file
USING THE INDEXER
by deafault we just record abs_loc, filename, ontime(timestamp we recorded it on) you can use the method rule() which returns a File::Find::Rule object, to do neat things..
my $i = new Metadata::ByInode({ abs_dbfile => '/tmp/dbfile.db' });
$i->finder->name( qr/\.mp3$|\.avi$/ );
$i->index('/home/myself');
This would only index mp3 and avi files in your home dir.
finder()
returns File::Find::Rule object, you can feed it rules before calling index()
CREATING YOUR OWN INDEXER
index_extra()
If you want to invent your own indexer, then this is the method to override. For every file found, this method is run, it just inserts data into the record for that file. By default, all files will have 'filename', 'abs_loc', and 'ondisk', which is a timestamp of when the file was seen (now).
for example, if you want the indexer to record mime types, you should override the index_extra method as..
package Indexer::WithMime;
use File::MMagic;
use base 'Metadata::ByInode::Indexer';
sub index_extra {
my $self = shift;
# get hash with current record data
my $record = $self->_record;
# by default, record holds 'abs_loc', 'filename', and 'ondisk'
# ext will be the distiction between dirs here
if ($record->{filename}=~/\.\w{1,4}$/ ){
my $m = new File::MMagic;
my $mime = $m->checktype_filename(
$record->{abs_loc} .'/'. $record->{filename}
);
if ($mime){
# and now we append to the record another key and value pair
$self->_set('mime_type',$mime);
}
}
return 1;
}
Then in your script
use Indexer::WithMime;
my $i = new Indexer::WithMime({ abs_dbfile => '/home/myself/dbfistartedle.db' });
$i->index('/home/myself');
# now you can search files by mime type residing somewhere in that dir
$i->search({ mime_type => 'mp3' });
#or
$i->search({
mime_type => 'mp3',
filename => 'u2',
});
_teststop()
returns how many files to index before stop only happens if DEBUG is on. default val is 1000, to change it, provide new argument before indexing.
$self->_teststop(10000); # now set to 10k
You may also pass this ammount to the constructor
my $i = new Metadata::ByInode( { _teststop => 500, abs_dbfile => '/tmp/index.db' });
_find_abs_paths()
argument is abs path to what base dir to scan to index, returns abs paths to all within no hidden files are returned
Returns array ref with abs paths:
$self->_find_abs_paths('/var/wwww');
_save_stat_data()
By default we do not save stat data, if you want to, then pass as argument to constructor:
my $i = new Metadata::ByInode({ save_stat_data => 1 });
This will create for each entry indexed;
ctime mtime is_dir is_file is_text is_binary size
If you are indexing 1k files, this makes little difference. But if you are indexing 1million, It makes a lot of difference in time.
CHANGES
The previous version used the system find to get a list of what to index, now we use File::Find::Rule
SEE ALSO
Metadata::ByInode and Metadata::ByInode::Search
$self->{_open_handle}->{recursive_delete}->execute("$abs_path%");
my $rows_deleted = $self->{_open_handle}->{recursive_delete}->rows;
### $rows_deleted
$self->dbh->commit;
DOING A SUBSELECT LIKE THIS TAKES FOREEEEEVVVVVEEERRRRRRRRRRRR
my $delete = $self->dbh->prepare(
q{DELETE FROM metadata WHERE inode IN(
SELECT inode FROM (select * from metadata) as temptable WHERE temptable.mkey='abs_loc' AND temptable.mvalue LIKE ?)}
) or croak( "_delete_treeslice() ". $self->dbh->errstr );
print STDERR "done.\n" if DEBUG;
print STDERR "executing.. " if DEBUG;
$delete->execute("$abs_path\%");
print STDERR "done.\n" if DEBUG;
my $rows_deleted = $delete->rows;
## $rows_deleted
$self->dbh->commit;
AUTHOR
Leo Charre leocharre at cpan dot org