NAME

Meta::Utils::File::Iterator - iterate files in directories.

COPYRIGHT

Copyright (C) 2001, 2002 Mark Veltzer; All rights reserved.

LICENSE

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.

DETAILS

MANIFEST: Iterator.pm
PROJECT: meta
VERSION: 0.16

SYNOPSIS

package foo;
use Meta::Utils::File::Iterator qw();
my($iterator)=Meta::Utils::File::Iterator->new();
$iterator->add_directory("/home/mark");
$iterator->start();
while(!$iterator->get_over()) {
	print $iterator->get_curr()."\n";
	$iterator->next();
}
$iterator->fini();

DESCRIPTION

This is an iterator object which allows you to streamline work which has to do with recursing subdirs. Give this object a subdir to recurse and it will give you the next file to work on whenever you ask it to. The reason this method is more streamlined is that you dont need to know anything about iterating file systems and still you dont get all the filenames that you will be working on in RAM at the same time. Lets say that you're working on 100,000 files (which is more than the number of arguments that you can give to a utility program on a UNIX system by default...). How will you work on it ? If you want to get the filenames on the command line you have to use something like xargs which is an ugly hack since it runs your utility way too many times (one time for each file). If you don't want the xargs overhead then what you want is to put the iterator in your source. Again, two methods are available. Either you scan the file system and produce a list of the files which you will be working on. This means that the RAM that your program will take will be proportional to the number of files you will be working on (and since you may not need knowledge of all of them at the same time and you may even need them one at a time with no relation to the others) - this is quite a ram load. The other method is the method presented here: use this iterator.

This iterator can give you directory names or just the files. The default behaviour is to iterate just the files.

The interface to this object is quite object oriented. See the synopsis for an example.

The object may receive several directories to iterate and it will iterate them in sequence. The object may also receive file which will take part in the iteration.

The object will throw exceptions if any errors occur. Please see Error.pm for detail about catching or ignoring those.

FUNCTIONS

BEGIN()
init($)
add_directory($$)
add_directories($$$)
add_file($$)
add_files($$$)
start($)
next($)
fini($)
TEST($)

FUNCTION DOCUMENTATION

BEGIN()

This method creates the accessor methods for the following attributes: 0. "want_files" - do you want to iterate regular file. 1. "want_dirs" - do you want to iterate directories. 2. "curr" - current file/directory of the iterator. 3. "base" - basename (name without directory) of the current file/directory. 4. "dire" - directory of the current file/directory. 5. "over" - is the iterator over ?

init($)

This is an internal post-constructor.

add_directory($$)

This method will set the directory that this iterator will scan. Right now, if you add the same directory with different names, it will get iteraterd twice. This is on the todo list.

add_directories($$$)

This method receives: 1. A file iterator object to work on. 2. A string containing a catenated list of directories. 3. A separator string enabling split of directories. The method will add each of the directories in the list to the current file iterator. It will simply call add_directory on each of these.

add_file($$)

This method will add a single file to be iterated by the iterator.

add_files($$$)

This method adds a file list to be iterated by the iterator. A separator is also supplied to split them up.

start($)

This will initialize the iterator. After that you can start calling get_over in a loop and in the loop use get_curr and next.

next($)

This method iterates to the next value. You need to check if there are more entries to iterate using the "get_over" method after using this one.

fini($)

This method wraps the iterator up (does various cleanup). You're not obliged to call this one but for future purposes you better...:)

collect($$)

This is a static method which uses the current object to provide you with a set object which has all the files under a certain directory.

TEST($)

Test suite for this module. This test suite can be called to test the functionality of this module on a stand alone basis or as a part of a high level testing suite for an entire class library this class is provided with. Currently this test suite iterates over some directories and prints the results.

SUPER CLASSES

None.

BUGS

None.

AUTHOR

Name: Mark Veltzer
Email: mailto:veltzer@cpan.org
WWW: http://www.veltzer.org
CPAN id: VELTZER

HISTORY

0.00 MV more perl packaging
0.01 MV PDMT
0.02 MV md5 project
0.03 MV database
0.04 MV perl module versions in files
0.05 MV movies and small fixes
0.06 MV movie stuff
0.07 MV graph visualization
0.08 MV more thumbnail stuff
0.09 MV thumbnail user interface
0.10 MV more thumbnail issues
0.11 MV website construction
0.12 MV web site development
0.13 MV web site automation
0.14 MV SEE ALSO section fix
0.15 MV move tests to modules
0.16 MV md5 issues

SEE ALSO

Error(3), File::Basename(3), Meta::Class::MethodMaker(3), Meta::Ds::Oset(3), Meta::Ds::Stack(3), Meta::IO::Dir(3), Meta::Utils::Utils(3), strict(3)

TODO

-enable breadth vs depth first search.

-enable different options for filtering which files get delivered (suffixes, regexps, types etc...).