NAME
App::dupfind::Threaded::MapReduce::Weed - Map-reduce version of weed_dups, and the worker thread for it
VERSION
version 0.140230
DESCRIPTION
Overrides the weed_dups method from App::dupfind::Common and implements an worker thread routine that is invoked therein. In this threaded version of weed_dups, the set of same-size file groupings is mapped as a task and sent to the main map reducer logic engine implemented in App::dupfind::Threaded::MapReduce. The outcome of that multithreaded map-reduce operation is a significantly smaller list of potential duplicates (or no duplicates if none were left after the weeding-out).
Please don't use this module by itself. It is for internal use only.
METHODS
- weed_dups
-
Calls the map-reduce logic on the $size_dups hashref, providing a wrapped coderef calling out to _weed_worker for every weeding algorithm that has been specified by the user. The coderef mappings are then invoked by the map-reduce engine for same-size size file groupings
This overrides the weed_dups method in App::dupfind::Common
- _weed_worker
-
Runs weed-out passes for same-size file groupings, using $weeder, where $weeder is a weed-out algorithm that tosses out non-dupes by use of more efficient means than hashing alone. The idea is to read as little as possible from the disk while searching out dupes, and to use file hashing (digests) as a last resort.