NAME
dpan - create a DarkPAN from directories
SYNOPSIS
# from the command line
prompt> dpan [-l log4perl.config] [-f config] [directory [directory2]]
# get some help
prompt> dpan -h
prompt> dpan --help
# see the version
prompt> dpan -v
prompt> dpan --version
# see the configuration (exits after printing config)
prompt> dpan -c
prompt> dpan -c -f config
DESCRIPTION
The dpan
script takes a list of directories, indexes any Perl distributions it finds, and creates the PAUSE index files from what it finds. Afterward, you should be able to point a CPAN tool at the directory and install the distributions normally.
If you don't specify any directories on the command line, it works with the current working directory.
At the end, dpan
creates a modules directory in the first directory (or the current working directory) and creates the 02package.details.txt.gz and 03modlist.data.gz.
Command-line processing
- -c
-
Dump the configuration and exit.
- -f config_file
-
Use the named configuration file.
- -l log4perl_config_file
-
The path to the log4perl configuration file. You can also set this with the <DPAN_LOG4PERL_FILE> environment variable or the
log4perl_file
configuration directive.
Configuration options
If you don't specify these values in a configuation file, dpan
will use its defaults. The behavior of these options only apply to the default components. If you subclass a component or use a different component, you might see something different.
Some of the configuration options come from MyCPAN::Indexer::App::BackPAN
. The source of each directive is noted.
- alarm
-
The maximum amount of time allowed to index a distribution, in seconds.
Default: 15
Affected components: Worker
Config level: MyCPAN
-
dpan
will use this filename to add PAUSE IDs it finds in the repository to authors/00whois.xml and authors/01mailrc.txt.gz (see alsopause_id
andpause_full_name
). Without this file,dpan
will use the value ofpause_full_name
for new PAUSE IDs it finds.The format is the file are lines that have the PAUSE ID followed by the full name and email address in angle brackets:
BUSTER Buster Bean <buster@example.com>
This option helps your DPAN work with
CPAN::Mini::Webserver
, which tries to pull values out of authors/00whois.xml and authors/01mailrc.txt.gz to insert into its templates.Default: undef
Affected components: Collator
Config level: DPAN
- collator_class
-
The Perl class to use as the Dispatcher component. See
MyCPAN::Indexer::Tutorial
for details on components.Default: MyCPAN::App::DPAN::Reporter::Minimal
- copy_bad_dists
-
If set to a true value, copy bad distributions to the named directory so you can inspect them later.
Default: 0
Affected components: Queue
Config level: MyCPAN
- dispatcher_class
-
The Perl class to use as the Dispatcher component. See
MyCPAN::Indexer::Tutorial
for details on components.Default: MyCPAN::Indexer::Dispatch::Serial
Config level: MyCPAN
- dpan_dir
-
The directory where
dpan
will create the DPAN. It takes a single directory.Default: the current working directory
Affected components: all of them
Config level: DPAN
- error_report_subdir
-
The subdirectory under
report_dir
to use for error reports.Default: error
Affected components: Reporter, Collator
Config level: MyCPAN
- extra_reports_dir
-
You can specify another directory that contains pre-indexed reports.
dpan
will add these reports to its queue to create 02packages.details.txt.gz. So far,dpan
doesn't check that these reports correspond to any files in the repository and could cause the checks against 02packages.details.txt.gz to fail.You probably want to use this as a way to inject information about distributions that you skipped with
skip_dists_regexes
. This is also very handy for problematic distributions that resist indexing.Default: undef
Affected components: Indexer
Config level: DPAN
- fresh_start
-
Delete the report directory before indexing. This cleans out all previous work in
report_dir
, so you need to save that on your own. You can also set this with theDPAN_FRESH_START
environment variable.Default: 0
Affected components: Queue, Reporter
Config level: DPAN
- i_ignore_errors_at_my_peril
-
dpan
considers some problems to be too troublesome to continue, such as when it thinks it created incomplete index files. If you don't wantdpan
to stop, seti_ignore_errors_at_my_peril
to a true value. However, you might get an invalid mirror.Default: 0
Affected components: Collator
Config level: DPAN
- ignore_missing_dists
-
dpan
aborts the process if it finds distributions in the repository that don't show up in 02packages.details.txt.gz, which it think signals an incomplete index. If you want to risk an incomplete index, set this to a true value.Default: 0
Affected components: Collator
Config level: DPAN
- ignore_packages
-
You can tell DPAN to ignore some namespaces. The indexer may still record them, but they won't show up in 02packages.details.txt.gz. It's a space-separated list of exact package names
Default: main MY MM DB bytes DynaLoader
Affected components: Indexer
Config level: DPAN
- indexer_class
-
The Perl class to use as the Indexer component. See
MyCPAN::Indexer::Tutorial
for details on components.Default: MyCPAN::App::DPAN::Indexer
- indexer_id
-
Give yourself a name so people who who ran
dpan
.Default: Joe Example <joe@example.com>
Affected components: Reporter
Config level: MyCPAN
- interface_class
-
The Perl class to use as the Interface component. See
MyCPAN::Indexer::Tutorial
for details on components.Default: MyCPAN::Indexer::Interface::Text
Config level: MyCPAN
- log_file_watch_time
-
The time, in seconds, to wait until
Log4perl
checks for an updated configuration file.Default: 30
Affected components: all
Config level: MyCPAN
- log4perl_file
-
The path to the log4perl configuration file. You can also set this with the <-l> switch or the
DPAN_LOG4PERL_FILE
environment variable.Default: undef
Affected components: all
Config level: MyCPAN
- merge_dirs
-
A space separated list of directories containing distributions to merge into DPAN. These will go into the directory for
pause_id
. The files are copied, not renamed, so it works across partitions and mount points.Default: undef
Affected components: Queue
Config level: MyCPAN
- organize_dists
-
Take all of the distributions
dpan
finds and put the into a PAUSE-like structure under authors/id/D/DP/DPAN underdpan_dir
. You can change the author ID with thepause_id
directive.Default: 0
Affected components: Queue
Config level: DPAN
- parallel_jobs
-
The number of parallel jobs to run. This only matters for dispatcher classes that can do more than one thing at a time. The default dispatcher does not use parallel jobs. You need to change the
dispatcher_class
toMyCPAN::Indexer::Dispatcher::Parallel
(or another class that supports this).Default: 0
Affected components: Dispatcher
Config level: MyCPAN
- pause_id
-
The author ID to use if organize_dists is set to a true value.
Default: DPAN
Affected components: Queue, Reporter, Collator
Config level: DPAN
- pause_full_name
-
The full name to use for the default PAUSE ID and any unclaimed ID
dpan
finds in the repository. This shows up in the 01mailrc.txt.gz and 00whois.xml files.Default: "DPAN user <CENSORED>"
Affected components: Collator
Config level: DPAN
- postflight_class
-
If defined, after
dpan
has finished all of its work and is ready to exit, it will try to load this class and execute itsrun()
method. It passesrun()
the application object, so you have full access to everything. SeeMyCPAN::App::DPAN::NullPostFlight
for an example. To use the application object, you need to know a little about the guts ofMyCPAN::Indexer
.Default: undef
Affected components: none
Config level: DPAN
- prefer_bin
-
If set to a true value,
dpan
will try to use an external binary before it results to a pure Perl option. This matters mostly forArchive::Extract
. Yourtar
utility may have better luck thanArchive::Tar
.This setting overrides the
PREFER_BIN
environment variable. You can also set that environment variable to change howArchive::Extract
decides to extract an archive.Default: 0
Affected components: Indexer
Config level: MyCPAN
- queue_class MyCPAN::Indexer::SomeOtherQueue
-
The Perl class to use as the Queue component. See
MyCPAN::Indexer::Tutorial
for details on components.Default: MyCPAN::Indexer::SkipQueue
Config level: MyCPAN
- relative_paths_in_report
-
If true, supporting Reporter classes change the path to the distribution file to be relative to authors/id. Otherwise, the path in the report is the absolute path to the distribution.
Supported by
MyCPAN::App::DPAN::Reporter::Minimal
.Default: true
Affected components: Reporter
Config level: DPAN
- reporter_class
-
The Perl class to use as the Reporter component. See
MyCPAN::Indexer::Tutorial
for details on components.Default: MyCPAN::App::DPAN::Reporter::Minimal
Config level: MyCPAN
- report_dir
-
Where to store the distribution reports.
Default: a directory named indexer_reports in the current working directory
Affected components: Reporter
Config level: MyCPAN
- retry_errors
-
Try to index a distribution even if it was previously tried and had an error. This depends on previous reports being in
report_dir
, so if you don't set that configuration directive, it won't matter.Default: 1
Affected components: Indexer
Config level: DPAN
- skip_dists_regexes
-
You can specify a list of whitespace-separated regexes for
dpan
to use to filter the queue of distributions to index. You probably want to use this to skip very large distributions. You can have a pre-made index report by settingextra_reports_dir
.This is only supported by
MyCPAN::Indexer::SkipQueue
.Default: null
Affected components: Queue
Config level: DPAN
- skip_perl
-
When set to a true value,
skip_perl
causedpan
to ignore distributions that match/^(strawberry-?)perl-/
, since these can take a long time to index. You can have a pre-made index report by settingextra_reports_dir
.This is only supported by
MyCPAN::Indexer::SkipQueue
.Default: 0
Affected components: Queue
Config level: DPAN
- error_report_subdir
-
The subdirectory under
report_dir
to use for success reports.Default: success
Affected components: Reporter, Collator
Config level: MyCPAN
- system_id macbookpro
-
Give the indexing system a name, just to identify the machine. This shows up in the
run_info
, although the default Reporter does not include it in the report.Default: 'an unnamed machine'
Affected components: Reporter
Config level: MyCPAN
- temp_dir
-
Where to unpack the dists or create any temporary files.
Default: a temp directory in the current working directory
Affected components: Worker
Config level: MyCPAN
- use_real_whois
-
When
use_real_whois
is set to a true value,dpan
tries to update the authors/00whois.xml and authors/01mailrc.txt.gz files from a real CPAN mirror. This will overwrite the files already in the repository, although it will update the CPAN versions with PAUSE IDsdpan
finds in the repository (seeauthor_map
).The default case for most
dpan
options is to avoid the real CPAN, so by defaultdpan
will create stub files for these files instead of fetching the real ones.If you are using
CPAN::Mini
, you can mirror00whois.xml
by adding a line to your .minicpanrc:also_mirror authors/00whois.xml
Default: 0
Affected components: Collator
Config level: DPAN
- worker_class
-
The Perl class to use as the Worker component. See
MyCPAN::Indexer::Tutorial
for details on components.Default: MyCPAN::Indexer::Worker
Logging
dpan
uses Log4perl if it is available. Without Log4perl, it uses a small internal logger that prints to the screen.
You can set the Log4perl levels on each of the components separately:
log4perl.rootLogger = FATAL, Null
log4perl.logger.backpan_indexer = DEBUG, File
log4perl.logger.Indexer = DEBUG, File
log4perl.logger.Worker = DEBUG, File
log4perl.logger.Interface = DEBUG, File
log4perl.logger.Dispatcher = DEBUG, File
log4perl.logger.Queue = DEBUG, File
log4perl.logger.Reporter = DEBUG, File
log4perl.logger.Collator = DEBUG, File
ENVIRONMENT VARIABLES
- DPAN_FRESH_START
-
Delete the report directory before indexing. This cleans out all previous work, so you need to save that on your own. You can also set this with the
fresh_start
configuration directive. - DPAN_LOG4PERL_FILE
-
The path to the log4perl configuration file. You can also set this with the <-l> switch of the
log4perl_file
configuration directive. - DPAN_LOGLEVEL
-
The Log4perl level for easyinit. This won't affect the logging level if you specify a log configuration file.
SEE ALSO
MyCPAN::Indexer, MyCPAN::Indexer::DPAN
SOURCE AVAILABILITY
This code is in Github:
git://github.com/briandfoy/mycpan-app-dpan.git
AUTHOR
brian d foy, <bdfoy@cpan.org>
COPYRIGHT AND LICENSE
Copyright © 2010-2018, brian d foy <bdfoy@cpan.org>. All rights reserved.
You may redistribute this under the terms of the Artistic License 2.0.