NAME

pgloader.pl - loads data to Postgres tables

SYNOPSIS

pgloader.pl  -siTV  person 
pgloader.pl  --help

DESCRIPTION

pgloader.pl loads tables to a Postgres database. It is similar to the pgloader(1) python program, written by other authors. Data are read from the file specified in the configuration file (defaults to pgloader.dat).

This version of pgloader exhibits the -i option which (when activated) drops all table indexes and recreates them again after COPY. In case of errors, everything rolls back to the initial state. This version also allows the libpg 'service' database connection method.

The configuration file and command options are almost identical to the python pythod pgloader(1) and is meant to be a drop-in replacement. Configuration entries are ignored for unimplemented features. The core functionality and many usefull features are already implemented; read further to find what is currently available.

OPTIONS

-q                         quiet  mode     (same as loglevel=1)
-v                         verbose mode    (same as loglevel=3)
-d                         debug  mode     (same as loglevel=4)
-l,  --loglevel            set loglevel 1 to 4  . Defaults to 2
-c,  --config              configuration file; defaults to "pgloader.conf"
-i,  --indexes             disable indexes during COPY
-n,  --dry_run             dry_run
-s,  --summary             show summary
-D,  --disable_triggers    disable triggers during loading
-T,  --truncate            truncate table before loading 
-V,  --vacuum              vacuum analyze table after loading
-C,  --count               number of lines to process 
     --version             show version and exit
-F,  --from                process from this line number

CONFIGURATION FILE

The configuration file (default is pgloader.conf), follows the ini configuration format, and is divided into these sections:

[pgslq]

This section is the only mandatory section, and defines how to access the database.

base           [required]  name of the database
host           [optional]  hostname to connect. Default is 'localhost'
port           [optional]  port number. Default is 5432
user           [optional]  name of login user. Default is epid of user
pass           [optional]  user password. Not needed if using libpq defaults.
pgsysconfdir   [optional]  dir for PGSYSCONFDIR
service                    mandatory only when pgsysconfdir ( or the 
                           enviromental variable PGSYSCONFDIR ) is defined;
                           otherwise it is ignored
[template1]

This section defines templates. In this case, the name was arbitrary chosen as template1. The purpose of templates is to hold default values for other table sections (defined bellow). You may define an unlimited number of template sections. The only mandatory entry for this section is 'template':

template   when defined, the template as enabled; leave it blank 
           to disable it.
[person]

This is the table section. The name person was arbitrary choosen, you can define an unlimited number of table sections. If the name of a table section appears on the command line (when invoking pgloader.pl) the corresponding table section defines how to load this table. Try to keep the name of the section the same as the name of the table. In a table section you can define the following parameters:

 filename             filename with data for the table         [mandatory]

 table                [Mandatory]  tablename or use schema.tablename. 
                      Defaults to section name                 [mandatory]

 use_template         which template to use for default values [optional]

 field_sep            Delimiter that separates fields. The default for
                      text formats is TAB, and for csv formats is ',' 

 format               must be either 'text' or 'csv' (without the quotes)
                      Default is text.

 copy_columns         names of columns to use for COPY.          [optional]
	              The char '*' means all columns, but since this is 
                      also the default you may as well leave it blank
		      Example:  copy_columns = first, last, age
	                        copy_columns = *

 only_cols            which column numbers to COPY ; counting    [optional]
	              starts from 1. The char '*' means all columns, but 
                      since this is also the default you may as well leave 
                      it blank. Example: only_cols = 1-2, 3, 5
			                 only_cols = 3

 quotechar            Usefull only for csv formats. Default is "

 null                 String that designates a NULL value ; 
                      usefull only for text mode. Default is string '\NA'

 skipinitialspace     when defined, we ignore leading and trailing whitespace

 udc_COLUMNAME        assign this value for all rows of column COLUMNAME
		      Examples: udc_title = Sir 
                                udc_age   = 99 
                                udc_race  = white

 reformat             reformat values of the age column by passing 
                      it to function upper(), in the John::Util module
                      reformat = age:John::Util::upper
                      [ CURRENTLY UNDER DEVELOPMENT ]


 copy_every           How many tuples to copy per transaction. 
                      More transactions are automatically created to 
                      insert the rest of the date, each inserting
                      upto that many tuples. Defaults is 10_000

NOTE: Because of how the ini format is defined as a value separator, if you need to include the , char, you must escape it with \ . For example: field_sep = \, sets field_sep to char ','

SEE ALSO

http://pgfoundry.org/projects/pgloader/ hosts the original python project.

AUTHOR

Ioannis Tambouras, <ioannis@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2008 by Ioannis Tambouras

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.0 or, at your option, any later version of Perl 5 you may have available.