NAME

EAI::Wrap - framework for easy creation of Enterprise Application Integration tasks

SYNOPSIS

# site.config
%config = (
	sensitive => {
		myftp => {user => 'someone', pwd => 'password', privKey => 'pathToPrivateKey', hostkey => 'hostkey to be presented'},
		mydb => {user => 'someone', pwd => 'password'}
	},
	checkLookup => {"task_script.pl" => {errmailaddress => "test\@test.com", errmailsubject => "testjob failed", timeToCheck => "0800", freqToCheck => "B", logFileToCheck => "test.log", logcheck => "started.*"}},
	folderEnvironmentMapping => {Test => "Test", Dev => "Dev", "" => "Prod"},
	errmailaddress => 'To@somewhere.com',
	errmailsubject => "errMailSubject",
	fromaddress => 'from@somewhere.com',
	smtpServer => "MailServer",
	smtpTimeout => 60,
	logRootPath => "C:/dev/EAI/Logs",
	historyFolder => "History",
	redoDir => "redo",
	task => {
		retrySecondsErr => 60*5,
		retrySecondsPlanned => 60*15,
	},
	DB => {
		server => {Prod => "ProdServer", Test => "TestServer"},
		cutoffYr2000 => 60,
		DSN => 'driver={SQL Server};Server=$DB->{server}{$execute{env}};database=$DB->{database};TrustedConnection=Yes;',
		schemaName => "dbo",
	},
	FTP => {
		maxConnectionTries => 5, 
		plinkInstallationPath => "C:/dev/EAI/putty/PLINK.EXE",
	},
	File => {
		format_thousandsep => ",",
		format_decimalsep => ".",
	}
);

# task_script.pl
use EAI::Wrap;
%common = (
	FTP => {
		remoteHost => {"Prod" => "ftp.com", "Test" => "ftp-test.com"},
		remoteDir => "/reports",
		port => 22,
		user => "myuser",
		privKey => 'C:/keystore/my_private_key.ppk',
		FTPdebugLevel => 0, # ~(1|2|4|8|16|1024|2048)
	},
	DB => {
		tablename => "ValueTable",
		deleteBeforeInsertSelector => "rptDate = ?",
		dontWarnOnNotExistingFields => 1,
		database => "DWH",
	},
	task => {
		plannedUntil => "2359",
	},
);
@loads = (
	{
		File => {
			filename => "Datafile1.XML",
			format_XML => 1,
			format_sep => ',',
			format_xpathRecordLevel => '//reportGrp/CM1/*',
			format_fieldXpath => {rptDate => '//rptHdr/rptDat', NotionalVal => 'NotionalVal', tradeRef => 'tradeRefId', UTI => 'UTI'}, 
			format_header => "rptDate,NotionalVal,tradeRef,UTI",
		},
	},
	{
		File => {
			filename => "Datafile2.txt",
			format_sep => "\t",
			format_skip => 1,
			format_header => "rptDate	NotionalVal	tradeRef	UTI",
		},
	}
);
setupEAIWrap();
openDBConn(\%common) or die;
openFTPConn(\%common) or die;
while (!$execute{processEnd}) {
	for my $load (@loads) {
		getFilesFromFTP($load);
		if (checkFiles($load)) {
			readFileData($load);
			dumpDataIntoDB($load);
			markProcessed($load);
		}
	}
	processingEnd();
}

DESCRIPTION

EAI::Wrap provides a framework for defining EAI jobs directly in Perl, sparing the creator of low-level tasks as FTP-Fetching, file-parsing and storing into a database. It also can be used to handle other workflows, like creating files from the database and uploading to FTP-Servers or using other externally provided tools.

The definition is done by first setting up configuration hashes and then providing a high-level scripting of the job itself using the provided API (although any perl code is welcome here!).

EAI::Wrap has a lot of infrastructure already included, like logging using Log4perl, database handling with DBI and DBD::ODBC, FTP services using Net::SFTP::Foreign, file parsing using Text::CSV (text files), Data::XLSX::Parser and Spreadsheet::ParseExcel (excel files), XML::LibXML (xml files), file writing with Spreadsheet::WriteExcel and Excel::Writer::XLSX (excel files), Text::CSV (text files).

Furthermore it provides very flexible commandline options, allowing almost all configurations to be set on the commandline. Commandline options (e.g. additional information passed on with the interactive option) of the task script are fetched at INIT allowing use of options within the configuration, e.g. $opt{process}{interactive_startdate} for a passed start date.

Also the logging configured in $ENV{EAI_WRAP_CONFIG_PATH}/log.config (logfile root path set in $ENV{EAI_WRAP_CONFIG_PATH}/site.config) starts immediately at INIT of the task script, to use a logger, simply make a call to get_logger(). For the logging configuration, see EAI::Common, setupLogging.

API

%config

global config (set in $ENV{EAI_WRAP_CONFIG_PATH}/site.config, amended with $ENV{EAI_WRAP_CONFIG_PATH}/additional/*.config), contains special parameters (default error mail sending, logging paths, etc.) and site-wide pre-settings for the five categories in task scripts, described below under configuration categories)

%common

common configs for the task script, may contain one configuration hash for each configuration category.

@loads

list of hashes defining specific load processes within the task script. Each hash may contain one configuration hash for each configuration category.

configuration categories

In the above mentioned hashes can be five categories (sub-hashes): DB, File, FTP, process and task. These allow further parameters to be set for the respective parts of EAI::Wrap (EAI::DB, EAI::File and EAI::FTP), process parameters and task parameters. The parameters are described in detail in section CONFIGURATION REFERENCE.

The process category is on the one hand used to pass information within each process (data, additionalLookupData, filenames, hadErrors or custom commandline parameters starting with interactive), on the other hand for additional configurations not suitable for DB, File or FTP (e.g. uploadCMD). The task category contains parameters used on the task script level and is therefore only allowed in %config and %common. It contains parameters for skipping, retrying and redoing the whole task script.

The settings in DB, File, FTP and task are "merge" inherited in a cascading manner (i.e. missing parameters are merged, parameters already set below are not overwritten):

%config (defined in site.config and other associated configs loaded at INIT)
merged into ->
%common (common task parameters defined in script)
merged into each of ->
$loads[]

special config parameters and DB, FTP, File, task parameters from command line options are merged at the respective level (config at the top, the rest at the bottom) and always override any set parameters. Only scalar parameters can be given on the command line, no lists and hashes are possible. Commandline options are given in the format:

--<category> <parameter>=<value>

for the common level and

--load<i><category> <parameter>=<value>

for the loads level.

Command line options are also available to the script via the hash %opt or the list of hashes @optloads, so in order to access the cmdline option --process interactive_date=202300101 you could either use $common{process}{interactive_date} or $opt{process}{interactive_date}.

In order to use --load1process interactive_date=202300101, you would use $loads[1]{process}{interactive_date} or $optloads[1]{process}{interactive_date}.

The merge inheritance for DB, FTP, File and task can be prevented by using an underscore after the hashkey, ie. DB_, FTP_, File_ and task_. In this case the parameters are not merged from common. However, they are always inherited from config.

%execute

hash of parameters for current task execution which is not set by the user but can be used to set other parameters and control the flow. Most important here are $execute{env} giving the current used environment (Prod, Test, Dev, whatever), $execute{envraw} (Production is empty here), the several files lists (being procesed, for deletion, moving, etc.), flags for ending/interrupting processing, directory locations as home and history, etc.

Detailed information about the several parameters used can be found in section execute of the configuration parameter reference, there are parameters for files (filesProcessed, filesToArchive, filesToDelete, filesToMoveinHistory, filesToMoveinHistoryUpload, filesToRemove and retrievedFiles), directories (homedir, historyFolder, historyFolderUpload and redoDir), process controlling parameters (failcount, firstRunSuccess, retryBecauseOfError, retrySeconds and processEnd).

Retrying with querying $execute{processEnd} can happen on two reasons: First, because task => {plannedUntil => "HHMM"} is set to a time until the task has to be retried, however this is done at most until midnight. Second, because an error occurred, in this case $process->{hadErrors} is set on each load that failed. $execute{retryBecauseOfError} is also important in this context as it prevents the repeated run of following API procedures if the process didn't have an error:

getLocalFiles, getFilesFromFTP, getFiles, checkFiles, extractArchives, getAdditionalDBData, readFileData, dumpDataIntoDB, writeFileFromDB, putFileInLocalDir, uploadFileToFTP, uploadFileCMD, and uploadFile.

After the first successful run of the task, $execute{firstRunSuccess} is set to prevent any error messages resulting of files having been moved/removed while rerunning the task until the defined planned time (task => {plannedUntil => "HHMM"}) has been reached.

INIT

The INIT procedure is executed at the EAI::Wrap module initialization (when EAI::Wrap is used in the task script) and loads the site configuration, starts logging and reads commandline options. This means that everything passed to the script via command line may be used in the definitions, especially the task{interactive.*} parameters, here the name and the type of the parameter are not checked by the consistency checks (all other parameters not allowed or having the wrong type would throw an error).

removeFilesinFolderOlderX

remove files on FTP server being older than a time back (given in day/mon/year in remove => {removeFolders => ["",""], day=>, mon=>, year=>1}), see EAI::FTP::removeFilesOlderX

openDBConn ($)

argument $arg (ref to current load or common)

open a DB connection with the information provided in $DB->{user}, $DB->{pwd} (these can be provided by the sensitive information looked up using $DB->{prefix}) and $DB->{DSN} which can be dynamically configured using information from $DB itself, using $execute{env} inside $DB->{server}{*}: 'driver={SQL Server};Server=$DB->{server}{$execute{env}};database=$DB->{database};TrustedConnection=Yes;', also see EAI::DB::newDBH

openFTPConn ($)

argument $arg (ref to current load or common)

open a FTP connection with the information provided in $FTP->{remoteHost}, $FTP->{user}, $FTP->{pwd}, $FTP->{hostkey}, $FTP->{privKey} (these four can be provided by the sensitive information looked up using $FTP->{prefix}) and $execute{env}, also see EAI::FTP::login

redoFiles ($)

argument $arg (ref to current load or common)

redo file from redo directory if specified ($common{task}{redoFile} is being set), this is also being called by getLocalFiles and getFilesFromFTP. Arguments are fetched from common or loads[i], using File parameter.

getLocalFiles ($)

argument $arg (ref to current load or common)

get local file(s) from source into homedir, uses $File->{filename}, $File->{extension} and $File->{avoidRenameForRedo}. Arguments are fetched from common or loads[i], using File parameter.

getFilesFromFTP ($)

argument $arg (ref to current load or common)

get file/s (can also be a glob for multiple files) from FTP into homedir and extract archives if needed. Arguments are fetched from common or loads[i], using File and FTP parameters.

getFiles ($)

argument $arg (ref to current load or common)

combines above two procedures in a general procedure to get files from FTP or locally. Arguments are fetched from common or loads[i], using File and FTP parameters.

All get<*Files*> functions also parse the file into the datastructure process{data}. Custom "hooks" can be defined with fieldCode and lineCode to modify and enhance the standard mapping defined in format_header. To access the final line data the hash %EAI::File::line can be used (specific fields with $EAI::File::line{<target header column>}). if a field is being replaced using a different name from targetheader, the data with the original header name is placed in %EAI::File::templine. You can also access data from the previous line with %EAI::File::previousline and the previous temp line with %EAI::File::previoustempline.

checkFiles ($)

argument $arg (ref to current load or common)

check files for continuation of processing and extract archives if needed. Arguments are fetched from common or loads[i], using File parameter. The processed files are put into process->{filenames}

extractArchives ($)

argument $arg (ref to current load or common)

extract files from archive. Arguments are fetched from common or loads[i], using only the process->{filenames} parameter that was filled by checkFiles.

getAdditionalDBData ($;$)

arguments $arg (ref to current load or common) and optional $refToDataHash

get additional data from DB. Arguments are fetched from common or loads[i], using DB and process parameters. You can also pass an optional ref to a data hash parameter to store the retrieved data there instead of $process-{additionalLookupData}>

readFileData ($)

argument $arg (ref to current load or common)

read data from a file. Arguments are fetched from common or loads[i], using File parameter.

dumpDataIntoDB ($)

argument $arg (ref to current load or common)

store data into Database. Arguments are fetched from common or loads[i], using DB and File (for emptyOK) parameters.

markProcessed ($)

argument $arg (ref to current load or common)

mark files as being processed depending on whether there were errors, also decide on removal/archiving of downloaded files. Arguments are fetched from common or loads[i], using File parameter.

writeFileFromDB ($)

argument $arg (ref to current load or common)

create Data-files from Database. Arguments are fetched from common or loads[i], using DB and File parameters.

putFileInLocalDir ($)

argument $arg (ref to current load or common)

put files into local folder if required. Arguments are fetched from common or loads[i], using File parameter.

markForHistoryDelete ($)

argument $arg (ref to current load or common)

mark to be removed or be moved to history after upload. Arguments are fetched from common or loads[i], using File parameter.

uploadFileToFTP ($)

argument $arg (ref to current load or common)

upload files to FTP. Arguments are fetched from common or loads[i], using FTP and File parameters.

uploadFileCMD ($)

argument $arg (ref to current load or common)

upload files using an upload command program. Arguments are fetched from common or loads[i], using File and process parameters.

uploadFile ($)

argument $arg (ref to current load or common)

combines above two procedures in a general procedure to upload files via FTP or CMD or to put into local dir. Arguments are fetched from common or loads[i], using File and process parameters

processingEnd

final processing steps for processEnd (cleanup, FTP removal/archiving) or retry after pausing. No context argument as this always depends on all loads and/or the common definition

processingPause ($)

generally available procedure for pausing processing, argument $pauseSeconds gives the delay

moveFilesToHistory (;$)

optional argument $archiveTimestamp

move transferred files marked for moving (filesToMoveinHistory/filesToMoveinHistoryUpload) into history and/or historyUpload folder. Optionally a custom timestamp can be passed.

deleteFiles ($)

argument $filenames, ref to array

delete transferred files given in $filenames

CONFIGURATION REFERENCE

config

parameter category for site global settings, defined in site.config and other associated configs loaded at INIT

checkLookup

ref to datastructure {"scriptname.pl" => {errmailaddress => "",errmailsubject => "",timeToCheck =>"", freqToCheck => "", logFileToCheck => "", logcheck => "",logRootPath =>""},...} used for logchecker, each entry of the hash lookup table defines a log to be checked, defining errmailaddress to receive error mails, errmailsubject, timeToCheck as earliest time to check for existence in log, freqToCheck as frequency of checks (daily/monthly/etc), logFileToCheck as the name of the logfile to check, logcheck as the regex to check in the logfile and logRootPath as the folder where the logfile is found. lookup key: $execute{scriptname} + $execute{addToScriptName}

errmailaddress

default mail address for central logcheck/errmail sending

errmailsubject

default mail subject for central logcheck/errmail sending

executeOnInit

code to be executed during INIT of EAI::Wrap to allow for assignment of config/execute parameters from commandline params BEFORE Logging!

folderEnvironmentMapping

ref to hash {Test => "Test", Dev => "Dev", "" => "Prod"}, mapping for $execute{envraw} to $execute{env}

fromaddress

from address for central logcheck/errmail sending, also used as default sender address for sendGeneralMail

historyFolder

ref to hash {"scriptname.pl" => "folder"}, folders where downloaded files are historized, lookup key as in checkLookup, default in "" => "defaultfolder"

historyFolderUpload

ref to hash {"scriptname.pl" => "folder"}, folders where uploaded files are historized, lookup key as in checkLookup, default in "" => "defaultfolder"

logCheckHoliday

calendar for business days in central logcheck/errmail sending. builtin calendars are AT (Austria), TG (Target), UK (United Kingdom) and WE (for only weekends). Calendars can be added with EAI::DateUtil::addCalendar

logs_to_be_ignored_in_nonprod

logs to be ignored in central logcheck/errmail sending

logRootPath

ref to hash {"scriptname.pl" => "folder"}, paths to log file root folders (environment is added to that if non production), lookup key as checkLookup, default in "" => "defaultfolder"

redoDir

ref to hash {"scriptname.pl" => "folder"}, folders where files for redo are contained, lookup key as checkLookup, default in "" => "defaultfolder"

sensitive

hash lookup table ({"prefix" => {user=>"",pwd =>"",hostkey=>"",privkey =>""},...}) for sensitive access information in DB and FTP (lookup keys are set with DB{prefix} or FTP{prefix}), may also be placed outside of site.config; all sensitive keys can also be environment lookups, e.g. hostkey=>{Test => "", Prod => ""} to allow for environment specific setting

smtpServer

smtp server for den (error) mail sending

smtpTimeout

timeout for smtp response

testerrmailaddress

error mail address in non prod environment

execute

hash of parameters for current task execution which is not set by the user but can be used to set other parameters and control the flow

alreadyMovedOrDeleted

hash for checking the already moved or deleted files, to avoid moving/deleting them again at cleanup

addToScriptName

this can be set to be added to the scriptname for config{checkLookup} keys, e.g. some passed parameter.

env

Prod, Test, Dev, whatever

envraw

Production has a special significance here as being the empty string (used for paths). Otherwise like env.

errmailaddress

for central logcheck/errmail sending in current process

errmailsubject

for central logcheck/errmail sending in current process

failcount

for counting failures in processing to switch to longer wait period or finish altogether

filesToArchive

list of files to be moved in archiveDir on FTP server, necessary for cleanup at the end of the process

filesToDelete

list of files to be deleted on FTP server, necessary for cleanup at the end of the process

filesToMoveinHistory

list of files to be moved in historyFolder locally, necessary for cleanup at the end of the process

filesToMoveinHistoryUpload

list of files to be moved in historyFolderUpload locally, necessary for cleanup at the end of the process

filesToRemove

list of files to be deleted locally, necessary for cleanup at the end of the process

firstRunSuccess

for planned retries (process=>plannedUntil filled) -> this is set after the first run to avoid error messages resulting of files having been moved/removed.

freqToCheck

for logchecker: frequency to check entries (B,D,M,M1) ...

homedir

the home folder of the script, mostly used to return from redo and other folders for globbing files.

historyFolder

actually set historyFolder

historyFolderUpload

actually set historyFolderUpload

logcheck

for logchecker: the Logcheck (regex)

logFileToCheck

for logchecker: Logfile to be searched

logRootPath

actually set logRootPath

processEnd

specifies that the process is ended, checked in EAI::Wrap::processingEnd

redoDir

actually set redoDir

retrievedFiles

files retrieved from FTP or redo directory

retryBecauseOfError

retryBecauseOfError shows if a rerun occurs due to errors (for successMail) and also prevents several API calls from being run again.

retrySeconds

how many seconds are passed between retries. This is set on error with process=>retrySecondsErr and if planned retry is defined with process=>retrySecondsPlanned

scriptname

name of the current process script, also used in log/history setup together with addToScriptName for config{checkLookup} keys

timeToCheck

for logchecker: scheduled time of job (don't look earlier for log entries)

DB

DB specific configs

addID

this hash can be used to additionaly set a constant to given fields: Fieldname => Fieldvalue

additionalLookup

query used in getAdditionalDBData to retrieve lookup information from DB using readFromDBHash

additionalLookupKeys

used for getAdditionalDBData, list of field names to be used as the keys of the returned hash

cutoffYr2000

when storing date data with 2 year digits in dumpDataIntoDB/storeInDB, this is the cutoff where years are interpreted as 19XX (> cutoffYr2000) or 20XX (<= cutoffYr2000)

columnnames

returned column names from readFromDB and readFromDBHash, this is used in writeFileFromDB to pass column information from database to writeText

database

database to be used for connecting

debugKeyIndicator

used in dumpDataIntoDB/storeInDB as an indicator for keys for debugging information if primkey not given (errors are shown with this key information). Format is the same as for primkey

deleteBeforeInsertSelector

used in dumpDataIntoDB/storeInDB to delete specific data defined by keydata before an insert (first occurrence in data is used for key values). Format is the same as for primkey ("key1 = ? ...")

dontWarnOnNotExistingFields

suppress warnings in dumpDataIntoDB/storeInDB for not existing fields

dontKeepContent

if table should be completely cleared before inserting data in dumpDataIntoDB/storeInDB

doUpdateBeforeInsert

invert insert/update sequence in dumpDataIntoDB/storeInDB, insert only done when upsert flag is set

DSN

DSN String for DB connection

incrementalStore

when storing data with dumpDataIntoDB/storeInDB, avoid setting empty columns to NULL

ignoreDuplicateErrs

ignore any duplicate errors in dumpDataIntoDB/storeInDB

keyfields

used for readFromDBHash, list of field names to be used as the keys of the returned hash

longreadlen

used for setting database handles LongReadLen parameter for DB connection, if not set defaults to 1024

noDBTransaction

don't use a DB transaction for dumpDataIntoDB

noDumpIntoDB

if files from this load should not be dumped to the database

postDumpExecs

array for execs done in dumpDataIntoDB after postDumpProcessing and before commit/rollback: [{execs => ['',''], condition => ''}]. doInDB all execs if condition (evaluated string or anonymous sub: condition => sub {...}) is fulfilled

postDumpProcessing

done in dumpDataIntoDB after storeInDB, execute perl code in postDumpProcessing (evaluated string or anonymous sub: postDumpProcessing => sub {...})

postReadProcessing

done in writeFileFromDB after readFromDB, execute perl code in postReadProcessing (evaluated string or anonymous sub: postReadProcessing => sub {...})

prefix

key for sensitive information (e.g. pwd and user) in config{sensitive}

primkey

primary key indicator to be used for update statements, format: "key1 = ? AND key2 = ? ..."

pwd

for password setting, either directly (insecure -> visible) or via sensitive lookup

query

query statement used for readFromDB and readFromDBHash

schemaName

schemaName used in dumpDataIntoDB/storeInDB, if tableName contains dot the extracted schema from tableName overrides this. Needed for datatype information!

server

DB Server in environment hash lookup: {Prod => "", Test => ""}

tablename

the table where data is stored in dumpDataIntoDB/storeInDB

upsert

in dumpDataIntoDB/storeInDB, should an update be done after the insert failed (because of duplicate keys) or insert after the update failed (because of key not exists)?

user

for user setting, either directly (insecure -> visible) or via sensitive lookup

File

File parsing specific configs

avoidRenameForRedo

when redoing, usually the cutoff (datetime/redo info) is removed following a pattern. set this flag to avoid this

columns

for writeText: Hash of data fields, that are to be written (in order of keys)

columnskip

for writeText: boolean hash of column names that should be skipped when writing the file ({column1ToSkip => 1, column2ToSkip => 1, ...})

dontKeepHistory

if up- or downloaded file should not be moved into historyFolder but be deleted

dontMoveIntoHistory

if up- or downloaded file should not be moved into historyFolder but be kept in homedir

emptyOK

flag to specify whether empty files should not invoke an error message. Also needed to mark an empty file as processed in EAI::Wrap::markProcessed

extract

flag to specify whether to extract files from archive package (zip)

extension

the extension of the file to be read (optional, used for redoFile)

fieldCode

additional field based processing code: fieldCode => {field1 => 'perl code', ..}, invoked if key equals either header (as in format_header) or targetheader (as in format_targetheader) or invoked for all fields if key is empty {"" => 'perl code'}. set $EAI::File::skipLineAssignment to true (1) if current line should be skipped from data. perl code can be an evaluated string or an anonymous sub: field1 => sub {...}

filename

the name of the file to be read

firstLineProc

processing done in reading the first line of text files

format_allowLinefeedInData

line feeds in values don't create artificial new lines/records, only works for csv quoted data

format_beforeHeader

additional String to be written before the header in write text

format_dateColumns

numeric array of columns that contain date values (special parsing) in excel files

format_decimalsep

decimal separator used in numbers of sourcefile (defaults to . if not given)

format_defaultsep

default separator when format_sep not given (usually in site.config), if not given, "\t" is used as default.

format_encoding

text encoding of the file in question (e.g. :encoding(utf8))

format_headerColumns

optional numeric array of columns that contain data in excel files (defaults to all columns starting with first column up to format_targetheader length)

format_header

format_sep separated string containing header fields (optional in excel files, only used to check against existing header row)

format_headerskip

skip until row-number for checking header row against format_header in excel files

format_eol

for quoted csv specify special eol character (allowing newlines in values)

format_fieldXpath

for XML reading, hash with field => xpath to content association entries

format_fix

for text writing, specify whether fixed length format should be used (requires format_padding)

format_namespaces

for XML reading, hash with alias => namespace association entries

format_padding

for text writing, hash with field number => padding to be applied for fixed length format

format_poslen

array of positions/length definitions: e.g. "poslen => [(0,3),(3,3)]" for fixed length format text file parsing

format_quotedcsv

special parsing/writing of quoted csv data using Text::CSV

format_sep

separator string for csv format, regex for split for other separated formats. Also needed for splitting up format_header and format_targetheader (Excel and XML-formats use tab as default separator here).

format_sepHead

special separator for header row in write text, overrides format_sep

format_skip

either numeric or string, skip until row-number if numeric or appearance of string otherwise in reading textfile

format_stopOnEmptyValueColumn

for excel reading, stop row parsing when a cell with this column number is empty (denotes end of data, to avoid very long parsing).

format_suppressHeader

for textfile writing, suppress output of header

format_targetheader

format_sep separated string containing target header fields (= the field names in target/database table). optional for XML and tabular textfiles, defaults to format_header if not given there.

format_thousandsep

thousand separator used in numbers of sourcefile (defaults to , if not given)

format_worksheetID

worksheet number for excel reading, this should always work

format_worksheet

alternatively the worksheet name can be passed, this only works for new excel format (xlsx)

format_xlformat

excel format for parsing, also specifies excel parsing

format_xpathRecordLevel

xpath for level where data nodes are located in xml

format_XML

specify xml parsing

lineCode

additional line based processing code, invoked after whole line has been read (evaluated string or anonymous sub: lineCode => sub {...})

localFilesystemPath

if files are taken from or put to the local file system with getLocalFiles/putFileInLocalDir then the path is given here. Setting this to "." avoids copying files.

optional

to avoid error message for missing optional files, set this to 1

FTP

FTP specific configs

archiveDir

folder for archived files on the FTP server

dontMoveTempImmediately

if 0 oder missing: rename/move files immediately after writing to FTP to the final name, otherwise/1: a call to EAI::FTP::moveTempFiles is required for that

dontDoSetStat

for Net::SFTP::Foreign, no setting of time stamp of remote file to that of local file (avoid error messages of FTP Server if it doesn't support this)

dontDoUtime

don't set time stamp of local file to that of remote file

dontUseQuoteSystemForPwd

for windows, a special quoting is used for passing passwords to Net::SFTP::Foreign that contain [()"<>& . This flag can be used to disable this quoting.

dontUseTempFile

directly upload files, without temp files

fileToArchive

should file be archived on FTP server? requires archiveDir to be set

fileToRemove

should file be removed on FTP server?

FTPdebugLevel

debug ftp: 0 or ~(1|2|4|8|16|1024|2048), loglevel automatically set to debug for module EAI::FTP

hostkey

hostkey to present to the server for Net::SFTP::Foreign, either directly (insecure -> visible) or via sensitive lookup

localDir

optional: local folder for files to be placed, if not given files are downloaded into current folder

maxConnectionTries

maximum number of tries for connecting in login procedure

onlyArchive

only archive/remove on the FTP server, requires archiveDir to be set

path

additional relative FTP path (under remoteDir which is set at login), where the file(s) is/are located

port

ftp/sftp port (leave empty for default port 22)

prefix

key for sensitive information (e.g. pwd and user) in config{sensitive}

privKey

sftp key file location for Net::SFTP::Foreign, either directly (insecure -> visible) or via sensitive lookup

pwd

for password setting, either directly (insecure -> visible) or via sensitive lookup

queue_size

queue_size for Net::SFTP::Foreign, if > 1 this causes often connection issues

remove

ref to hash {removeFolders=>[], day=>, mon=>, year=>1} for for removing (archived) files with removeFilesOlderX, all files in removeFolders are deleted being older than day days, mon months and year years

remoteDir

remote root folder for up-/download, archive and remove: "out/Marktdaten/", path is added then for each filename (load)

remoteHost

ref to hash of IP-addresses/DNS of host(s).

SFTP

to explicitly use SFTP, if not given SFTP will be derived from existence of privKey or hostkey

simulate

for removal of files using removeFilesinFolderOlderX/removeFilesOlderX only simulate (1) or do actually (0)?

sshInstallationPath

path were ssh/plink exe to be used by Net::SFTP::Foreign is located

type

(A)scii or (B)inary

user

set user directly, either directly (insecure -> visible) or via sensitive lookup

process

used to pass information within each process (data, additionalLookupData, filenames, hadErrors or commandline parameters starting with interactive) and for additional configurations not suitable for DB, File or FTP (e.g. uploadCMD* and onlyExecFor)

additionalLookupData

additional data retrieved from database with EAI::Wrap::getAdditionalDBData

archivefilenames

in case a zip archive package is retrieved, the filenames of these packages are kept here, necessary for cleanup at the end of the process

data

loaded data: array (rows) of hash refs (columns)

filenames

names of files that were retrieved and checked to be locally available for that load, can be more than the defined file in File->filename (due to glob spec or zip archive package)

filesProcessed

hash for checking the processed files, necessary for cleanup at the end of the whole task

hadErrors

set to 1 if there were any errors in the process

interactive_

interactive options (are not checked), can be used to pass arbitrary data via command line into the script (eg a selected date for the run with interactive_date).

onlyExecFor

mark loads to only be executed when $common{task}{execOnly} !~ $load->{process}{onlyExecFor}

uploadCMD

upload command for use with uploadFileCMD

uploadCMDPath

path of upload command

uploadCMDLogfile

logfile where command given in uploadCMD writes output (for error handling)

task

contains parameters used on the task script level

customHistoryTimestamp

optional custom timestamp to be added to filenames moved to History/HistoryUpload/FTP archive, if not given, get_curdatetime is used (YYYYMMDD_hhmmss)

execOnly

used to remove loads where $common{task}{execOnly} !~ $load->{process}{onlyExecFor}

ignoreNoTest

ignore the notest file in the process-script folder, usually preventing all runs that are not in production

plannedUntil

latest time that planned repetition should start, this can be given either as HHMM (HourMinute) or HHMMSS (HourMinuteSecond), in case of HHMM the "Second" part is attached as 59

redoFile

flag for specifying a redo

redoTimestampPatternPart

part of the regex for checking against filename in redo with additional timestamp/redoDir pattern (e.g. "redo", numbers and _), anything after files barename (and before ".$ext" if extension is defined) is regarded as a timestamp. Example: '[\d_]', the regex is built like ($ext ? qr/$barename($redoTimestampPatternPart|$redoDir)*\.$ext/ : qr/$barename($redoTimestampPatternPart|$redoDir)*.*/)

retrySecondsErr

retry period in case of error

retrySecondsErrAfterXfails

after fail count is reached this alternate retry period in case of error is applied. If 0/undefined then job finishes after fail count

retrySecondsXfails

fail count after which the retrySecondsErr are changed to retrySecondsErrAfterXfails

retrySecondsPlanned

retry period in case of planned retry

skipHolidays

skip script execution on holidays

skipHolidaysDefault

holiday calendar to take into account for skipHolidays

skipWeekends

skip script execution on weekends

skipForFirstBusinessDate

used for "wait with execution for first business date", either this is a calendar or 1 (then calendar is skipHolidaysDefault), this cannot be used together with skipHolidays

COPYRIGHT

Copyright (c) 2023 Roland Kapl

All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.