NAME
Data::AnyXfer::Elastic::Importer
SYNOPSIS
use Data::AnyXfer::Elastic::Importer;
my $importer = Data::AnyXfer::Elastic::Importer->new(
logger => Data::AnyXfer::Elastic::Logger->new,
);
my $datafile = DataFile->new( file => 'my/project.datafile' );
# Put this live on the relevant clusters
my $response = $importer->deploy( datafile => $datafile );
# alternativly, you can deploy in steps..
See source of C<deploy>
DESCRIPTION
The Elasticsearch Importer is designed to take a datafile and stream the index into Elasticsearch cluster(s). This process is known as playing the datafile. The process creates a index with the mappings/settings defined in the datafile. It will create indexes on multiple clusters depending on the silo given. Once the index has been created it can then be finalise which makes the index live by switching over the alias.
ATTRIBUTES
- logger
-
Logs events and errors to file. A instance of a
Data::AnyXfer::Elastic::Logger
. - bulk_max_count
-
Perl number. Defaults to 500. The maximum number of items which will be sent by the bulk helper before a flush is performed.
- wait_count_timeout
-
Perl number. Defaults to 10. The maximum number of seconds to wait after indexing for the number of visible documents in the index to reach the expected count before treating the import as a failure.
- delete_before_create
-
Boolean. Defaults to 0. When true, the importer instance will attempt to delete the index before creating them during "execute" in ..
- document_id_field
-
String. Optional.
Allows you to specify a field on each document which will also be supplied to elasticsearch as the document's
_id
.
METHODS
deploy
This method "plays" the datafile. It streams the data from the datafile into Elasticsearch. It creates a unique index based on the datafiles' time-stamp and will assign the mapping, settings etc. Datafile documents are then streamed into the index via the bulk helper. Finally it swaps the aliases making the index 'live'.
my $response = $importer->deploy(
datafile => $datafile, # required
silo => 'public_data', # optional
no_finalise => 1, # optional, does not call finalise
);
- datafile
-
A required
Data::AnyXfer::Elastic::Import::DataFile
object that defines the content and configuration of an Elasticsearch index. - silo
-
A optional string that overrides the silo defined in
$datafile
index info. - no_finalise
-
An optional bool indicating whether to run finalise at the end of the successful deployment of the datafile to all intended nodes and clusters (defaults to
1
).Useful for situation where you need to delay or co-ordinate switching the data over with some other action.
execute
Use c<deploy> where you can, if you have to do it as several steps (import index, do something else, switch aliases), then see the source of the c<deploy> method.
my $elastic = Data::AnyXfer::Elastic->new;
my @clients = $elastic->all_clients_for($silo);
foreach my $client ( @clients ) {
$importer->execute(
datafile => $datafile, # required
elasticsearch => $client, # required
);
}
This method takes a datafile and plays it into Elasticsearch. It differs from deploy
because it does not automatically finalise it. It returns the number of documents played on successful execution, or undef
on error. The argument elasticsearch must be a Search::Elasticsearch object generated from Data::AnyXfer::Elastic
. If not provided then a client will be generated from the datafile silo configuration.
finalise
$importer->finalise;
This method finalises deployment by switching aliases for each datafile executed. It will concurrently add the alias to the new index while removing any previous associations, see c<deploy> source before using this directly.
errors
my $errors = $importer->errors;
List the errors that have occurred.
cleanup
$importer->cleanup;
This method removes all indexes the importer has created and will empty the cache. This can not be called if finalise() has been called already.
COPYRIGHT
This software is copyright (c) 2019, Anthony Lucas.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 60:
'=item' outside of any '=over'