NAME
Catmandu::Importer::Pure - Package that imports Pure data.
SYNOPSIS
# From the command line
$ catmandu convert Pure \
--base https://host/ws/api/... \
--endpoint research-outputs \
--apiKey "..."
# In Perl
use Catmandu;
my %attrs = (
base => 'https://host/path',
endpoint => 'research-outputs',
apiKey => '...',
options => { 'fields' => 'title,type,authors.*' }
);
my $importer = Catmandu->importer('Pure', %attrs);
my $n = $importer->each(sub {
my $hashref = $_[0];
# ...
});
# get number of validated and approved publications
my $count = Catmandu->importer(
'Pure',
base => 'https://host/path',
endpoint => 'research-outputs',
apiKey => '...',
fullResponse => 1,
post_xml => '<?xml version="1.0" encoding="utf-8"?>'
. '<researchOutputsQuery>'
. '<size>0</size>'
. '<workflowSteps>'
. ' <workflowStep>approved</workflowStep>'
. ' <workflowStep>validated</workflowStep>'
. '</workflowSteps>'
. '</researchOutputsQuery>'
)->first->{result}[0]{count};
DESCRIPTION
Catmandu::Importer::Pure is a Catmandu package that seamlessly imports data from Elsevier's Pure system using its REST service. In order to use the Pure Web Service you need an API key. List of all available endpoints and further documentation can currently be found under /ws on a webserver that is running Pure. Note that this version of the importer is tested with Pure API version 5.18 and might not work with later versions.
CONFIGURATION
- base
-
Base URL for the REST service is required, for example 'http://purehost.com/ws/api/518'
- endpoint
-
Valid endpoint is required, like 'research-outputs'
- apiKey
-
Valid API key is required for access
- path
-
Path after the endpoint
- user
-
User name if basic authentication is used
- password
-
Password if basic authentication is used
- options
-
Options passed as parameters to the REST service, for example: { 'size' => 20, 'fields' => 'title,type,authors.*' }
- post_xml
-
xml containing a query that will be submitted with a POST request
- fullResponse
-
Optional flag. If true delivers the complete results as a single item (record), corresponding to the XML response received. Only one request to the REST service is made in this case. Default is false.
If the flag is false then the items are set to child elements of the element 'result' or in case the 'result' element does not exist they are set to child elements of the root element for each response.
- handler( sub {} | $object | 'NAME' | '+NAME' )
-
Handler to transform each record from XML DOM (XML::LibXML::Element) into Perl hash.
Handlers can be provided as function reference, an instance of a Perl package that implements 'parse', or by a package NAME. Package names should be prepended by
+
or prefixed withCatmandu::Importer::Pure::Parser
. E.gfoobar
will create aCatmandu::Importer::Pure::Parser::foobar
instance.By default the handler Catmandu::Importer::Pure::Parser::simple is used. It provides a simple XML parsing, using XML::LibXML::Simple,
Other possible values are Catmandu::Importer::Pure::Parser::struct for XML::Struct based structure that preserves order and Catmandu::Importer::Pure::Parser::raw that returns the XML as it is.
- userAgent
-
HTTP user agent string, set to
Mozilla/5.0
by default. - furl
-
Instance of Furl or compatible class to fetch URLs with.
- timeout
-
Timeout for HTTP requests in seonds. Defaults to 50.
- trim_text
-
Optional flag. If true then all text nodes in the REST response are trimmed so that any leading and trailing whitespace is removed before parsing. This is useful if you don't want to risk getting leading and trailing whitespace in your data, since Pure doesn't currently clean leading/trailing white space from user input. Note that there is a small performance penalty when using this option. Default is false.
- filter( sub {} )
-
Optional reference to function that processes the XML response before it is parsed. The argument to the function is a reference to the XML text, which is then used to modify it. This is option is normally not needed but can helpful if there is a problem parsing the response due to a bug in the REST service.
METHODS
In addition to methods inherited from Catmandu::Iterable, this module provides the following public methods:
- url
-
Return the current Pure REST request URL (useful for debugging).
SEE ALSO
AUTHOR
Snorri Briem <briem@cpan.org>
COPYRIGHT
Copyright 2017- Lund University Library
LICENSE
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.