The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

FAIR::Accessor - all this does is assign the HTTP call to the correct routine

VERSION

version 0.4

SYNOPSIS

The following code is a complete implementation of a 'Hello, World!' FAIR Accessor

 C<#!/usr/bin/perl -w

 package HelloWorld_Accessor;  # this should be the same as your filename!

 use strict;
 use warnings;
 use JSON;


 #-----------------------------------------------------------------
 # Configuration and Daemon
 #-----------------------------------------------------------------

 use base 'FAIR::Accessor';

 my $config = {
    title => 'Hello World Data Accessor',
    serviceTextualDescription => 'Server for some Helloworld Data',
    textualAccessibilityInfo => "The information from this server requries no authentication",  # this could also be a $URI describing the accessibiltiy
    mechanizedAccessibilityInfo => "",  # this must be a URI to an RDF document
    textualLicenseInfo => "CC-BY",  # this could also be a URI to the license info
    mechanizedLicenseInfo =>  "", # this must be a URI to an RDF document
    baseURI => "", # I don't know what this is used for yet, but I have a feeling I will need it!
    ETAG_Base => "HelloWorld_Accessor_For_Greetings", # this is a unique identificaiton string for the service (required by the LDP specification)
    localNamespaces => {hw => 'http://hello.world.org/some/items/',
                        hw2 => 'http://another.hello.world.org/some/predicates/'},  # add a few new namespaces to the list of known namespaces....
    localMetadataElements => [qw(hw:Greeting hw2:grusse) ],  # things that we use in addition to common metadata elements
    # baseURI => 'some/regular/expression', # OPTIONAL regexp to match the RESTful PATH part of the URL, before the ID number

 };

 my $service = HelloWorld_Accessor->new(%$config);

 # start daemon
 $service->handle_requests;


 #-----------------------------------------------------------------
 # Accessor Implementation
 #-----------------------------------------------------------------



 =head2 MetaContainer

  Function: REQUIRED SUBROUTINE - returns the first-stage LD Platform list of contained URIs and the dataset metadata.
  Args    : $starting_at_record : this will be passed-in to tell you what record to start with (for paginated responses)
  $path : the webserver's PATH_INFO environment value (used to modify the behaviour of REST services)
  Returns : JSON encoded listref of 'meta URIs' representing individual records
  Note    :  meta URIs are generally URIs that point back to this same server; calling GET on a meta URI will
            return an RDF description of the set of DCAT distributions for that record.
            The format of the JSON response is as follows:
            
            {"metadata:element1" : "some value",
             "external:metadatatype":  "some other value",
             "void:entities" : "3",
             "ldp:contains" : ["http://myserver.org/ThisScript/record/479-467-29",
                               "http://myserver.org/ThisScript/record/479-467-32",
                               "http://myserver.org/ThisScript/record/479-467-434"
                               ]
            }
            
            Recommended metadata elements include dc:title, dcat:description,dcat:identifier,
            dcat:keyword,dcat:landingPage,dcat:publisher,dcat:theme
            
            note #1:  Using dcat:theme requires you to create a SKOS concept scheme of the various ontology
            terms that describe the data in your repository... this isn't hard, but it's not entirely trivial either...
            
            note #2:  if you return URLs in the ldp:contains, then you must also return the count of those URLs in void:entities


 =cut

 sub MetaContainer {

    my ($self, %ARGS) = @_;
    my $PATH = $ARGS{'PATH'} || "";  # if there was a specific path sent after the script URL, it will be here
    
    # this is how you would manage "RESTful" references to different subsets of your data repository
    if ($PATH =~ /DataSliceX/) {
        # some behavior for Data Slice X
    } elsif ($PATH =~ /DataSliceY/) {
        # some behavior for Data Slice Y
    }
    
    my %result =  (  # NOTE THAT ALL OF THESE ARE OPTIONAL!  (and there are more fields.... see DCAT...)
                    'dc:title' => "Hello World Accessor Server",
                   'dcat:description' => "the prototype Accessor server for Hello World",
                    'dcat:identifier' => "handle:HelloWorld1234567",
                    'dcat:keyword' => ["greetings", "friendly", "welcome", "Hi"],
                    'dcat:landingPage' => 'http://hello.world.net/homepage.html',
                    'dcat:language' => 'en',
                    'dcat:publisher' => 'http://hello.world.net',
                    'dcat:temporal' => 'http://reference.data.gov.uk/id/quarter/2006-Q1',  # look at this!!  It doesn't have to be this complex, but it can be!
                    'dcat:theme'  => 'http://example.org/ConceptSchemes/HelloWorld.rdf',  # this is the URI to a SKOS Concept Scheme
                    );
    my $BASE_URL = "http://" . $ENV{'SERVER_NAME'} . $ENV{'REQUEST_URI'} . $PATH;

   # you may chose to return no record IDs at all, if you only want to serve repository-level metadata     
    my @known_records = ($BASE_URL . "/hello",
                         $BASE_URL . "/world",
                         # ...  you need to generate this list of record URIs here... somehow
                        );
    $result{'void:entities'} = scalar(@known_records);  #  THE TOTAL *NUMBER* OF RECORDS THAT CAN BE SERVED
    $result{'ldp:contains'} = \@known_records; # the listref of record ids
    
    return encode_json(\%result);

 }


 =head2 Distributions

  Function: REQUIRED IF get_all_meta_URIs list of URIs point back to this script.
           returns the second-stage LD Platform metadata describing the DCAT distributions, formats, and URLs
           for a particular record
  Args    : $ID : the desired ID number, as determined by the Accessor.pm module
           $PATH_INFO : the webserver's PATH_INFO environment value (in case the $ID best-guess is wrong... then you're on your own!)
  Returns : JSON encoded hashref of 'meta URIs' representing individual DCAT distributions and their mime-type (mime-type is key)
            The format for this response is (you are always allowed to use lists as values if you wish):
            
            {"metadata":
                {"rdf:type": ["edam:data_0006","sio:SIO_000088"]
                 "my:metadatathingy":  "some value",
                 "external:metadatatype":  "some other value"
                },
            "distributions":
                {"application/rdf+xml" : "http://myserver.org/ThisScript/record/479-467-29X.rdf",
                 "text/html" : "http://myserver.org/ThisScript/record/479-467-29X.html"
                }
            }

 =cut


 sub Distributions {
    my ($self, %ARGS) = @_;

    my $PATH = $ARGS{'PATH'};  
    my $ID = $ARGS{'ID'};
    
    my %response;
    my %formats;
    my %metadata;

    # this is how you would manage "RESTful" references to different subsets of your data repository
    if ($PATH =~ /DataSliceX/) {
        # some behavior for Data Slice X
    } elsif ($PATH =~ /DataSliceY/) {
        # some behavior for Data Slice Y
    }
    
    $formats{'text/html'} = 'http://myserver.org/ThisScript/helloworld.html';
    $formats{'application/rdf+xml'} = 'http://myserver.org/ThisScript/helloworld.rdf';

    # set the ontological type for the record  (optional)
    $metadata{'rdf:type'} = ['edam:data_0006', 'sio:SIO_000088'];
    
    # and whatever other metadata you wish (also optional)
    # extractMetaDataFromSpreadsheet(\%metadata, $ID);    

    $response{distributions} = \%formats;
    $response{metadata} = \%metadata if (keys %metadata);  # only set it if you can provided something

    my $response  = encode_json(\%response);
    
    return $response;

 }

>

DESCRIPTION

FAIR Accessors are an implementation of the W3Cs Linked Data Platform.

FAIR Accessors follow a two-stage interaction, where the first stage retrieves metadata about the repository, and (optionally) a series of URLs representing 'meta-records' for every record in that repository (or whatever slice of the repository is being served). This is accomplished by the MetaContainer subroutine. These URLs will generally point back at this same Accessor script (e.g. with the record number appended to the URL: http://this.host/thisscript/12345).

The second stage involves retrieving metadata about individual recoreds. The metadata is up to you, but optimally it would include the available DCAT distributions and their file formats. The second stage can be accomplished by this same Accessor script, using the Distributions subroutine.

The two subroutine names - MetaContainer and Distributions - are not flexible, as they are called by-name, by the Accessor libraries.

You MUST create the MetaContainer subroutine, at a minimum, and it should return some metadata. It does not have to return a list of known records (in which case it simply acts as a metadata descriptor of the repository in general, nothing more... which is fine!... and there will be no second stage interaction. In this case, you do not need to provide a Distributions subroutine.)

NAME

    FAIR::Accessor - Module for creating Linked Data Platform Accessors for the FAIR Data project

Command-line testing

If you wish to test your Accessor server at the command line, you can run it with the following commandline arguments (in order):

 Method (always GET, at the moment)
 Domain
 Request URI (i.e. the path to this script, including the script name)
 PATH_INFO  (anything that should appear in the PATH_INFO variable of the webserver)

  perl  myAccessorScript  GET  example.net  /this/myAccessorScript /1234567

AUTHOR

Mark Denis Wilkinson (markw [at] illuminae [dot] com)

COPYRIGHT AND LICENSE

This software is Copyright (c) 2016 by Mark Denis Wilkinson.

This is free software, licensed under:

  The Apache License, Version 2.0, January 2004