NAME
RDF::Sesame::Repository - A repository on a Sesame server
DESCRIPTION
This class is the workhorse of RDF::Sesame. Adding triples, removing triples and querying the repository are all done through instances of this class. Only SELECT queries are supported at this point, but it should be fairly straightforward to add CONSTRUCT functionality. If you do it, send me a patch ;-)
METHODS
construct ( %opts )
Evaluates a construct query and returns the RDF serialization of the resulting RDF graph. A minimal invocation looks something like:
my $q = qq(
CONSTRUCT {Parent} ex:hasChild {Child}
FROM {Child} ex:hasParent {Parent}
USING NAMESPACE
ex = <http://example.org/things#>
);
my $rdf = $repo->construct(
query => $q,
format => 'turtle',
);
If an error occurs during the construction, an exception is thrown. This is different from some RDF::Sesame methods which return undef
.
format
Required: Yes
Indicates the RDF serialization format that the Sesame server should return. Acceptable values are 'rdfxml', 'turtle' and 'ntriples'.
language
Default: SeRQL
Specifies the language in which the construct query is written. This is only included for forwards-compatibility since the only query language supported by Sesame is SeRQL.
output
Default: undef
Indicates where the RDF serialization should be placed. The default value of undef
means that the serialization should simply be returned as the value of the construct
method.
If the value is a filehandle, the serialization is written to that filehandle. The filehandle must already be open for writing. Otherwise, the value is taken to be a filename which is opened for writing (clobbering existing contents) and the serialization is written to the file.
query
Required : Yes
The text of the construct query.
extract ( %opts )
Extract an RDF representation of all the triples in the repository. The only required option is "format" which specifies the serialization format of the resulting RDF. The minimal method invocation looks like
my $rdf = $repo->extract( format => 'turtle' )
where $rdf
is a reference to a scalar containing the serialization of all the triples in the repository. The streaming results returned by Sesame are handled appropriately so that memory usage in minimized. If the output is sent to a file (see "output"), only one "chunk" is held in memory at a time (subject to caching by your OS). The serialization may also be compressed (or otherwise processed) as it's being streamed from the server (see "compress").
Error handling is done differently in this method than in other methods in RDF::Sesame. Namely, if an error occurs, an exception is thrown (rather than returning undef and setting errstr()
. Eventually, I'd like all methods to behave this way.
compress
Default: 'none'
Indicates how the RDF serialization returned by the Sesame server should be compressed (or otherwise processed) before it's sent to the designated output destination (see "output)". The default value of none
indicates that no compression or processing should be performed. The value gz
indicates that Compress::Zlib should be used to compress the serialization into the gzip file format.
One may also specify a hash reference as the value of this option. The hash reference should contain the keys 'init', 'content', and 'finish'. The value for each key should be a subroutine reference which will be called during the extraction process.
The 'init' coderef is called before any data is received from Sesame. It receives an output filehandle as its sole argument and should return a "context" value which will be passed to the 'content' and 'finish' callbacks. The context may be any value, but objects and hashrefs seem to be the most useful.
The 'content' coderef is called once for each chunk of data returned from the Sesame server. It receives the context, the output filehandle and a serialization chunk as arguments. Its return value is ignored.
The 'finish' coderef is called after all data has been received from the server and after the last call to the 'content' coderef has completed. 'finish' receives the context and the output filehandle as arguments. Its return value is ignored.
Here is a short example of using callbacks to implement gzip compression (of course gzip compression is already implemented by specifying 'gz' as the compression value):
my $rdf_gz = $repo->extract(
format => 'turtle',
compress => {
init => sub {
my ($fh) = @_;
require Compress::Zlib;
binmode $fh;
my $gz = Compress::Zlib::gzopen( $fh, 'wb' );
return $gz; # our context object
},
content => sub {
my ( $context, $fh, $content ) = @_;
$context->gzwrite($content);
},
finish => sub {
my ( $context, $fh ) = @_;
$context->gzclose();
},
},
);
format
Required: Yes
Indicates the RDF serialization format that the Sesame server should return. Acceptable values are 'rdfxml', 'turtle' and 'ntriples'.
options
Default: []
Specifies various boolean extraction options provided by Sesame for extracting RDF from the repository. Acceptable options are 'niceOutput', 'explicitOnly', 'data', 'schema'. The values of these options have the meanings indicated in the "User Guide for Sesame 1.2" section 8.1.6. See http://www.openrdf.org/doc/sesame/users/ch08.html#d0e3026.
output
Default: undef
Indicates where the RDF serialization (including processing done according to the 'compress' argument) should be placed. The default value of undef
means that the serialization should simply be returned as the value of the extract
method.
If the value is a filehandle, the serialization is written to that filehandle. The filehandle must already be open for writing. Otherwise, the value is taken to be a filename which is opened for writing (clobbering existing contents) and the serialization is written to the file.
query_language ( [ $language ] )
Sets or gets the default query language. Acceptable values for $language are "RQL", "RDQL" and "SeRQL" (case sensitive). If an unacceptable value is given, query_language() behaves as if no $language
had been provided.
When an RDF::Sesame::Repository object is first created, the default query language is SeRQL. It is not necessary to change the default query language because the language can be specified on a per query basis by using the $language
parameter of the select() method (documented below).
Parameters :
$language The query language to use for queries in which the
language is not otherwise specified.
Returns :
If setting, the old value is returned. If getting, the current
value is returned.
select ( %opts )
Execute a query against this repository and return an RDF::Sesame::TableResult object. This object can be used to access the table of results in a number of useful ways.
Only SELECT queries are supported through this method. A list of the options which are currently understood is provided below. If a single scalar is provided instead of %opts
, the scalar is used as the value of the 'query' option.
Returns an RDF::Sesame::TableResult on success or the empty string on failure.
If an error occurs, call errstr() for an explanation.
query
The text of the query to execute. The format of this text is dependent on the query language you're using.
Default: ''
language
The query language used by the query. The option accepts the same values as the query_language method.
If this option is not provided, the default language that was set through query_language() is used. If query_language() has not been called, then "SeRQL" is assumed.
strip
Determines whether N-Triples encoding will be stripped from the query results. Normally, a literal is surrounded with double quotes and a URIref is surrounded with angle brackets. Literals may also have language or datatype information. By using the strip option, this behavior can be changed.
The value of the strip option is a scalar describing how you want the query results to be stripped. Acceptable values are listed below. The default for all calls to select may be changed by specifying the strip option to RDF::Sesame::Connection::open
- literals
-
strip N-Triples encoding from Literals
- urirefs
-
strip N-Triples encoding from URIrefs
- all
-
strip N-Triples encoding from Literals and URIrefs
- none
-
the default; leave N-Triples encoding intact
For example, to strip all N-Triples encoding, call select() like this
$repo->select(
query => $serql,
strip => 'all',
);
upload_data ( %opts )
Upload triples to the repository. %opts
is a hash of named options to use when uploading the data. Acceptable option names are documented below. If a single scalar is provided instead of %opts
, the scalar will be used as the value of the 'data' option.
This method is mostly useful for uploading triples which your program has generated itself. If you want to upload the data from a URI or even a local file (using the "file:" URI scheme) then use the upload_uri
method. It will take care of fetching the data and uploading it all in one step.
Returns the number of triples processed or 0 on error. If an error occurs during the upload, call errstr() to find out why.
data
The triples that should be uploaded. The 'format' option specifies the format of the triples.
Default: ''
format
The format of the 'data' option. Acceptable values are 'rdfxml', 'ntriples' and 'turtle'. If a value other than these is specified, 0 is returned and calling errstr
will return an explanatory message.
Default : ntriples
base
The base URI to use for resolving relative URIs. The default is not useful so be sure to specify this parameter if the data has relative URIs.
verify
Indicates whether data uploaded to Sesame should be verified before it is added to the repository.
Default : true
upload_uri ( %opts )
Uploads the triples from the resource located at a given URI. This method supports the "file:" URI scheme. If a file URI is specified, LWP::Simple is used to retrieve the contents of the URI. Those contents are then passed as the 'data' option to upload_data(). For any URI scheme besides "file:", the Sesame server will retrieve the data on its own.
The %opts
parameter provides a list of named options to use when uploading the data. If a single scalar is provided instead of %opts
, the scalar is used as the value of the 'uri' option. A list of acceptable options is provided below.
Returns the number of triples processed or 0 on error. If an error occurs during the upload, call errstr() to find out why.
uri
The URI of the resource to upload. The scheme of the URI may be 'file:' or anything supported by Sesame.
Default: ''
format
The format of the data located at the given URI. This can be one of 'rdfxml', 'ntriples' or 'turtle'.
Default: 'rdfxml'
base
The base URI of the data for resolving any relative URIs. The default base URI is the URI of the resource to upload.
verify
Indicates whether data uploaded to Sesame should be verified before it is added to the repository.
Default : true
clear
Removes all triples from the repository. When this method is finished, all the data in the repository will be gone, so be careful.
Return :
1 for success and the empty string for failure.
remove ($subject, $predicate, $object)
Removes from the repository triples which match the specified pattern. undef
is a wildcard which matches any value at that position. For example:
$repo->remove(undef, "<http://xmlns.com/foaf/0.1/gender>", '"male"')
will remove from the repository all the foaf:gender triples which have a value of "male". Notice also that the values should be encoded in NTriples syntax:
* URI : <http://foo.com/bar>
* bNode : _:nodeID
* literal: "Hello", "Hello"@en and "Hello"^^<http://bar.com/foo>
Parameters :
$subject The NTriples-encoded subject of the triples to
remove. If this is undef, it will match all
subjects.
$predicate The NTriples-encoded predicate of the triples
to remove. If this is undef, it will match
all predicates.
$object The NTriples-encoded object of the triples to remove.
If this is undef, it will match all objects.
Return :
The number of statements removed (including 0 on error).
errstr( )
Returns a string explaining the most recent error from this repository. Returns the empty string if no error has occured yet or the most recent method call succeeded.
INTERNAL METHODS
These methods are used internally by RDF::Sesame::Repository. They will probably not be helpful to general users of the class, but they are documented here just in case.
command ( $name [, $parameters ] )
Execute a command against a Sesame repository. This method is generally used internally, but is provided and documented in case others want to use it for their own reasons.
It's a simple wrapper around the RDF::Sesame::Connection::command method which simply adds the name of this repository to the list of parameters before executing the command.
Parameters :
$name The name of the command to execute. This name should be
the name used by Sesame. Example commands are "login"
or "listRepositories"
$parameters An optional hashref giving the names and values
of parameters for the command.
Return :
RDF::Sesame::Response