The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Couch::DB - CouchDB database client

INHERITANCE

 Couch::DB is extended by
   Couch::DB::Mojolicious

SYNOPSIS

   use Couch::DB::Mojolicious ();
   my $couch   = Couch::DB::Mojolicious->new(api => '3.3.3');
   my $db      = $couch->db('my-db'); # Couch::DB::Database object
   my $cluster = $couch->cluster;     # Couch::DB::Cluster object
   my $client  = $couch->createClient(...);  # Couch::DB::Client

DESCRIPTION

When this module was written, there were already a large number of CouchDB implementations on CPAN. Still, there was a need for one more. This implementation provides a thick interface: a far higher level of abstraction than the other modules. This should make your work much, much easier.

Also, open https://perl.overmeer.net/couch-db/reference.html in a browser window, as useful cross-reference: parameters for CouchDB are not documented in this Perl documentation!

Please read the "DETAILS" section, further down, at least once before you start!

Early adopters

Be warned that this module is really new. The 127 different endpoints that the CouchDB 3.3.3 API defines, are grouped and combined. The result is often not tested, and certainly not battle ready. Please, report the result of calls which which are currently flagged "UNTESTED".

Please help me fix issues by reporting them. Bugs will be solved within a day. Please, contribute ideas to make the use of the module lighter. Together, we can make the quality grow fast.

Integration with your framework

You need to instantiate an extensions of this class. At the moment, you can pick from:

  • Couch::DB::Mojolicious

    Implements the client using the Mojolicious framework, using Mojo::URL, Mojo::UserAgent, Mojo::IOLoop, and many other.

Other extensions are hopefully added in the future. Preferrably as part of this release so it gets maintained together. The extensions are not too difficult to create and certainly quite small.

Where can I find what?

The CouchDB API lists all endpoints as URLs. This library, however, creates an Object Oriented interface around these calls: you do not see the internals in the resulting code. Knowing the CouchDB API, it is usually immediately clear where to find a certain end-point: /{db} will be in Couch::DB::Database. A major exception is anything what has to do with replication and sharding: this is bundled in Couch::DB::Cluster.

Have a look at https://perl.overmeer.net/couch-db/reference.html. Keep that page open in your browser while developing.

METHODS

Constructors

Couch::DB->new(%options)

Create a relation with a CouchDB server (cluster). You should use totally separated Couch::DB-objects for totally separate database clusters. Note: you can only instantiate extensions of this class.

When you do not specify a server-url, but have an environment variable PERL_COUCH_DB_SERVER, then server url, username, and password are derived from it.

 -Option  --Default
  api       <required>
  auth      'BASIC'
  password  undef
  server    "http://127.0.0.1:5984"
  to_json   +{ }
  to_perl   +{ }
  to_query  +{ }
  username  undef
api => $version

You MUST specify the version of the server you expect to answer your queries. Couch::DB tries to hide differences between your expectations and the reality of the server version.

The $version can be a string or a version object (see "man version").

auth => 'BASIC'|'COOKIE'

Authentication method to be used by default for each client.

password => STRING
server => URL

The default server to connect to, by URL. See etc/local.ini[chttpd] This server will be named _local.

You can add more servers using addClient(). In such case, you probably do not want this default client to be created as well. To achieve this, explicitly set server => undef.

to_json => HASH

A table mapping converter name to CODE, to override/add the default PERL to JSON object conversions for sending structures. See toJSON().

to_perl => HASH

A table mapping converter name to CODE, to override/add the default JSON to PERL object conversions for Couch::DB::Result::values(). See toPerl() and listToPerl().

to_query => HASH

A table mapping converter name to CODE, to override/add the default PERL to URL QUERY conversions. Defaults to the json converters. See toQuery().

username => STRING

When a username is given, it will be used together with auth and password to login to any created client.

Accessors

$obj->api()

Returns the interface version you expect the server runs, as a version object. Differences between reality and expectations are mostly automatically resolved.

Interface starting points

$obj->cluster()

Returns a Couch::DB::Cluster-object, which organizes calls to manipulate replication, sharding, and related jobs. This will always return the same object.

$obj->createClient(%options)

Create a client object which handles a server. All options are passed to Couch::DB::Client. The couch parameter is added for you. The client will also be added via addClient(), and is returned.

It may be useful to create to clients to the same server: one with admin rights, and one without. Or clients to different nodes, to create fail-over.

$obj->db($name, %options)

Declare a database. The database may not exist yet: calling this method does nothing with the CouchDB server.

  my $db = $couch->db('authors');
  $db->ping or $db->create(...);
$obj->node($name)

Returns a Couch::DB::Node-object with the $name. If the object does not exist yet, it gets created, otherwise reused.

Unrelated calls

$obj->freshUUIDs($count, %options)

Returns a $count number of UUIDs in a LIST. This uses requestUUIDs() to get a bunch at the same time, for efficiency. You may get fewer than you want, but only when the server is not sending them.

 -Option--Default
  bulk    50
bulk => INTEGER

When there are not enough UUIDs in stock, in how large chuncks should we ask for more.

$obj->requestUUIDs($count, %options)
 [CouchDB API "GET /_uuids", since 2.0]

Returns UUIDs (Universally unique identifiers), when the call was successful. Better use freshUUIDs(). It is faster to use Perl modules to generate UUIDs.

$obj->searchAnalyze(%options)
 [CouchDB API "POST /_search_analyze", since 3.0, UNTESTED]

Check what the build-in Lucene tokenizer(s) will do with your text.

 -Option  --Default
  analyzer  <required>
  text      <required>
analyzer => KIND
text => STRING

Processing

The methods in this section implement the CouchDB API. You should usually not need to use these yourself, as this libary abstracts them.

$obj->addClient($client)

Add a Couch::DB::Client-object to be used to contact the CouchDB cluster. Returned is the couch object, so these calls are stackable.

$obj->call($method, $path, %options)

Call some couchDB server, to get work done. This is the base for any interaction with the server.

Note: you should probably not use this method yourself: all endpoint of CouchDB are available via a nice, abstract wrapper.

 -Option   --Default
  client     undef
  clients    undef
  delay      false
  on_chain   undef
  on_values  undef
  paging     {}
  query      undef
  send       undef
client => Couch::DB::Client|$name

Select a specific client connection to be used, as object or by name.

clients => ARRAY|$role

Explicitly use only the specified clients (Couch::DB::Client-objects) for the query. When none are given, then all are used (in order of precedence). When a $role (string) is provided, it is used to select a subset of the defined clients.

delay => BOOLEAN
 [PARTIAL]

Do not execute the server call yet, but prepare it only in a way that it can be combined with other clients in parallel. See Couch::DB::Result chapter "DETAILS" about delayed requests.

on_chain => CODE

When the call ends successfully, then run the chain code. Event on_error and on_final are only called on the last results of the chain.

on_values => CODE

A function (sub) which transforms the data of the CouchDB answer into useful Perl values and objects. See Couch::DB::toPerl(). The function is called with the result and a partially or unprocessed reponse (answer). That data shall not be modified. Return a new data-structure which contains the processed information, which may reuse parts which are not modified.

paging => HASH

When the endpoint support paging, then its needed configuration data has been collected in here. This enables the use of _succeed, _page, skip, and friends. See examples in section "Pagination".

query => HASH

Query parameters for the request.

send => HASH

The content to be sent with POST and PUT methods. in those cases, even when there is nothing to pass on, simply to be explicit about that.

$obj->check($condition, $change, $version, $what)

If the $condition it true (usually the existence of some parameter), then check whether api limitiations apply.

Parameter $change is either removed, introduced, or deprecated (as strings). The version is taken from the CouchDB API documentation. The $what describes the element, to be used in error or warning messages.

$obj->client($name)

Returns the client with the specific $name (which defaults to the server's url).

$obj->clients(%options)

Returns a LIST with the defined clients; Couch::DB::Client-objects.

 -Option--Default
  role    undef
role => $role

When defined, only return clients which are able to fulfill the specific $role.

$obj->jsonText($json, %options)

Convert the (complex) $json structure into serialized JSON. By default, it is beautified.

 -Option --Default
  compact  false
compact => BOOLEAN

Produce compact (no white-space) JSON.

$obj->listToPerl($set, $type, @data|\@data)

Returns a LIST from all elements in the LIST @data or the ARRAY, each converted from JSON to pure Perl according to rule $type.

$obj->toJSON(\%data, $type, @keys)

Convert the named fields in the %data into a JSON compatible format. Fields which do not exist are left alone.

$obj->toPerl(\%data, $type, @keys)

Convert all fields with @keys in the %data HASH into object of $type. Fields which do not exist are ignored.

As default JSON to Perl translations are currently defined: abs_uri, epoch, isotime, mailtime, version, and node.

$obj->toQuery(\%data, $type, @keys)

Convert the named fields in the %data HASH into a Query compatible format. Fields which do not exist are left alone.

DETAILS

Thick interface

The CouchDB client interface is based on HTTP. It is really easy to construct a JSON, and then use a UserAgent to send it to the CouchDB server. All other CPAN modules which support CouchDB stick on this level of support; except Couch::DB.

When your library is very low-level, your program needs to put effort to create an abstraction around the interface it itself. When the library offers that abstraction already, you need to write much less code!

The Perl programming language works with functions, methods, and objects, so why would your libary require you to play with URLs? So, Couch::DB has the following extra features:

  • Calls have a functional name, and are grouped into classes: the endpoint URL processing is totally abstracted away;

  • Define multiple clients at the same time, for automatic fail-over, read, write, and permission separation, or parallellism;

  • Resolving differences between CouchDB-server versions. You may even run different CouchDB versions on your nodes;

  • JSON-types do not match Perl's type concept: this module will convert boolean and integer parameters (and more) from Perl to JSON and back transparently;

  • Offer error handling and event processing on each call;

  • Event framework independent (currently only a Mojolicious connector).

Using the CouchDB API

All methods which are marked with [CouchDB API] are, as the name says: client calls to some CouchDB server. Often, this connects to a node on your local server, but you can also connect to other servers and even multiple servers.

All these API methods return a Couch::DB::Result object, which can tell you how the call worked, and the results. The resulting object is overloaded boolean to produce false in case of an error. So typically:

  my $couch  = Couch::DB::Mojolicious->new(version => '3.3.3');
  my $result = $couch->requestUUIDs(100);
  $result or die;

  my $uuids  = $result->values->{uuids};

This CouchDB library hides the fact that endpoint /_uuids has been called. It also hides the client (UserAgent) which was used to collect the data.

You could also write

  my $uuids  = $couch->requestUUIDs(100)->values->{uuids};

because "values()" will terminate when the database call did not result in a successful answer. Last alternative:

   my @uuids = $couch->freshUUIDs(100);

Besides calls, there are all kinds of facility methods, which add further abstraction from the server connection.

Type conversions

With the Couch::DB::Result::values() method, conversions between JSON syntax and pure Perl are done. This also hides database interface changes for you, based on your new(api) setting. Avoid Couch::DB::Result::answer(), which gives the uninterpreted, unabstracted results.

This library also converts parameters from Perl space into JSON space. POST and PUT parameters travel in JSON documents. In JSON, a boolean is true and false (without quotes). In Perl, these are undef and 1 (and many alternatives). For anything besides your own documents, Couch::DB will totally hide these differences for you!

Generic parameters

Each method which is labeled [CouchDB API] also accepts a few options which are controlling the calling progress. These are available everywhere, hence no-where documented explicitly. Those options start with an underscore (_) or with on_ (events).

At the moment, the following %options are supported everywhere:

  • _delay => BOOLEAN, default false

    Do not perform and wait for the actual call, but prepare it to be used in parallel querying. TO BE IMPLEMENTED/DOCUMENTED.

  • _client => $client-object or -name

    Use only the specified client (=server) to perform the call.

  • _clients => ARRAY-of-clients or a role

    Use any of the specified clients to perform the call. When not an ARRAY, the parameter is a role: select all clients which can perform that role (the logged-in user of that client is allowed to perform that task).

  • _headers => HASH

    Add headers to the request. When applicable (for instance, the Accept-header) this will overrule the internally calculated defaults.

Besides, at the moment we support the following events:

  • on_error => CODE or ARRAY-of-CODE

    A CODE (sub) which is called when the interaction with the server has been completed without success. The CODE gets the result object as only parameter.

  • on_final => CODE or ARRAY-of-CODE

    A CODE (sub) which is called when the interaction with the server has been completed. This may happen much later, when combined with _delay. The CODE gets the result object as only parameter, and returns a result object which might be different... as calls can be chained.

  • on_chain => CODE

    Run the CODE after the call has been processed. It works as if the changed logic is run after the call, with the difference is that this next step is defined before the call has been made. This sometimes produces a nicer interface (like paging).

  • on_values => CODE

    Run the CODE on the result on the returned JSON data, to translate the raw answer() into values(). Wherever seemed useful, this is already hidden for you. However: there may be cases where you want to add changes.

Pagination

Searches tend to give a large number of results. CouchDB calls will refuse to return too many answers at a time (typically 25). When you need more results, you will need more calls.

To get more answers, there are two mechanisms: some calls provide a skip and limit only. Other calls implement the more sofisticated bookmark mechanism. Both mechanisms are abstracted away by the _succeed mechanism.

Be aware that you shall provide the same query parameters to each call of the search method. Succession may be broken when you change some parameters: it is not fully documented which ones are needed to continue, so simply pass all again. Probably, it is save to change the limit between pages.

To manage paged results, selected calls support the following options:

  • all => BOOLEAN (default false)

    Return all results at once. This may involve multiple calls, like when the number of results is larger than what the server wants to produce in one go.

    Do not use this when you expect many or large results. Maybe in combination with _map.

  • _page => INTEGER (default 1)

    Start-point of returned results, for calls which support paging. Pages are numbered starting from 1. When available, bookmarks will be used for next pages. Succeeding searches will automatically move through pages (see examples)

  • _page_size => INTEGER (default 25)

    The CouchDB server will often not give you more than 25 or 50 answers at a time, but you do not want to know.

  • _succeed => $result or $result->paging

    Make this query as successor of a previous query. Some requests support paging (via bookmarks). See examples in a section below.

  • _harvester => CODE

    How or what to extract per request. You may add other information, like collecting response objects. The CODE returns the extract LIST of objects/elements. Collection for a page stops once that combined list reaches _page_size.

  • _bookmark => STRING

    If you accidentally know the bookmark for the search. Usually, this is automatically picked-up via _succeed.

  • _map => CODE

    Call the CODE on each of the (defined) harvested page elements. The CODE is called with the result object, and one of the harvested elements. When a single page requires multiple requests to the CouchDB server, this map will happen on the moment each response has been received, which may help to create a better interactive experience.

    Your CODE may return the harvested object, but also something small (even undef) which will free-up the memory use of the object immediately. However: at least return a single scalar (it will be returned in the "page"), because an empty list signals "end of results".

  • skip => INTEGER

    Do not return this amount of first following elements. Be warned: use as %option, not as search parameter.

  • limit => INTEGER

    Do not request more than limit number of results per request. May be less than _page_size. Be warned: use as %option, not as search parameter.

. Example: paging through result

Get page by page, where you may use the limit parameter to request for a number of elements. Do not use skip, except in the first call. The _succeed handling will play tricks with _page, _harvester, and _client, which you do not wish to know.

  my $page1 = $couch->find(\%search, limit => 12, skip => 300);
  my $docs1 = $page1->page;
  my $page2 = $couch->find(\%search, _succeed => $page1);
  my $docs2 = $page2->page;

. Example: paging via a session

When you cannot ask for pages within a single continuous process, because the page is shown to a user who has to take action to see an other page, then save the pagingState.

The state cannot contain code references, so when you have a specific harvester or map, then you need to resupply those.

  my $page1 = $couch->find(\%search);
  my $docs1 = $page1->page;
  $session->save(current => serialized $page1->pagingState);
  ...
  my $prev  = deserialize $session->load('current');
  my $page2 = $couch->find(\%search, _succeed => $prev);
  my $docs2 = $page2->page;

. Example: get all results in a loop

Handle the responses which are coming in one by one. This is useful when the documents (with attachements?) are large. Each $list is a new result object.

  my $list;
  while($list = $couch->find(\%search, _succeed => $list))
  {   my $docs = $list->page;
      @$docs or last;    # nothing left
      ...;    # use the docs
  }
  $list or die "Stopped somewhere with ". $list->message;

. Example: get one page of results

You can jump back and forward in the pages: bookmarks will remember the pages already seen.

  my $page4 = $couch->find(\%search,
        limit      => 10,  # results per server request
        _page_size => 50,  # results until complete
    _page      =>  4,  # start point, may use bookmark
    _harvester => sub { $_[0]->values->{docs} }, # default
  );
  my $docs4 = $page4->page;
  my $page5 = $couch->find(\%search, _succeed => $page4);
  my $docs5 = $page5->page;

. Example: get all results in one call

Do not attempt this unless you know there there is a limited number of results, maybe just a bit more than a page.

  my $all   = $couch->find(\%search, _all => 1) or die;
  my $docs6 = $all->page;

. Example: processing results when they arrive

When a page (may) require multiple calls to the server, this may enhance the user experience.

  sub do_something($$) { my ($result, $doc) = @_; ...; 42 }
  my $all = $couch->find(\%search, _all => 1, _map => \&do_something);
  # $all->page will now show elements containing '42'.

SEE ALSO

This module is part of Couch-DB distribution version 0.005, built on June 23, 2024. Website: http://perl.overmeer.net/CPAN/

LICENSE

Copyrights 2024 by [Mark Overmeer]. For other contributors see ChangeLog.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://dev.perl.org/licenses/