NAME
Couchbase::Bucket - Couchbase Cluster data access
SYNOPSIS
# Imports
use Couchbase::Bucket;
use Couchbase::Document;
# Create a new connection
my $cb = Couchbase::Bucket->new("couchbases://anynode/bucket", { password => "secret" });
# Create and store a document
my $doc = Couchbase::Document->new("idstring", { json => ["encodable", "string"] });
$cb->insert($doc);
if (!$doc->is_ok) {
warn("Couldn't store document: " . $doc->errstr);
}
# Retrieve a document:
$doc = Couchbase::Document->new("user:mnunberg");
$cb->get($doc);
printf("Full name is %s\n", $doc->value->{name});
# Query a view:
my $res = Couchbase::Document->view_slurp(['design_name', 'view_name'], limit => 10);
# $res is actually a subclass of Couchbase::Document
if (! $res->is_ok) {
warn("There was an error in querying the view: ".$res->errstr);
}
foreach my $row (@{$res->rows}) {
printf("Key: %s. Document ID: %s. Value: %s\n", $row->key, $row->id, $row->value);
}
# Get multiple items at once
my $batch = $cb->batch;
$batch->get(Couchbase::Document->new("user:$_")) for (qw(foo bar baz));
while (($doc = $batch->wait_one)) {
if ($doc->is_ok) {
printf("Real name for userid '%s': %s\n", $doc->id, $doc->value->{name});
} else {
warn("Couldn't get document '%s': %s\n", $doc->id, $doc->errstr);
}
}
DESCRIPTION
Couchbase::Bucket is the main module for Couchbase and represents a data connection to the cluster.
The usage model revolves around an Couchbase::Document which is updated for each operation. Normally you will create a Couchbase::Document and populate it with the relevant fields for the operation, and then perform the operation itself. When the operation has been completed the relevant fields become updated to reflect the latest results.
CONNECTING
Connection String
To connect to the cluster, specify a URI-like connection string. The connection string is in the format of SCHEME://HOST1,HOST2,HOST3/BUCKET?OPTION=VALUE&OPTION=VALUE
- scheme
-
This will normally be
couchbase://
. For SSL connections, usecouchbases://
(note the extra s at the end). See "Using SSL" for more details - host
-
This can be a single host or a list of hosts. Specifying multiple hosts is not required but may increase availability if the first node is down. Multiple hosts should be separated by a comma.
If your administrator has configured the cluster to use non-default ports then you may specify those ports using the form
host:port
, whereport
is the memcached port that the given node is listening on. In the case of SSL this should be the SSL-enabled memcached port. - bucket
-
This is the data bucket you wish to connect to. If left unspecified, it will revert to the
default
bucket. - options
-
There are several options which can modify connection and general settings for the newly created bucket object. Some of these may be modifiable via Couchbase::Settings (returned via the
settings()
method) as well. This list only mentions those settings which are specific to the initial connectionconfig_total_timeout
-
Specify the maximum amount of time (in seconds) to wait until the client has been connected.
config_node_timeout
-
Specify the maximum amount of time (in seconds) to wait for a given node to respond to the initial connection request. This number may also not be higher than the value for
config_total_timeout
. certpath
-
If using SSL, this option must be specified and should contain the local path to the copy of the cluster's SSL certificate. The path should also be URI-encoded.
Using SSL
To connect to an SSL-enabled cluster, specify the couchbases://
for the scheme. Additionally, ensure that the certpath
option contains the correct path, for example:
my $cb = Couchbase::Bucket->new("couchbases://securehost/securebkt?certpath=/var/cbcert.pem");
Specifying Bucket Credentials
Often, the bucket will be password protected. You can specify the password using the password
option in the $options
hashref in the constructor.
new($connstr, $options)
Create a new connection to a bucket. $connstr
is a "Connection String" and $options
is a hashref of options. The only recognized option key is password
which is the bucket password, if applicable.
This method will attempt to connect to the cluster, and die if a connection could not be made.
DATA ACCESS
Data access methods operate on an Couchbase::Document object. When the operation has completed, its status is stored in the document's errnum
field (you can also use the is_ok
method to check if no errors occurred).
get($doc)
get_and_touch($doc)
Retrieve a document from the cluster. $doc
is an Couchbase::Document. If the operation is successful, the value of the item will be accessible via its value
field.
my $doc = Couchbase::Document->new("id_to_retrieve");
$cb->get($doc);
if ($doc->is_ok) {
printf("Got value: %s\n", $doc->value);
}
The get_and_touch
variant will also update (or clear) the expiration time of the item. See "Document Expiration" for more details:
my $doc = Couchbase::Document->new("id", { expiry => 300 });
$cb->get_and_touch($doc); # Expires in 5 minutes
fetch($id)
This is a convenience method which will create a new document with the given id
and perform a get
on it. It will then return the resulting document.
my $doc = $cb->fetch("id_to_retrieve");
insert($doc)
replace($doc, $options)
upsert($doc, $options)
my $doc = Couchbase::Document->new(
"mutation_method_names",
[ "insert", "replace", "upsert"],
{ expiry => 3600 }
);
# Store a new item into the cluster, failing if it exists:
$cb->insert($doc);
# Unconditionally overwrite the value:
$cb->upsert($doc);
# Only replace an existing value
$cb->replace($doc);
# Ignore any kind of race conditions:
$cb->replace($doc, { ignore_cas => 1 });
# Store the document, wait until it has been persisted
# on at least 2 nodes
$cb->replace($doc, { persist_to => 2 });
These three methods will set the value of the document on the server. insert
will only succeed if the item does not exist, replace
will only succeed if the item already exists, and upsert
will unconditionally write the new value regardless of it existing or not.
Storage Format
By default, the document is serialized and stored as JSON. This allows proper integration with other optional functionality of the cluster (such as views and N1QL queries). You may also store items in other formats which may then be transparently serialized and deserialized as needed.
To specify the storage format for a document, specify the `format` setting in the Couchbase::Document object, like so:
use Couchbase::Document;
my $doc = Couchbase::Document->new('foo', \1234, { format => COUCHBASE_FMT_STORABLE);
This version of the client uses so-called "Common Flags", allowing seamless integration with Couchbase clients written in other languages.
Encoding Formats
Bear in mind that Perl's default encoding is Latin-1 and not UTF-8. To that effect, any input, unless indicated otherwise, is assumed to thus be Latin-1. There are various ways to change the "type" of a string, the details of which can be found within the utf8 and Encode modules.
From the perspective of this module, any input string which is marked as being JSON or UTF8 will be marked as being UTF-8. This may mean some smaller performance implications. If this is a concern, you can intercept the JSON decoding function and handle the raw string there.
CAS Operations
To avoid race conditions when two applications attempt to write to the same document Couchbase utilizes something called a CAS value which represents the last known state of the document. This CAS value is modified each time a change is made to the document, and is returned back to the client for each operation. If the $doc
item is a document previously used for a successful get
or other operation, it will contain the CAS, and the client will send it back to the server. If the current CAS of the document on the server does not match the value embedded into the document the operation will fail with the code COUCHBASE_KEY_EEXISTS
.
To always modify the value (ignoring whether the value may have been previously modified by another application), set the ignore_cas
option to a true value in the $options
hashref.
Durability Requirements
Mutation operations in couchbase are considered successful once they are stored in the master node's cache for a given key. Sometimes extra redundancy and reliability is required, where an application should only proceed once the data has been replicated to a certain number of nodes and possibly persisted to their disks. Use the persist_to
and replicate_to
options to specify the specific durability requirements:
persist_to
-
Wait until the item has been persisted (written to non-volatile storage) of this many nodes. A value of 1 means the master node, where a value of 2 or higher means the master node and
n-1
replica nodes. replicate_to
-
Wait until the item has been replicated to the RAM of this many replica nodes. Your bucket must have at least this many replicas configured and online for this option to function.
You may specify a negative value for either persist_to
or replicate_to
to indicate that a "best-effort" behavior is desired, meaning that replication and persistence should take effect on as many nodes as are currently online, which may be less than the number of replicas the bucket was configured with.
You may request replication without persistence by simply setting replicate_to=0
.
Document Expiration
In many use cases it may be desirable to have the document automatically deleted after a certain period of time has elapsed (think about session management). You can specify when the document should be deleted, either as an offset from now in seconds (up to 30 days), or as Unix timestamp.
The expiration is considered a property of the document and is thus configurable via the Couchbase::Document's expiry
method.
remove($doc, $options)
Remove an item from the cluster. The operation will fail if the item does not exist, or if the item's CAS has been modified.
my $doc = Couchbase::Document->new("KILL ME PLEASE");
$cb->remove($doc);
if ($doc->is_ok) {
print "Deleted document OK!\n";
} elsif ($doc->is_not_found) {
print "Document already deleted!\n"
} elseif ($doc->is_cas_mismatch) {
print "Someone modified our document before we tried to delete it!\n";
}
touch($doc, $options)
Update the item's expiration time. This is more efficient than get_and_touch as it does not return the item's value across the network.
Client Settings
settings()
Returns a hashref of settings (see Couchbase::Settings). Because this is a hashref, its values may be local
ized.
Set a high timeout for a specified operation:
{
local $cb->settings->{operation_timeout} = 20; # 20 seconds
$cb->get($doc);
}
ADVANCED DATA ACCESS
counter($doc, { delta=>n1, initial=n2 })
sub example_hit_counter {
my $page_name = shift;
my $doc = Couchbase::Document->new("page:$page_name");
$cb->counter($doc, { initial => 1, delta => 1 });
}
This method treats the stored value as a number (i.e. a string which can be parsed as a number, such as "42"
) and atomically modifies its value based on the parameters passed.
The options are:
delta
-
the amount by which the current value should be modified. If the value for this option is negative then the counter will be decremented
initial
-
The initial value to assign to the item on the server if it does not yet exist. If this option is not specified and the item on the server does not exist then the operation will fail.
append_bytes($doc, { fragment => "string" })
prepend_bytes($doc, { fragment => "string"} )
These two methods concatenate the fragment
value and the existing value on the server. They are equivalent to doing the following:
# Append:
$doc->value($doc->value . '_suffix');
$doc->format('utf8');
$cb->replace($doc);
# Prepend:
$doc->value('prefix_' . $doc->value);
$doc->format('utf8');
$cb->replace($doc);
The fragment
option must be specified, and the value is not updated in the original document.
Also note that these methods do a raw string-based concatenation, and will thus only produce desired results if the existing value is a plain string. This is in contrast to COUCHBASE_FMT_JSON
where a string is stored enclosed in quotation marks.
Thus a JSON string may be stored as "foo"
, and appending to it will yield "foo"bar
, which is typically not what you want.
BATCH OPERATIONS
Batch operations allow more efficient utilization of the network by reducing latency and increasing the number of commands sent at a single time to the server.
Batch operations are executed by creating an Couchbase::OpContext; associating commands with the conext, and waiting for the commands to complete.
To create a new context, use the batch
method
batch()
Returns a new Couchbase::OpContext which may be used to schedule operations.
Batched Durability Requirements
In some scenarios it may be more efficient on the network to submit durability requirement requests as a large single command. The behavior for the persist_to
and replicate_to
parameters in the upsert()
family of methods will cause a durability request to be sent out to the given nodes node as soon as the success is received for the newly-modified item. This approach reduces latency at the cost of additional bandwidth.
Some bandwidth may be potentially saved if these requests are all batched together:
durability_batch($options)
Volatile - Subject to change
Creates a new durability batch. A durability batch is a special kind of batch where the contained commands can only be documents whose durability is to be checked.
my $batch;
$batch = $cb->batch;
$batch->upsert($_) for @docs;
$batch->wait_all;
$batch = $cb->durability_batch({ persist_to => 1, replicate_to => 2 });
$batch->endure($_) for @docs;
$batch->wait_all;
The options
passed can be persist_to
and replicate_to
. See the "Durability Requirements" section for information.
N1QL QUERIES (EXPERIMENTAL)
N1QL queries are available as an experimental feature of the client library.
The N1QL API exposes two functions, both of which function similarly to their view counterparts.
At the time of writing, the server does not include N1QL as an integrated feature (because it is still experimental). This means it must be downloaded as a standalone package (see http://docs.couchbase.com/developer/n1ql-dp4/n1ql-intro.html). Once downloaded and configured, the _host
option should be passed to the query function (as detailed below).
N1QL functions return a Couchbase::N1QL::Handle object, which functions similarly to Couchbase::View::Handle (internally, they share a lot of code).
query_slurp("query", $queryargs, $queryopts)
Issue an N1QL query. This will send the query to the server (encoding any parameters as needed).
my $rv = $cb->query_slurp(
# Query string
'SELECT *, META().id FROM travel WHERE travel.country = $country ',
# Placeholder values
{ country => "Ecuador", },
# Query options
{ _host => "localhost:8093" }
);
foreach my $row (@{$rv->rows}) {
# do something with decoded JSON
}
The queryargs
parameter can either be a hashref of named placeholders (omiting of course, the leading $
which is handled internally), or it can be an arrayref of positional placeholders (if your query uses positional placeholders).
The queryopts
is a set of other modifiers for the query. Most of these are sent to the server. One special parameter is the _host
parameter, which points to a standalone instance of the N1QL Developer Preview installation; a temporary necesity for pre-release versions. Using of the _host
paramter will be removed once Couchbase Server is available (in release or pre-release) with an integrated N1QL process.
query_iterator("query", $queryargs, $queryopts)
This function is to query_slurp
as view_iterator
is to view_slurp
. In short, this allows an iterator over the rows, only fetching data from the network as needed. This is more efficient (but a bit less simple to use) than query_slurp
my $rv = $cb->query_iterator("select * from default");
while ((my $row = $rv->next)) {
# do something with row.
}
VIEW (MAPREDUCE) QUERIES
View methods come in two flavors. One is an iterator which incrementally fetches data from the network, while the other loads the entire data and then returns. For small queries (i.e. those which do not return many results), which API you use is a matter of personal preference. For larger resultsets, however, it often becomes a necessity to not load the entire dataset into RAM.
Both the view_slurp
and view_iterator
return Couchbase::View::Handle objects. This has been changed from previous versions which returned a Couchbase::View::HandleInfo
object (Though the APIs remain the same).
view_slurp("design/view", %options)
Queries and returns the results of a view. The first argument may be provided either as a string of "$design/$view"
or as a two-element array reference containing the design and view respectively.
The %options
are options passed verbatim to the view engine. Some options however are intercepted by the client, and modify how the view is queried.
spatial
-
Indicate that the queried view is a geospatial view. This is required since the formatting of the internal URI is slightly different.
include_docs
-
Indicate that the relevant documents should be fetched for each view. The following forms are equivalent.
# fetching directly: my $iter = $bkt->view_iterator(['design', 'view']); while ((my $row = $iter->next)) { my $doc = Couchbase::Document->new($row->id); $bkt->get($doc); } # using include_docs my $iter = $bkt->view_iterator(['design', 'view'], include_docs => 1); while ((my $row = $iter->next)) { my $doc = $row->doc; }
Using
include_docs
is significantly more efficient than fetching the rows manually as it allows the library to issue gets in bulk for each raw chunk of view results received - and also allows the library to "lazily" fetch documents while other rows are being received.
The returned object contains various status information about the query. The rows themselves may be found inside the rows
accessor:
my $rv = $cb->view_slurp("beer/brewery_beers", limit => 5);
foreach my $row @{ $rv->rows } {
printf("Got row for key %s with document id %s\n", $row->key, $row->id);
}
This method returns an instance of Couchbase::View::Handle which may be used to inspect for error messages. The object is in fact a subclass of Couchbase::Document with an additional errinfo
method to provide more details about the operation.
if (!$rv->is_ok) {
if ($rv->errnum) {
# handle error code
}
if ($rv->http_code !~ /^2/) {
# Failed HTTP status
}
}
As of version 2.0.3, this method is implemented as a wrapper atop view_iterator
view_iterator("design/view", %options)
This works in much the same way as the view_slurp()
method does, except that it returns responses incrementally, which is handy if you expect the query to return a large amount of results:
my $iter = $cb->view_iterator("beer/brewery_beers");
while (my $row = $iter->next) {
printf("Got row for key %s with document id %s\n", $row->key, $row->id);
}
Note that the contents of the Handle
object are only considered valid once the iterator has been through at least one iteration; thus:
Incorrect, because it requests the info
object before iteration has started
my $iter = $cb->view_iterator($dpath);
if (!$iter->info->is_ok) {
# ...
}
Correct
my $iter = $cb->view_iterator($dpath);
while (my $row = $iter->next) {
# ...
}
if (!$iter->info->is_ok) {
# ...
}
INFORMATIONAL METHODS
These methods return various sorts of into about the cluster or specific items
stats()
stats("spec")
Retrieves cluster statistics from each server. The return value is an Couchbase::Document with its value
field containing a hashref of hashrefs, like so:
# Dump all the stats, per server:
my $results = $cb->stats()->value;
while (my ($server,$stats) = each %$results) {
while (my ($statkey, $statval) = each %$stats) {
printf("Server %s: %s=%s\n", $server, $statkey, $statval);
}
}
keystats($id)
Returns metadata about a specific document ID. The metadata is returned in the same manner as in the stats()
method. This will solicit each server which is either a master or replica for the item to respond with information such as the cas, expiration time, and persistence state of the item.
This method should be used for informative purposes only, as its output and availability may change in the future.
observe($id, $options)
Returns persistence and replication status about a specific document ID. Unlike the keystats
method, the information is received from the network as binary and is thus more efficient.
You may also pass a master_only
option in the options hashref, in which case only the master node from the item will be contacted.