The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Name

Yabsm - yet another btrfs snapshot manager

What is Yabsm?

Yabsm is a btrfs snapshot and backup management system that provides the following features:

  • Takes read only snapshots and performs both remote and local incremental backups.

  • Separates snapshots and backups into 5minute, hourly, daily, weekly, and monthly timeframe categories.

  • Provides a simple query language for locating snapshots and backups.

Usage

Yabsm provides 3 commands: config, find, and daemon

usage: yabsm [--help] [--version] [<COMMAND> <ARGS>]

commands:

  <config|c> [--help] [check ?file] [ssh-check <SSH_BACKUP>] [ssh-key]
             [yabsm-user-home] [yabsm_dir] [subvols] [snaps] [ssh_backups]
             [local_backups] [backups]

  <find|f>   [--help] [<SNAP|SSH_BACKUP|LOCAL_BACKUP> <QUERY>]

  <daemon|d> [--help] [start] [stop] [restart] [status] [init]

Snapshots vs Backups

Before we go on, let's clear up the difference between a snapshot and a backup.

A snapshot is a read-only nested subvolume created with the btrfs subvolume snapshot -r command.

A backup is a snapshot that has been transferred to another disk via the btrfs send/receive commands. Yabsm leverages btrfs's <incremental backup|https://btrfs.wiki.kernel.org/index.php/Incremental_Backup> feature to do this efficiently. Yabsm can transfer backup snapshots over SSH and to local storage.

The Yabsm Daemon

usage: yabsm <daemon|d> [--help] [start] [stop] [restart] [status] [init]

Snapshots and backups are performed by the Yabsm daemon. The Yabsm daemon must be started as root so it can initialize its runtime environment, which includes creating a locked user named yabsm (and a group named yabsm) that the daemon will run as. You can initialize the daemon's runtime environment without actually starting the daemon by running yabsm daemon init.

When the daemon starts, it reads the /etc/yabsm.conf file that specifies its configuration to determine when to schedule the snapshots and backups and how to perform them. If the Yabsm daemon is already running and you make a configuration change, you must run yabsm daemon restart to apply the changes.

Initialize Daemon Runtime Environment

You can use the command yabsm daemon init to initialize the daemon's runtime environment without actually starting the daemon. Running this command creates the yabsm user and group, gives the yabsm user sudo access to btrfs-progs, creates yabsms SSH keys, and creates the directories needed for performing all the snaps, ssh_backups, and local_backups defined in /etc/yabsm.conf.

Daemon Logging

The Yabsm daemon logs all of its errors to /var/log/yabsm. If, for example, you have an ssh_backup that is not being performed, the first thing you should do is check the log file.

Configuration

The Yabsm daemon is configured via the /etc/yabsm.conf file.

You can run the command yabsm config check to check the correctness of your config and output useful error messages if there are any problems.

Configuration Grammar

You must specify a yabsm_dir that Yabsm will use for storing snapshots and as a cache for holding data needed for performing snapshots and backups. Most commonly this directory is set to /.snapshots/yabsm. Yabsm will take this directory literally so you almost certainly want the path to end in /yabsm. If this directory does not exist, the Yabsm daemon will create it automatically when it starts.

yabsm_dir=/.snapshots/yabsm

The rest of the config will contain your configuration objects. There are 4 different configuration objects: subvols, snaps, ssh_backups, and local_backups. You can define as many configuration object as you want. The general form of each configuration object is:

type name {
    key=val
    ...
}

All configuration objects share a namespace, so you must make sure they all have unique names. You can define as many configuration objects as you want.

Subvols

A subvol is the simplest configuration object and is used to give a name to one of your btrfs subvolumes. A subvol definition accepts one field named mountpoint which takes a value that is a path to a subvolume.

subvol home_subvol {
    mountpoint=/home
}

Timeframes

We need to understand timeframes before we can understand the other configuration objects (snaps, ssh_backups, and local_backups).

There are 5 timeframes: 5minute, hourly, daily, weekly, and monthly.

Snaps, ssh_backups, and local_backups are performed in one or more timeframes. For example, a ssh_backup may be configured to take backups in the hourly and weekly timeframes , which means that we want to backup every hour and once a week.

The following table provides a brief description of each timeframe:

5minute -> Every 5 minutes
hourly  -> At the beginning of every hour
daily   -> Every day at one or more times of the day
weekly  -> Once a week on a specific weekday at a specific time
monthly -> Once a month on a specific day at a specific time

To specify the timeframes you want, the snap, ssh_backup, and local_backup require you to specify a timeframes value. This value is set to a comma separated list of timeframe values. For example, this is how you specify that you want every timeframe:

timeframes=5minute,hourly,daily,weekly,monthly

Each timeframe you specify adds new required settings for the configuration object. Here is a table that shows the extra settings required for each timeframe:

5minute -> 5minute_keep
hourly  -> hourly_keep
daily   -> daily_keep,   daily_times
weekly  -> weekly_keep,  weekly_time,  weekly_day
monthly -> monthly_keep, monthly_time, monthly_day

Any *_keep setting defines how many snapshots/backups you want to keep at one time for that timeframe category. For example, a common configuration is to keep 48 hourly snapshots so you can go back 2 days in one-hour increments.

The daily timeframe requires a daily_times setting, which takes a comma separated list of hh:mm times. Yabsm will perform the snapshot/backup every day at all the given times.

The weekly timeframe requires a weekly_day setting that takes a day of week string such as monday, thursday, or saturday and a weekly_time setting that takes a hh:mm time. The weekly snapshot/backup will be performed on the given day of the week at the given time.

The monthly timeframe requires a monthly_day setting that takes an integer between 1-31 and a monthly_time setting that takes a hh:mm time. The monthly snapshot/backup will be performed on the given day of the month at the given time.

Snaps

A snap represents a snapshot configuration for some subvol. Here is an example of a snap that snapshots home_subvol twice a day. These snapshots will be stored in $yabsm_dir/$SNAP_NAME/$TIMEFRAME.

snap home_subvol_snap {
    subvol=home_subvol
    timeframes=daily
    daily_keep=62 # two months
    daily_times=13:40,23:59
}

SSH Backups

A ssh_backup represents a backup configuration that sends snapshots over a network via SSH. See this example of a ssh_backup that backs up home_subvol to larry@192.168.1.73:/backups/yabsm/laptop_home every night at midnight:

ssh_backup home_subvol_larry_server {
    subvol=home_subvol
    ssh_dest=larry@192.168.1.73
    dir=/backups/yabsm/laptop_home
    timeframes=daily
    daily_keep=31
    daily_times=23:59
}

The difficult part of configuring a ssh_backup is making sure the SSH server is properly configured. You can test that a ssh_backup is able to be performed by running yabsm config ssh-check <SSH_BACKUP>. For a ssh_backup to be able to be performed the following conditions must be satisfied:

  • The host's yabsm user can sign into the SSH destination (ssh_dest) using key based authentication. To achieve this you must add the yabsm users SSH key (available via # yabsm ssh print-key) to the server user's $HOME/.ssh/authorized_keys file.

  • The remote backup directory (dir) is an existing directory residing on a btrfs filesystem that the remote user has read and write permissions to.

  • The SSH user has root access to btrfs-progs via sudo. To do this you can add a file containing a string like larry ALL=(root) NOPASSWD: /sbin/btrfs to a file in /etc/sudoers.d/.

Local Backups

A local_backup represents a backup configuration that sends snapshots to a partition mounted on the host OS. This is useful for sending snapshots to an external hard drive plugged into your computer.

Here is an example local_backup that backs up home_subvol every hour, and once a week.

local_backup home_subvol_easystore {
    subvol=home_subvol
    dir=/mnt/easystore/backups/yabsm/home_subvol
    timeframes=hourly,weekly
    hourly_keep=48
    weekly_keep=56
    weekly_day=sunday
    weekly_time=23:59
}

The backup directory (dir) must be an existing directory residing on a btrfs filesystem that the yabsm user has read permission on.

Configuration Querying

Yabsm comes with a config command that allows you to check and query your configuration.

usage: yabsm <config|c> [--help] [check ?file] [ssh-check <SSH_BACKUP>]
                        [ssh-key] [yabsm-user-home] [yabsm_dir] [subvols]
                        [snaps] [ssh_backups] [local_backups] [backups]

The check ?file subcommand checks that ?file is a valid Yabsm configuration file and if not prints useful error messages. If the ?file argument is omitted it defaults to /etc/yabsm.conf.

The ssh-check <SSH_BACKUP> subcommand checks that <SSH_BACKUP> can be performed and if not prints useful error messages. See the section SSH Backups for an explanation on the configuration required for performing an ssh_backup.

The ssh-key subcommand prints the yabsm user's public SSH key.

All of the other subcommands query for information derived from your /etc/yabsm.conf:

subvols         -> The names of all subvols.
snaps           -> The names of all snaps.
ssh_backups     -> The names of all ssh_backups.
local_backups   -> The names of all local_backups.
backups         -> The names of all ssh_backups and local_backups.
yabsm_dir       -> The directory used as the yabsm_dir.
yabsm_user_home -> The 'yabsm' users home directory.

Finding Snapshots

Now that we know how to configure Yabsm to take snapshots, we are going to want to locate those snapshots. Yabsm comes with a command find that allows you to locate snapshots and backups using a simple query language. Here is the usage string for the find command.

usage: yabsm <find|f> [--help] [<SNAP|SSH_BACKUP|LOCAL_BACKUP> <QUERY>]

Here are a few examples:

$ yabsm find home_snap back-2-mins
$ yabsm f root_ssh_backup 'after b-2-m'
$ yabsm f home_local_backup 10:45

The first argument is the name of any snap, ssh_backup, or local_backup. Because these configuration entities share the same namespace there is no risk of ambiguity.

The second argument is a snapshot location query. There are 7 types of queries:

all                 -> Every snapshot sorted newest to oldest
newest              -> The most recent snapshot/backup.
oldest              -> The oldest snapshot/backup.
after   TIME        -> All the snapshot/backups that are newer than TIME.
before  TIME        -> All the snapshot/backups that are older than TIME.
between TIME1 TIME2 -> All the snapshot/backups that were taken between TIME1 and TIME2.
TIME                -> The snapshot/backup that was taken closest to TIME.

Time Abbreviations

In the list above the TIME variables stand for a time abbreviation.

There are two different kinds of time abbreviations: relative times and immediate times.

Relative Times

A relative time comes in the form back-AMOUNT-UNIT, where back can be abbreviated to b, AMOUNT is a positive integer, and UNIT is either minutes, hours, or days. Each UNIT can be abbreviated:

minutes -> mins, m
hours   -> hrs, h
days    -> d

Here are some English descriptions of relative times.

back-5-h  -> 5 hours ago
b-10-m    -> 10 minutes ago
b-24-days -> 24 days ago

Immediate Times

An immediate_time is an abbreviation for a time/date denoted by yr_mon_day_hr:min.

There are 7 immediate_time forms, the following table gives an example of each form:

yr_mon_day_hr:min -> 2020_5_13_23:59
yr_mon_day        -> 2020_12_25
mon_day_hr:min    -> 12_25_8:30
mon_day_hr        -> 12_25_8
mon_day           -> 12_25
hr:min            -> 23:59

The immediate_time abbreviation rules are simple. If the yr, mon, or day is omitted then the current year, month, or day is assumed. If the hr or min is omitted then they are assumed to be 0. Therefore 2020_12_25 is always the same as 2020_12_25_00:00. If the current day is 2020/12/25, then 23:59 is the same as 2020_12_25_23:59.

Getting Support

Do not hesitate to open an issue at https://github.com/NicholasBHubbard/Yabsm/issues! To help get support, you may want to include the output of the following commands in your issue:

$ yabsm config check
$ yabsm config ssh-check <SSH_BACKUP>
$ cat /var/log/yabsm

Author

Nicholas Hubbard <nicholashubbard@posteo.net>

Copyright

Copyright (c) 2022-2023 by Nicholas Hubbard (nicholashubbard@posteo.net)

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with App-Yabsm. If not, see http://www.gnu.org/licenses/.