Skip to content
Snippets Groups Projects
Name Last commit Last update
src
.gitignore
Cargo.lock
Cargo.toml
LICENCE
Readme.md

Git borg linker utility

Description

The git borg linker utility (abreviated as gblk) is a tool that aims to ease the usage of borgbackup in a project using git as a version control system.

It helps you to track the changes in your results folder every time you commit a change in your code. For versionning your results gblk uses borgbackup a tool to create backups and using a data deduplication technique.

Prerequisites

To install gblk, git and borgbackup must be installed on your system.

To install borg, you can go to borg's installation page.

As gblk is written in rust, you need to install it with:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

You can optionally install delta. Delta aims to show differences between two files and can be used with git. It can be customized by editing the ~/.gitconfig file. This tool is needed if you use the --diff option of the gblk mount command.

You can optionally install ImageMagick. ImageMagick aims to perform various operation on images. This tool is needed if you plan to create diff of image with gblk with the --diff option of the gblk mount command.

Installation

To install gblk run the following command:

cargo install --git https://gitbio.ens-lyon.fr/LBMC/hub/git_borg_linker

Usage

gblk is meant to be used inside a project using git as a version control system. It will only be helpful if your projet folder contains a .git folder. A results folder must be present in your project directory as gblk will try to backup it.

To sum up gblk must be used in a folder having this minimal structure:

project
├── .git
└── results

To display the help of gblk run the following command:

$ gblk help
A tool used to link borg and git together

This tool was created to link borg and git together and ease the management of developpment artifact
versionning using git

gbl
A tool used to link borg and git together

USAGE:
    gblk <SUBCOMMAND>

OPTIONS:
    -h, --help    Print help information

SUBCOMMANDS:
    checkout        Checkout results to the current git commit
    commit          Save the results folder of a git repository in an archive
    compact         This command frees repository space by compacting segments
    create-hooks    Create github hooks to use gbl automaticaly after commit, before and after
                        checkout
    delete          This command deletes an archive from the repository or the complete
                        repository
    delete-hooks    Remove the post-checkout and the post-commit hooks
    diff            Show differences between two commits of the `results` folder
    help            Print this message or the help of the given subcommand(s)
    init            Initialize a borg repository inside a git project
    list            List the content of the .borg archive
    mount           Mount an old file/directory from one or multiple archive named afert git
                        commits into the .mount folder inside de project directory
    pre-co          Check if a checkout can be performed without losing data
    prune           This command prunes the .borg repository. This can be used to keep only
                        archive created during a given time interval
    umount          Unmount everything in the folder .mount

You can type gblk help <SUBCOMMAND> or gblk <SUBCOMMAND> --help to display the help of any given subcommands.

Note that create-hooks subcomand can be abbreviated to ch, and checkout subcommand can be abbreviated to co. For example gblk co --help will work the same as gblk checkout --help

Example usage without git hooks

Usage of gblk (init, commit, pre-co, checkout) without hooks

$ mkdir project
$ cd project
$ mkdir results src
$ git init
$ gblk init # creation of a .borg repository at the root of your filesystem
$ exa -a --tree --level=1
.
├── .borg
├── .git
├── results
└── src
$
$ # Creation of a simple script that creates a result file
$ echo "echo 'result line' > results/result.txt" > src/script.sh
$ bash src/script.sh
$ ls results
result.txt
$ git add src/script.sh
$ git commit -m "src/script.sh: initial commit"
$ git rev-parse --verify HEAD # Show current commit
62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d
$ gblk commit # creation of an archive in .borg repository
Repository: /home/nicolas/Documents/project/.borg
Archive name: 62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d
Archive fingerprint: fbb7444b0d11da22959f7611b66d8d6378b666b379237d46a0448de352fbbb62
Time (start): Thu, 2022-05-12 14:35:43
Time (end):   Thu, 2022-05-12 14:35:43
Duration: 0.00 seconds
Number of files: 1
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                  610 B                548 B                548 B
All archives:                   12 B                 15 B                735 B

                       Unique chunks         Total chunks
Chunk index:                       3                    3
------------------------------------------------------------------------------

$ gblk list # list archive in .borg repository
62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d Thu, 2022-05-12 14:35:43 [fbb7444b0d11da22959f7611b66d8d6378b666b379237d46a0448de352fbbb62]
$
$ # New change
$ echo "echo 'newresult line' > results/newresult.txt" > src/script.sh
$ bash src/script.sh
$ ls results
newresult.txt   result.txt
$ git add src/script.sh
$ git commit -m "src/script.sh"
$ gblk commit
Repository: /home/nicolas/Documents/project/.borg
Archive name: 705b95f48fe52bf9aac4406e6d4d7eb16a75f543
Archive fingerprint: ed45c00ec2059f366f53c9c9288a72ff1c9428a16ce37614bf59a56b89fc4dee
Time (start): Thu, 2022-05-12 14:37:43
Time (end):   Thu, 2022-05-12 14:37:43
Duration: 0.00 seconds
Number of files: 2
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                  625 B                566 B                551 B
All archives:                   39 B                 48 B              1.56 kB

                       Unique chunks         Total chunks
Chunk index:                       6                    7
------------------------------------------------------------------------------
$ gblk list
62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d Thu, 2022-05-12 14:35:43 [fbb7444b0d11da22959f7611b66d8d6378b666b379237d46a0448de352fbbb62]
705b95f48fe52bf9aac4406e6d4d7eb16a75f543 Thu, 2022-05-12 14:37:43 [ed45c00ec2059f366f53c9c9288a72ff1c9428a16ce37614bf59a56b89fc4dee]

$ # checkout
$ # The next command is important: it will check if your results folder doesn't contain new results compared to your archive with the current git id. If there is no errors, then you wont lose any data
$ gblk pre-co
$ git co 62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d
$ gblk co --mode hard # hard is used to delete file that were not present in the first commit. Otherwise only existing files at the destination commit will be updated.
$ ls results
result.txt

Not: if gblk pre-co says that you might lose data compared to the saved version of your actual commit, then use gblk commit --update.

Example usage with git hooks

Git hooks are commands that can be automatically executed before and after some git commands. They are defined in the repository .git/hooks.

gblk can create two hooks:

  • post-commit hook that executes gblk commit after every git commit
  • post-checkout hook that execute:
    1. git co to revert back to the last commit as a pre-checkout hooks doesn't exits.
    2. gblk pre-co to be sure to not lose any data before the actual chekout
    3. git checkout do the actual chekout
    4. gblk co to revert back to the results folder corresponding to your target commit

As the pre-checkout hook doesn't exits, this is the post-checkout hook that is used to cancel the first checkout and check for data loss.

Note that when gblk creates hooks it also modifies the .git/config file to add 3 aliases:

  1. alias co: Performs a quiet checkout. This alias is used in step1 of the post-checkout hooks, so it is recommended to use it when you perform a checkout. It allows to have a quiet initial checkout that is then quietly reverted so gblk can check that no data is lost.
  2. alias conh: This alias performs a checkout without the post-checkout hooks. This can be usefull when you perform a checkout to a deleted commit on your .borg archive. If you want to checkout to another commit, gblk pre-co will prevent that because it will think that the results folder was not commited with gblk commit. To perform a checkout anyway you can use: git conh [TARGET-BRANCH] && gblk checkout --mode hard
  3. alias cnh: This alias performs a commit without using the post-commit hooks.
$ mkdir project
$ cd project
$ mkdir results src
$ git init
$ gblk init --hooks # creation of a .borg repository at the root of your filesystem and add hooks to your .git/hooks folder
$ # Note: If you forgot the --hooks option you can always enable tem later with `gblk create-hook`
$ exa .git/hooks -a --tree --level=1 | grep -v sample
.git/hooks
├── post-checkout
├── post-commit
$
$ # Creation of a simple script that creates a result file
$ echo "echo 'result line' > results/result.txt" > src/script.sh
$ bash src/script.sh
$ ls results
result.txt
$ git add src/script.sh
$ git commit -m "src/script.sh: initial commit"
------------------------------------------------------------------------------
Repository: /home/nicolas/Documents/project/.borg
Archive name: 2da3c535543fb9a216b52f29ecf598b6310c1223
Archive fingerprint: e3fc804ec86e7d372f44cdc7e8c88bcd23cecedfdf7dc6ed3ac78c86a31f375b
Time (start): Thu, 2022-05-12 17:10:40
Time (end):   Thu, 2022-05-12 17:10:40
Duration: 0.00 seconds
Number of files: 1
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                  610 B                548 B                548 B
All archives:                   12 B                 15 B                735 B

                       Unique chunks         Total chunks
Chunk index:                       3                    3
------------------------------------------------------------------------------
[master (commit racine) 2da3c53] src/script.sh: initial commit
 1 file changed, 1 insertion(+)
 create mode 100644 src/script.sh
$ git rev-parse --verify HEAD # Show current commit
2da3c535543fb9a216b52f29ecf598b6310c1223
$ gblk list # list archive in .borg repository
2da3c535543fb9a216b52f29ecf598b6310c1223 Thu, 2022-05-12 17:10:40 [e3fc804ec86e7d372f44cdc7e8c88bcd23cecedfdf7dc6ed3ac78c86a31f375b]
$
$ # New change
$ echo "echo 'newresult line' > results/newresult.txt" > src/script.sh
$ bash src/script.sh
$ git add src/script.sh
$ git commit -m "src/script.sh"
------------------------------------------------------------------------------
Repository: /home/nicolas/Documents/project/.borg
Archive name: f47abde74c32fe570bf69ca28168120e67703754
Archive fingerprint: e48648ee971fd47c39fcaedf8aee73138334db0f4255e42b2631303df308d2ca
Time (start): Thu, 2022-05-12 17:11:52
Time (end):   Thu, 2022-05-12 17:11:52
Duration: 0.00 seconds
Number of files: 2
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                  625 B                566 B                551 B
All archives:                   39 B                 48 B              1.55 kB

                       Unique chunks         Total chunks
Chunk index:                       6                    7
------------------------------------------------------------------------------
[master f47abde] src/script.sh
 1 file changed, 1 insertion(+), 1 deletion(-)
$ # checkout
$ # Let's add a file to results.txt that was not saved by borg before
$ echo "newline" >> results/newresult.txt
$ # Let's try to make a checkout
$ git co 2da3c535543fb9a216b52f29ecf598b6310c1223
Note : basculement sur '2da3c535543fb9a216b52f29ecf598b6310c1223'
...
Basculement sur la branche 'master'
Your results folder contains unsaved changes!
Please update your current commit with:  gbl commit --update
$ # Nothing happens because we have unsaved changes. Let's update our changes
$ gbl commit --update # this rewrite the archive named after the current commit id
------------------------------------------------------------------------------
...
------------------------------------------------------------------------------
$ git co 2da3c535543fb9a216b52f29ecf598b6310c1223
Note : basculement sur '2da3c535543fb9a216b52f29ecf598b6310c1223'.
...
HEAD est maintenant sur 2da3c53 src/script.sh: initial commit
$ gblk co --mode hard # hard is used to delete file that were not present in the first commit. Otherwise only existing files at the destination commit will be updated.
$ cat results/*
File: results/result.txt
result line
$ git co master
$ cat results/*
File: results/newresult.txt
newresult line
newline

File: results/result.txt
result line

mount command

Gblk as two commands mount and unmount that can be used to respectively mount your borg archives into the .mount folder of the project directory or unmount them.

⚠️ Important note : When the borg archive is mounted, borg is locked ! It mean that you can get the following error message if you forget to unmount a borg acrchive when calling a borg command on it: Failed to create/acquire thelock .. . To resolve it just use gblkt umount. gblk will silently unmount the borg archive before commit, checkout, pre-checkout and mount commands.

gblk mount usage:

gblk-mount
Mount an old file/directory from one or multiple archive named afert git commits into the .mount
folder inside de project directory

USAGE:
    gblk mount [OPTIONS]

OPTIONS:
    -c, --commit <COMMIT>
            Commit name, sh: Glob is supported. This is an optional parameter: if not set then all
            commit archives will be mounted into the .mount directory

    -d, --diff
            Displays the differences between two files mounted corresponding to the given path.

            Note that if only one file is recovered then, the other is taken from the current result
            folder

            This option is deactivated when used with --diff

    -h, --help
            Print help information

    -l, --last <LAST>
            Consider last N archive after  other filter were applied

    -p, --path <PATH>
            The file/directory to extract. This is an optional parameter. If not set then all files
            in the archive will be displayed

    -v, --versions
            If set, displays the .mount directory in 'version view'.

            - Normal view: The `.mount` directory contains a subfolder with the name of archives.

            - Version view: The `.mount` directory contains the results folder and every file within
            it becomes a directory storing every version of that file

Examples:

$ gblk mount # mount all archives in `.borg` into `.mount` folder
$ gblk mount -v # mount all archives in `.borg` into `.mount` folder in version view.
$ # It is not necessary to execute "gblk umount" before the above command because its run silently before "gblk mount"
$ gblk mount -c '[ab]*' # mount all archives named after commits strating with 'a' or 'b'. Note that the quotes are necessary for -a options
$ gblk mount -c 'ae8rt77*' -p 'results/fichier.txt' # mounts all files matching 'results/fichier.txt' inside archives named after commits starting with'ae8rt77*'
$ gblk mount -p 'results/**/*.txt' # mounts all txt files inside every archives
$ gblk mount --last 2 # mount the last two archives named after the last commits
$ gblk umount # unmount the archive mounted into `.mount` directory

If we have two archive containing a file named file.txt and we want to direclty compare them in a similar way as git diff, then we can enter:

gblk mount -p 'results/file.txt' --diff

The differences between the two files will be displayed with delta:

Δ .mount/cfdc/results/lol.txt ⟶   .mount/4ec196bb/results/lol.txt
────────────────────────────────────────────────────────────────────────

─────┐
• 1: │
─────┘
│  1 │newblooup                                       │  1 │newblooup
│  2 │dfbifbsi                                        │  2 │dfbifbsi
│    │                                                │  3 │obeigbvisdb

If only one file matches the path given with the --path argument of the mount command, then, gblk will search if a match can be found in the current results folder.

Learn how to customize delta display by going here

Note that you can also display the differences between images. To be able to do so, imagemagick must be installed.

You you plan to make pdf diff, you might want to change imagemagick /etc/ImageMagick-[VERSION]/policy.xml and replacing the line:

<policy domain="coder" rights="none" pattern="PDF" />

by

<policy domain="coder" rights="read | write" pattern="PDF" />

Currently the following formats are available for a nimage diff: PNG, JPEG, BMP, ICO, SVG, PDF

Let's say, we have one image named im1.png inside the last commit. to compare it with the image im1.png in the result folder, you can type:

gblk mount --last 1 -p 'results/im1.png' --diff

The diff image will be created inside the .tmp directory of the project folder.

borgignore file

Its possible to tell borg to ignore files in the results folder. To do that, you can create a .borgignore file at the root of the project directory.

To ignore a given file named file.txt you can add to your .borgignore file the following content:

- results/file.txt

Note that:

  1. you have to put a results/ prefix in front of your files.
  2. To exclude a file, the line must begin by - .

To ignore every files named file.txt wherever they are, use the following syntax:

- /**/file.txt

You can also ignore files with a given extention inside a folder (named folder here) with:

- results/folder/*.txt

To ignore all files with a given extention use:

- /**/*.txt

To rescue files from being ignored by another pattern, you can use a line begining by '+ ' in .borgignore file. Example: If we have a test folder inside the results folder containing the files a.txt, b.txt, ..., z.txt and we want to ignore everything except the c.txt file. This can be done with:

- results/test/*.txt
+ results/test/c.txt

Delete command

This command is a wrapper of the borg delete command. If you need information about the borg delete command, you can check borg's documentation

This command can be used to delete specific archive directly by their name or by a prefix or a glob

Note that this command doesn't actually free disk space. You have to use gblk compact afterwards to achieve this

gblk delete usage

To display the help of gblk delete, run the following command:

gblk delete -h # -h for compact help, --help for a more exhaustive help

To see what archive you are about to remove, enter

gblk delete --list --dry-run [OTHER_OPTIONS]
  • The dry-run option will keep the archive unchanged
  • The --list option will display what was deleted (without --dry-run option) or what would be deleted (with the dry-run option)

Prune command

This command is a wrapper of the borg prune command. Check borg's documentation for more details.

This command can be used to keep archives created during a given period of time and remove others.

Note that this command doesn't actually free disk space. You have to use gblk compact afterwards to achieve this.

gblk prune usage

To display the help of gblk prune, run the following command:

gblk prune -h # -h for compact help, --help for a more exhaustive help

To see what archives you are about to remove, enter

gblk prune --list --dry-run [OTHER_OPTIONS]

gblk compact

This command frees .borg repository space.

You can use this command after deleting one or more archives because it will really free repository space.

To use this command, you can run:

gblk compact  # -h for compact help, --help for a more exhaustive help

To compact you .borg folder, you can run

gblk compact --verbose

If the amount of parts that need compaction is big the .borg folder, this command may take a while. Consider using the --progress option in this case.

gblk config

Sometimes, you want to only keep a small number archives of your results folder to save some space. If you always want to keep all backups from last week and one backup per month for 5 month, it can tedious to always remember the prune archive doing that:

gblk prune --keep-within '7d' --keep-monthly 5 --dry-run

You may want to put those settings in a local or in a global configuration file to always prune a given project or all your projects in the same way.

gblk provide a way to do that by using the borg configuration file .borg/config as a local configuration file and the ~/.gblkconfig file as a global configuration file.

Note: If both the local and global configuration files contain gblk settings used for pruning, only the local settings are used..

Add new settings in configuration file

To add or update settings in the local configuration file you can use the following command:

gblk config add <KEY> <VALUE> [--global]

Where KEY corresponds to a prune command argument. You can choose from:

  • keep_within, keep_last, keep_minutely, keep_hourly, keep_daily
  • keep_weekly, keep_monthly, keep_yearly, prefix, glob_archives, save_space And VALUE corresponds to the value to associate with the key

Check borg documentation to know what those arguments do. You can also run the command gblk prune --help to see a description of those arguments.

Note: you can also use the KEY by replacing the '-' by an '_'. gblk will have the same behavior.

For example, to keep all backups from last week and one backup per month for 5 month, you must run the following command:

gblk config add keep-within '7d'
gblk config add keep-monthly 5

You can use the flag --global to set those settings in the global configuration file

Display gblk settings used for pruning

To display the current list of setting of the gblk local or global configuration file, you can use the following command:

$ gblk config show # add --global to see setting the the global configuration file
keep_within = '7d'
keep_monthly = 5

Note: The '-' in keep_within is replaced by an underscore inside the configuration file

Remove gblk settings used for pruning

To remove a setting previouly defined in the local configuration file you can enter the following command

gblk config rm <KEY> [--global]

For example, let's remove the setting keep_within:

$ gblk config show
keep_within = '7d'
keep_monthly = 5
$ gblk config rm keep-within # 'gblk config rm keep_within' also works
$ gblk config show
keep_monthly = 5

To do the same thing with the global configuration file just add --global at the end of those 3 commands.

Pruning archives using gblk settings

Finnaly, to prune your results archives using the settings defined in the global or local configuration you can use the following command:

gblk config prune [OPTION]

Note: If both the local and global configuration files contain gblk settings used for pruning, only the local settings are used..

You can see what option you can add to your command with

$ gblk config prune --help
gblk-config-prune
Prune using the project configuration

USAGE:
    gblk config prune [OPTIONS]

OPTIONS:
    -h, --help    Print help information

Filtering options:
    -n, --dry-run    Do not change the repository
        --list       Output verbose list of archive
    -s, --stats      Print statistics for the deleted archive
        --force      Force deletion of corrupted archives, use `--force --force` in case `--force`
                     does not work