Skip to content
Snippets Groups Projects
Select Git revision
  • 09d69a374f7408ad952d13ee6cca16222d0c851c
  • master default protected
  • doc
  • dev_doc
  • dev_push
  • dev
6 results

git_borg_linker

  • Clone with SSH
  • Clone with HTTP
  • Fontrodona Nicolas's avatar
    09d69a37
    History
    Name Last commit Last update
    src
    .gitignore
    Cargo.lock
    Cargo.toml
    LICENCE
    Readme.md

    Git borg linker utility

    Description

    The git borg linker utility (abreviated as gblk) is a tool that aims to ease the usage of borgbackup in a project using git as a version control system.

    It helps you to track the changes in your results folder every time you commit a change in your code. For versionning your results gblk uses borgbackup a tool to create backups and using a data deduplication technique.

    Prerequisites

    To install gblk, git and borgbackup must be installed on your system.

    To install borg, you can go to borg's installation page.

    As gblk is written in rust, you need to install it with:

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

    You can optionally install delta. Delta aims to show differences between two files and can be used with git. It can be customized by editing the ~/.gitconfig file. This tool is needed if you use the --diff option of the gblk mount command.

    You can optionally install ImageMagick. ImageMagick aims to perform various operation on images. This tool is needed if you plan to create diff of image with gblk with the --diff option of the gblk mount command.

    Installation

    To install gblk run the following command:

    cargo install --git https://gitbio.ens-lyon.fr/LBMC/hub/git_borg_linker

    Usage

    gblk is meant to be used inside a project using git as a version control system. It will only be helpful if your projet folder contains a .git folder. A results folder must be present in your project directory as gblk will try to backup it.

    To sum up gblk must be used in a folder having this minimal structure:

    project
    ├── .git
    └── results

    To display the help of gblk run the following command:

    $ gblk help
    A tool used to link borg and git together
    
    This tool was created to link borg and git together and ease the management of developpment artifact
    versionning using git
    
    gbl
    A tool used to link borg and git together
    
    USAGE:
        gblk <SUBCOMMAND>
    
    OPTIONS:
        -h, --help    Print help information
    
    SUBCOMMANDS:
        checkout        Checkout results to the current git commit
        commit          Save the results folder of a git repository in an archive
        compact         This command frees repository space by compacting segments
        create-hooks    Create github hooks to use gbl automaticaly after commit, before and after
                            checkout
        delete          This command deletes an archive from the repository or the complete
                            repository
        delete-hooks    Remove the post-checkout and the post-commit hooks
        diff            Show differences between two commits of the `results` folder
        help            Print this message or the help of the given subcommand(s)
        init            Initialize a borg repository inside a git project
        list            List the content of the .borg archive
        mount           Mount an old file/directory from one or multiple archive named afert git
                            commits into the .mount folder inside de project directory
        pre-co          Check if a checkout can be performed without losing data
        prune           This command prunes the .borg repository. This can be used to keep only
                            archive created during a given time interval
        umount          Unmount everything in the folder .mount

    You can type gblk help <SUBCOMMAND> or gblk <SUBCOMMAND> --help to display the help of any given subcommands.

    Note that create-hooks subcomand can be abbreviated to ch, and checkout subcommand can be abbreviated to co. For example gblk co --help will work the same as gblk checkout --help

    Example usage without git hooks

    Usage of gblk (init, commit, pre-co, checkout) without hooks

    $ mkdir project
    $ cd project
    $ mkdir results src
    $ git init
    $ gblk init # creation of a .borg repository at the root of your filesystem
    $ exa -a --tree --level=1
    .
    ├── .borg
    ├── .git
    ├── results
    └── src
    $
    $ # Creation of a simple script that creates a result file
    $ echo "echo 'result line' > results/result.txt" > src/script.sh
    $ bash src/script.sh
    $ ls results
    result.txt
    $ git add src/script.sh
    $ git commit -m "src/script.sh: initial commit"
    $ git rev-parse --verify HEAD # Show current commit
    62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d
    $ gblk commit # creation of an archive in .borg repository
    Repository: /home/nicolas/Documents/project/.borg
    Archive name: 62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d
    Archive fingerprint: fbb7444b0d11da22959f7611b66d8d6378b666b379237d46a0448de352fbbb62
    Time (start): Thu, 2022-05-12 14:35:43
    Time (end):   Thu, 2022-05-12 14:35:43
    Duration: 0.00 seconds
    Number of files: 1
    Utilization of max. archive size: 0%
    ------------------------------------------------------------------------------
                           Original size      Compressed size    Deduplicated size
    This archive:                  610 B                548 B                548 B
    All archives:                   12 B                 15 B                735 B
    
                           Unique chunks         Total chunks
    Chunk index:                       3                    3
    ------------------------------------------------------------------------------
    
    $ gblk list # list archive in .borg repository
    62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d Thu, 2022-05-12 14:35:43 [fbb7444b0d11da22959f7611b66d8d6378b666b379237d46a0448de352fbbb62]
    $
    $ # New change
    $ echo "echo 'newresult line' > results/newresult.txt" > src/script.sh
    $ bash src/script.sh
    $ ls results
    newresult.txt   result.txt
    $ git add src/script.sh
    $ git commit -m "src/script.sh"
    $ gblk commit
    Repository: /home/nicolas/Documents/project/.borg
    Archive name: 705b95f48fe52bf9aac4406e6d4d7eb16a75f543
    Archive fingerprint: ed45c00ec2059f366f53c9c9288a72ff1c9428a16ce37614bf59a56b89fc4dee
    Time (start): Thu, 2022-05-12 14:37:43
    Time (end):   Thu, 2022-05-12 14:37:43
    Duration: 0.00 seconds
    Number of files: 2
    Utilization of max. archive size: 0%
    ------------------------------------------------------------------------------
                           Original size      Compressed size    Deduplicated size
    This archive:                  625 B                566 B                551 B
    All archives:                   39 B                 48 B              1.56 kB
    
                           Unique chunks         Total chunks
    Chunk index:                       6                    7
    ------------------------------------------------------------------------------
    $ gblk list
    62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d Thu, 2022-05-12 14:35:43 [fbb7444b0d11da22959f7611b66d8d6378b666b379237d46a0448de352fbbb62]
    705b95f48fe52bf9aac4406e6d4d7eb16a75f543 Thu, 2022-05-12 14:37:43 [ed45c00ec2059f366f53c9c9288a72ff1c9428a16ce37614bf59a56b89fc4dee]
    
    $ # checkout
    $ # The next command is important: it will check if your results folder doesn't contain new results compared to your archive with the current git id. If there is no errors, then you wont lose any data
    $ gblk pre-co
    $ git co 62efe302b6c2e7ab0dfd9c08ddfb0a87ea699c6d
    $ gblk co --mode hard # hard is used to delete file that were not present in the first commit. Otherwise only existing files at the destination commit will be updated.
    $ ls results
    result.txt

    Not: if gblk pre-co says that you might lose data compared to the saved version of your actual commit, then use gblk commit --update.

    Example usage with git hooks

    Git hooks are commands that can be automatically executed before and after some git commands. They are defined in the repository .git/hooks.

    gblk can create two hooks:

    • post-commit hook that executes gblk commit after every git commit
    • post-checkout hook that execute:
      1. git co to revert back to the last commit as a pre-checkout hooks doesn't exits.
      2. gblk pre-co to be sure to not lose any data before the actual chekout
      3. git checkout do the actual chekout
      4. gblk co to revert back to the results folder corresponding to your target commit

    As the pre-checkout hook doesn't exits, this is the post-checkout hook that is used to cancel the first checkout and check for data loss.

    Note that when gblk creates hooks it also modifies the .git/config file to add 3 aliases:

    1. alias co: Performs a quiet checkout. This alias is used in step1 of the post-checkout hooks, so it is recommended to use it when you perform a checkout. It allows to have a quiet initial checkout that is then quietly reverted so gblk can check that no data is lost.
    2. alias conh: This alias performs a checkout without the post-checkout hooks. This can be usefull when you perform a checkout to a deleted commit on your .borg archive. If you want to checkout to another commit, gblk pre-co will prevent that because it will think that the results folder was not commited with gblk commit. To perform a checkout anyway you can use: git conh [TARGET-BRANCH] && gblk checkout --mode hard
    3. alias cnh: This alias performs a commit without using the post-commit hooks.
    $ mkdir project
    $ cd project
    $ mkdir results src
    $ git init
    $ gblk init --hooks # creation of a .borg repository at the root of your filesystem and add hooks to your .git/hooks folder
    $ # Note: If you forgot the --hooks option you can always enable tem later with `gblk create-hook`
    $ exa .git/hooks -a --tree --level=1 | grep -v sample
    .git/hooks
    ├── post-checkout
    ├── post-commit
    $
    $ # Creation of a simple script that creates a result file
    $ echo "echo 'result line' > results/result.txt" > src/script.sh
    $ bash src/script.sh
    $ ls results
    result.txt
    $ git add src/script.sh
    $ git commit -m "src/script.sh: initial commit"
    ------------------------------------------------------------------------------
    Repository: /home/nicolas/Documents/project/.borg
    Archive name: 2da3c535543fb9a216b52f29ecf598b6310c1223
    Archive fingerprint: e3fc804ec86e7d372f44cdc7e8c88bcd23cecedfdf7dc6ed3ac78c86a31f375b
    Time (start): Thu, 2022-05-12 17:10:40
    Time (end):   Thu, 2022-05-12 17:10:40
    Duration: 0.00 seconds
    Number of files: 1
    Utilization of max. archive size: 0%
    ------------------------------------------------------------------------------
                           Original size      Compressed size    Deduplicated size
    This archive:                  610 B                548 B                548 B
    All archives:                   12 B                 15 B                735 B
    
                           Unique chunks         Total chunks
    Chunk index:                       3                    3
    ------------------------------------------------------------------------------
    [master (commit racine) 2da3c53] src/script.sh: initial commit
     1 file changed, 1 insertion(+)
     create mode 100644 src/script.sh
    $ git rev-parse --verify HEAD # Show current commit
    2da3c535543fb9a216b52f29ecf598b6310c1223
    $ gblk list # list archive in .borg repository
    2da3c535543fb9a216b52f29ecf598b6310c1223 Thu, 2022-05-12 17:10:40 [e3fc804ec86e7d372f44cdc7e8c88bcd23cecedfdf7dc6ed3ac78c86a31f375b]
    $
    $ # New change
    $ echo "echo 'newresult line' > results/newresult.txt" > src/script.sh
    $ bash src/script.sh
    $ git add src/script.sh
    $ git commit -m "src/script.sh"
    ------------------------------------------------------------------------------
    Repository: /home/nicolas/Documents/project/.borg
    Archive name: f47abde74c32fe570bf69ca28168120e67703754
    Archive fingerprint: e48648ee971fd47c39fcaedf8aee73138334db0f4255e42b2631303df308d2ca
    Time (start): Thu, 2022-05-12 17:11:52
    Time (end):   Thu, 2022-05-12 17:11:52
    Duration: 0.00 seconds
    Number of files: 2
    Utilization of max. archive size: 0%
    ------------------------------------------------------------------------------
                           Original size      Compressed size    Deduplicated size
    This archive:                  625 B                566 B                551 B
    All archives:                   39 B                 48 B              1.55 kB
    
                           Unique chunks         Total chunks
    Chunk index:                       6                    7
    ------------------------------------------------------------------------------
    [master f47abde] src/script.sh
     1 file changed, 1 insertion(+), 1 deletion(-)
    $ # checkout
    $ # Let's add a file to results.txt that was not saved by borg before
    $ echo "newline" >> results/newresult.txt
    $ # Let's try to make a checkout
    $ git co 2da3c535543fb9a216b52f29ecf598b6310c1223
    Note : basculement sur '2da3c535543fb9a216b52f29ecf598b6310c1223'
    ...
    Basculement sur la branche 'master'
    Your results folder contains unsaved changes!
    Please update your current commit with:  gbl commit --update
    $ # Nothing happens because we have unsaved changes. Let's update our changes
    $ gbl commit --update # this rewrite the archive named after the current commit id
    ------------------------------------------------------------------------------
    ...
    ------------------------------------------------------------------------------
    $ git co 2da3c535543fb9a216b52f29ecf598b6310c1223
    Note : basculement sur '2da3c535543fb9a216b52f29ecf598b6310c1223'.
    ...
    HEAD est maintenant sur 2da3c53 src/script.sh: initial commit
    $ gblk co --mode hard # hard is used to delete file that were not present in the first commit. Otherwise only existing files at the destination commit will be updated.
    $ cat results/*
    File: results/result.txt
    result line
    $ git co master
    $ cat results/*
    File: results/newresult.txt
    newresult line
    newline
    
    File: results/result.txt
    result line

    mount command

    Gblk as two commands mount and unmount that can be used to respectively mount your borg archives into the .mount folder of the project directory or unmount them.

    ⚠️ Important note : When the borg archive is mounted, borg is locked ! It mean that you can get the following error message if you forget to unmount a borg acrchive when calling a borg command on it: Failed to create/acquire thelock .. . To resolve it just use gblkt umount. gblk will silently unmount the borg archive before commit, checkout, pre-checkout and mount commands.

    gblk mount usage:

    gblk-mount
    Mount an old file/directory from one or multiple archive named afert git commits into the .mount
    folder inside de project directory
    
    USAGE:
        gblk mount [OPTIONS]
    
    OPTIONS:
        -c, --commit <COMMIT>
                Commit name, sh: Glob is supported. This is an optional parameter: if not set then all
                commit archives will be mounted into the .mount directory
    
        -d, --diff
                Displays the differences between two files mounted corresponding to the given path.
    
                Note that if only one file is recovered then, the other is taken from the current result
                folder
    
                This option is deactivated when used with --diff
    
        -h, --help
                Print help information
    
        -l, --last <LAST>
                Consider last N archive after  other filter were applied
    
        -p, --path <PATH>
                The file/directory to extract. This is an optional parameter. If not set then all files
                in the archive will be displayed
    
        -v, --versions
                If set, displays the .mount directory in 'version view'.
    
                - Normal view: The `.mount` directory contains a subfolder with the name of archives.
    
                - Version view: The `.mount` directory contains the results folder and every file within
                it becomes a directory storing every version of that file

    Examples:

    $ gblk mount # mount all archives in `.borg` into `.mount` folder
    $ gblk mount -v # mount all archives in `.borg` into `.mount` folder in version view.
    $ # It is not necessary to execute "gblk umount" before the above command because its run silently before "gblk mount"
    $ gblk mount -c '[ab]*' # mount all archives named after commits strating with 'a' or 'b'. Note that the quotes are necessary for -a options
    $ gblk mount -c 'ae8rt77*' -p 'results/fichier.txt' # mounts all files matching 'results/fichier.txt' inside archives named after commits starting with'ae8rt77*'
    $ gblk mount -p 'results/**/*.txt' # mounts all txt files inside every archives
    $ gblk mount --last 2 # mount the last two archives named after the last commits
    $ gblk umount # unmount the archive mounted into `.mount` directory

    If we have two archive containing a file named file.txt and we want to direclty compare them in a similar way as git diff, then we can enter:

    gblk mount -p 'results/file.txt' --diff

    The differences between the two files will be displayed with delta:

    Δ .mount/cfdc/results/lol.txt ⟶   .mount/4ec196bb/results/lol.txt
    ────────────────────────────────────────────────────────────────────────
    
    ─────┐
    • 1: │
    ─────┘
    │  1 │newblooup                                       │  1 │newblooup
    │  2 │dfbifbsi                                        │  2 │dfbifbsi
    │    │                                                │  3 │obeigbvisdb

    If only one file matches the path given with the --path argument of the mount command, then, gblk will search if a match can be found in the current results folder.

    Learn how to customize delta display by going here

    Note that you can also display the differences between images. To be able to do so, imagemagick must be installed.

    You you plan to make pdf diff, you might want to change imagemagick /etc/ImageMagick-[VERSION]/policy.xml and replacing the line:

    <policy domain="coder" rights="none" pattern="PDF" />

    by

    <policy domain="coder" rights="read | write" pattern="PDF" />

    Currently the following formats are available for a nimage diff: PNG, JPEG, BMP, ICO, SVG, PDF

    Let's say, we have one image named im1.png inside the last commit. to compare it with the image im1.png in the result folder, you can type:

    gblk mount --last 1 -p 'results/im1.png' --diff

    The diff image will be created inside the .tmp directory of the project folder.

    borgignore file

    Its possible to tell borg to ignore files in the results folder. To do that, you can create a .borgignore file at the root of the project directory.

    To ignore a given file named file.txt you can add to your .borgignore file the following content:

    - results/file.txt

    Note that:

    1. you have to put a results/ prefix in front of your files.
    2. To exclude a file, the line must begin by - .

    To ignore every files named file.txt wherever they are, use the following syntax:

    - /**/file.txt

    You can also ignore files with a given extention inside a folder (named folder here) with:

    - results/folder/*.txt

    To ignore all files with a given extention use:

    - /**/*.txt

    To rescue files from being ignored by another pattern, you can use a line begining by '+ ' in .borgignore file. Example: If we have a test folder inside the results folder containing the files a.txt, b.txt, ..., z.txt and we want to ignore everything except the c.txt file. This can be done with:

    - results/test/*.txt
    + results/test/c.txt

    Delete command

    This command is a wrapper of the borg delete command. If you need information about the borg delete command, you can check borg's documentation

    This command can be used to delete specific archive directly by their name or by a prefix or a glob

    Note that this command doesn't actually free disk space. You have to use gblk compact afterwards to achieve this

    gblk delete usage

    To display the help of gblk delete, run the following command:

    gblk delete -h # -h for compact help, --help for a more exhaustive help

    To see what archive you are about to remove, enter

    gblk delete --list --dry-run [OTHER_OPTIONS]
    • The dry-run option will keep the archive unchanged
    • The --list option will display what was deleted (without --dry-run option) or what would be deleted (with the dry-run option)

    Prune command

    This command is a wrapper of the borg prune command. Check borg's documentation for more details.

    This command can be used to keep archives created during a given period of time and remove others.

    Note that this command doesn't actually free disk space. You have to use gblk compact afterwards to achieve this.

    gblk prune usage

    To display the help of gblk prune, run the following command:

    gblk prune -h # -h for compact help, --help for a more exhaustive help

    To see what archives you are about to remove, enter

    gblk prune --list --dry-run [OTHER_OPTIONS]

    gblk compact

    This command frees .borg repository space.

    You can use this command after deleting one or more archives because it will really free repository space.

    To use this command, you can run:

    gblk compact  # -h for compact help, --help for a more exhaustive help

    To compact you .borg folder, you can run

    gblk compact --verbose

    If the amount of parts that need compaction is big the .borg folder, this command may take a while. Consider using the --progress option in this case.

    gblk config

    Sometimes, you want to only keep a small number archives of your results folder to save some space. If you always want to keep all backups from last week and one backup per month for 5 month, it can tedious to always remember the prune archive doing that:

    gblk prune --keep-within '7d' --keep-monthly 5 --dry-run

    You may want to put those settings in a local or in a global configuration file to always prune a given project or all your projects in the same way.

    gblk provide a way to do that by using the borg configuration file .borg/config as a local configuration file and the ~/.gblkconfig file as a global configuration file.

    Note: If both the local and global configuration files contain gblk settings used for pruning, only the local settings are used..

    Add new settings in configuration file

    To add or update settings in the local configuration file you can use the following command:

    gblk config add <KEY> <VALUE> [--global]

    Where KEY corresponds to a prune command argument. You can choose from:

    • keep_within, keep_last, keep_minutely, keep_hourly, keep_daily
    • keep_weekly, keep_monthly, keep_yearly, prefix, glob_archives, save_space And VALUE corresponds to the value to associate with the key

    Check borg documentation to know what those arguments do. You can also run the command gblk prune --help to see a description of those arguments.

    Note: you can also use the KEY by replacing the '-' by an '_'. gblk will have the same behavior.

    For example, to keep all backups from last week and one backup per month for 5 month, you must run the following command:

    gblk config add keep-within '7d'
    gblk config add keep-monthly 5

    You can use the flag --global to set those settings in the global configuration file

    Display gblk settings used for pruning

    To display the current list of setting of the gblk local or global configuration file, you can use the following command:

    $ gblk config show # add --global to see setting the the global configuration file
    keep_within = '7d'
    keep_monthly = 5

    Note: The '-' in keep_within is replaced by an underscore inside the configuration file

    Remove gblk settings used for pruning

    To remove a setting previouly defined in the local configuration file you can enter the following command

    gblk config rm <KEY> [--global]

    For example, let's remove the setting keep_within:

    $ gblk config show
    keep_within = '7d'
    keep_monthly = 5
    $ gblk config rm keep-within # 'gblk config rm keep_within' also works
    $ gblk config show
    keep_monthly = 5

    To do the same thing with the global configuration file just add --global at the end of those 3 commands.

    Pruning archives using gblk settings

    Finnaly, to prune your results archives using the settings defined in the global or local configuration you can use the following command:

    gblk config prune [OPTION]

    Note: If both the local and global configuration files contain gblk settings used for pruning, only the local settings are used..

    You can see what option you can add to your command with

    $ gblk config prune --help
    gblk-config-prune
    Prune using the project configuration
    
    USAGE:
        gblk config prune [OPTIONS]
    
    OPTIONS:
        -h, --help    Print help information
    
    Filtering options:
        -n, --dry-run    Do not change the repository
            --list       Output verbose list of archive
        -s, --stats      Print statistics for the deleted archive
            --force      Force deletion of corrupted archives, use `--force --force` in case `--force`
                         does not work