From 36db9d37a2efc6d90216a0c6b01a1ba84b9c8017 Mon Sep 17 00:00:00 2001 From: Ghislain Durif <gd.dev@libertymail.net> Date: Fri, 8 Sep 2023 11:35:32 +0200 Subject: [PATCH] minor modifications and typo fixes in DVC tuto --- README.md | 35 ++++++++++++++++++++++------------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 9dcf2e0..968aac5 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,7 @@ # testing Data Version Control (DVC) +Doc: https://dvc.org/doc + ## Install ```bash @@ -45,7 +47,7 @@ To enable auto staging, run: dvc config core.autostage true ``` -5. Record reference to data in git (but not datafile): +5. Record reference to data in git (but not the data file itself): ```bash git add data/datafile.dat.dvc data/.gitignore git commit -m "data generation v1" @@ -108,7 +110,7 @@ git commit -m "new version of generated data" dvc remote add -d local ../dvc-testing-remote ``` -Note: `-d` option is to set the new remote as the default one. +**Note**: `-d` option is to set the new remote as the default one. 9. Push to local DVC remote: ```bash @@ -130,7 +132,7 @@ outs: path: datafile.dat ``` -Note: we can verify that the hash stored in the `data/datafile.dat.dvc` file corresponds to the actual `data/datafile.dat` file: +**Note**: we can verify that the hash stored in the `data/datafile.dat.dvc` file corresponds to the actual `data/datafile.dat` file. ```bash md5sum data/datafile.dat ``` @@ -172,7 +174,7 @@ outs: path: datafile.dat ``` -Note: at this point, the datafile version does not correspond (discrepancy between the hash stored in the `data/datafile.dat.dvc` file and the actual `data/datafile.dat` file: +**Note**: at this point, the actual data file version does not correspond to the version for this specific git commit (discrepancy between the hash stored in the `data/datafile.dat.dvc` file and the actual `data/datafile.dat` file. ```bash md5sum data/datafile.dat ``` @@ -181,25 +183,28 @@ md5sum data/datafile.dat a0c027223a771d1bb1519e5e5aaaf82c data/datafile.dat ``` - +3. Switch to corresponding version of data file with DVC: ```bash -md5sum data/datafile.dat +dvc checkout ``` ``` -a0c027223a771d1bb1519e5e5aaaf82c data/datafile.dat +M data/datafile.dat ``` -3. Switch to corresponding version of data file: +4. Verify the version of the data file now corresponds to the git commit: ```bash -dvc checkout +cat data/datafile.dat.dvc ``` ``` -M data/datafile.dat +outs: +- md5: 8c0a82ed58e6152f9b134ba8d272dd42 + size: 102400000 + hash: md5 + path: datafile.dat ``` -4. Verify that version of data file: ```bash md5sum data/datafile.dat ``` @@ -208,6 +213,11 @@ md5sum data/datafile.dat 8c0a82ed58e6152f9b134ba8d272dd42 data/datafile.dat ``` +5. Switch back to last git commit: +```bash +git switch main +dvc checkout +``` ## SSH remote @@ -268,6 +278,5 @@ Disable [analytics](https://dvc.org/doc/user-guide/analytics) reporting (locally dvc config core.analytics false ``` -Note: add the `--global` option for global configuration as with `git`. -``` +**Note**: add the `--global` option for global configuration as with `git`. -- GitLab