Skip to content
Snippets Groups Projects
Commit 36db9d37 authored by Ghislain Durif's avatar Ghislain Durif
Browse files

minor modifications and typo fixes in DVC tuto

parent 7f994cae
Branches main
No related tags found
No related merge requests found
# testing Data Version Control (DVC) # testing Data Version Control (DVC)
Doc: https://dvc.org/doc
## Install ## Install
```bash ```bash
...@@ -45,7 +47,7 @@ To enable auto staging, run: ...@@ -45,7 +47,7 @@ To enable auto staging, run:
dvc config core.autostage true dvc config core.autostage true
``` ```
5. Record reference to data in git (but not datafile): 5. Record reference to data in git (but not the data file itself):
```bash ```bash
git add data/datafile.dat.dvc data/.gitignore git add data/datafile.dat.dvc data/.gitignore
git commit -m "data generation v1" git commit -m "data generation v1"
...@@ -108,7 +110,7 @@ git commit -m "new version of generated data" ...@@ -108,7 +110,7 @@ git commit -m "new version of generated data"
dvc remote add -d local ../dvc-testing-remote dvc remote add -d local ../dvc-testing-remote
``` ```
Note: `-d` option is to set the new remote as the default one. **Note**: `-d` option is to set the new remote as the default one.
9. Push to local DVC remote: 9. Push to local DVC remote:
```bash ```bash
...@@ -130,7 +132,7 @@ outs: ...@@ -130,7 +132,7 @@ outs:
path: datafile.dat path: datafile.dat
``` ```
Note: we can verify that the hash stored in the `data/datafile.dat.dvc` file corresponds to the actual `data/datafile.dat` file: **Note**: we can verify that the hash stored in the `data/datafile.dat.dvc` file corresponds to the actual `data/datafile.dat` file.
```bash ```bash
md5sum data/datafile.dat md5sum data/datafile.dat
``` ```
...@@ -172,7 +174,7 @@ outs: ...@@ -172,7 +174,7 @@ outs:
path: datafile.dat path: datafile.dat
``` ```
Note: at this point, the datafile version does not correspond (discrepancy between the hash stored in the `data/datafile.dat.dvc` file and the actual `data/datafile.dat` file: **Note**: at this point, the actual data file version does not correspond to the version for this specific git commit (discrepancy between the hash stored in the `data/datafile.dat.dvc` file and the actual `data/datafile.dat` file.
```bash ```bash
md5sum data/datafile.dat md5sum data/datafile.dat
``` ```
...@@ -181,25 +183,28 @@ md5sum data/datafile.dat ...@@ -181,25 +183,28 @@ md5sum data/datafile.dat
a0c027223a771d1bb1519e5e5aaaf82c data/datafile.dat a0c027223a771d1bb1519e5e5aaaf82c data/datafile.dat
``` ```
3. Switch to corresponding version of data file with DVC:
```bash ```bash
md5sum data/datafile.dat dvc checkout
``` ```
``` ```
a0c027223a771d1bb1519e5e5aaaf82c data/datafile.dat M data/datafile.dat
``` ```
3. Switch to corresponding version of data file: 4. Verify the version of the data file now corresponds to the git commit:
```bash ```bash
dvc checkout cat data/datafile.dat.dvc
``` ```
``` ```
M data/datafile.dat outs:
- md5: 8c0a82ed58e6152f9b134ba8d272dd42
size: 102400000
hash: md5
path: datafile.dat
``` ```
4. Verify that version of data file:
```bash ```bash
md5sum data/datafile.dat md5sum data/datafile.dat
``` ```
...@@ -208,6 +213,11 @@ md5sum data/datafile.dat ...@@ -208,6 +213,11 @@ md5sum data/datafile.dat
8c0a82ed58e6152f9b134ba8d272dd42 data/datafile.dat 8c0a82ed58e6152f9b134ba8d272dd42 data/datafile.dat
``` ```
5. Switch back to last git commit:
```bash
git switch main
dvc checkout
```
## SSH remote ## SSH remote
...@@ -268,6 +278,5 @@ Disable [analytics](https://dvc.org/doc/user-guide/analytics) reporting (locally ...@@ -268,6 +278,5 @@ Disable [analytics](https://dvc.org/doc/user-guide/analytics) reporting (locally
dvc config core.analytics false dvc config core.analytics false
``` ```
Note: add the `--global` option for global configuration as with `git`. **Note**: add the `--global` option for global configuration as with `git`.
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment