DVC example mirror for DATA 607

jmhsi 50c931bb7e Update README 4 weeks ago
.dvc 0d7b78ca16 initial commit 1 month ago
run 23f1acffbc 100 data coefs 4,4,4,7 1 month ago
.gitignore cf36d2c02d made run files for dvc run 1 month ago
README.md 50c931bb7e Update README 4 weeks ago
check_coeffs.py 91551f6c1a made check_coeffs 1 month ago
clean_data.py 0d7b78ca16 initial commit 1 month ago
coefs.json 23f1acffbc 100 data coefs 4,4,4,7 1 month ago
data.csv.dvc 23f1acffbc 100 data coefs 4,4,4,7 1 month ago
gen_data.py 91551f6c1a made check_coeffs 1 month ago
mse.json 23f1acffbc 100 data coefs 4,4,4,7 1 month ago
n_data.json 23f1acffbc 100 data coefs 4,4,4,7 1 month ago
train.py a964255f65 added n_data metric 1 month ago
writing_scripts.ipynb a964255f65 added n_data metric 1 month ago

Data Pipeline

Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

DVC_example

DVC example for Data Science in Context DATA 607

Make some data

python gen_data -d <n_datapoints> -c <c1,c2,c3,c4> (c1-c4 are coefficients that the linear regressor should find)

Example use cases

git checkout <branch> data.csv.dvc (checkout the dvc pointer from a specific branch/point in time)\ dvc checkout (dvc sees the pointer has changed, pulls in the right version of data.csv from the dvc cache)\ dvc repro run/check_coeffs.dvc (dvc reproduces the pipeline up to the stage specified.)

If you need to alter stages of the pipeline

Here you define the stages, their dependencies, and their outputs

See dvc documentation for more info

make changes to run/<stage-definition-file>.sh files\ bash run/<stage-definition-file>.sh