Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

dvc.yaml 1.8 KB

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
  1. stages:
  2. download_data:
  3. desc: Download data from Kaggle
  4. cmd: python3 src/data/download.py -tr train.csv -te test.csv -o ./data/raw
  5. deps:
  6. - src/data/download.py
  7. params:
  8. - competition
  9. outs:
  10. - data/raw/test.csv
  11. - data/raw/train.csv
  12. prepare_images:
  13. desc: Create 2D images from flattened images in numpy array
  14. cmd: python3 src/data/prepare_img.py -tr data/raw/train.csv -te data/raw/test.csv
  15. -o ./data/processed/
  16. deps:
  17. - data/raw/test.csv
  18. - data/raw/train.csv
  19. - src/data/prepare_img.py
  20. - src/img/transforms.py
  21. outs:
  22. - data/processed/test/
  23. - data/processed/test_mapfile.csv
  24. - data/processed/test_mean_image.png
  25. - data/processed/train/
  26. - data/processed/train_mapfile.csv
  27. - data/processed/train_mean_image.png
  28. split_train_dev:
  29. desc: Split training data into the train and dev sets using stratified K-fold
  30. cross validation.
  31. cmd: python3 src/data/split_train_dev.py -tr data/processed/train_mapfile.csv
  32. -o data/processed/
  33. deps:
  34. - data/processed/train_mapfile.csv
  35. - src/data/split_train_dev.py
  36. params:
  37. - random_seed
  38. - train_test_split
  39. outs:
  40. - data/processed/split_train_dev.csv
  41. train_model:
  42. desc: Train the specified classifier using the pre-allocated stratified K-fold
  43. cross validation splits and the current params.yaml settings.
  44. cmd: venv/bin/python3 src/models/train_model.py -mf data/processed/train_mapfile.csv
  45. -cv data/processed/split_train_dev.csv
  46. deps:
  47. - data/processed/split_train_dev.csv
  48. - data/processed/train_mapfile.csv
  49. - src/models/train_model.py
  50. params:
  51. - classifier
  52. - model_params
  53. - random_seed
  54. plots:
  55. - reports/figures/logs.csv
  56. metrics:
  57. - results/metrics.json
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...