Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

dvc.yaml 971 B

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
  1. artifacts:
  2. stackoverflow-dataset:
  3. path: data/data.xml
  4. type: dataset
  5. desc: Initial XML StackOverflow dataset (raw data)
  6. text-classification:
  7. path: model.pkl
  8. desc: Detect whether the given stackoverflow question should have R language tag
  9. type: model
  10. labels:
  11. - nlp
  12. - classification
  13. - stackoverflow
  14. stages:
  15. prepare:
  16. cmd: python src/prepare.py data/data.xml
  17. deps:
  18. - data/data.xml
  19. - src/prepare.py
  20. params:
  21. - prepare.seed
  22. - prepare.split
  23. outs:
  24. - data/prepared
  25. featurize:
  26. cmd: python src/featurization.py data/prepared data/features
  27. deps:
  28. - data/prepared
  29. - src/featurization.py
  30. params:
  31. - featurize.max_features
  32. - featurize.ngrams
  33. outs:
  34. - data/features
  35. train:
  36. cmd: python src/train.py data/features model.pkl
  37. deps:
  38. - data/features
  39. - src/train.py
  40. params:
  41. - train.min_split
  42. - train.n_est
  43. - train.seed
  44. outs:
  45. - model.pkl
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...