Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

dvc-scan.jsonnet 2.1 KB

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
  1. local bd = import '../lib.jsonnet';
  2. {
  3. 'scan-book-info': {
  4. cmd: bd.cmd('goodreads scan books ../data/goodreads/goodreads_books.json.gz'),
  5. deps: [
  6. '../src/cli/goodreads',
  7. '../src/goodreads',
  8. '../data/goodreads/goodreads_books.json.gz',
  9. ],
  10. outs: [
  11. 'gr-book-ids.parquet',
  12. 'gr-book-info.parquet',
  13. 'gr-book-authors.parquet',
  14. 'gr-book-series.parquet',
  15. ],
  16. },
  17. 'scan-work-info': {
  18. cmd: bd.cmd('goodreads scan works ../data/goodreads/goodreads_book_works.json.gz'),
  19. deps: [
  20. '../src/cli/goodreads',
  21. '../src/goodreads',
  22. '../data/goodreads/goodreads_book_works.json.gz',
  23. ],
  24. outs: [
  25. 'gr-work-info.parquet',
  26. ],
  27. },
  28. 'scan-book-genres': {
  29. cmd: bd.cmd('goodreads scan genres ../data/goodreads/goodreads_book_genres_initial.json.gz'),
  30. deps: [
  31. '../src/cli/goodreads',
  32. '../src/goodreads',
  33. '../data/goodreads/goodreads_book_genres_initial.json.gz',
  34. ],
  35. outs: [
  36. 'gr-book-genres.parquet',
  37. 'gr-genres.parquet',
  38. ],
  39. },
  40. 'scan-author-info': {
  41. cmd: bd.cmd('goodreads scan authors ../data/goodreads/goodreads_book_authors.json.gz'),
  42. deps: [
  43. '../src/cli/goodreads',
  44. '../src/goodreads',
  45. '../data/goodreads/goodreads_book_authors.json.gz',
  46. ],
  47. outs: [
  48. 'gr-author-info.parquet',
  49. ],
  50. },
  51. 'scan-interactions': {
  52. cmd: bd.cmd('goodreads scan interactions ../data/goodreads/goodreads_interactions.json.gz'),
  53. deps: [
  54. '../src/cli/goodreads',
  55. '../src/goodreads',
  56. '../data/goodreads/goodreads_interactions.json.gz',
  57. ],
  58. outs: [
  59. 'gr-interactions.parquet',
  60. 'gr-users.parquet',
  61. ],
  62. },
  63. } + if bd.config.goodreads.reviews then {
  64. 'scan-reviews': {
  65. cmd: bd.cmd('goodreads scan reviews ../data/goodreads/goodreads_reviews_dedup.json.gz'),
  66. deps: [
  67. '../src/cli/goodreads',
  68. '../src/goodreads',
  69. '../data/goodreads/goodreads_reviews_dedup.json.gz',
  70. 'gr-book-link.parquet',
  71. 'gr-users.parquet',
  72. ],
  73. outs: [
  74. 'gr-reviews.parquet',
  75. ],
  76. },
  77. } else {}
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...