No Description

Hlib caad3aaacf remove "test" remote 4 months ago
.dvc caad3aaacf remove "test" remote 4 months ago
data 6851ce644c add stage for zipping devanbu small corpus 4 months ago
params 71e95fd455 improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq 5 months ago
pipeline e5c675a4f6 add allamanis cropus extraction to pipeline 4 months ago
script-data 47542df864 add stage for extracting devanbu small corpus 5 months ago
scripts 34e81dc3d2 Merge branch 'master' of https://github.com/giganticode/datasets 5 months ago
LICENSE 8667f6c94a Initial commit 5 months ago
devanbu-small-corpus-metadata.dvc 0e666e309a add stage for computing the stats for devanbu small corpus 5 months ago
dvc.lock 6b7b7d36b3 zipping devanbu small corpus: include train, valid, test, demo folders directly to the root of the zip (without parent folders) 4 months ago
dvc.yaml 6b7b7d36b3 zipping devanbu small corpus: include train, valid, test, demo folders directly to the root of the zip (without parent folders) 4 months ago
requirements.txt 71e95fd455 improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq 5 months ago

Data Pipeline

Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File