No Description

Hlib caad3aaacf remove "test" remote 10 months ago
.dvc caad3aaacf remove "test" remote 10 months ago
data 6851ce644c add stage for zipping devanbu small corpus 10 months ago
params 71e95fd455 improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq 11 months ago
pipeline e5c675a4f6 add allamanis cropus extraction to pipeline 10 months ago
script-data 47542df864 add stage for extracting devanbu small corpus 11 months ago
scripts 34e81dc3d2 Merge branch 'master' of https://github.com/giganticode/datasets 11 months ago
LICENSE 8667f6c94a Initial commit 11 months ago
devanbu-small-corpus-metadata.dvc 0e666e309a add stage for computing the stats for devanbu small corpus 11 months ago
dvc.lock 6b7b7d36b3 zipping devanbu small corpus: include train, valid, test, demo folders directly to the root of the zip (without parent folders) 10 months ago
dvc.yaml 6b7b7d36b3 zipping devanbu small corpus: include train, valid, test, demo folders directly to the root of the zip (without parent folders) 10 months ago
requirements.txt 71e95fd455 improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq 11 months ago

Data Pipeline

Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File