No Description

hlib 34e81dc3d2 Merge branch 'master' of https://github.com/giganticode/datasets 1 week ago
.dvc 7703a7a308 add google drive remote 2 weeks ago
data 0e666e309a add stage for computing the stats for devanbu small corpus 1 week ago
params 71e95fd455 improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq 1 week ago
pipeline 71e95fd455 improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq 1 week ago
script-data 47542df864 add stage for extracting devanbu small corpus 1 week ago
scripts 34e81dc3d2 Merge branch 'master' of https://github.com/giganticode/datasets 1 week ago
LICENSE 8667f6c94a Initial commit 2 weeks ago
devanbu-small-corpus-metadata.dvc 0e666e309a add stage for computing the stats for devanbu small corpus 1 week ago
requirements.txt 71e95fd455 improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq 1 week ago

Data Pipeline

Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File