Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Commit History
Message Author SHA1 Date
Some metrics are saved, and better prepared training code for mid-epoch checkpoints   Tolstoyevsky 5 years ago
Almost ready to output some training metrics   Tolstoyevsky 5 years ago
tmp - work on saving all losses   Tolstoyevsky 5 years ago
Added a script to actually use the hyperparams saved in the yaml file   Tolstoyevsky 5 years ago
Successfully overfit on a single batch with deep universal transformer   Tolstoyevsky 5 years ago
Checkpoint   Tolstoyevsky 5 years ago
Merge branch 'dvc-train-single-batch' into universal-decoder-single-batch   Tolstoyevsky 5 years ago
Universal decoder seems ready (still need to fix minor architecture mismatches in the layers, mainly dropout positions)   Tolstoyevsky 5 years ago
Partially successful attempt to overfit a small sample   Tolstoyevsky 5 years ago
Print the reason why training stops   Tolstoyevsky 5 years ago
Made training set the same as validation set   Tolstoyevsky 5 years ago
Merge branch 'dvc' into dvc-train-single-batch   Tolstoyevsky 5 years ago
Fixed checkpoint output path in train.sh   Tolstoyevsky 5 years ago
Commit wmt14_en_de_token.dvc   Tolstoyevsky 5 years ago
Merge branch 'dvc' into dvc-train-single-batch   Tolstoyevsky 5 years ago
Fixed reference to resume checkpoint   Tolstoyevsky 5 years ago
Skipped the unzipping stage completely   Tolstoyevsky 5 years ago
Created a training and validation set that should fit in a single batch, to try to overfit it to validate the model is working   Tolstoyevsky 5 years ago
Gave better names to the tokenization stage, and moved the prep command to a script   Tolstoyevsky 5 years ago
Fix path names in resume-checkpoint.dvc   Tolstoyevsky 5 years ago
Removed the unzipped files cache   Tolstoyevsky 5 years ago
Refactored the checkpoint moving stage   Tolstoyevsky 5 years ago
Training iteration   Guy 5 years ago
Added training step stub   Tolstoyevsky 5 years ago
Ran data prep dvc stage   Guy 5 years ago
Change logo size   Dean 5 years ago
Calculated BPE using DVC   Guy 5 years ago
Added missing commoncrawl dependency   Tolstoyevsky 5 years ago
Refactor data preparation a bit   Tolstoyevsky 5 years ago
Universal encoder seems ready   Tolstoyevsky 5 years ago