Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Commit History
Message Author SHA1 Date
Cleaned code, added comments, improved metrics   Wils McCreight 4 years ago
Now trains model on all types of data at once   Wils McCreight 4 years ago
Added a histogram   Wils McCreight 4 years ago
Compute basic stats and conf intervals   Wils McCreight 4 years ago
Now calculate entropy of individual files and included some example cells   Wils McCreight 4 years ago
Cleaned and refactored data   Wils McCreight 4 years ago
Rough code that calculates the entropy of the entire LIBest requirements corpus   Wils McCreight 4 years ago
Use sp model to get frequency of encoded tokens   Wils McCreight 4 years ago
Model creation with sentencepiece   Wils McCreight 4 years ago
Added preliminary dit usage   Wils McCreight 4 years ago
Created using Colaboratory   Wils McCreight 4 years ago
Created using Colaboratory   Wils McCreight 4 years ago
Created using Colaboratory   Wils McCreight 4 years ago
Update docs   Nathan Cooper 4 years ago
Fix bug where benchmark traceability would overwrite data repr docs   Nathan Cooper 4 years ago
Update requirements   Nathan Cooper 4 years ago
Update docs   Nathan Cooper 4 years ago
Update python module   Nathan Cooper 4 years ago
Add code for training and using hugging face tokenizers and for training hugging face robertaish language model for code vectorization   Nathan Cooper 4 years ago
Add some test data for training BERTish model   Nathan Cooper 4 years ago
Update library modules   Nathan Cooper 4 years ago
Update project lvl readme   Nathan Cooper 4 years ago
Update library modules   Nathan Cooper 4 years ago
Updated docs   Nathan Cooper 4 years ago
Fix bug where traceability notebook was not exporting to correct module   Nathan Cooper 4 years ago
Update settings to be more consistent with project and updated start.sh to use unused port   Nathan Cooper 4 years ago
Update requirements   Nathan Cooper 4 years ago
Reorganize structure of project to be built entirely from notebooks including blogs, templates, and benchmarks   Nathan Cooper 4 years ago
Remaining components: blogs, benchmaking, data organization, and exploration   David A Nader Palacio 4 years ago
Data representation components   David A Nader Palacio 4 years ago