Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Jinen Setpal 338ef81af4
added replication code for demonstrating transformer models implicitly reconstructing mess3's belief state geometry
3 months ago
..
5d2b4807bf
adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm
2 years ago
5d2b4807bf
adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm
2 years ago
5d2b4807bf
adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm
2 years ago
5d2b4807bf
adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm
2 years ago
fce706cbe6
tune the hyperparams a bit, in configs
2 years ago
978d4fe538
Fix for gradient_accumulation_steps training slow
2 years ago
338ef81af4
added replication code for demonstrating transformer models implicitly reconstructing mess3's belief state geometry
3 months ago
978d4fe538
Fix for gradient_accumulation_steps training slow
2 years ago

Comments

Loading...