Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

universal_transformer_hyperparams.yaml 587 B

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
  1. # To see possible and default values: look at the various model architectures at the bottom of transformer.py
  2. arch: transformer_universal
  3. dropout: 0.1
  4. attention-dropout: 0.0
  5. encoder-embed-dim: 512
  6. encoder-ffn-embed-dim: 2048
  7. encoder-layers: 1
  8. encoder-layer-recurrence: 6
  9. encoder-attention-heads: 8
  10. decoder-embed-dim: 512
  11. decoder-ffn-embed-dim: 2048
  12. decoder-layers: 1
  13. decoder-layer-recurrence: 6
  14. decoder-attention-heads: 8
  15. relu-dropout: 0.
  16. adaptive-softmax-dropout: 0
  17. optimizer: sgd
  18. lr: 0.0001
  19. lr-schedule: fixed
  20. lr-shrink: 0.9
  21. max-tokens: 500
  22. min-lr: 0.0
  23. update-freq: 4
  24. max-update: 630
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...