Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

#211 SG-136: Apply ema only on student (KD)

Merged
Louis Dupont merged 1 commits into Deci-AI:master from deci-ai:feature/SG-136_use_ema_only_on_kd_student
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
  1. # Efficientnet-B0 Imagenet training
  2. # This example trains with effective batch size = 64 * 4 gpus = 256.
  3. # Epoch time on 4 X 3090Ti distributed training is ~ 16:25 minutes
  4. # Logs and tensorboards: s3://deci-pretrained-models/efficientnet_b0/
  5. # Instructions:
  6. # Set the PYTHONPATH environment variable: (Replace "YOUR_LOCAL_PATH" with the path to the downloaded repo):
  7. # export PYTHONPATH="YOUR_LOCAL_PATH"/super_gradients/:"YOUR_LOCAL_PATH"/super_gradients/src/
  8. # Then:
  9. # # python -m torch.distributed.launch --nproc_per_node=4 train_from_recipe.py --config-name=imagenet_efficientnet
  10. defaults:
  11. - training_hyperparams: imagenet_efficientnet_train_params
  12. - dataset_params: imagenet_dataset_params
  13. - arch_params: efficientnet_b0_arch_params
  14. - checkpoint_params: default_checkpoint_params
  15. arch_params:
  16. num_classes: 1000
  17. dataset_params:
  18. batch_size: 64
  19. color_jitter: 0.4
  20. random_erase_prob: 0.2
  21. random_erase_value: random
  22. train_interpolation: random
  23. auto_augment_config_string: rand-m9-mstd0.5
  24. dataset_interface:
  25. _target_: super_gradients.training.datasets.dataset_interfaces.dataset_interface.ImageNetDatasetInterface
  26. dataset_params: ${dataset_params}
  27. data_dir: /data/Imagenet
  28. data_loader_num_workers: 8
  29. load_checkpoint: False
  30. checkpoint_params:
  31. load_checkpoint: ${load_checkpoint}
  32. experiment_name: efficientnet_b0_imagenet
  33. model_checkpoints_location: local
  34. ckpt_root_dir:
  35. multi_gpu:
  36. _target_: super_gradients.training.sg_model.MultiGPUMode
  37. value: 'DDP'
  38. sg_model:
  39. _target_: super_gradients.SgModel
  40. experiment_name: ${experiment_name}
  41. model_checkpoints_location: ${model_checkpoints_location}
  42. ckpt_root_dir: ${ckpt_root_dir}
  43. multi_gpu: ${multi_gpu}
  44. architecture: efficientnet_b0
Discard
Tip!

Press p or to see the previous file or, n or to see the next file