# This example trains with batch_size = 192 * 8 GPUs, total 1536.
# Training time on 8 x GeForce RTX A5000 is 9min / epoch.
# Reach => 81.91 Top1 accuracy.
#
# Log and tensorboard at s3://deci-pretrained-models/KD_ResNet50_Beit_Base_ImageNet/average_model.pth
# Instructions:
# 0. Make sure that the data is stored in dataset_params.dataset_dir or add "dataset_params.data_dir=<PATH-TO-DATASET>" at the end of the command below (feel free to check ReadMe)
# 1. Move to the project root (where you will find the ReadMe and src folder)