Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

sbatch_command.sh 642 B

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  1. #!/usr/bin/bash
  2. #SBATCH --job-name=epa_ddp
  3. #SBATCH --nodes=1
  4. #SBATCH --cpus-per-task=8
  5. #SBATCH --mem=24G
  6. #SBATCH --account=csso-e
  7. #SBATCH --gres=gpu:2
  8. #SBATCH --time=24:00:00
  9. #SBATCH --output=epa_ddp_%j.out
  10. #SBATCH --error=epa_ddp_%j.err
  11. module load cuda/12.1.1 cudnn/cuda-12.1_8.9 anaconda
  12. conda activate /scratch/gilbreth/rai53/fire
  13. export MLFLOW_TRACKING_USERNAME=$MLFLOW_USERNAME
  14. export MLFLOW_TRACKING_PASSWORD=$MLFLOW_TOKEN
  15. export OMP_NUM_THREADS=8
  16. NCCL_DEBUG=INFO
  17. torchrun --standalone --nnodes=1 --nproc_per_node=gpu \
  18. epa_seq2seq_train_ddp.py \
  19. config.json \
  20. dropout_config.json \
  21. data.zarr \
  22. output_directory/
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...