342 B
342 B
Multi-node example
To run this demo which launches a single job that trains on 2 nodes (2 gpus per node), do the following:
- Log into the jumphost node of your SLURM-managed cluster.
- Create a conda environment with Lightning and a GPU PyTorch version.
- Submit this script.
sbatch job_submit.sh --env=YourEnv