GROMACS on MeluXina
The MeluXina system environment provides the Gromacs scientific software for molecular dynamic simulation.
EasyBuild module description
GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems
with hundreds to millions of particles. This is a CPU only build, containing both MPI and threadMPI builds for both single and double precision. It also contains the gmxapi extension for the single precision MPI build.
The list of avalaible versions (CPU and GPU) can be found with module avail GROMACS
command with an output example below:
(compute)$ module avail GROMACS
Output
------------------------------------------ /apps/USE/easybuild/release/latest/modules/all -------------------------------------------
GROMACS/2019.6-foss-2021a-CUDA-11.3.1 GROMACS/2021.3-foss-2021a-CUDA-11.3.1
GROMACS/2019.6-foss-2021a GROMACS/2021.3-foss-2021a (D)
Where:
D: Default Module
Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
The modules containing the CUDA
tag show GROMACS versions with GPU acceleration.
Gromacs usage
CPU version
Gromacs has a CPU-only version that can be run on the Meluxina CPU partition in Slurm batch jobs. The computational load can be distributed among MPI tasks and/or shared between OpenMP threads to reduce simulation time. The script below runs a simple thread MPI Gromacs script on one CPU node allocated for 30 minutes.
#!/bin/bash -l
#SBATCH -A COMPUTE_ACCOUNT
#SBATCH --job-name="Gromacs"
#SBATCH -p cpu
#SBATCH -q short
#SBATCH --time=30:00
#SBATCH -N 1
#SBATCH --ntasks-per-node=12
#SBATCH --cpus-per-task=5
#SBATCH --output=gromacs.out
#SBATCH --error=gromacs.err
#Load Gromacs module
module load GROMACS/2021.3-foss-2021a
#Run the case
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun gmx_mpi mdrun -dlb yes -nsteps 500000 -ntomp $OMP_NUM_THREADS -pin on -v -nb cpu -s topol.tpr
The Slurm batch script above makes a request for one CPU node with 12 tasks and 5 cores per task, thus will run 12 MPI ranks per node and 5 OpenMP threads per MPI rank. The GROMACS case is a run for 500000 steps with dynamic load balancing (-dlb
) and threads affinity enabled (-pin on
), having non-bonded interactions calculated on cpu (-nb cpu
). The -v
option activates the verbose output.
Note
gmx_mpi mdrun -dlb yes -nsteps 500000 -ntomp 5 -pin on -v -nb cpu -s topol.tpr
is an optimized scenario for a specific example case that may not suit your simulation. To use GROMACS efficiently, start from basic parameters (for example srun gmx_mpi mdrun -s topol.tpr
) and then tune the number of nodes, MPI ranks and threads for performance. Please see the GROMACS documentation for more details on getting good performance from mdrun.
GPU version
To use the GPU-enabled version of GROMACS, you will need to use one of the -CUDA-
tagged modules.
#!/bin/bash -l
#SBATCH -A COMPUTE_ACCOUNT
#SBATCH --job-name="Gromacs-GPU"
#SBATCH -p gpu
#SBATCH -q short
#SBATCH --time=30:00
#SBATCH -N 1
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=16
#SBATCH --gpus-per-task=1
#SBATCH --output=gromacs.out
#SBATCH --error=gromacs.err
#Load Gromacs module
module load GROMACS/2021.3-foss-2021a-CUDA-11.3.1
#Run the case
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun gmx_mpi mdrun -dlb yes -nsteps 500000 -ntomp $OMP_NUM_THREADS -pin on -v -nb gpu -s topol.tpr
You may also wish to look at the GROMACS documentation specific guidance on running GROMACS efficiently on GPUs to be able to achieve better performance than with the CPU-only version.