Vector Engine
Introduction
The NEC Vector Engine (VE) compute node in ACES has 8 VE Type 20B-P cards. Each VE card has 8 VE cores and 48GB HBM2 memory.
Accessing Vector Engines
This VE compute node is available via the nec partition.
Interactive
To access the VE compute node interactively to develop your programs against the VE cards:
srun --partition nec --pty --time=4:00:00 --nodes=1 --gres=gpu:ve:8 /bin/bash
To setup your interactive job environment for the NEC compiler:
export PATH=/opt/nec/ve/bin/:$PATH
source /opt/nec/ve/mpi/3.4.0/bin64/necmpivars.sh
Turn on printing details about VE card usage:
export VE_PROGINF=DETAIL
Job Submission
To submit batch jobs to the NEC VE node, include the following Slurm directives in your job script.
#SBATCH --ntasks=48 # request all 48 core
#SBATCH --partition=nec # nec partition required
#SBATCH --gres=gpu:ve:8 # request all 8 NEC cards
In addition, you will need to create a script that contains all of the commands needed to run your code outside of the job file. If you try to run your code directly in the job file, the code will not find the NEC VE cards.
Example job file:
#!/bin/bash
#SBATCH --time=24:00:00 # specify the wall time
#SBATCH --ntasks=48 # request all 48 core
#SBATCH --partition=nec # nec partition required
#SBATCH --gres=gpu:ve:8 # always request 8 NEC VE cards
# ssh command required. script_to_run_workload must contain all of the commands needed to run and be executable
ssh dss "cd $SLURM_SUBMIT_DIR;./script_to_run_workload"
exit
Example for a script to run VASP (script_to_run_workload):
#!/bin/bash
#
# setup environment for NEC VE environment
export PATH=/opt/nec/ve/bin/:$PATH
source /opt/nec/ve/mpi/3.4.0/bin64/necmpivars.sh
#setup environment for vasp
export VASPHOME=/sw/restricted/vasp/sw/6.3.2/nec_5.0.1/
#Turn on printing details about VE card usage
export VE_PROGINF=DETAIL
# run using all 8 NEC VE cards and VE cores
mpirun -ve 0-7 -vennp 8 $VASPHOME/bin/vasp_std
exit
Information
MPI Usage
Example MPI runs:
mpirun –ve 0 –vennp 8 executable # run on 1 VE card (ve card 0) with 8 VE cores
mpirun –ve 0-1 –vennp 8 executable # run on 2 VE cards (VE card 0 and 1) with 8 VE cores per card
mpirun –ve 0-7 –vennp 8 executable # run on 8 VE cards (VE cards 0-7) with 8 VE cores per card
References
Additional details and tutorials:
NEC Vector Engine: Vectorization for HPC applications
NEC Compiler user manuals: