Skip to content

Vector Engine

Introduction

The NEC Vector Engine (VE) compute node in ACES has 8 VE Type 20B-P cards. Each VE card has 8 VE cores and 48GB HBM2 memory.

Accessing Vector Engines

This VE compute node is available via the nec partition.

Interactive

To access the VE compute node interactively to develop your programs against the VE cards:

srun --partition nec --pty --time=4:00:00 --nodes=1 --gres=gpu:ve:8 /bin/bash

To setup your interactive job environment for the NEC compiler:

export PATH=/opt/nec/ve/bin/:$PATH
source /opt/nec/ve/mpi/3.4.0/bin64/necmpivars.sh

Turn on printing details about VE card usage:

export VE_PROGINF=DETAIL

Job Submission

To submit batch jobs to the NEC VE node, include the following Slurm directives in your job script.

#SBATCH --ntasks=48     # request all 48 core
#SBATCH --partition=nec # nec partition required
#SBATCH --gres=gpu:ve:8 # request all 8 NEC cards

In addition, you will need to create a script that contains all of the commands needed to run your code outside of the job file. If you try to run your code directly in the job file, the code will not find the NEC VE cards.

Example job file:

#!/bin/bash
#SBATCH --time=24:00:00  # specify the wall time
#SBATCH --ntasks=48      # request all 48 core
#SBATCH --partition=nec  # nec partition required
#SBATCH --gres=gpu:ve:8  # always request 8 NEC VE cards

# ssh command required.  script_to_run_workload must contain all of the commands needed to run and be executable
ssh dss "cd $SLURM_SUBMIT_DIR;./script_to_run_workload"

exit

Example for a script to run VASP (script_to_run_workload):

#!/bin/bash
#

# setup environment for NEC VE environment
export PATH=/opt/nec/ve/bin/:$PATH
source /opt/nec/ve/mpi/3.4.0/bin64/necmpivars.sh

#setup environment for vasp
export VASPHOME=/sw/restricted/vasp/sw/6.3.2/nec_5.0.1/

#Turn on printing details about VE card usage
export VE_PROGINF=DETAIL

# run using all 8 NEC VE cards and VE cores
mpirun -ve 0-7 -vennp 8  $VASPHOME/bin/vasp_std

exit

Information

MPI Usage

Example MPI runs:

mpirun –ve 0 –vennp 8 executable     # run on 1 VE card (ve card 0) with 8 VE cores
mpirun –ve 0-1 –vennp 8 executable  # run on 2 VE cards (VE card 0 and 1) with 8 VE cores per card
mpirun –ve 0-7 –vennp 8 executable  # run on 8 VE cards (VE cards 0-7) with 8 VE cores per card

References

Additional details and tutorials:

NEC Vector Engine: Vectorization for HPC applications

NEC Compiler user manuals:

C/C++ Compiler
Fortran Compiler