The Cooperative Computing Tools (cctools) is a software package for enabling large scale distributed computing on clusters, clouds, and grids. It is used primarily for attacking large scale problems in science and engineering. "parrot_run" can be used to access CVMFS.
Parrot is a tool for attaching existing programs to remote I/O systems through the filesystem interface. Parrot "speaks" a variety of remote I/O services include HTTP, FTP, GridFTP, iRODS, CVMFS, and Chirp on behalf of ordinary programs.
Parrot cache dir is set to /scratch/user/YourNetID/parrot_cache. User can change it's location by updating environment variable PARROT_CVMFS_ALIEN_CACHE prior running "parrot_run" command
Parrot is available on Terra cluster and it's working on both login nodes and compute nodes. HPRC compute nodes do not have access to internet, thus a proxy server is used when running on the compute nodes.
To access parrot, user can load cctools module
module load cctools/7.0.22-Python-2.7
Alternatively, user can load cctools with Python 3.6 toolchain via
module load cctools/7.0.22-Python-3.7
In the following examples, the commands are listed in bold face.
User can list CVMFS file system using "parrot_run"
[user@terra1 ~]$ parrot_run ls -lad /cvmfs/cms.cern.ch/slc6_amd64_gcc530 LibCvmfs version 2.4, revision 25 drwxr-xr-x 1 root root 44 Dec 19 05:10 /cvmfs/cms.cern.ch/slc6_amd64_gcc530
User can get a shell with "parrot_run". Please ignore the error of "bash: /usr/bin/mount: Permission denied"
[user@terra1 ~]$ parrot_run bash 2020/02/19 15:03:46.07 parrot_run <child:17149> notice: warning: system call 316 (renameat2) not supported for program /usr/bin/mv 2020/02/19 15:03:46.38 parrot_run <child:17178> notice: cannot execute the program /usr/bin/mount because it is setuid. bash: /usr/bin/mount: Permission denied [user@terra1 ~]$ ls -l /cvmfs/atlas.cern.ch LibCvmfs version 2.4, revision 25 total 1 drwxr-xr-x 1 root root 3 Dec 5 08:41 repo
User will get a warning message when running "parrot_run" inside "parrot_run bash"
[user@terra1 ~]$ parrot_run bash sorry, parrot_run cannot be run inside of itself. version already running is 7.0.22 FINAL.
User can check if "parrot_run" is running via "parrot_run --is-running" command. If "parrot_run" is not running, no output will be given, otherwise version number will be printed.
[user@terra1 ~]$ parrot_run --is-running 7.0.22 FINAL
User can change prompt for parrot_run to indicate that you are inside a parrot_run shell.
[user@terra1 ~]$ PS1="(Parrot) [\u@terra1 \W]\$ " parrot_run bash --login --noprofile (Parrot) [user@terra1 ~]$ ls -l /cvmfs/atlas.cern.ch LibCvmfs version 2.4, revision 25 total 1 drwxr-xr-x 1 root root 3 Dec 5 08:41 repo (Parrot) [user@terra1 ~]$ exit logout
Show which filesystems are supported by parrot_run
[user@terra1 ~]$ parrot_run --help | grep Enabled Enabled filesystems are: chirp ftp cvmfs multi http hdfs grow anonftp
Sample job script for Terra
The sample job script below shows two methods to run parrot_run with list of commands. The first method uses commands listed in an external script and the second methods uses pipe and heredoc (Here document).
#!/bin/bash ##NECESSARY JOB SPECIFICATIONS #SBATCH --job-name=parrot_test #Set the job name to "parrot_test" #SBATCH --time=00:10:00 #Set the wall clock limit to 10min #SBATCH --ntasks=1 #Request 1 task #SBATCH --mem=2560M #Request 2560MB (2.5GB) per node #SBATCH --output=parrot_test.%j #Send stdout/err to "parrot_test.[jobID]" echo "start testing..." # load module module load cctools # execute commands listed in "mycommands.sh" parrot_run $SHELL mycommands.sh # pipe commands using heredoc (Here document); "EOF" is the delimiting identifier parrot_run $SHELL << EOF echo "# test: 'ls /cvmfs/geant4.cern.ch' ..." ls /cvmfs/geant4.cern.ch EOF echo "end of testing..."