Hprc banner tamu.png

HPRC:File Transfers

From TAMU HPRC
Revision as of 19:52, 16 January 2019 by Tmarkhuang (talk | contribs) (initial version)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Transfer Files

File Transfer Software

There are several options to transfer files to and from HPRC clusters.

Globus Connect

Globus Connect is a reliable, high-performance file transfer platform allowing users to transfer large amounts of data seamlessly between systems or endpoints. Users can schedule transfer via a web interface on globus.org and receive notification after transfer is completed. The endpoint can be systems with Globus installed (like ada-ftn1) or user's personal desktop.

What Globus Connect is good for

  • transfer large amount of data
  • it's fast (utilizing up to 4 data streams); as fast as the slowest link from your server/desktop/laptop to HPRC fast transfer nodes
  • resume for failed transfers
  • receive notification after a scheduled transfer is completed

What Globus Connect is not good for

  • your server or desktop/laptop must have Globus Connect software installed and setup as an endpoint
  • it will not work if your files are on a server behind a firewall (not reachable from internet)

How do I use Globus Connect

  • visit Globus Connect for more information
  • use endpoints: "TAMU ada-ftn1" or "TAMU ada-ftn2" for Ada/Curie cluster and "TAMU terra-ftn" for Terra cluster


SCP/SFTP

SCP and SFTP protocols are a means of securely transferring computer files between a local host and a remote host.

What SCP/SFTP is good for

  • transfer files

What SCP/SFTP is not good for

  • not very fast (file transfer only uses one data stream)

How do I use SCP/SFTP

  • you can use command line on Linux, Mac or MobaXterm terminal to issue scp/sftp command
  • use WinSCP on Windows, FileZilla on Windows or Mac, or use File Transfer panel on MobaXterm
  • use "ada-ftn1.tamu.edu" or "ada-ftn2.tamu.edu" for Ada/Curie and "terra-ftn.hprc.tamu.edu" for Terra if your data transfer to Ada or Terra login nodes (ada.tamu.edu or terra.tamu.edu) is terminated after one hour; Ada/Terra login nodes have one hour CPU limit for all user processes.


rsync

rsync is a fast, versatile, remote (and local) file-copying tool and recommended when relatively few differences exist between target and source versions, because rsync copies only the differences of files that have actually changed. By default, rsync uses the SSH remote shell.

What rsync is good for

  • resume file transfer for partial transferred file
  • synchronize files/dirs of two directories (local-local, local-remote, remote-local)

What rsync is not good for

  • by default, files transferred over SSH which uses only one data stream and not very fast

How do I use rsync

  • from command line on Linux, Mac, or MobaXterm terminal to issue rsync command
  • use DeltaCopy or Grsync on Windows
  • use cwRsync 5.4.1 for command line on Windows


rclone

rclone is a tool for syncing files from HPRC systems to remote storage sites like Google Drive, Dropbox, Amazon's AWS and many more.

What rclone is good for

  • copy data to or from cloud (Google Drive, Dropbox, AWS, etc)

What rclone is not good for

  • transfer can be be slow

How do I use rclone

  • rclone is available on ada, terra and HPRC Lab workstations. No module is required for any of them
  • use Ada/Terra fast transfer nodes (ada-ftn1.tamu.edu, ada-ftn2.tamu.edu or terra-ftn.hprc.tamu.edu) for long data transfer (1+ hours)


Other Considerations

  • Should I use FTN or login nodes?
  • Why the transfer takes so long?
  • I have 100+ TB to transfer.