Data Transfer with Globus

Globus is a robust online tool that simplifies the process of transferring large (GB or TB) datasets to and from RCC systems. This service enables you to transfer data from your personal workstations, share it with other researchers, or move data among different HPC providers.

Example use cases:

  • Reliably transfer a large amount of data from the RAS, Panasas, Lustre to another machine or server.
  • Reliably transfer a large amount of data from another machine or server onto the RAS, Panasas, Lustre.
  • Setup automated replication of data between the RAS, Panasas, Lustre and another server or computer.

Setting up a Globus Account and Endpoint

Before you can transfer data to and from RCC systems, you must setup a Globus Account and configure the target computer (personal computer or server) as a Globus Endpoint.

  1. Visit http://globus.org and sign up for an account. Then sign in.
  2. Browse to the Manage Endpoints page.
  3. Click the Add Globus Connect Personal link.
  4. When prompted, create an Endpoint name.
  5. Click Generate Setup Key, and yake note of the key when it appears.
  6. Click the appropriate client download button for your operating system.
  7. Install the client and follow the on-screen instructions. You'll need your setup key for this step.

Transferring Data

Once you have setup an account and endpoint, you can transfer data anytime.

  1. With the Globus Client running, browse to the Transfer Files page at http://globus.org.
  2. On the right-hand side of the screen, enter the name of your endpoint (or click the "..." button to find your endpoint in a list)
  3. Select the path in your file tree that you wish to transfer your files to or from.
  4. On the left-hand site, enter:
    1. fsurcc#panfs to transfer data from Panasas
    2. fsurcc#lustre to transfer data from Lustre
    3. fsurcc#archival to transfer data from the Research Archival Storage
  5. Click Go
  6. In the authorization dialog box that appears, enter your RCC username and password.
  7. Browse to and select the files and folders you wish to transfer. Then, click the large arrow in the direction that you wish to transfer files.
  8. Your files will now be queued for transfer. You can monitor the progress of your transfer by viewing the Activity on the Globus website.

Setting up CLI Access (no client needed!)

You can use the Globus CLI client to transfer data to and from a remote server using SSH. This is a good strategy to use when trying to reliably move data from a remote, firewall-protected server to RCC. You do not need admin privileges on the server for this method. You will need a public/private keypair.

  1. Ensure you have signed up for an account and created an endpoint (see Setting up a Globus Account and Endpoint​ above).
  2. On the Globus website, browse to the Manage Identities page.
  3. Click the Add Linked Identity link, and select Add SSH Public Key.
  4. Upload your Public Key.

Using CLI Access

Activate your endpoint: ssh -t [GLOBUS-USERNAME]@cli.globusonline.org [ENDPOINT#NAME]

Supply your Globus password when prompted

Transfer files using the following syntax:

ssh [GLOBUS-USERNAME]@cli.globusonline.org scp [ENDPOINT#NAME]:/path/to/files [FSURCC#FILESYSTEM]:~/destination/path

For example, to transfer a file named foobar.txt from your home directory to your Lustre space:

ssh john-doe@cli.globusonline.org scp john-doe#laptop:~/foobar.txt fsurcc#lustre:~

Further Reading

There is more you can do with Globus. Refer to the official documentation for more information.