Search open search form

Transferring Files

How to transfer files using SCP, SFTP, Fuse and S3

Using OpenOnDemand to move files graphically from a browser

Some people are choosing  to move files via the web console since you are only asked for your Duo authentication when first logging in and then can keep the window open to transfer files selectively.  To learn more about this option take a look at the documentation for Open OnDemand


Using SSH/SCP

Why use SSH/SCP/SFTP for file transfer?

SCP and SFTP both run over ssh and are thus encrypted. There are implementations available for all common operating systems including Linux, Windows, and Mac OS X.

Windows

GUI:
  • WinSCP
    • Host: login.hpc.caltech.edu
    • Enter your username and password.
  • FileZilla
  • Command Line:
  • Linux

  • Command Line:
    • Start Terminal (Applications->Accessories->Terminal)
      • To copy files from your computer to the central cluster
        • Type scp local_filename username@login.hpc.caltech.edu:~/username/
      • To copy files from to the central cluster to your computer
        • Type scp username@login.hpc.caltech.edu:/home/username/remote_filename .
  • Mac OS X

    Command Line:
    • Start Terminal (Applications->Utilities->Terminal)
      • To copy files from your computer to the central cluster
        • Type scp local_filename username@login.hpc.caltech.edu:~/username/
      • To copy files from the central cluster to your computer
        • Type scp username@login.hpc.caltech.edu:~/username/remote_filename

    SSHFS on Mac OS X

    If you prefer filesystem like access you may use FuseOS together with SSHFS. This works over SSH protocol and is therefore encrypted as with standard SSH/SCP/SFTP but with the added benefit of drag and drop transfers. 

    • Download and install FUSE and SSHFS here.
    • Make a local mount directory on your Mac. mkdir ~/Desktop/HPC-Mount
    • Run a command similar to the following, swapping out your username and directory name.
    •  sshfs -o allow_other,defer_permissions,auto_cache remote-username@login.hpc.caltech.edu:/home/remote-username ~/Desktop/HPC-Mount

    GUI:
    • Cyberduck.
      • Download
      • Cyberduck can be made to work with 2 factor
        • Click on "Open Connection"
        • choose "SFTP"
        • enter you username and password, then click connect
        • In the "Provide additional login credentials" box, enter 1 in the password field and hit enter if using the smartphone app.
        • You should be prompted on you cell phone to allow the connection
        • If using a yubikey, you can touch it when prompted to complete the login.

    Globus Personal (GridFTP)

    Globus is a fast, reliable file transfer service that makes it easy for users to move data between two GridFTP servers or between a GridFTP server and a user's machine (Windows, Mac or Linux).

    Setup the Globus endpoint via their website. 
    1. Go to this address to setup a new globus endpoint https://app.globus.org/file-manager/gcp
    2. Select California Institute of Technology > Continue
    3. Sign in with your access.caltech credentials
    4. Set the Endpoint display name to "central-hpc" (or something similar)
    5. Click 'Generate setup key' and copy that to a secure location. 
    Setup the Globus Personal client under your account on the Central HPC. 

    SSH to the Central HPC then run the following. 

    1. module load globusconnectpersonal/3.0.2
    2. globusconnectpersonal -setup <setup-key-from-website>
    3. globusconnectpersonal -start &

    The daemon should now be running in the background and connected to the external Globus service. You should be able to browse your home directory and transfer data to and from it. 

    If you need to allow Globus access to another directory (for instance /central/groups/xxx) perform the following. 

    Edit ~/.globusonline/lta/config-paths, adding the following line after the existing one. The one in the configuration below sets the directory as read/write. Setting to 0 will set the directory to read-only in Globus. 

    /central/groups/<whatever-group-directory-you-want-access-to>/,0,1

    Restart the Globus daemon to pickup the changes. 

    1. globusconnectpersonal -stop
    2. globusconnectpersonal -start &
    You should now be able to navigate to the additional directory via the Globus website. Keep in mind that even though you've added a directory to allow globus access, existing unix permissions will determine what files and directories you have access to on the cluster. 

    Using Amazon S3

    If your data is in Amazon S3 you may use the awscli tools which are already installed as a module on the cluster. 

    • Log into the cluster and run module load awscli/1.15.27
    • Type aws configure and enter your Amazon Web Services API key and private key. (You generate these in the IAM credential page in the AWS console).
    • Run a command similar to the following to copy data from S3 to your cluster home directory. 
    • aws s3 cp --recursive s3://my-bucket-name/subfolder/ ~/destination-directory/
    • Run a command similar to the following to copy data from the cluster to a pre-existing S3 bucket.
    • aws s3 cp --recursive ~/source-directory/ s3://my-bucket-name/subfolder/
    • More s3 examples are available here.