Backups

If you have valuable data that requires backup you may want to use the currently installed Duplicity backup client or another client/process of your choosing. Duplicity supports encrypted backups to many cloud storage vendors including Google, Amazon S3, Backblaze B2, Azure etc.


As an example, to configure a simple backup of a specific directory to AWS S3 perform the following. 


Create IAM user and associated AWS API keys in AWS Console to allow access to S3. (For simplicity sake this example gives full access to S3 in AWS. You may want to edit to conform to your specific AWS security requirements)


Log into your AWS console. https://aws.amazon.com/console/

Click on 'Services' followed by 'IAM'. 

In IAM console click 'Users' and 'Add User'.

Username: duplicity-backups
Access: Programmatic access

Under Set Permissions click 'Attach existing policies directly', select 'AmazonS3FullAccess'

Click Next, Next (Review) followed by 'Create User'

Note the Access Key ID and Secret Access Key, you will use these in the configuration of Duplicity.

Create a GNU privacy guard (GPG) key to be used to encrypt the backups. 


Run 'gpg2 --gen-key' , select default key (hit enter), select default keysize (hit enter) , select default for key expiration (hit enter) then enter real name, email address and passphrase. 

You'll see the public key listed (line that starts with 'pub') similar to below. The key part (AE776EB5 in my case) will be used in the configuration of Duplicity. 

pub   2048R/AE776EB5 2019-04-08

Pull down and edit helper scripts to ease configuration of Duplicity

cd ~/
git clone https://github.com/zertrin/duplicity-backup.sh.git
cd duplicity-backup.sh/
cp duplicity-backup.conf.example  duplicity-backup.conf

At minimum, edit the following entries in the duplicity-backup.conf file. Do not include the notes in parentheses.


ROOT='/home/<your-username>/<directory-to-backup>' (directory to backup)
DEST="s3+http://<aws-bucket-name>" (this bucket will automatically be created on first run.
AWS_ACCESS_KEY_ID="AKIAX4EICHL6ZH3XXXXX" (IAM access keys for bucket)
AWS_SECRET_ACCESS_KEY="UMSXSBP70gRIkExXM/6fo2OCHds6qAJMhcXXXXXX"
STORAGECLASS="--s3-use-ia" (optional: uncommenting this will use S3's infrequent access storage class to save on cost)
ENCRYPTION='yes'
PASSPHRASE='<your GPG key passphrase selected during GPG key creation>'
GPG_ENC_KEY="AE776EB5" (public GPG key ID)
GPG_SIGN_KEY="AE776EB5" (public GPG key ID)
GPG_OPTIONS="--list-options no-show-photos"
STATIC_OPTIONS="--full-if-older-than 14D --s3-use-new-style"
CLEAN_UP_TYPE="remove-all-but-n-full"
CLEAN_UP_VARIABLE="4"
LOGDIR="/home/<your-username>/backup-logs/"
LOG_FILE="duplicity-$(date +%Y-%m-%d_%H-%M).txt"
TMPDIR="/central/scratch/<your-username>"
EMAIL_TO=<your-email-address>
EMAIL_FROM=backups@hpc.caltech.edu
EMAIL_FAILURE_ONLY="no" (send emails whenever job runs, set to "yes" after testing if you only want emails when jobs fail)
MAIL="mailx"

Save the conf file.

 If the TEMP_DIR above does not exist you must create it before the first backup is run. See documentation links at the end of this page.

mkdir /central/scratch/<your-user-name>

Run a test Backup.


Now run a test backup to assure everything works as intended. 

/home/<your-username>/duplicity-backup.sh/duplicity-backup.sh -c /home/<your-username>/duplicity-backup.sh/duplicity-backup.conf --full

Create a crontab entry to automate backup. 


crontab -e

Add the following link to run the backup once per day at 3AM. You should receive a email message from cron whenever the backup script is run. Be sure to periodically check the script to verify if your backups are running correctly and run test recovery often to assure you can properly recover and decrypt data from the backups.

41 3 * * * /home/<your-username>/duplicity-backup.sh/duplicity-backup.sh -c /home/<your-username>/duplicity-backup.sh/duplicity-backup.conf -b

Save the cron file. This will run your backup every day at 3:41AM. 

Links: