Backing up to DreamObjects with Duplicity

The other day I was talking with my wife and she revealed a dirty little secret. She had not backed up the pictures on her phone or camera since our daughter was born in August. Meet Chloe!

Baby Chloe

It’s really more my fault than hers, I’m the one that works with technology professionally and I certainly know a thing or two about backing things up given that I’m one of the System Engineers that helped create DreamObjects, a cloud storage service here at DreamHost. I’m not going to go into too much detail about why you should do backups, devices get lost and devices die, in fact having worked in data center operations previously I’ve personally shipped boxes with dozens of failed hard drives back for the manufacturer for RMA.

drives

I decided that I wanted to use a product I helped build to archive my families growing collection of memories because I’m extremely confident about its durability. DreamObjects uses open source storage software named Ceph that began as a PHD thesis for one DreamHost’s co-founders Sage Weil. After years of being an exciting research project and continued development here at DreamHost, Ceph is now running in production and trusted by a ever growing number of customers. Ceph uses the CRUSH algorithm to replicate data N times (currently we set this to 3) across racks and racks of servers according to a placement map that assures availability despite the loss of various common failure domains eg. a server, a rack or in our design an entire row of servers.

crush

Many people use FTP to upload files to their hosting accoun DreamHost, I mean it’s not called file transport protocol for nothing. DreamObjects is a bit different in that you can’t use FTP, SFTP or SSH to copy files to the service – don’t worry; that doesn’t mean you have to know a programming language in order to use it! DreamObjects provides several RESTful APIs, one that is compatible is, Amazon S3 and another that is compatible with OpenStack Storage (Swift). There are a lot of tools that are readily available and work with compatible services like DreamObjects. After a bit of research and talking with co-workers I decided that I wanted to try using Duplicity, since it has a lot of things that appealed to me:

  • Open Source (free as in speech and free as in beer)
  • Encryption
  • Incremental backups

I’ve always been a huge fan of open source software and certainly don’t mind digging through code on occasion. The freedom to understand how an application works and modify it if the need arises is invaluable. Anyone that has been around me for a while knows that I’m a bit paranoid when it comes to security. I use whole disk encryption, put anti-virus software on my Macs, PGP sign every email I send and slather every linux kernel within arms reach with a copious amount of grsecurity. Incremental backups are fantastic if you want to continue to backup your ever growing collection of digital media without performing a complete backup each and every time.

Getting Started

I’m setting up backups for a Mac in this article although the setup would be almost identical if you’re running Linux on your desktop or if you were backing up your site from one of our shared hosting, virtual private or dedicated servers. First you will need to setup a plan for DreamObjects; if you already have an account with us you can do so from the panel here, otherwise you can signup and then head over to the panel. Once you have a plan you will want to create a user, and after submitting a user name you’ll have toperiodically refresh the page to see if it is now available. Now once you have a DreamObjects user you will want to click the “1 keys” button on the right side and copy down the user key, followed by revealing the secret key and copying that.

panel

Since we need to build the duplicity application we need to install a couple development utilities. The first tool you will need is Xcode, it’s available for download from the Mac App store. The next tool you need to install is a package manager called homebrew, paste their one-liner to Terminal:


ruby -e “$(curl -fsSkL raw.github.com/mxcl/homebrew/go)”

Your getting closer! Now that you have Xcode and homebrew you are ready to install the dependencies for duplicity. In the same Terminal paste the following:


brew install librsync python gpg ncftp

Once homebrew finishes doing what it does best you’ll finally be ready to download Duplicity. Afterwards extract it by using this command in Terminal:


tar xvzf duplicity-0.6.20.tar.gz

Now it’s time to install python libraries and build duplicity, again paste the following into terminal:,


pip install boto httplib2 oauth
cd duplicity-0.6.20
sudo python setup.py install

Now we have to take our user and secret key and put them into a file named ‘.boto’ in our users home directory. You can do this with the following commands if you substitute your own keys:


echo “[Credentials]” >> ~/.boto
echo “aws_access_key_id = 98F3n8qUtWEJ6ZdBYyQy” >> ~/.boto
echo “aws_secret_access_key = p5kptXKQrsQtTNJTYtG7emGYooXkN6Kaza1OV-_s” >> ~/.boto

Now you can backup any directory you desire to a bucket name of your choice. In this example I’m backing up my Pictures directory to the blogdemo bucket:


duplicity –allow-source-mismatch ~/Pictures \
s3://objects.dreamhost.com/blogdemo

It prompts for a encryption password which you won’t want to forget and then syncs your files, the output should look something like this:


GnuPG passphrase:
Retype passphrase to confirm:
No signatures found, switching to full backup.
————–[ Backup Statistics ]————–
StartTime 1356046824.18 (Thu Dec 20 15:40:24 2012)
EndTime 1356046824.20 (Thu Dec 20 15:40:24 2012)
ElapsedTime 0.02 (0.02 seconds)
SourceFiles 10
SourceFileSize 374 (374 bytes)
NewFiles 10
NewFileSize 374 (374 bytes)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 10
RawDeltaSize 0 (0 bytes)
TotalDestinationSizeChange 295 (295 bytes)
Errors 0
————————————————-

That’s it! You’ll want to run this command whenever you want to update your backups. I prefer to run it manually when I’m importing pictures but it wouldn’t be difficult to configure Cron to automatically backup a directory on your computer.