Encrypted offsite backup with EncFS, Amazon S3, and s3cmd
Stolen from here: http://shrp.me/docs/encrypted_offsite_backup.php
I’ve been using Jungle Disk to do offsite backups of my data. Jungle Disk uses Amazon’s cheap online storage service, Amazon S3, to host backups. (15 cents a gig!) I don’t like Jungle Disk because it’s not open source and because the app is a little clunky, at least on Linux. I recently found that s3cmd could do an rsync-like sync of a directory. That’s cool, but it doesn’t do encrypted backups like Jungle Disk did. In this post, I’ll demonstrate how to make an encrypted backup of locally unencrypted data using EncFS, Amazon S3, and s3cmd.
Enter EncFS. EncFS transparently encrypts files with AES encryption from a FUSE mountpoint to a local directory. That means I could have an encrypted directory, like /home/user/encrypted, and a encfs mountpoint at /home/user/unencrypted. The unencrypted directory would contain all the plaintext (unencrypted) data, and the encrypted directory would contain a mirror of the unencrypted directory’s directory structure as well as all of the individual files, except that the file names and contents have been encrypted. (Note that this could be a disadvantage of EncFS depending on your needs: the files contents and filenames have been scrambled, but an attacker who has accessed your data still encypted can still see approximate file sizes, approximate file name lengths, and file attributes. Jungle Disk shares these disadvantages with its encryption.) More on EncFS here…
You might already see how EncFS can make it really easy to back up your encrypted data without any hassle, but what about if you already have a ton of unencrypted files which you don’t care to encrypt on your local disk? Well EncFS has a cool little “reverse” mode that lets you create an encrypted mountpoint from an unencrypted directory, suitable for rsyncing against, or in this case, for using s3cmd sync with.
How to do it
Before you get started, you have to have an Amazon S3 account. You can sign up here if you’re not signed up already. You should also have a modern Linux distro with FUSE, as well as encfs and the s3cmd utility. Now lets go to a terminal and configure s3cmd:
sharp@blue:~$ s3cmd --configure Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3 Access Key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Secret Key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3 Encryption password: (just hit enter, if you want) Path to GPG program [/usr/bin/gpg]: (hit enter) When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP and can't be used if you're behind a proxy Use HTTPS protocol [No]: Yes New settings: Access Key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Secret Key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Encryption password: Path to GPG program: /usr/bin/gpg Use HTTPS protocol: True HTTP Proxy server name: HTTP Proxy server port: 0 Test access with supplied credentials? [Y/n] y Please wait... Success. Your access key and secret key worked fineNow verifying that encryption works... Not configured. Never mind. Save settings? [y/N] y Configuration saved to '/home/sharp/.s3cfg'
You may have noticed my access key and secret key blocked out with Xs. These are unique to your account and can be found at this page. Now that s3cmd is configured and working, we can make a bucket to keep our backup. (You can keep multiple backups per bucket.) Keep in mind that nobody else on S3 may be using the same bucket name, so you’ll have to pick one thats unique. This is because lots of S3 users make whatever content is in their buckets public (although the default is to keep it private.) So lets create our bucket:
sharp@blue:~$ s3cmd mb s3://sharpbackup Bucket 'sharpbackup' created
Now we need a temporary directory to mount the encrypted filesystem on.
sharp@blue:~$ mkdir Music_enc
You might make this in /tmp, especially if you are scripting the process. In this example I’m trying to back up my music (which is in /home/sharp/Music, so I’ve given the mountpoint the name /home/sharp/Music_enc.) Now finally we can create our key and reverse mount this unencrypted directory to an encrypted mountpoint. Be sure to use the full path of both the directory you are backing up and the mountpoint.
sharp@blue:~$ encfs --reverse /home/sharp/Music /home/sharp/Music_enc Creating new encrypted volume. Please choose from one of the following options: enter "x" for expert configuration mode, enter "p" for pre-configured paranoia mode, anything else, or an empty line will select standard mode. ?> (press enter here) Standard configuration selected. --reverse specified, not using unique/chained IV Configuration finished. The filesystem to be created has the following properties: Filesystem cipher: "ssl/aes", version 2:1:1 Filename encoding: "nameio/block", version 3:0:1 Key Size: 192 bits Block Size: 1024 bytes Now you will need to enter a password for your filesystem. You will need to remember this password, as there is absolutely no recovery mechanism. However, the password can be changed later using encfsctl. New Encfs Password: (enter password here) Verify Encfs Password: (again...)
Now we’ll want to back up the EncFS config file. EncFS puts this file in the rootDir you specify. It contains the key used to decrypt the file system. The key itself is encrypted with your EncFS password, so if Mallory gets this file, he’ll still have to have your password. If you’re paranoid you can keep this file and put it somewhere safe, but since I’m not that paranoid about keeping my music unreadable, and because I don’t want to lose it (we are making a backup, after all,) I’ll put it in my bucket:
sharp@blue:~$ s3cmd put Music/.encfs6.xml s3://sharpbackup/music.xml File 'Music/.encfs6.xml' stored as s3://sharpbackup/music.xml (911 bytes in 0.0 seconds, 3.28 MB/s) [1 of 1]
Now that file is safe, we can use s3cmd sync to sync all the encrypted versions of the files to the bucket.
sharp@blue:~$ s3cmd --delete-removed sync Music_enc/ s3://sharpbackup/music Compiling list of local files... Retrieving list of remote files... Found 11 local files, 0 remote files Verifying checksums... Summary: 11 local files to upload, 0 remote files to delete ...
…and we’re done. If you stop this command and then start it again it will pick up where it left off. That’s actually true for any point in this process. You can even change files, add files or delete files, and s3cmd will only transfer the files it has to to make the backup up-to-date. This is the beauty of using EncFS with an rsync-like system. One last thing. When we’re done you should unmount the EncFS mountpoint:
sharp@blue:~$ fusermount -u Music_enc/
Restoring the backup
Now lets pretend our hard disk crashes, and we’ve lost all our data. We install Linux along with EncFS and s3cmd. At this point you could sync back all the data and use it like a regular EncFS folder. The problem is that we never intended for the data to be encrypted locally, and it would be a hassle to mount it as a regular EncFS folder and copy all the data out of there. Luckily we can reverse mount the same way we did before and sync all our music back. First, lets create our folders:
sharp@blue:~$ mkdir Music sharp@blue:~$ mkdir Music_enc
Now we have to pull our config file back into the directory we want all of our files to go into:
sharp@blue:~$ s3cmd get s3://sharpbackup/music.xml Music/.encfs6.xml Object s3://sharpbackup/music.xml saved as 'Music/.encfs6.xml' (911 bytes in 0.0 seconds, 1569.16 kB/s)
Now all we have to do is sync the encrypted files back into our Music_enc directory, and EncFS handles the rest:
sharp@blue:~$ s3cmd sync s3://sharpbackup/music Music_enc/ Retrieving list of remote files... Compiling list of local files... Found 11 remote files, 1 local files Verifying checksums... Summary: 11 remote files to download, 1 local files to delete not-deleted 'UO5JPyI9Q3Q7hcnRW0kz8d6H' ... sharp@blue:~$ cd Music sharp@blue:~/Music$ ls Minor Threat sharp@blue:~$ fusermount -u Music_enc/
Final thoughts
- EncFS makes a ton of stuff like this really easy. You could do pretty much the same process with rsync and rsync.net. Or with rsync and another FUSE filesystem like sshfs or GmailFS, although I wouldn’t recommend the latter because Google looks down upon that sort of thing and is known to remove accounts that use tons of bandwidth. The upside to S3 is that it is cheap storage.
- This whole process can be easily scripted. I may (or may not) be releasing a script soon that just does this whole thing if you give it a directory you want to back up and a name of a bucket and prefix.
- Metadata (file size, file name size, attributes, etc) is still easy to see. The contents and file names may be encrypted, but it is not hard to figure out that a bunch of folders containing 10 or so files that are about 2-4 megs are folders containing music.
- Backing up folders already encrypted with EncFS is even easier. Just sync them.



















