Summary: I finally managed to get a proper backup strategy for my DroboFS using Amazon’s S3 (Simple Storage Service).
This is a long post, so bear with me. By the end of it you’ll be able to be one step closer to a perfect 3-2-1 backup strategy (if you do not know what a 3-2-1 backup strategy is, then you really, really ought to read that link).
The one thing that always bugged me about the Drobo is that they are not backup devices by themselves. Sure you can configure your computer to sync with them, but that is just one copy. And (at least for me) the FS is fast enough to use as a primary storage device. In fact, although I had bits and pieces of my photo collection here and there, the FS has become my primary storage location. I’m somewhat ashamed to admit it, but due to laziness the FS has become the only copy I have of some files. Damn you DRI for making products so reliable!
It is pretty clear that I trust the Drobo’s ability to keep my files safe, however it always bugged me that if something bad happened (mishandling the disk pack, having a second disk fail while replacing another, lightning striking) all my precious data would be gone.
So, like many others in this forum, I set out to figure out a way to easily sync my files with a remote repository. My requirements were:
[list]
[]Drobo compatible, i.e., that the client software would run from inside the FS, if possible as a cron job.
[]Trustworthy, reputable storage provider.
[*]As cheap as possible.
[/list]
Unfortunately, it seems that the Drobo family is not well known by companies providing cloud storage. Dropbox (my favorite cloud storage service) has no support for Drobos. Carbonite doesn’t either. Even Amazon does not provide support at all (they basically just sell you storage).
That was until I found s3cmd. S3cmd is a python script that provides you with a simple command-line interface to Amazon’s storage services.
As you can probably imagine by now, I thought: Amazon S3 + s3cmd + cross-compiled python + (GNU Screen or cron job) + Drobo FS = win!
Indeed, as I type this my FS is happily syncing all my important files to an Amazon S3 bucket. Mind you, it is not exactly the fastest thing I’ve ever seen, but as long as it works I couldn’t care less. Most likely, the deltas from now on will be much faster anyway.
So here is the step-by-step guide:
[list=1]
[]Don’t subscribe to Amazon S3 just yet.
[]Cross-compile python 2.6.6 and bring it over to your Drobo FS as explained here. Make sure that it does work.
[]Download s3cmd 1.0.0 and place it somewhere in your DroboApps folder, like this:cd /mnt/DroboFS/Shares/DroboApps/
wget http://sourceforge.net/projects/s3tools/files/s3cmd/1.0.0/s3cmd-1.0.0.tar.gz/download
tar zxf s3cmd-1.0.0.tar.gz
[]Test it by runningcd s3cmd-1.0.0
./s3cmd
At this point, if python is not on the PATH variable, you’ll get an error message indicating that python could not be found. Otherwise you’ll get a message about missing config file, and to use --configure.
[]This is where you sign-up for Amazon S3. Make sure to copy your Access Key and Secret Key. You find those in Account > Security Credentials > Access Credentials.
[]Start s3cmd again, but this time using the --configure parameter. It’ll ask you for the access key and the secret key. You’ll also be asked for encryption keys. I haven’t tried to enable encryption (I’ll leave that as an exercise to the readers ), so just leave it empty. After that, s3cmd tests the connection and hopefully it’ll be successful. If not, make sure that your Amazon S3 registration went through properly.
[]From now on, you can basically follow the instructions here, but I’m going to proceed with a bit more detail.
[]First, you have to create a ‘bucket’, which is Amazon’s name for a storage pool. Type something like this:s3cmd mb s3://some.unique.name.here
Make sure that the unique name part is really unique. If it works, you have a new bucket called ‘s3://some.unique.name.here’.
[]Second, to put a file in there type:s3cmd put file.ext s3://some.unique.name.here/some/path/file.ext
[]Finally, to sync a folder type:s3cmd sync folder s3://some.unique.name.here/some/path/
. You’ll probably want to use the parameter --dry-run, which outputs a list of operations that will be performed. Also, when you are confident about what you are doing, you might want to use --delete-removed, which removes from S3 the files that have been removed locally. Needless to say, use --delete-removed very carefully (for instance, I wouldn’t use it in a cron job).
[/list]
And that is it! As I said, it might take a while if you have a lot of data, but so far it has been working perfectly fine.
P.S.: Python maxes out the Drobo CPU, so you might want to re-prioritize it using something like this:renice -n 5 -p <number of python's process here>