DroboFS backup on Amazon S3

Summary: I finally managed to get a proper backup strategy for my DroboFS using Amazon’s S3 (Simple Storage Service).

This is a long post, so bear with me. By the end of it you’ll be able to be one step closer to a perfect 3-2-1 backup strategy (if you do not know what a 3-2-1 backup strategy is, then you really, really ought to read that link).

The one thing that always bugged me about the Drobo is that they are not backup devices by themselves. Sure you can configure your computer to sync with them, but that is just one copy. And (at least for me) the FS is fast enough to use as a primary storage device. In fact, although I had bits and pieces of my photo collection here and there, the FS has become my primary storage location. I’m somewhat ashamed to admit it, but due to laziness the FS has become the only copy I have of some files. Damn you DRI for making products so reliable! :slight_smile:

It is pretty clear that I trust the Drobo’s ability to keep my files safe, however it always bugged me that if something bad happened (mishandling the disk pack, having a second disk fail while replacing another, lightning striking) all my precious data would be gone.

So, like many others in this forum, I set out to figure out a way to easily sync my files with a remote repository. My requirements were:

[list]
[]Drobo compatible, i.e., that the client software would run from inside the FS, if possible as a cron job.
[
]Trustworthy, reputable storage provider.
[*]As cheap as possible.
[/list]

Unfortunately, it seems that the Drobo family is not well known by companies providing cloud storage. Dropbox (my favorite cloud storage service) has no support for Drobos. Carbonite doesn’t either. Even Amazon does not provide support at all (they basically just sell you storage).

That was until I found s3cmd. S3cmd is a python script that provides you with a simple command-line interface to Amazon’s storage services.

As you can probably imagine by now, I thought: Amazon S3 + s3cmd + cross-compiled python + (GNU Screen or cron job) + Drobo FS = win!

Indeed, as I type this my FS is happily syncing all my important files to an Amazon S3 bucket. Mind you, it is not exactly the fastest thing I’ve ever seen, but as long as it works I couldn’t care less. Most likely, the deltas from now on will be much faster anyway.

So here is the step-by-step guide:
[list=1]
[]Don’t subscribe to Amazon S3 just yet.
[
]Cross-compile python 2.6.6 and bring it over to your Drobo FS as explained here. Make sure that it does work.
[]Download s3cmd 1.0.0 and place it somewhere in your DroboApps folder, like this:cd /mnt/DroboFS/Shares/DroboApps/ wget http://sourceforge.net/projects/s3tools/files/s3cmd/1.0.0/s3cmd-1.0.0.tar.gz/download tar zxf s3cmd-1.0.0.tar.gz
[
]Test it by runningcd s3cmd-1.0.0 ./s3cmdAt this point, if python is not on the PATH variable, you’ll get an error message indicating that python could not be found. Otherwise you’ll get a message about missing config file, and to use --configure.
[]This is where you sign-up for Amazon S3. Make sure to copy your Access Key and Secret Key. You find those in Account > Security Credentials > Access Credentials.
[
]Start s3cmd again, but this time using the --configure parameter. It’ll ask you for the access key and the secret key. You’ll also be asked for encryption keys. I haven’t tried to enable encryption (I’ll leave that as an exercise to the readers :slight_smile: ), so just leave it empty. After that, s3cmd tests the connection and hopefully it’ll be successful. If not, make sure that your Amazon S3 registration went through properly.
[]From now on, you can basically follow the instructions here, but I’m going to proceed with a bit more detail.
[
]First, you have to create a ‘bucket’, which is Amazon’s name for a storage pool. Type something like this:s3cmd mb s3://some.unique.name.hereMake sure that the unique name part is really unique. If it works, you have a new bucket called ‘s3://some.unique.name.here’.
[]Second, to put a file in there type:s3cmd put file.ext s3://some.unique.name.here/some/path/file.ext
[
]Finally, to sync a folder type:s3cmd sync folder s3://some.unique.name.here/some/path/. You’ll probably want to use the parameter --dry-run, which outputs a list of operations that will be performed. Also, when you are confident about what you are doing, you might want to use --delete-removed, which removes from S3 the files that have been removed locally. Needless to say, use --delete-removed very carefully (for instance, I wouldn’t use it in a cron job).
[/list]

And that is it! As I said, it might take a while if you have a lot of data, but so far it has been working perfectly fine.

P.S.: Python maxes out the Drobo CPU, so you might want to re-prioritize it using something like this:renice -n 5 -p <number of python's process here>

It’s awesome to see some progress on setting up the DroboFS to back up to online services. I checked out Amazon’s S3 storage pricing and…well, it would probably be cheaper to set up a DroboFS at a friend’s house. 2TB/mo just for storage is $270/mo. If you managed to back up that entire 2TB during one month then your first bill would include another $200 in transfer fees. If you double that to 4TB storage and 4TB of incoming transfer then that’s $527/mo for storage and $410 in transfer fees ($935 total first month’s bill).

Might be reasonable for a company, but way out of reach for just a home user.

Good point. I’m just backing up about 200 GB of data (some documents, family pictures and movies), which are absolutely worth the cost, i.e., the 20-something dollars per month.

Of course, I’m not an Amazon fanboy. I have seen that Carbonite and Mozy seem to be much cheaper than Amazon, but there is one catch: Amazon allows me to make any file publicly available, while as far as I know both Carbonite and Mozy don’t.

Dropbox seems to offer the same kind of functionality as Amazon, but it is more expensive.

Anyone knows of other alternatives?

ricardo, as of recently it seems Carbonite is making sharing possible, they even have mobile apps!

Dropbox is at least an alternative for sharing! Think of it like that:
Sync a subset of your potentially shareable or to-be-shared files to Dropbox and you’re ready to go!
I for example have all the files I work with frequently and daily (and are not of big size) in my dropbox.
The rest resides on my Drobo. And syncronizes to the cloud. (I’m currently testing out one service, sucks performance-wise though!)

But did you ever think of working with the drobo partner Oxygen Cloud?! (http://www.oxygencloud.com/)
I think they don’t offer TRUE cloud space yet (but soon??), yet you sure will like the integration.
For my part, it is too complicated for me as of now, as I have so little knowledge and time currently!

Haven’t checked that out, but my point was that from Amazon S3 I can generate publicly accessible URLs, that no one needs to login anywhere to see.

Yeah, but I need a computer to do it. I’m really looking for a solution that would sync everything from the Drobo, no additional hardware involved.

Yeah, I’ve tested their services but it is not exactly what I am looking for. They don’t just share your files. As far as I can tell, all files put in the Oxygen Cloud are encrypted and stored in a separate space, which means duplicated data inside the Drobo… on top of that, it is still quite unclear where Oxygen is going after beta. As I said before, I don’t mind paying, as long as it does what I want for a reasonable price.

Love this solution. I registered here just to get the SDK to hopefully get something working, and here’s the solution fully formed. Thanks ricardo, seriously.

You are very welcome. Since I started using this, I have found a small caveat: big files (i.e., larger than 200 MB) are not sync’ed using s3cmd. It is a requested feature on their website, but right now what I am doing is to manually add the big files (such as ISO files of homemade DVDs), as they tend not to change at all.

I am currently investigating Crashplan. They have a Java client, and as far as I can tell there is a Java VM for the ARM platform floating around somewhere here in the forums.

So far my feeling is that it is possible to extract the Java program from the Crashplan install files and run it directly from the command-line, using a Drobo-compatible JavaVM. Crashplan’s pricing seems to be a bit more reasonable than Amazon’s for large storage pools, so it may be a better option for a lot of people.

I am currently using Crashplan to back up my Drobo via a PC, to be able to cut the PC out of the equation would be great. Pricing on Crashplan is reasonable and it also has the ability to back up to another Crashplan user’s account, even more reason to get Crashplan the Drobo.

Hi Ricardo,

Did you have any luck getting crashplan up and running on the FS?
I’m looking to do the same and am curious to hear if you hit any road blocks?

Thanks.

Crashplan is no go. Although the app is in Java, it depends on a couple of native libraries. One of them is open source, but the other is not, and there is no ARM version.

I use Crashplan to backup my drobo’s critical data to 2 of my friend’s PCs. They also in turn backup to my drobo.

Free is a good price too! :wink:

But of course, my PC has to handle the traffic to and from the Drobo. I don’t trust my Drobo after it lost all of my data for no apparent reason.

interesting concept - so you each save/share a percentage of your drobo space as a back up for the other friend.
reminds me of the Lots Of Copies Keeps Stuff Safe (locks) method :slight_smile:

Have you guys all switched to crashplan, or are you still using s3? I am interested in writing a program that will sync folders on my Drobo-Pro FS with Amazon Glacier. Pricing on Glacier is running around $10 / TB right now, so it makes a very cheap archiving solution. Just wondering if it is worth my time, or if there are other solutions out there now that work as well and are not cost prohibitive.

I switched to CrashPlan. Mainly because CrashPlan supports multiple versions of files, which would be pretty damn annoying to implement myself. Cost was also a factor, but like you said Amazon Glacier has beat them on that aspect.