Drobo

Solved: Reliable 86MB/s RW Backups With Time Machine and Lion

Been waiting 3 months for Drobo to get their act together so I can backup reliably via Time Machine on OSX Lion. My backups need to be reliable, not randomly disappear into oblivion and I’d like to get a backup to complete at least once. It would also be helpful if they weren’t as slow as molasses.

So yesterday, I decided to do something about it, sell my Drobos to windows users and never purchase or recommend a drobo again.

This solution is slightly cheaper than a Drobo FS and has more capabilities and drive bays.

Here’s how I did it, it was easier than I expected it to be and finally I have reliable, fast backups and a bunch of additional features for free.

Drobo FS versus FreeNas comparison for Time Machine backups on Lion.

This comparison is using the same quiescent Network, the same Gigabit switches and the same hard disks, so you can consider it a direct performance comparison between the two. It’s for 1.5TB on a 2.66TB volume (5 x 1TB) using dual disk redundancy.

Bottom line, I managed to figure out what hardware I needed, drive to the shop and buy it, run a few errands, build it, install FreeNas, sleep, zero 5 x 1TB disks, run comparison performance tests, backup 1.5TB of total data encrypted on 4 separate machines via Time Machine, all in less time than it would have taken to backup one machine to a Drobo (750MB of data). (Assuming that the Drobo would complete the backup, which it doesn’t).

The hardware I used: Antec Nine Hundred II V3 Gamer Case (9 x 5.25" bays), Zalman Modular 3 Bay HDD Rack x 2, Asus P8H67-M PRO/CSM Rev 3.0 motherboard, Intel Core i3 2100, 8GB DDR-1333 DIMM, 4GB USB Key, Antec EarthWatts 650W Green PSU. You could also use an old PC to run FreeNas.

Drobo FS

Cost: $693
Drive Bays: 5
Asthetics: 5 RGB Leds and About 10 Blue Leds. Black Case. Small Case.
iScsi: NO (only available on B800i, B1200i and Drobo Pro)
Supports Time Machine/Lion: Data Robotics claim it does, but it doesn’t due to reliability issues
Supports 10GigE: NO
Remote Replication: NO (only available on B800fs)
Snapshots: NO
Raid: (Mirror, Stripe, Hot Spare, Dual Disk Redundancy)
Network Link Aggregation: NO
Hotswap: YES
Drive Assembly: Plugs straight in
Software Stability: Flakey
Configuration: Dashboard (separate package install and drivers)
Replacing Drives: Easy, however performance takes a hit from now on because of the way Drobo moves data when a drive dies (wipe data and rebuild drobo array from scratch to avoid).
Increasing Volume Capacity: Easy (but again, potential performance hit)

— Performance —
Time taken for the Drobo to be seen after client startup: 2 mins
Mount time: 13 mins
Time to produce list of files on volume: 7 mins (on 2nd try, 1st try timed out)
Copy of 11GB file via AFP: Read 50MB/s Write 14MB/s (the write froze midway for 2 mins (not included in the speed))
Time Machine Backup (750MB): 52Hours (estimate from Time Machine progress bar, as backup never completes)

FreeNas 8.0.1

Cost: $649
Drive Bays: 6 (expandable to 9 bays for $140)
Asthetics: 16 Blue Leds. Black Case. Large Case.
iScsi: YES
Supports Time Machine/Lion: YES
Supports 10GigE: YES
Remote Replication: YES
Snapshots: YES
Raid (ZFS): (Mirror, Stripe, Hot Spare, Dual Disk Redundancy)
Network Link Aggregation: YES
Hotswap: YES
Drive Assembly: On a pushbutton screwless carrier
Software Stability: Mostly Stable
Configuration: Via a web browser
Replacing Drives: (if using ZFS, a few command lines)
Increasing Volume Capacity: A few command lines

— Performance —
Time taken for the Drobo to be seen after client startup: Instant
Mount time: 1 second
Time to produce list of files on volume: 2 seconds
Copy of 11GB file via AFP: Read 86MB/s Write 87MB/s
Time Machine Backup (750MB): 6.5 Hours

Note that in my testing, FreeNas reaches very close to the theoretical speed limit of GigE for both read and write, i.e. 100MB/s transfer. (1GBit / 10 bits per byte (8 + 2 protocol overhead)). The Drobo doesn’t even come close. A good quality GigE switch is a must.

Information about FreeNas can be found here: http://www.freenas.org

Information on configuring Time Machine backups can be found here:
http://www.youtube.com/watch?v=9sT3F5at4cM&feature=youtu.be (ignore the part about terminal, it doesn’t apply).

I’m done with Drobo !

I didn’t know that this was such an issue with OSX TimeMachine and Drobo, until I read the other thread.
That’s real good of you to share your solution with the board, even if it means you can’t use your Drobo anymore.

What was your ZFS array configuration? With 5 1TB drives if you built for both a hot spare and raidz2 then you might have been better served in a mirror configuration with a hotspare; and the annoying bit is that zfs raidtypes are upgradable but not mutable. You can turn a mirror into a raidz into a raid2z but you can never step the other way.

I just went through a failure of a raidz2 system because due to the size of the array and the complexity of the parity rebuild operations the system could not recover to a steady state before consecutive drive failures killed the data.

ZFS has significant cost to rebuild. a 2.2TB array with raid2z had a 36 hour scub time before you start taking errors into consideration. Most desktop boards and desktop drives do not handle transient errors well without a lot of tuning.

In short, be careful with what you just built. You need to consider a lot because you just made yourself a systems engineer.

So I’ve been doing some testing. I’m using RaidZ2. (can survive 2 disk faults, 5 x 1TB).

To test, I pulled a disk whilst the machine was live. Then I zero’d it on another machine to simulate replacing it with a new disk.

Technically didn’t need to scrub, but did anyway. The volume was degraded as expected but not offline. Scrub took just under 3 hours. Then replaced the disk and resilvered. The resilver took just under 3 hours too (which would be around the same as a Drobo or any Raid config for that matter, i.e. approaching the speed of the IO to the disk.).

I also tried pulling a drive, and inserting it back in without wiping, ZFS recovered fine by itself (Drobo’s don’t do that without having to copy between the drives).

So overall, no issues in regards to fault tolerance, it worked fine during my testing. Personally I always like dual disk redundancy , purely because of what you illustrated, during rebuilds you can always lose another disk.

Not sure why your scrub took so long, I’m wondering if you had 2 disks faulty for a while. Did you have SMART enabled to warn you of impending disk failure? Do you run scrub occasionally anyway to check consistency?

Regarding being a systems engineer. Yes and no. I am anyway, but the gui on freenas works great. Additionally you need to be a systems engineer to get a Drobo to behave itself, so really there’s not much difference between the to.

Bottom line, I’ll take a system that works reliably and faster over a Drobo that doesn’t.

After all, if I wanted to use Time Machine and Lion with a Drobo, I currently can’t. Meanwhile I now have backups using freenas. No brainer really.

Plus the speed outperforms the Drobo FS by far without timeouts and hangs.

Also over the last 2 years, my Drobo has nuked my data about 5 times, many times without notice.

Finally, I always maintain multiple backup sets for any solution (Drobo included) offsite, which insulates you somewhat from losing your backups.

I’ve also drank the Drobo Coolaid until it upset my stomach :slight_smile:

I’ll admit my NAS was aging. You are throwing a Core i3 at the problem, not hard to get faster numbers with that. I had a 3 drive cluster failure hit me and had one drive immediately die with the remaining 2 start to throw transient errors. The 36 hour quote included the fact the array had to slow down during rebuild due to errors. You experienced a best case recovery with what you performed, I’d suggest a little more testing so you can understand your total risk in a more likely failure environment.

Have you tested TM recoveries? Do you have an RTO time? Part of the problem of AFP is that it maintains a database along side the filesystem to map where files are, are you ensuring the BerkleyDB that drives netatalk is being served as well? Ideally if you can get that thing entirely into RAM you should get the last of the performance bottlenecks out of your server. You don’t need to provide for the full lifecycle since you can always regenerate that from the filesystem but you do need to go take care of that thing a little. If it corrupts you get a lot of the issues you see here on the forums.

I think you’re missing the point. Sure it’s faster with an i3, however it’s also cheaper than a Drobo FS and has more capability. Drobo performance is poor and ultimately, they could put a faster processor in it. Considering I’ve only had this built for a couple of days, I have yet to do a TM recovery, but I will post results when I do to a spare mac. I would expect it take around twice as long as a 1.5TB Copy, which I have already tested and is bit perfect.

Expecting anything to recover from 3 drive failures within a close time of each other, is a little optimistic, even for a Drobo. If the array had errors that was causing scrub to have serious issues, then your volume was already history and it would have become a save what you can type scenario.

I’m aware of .AppleDB, actually I was the first to point out the fix to Drobo when it became corrupted as regularly happens with Drobo’s.

Basically, whatever your DR solution is, you should be keeping multiple backups and off-siting them to protect you from the kind of situation you encountered. It’s rare these situations occur, but they do, and I’ve seen enough of them with customers over the years to see a lot of pain when they realize they never had a comprehensive DR plan in place, and I’ve been the one hired to do the impossible and recover their data. Sometimes they learn from their mistakes and sometimes they don’t.

So, let me qualify what I said earlier, you should trust no backup solution 100% ever, freenas, Drobo or anything else. (Also, don’t trust Time Machine, it’s not 100% reliable either). Freenas I’m finding so far to be faster and more reliable than the Drobo’s I’ve have, or have recommended. Freenas is a more mature product now than it was when I looked at it a while back.

If you wish to trust your Drobo 100% with the security of your data, I wish you the best. If not, then I’m just pointing out that there are other options out there and I found one that works for me for my specific needs.

Hah, Wasn’t questioning your design decisions, just making sure you had everything in mind. I was doing what you are doing now, trying to engineer it all myself. After my 3rd array toasted itself for a brand new reason each time I purchased a drobo because I was tired of trying to pick hardware myself. You always end up with some stupid problem no matter what you do.

No prob, been there too. Here’s my reasoning behind this. When I looked at this a while back, I’d done many DR type setups and backup systems at the enterprise level (and enterprise cost). Then my SOHO level requirements for my own stuff got to the level where it was bigger than 1 disk and off-siting an few external drives, I looked at NAS and Drobo a few years ago. The build your own NAS options then were somewhat “green”, time consuming and if you wanted any better, the costs got larger as you went into the semi-enterprise level. So at that time, Drobo was simpler and slightly cheaper.

Back then Drobo mainly worked as described, was somewhat more stable and didn’t wipe themselves that often. Although Drobo was never approaching enterprise quality, and I couldn’t expect it to either at that price.

Move forward to now (and this is just my opinion), but the build your own is somewhat simpler and easier, now with gui’s, cheaper etc. than it was before and also more stable. It’s reached the level where the somewhat more experienced non sys-admin level user, who can read the docs provided to them, can build it themselves. The Drobo over the years have become more unstable due to firmware and software issues and they have been behind the current technology. They’ve also required more “tinkering” to get the working. Requiring things like installing ssh, logging in, using command line to find and remove berkeley DB’s, patching netatalk etc… Killing processes, and fault finding when it disappears from the network or won’t shutdown, i.e. the advanced user / sys admin type level. It hits the sys admin level when things don’t work and Drobo Support fail to be helpful.

I.E. Drobo was ahead of the pack, and it had a coolness to boot. They were expensive, but not crazily so. But they rested on their coolness and didn’t bother to continue to innovate at the same rate. For example, we waited ages for firewire, network backups, Time Machine support and iScsi, even the ability to back up multiple machines to 1 Drobo.

For example when the first Drobo Pro came out with iScsi, I looked at it for a smaller workgroup situation (about 15 users) and the Drobo Pro was targeted at this level at $3500 and iScsi. However it turned out that it only supported iScsi client! At this time, you could get an 8 drive Raid for just over $1,500 that could handle it. Very overpriced and without basic features that you’d expect.

Add in the lackluster speeds over the years, and I somewhat resent firmware updates that progressively produce slower speed.

Plus with the software / firmware being so unstable these days, it’s one of the few products that even techies need support for, from time to time. (not that having that support gets things fixed).

I think these reviews tell it as it is:

http://reviews.cnet.com/external-hard-drives/drobo/4852-3190_7-32470303.html

I think this comment summarizes it nicely:

“Does not replace a backup strategy, but pretends to”

Expecting a product that I’ve laid down my hard earned cash for to work as described is a reasonable expectation, for example, here are some of Drobo’s lies:

http://www.drobo.com/resources/this_is_drobo.php
"Our customers think of us as a hard drive that is NEVER FULL AND NEVER FAILS.
Hard drives get full and wear out. A single drive failure can lose your data. Not so with Drobo. "

"Redundant Protection, No Headaches "

"Drobo provides the redundant protection of much more expensive storage, in a format you don’t have to configure or manage. We call this BeyondRAIDTM technology. You’ll call it peace of mind. "

http://support.drobo.com/app/answers/detail/a_id/602#Is_my_Drobo_compatible_with_Lion?

“Yes Drobo Dashboard 2.0.3 and later supports Apple’s new OS, Mac OS X Lion”

Just with those statements and the volume of complaints, I’m sure there would be enough for a Class Action Lawsuit. If any enterprising lawyer decides to do this, I’m in.

How have your experiences been with the stability of Drobo in general? Same, great, getting worse?

Honestly, the drobo I brought home has been quiet. I downloaded the latest dashboard, it upgraded the unit for me, I only shoved one drive in when I turned it on and it prompted me to stop being stupid and insert the other drive. Insert drive, get full 1.5TB array size.

I’ve experienced no problems with the dashboard software finding the drobo, most of the time it just sits in the menutray showing me aprox how full the thing is. Opens and interacts with the drobo just fine

Moved over to my dying solaris box and mounted the drobo via SMB and then ran rsync. Between the read errors and the general crappyness of my recovery environment I achieved an average of 17MB/s out of that dying horse but I got 900GB moved over. After that I played around with a few speed tests, found the boundries of the box pretty much the same as my zfs box and then set out to find the community. Also shoved in another 3 drives and rounded out my array, and promptly began putting the next tb into the thing

I upgraded netatalk mostly as a result of reading these forums and getting damn curious but as I have no intention of using the Drobo as a time machine target I’ve encountered no real hiccups I didn’t cause during this setup process.

Accidentally set my MTU incorrectly when trying to bump my network up to jumboframes and then freaked out for a bit when things started braking randomly but only on my mac and I managed to scramble permissions on my data once when I was toying around fixing the netatalk package since the thing comes with the cfg file mislinked.

I’ll admit I’m kinna sad I’m not getting the 60-80MB/s these drives might do on a faster NAS, but i’ve reduced my relationship with my storage from having to deal with an “almost enterprise” server built from scrap to looking at lights and replacing the drives as they go boom. Obviously you’ve had a much longer history than I have, and I will keep a damn good eye on the things you have mentioned.

Yeah, I’ll probably check out a FreeNAS type solution next time. I didn’t even know a drop-in OS like that existed. There are some nice things about the Drobo though. The looks and size mostly. I’m running all Apple computers, and now I’ve become super sensitive to machine noise. I actually keep my Drobo in the living room so I don’t have to hear it! So I’d probably want some super small quiet case for the ideal roll your own NAS system.

Don’t forget about ongoing expenses. I’d be curious to find out about the monthly cost of electricity to run that FreeNAS server. I’ve measured my Drobo FS at $5/month with 5 drives, while my own DIY servers (built several years ago) using 3ware cards and 9 drives ran at $20/month. Obviously I use fewer drives now and likely have more energy-efficient drives, so I’d be curious to find out the true cost difference using today’s equipment.

This can be largely solved by Wake On Lan. It’s possible to have the FreeNas Server off (i.e. very low power, all fans, CPU, memory and hard disks) and to be woken up for backups by Time machine when Time Machine starts on a client. Also, you can just have the drives powered down when not in use (first option is more power efficient).

I would expect there to be not much difference in power usage as Drobo FS is using older hardware which is less energy efficient than today’s newer hardware. Even though the hardware today is faster, so will be more power hungry than slower hardware.

For an initial rough estimate of the “Off” current usage, Wake On Lan is using 60mA @ 5V @ 80% efficiency, which works out at ~0.36 Watts. Compare this to the Drobo FS’s 12 watts. This doesn’t include the power supply’s or motherboards quiescent current (but it is a Green Energy Efficient Power Supply too).

I’ll hook up a Watt meter later and see if I can get some hard values for On with busy 5 drives and Off. How much time the system remains on and off is slightly more difficult to estimate as it will depend on how many machines you have, how often they backup and how long they take to backup.

60mA works out at around 1.7cents per month @6.5cents per KWh

Not to mention that if you have Apple machines in your network you can basically forget about WoL. Apple machines are notoriously chatty and keep waking everything in the network.

Please qualify that statement further. My understanding is that OSX has 2 types, Wake On Lan and Wake For Network Access. Wake On Lan is only on older Mac hardware and Wake For Network Access supersedes it.

Wake On Lan is a specially crafted packet that can be sent out that can wake a machine from off / sleep.

Wake For Network Access is more advanced, what happens is the machine goes to sleep and the machines’s mac / ip address get handed off to a server (typically a recent Airport Express or Time Capsule, or a Sleep Server running on one of your computers) which continues to adversities those services. When something else on the network requests a service, then one of those 3 will respond and wake up the sleeping mac to provide the service. Also the machine uses the RTC to occasionally wake up (generally once per hour), just to say “I’m still here”.

That said, freenas isn’t OSX (even though OSX is based on freebsd), so it won’t handle Wake For Network Access, so freenas needs a WOL packet to wake up.

Well, if I remember correctly the BIOS on PCs will give you two options: either you wake up only on WoL packets (aka Magic Packet) or by network activity.

Since I haven’t yet seen a good, reliable way to prepend a WoL packet to an arbitrary client (e.g., have the OS X TimeMachine client generate the WoL packet automatically), I assume you have to rely on the wake-on-network-activity option.

Disclaimer: I’ve seen solutions such as this http://rob.by/2009/use-time-machine-with-wake-on-lan-to-enable-network-backups-to-sleeping-devices/ but well… you can see what I mean by unreliable.

Since Mac machines are constantly probing the network using Bonjour, any machine that wakes up on network activity will never be able to go to sleep.

Apple has a solution to this called Sleep Proxy, but it would need to be implemented in the FreeNAS

https://secure.wikimedia.org/wikipedia/en/wiki/Apple_Sleep_Proxy_Service_(Bonjour_Sleep_Proxy)

You’ll spend more electricity researching this than you’ll ever save. :wink:

the truly ideal solution would be if Thunderbolt enclosure prices come down (mostly they need to be sold unbundled from drives). A MacMini with a ZFS Thunderbolt RAID would saturate your GigE and then some no problem!

I have a WOL solution which is totally reliable. Machine powers off within 2 mins on no network activity and Time Machine sends a WOL packet when it starts. I don’t use rob.by’s technique, I would agree with you that the technique he uses is messy although it would work. I directly fire it off via backupd, using the following technique:

  1. Check host is alive
  2. If not, send broadcast WOL packet and wait for host to come alive
  3. Execv backupd
    A fairly simple C program. You could also write it in perl / python if so inclined.

On the freenas side, a simple shell script using netstat to look for ip connections on ports 22 (ssh) 80 (the GUI) and 548 (afp) fired from cron. I’m just using those, but if you’re using other services, additional ports can be added. If there are no active connections, then it shuts down freenas.

Been testing it, and it works fine. No problem with any chatter. To get a problem from “chatty” Apple devices to this setup, you need to have Wake On Lan packets being sent, which basically just doesn’t happen by itself. It’s more energy efficient than the Drobo, which takes ages to power to sleep mode, and even the the drobo uses more in sleep.

Basically, when a machine is due to Time Machine, freenas wakes up and the backup starts in about 100 seconds. Then 2 mins after the backup has finished, it powers down. All quite green.

One disadvantage is wear and tear on the machine, so I don’t necessarily think that the cost benefit outweighs that as the freenas server is pretty cheap to run. But the assumption that Drobo’s are cheaper to run than a Freenas server is flawed.

Freenas with current “low end” hardware, is viable, it’s cheaper, a lot faster, more stable and uses less power. Also, it can currently complete Time Machine Backups with Lion which Drobo FS can’t. It perfectly viable for small workgroup type situations.

Here are the power readings:

Freenas (with hardware as described previously), 5 drives:

  • Powered Off (listening for WOL) 4W
  • Powered On maxed out copying with 5 drives 90W
  • Time to power off after inactivity: 2 mins

Drobo FS (5 drives):

  • Sleeping (Orange Light) 12W
  • Powered On maxed out copying with 5 drives: 55W
  • Minimum Spin Down Time after inactivity: 15mins

If you take the average time machine backup as being 5 mins every hour for 8 hours a day (if you sleep or power off your machines). We’ll ignore the fact that the Drobo FS will take longer to backup than this freenas setup, which makes these numbers even worse for the drobo:

Freenas per hour: 0.116hrs * 0.090KWh + 0.884 * 0.004KWh = 0.013976 KWh * 24 * 30 = 10.06 KWh / month @ $0.12 per KWh = $1.21 per month

Drobo per hour: 0.333hrs * 0.055KWh + 0.667 * 0.012KWh = 0.026315KWh * 24 * 30 = 18.95 KWh / month @ 0.12 per KWh = $2.27 per month

I.E. Drobo costs nearly double the electricity. However, neither really break the bank either to factor highly into the decision. I certainly can’t envisage getting to the $20 per month mentioned by another user with the technology available today.

Disclaimer1: If you are running both Drobo and Freenas full on instead of putting them in standby, then Drobo uses less power at the expense of less performance.

Discalimer2: Freenas can be configured easily to power down the disks during inactivity instead of shutting down the machine. I decided to go for the WOL scenario, but powering disks down is effective to reduce power too.

Freenas may not be for you, but it’s important to be aware of the options out there. The Drobo FS is unreliable and they falsely advertise Lion / Time Machine support.

It’s great to have working Time Machine backups that are fast ! I had an hourly 13GB backup take 6 mins on Time Machine start to finish today on freenas, including the WOL startup time. On the Drobo, it would have taken more time than this to just authenticate and then crap out after a few hundred meg.

Zawie, I thought it wasn’t worth it for me to drop Drobo, but last night I lost a bunch of data after spending a week or more trying very hard to get Time Machine running. I haven’t had a successful TM backup since April and now I lost quite a bit of my other data on the Drobo. This was after a week or more of pretty intense effort on my part.

So I have to say you are right, the Drobo (FS at least) is the exact opposite of what is advertised. It takes major babysitting, it is not fire and forget. And if you buy one thinking that it gives your precious data extra protection, you may end up finding its the exact opposite.

I think the number one cause of these issues is that the Drobo FS is diabolically slow. In my case, the CPU was so weak, when I tried to delete a 3.6 TB Time Machine sparsebundle the whole thing blew up. The fact that there are a huge number of files in a 3.6TB sparsebundle and all delete share does on the Drobo is execute “rm -Rf” doesn’t help either (too many arguments).

I’m going with the safety of native AFP at this point and I’m getting a 4-bay firewire enclosure to hook up to my Mac Mini. My long Drobish nightmare will soon be over.