Infinite Data Protection Rebuild Loop

Lost Drive 1 on my Ethernet-connected FS (5 x 3TB WD Green drives) about a month ago. Replaced the drive, and now (for weeks), it has been rebuilding/rebooting, never completing. About 5.7 TB of 10.85 used.

I am able to mount images (a little iffy, may take a couple of tries, and very long latency responses, sometimes minutes), and have been trying to copy data off the FS in anticipation of worse failures to come. Copies proceed very slowly, with about a minute of data transfer at a decent rate (10s of MB/s) followed by 3-5 minutes of no data moving. I assume this is data movement interleaved by data rebuilding?

And then the FS disconnects and appears to reboot. It comes up marking drive 1 red and the rest green, then eventually restarts the rebuild. I’ve only been able to successfully get a couple of hundred gigs of data copied off of it so far.

Sometimes the Dashboard Data Protection indicator gives an estimate of time remaining and it has gotten as low as 5 hours remaining, but then it will drop, restart, and away we go again. Drops/restarts seem to happen on the order of several hours, though it is highly variable.

At one point, with drive 1 pulled, I upgraded the firmware to 1.2.6 as it was nagging me to do. Based on advice elsewhere on this site, I also have shut down the FS, pulled all the drives, booted, shut down again, and added the drives back in. When it was in one of its post-reboot/pre-rebuild refractory periods (drive 1 red, 2-5 green), I pulled drive 1 and put the original failed drive back in drive 1. The FS failed the drive and refused to enter the rebuild. I pulled the bad drive and put in the new one again and it recognized it as a good drive and started the rebuild.

I don’t have hard evidence, but I fancy the Dashboard gave numerical estimates of time remaining after that for a while and it seemed to mostly count down, but, for the past week or so, it’s back to just showing the progress bar animation with no time estimate.

Any ideas? If I pull out the bad drive, will I be able to copy at full speed? I’m starting to think I’m going to have to completely reset and wipe my data in the process so minimally would like to be able to stabilize it long enough to accomplish the copy.

This is the first Drobo drive failure (purchased FS in mid-2012). It’s pretty disappointing that it failed to recover seamlessly and automatically as hoped, though I consider it a partial success that I seem to still have (dodgy) access to the data.

-Steve O.

Update: I shut down the FS, pulled drive 1, and started it up. The remaining 4 drives are green. Weirdly, I am still experiencing the punctuated behavior accessing data on the Drobo. It reads great guns for a minute or so (~25 MB/s), then just stops for 3 minutes, then continues, then pauses, etc.

What’s with that?!

-Steve O.

I would recommend collecting a set of logs and having this reviewed deeper by support. They can advise further the health of your drives and review possible performance issues deeper.

Drobo Support