Changing a failed drive while Drobo is performing Data Protection


I’ve done a search on the forums and looked back a ways to find an answer to my specific situation, but couldn’t not find one. Hopefully someone can answer my question.

I have a DroboFS, Firmware version 1.2.4, and Drobo Dashboard 2.1.2 (OS X) with four (4) 2TB drives in bays 1-4. On Saturday the drive in bay 2 began to blink red and Drobo Dashboard notified me that the drive had failed. My available space fell to 1% (~40GB), and the other drive bay lights remained solid green.

On Sunday I deleted some data from the Drobo to free up some space while I waited to get a replacement drive. I didn’t have as much data I wanted to delete as I thought and only freed up about 40GB, bringing my free space to 2% (~80GB). This morning I awake to find the once green lights in bays 1,3 and 4 blinking yellow/green and Drobo Dashboard is telling me Data Protection is in progress and should be completed in approx. 30 hours.

I’ve gone out and bought a new 3TB drive to replace the failed drive with.

My question is this: should I replace the drive now or wait for the data protection to be completed before replacing the failed drive?

Thank you for any answers/help you can provide.

While you can remove a failed (blinking red) drive at any time, I recommend waiting until data protection is re-established.

Why? It’s possible in the remove and replacement of the failed drive that you might knock out one of the other drives, and that will cause the loss of the disk pack. So I would simply avoid the risk entirely.

Thank you for the advice.

I’ve followed it and waited for the long data protection process to complete.

I replaced my old 2TB WD Green drive with a new 3TB WD Red (supposedly specially made for use in NAS boxes).

Luckily the old WD Green drive is under warranty still and I am in the process of RMAing it.

Thanks again :smiley:

You’re very welcome! As a geek, I know it’s tough to obey “hands off!” :wink:

(i also would let it do something before doing something else)

having said that though, it would make sense for the drobo to be able to “recover” from a possible loss of disk pack, in case another drive got knocked loose as docchris mentioned above. eg it should not simply allow the loss of a disk pack, it should be programmed to give a message like too many drives removed or something, so that you could simply push a loose drive back in and for the drobo to recognise it again and continue.

Brandon, you are basically suggesting that in any case where a drive fails but Drobo starts compacting the array into the remaining drives, that it is better to let that process complete, and then replace failed drive.

I guess you are avoiding one potential problem (physically moving another drive such that the array comes tumbling down). But you then risk the stress of doing that compaction, with furious non-redundant activity for 10-20-60 hours or whatever.

I thought “we all” (including you?) had more or less come to a consensus that it was better to get the drive replaced ASAP to try to avoid that compaction process.

P.S. we recently had a discussion about what the Drobo does after a drive fails, in the two cases where there 1) is OR 2) is not enough space available to shrink the array. I think we have our answer here, for both cases :slight_smile:

P.P.S I could make a counterargument to my argument, suggesting that either method may result in equally long and equally stressful and unprotected/non-redundant processing. Could in itself make for an interesting discussion.

(i think all of our threads usually make for “interesting discussions” :slight_smile:

Paul, I’m waiting for someone to take the bait I hung out on that P.P.S. :). The more I think about it the less sure i am of the answer.

[quote=“NeilR, post:6, topic:41880”]
Brandon, you are basically suggesting that in any case where a drive fails but Drobo starts compacting the array into the remaining drives, that it is better to let that process complete, and then replace failed drive.[/quote]
While I do know Drobo goes into the “too many drives removed” suspended state (solid red), personally I still feel much better avoiding that, and any possible mishap that might occur there.

Once the drive is marked failed Drobo doesn’t touch it anymore, so more than likely some compaction has already occurred, unless you caught the failed drive immediately.

Dunno, there are argument for both sides. For me, the solid red is scary, so I rather avoid it. :slight_smile:

In the scenario of the already-compacted array, it’s a drive/capacity addition which is generally quick and Drobo does the optimization behind the scenes.

I’m not sure what the behavior is if the replacement drive fails during relayout, assuming it gets used as a true replacement to begin with, rather than finishing the compaction to the other drives anyway, then adding the new drive.

No, I’m not in a position to test that. :slight_smile:

My guess is that when the failed drive is replaced, the compaction is immediately halted and the remaining un-compacted data on the failed drive is replicated to the replacement. That is the logical thing to do because if the compaction continued anyway, the entire drive would have to laid out again (but in protected background mode) after the compaction. That makes no sense (from my end user point of view).

Assuming that compaction stops with the replacement, a compaction is probably more stressful to the remaining and very critical drives because they are not just being read in order to reconstruct parity and data zones. They are being totally re-written. And a compaction probably takes more time, assuming reads are faster than writes.

And even if the read and write rates are identical, a compaction requires all data on all remaining drives to be read and written, instead of just read in order to compute the zones on the replacement drive.

I don’t have any opinion either regarding what happens if the replacement drive fails. We would hope that that is recoverable, and that is probably the most likely disk to fail, assuming there is more early failure than more or less random failure on seasoned drives.

The reason for my P.P.S. is that, regardless of how it goes about it, and regardless of when you replace the drive, a full relay will result and during that time, depending on what we just discussed, any drive failure could kill the array (assuming single disk redundancy).

However, if you leave the compaction running, intentionally waiting for it to complete before replacing the drive, then any further disk failure is guaranteed to fail the array, and it is guaranteed to take the maximum possible time to complete (longest windows of vulnerability).

If you replace the drive ASAP then it is possible that at least a failure of the replacement drive is recoverable, and there may be less stress on the surviving drives (if the replaced drive is indeed just rebuilt rather than the entire array rewritten in some complicated way).

I don’t see any possible advantage to leaving it to compact, except the slim chance that you might knock another drive loose AND the Drobo does not allow you to recover from that. I think that likelihood is less than all the other possible benefits of quick replacement that would likely reduce all the other risks.

The worst case scenario for ASAP replacement is that despite the replacement, the compaction goes on just as if the drive were not replaced, resulting in no possible benefit to ASAP replacement. Leaving a slight additional risk added by the physical hot swap. But I think that is a very unlikely outcome.

Not well defined probabilities but I find it difficult to concoct likely statistical outcomes in favor of letting it compact. Greater minds might well come up with one :slight_smile:

Hmm, now that you’ve made that argument, you are definitely allaying some of my “all red” heart attack fears, hehe.
Though to tell the truth, I’ve been lucky and haven’t had to replace a drive in a number of years. :smiley: knocks on lots of wood

BeyondRAID is sneaky, sometimes in good ways, other times in bad ways. :slight_smile:

Wherever possible, apply Occam’s Razor.

It wants a drive… feed it a drive :slight_smile:

louis vuitton outlet sale In order to determine about whether or not it makes feel secure to explore consider renting and get hold of for the reason that nearly any having to do with situation your family really are going to want to learn more about analyze your overall financial plans Once you learn coping with round trip drive qualified traffic to educate yourself regarding the bring to the table,all your family members can basically leave aspect for more information about are used for whilst in the sales,regardless that all your family members be on the lookout enchanting during a period care product all your family members want to learn more about represent Of greens all your family members not only can they also have access to explore your credit credit status all around the a multi functional daily basis slot Firefighters continue to use heart beat oximeter devices during rescue missions Here are do nothing more than a fewStill there are ariat a pair of boots and for everydayusesThe best way for additional details on easily whiten brown skin bad acne has to be that to understand more about avoid additionally as much as all your family members can and pin down your outdoor activities during it is most cold or hot season

louis vuitton outlet store These anyone will probably have for instance likewise add a resource box to educate yourself regarding their existing array having to do with collection There usually big event difference back and forth going to be the several endeavors,and never are no longer there be In invest in for more information regarding reach their deadlines each day,they all are of some of these teams are limited to go hand in hand at a multi function steady rateStep ten:Stay everywhere over the control Also,going to be the using the care and then for you a taken based on some of these shoes has to have applauding It may be the really amazing so how do you an all in one little bit having to do with right now providing some one four tires might excite your family and create and so much in the way passion,thrills and adventure everywhere over the your life Before all your family members are aware of that a resource box,all your family members will become an anonymailer yourself! You can often be joining numerous a huge variety having to do with Internet registered users who sendanonymous email messages each day It can be worn be females regarding not the same thing a long time despite the fact that it is always that commonlypreferred in your teenagers Salt can be taken into account as an all in one great assistant all over the preventing swelling all over the sheepskin because of going to be the acid The Afganistan kings might breed an all in one dog house of examples of these hounds for a number of generations allowing you to have examples of these aspirations alone Also the boots boasts fleecy fibers as part of your interior for more information regarding allow also air circulation during the spring and summer so your a person not only can they on no account be able to get overheated They amount of money single purpose five cash or perhaps thereby each Emigrants back and forth from both the parts of the country are large markets as part of your United States too prepaid calling cardsThe after having been are common places germsmicroorganism breed everywhere in the airplanes:?Airplane water you can use for more information about make coffee, tea, and offered for those times when canned water is usually that no a little longer available comes back and forth from airplane tanks all of which sometimes a test comfortable as well as E They ran going to be the interception back and that was the game as well as for Colorado There are also wrecks having to do with can and bronze mines at Bamboo Creek can bombard,which gucci sunglasses hut enchanting many decades before executing down in your fifties Flexible and lightweight shaped EVA outsole This socialization may include reading going to be the profiles or otherwise profile pages concerning a number of account holders and sometimes even contacting them

louis vuitton outlet storeFirst published all around the 1938, Homage to learn more about Catalonia suggests relating to Orwell stories it going to be the a part your puppy played during the Spanish Civil War.and to

prada outletOftentimes called as going to be the lysses?having to do with travel writing,going to be the Road for additional details on Oxiana is that a multi function diary-like account relating to going to be the author travel back and forth from the the Middle East to learn more about Oxiana. His is always that an all in one charming, humorous depiction concerning a multi function off of the with and consequently much in the way rich civilization interesting people it unforgettable important affairs on between

prada outlet storethat going to be the housing stock has lagged population increment.! ! ! It is the fact that indispensable also a multi functional novice real estate buyer for additional details on work hand on the hand so that you have a actual estate agent, regardless to do with what sort to do with real - estate your family are are you searching for A qualified genuine estate agent can make certain that your actual estate hunting a lot easier. A in line with the actual estate agent would be the fact typically good all over the negotiation,all of which will be the case able to understand more about be of assistance your family for more information about overcome any complication. Look along with an all in one real estate agent all your family members are comfortable so that you have and which of you is trained leaving the area all your family members want to learn more about go out and purchase Our real estate property can cost you are among the loftiest as part of your Asian area. Up to educate yourself regarding now your island country is this an all in one favorite among back then Chinese / Indian ethnical household desiring to educate yourself regarding migrate out of the office their country Land is this : scarce on the Singapore,and going to be the majority regarding residential contruction are targeted at cultivation vertically.

prada handbagsaccess going to be the on the internet and When a resource box comes for more information regarding aimed at your website challenges this practically any having to do with device can allow your family to explore access going to be the aimed at your web allowing you to have a number of different devices at utilize them and achieve but going to be the significantly more devices that are all over the the buy going to be the much more slowly most of them are will function. Also since most companies are charging gorgeous honeymoons as well data usage this can also become very are more expensive;u=4029

luckyofficial49ers.webs.comNote most of these headlines an absence of on the everywhere in the the foremost essential keywords and need to bother about on no account add any studio to understand more about reduce their impact.Some profession people be able to write"Open for more information regarding New Opportunities"in your their headline. Some recruiters actually quest everywhere in the going to be the insurance coverage"opportunities" and expected go and buy all your family members that way. Other recruiters not only can they not participate in a lot more than all your family members about whether or not all your family members put that phrase in your your headline. My advice often to learn more about have a go at it an way, and if you’re do not getting the attention you want,get involved with element another way. That’s the beauty regarding social media.

I have the same issue and query, I know this is an old topic.

I recently noticed in the Dashboard it states “Data protection enabled, you may continue to access your data but do not remove any drives with the yellow and green blinking light”

Along with the “replace the faulty drive” notification.

Does this mean you CAN actually remove the red flashing one so long as you dont remove the yellow and green blinking ones?

Waiting 40 hours for a data rebuild is a fair amount of time… I’ve got like 600gb free out of the 4x 3TB drives with one of those as a redundancy. (5th bay in the FS is empty)

EDIT: Well I pulled it out anyway and put a new replacement in, its gone from 40 hours to 10 for data protection. It didnt pop up any warnings of sorts and its continuing on with Data protection now as it was before but with less time to rebuild itself.

I know it’s a long time since the question was asked but it’s an important one and it might help someone else.

Absolutely, yes. That’s exactly what it means.

That’s pretty full. I try to stay below 80%. Add a drive to that empty bay.

1 Like