Drobo

"strike 1 youre out" for a flashing RED drive light?

hi guys, i was wondering how many strikes does a hard drive get when drobo marks it as bad, before its best removed for good?

eg, if its red, should we:
remove it?
and let drobo rebuild?
and then add it back again?
and then see after rebuilding/using if it goes red again?

or is it that when drobo marks it as red, it’s 100% justified cause for not using it at all?

just wondering, as am sure im bound to get a single red light at somepoint - is just a matter of time :slight_smile:

Remove it. Except for rare cases where Drobo is wrong, the red light means the drive is on the way out.
Why take the risk?

Especially on a Drobo that only supports single-disk redundancy, you want to replace the bad drive ASAP so the rest of the drives will live through the relayout.

Also, once it is marked bad, it will remain marked bad on reinsertion unless you do a reset on the Drobo.

Imagine your car has uneven tire wear. Instead of replacing the near-bad tire, you wait until all your tires are super low. The likelihood of having multiple blowouts increases dramatically, and if you do have more than one blowout, you’re stuck because you only have one spare and multiple blown tires. In the Drobo scenario, you just lost all your data.

So my standpoint is, replace it. If you’re entirely convinced that it’s still good, repurpose it for something else, or put it in a different Drobo. But don’t keep putting off its replacement after it gets marked red. That’s just asking for trouble.

If your data is important enough that you spent the money to put it on a Drobo, don’t “cheap out” and get yourself screwed later. Buy the new drive, run a full check on it using the manufacturer’s diagnostic (to avoid any DOA or near-DOA issues), then replace the bad drive with it.

agree with bhiga - if drobo says its going bad - thats it

there are no strikes, if its marked as unusable, stop using it

See, I think it’s just a matter of opinion.

If you’re the sort of person who brings your car to a mechanic when the “check engine” light goes on, then replace the dying drive.

If you prefer to tape over car warning lights and turn up your stereo to drown out the crunching noises, then leave the drive in there. Why spend your hard-earned money just to protect a little data?

One thing to double check, a solid red light means that you are running 95% or more full and a flashing red light means a drive has failed.

If you think a drive is starting to have issues, you can open a support ticket and have support look at a diagnostic.

thanks guys for replies,
i think when the time comes i’ll play it safe rather than cheap it out.

ive got 8 led lights so around 80% capacitity used, and 500MB free on the 1st volume, and about 400GB free on the 2nd volume, so should be ok for a bit more interms of capacity, but thanks and have also updated the title to say flashing just to be clearer if others see the question.

ive got some WD15EADS pre full-surface-scanned (back from one of my earlier posts this year i think using the wd tool) where i got 7 of them for a good deal, and so far i think ive ticked 3 of them as all clear and the rest are pending scanning (took like 1-2 days elapsed to scan each of them lol) :slight_smile:

I think you are still confused. If you get a flashing red light that means the Drobo has marked the drive bad and will never use it again. It will immediately start to relay your data back to the other drives even if there isn’t room to complete the relay. As far as Drobo is concerned the drive is dead, not dying or on the way out.

I don’t think there are any unambiguous signals from the Drobo that a drive is on the edge of failing (something discussed here from time to time).

The above is my understanding of things. I’ve managed to get through 2 full years now without seeing a flashing red light. For all I know it’s just another Urban Legend :slight_smile:

neil - the discussion is because drobo may give a drive a flashing red light and remove it from the disk pack, but when you take it out and test it on a PC/MAC the drive appears to work perfectly - the question is “should we believe drobo that the drive shouldn’t be in a disk pack or should we re-add it back into the disk pack and take our chances?”[hr]

oh and, no it won’t - it will only rebuild if there is enough room so that when it has finished the rebuild then it doesn’t have a low space condition (i.e. you must have at least 15-20% free space AFTER the rebuild

Dochris,

You may be concerned about how dead the drive really is just because Drobo tagged it as bad. But this is Paul’s thread, and as I interpret his posts here, I think it is impossible to “just rebuild the array and then put the drive back in”, as Paul suggested might be one option “to cheap out”.

It was stated above that the only way to reuse a drive marked bad is to reinitialize the array and reload all the data. If that is correct (and is correct, AFAIK) then short of reinitializing the array I don’t think there are any options to “reuse a bad drive”, in the manner in which Paul asked.

I would personally not reuse a drive that my Drobo marked bad, especially if it were under warranty. But that’s almost religious and philosophical since DRI doesn’t disclose the nature of the problem (unless maybe if you submit logs and get a sympathetic rep that will detail the problem).

I would suggest to Paul the following:

  1. Modern drives are cheap

  2. You have a pile of them so you shouldn’t even think about “cheaping out” :slight_smile:

  3. As near as I can figure, modern consumer drives have about a 10% per annum failure rate, more or less. The real number is probably far greater than 1% but also far less than 50%.

  4. The weakness of a 4 bay Drobo and any other conventional or unconventional Raid 1 or Raid 5 array is a double disk failure, which must result in the loss of the array.

  5. If you put a suspect drive in your Drobo (or back in your Drobo) your entire array is more or less subject to that 10% failure rate since a suspect drive, if it really does have a problem, will be unlikely to survive a long relayout, or maybe a double relayout depending on the sequence of events.

  6. The name of the game is to buy lots of drives, make lots of independent backups, and don’t rely on suspect drives :slight_smile:

Regarding the relay issue…

I very distinctly recall a thread on the old Drobospace forum where users reported that their Drobo went into a relay with a flashing red light on a drive, with no hope of completing the relay due to insufficient space on the remaining drives.

I also distinctly recall a DRI tech rep responding to that thread, trying to explain the logic behind that. As I recall his argument was that it was better to get as much of the relay done as soon as possible, presumably leaving only a partial relay remaining when the drive is replaced. I also recall his logic was somewhat unclear so I may be filling in a few blacks with my “presumably”.

Forum members then counter-argued that it was too risky, being subject to a 2nd drive failure with no hope of restoring redundancy. At the time I agreed with the forum members but in thinking about it now, it really doesn’t matter as long as the relay ultimately gets done in the same amount of time (which is impossible for us users to ascertain but I will assume so for argument’s sake).

That is my distinct recollection of the discussion that occurred about 3 years ago now. If the Drobo no longer does partial relays with insufficient space for completion then that may be due to a subsequent firmware change and I stand corrected. As I said, I’ve never had the unpleasant opportunity to test that logical branch in the relay decision tree :slight_smile:

[quote=“NeilR, post:9, topic:2494”]
It was stated above that the only way to reuse a drive marked bad is to reinitialize the array and reload all the data. If that is correct (and is correct, AFAIK) then short of reinitializing the array I don’t think there are any options to “reuse a bad drive”, in the manner in which Paul asked.[/quote]

there a several differnet ways to trick drobo into re-using a drive, none of which afect your existing disk pack

Drobo’s have none done this, I’ve had a LOT of drive failures over the years, and i started out owning one of the very first drobo v1’s which were available.

A drobo wont begin a rebuild even if there is enough space to actually do it - if it woudl leave dobo with a “low space” condition after it was done. this is both from dri employes and my personal experince with three different models of drobo.

from the knowledge base:

When you remove a drive, the Drobo device tries to distribute files and redundancy information on the remaining drives, provided these drives have enough storage to protect all files under management.

if you dont have enough storage - it doenst do ANYTHING (it just sits there whining that you need to add more space)

For what it’s worth, each time a disk has failed in my Drobo v2, the Drobo has failed the disk, started to relayout, discovered the disk was actually recoverable, and brought the disk back online (usually cancelling the relayout).
There appears to be a process for Drobo to retry using a dodgy disk.

I’ve worked with other RAID products which permanently fail a disk at the first sign of real problems. This is great until a second drive returns an error, at which point the whole RAID set falls on the floor in a muddled heap. While I wholeheartedly recommend replacing a disk with errors, my Drobo’s persistence might save me the many hours of work with ‘dd_rescue’ I’ve spent recovering from stupid Unrecoverable Read Errors on conventional RAID sets.

Sorry for being a broken record here, but I’ll repeat: Why can’t WE look at diagnostics on OUR OWN EQUIPMENT without involving support? Time to decrypt at least part of the logs!

they did say that they were going to do that

they are worried since a significant part of the logs is to do with how beyondraid works -which obviously they dont want to make pubic, they just need to split the logs in two

Didn’t spot that, can you point me to it so I can congratulate them?

…which seems like it should be a day’s work for a software engineer, one would think. Hardly a good reason to wait a year or more (the relevant feature request threads were started at the end of 2009).

A week or two before Dashboard V2 was introduced I was told by a DRI rep that a new version of Dashboard would be introduced concurrent with the new 12 bay Drobos. I was disappointed to see that that was not the case. My phone call had to do with my Drobocare renewal and the rather strange method they use for dating those renewals (I was between the official expiration and the real expiration based on the anniversary of my purchase). While I had someone one the phone I gave him an earful about a number of things, including the lack of exposure of diagnostics. Maybe he just wanted to shut me up :-), or maybe he had bad info that be believed to be correct.

Quick edit: the rep specifically said that this new Dashboard would have “extensive diagnostics” although he could not or would not be specific as to any details.

I’m not sure if I actually reported this here although I think I mentioned something about a “pleasant surprise” coming soon, which was this little piece of vaporware.

It is my opinion that Drobo will never be truly ready for prime time until we get smart stats (especially temps!) and other vital data, including identification of disks that are marginal and known to be marginal to the Drobo (per frequent responses from tech support). This is a major part of one side of my love/hate relationship with my Drobo :slight_smile:

This is not just curiosity. It is my firm belief, having followed this forum for some time, that the Achilles Heal of the Drobo is the very long relay times incurred with modern multi-terabyte drives. A 4 bay Drobo is subject to a 2 drive failure. Not knowing if a drive is going marginal could be the difference between a successful relay and a failed relay.

Based on research I’ve done these long “relay” times are not unique to Drobo although the Drobo is slower than most by some moderate amount. The difference is that we know NOTHING about the day to day state of our drives, where competing units disclose very detailed data. I know I’m preaching to the choir here but I think it’s important to keep our views out in front, in the event that anyone from DRI reads this and pays attention.

many thanks for all the extra replies guys -its all good advice

i actually got a “solid” red light yesterday lol - but it was also like that on the dashboard - though due to me actually going to the 90% mark approx (9 leds)

it was the 1st slot which was solidly red, and i think it is just simply due to the fact that all 4 drives are the same wd10eads model, so it probably picked the 1st (smallest) drive in the list, which is the 1st slot.

interestingly enough i was able to get it back to all green solid lights, by moving stuff from the most used (initial) volume, onto the 2nd (newest) volume.

am not quite sure why that worked though, but i do know that the 1st volume went to hardly any space left, and the 2nd volume has around 500gigs left. (maybe i deleted some files too but im pretty sure i just moved stuff more eveningly across volumes)