Red light for 10 sec....

Hi

My Drobo v2 has been running OK for a few months having been replaced under warranty for a faulty fan. I decided to put the front cover back on it and immediately after doing so the top most drive light went red. Dashboard gave me the red alert, and I started to panic!

About 10 seconds later it went back to green. Has been fine for 3 days.

Do I replace the drive? Leave it alone? Wait for it to fail again? Drobo is at 63% capacity.

What could have caused it?

i am guessing that the drive is poorly seated and the knock from putting the front cover back on moved it enough to break the connection temporarily

power off drobo - remove the drive - firmly but gently reseat it

all should be well

Hi Doc.

Hmm. This is a worry then.

What I didn’t mention is that I had a problem like this with an earlier replacement Drobo - the top drive was red after swapping the diskpack, possibly due to poor connection. Ended up having to relayout the pack.

I made sure that this disk pack was well-seated I can tell you! When it went red I was pushing it back in as tight as possible but it didn’t seem to help. I stopped trying to reseat it and about 5 sec later it went green while I was freaking out.

So can they work lose over time? Seems to be a design flaw if that’s the case. Or maybe that one drive is not a good fit for whatever reason.

i’d open a support case and send in your logs

they should be able to tell you if it totally lost connection with the drive, or if there was an issue with the drive which made it be rejected (although i think this is unlikely as it wouldnt then re-accept it)

was it the same drive each time?

I’m 90% sure it was the same drive.

I’ll open a case.[hr]
BTW just checked the log and it seems it was a disk removal event rather than a problem with the disk itself. This is a worry because I had the same problem with a previous Drobo…

Seems to have been all of 3sec. I was told by support last time that if it goes over about 10 sec then it will initiate a complete relayout even if the missing disk is reinserted. The red light was on for longer than 3 though.

2010-05-14,06:35:37: -------------- CALLING SYSTEM_REMOVE_DISK_EVENT for physical disk 0
targetDrive: Device bitmap 0xe
2010-05-14,06:35:37: >>>> J2Manager::hotPlug: DriveNumber = 0, op = REMOVE
2010-05-14,06:35:37: J2Manager::hotPlug: REMOVAL of disk from slot 0 with journalette CLEAN
2010-05-14,06:35:37: >>>> J2Manager::cleanseJournalette: journalette = 0
2010-05-14,06:35:37: <<<< J2Manager::hotPlug
2010-05-14,06:35:37: ELM: Fri May 14 06:35:37 2010: DPM::hotPlug: SlotNumber = 0, op = REMOVE
2010-05-14,06:35:37: ELM: Fri May 14 06:35:37 2010: DPM::hotplug: LD #4 matches disk in slot 0 WWN=>S13PJDWS219174 SAMSUNGHD103UJ 1AA01113
2010-05-14,06:35:37: DPM::hotplug: DISK REMOVED FROM SLOT 0 LD #4 WAS NOT CRITICAL.
2010-05-14,06:35:37: DPM::removeDiskInSlot: found container for slot 0 with logical #4
2010-05-14,06:35:37: DPM::updateCriticalDiskMap: Critical disk map has changed, new = 0xe
2010-05-14,06:35:37: DPM::updateCriticalDiskMap: telling J2 that slots 0xe are critical
2010-05-14,06:35:37: ELM: Fri May 14 06:35:37 2010: DPM::updateDis: writing 0x2aa4750e2aa47986/f to LDs #1 2 3
VxWorks IAL (DEBUG) [0,0]: device connected event received
HotPlug: Msg Rcvd Drive 0 Event 0
2010-05-14,06:35:37: -------------- DEBOUNCING SYSTEM_ADD_DISK_EVENT for physical disk 0
2010-05-14,06:35:37: StartDebounce DriveNumber: 0OldState : 17NewState : 5
2010-05-14,06:35:37: >>>> RM::RemoveDisk: 4
2010-05-14,06:35:37: <<<< RM::RemoveDisk: 4 succeeded
2010-05-14,06:35:37: SPMM: RemoveDisk SlotStatus[0] = 0x0
2010-05-14,06:35:37: ELM: Fri May 14 06:35:37 2010: SM: remove physical disk 0 (logical disk 4)
2010-05-14,06:35:37: DUD::CalculateCapacity: ZM_PotentialZones changed: 2328 -> 1397
2010-05-14,06:35:37: DUD::CalculateCapacity: mUsedClusters = 375197482, mTotalClusters = 358263360, mTotalClustersProtected = 358263360, mFreeClusters = 0
2010-05-14,06:35:37: ELM: Fri May 14 06:35:37 2010: Capacity: Free=0B(0.00%), Used=1.39TiB(100.00%), Total=1.39TiB, Unprotected=0B.
2010-05-14,06:35:37: Requesting restart of relayout due to InitiateRemoveDisk()
2010-05-14,06:35:37: LogicalDisk 4 has been removed, rebuilding initiated
2010-05-14,06:35:37: ELM: Fri May 14 06:35:37 2010: LM: rebuild initiated as logical disk 4 removed
2010-05-14,06:35:37: Starting efficiency scan
2010-05-14,06:35:37: Performing pass number 1
2010-05-14,06:35:37: SPMM::Rebuild(): event flag = 0x8000, SlotStatus[0] = 0x0
2010-05-14,06:35:37: SPMM::Rebuild(): event flag = 0x8000, SlotStatus[1] = 0x5
2010-05-14,06:35:37: SPMM::Rebuild(): event flag = 0x8000, SlotStatus[2] = 0x5
2010-05-14,06:35:37: SPMM::Rebuild(): event flag = 0x8000, SlotStatus[3] = 0x5
Capacity meter changing from 55% to 100%
2010-05-14,06:35:37: DiskUtilizationDaemon.cpp:770 Disk getting critically full - Transitioning to synchronous one-at-a-time I/O mode
2010-05-14,06:35:37: DiskUtilizationDaemon.cpp:771 DUD: (freeClusters=625238, potentialZones=1397, allocatedZones=1474, physicallyFreeZones=425)
2010-05-14,06:35:37: SPMM::Rebuild(): event flag = 0x7fff, SlotStatus[0] = 0x0
2010-05-14,06:35:37: SPMM::Rebuild(): event flag = 0x7fff, SlotStatus[1] = 0x1
2010-05-14,06:35:37: SPMM::Rebuild(): event flag = 0x7fff, SlotStatus[2] = 0x1
2010-05-14,06:35:37: SPMM::Rebuild(): event flag = 0x7fff, SlotStatus[3] = 0x1

2010-05-14,06:35:40: -------------- CALLING SYSTEM_ADD_DISK_EVENT for physical disk 0
VxWorks IAL (ERROR) Device connected but not ready. channel=0, ready=0, bConnected=1
targetDrive: Device bitmap 0xf
ResetDrive 0
VxWorks IAL (ERROR) Device connected but not ready. channel=0, ready=0, bConnected=1
SATA 0 : MODEL [SAMSUNG HD103UJ ] SIZE [ 953869 MiByte] LBA_48
VxWorks IAL (DEBUG) sataRestartDisk: disk 0, ready.
2010-05-14,06:35:43: Drive 0 SAMSUNG HD103UJ Rev 1AA01113 Serial S13PJDWS219174
2010-05-14,06:35:47: DPM::discoverDis: slot 0 is 0x74706db0 sectors (931.51GiB)
2010-05-14,06:35:47: => Setting DIS LBAs to 0x0/0x3a37d900/0x74706d30
2010-05-14,06:35:47: => PackId = 0x2aa4750e2aa47986/0xe [LD #4] layout = 23
2010-05-14,06:35:48: Long HRM Mgmt detected, duration = 3Secs
2010-05-14,06:35:48: >>>> J2Manager::hotPlug: DriveNumber = 0, op = ADD
2010-05-14,06:35:48: J2Manager::hotPlug: added disk was CLEAN when removed, no replay needed
2010-05-14,06:35:48: J2Manager::hotPlug: making soft journalette 0 match: disk 0
2010-05-14,06:35:48: => [S13PJDWS219174 SAMSUNGHD103UJ 1AA01113]
2010-05-14,06:35:48: <<<< J2Manager::hotPlug
2010-05-14,06:35:48: ELM: Fri May 14 06:35:48 2010: DPM::hotPlug: SlotNumber = 0, op = ADD