Bricked FS?

Hi everyone,

My brother bought an FS and something pretty bad happened to it. He says the thing is bricked, but I’m not quite sure. I’ve seen some pretty amazing recovery stories around these forums to believe that you can’t completely brick an FS.

So here are the symptoms:

  1. The FS is not accessible via SSH, or any other DroboApp.
  2. The Drobo Dashboard is not able to contact it.
  3. The only thing it does is to request an IP address to the DHCP server.
  4. As far as I remember, telnet to port 5000 does not work either.

So, from this I gather that:

  1. The hardware seems to be pretty much in order, since it asks for an IP address. I.e., the processor, RAM, ethernet, and at least part of the flash memory are ok.
  2. The Linux kernel gets loaded, since DHCP means that the boot process went pretty far.
  3. The boot process stops at some point after DHCP, but before nasd gets started (thus no Dashboard connection).

Do you guys know if there is anything that can be done?

I was wondering if he can try to remove the diskpack and reset the FS. Would that cause data loss?

My parents also have a DroboFS. Would it be safe to take their diskpack off and just put his on temporarily, just to rescue his files?

What about replacing the FS? How have your experiences been with that?

[quote=“ricardo, post:1, topic:29007”]
I was wondering if he can try to remove the diskpack and reset the FS. Would that cause data loss?[/quote]

My understanding from talking this over with Drobo tech support is it will not affect the disk pack, so you can reset it as long as it’s empty. Pretty sure that resetting the unit with drives inserted will wipe them, but have no lingering effects if it’s empty.

Totally. I’ve swapped disk packs between FS’s when I had my first couple lemons, and it always worked fine. You’ll probably want to configure their FS to look similar to his first (admin user account, etc), just in case Samba/netatalkd get confused by the change in users.

Drop the disk pack in a new unit and it’s right as rain. Works like a charm. :slight_smile:

PLEASE DON’T RESET THE DROBO FS WHILE EMTPY! When you re-insert the disk pack it will be erased. Please see the warning at the bottom of our KB at https://support.drobo.com/app/answers/detail/a_id/431/kw/pin%20reset.

If there is no response when you telnet to port 5000, the unit isn’t responding. Do the following:

  1. Reboot using the switch on the back to see if that gets it to respond again. If that works, get a diag and create a ticket.
  2. If rebooting with drives doesn’t help, shut it down, eject the drives and reboot the Drobo FS while empty. If it responds, get a diag and create a ticket. This means the source of the problem is the drive pack.

[quote=“Sky, post:3, topic:29007”]
PLEASE DON’T RESET THE DROBO FS WHILE EMTPY! When you re-insert the disk pack it will be erased. Please see the warning at the bottom of our KB at https://support.drobo.com/app/answers/detail/a_id/431/kw/pin%20reset.[/quote]

Okay, that’s damned odd, as I talked with tech support a couple weeks ago about resetting a Drobo FS I sold, and they told me there would be no problem putting my disk pack back in if needed (in the event they should return the unit. I’d really like to know how it “knows” to erase the disk pack).

Pretty sure ricardo knows that. :wink:

Yup. :smiley:

My brother being somewhat of a nerd himself, went for the full monty. He broke out the Wireshark sniffer and made a log of all the traffic coming out and going to the FS. As far as he told me, the FS talks to the DHCP server and it does manage to configure its IP stack. In fact, he is able to ping the FS and get a reply back. Telnet to port 5000, however, does not work.

Btw, he tells me that the all the leds work just fine, i.e., the space usage ones at the bottom indicate the proper free space, and if he turns it on without any disks the first bay goes red, etc.

So I dug around the initialization scripts of the FS and noticed that all this initialization happens inside of /etc/init.d/rcS (abridged):

/bin/mount -a
mount -t sysfs  none /sys
mount -t tmpfs -o size=30M /dev/shm /dev/shm
mdev -s
/etc/init.d/enable_var
/etc/init.d/net_config
sleep 1
/etc/init.d/ntp_config
/sbin/netplugd -P
sleep 1
if [ ! -f /var/samba/netbios.conf ]; then
  /bin/set_droboshare_name.sh DroboShare
fi
if [ -f /var/log/nasd.log ]; then
  /bin/cp /var/log/nasd.log /var/log/nasd.backup.log
fi
/usr/bin/nasdLogRotate&
/usr/bin/nasd &> /var/log/nasd.log &

From what I can tell, as soon as the FS starts booting Linux it tries to mount the diskpack (mount -a) and other auxiliary mount points (/sys, /dev/shm), then it mounts /var (enable_var), and then it reaches out to the DHCP server (net_config).

After that, the FS tries to update the time (ntp_config), adds some more network stuff (netplugd), sets a default netbios name if none exists (DroboShare? really?), makes a backup copy of nasd.log, and starts nasd.

So… his FS seems to be stuck somewhere between net_config and nasd. I can’t see what could possibly be preventing it from proceeding. Maybe it is because /var is full and the backup copy of nasd.log fails?

There is another possibility, but only someone from DRI engineering could answer. I know that x86 servers (when configured to netboot/PXE boot) reach out to the DHCP server twice during boot. The first time happens inside the BIOS, so that the server can try to netboot. The second time is during boot, where Linux will reconfigure the network cards from scratch.

In other words, this DHCP attempt that my brother is seeing could be coming not from the Linux OS, but from the bootloader of the FS. If that is the case, I wonder if it is trying to netboot… and if so, I wonder if I could give it an image of the boot and root sections of the flash memory (taken from a good FS) over TFTP so that it would boot far enough to do a firmware reflash…

@Sky: could you please check with your engineering guys to see if that is possible?

DiamondSW, I believe it has to do with the way the memory is reset. If you reset the Drobo FS with at least one drive in it, preferably a spare drive you don’t mind erasing, then it will not wipe an already existing drive pack on boot. In your case, the person you sold it to will have already put in their own drive pack and the Drobo FS will have wiped it instead. I will do some testing and let you know if you can now safely put your drive pack back into it if needed.

Ricardo, I’m afraid I don’t have direct access to the engineers and I don’t know if that’s the sort of information they could give out. I will send you further instructions from the ticket I opened with you.

Sounds like the pin-reset process queues some kind of “initialize” process for the next boot.

Booting with a drive in has it “initialize” that drive’s content, but booting without a drive leaves the “initialize” process queued and it’s “Nuke the next pack I see.”

(i dont actually have an fs, but can understand what bhiga just said)

eg, if you buy a new drobo from the factory (or if you factory reset it via pinhole)
and then if you switch it on and when its plugged on you insert disks, it would wipe/initialise them for use

(if you put the existing disk pack when a drobo is switched off, and then when its switched on, id expect it to try to search for, and read the existing pack first, or blink some lights if it cant read it properly. but i dont have an fs to be sure.

the only “feature” i was thinking of raising as a feature request (unless it exists already) is that a drobo should never auto wipe a drive which is already used as part of a pack (unless we actually confirm it via a dashboard button, or physical button on the unit) - that’s what i believe anyway :slight_smile:

Good news everyone. I heard from on high that as long as the Drobo FS is running firmware 1.2.4, you can safely pin reset it while empty and it will not erase the disk pack on next boot. To be thorough, I tested this on the desk here and had no data loss.

Thank you for investigating this, Sky! I know that it’s been a concern for some time, and the knowledgebase was sometimes contradictory on the matter (or I wouldn’t have called support).

Excellent news! You might just have convinced me to upgrade my firmware. I’m still rocking 1.1.1. :slight_smile: