Microsoft Data Protection Manager (DPM) 2007

Hello!
I have a weird issue using my DroboElite. I have 8 2TB drives divided into 4 4TB iSCSI volumes attached to a Server 2008R2 box running DPM 2007. I have dedicated these volumes to the DPM backups. Three weeks ago, the server started to just lose the drives. If I check the iSCSI settings they still show as connected. When I go into the Storage manager it wants me to reinitialize the drives as if it hasn’t ever seen them before. The only way to bring them back online is to turn the Drobo off, back on, then reboot the server. After a random/indeterminate amount of time, the connection is lost again.
I’ve trolled the MS Technet boards and not found anything helpful. I’ve made the server service dependent on the iscsi service, and that didn’t seem to solve the issue.
I’m out of ideas at this point, has anyone seen this before?
Thanks!

Derek

Never heard of DPM before, but I suspect one of three possibilities:

  1. DroboElite is somehow confused/broken
  2. DroboElite’s drives are spinning down and the subsequent iSCSI connection that uses it times out before the drives spin back up.
  3. There’s some network issue that’s cropped up - either a switch or cable gone bad.

Nothing personal and my opinion is purely subjective and DPM 2007 has little or nothing to do w/ ur DroboElite issue but it’s nice to know there are folks out there using atually DPM w/ the Drobo/Pro/Elite. Not Symantec BEWS! We dropped DPM months ago & use AppAssure Replay4 to CDP all our Windows workloads onto the Pros & Elites. Does ur Elite have the latest firmware? What’s the patch level of ur DPM 2K7? How much RAM that W2K8R2 have?

Thanks for the replies! I’ll check but I thought I had setup the Drobo to not spin down. I did put the latest firmware on it last week and updated the dashboard, also.

I’ve got DPM at the latest patch level. The DPM server has 16GB. The issue seems to be just that the 2 boxes lose network connection to one another.

I have noticed in our notes that the issue began when we added the 2nd set of 2 TB drives.

So…right now we are looking at 2 things:

  1. Upgrade DPM to 2010…its gonna happen anyway, but I don’t want to introduce another variable.
    2 Reconfigure the iSCSI LUN’s to either 8 2 TB drives or 16 1TB drives since Windows seems to be having some issues with the big ol’ TB drives.

How many NICs do the two machines have? It’s possible that the iSCSI traffic is “clogging the pipe” long enough that the non-iSCSI communication between the two is timing out.

We used to have this type of problem back in the traditional SCSI day. If you had a write-intensive process running on the SCSI bus, you could “clog” the PCI bus bandwidth to the point that real-time-critical operations on other cards would fail. Because average data rate is the not the same as minimum sustained data rate, and the infrequent bursts didn’t amount to the necessary minimum sustained rate.

It was like drinking water from a hose that had huge bursts instead of a constant flow.

Thanks for the input folks. I’ve solved the issue and here are the details logged here for anyone else that might run into this.

  1. DPM 2010 - Microsoft Data Protection Manager 2010 (I was wrong in the original post…doh!) on Windows 2008R2, 250 GB C: partition, 250 GB D: partition.
  2. Server hardware: Intel SR1630GP, 16 GB RAM, 2x500GB WD 7200RPM SATA drives (mirrored), Dual onboard Intel NICS (852574L & 82578DM), Intel Xeon X3450 @ 2.66ghz (quad core w/hyperthreading so 8 processes).
  3. Drobo Elite with 8 WD 2TB drives, both NIC’s assiged static addresses.
  4. One server NIC assiged to VLAN for storage, the other on the server VLAN. Both DROBO NICs on the Storage VLAN with static addresses all around.

I followed the instructions found here about setting up the volumes via USB connection and not formatting them. Shutting the DROBO down, bringing it back online and connecting via iSCSI. Once the connection is established via iSCSI it is important to note the connection strings.

For instance I setup 4 4TB volumes. The iSCSI Names showed up as:
iqn.2005-06.com.datarobotics:elite.tdb0940000000.id1
iqn.2005-06.com.datarobotics:elite.tdb0940000000.id2
iqn.2005-06.com.datarobotics:elite.tdb0940000000.id3
iqn.2005-06.com.datarobotics:elite.tdb0940000000.id4

Plus one bonus as:
iqn.2005-06.com.datarobotics:elite.tdb0940000000.management

Do NOT try to format the connection to the .management LUN!!! I kept doing it and it will cause things to NOT WORK in a very inconsistent manner. Now, in my defense, I didn’t count the number of connections and it took fresh eyes to see there were 5 instead of the expected 4 (yeah, I’m being hazed about it…rightly so.) and I didn’t expand the name field to see the entire string. Once I did that it was pretty apparent what I had done wrong.

Last tip: format the drives as GPT and make them dynamic in the Disk Manager but don’t put any partitions on them. DPM likes to do its own partitioning thing on the free space of a volume. I found that if I made the disk dynamic in Disk Manager it worked every time (I tried it about 7 or 8 times before I figured that little gem out.).

Oh one more little thing: I took bhiga’s advice about bottlenecks to heart. I found the best performance was to use both DROBO IP’s as targets and enable multipath on all 4 volumes. I plan on putting another dual NIC in and bonding it but wanted to get some good backups going before the weekend and the NIC hasn’t arrived (curse you red baron!) as of last night.

I think that is it. We’ve been online for a week now with no problems. Before I would have some time out issues as bhiga pointed out.

I hope this helps someone, and as usual, your mileage may vary. Please note that this is the final configuration that I came up with as a compromise of simplicity and ease of management and lots of little partitions…

Go Army! Beat Navy!

Derek Kruger, MCSE, MBA - IT Management

Just a little FYI, qn.2005-06.com.datarobotics:elite.tdb0940000000.management is the drobo dashboard connection, not a LUN.