Drobo

Preliminary Network Tests

So I work as a network researchers focused on performance and diagnostics. Having just purchased an FS I decided to build one of the standard tools used to determine network throughput - iperf*. Essentially, on the the server side this creates a packet discard server and on the client side it acts as a traffic generator. My main goal was to determine if there were any network issues that might be impacting drobo fs performance. All tests involve a drobo FS running the latest firmware, a quad core system running windows 7 64bit, and 9KB packet sizes.

The initial tests used a direct connection between the drobo and my home system (a windows 7 box). The first round of tests used the default values for the drobo receive buffer and windows send buffer. The average throughput over multiple runs hovered at around 310Mbps. Not entirely surprising as the default send buffer size for Windows 7 is 8Kb (this can be a problem in some applications). When the buffers size was increased to a reasonable value (around 128KB) the throughput topped out at around 770 Mbps. Running ‘top’ on the drobo revealed that iperf was consuming 100% of the available ‘user’ side core. So I’m going to go out on a limb here and say that the Drobo FS is never going to see full Gb speed but ~800 Mbps isn’t that bad. The problem will come in as you factor in things like the overhead associated with the BeyondRaid parity calculations and the like.

For the next test I hooked up the drobo to my home network through the switch I’m using (which supports 9KB packets and has a 128KB buffer). The iperf results were identical to the direct connection tests. I wanted to make sure that my switch wasn’t imposing a performance hit.

I then ran tests involving the Drobo FS filesystem (the Drobo FSF fs!) using FTP. I wanted to use FTP to avoid the possible overhead of SMB, AFP, etc. Tests with FTP were performed with a 3GB file in binary mode at various send buffer sizes. All tests produced nearly identical results; approximately 32MBps (256Mbps). Not bad, but you are definitely being impacted by the demands placed on the ARM9 processor by the parity calculations other file I/O overhead.

Receive tests were much more variable. Without any buffer tuning (aside from the built in Win7 auto tuning tcp stack) I averaged around 34MBps. The transmission were a lot burstier than I expected with peaks of close to 50MBps and troughs down as far as 17.5MBps. It almost looks like I’m getting congestion but that doesn’t seem entirely likely. Setting the receive buffer by hand did reduce the burstiness (which isn’t entirely unexpected - auto tuning is really an approximation of the best buffer size rather than the very best possible size. Coworkers of mine at PSC helped develop the underlying principle of auto tuning and we really aimed it at making it ‘good enough’ rather than ‘damn near perfect’). Anyway, the CPU was not overloaded as the idle reported by ‘top’ never dipped below 15%. With a manually tuned connection throughput was approximately the same with slightly reduced burstiness in the flow (but not by much and I’m just eyeballing this). So it looks like Windows 7 is tuning the buffers pretty well for near maximum throughput (at least for the conjunction between my network and the drobo).

When I ran the receive tests using iperf (to eliminate filesystem issues on both sides) using the default settings (auto tune on the windows side, 27.2k on the drobo side) I averaged around 620Mbps over multiple 60s tests with a very normal amount of bustiness (eg, significantly less than seen with the file transfers). Modifying the send buffers made no difference in throughput (27.2 is pretty much perfect for a GigE LAN configuration). In fact, increasing the send buffers on the drobo reduced performance by around 40Mbps. Likewise, setting the buffers manually on the receiving end (the windows box) actually reduced performance by a similar amount.

So my conclusions are:

  1. The drobo fs is processor limited to just under 800Mbps in near ideal situations.
  2. The buffers are well tuned on the drobo fs. The windows send buffer may be undersized with some applications (such as FTP).
  3. However, file system overhead completely swamps any of the buffer considerations.

My random thinking is:
Faster disks may help. The standard 2TB green drives that a lot of us are putting in the drobo are usually no more than 5900rpm. Increasing the disk speed may mean more data gets over the network. I can’t test this as I have 5900rpm drives previously mentioned. HOWEVER: When I ran ‘cat 3gbtestfile >/dev/null’ it only took 60s of wall clock time (60MB/s) so somewhere between the disk and the network we’re losing 26MB/s. I’m wondering if the burstiness I saw in the network tests are an indicator of this. I’d love to talk to the Data Robotics people about this.

My idea for future work:
Get a tcpdump running and then dump the results into tcptrace and see if there is anything that really stands out. I’m wondering if we’re getting some sort of weird interaction between the fs i/o and the network i/o. I’d have to see if I can build tcpdump under the cross compile environment. Even if I can do that I’m not sure it will work under the drobo environment but I’ll give it a shot at some point. It may also be interesting to try different congestion control algorithms. The default used on the drobo is cubic. It may be worth trying the only other algorithm available of reno. I doubt it would be any better. BIC on the other hand might be worth trying but again - dunno about building them all though.

Anyway, hopefully this information will be of use to someone. This post is mostly stream of consciousness and not edited for clarity &c. If you made it through this far, congrats!

*If anyone wants the iperf application I built for drobofs you can grab it from http://staff.psc.edu/rapier/droboapps/drobo-iperf.gz (35024 bytes). You can get a windows version of iperf in the jperf package (look in /bin). http://openmaniak.com/iperf.php has a good tutorial on the use of iperf. I think this will be helpful for all of the people complaining about performance. It may help you figure out if it’s your network or the drobo.[hr]
Quick update: I just ran some UDP tests. With standard 1500B packets I wasn’t able to get more than around 360Mbps. When I bumped it up to 9K packets throughput increased to 770Mbps. This really illustrates how important it is that you use jumbo frames (9k packets) when using your Drobo FS. The limiting factor really seems to be the processor. The ARM9 is a great little chip but it just can’t seem to spit out packets as fast as we might like. Reducing the load on the chip really seems to help.

Fantastic work! I was dying for someone to get us a decent benchmark on this thing.

That being said, your great work raised a couple of questions on my mind:

Do you remember what the default values are?

[quote=“rapier1, post:1, topic:2238”]
The average throughput over multiple runs hovered at around 310Mbps. [/quote]

I assume that is using TCP.

Have you tried higher buffer sizes? Correct me if my math is wrong, but 128 KB buffer is only about 87 packets of 1500 bytes. If I remember correctly, you need around 83000 packets per second to saturate a gigabit connection at that MTU.

Of course, I don’t think that a 128 MB buffer would work better, but something in the order of 1-8 MB might still show some improvement.

Could you please explain this a bit further? What do you mean by “user side core”?

That is a reasonable assumption. In fact, it would be nice to have some input from DRI engineers about which hardware is processing the BeyondRAID stuff (kernel module? dedicated core?).

See point (1) below for more.

Could you indicate brand and model of the switch? It would be nice to start compiling a collection of network hardware that we have tested to behave nice with the FS.

I didn’t quite get the scenario here. Is the file on the Drobo and you are downloading it to the Win7 machine? How did you set the send buffer sizes?

Also see point (2) below.

I’m not quite sure again about the scenario. Is it the PC sending to the Drobo? What do you mean to be the “receive buffer”?

Well, this means to me that the problem is somewhere else then. My bet is that the hard disks are doing synchronous writes. See point (3) below.

Yes, usually just increasing send buffers may worsen the throughput due to added queuing, and thus, latency. But what about receive buffers? In other words, what are the default sizes for both send and receive on the FS?

Here are my final thoughts from what you reported:

(1) as far as I know, the Drobo ARM processor runs at 400 MHz. What is the maximum rated clock for that baby? I know that overclocking it would void all kinds of warrantee, but I can’t help but wonder how much it would affect performance. My experience is that once a CPU is saturated, performance takes a cliff dive. Maybe overclocking it just a bit gives it just a bit of wiggle room not to saturate anymore.

(2) you haven’t mentioned how much data was already on the Drobo during the tests. I remember seeing on the web (can’t find a link right now) that some Drobo models would take a performance dive once a certain space usage threshold was crossed.

(3) I think we have to ask the DRI engineers whether or not Drobo makes synchronous writes or not. That fact alone will explain pretty much every single performance difference between raw network speed, simple reads and the fluctuations that you see while writing.

I am doing a PhD in byzantine fault tolerant storage, and when I need to run (or compare) benchmarks, the first thing I check is whether or not the hard disks were doing synchronous writes. In fact, the next step would probably be to compile IOzone and run it on a Drobo. I think the results would be most interesting.

No, no, congrats to you, sir, for the excellent work so far!

Ricardo,

I’m sorry if I left out some important details. This really was close to a stream of consciousness rambling while I was testing the drobo. Yes, the test were all TCP based. I did some UDP tests later on but I need to rerun them in a more consistent way. To address your main question about the buffer sizes; The default receive buffer on the drobo is the linux default of 86k. It does use our autotuning receive buffer stack though so it will automagically ramp up to 256KB if the network conditions require it. The default send buffer seems to be 27.2 KB. In both cases these values are perfectly reasonable. The optimal buffer size is equal to the BDP (bandwidth delay product (RTTs*BW)). In a LAN configuration you probably won’t see RTTs above .5ms with .2ms being average for my network. This gives a BDP of 26KB to 66KB. Personally, I think the receive buffer might be a little too large but I’m not sure I saw any real over-buffering problems. I have tried really punch the send buffer on Win7 up but there was never any improvement past ~800Mbps.

The windows side also uses auto tuning for the receive buffer (as does every linux kernel and OS X since 10.4 or so). The send buffer is, by default, 8KB. This is absurdly small and there doesn’t seem to be any method to change the default value. However, it can be set programmatically via a socket call. I have no idea if SMB is doing that though - I’d hope they are.

The FTP test were conducted with writes to and from the drobo (binary mode was critically important, leaving it in ACSII really dropped the performance). The first test I mentioned were puts to the drobo. The second set were gets.

Oh, I am stumped by something - disabling SACK on the drobo destroyed my throughput. Absolutely obliterated it. This seems very very odd to me being that there shouldn’t be any congestion on my LAN. Disabling SACK should actually increase performance.

I really really need to compile tcpdump and see exactly what is going on. Something is injecting weirdness (being the technical term) and I’d like to at least be able to describe what that weirdness is. That, unfortunately, is probably going to have to wait a few days until I can shave off a few spare cycles. I also need to compile HPN-SSH. It turns out that one of the problems with using SSH, the inability for non root users to log in, has been resolved with the latest firmware release. However, /dev/null is only accessible by root so it makes SCP or SFTP out of the drobo impossible unless you chmod /dev/null. I can probably fix that with a script in init.d though.

Let me know if you compile IOzone. I’ll work on tcpdump. I’d love to get a full work up done and then talk to the DRI engineers about the results.

I was able to get tcpdump built statically. I’ll run some of the tests again tonight and pump the resulting dumps through tcptrace and see if I can figure out what’s going on in the network.[hr]
I also got iozone built but I don’t really know how to interpret the results of it. I’ll get copies of iozone and tcpdump up on a site shortly for people to play with.

I have managed to figure out exactly the hardware inside the DroboFS. It is a Marvell MV78200, whose full spec sheet can be found here.

Turns out I was wrong about the CPU speed. The FS runs at 800 MHz, but only one of the two cores is actually used by the Linux OS.