So I work as a network researchers focused on performance and diagnostics. Having just purchased an FS I decided to build one of the standard tools used to determine network throughput - iperf*. Essentially, on the the server side this creates a packet discard server and on the client side it acts as a traffic generator. My main goal was to determine if there were any network issues that might be impacting drobo fs performance. All tests involve a drobo FS running the latest firmware, a quad core system running windows 7 64bit, and 9KB packet sizes.
The initial tests used a direct connection between the drobo and my home system (a windows 7 box). The first round of tests used the default values for the drobo receive buffer and windows send buffer. The average throughput over multiple runs hovered at around 310Mbps. Not entirely surprising as the default send buffer size for Windows 7 is 8Kb (this can be a problem in some applications). When the buffers size was increased to a reasonable value (around 128KB) the throughput topped out at around 770 Mbps. Running ‘top’ on the drobo revealed that iperf was consuming 100% of the available ‘user’ side core. So I’m going to go out on a limb here and say that the Drobo FS is never going to see full Gb speed but ~800 Mbps isn’t that bad. The problem will come in as you factor in things like the overhead associated with the BeyondRaid parity calculations and the like.
For the next test I hooked up the drobo to my home network through the switch I’m using (which supports 9KB packets and has a 128KB buffer). The iperf results were identical to the direct connection tests. I wanted to make sure that my switch wasn’t imposing a performance hit.
I then ran tests involving the Drobo FS filesystem (the Drobo FSF fs!) using FTP. I wanted to use FTP to avoid the possible overhead of SMB, AFP, etc. Tests with FTP were performed with a 3GB file in binary mode at various send buffer sizes. All tests produced nearly identical results; approximately 32MBps (256Mbps). Not bad, but you are definitely being impacted by the demands placed on the ARM9 processor by the parity calculations other file I/O overhead.
Receive tests were much more variable. Without any buffer tuning (aside from the built in Win7 auto tuning tcp stack) I averaged around 34MBps. The transmission were a lot burstier than I expected with peaks of close to 50MBps and troughs down as far as 17.5MBps. It almost looks like I’m getting congestion but that doesn’t seem entirely likely. Setting the receive buffer by hand did reduce the burstiness (which isn’t entirely unexpected - auto tuning is really an approximation of the best buffer size rather than the very best possible size. Coworkers of mine at PSC helped develop the underlying principle of auto tuning and we really aimed it at making it ‘good enough’ rather than ‘damn near perfect’). Anyway, the CPU was not overloaded as the idle reported by ‘top’ never dipped below 15%. With a manually tuned connection throughput was approximately the same with slightly reduced burstiness in the flow (but not by much and I’m just eyeballing this). So it looks like Windows 7 is tuning the buffers pretty well for near maximum throughput (at least for the conjunction between my network and the drobo).
When I ran the receive tests using iperf (to eliminate filesystem issues on both sides) using the default settings (auto tune on the windows side, 27.2k on the drobo side) I averaged around 620Mbps over multiple 60s tests with a very normal amount of bustiness (eg, significantly less than seen with the file transfers). Modifying the send buffers made no difference in throughput (27.2 is pretty much perfect for a GigE LAN configuration). In fact, increasing the send buffers on the drobo reduced performance by around 40Mbps. Likewise, setting the buffers manually on the receiving end (the windows box) actually reduced performance by a similar amount.
So my conclusions are:
- The drobo fs is processor limited to just under 800Mbps in near ideal situations.
- The buffers are well tuned on the drobo fs. The windows send buffer may be undersized with some applications (such as FTP).
- However, file system overhead completely swamps any of the buffer considerations.
My random thinking is:
Faster disks may help. The standard 2TB green drives that a lot of us are putting in the drobo are usually no more than 5900rpm. Increasing the disk speed may mean more data gets over the network. I can’t test this as I have 5900rpm drives previously mentioned. HOWEVER: When I ran ‘cat 3gbtestfile >/dev/null’ it only took 60s of wall clock time (60MB/s) so somewhere between the disk and the network we’re losing 26MB/s. I’m wondering if the burstiness I saw in the network tests are an indicator of this. I’d love to talk to the Data Robotics people about this.
My idea for future work:
Get a tcpdump running and then dump the results into tcptrace and see if there is anything that really stands out. I’m wondering if we’re getting some sort of weird interaction between the fs i/o and the network i/o. I’d have to see if I can build tcpdump under the cross compile environment. Even if I can do that I’m not sure it will work under the drobo environment but I’ll give it a shot at some point. It may also be interesting to try different congestion control algorithms. The default used on the drobo is cubic. It may be worth trying the only other algorithm available of reno. I doubt it would be any better. BIC on the other hand might be worth trying but again - dunno about building them all though.
Anyway, hopefully this information will be of use to someone. This post is mostly stream of consciousness and not edited for clarity &c. If you made it through this far, congrats!
*If anyone wants the iperf application I built for drobofs you can grab it from http://staff.psc.edu/rapier/droboapps/drobo-iperf.gz (35024 bytes). You can get a windows version of iperf in the jperf package (look in /bin). http://openmaniak.com/iperf.php has a good tutorial on the use of iperf. I think this will be helpful for all of the people complaining about performance. It may help you figure out if it’s your network or the drobo.[hr]
Quick update: I just ran some UDP tests. With standard 1500B packets I wasn’t able to get more than around 360Mbps. When I bumped it up to 9K packets throughput increased to 770Mbps. This really illustrates how important it is that you use jumbo frames (9k packets) when using your Drobo FS. The limiting factor really seems to be the processor. The ARM9 is a great little chip but it just can’t seem to spit out packets as fast as we might like. Reducing the load on the chip really seems to help.