From Feargal O’Sullivan, High Performance Messaging at NYSE Technologies:
I’m here in Portland, Oregon, the city with the best micro-breweries, I’m told (a claim I’ll be sure to validate this evening), for Super Computing ’09. We’re back in action with our partners Voltaire and Intel, this time showing an enhanced version of the NYSE Technologies Data Fabric 10GigE RDMA demo we built for the Intel Developer’s Forum in September. You can read more about it in our press release or read on for a summary.
For the demonstration, we are using 12 identical, Intel Xeon 5570-based servers with NetEffect 10GigE NICs, running NYSE Technologies Data Fabric and Voltaire VMS, on a real-time Linux kernel. Note: Data Fabric can seamlessly switch between using LDMA, RDMA or TCP to transport data.
On one server, a publisher application generates 1 million, 100-byte messages per second, inserts a timestamp and then sends them to five subscribers on five other servers, all over 10GigE RDMA. One of those subscribers reflects the message back to the publisher box, which then timestamps again and calculates how long the message took.
Meanwhile a different publisher application on a different server generates 50,000, 100-byte messages per second, inserts a timestamp and then sends them to five subscribers on the remaining five servers. These servers use the exact same type of 10GigE NICs only this time Data Fabric is configured to publish using the standard Linux TCP stack rather than using the RDMA iWarp hardware acceleration built onto the NIC. One of those subscribers “reflects” the message back to the publisher box which then timestamps again and calculates how long the message took.
To display everything we have a charting application showing the throughput and 1-second average latency of each transport. Drop by the Intel Booth (#1935) to see the comparison for yourself. Oh… okay… I’ll fill you in here in case you can’t make it. On average, Data Fabric RDMA has a seven times better latency profile than Data Fabric TCP (its lack of jitter is an even bigger improvement) and can handle 20 times the throughput.
So, NYSE Technologies Data Fabric and the MAMA API give users the flexibility to deploy applications with whatever latency profile they need for the business use case, all through a simple configuration change.
For:
• Ultra-Low-latency: Local Direct Memory Access on a single server.
• Low-latency: Remote Direct Memory Access over 10GigE or Infiniband.
• Enterprise Fan-out: TCP over 1GigE.
