2. Slide 2 | PROPRIETARY AND CONFIDENTIAL
Introducing Solarflare
• High-performance software
and hardware for 10GbE
server networking
• Mission-critical applications
– Securities and trading
– HPC
– Storage
– Cloud/Web 2.0
• Leader in financial services
• Partnerships with Arista,
Cisco, Couchbase, IBM,
Juniper, Redhat, Oracle,
VMware
“Solarflare’s product, EnterpriseOnload is a
robust, rigorously tested and fully supported
solution that addresses our demanding support
and service level requirements. In addition to
providing the highest-performance, lowest-
latency hardware, Solarflare’s unique and
innovative application acceleration software
can be used to deploy quickly without any need
to re-write our applications.”
Andrew Bach
Senior Vice President of Network Services for NYSE Euronext
3. Slide 3 | PROPRIETARY AND CONFIDENTIAL
Data Volume
Growing 44x
2020: 35.2
Zettabytes
2010:
1.2
Zettabytes
Data is Growing Faster than Moore’s Law
Business Analytics Requires a New Approach
Source: IDC Digital Universe Study, sponsored by EMC, May 2010
The Network is the Bottleneck
IDC
Digital Universe
Study 2011
4. Slide 4 | PROPRIETARY AND CONFIDENTIAL
But Computing is More than Just Moore’s Law
• SandyBridge-Based
Servers/ Aka Romley
Motherboards
• Intel Integrated I/0
• Reduced bottlenecks
• Increase performance
up to 80%
Up to 4 channels
DDR3 1600 Mhz
memory
Up to 8 cores
Up to 20 MB cache
Integrated
PCI Express*
3.0
Up to 40
lanes
per socket
[ Transactions per second ]
Xeon
5600
Series
Xeon
2600
Family
Can more than Double
I/O Performance1
Direct Data I/O (DDIO)
5. Slide 5 | PROPRIETARY AND CONFIDENTIAL
SFN6122F & Xeon E5-2600 Deliver Winning Combination
• SFN6122F single-stream
latency is superb over all
message rates on Romley
platforms, right up to the
point of CPU core utilization
• Ultra-low jitter (sub-micro at
99Percentile)
• Benefits from Intel® Data
Direct I/O (DDIO) and
chipset IO – memory
bandwidth
• Message rate headroom –
20Mpps with 4x sfnt-streamssfnt-stream / openonload-201109-u2
“Westmere” = 2x Xeon 5687 (3.6GHz)
“Romley” = 2x E5-2687W (3.1GHz) – DDR 1333
6. Slide 6 | PROPRIETARY AND CONFIDENTIAL
Standard Server I/O Networking: RSS
• RSS spreads flows
randomly over cores
• Packets within a flow go to
same core
• Works well when
– Connections
are long lived
– One thread per
connection
– And the thread
happens to run
on the right
core
– Great for
network
benchmarks
NIC
VNIC VNIC VNICVNIC
receive-side scaling
core0 core1 core2 core3
App App App App
ISR ISR ISR ISR
7. Slide 7 | PROPRIETARY AND CONFIDENTIAL
Smarter Server I/O Networking: Flow Affinity
• Hardware deterministically
directs flows to the ideal
core for handling the load
• Google developed
Receive packet steering
(RPS)
• Solarflare developed
Accelerated RFS for
multiqueue hardware
• Supported today on
Solarflare adapters
NIC
VNIC VNIC VNICVNIC
receive-side scaling
core0 core1 core2 core3
App App App App
ISR ISR ISR ISR
8. Slide 8 | PROPRIETARY AND CONFIDENTIAL
Cisco and Solarflare Achieve Dramatic Latency Reductions
for Interactive Web 2.0 Applications
11. Slide 11 | PROPRIETARY AND CONFIDENTIAL
Other Tuning Tips
• RSS Spreading
– Changing the default RSS spreading so that the card uses an RX queue per CPU
core. This is done by setting the driver’s module parameter rss_cpus=cores. See
section “Receive Side Scaling (RSS)” on page 193-194 for more details. If you
can, we also suggest you disable the irqbalancer service before doing this, as the
irqbalancer tends to undo the good work of spreading the networking interrupts
over the available CPU cores (see page 196)
• Enable LRO
– Double check LRO is enabled using ethtool. As mentioned RHEL 6 libvirtd
daemon can cause this to be disabled. See “TCP Large Receive Offload (LRO)”
on page 191-192 for details of LRO and how to change RHEL6 behaviour
• Interrupt Moderation
– IF (and only if) you think the benchmark is latency sensitive then we suggest
disabling interrupt moderation. The driver by default uses adaptive interrupt
moderation and tries to tune based on traffic patterns. However, if you know you
have a ping/pong – transactional app then helps to disable this completely (see
page 189 – “ethtool –C <ethX> rx-usecs-irq 0 adaptive-rx off”). BUT if you are
streaming large blocks of storage data between the servers then don’t do this - the
default is best.