More Related Content Similar to Hpc Day Oct 09 (20) More from Oleg Nazarevych (20) Hpc Day Oct 092. The Technical Computing Architecture
Technical Computing Innovation and Discovery
Create Visualize
CAE/CFD Weather
Analyze
DCC Life Science
Simulate
Energy Finance
Optimizing the Time From Idea to Reality
With a New Generation of Intelligent Processors
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 2
3. Insatiable Demand for
Performance, Density, and Efficiency
Your Demand For Performance Intel Power Reduction Over Time
1.E+00
1 ZFlops 1.E-01
1.E-02
1 EFlops
100 PFlops 1.E-03
1 PFlops
1.E-04
1.E-05
1 TFlops
1.E-06
1 GFlops
100 MFlops 1.E-07
1993 1999 2005 2011 2017 2023 2029 1970 1980 1990 2000 2005 2010
Source: Top500.org
~ 1 Million Factor Reduction In
Energy per Transistor Over 30+ Years
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 3
4. Meeting Today’s HPC Challenges
Intelligent Performance
Genomics Research Weather Prediction
Software Versatility
Oil Exploration Design Simulation
Ease of Deployment
Financial Analysis Medical Imaging
Scaling Performance Forward
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 4
5. Intelligent
Performance
Up to 3x
Performance Increase
Performance Optimized
For Your Environment
Power Efficiency
Enabling You to Intelligently
“Scale Your Performance
Forward“
For notes and disclaimers, see legal information slide at end of this presentation.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 5
6. Intel® Xeon® 5500 (codename Nehalem – EP):
Putting More Brain Power into Your Cluster
Performance by Design
• Intel® QuickPath Interconnect Integrated Memory Controller – 3 Ch DDR3
• Integrated memory controller
Intelligent Performance
• Intel® Turbo Boost Technology Core Core Core Core
• Hyper-Threading technology
Power Efficiency
• More power states Q
• Faster transition between power P Shared L3 Cache
states I
• lower idle power
Driving Performance Through Multi-core
Technology and Platform Enhancements
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 6
7. Intel® Xeon® 5500 Platform
Up to 3X the performance
over previous generation
Intel® Xeon® 5400
Intel® 5520
Chipset
Optimize your performance
for diverse workloads PCI Express* 2.0
Lower TCO by providing more
Intel® X25-E Intel® 82599
energy efficient higher SSDs
ICH 9/10
10GbE Controller
performing solutions
Platform Ready for Future 32nm Products
For notes and disclaimers, see legal information slide at end of this presentation.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 7
8. Intel® Xeon 5500:
A New Generation of Intelligent Processors
Relative Performance
Higher is better
Xeon
5400
series
Weather FEA FEA CFD CFD Energy Open MP Energy Open MP Weather Energy
Source: Published/submitted/approved results March 30, 2009. See backup for
additional details
Knows Where to Put the Speed, Knows How to Save
Energy
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system
hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering
purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright © 2009, Intel Corporation. * Other names
and brands may be claimed as the property of others.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 8
9. Advanced Processors for HPC
Selection Guidance
Memory Max Balanced Max
Requirement Bandwidth Performance Capacity
Example Usages HPC General Purpose Virtualized
Technical Computing Enterprise workloads Environment
DDR3 1333 DDR3 1066 DDR3 800
Memory Technology 32 GB/s 25.5 GB/s 19.2 GB/s
48 GB 96 GB 144 GB
Advanced Skus X5570
2.93 GHz
X5570
2.93 GHz
X5570
2.93 GHz
8M cache
6.4 GT/s QPI X5560
2.80 GHz
X5560
2.80 GHz
X5560
2.80 GHz
DDR3 1333
HT X5550 X5550 X5550
2.66 GHz 2.66 GHz 2.66 GHz
Turbo +3
E5540 E5540
Standard 2.53 GHz 2.53 GHz
Skus8M cache QPI
5.86 GT/s
E5530
2.40 GHz
E5530
2.40 GHz
HT Fastest E5520 E5520
Turbo +2 2.26 GHz 2.26 GHz
QPI
Basic Fastest E5506
2.13 GHz
4M
Skus cache
4.8 GT/s
Memory E5504
2.00 GHz
QPI Faster
QPI E5502
1.86 GHz (2C)
Faster
memory
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 9
10. Step Function in Performance
Advanced SKUs Offer Significant Performance Gains
Advanced
250 Standard
SPEC Benchmark
Maximum
200
Basic Performance
150
100
50
0
SPECint_ rate_base2006 SPECfp_ rate_base2006
Turbo and HT “ON”
For notes and disclaimers, see legal information slide at end of this presentation.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 10
11. Intel Technology is Changing HPC
TCO, Performance, Reliability
Extreme Increased Power Reduce
Performance Reliability Efficient System Cost
Solid State Disk 10GbE
Optimize Performance for
Bridging the Gap Between
I/O Intensive Apps and
1GbE and Infiniband®
Boot Drive Replacement
SSD Proof Points
€ Intel IT evaluation results.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 11
12. Intel® Xeon® 5500
Putting More Brainpower into the Datacenter
200
180
160
140
120
100
80
60
40
20
0
Intel® Xeon 5100 Intel® Xeon 5500
Dual-core Intel® Xeon® 5160 New Intel® Xeon® 5500 Series
Processor (Woodcrest) With SSDs
Up to 7.8X
Performance
Same Power, Same Space
Source: Intel internal measurements. Test configurations in backup
For notes and disclaimers, see legal information slide at end of this presentation.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 12
13. Nehalem-EX
The next step in large scale HPC
High Performance
– Up to 8 cores per socket
– Up to24MB shared last level
Nehalem Nehalem
cache
– 4 high bandwidth QPI links / I/O HUB I/O HUB
processor
– High memory bandwidth and PCI Nehalem Nehalem
PCI
Express* Express*
capacity
Schedule: Target Q4’09 Production Availability
More Scalability, Cores, and Memory Capacity
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 13
14. Delivering Versatility
Performance Gains Today
Optimize Application
Performance
Develop Highly Portable
and Parallel Software
Enabling You to Easily
“Scale Your Performance
Forward“
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 14
15. Parallel Programming Challenge
Ease of Use and Flexibility
Irregular Patterns, Data Structures,
and Serial Algorithms
? Scale to Multi-Core Today → Hard
Scale to Many-Core Tomorrow → Harder
Increasing Cores (2→64+ Cores)
Vector Instructions (4→8+ Wide)
Cache and Interconnect Latency
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 15
16. Scaling Performance Forward
One Development Environment – Multi- to Many-core
Insight
Architectural Analysis
Performance Performance
Optimize/Tune Optimize/Tune
Confidence
Correctness
Simplify Your Development
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 16
17. Ease of Deployment
Confidently Deploy and Manage a Cluster
Certified Cluster Configurations
Intel® Cluster Checker to Validate
Simplification
Application Interoperability
“Out of Box” Experience
Enabling You to Confidently “Scale Your Performance Forward“
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 17
18. ICR – Intel® Cluster Ready
What is ICR?
A specification to help OEM’s &
PI’s manufacture HPC clusters
based upon the Intel architecture Simplify
Management
with Intel® Cluster
Checker
Simplify
Deployment
with registered
Simplify applications
Manufacturing
with defined recipes
and Intel® Cluster
Simplify Checker to validate
Purchasing Registered ISV/Apps =18/53
with certified cluster
configurations
Certified OEM/Platforms = 21/89
Simplifying Your Cluster
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 18
19. The Future Looks Bright
Silicon and
Software Future
Tools A leap ahead in
technology
Unleash
Performance
22nm
Continue to deliver
32nm world class
45nm processor
– Westmere – technology
– Intel® Xeon 5400 more cores
– Intel® Xeon 5500 – Sandy Bridge –
– Nehalem EX higher integration
Breakthrough Technology Year After Year
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 19
20. Solving Your HPC Challenges
– Up to 3x performance gains to decrease your
time
Intelligent to discovery
Performance – Improved power technology, more efficient
data for a lower TCO
– Easily optimize application performance and
Software eliminate the need to increase software
Versatility resources
– Develop highly portable, parallel software
– Certified cluster configurations to simplify
cluster deployment
Ease of
– Use Intel® Cluster Checker to validate
Deployment configurations: ensure a highly reliable
solution
Scaling Performance Forward
For notes and disclaimers, see legal information slide at end of this presentation.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 20
21. BACK UP
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 21
22. Dual Core Performance Refresh
Data Center perf. optimization with Intel® Xeon 5500 (Nehalem-EP)
Business
2006: 1,000 servers 2009: 1,000 servers BENEFITS
Dual core Intel Xeon® 5160 Processor (WDC) New Intel Xeon® 5500 series
200
Performance
180
Up to 4X the performance;
160
140
BENEFIT over WDC
120
100
80
Up to 14% less power
SPECfp_rate_base2006 60
40
20
(4.16x) 0
WCD NHM
Source: Intel estimates and measurements as of Nov 2008. Performance comparison using SPECfp_rate _base2006. Use this slide in conjunction with backup slide.
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by
those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to
Source: Intel internal measurements. Test configurationsof systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel
evaluate the performance in backup
products, visit Intel Performance Benchmark Limitations
For notes and disclaimers, see legal information slide atIntel analysisthis presentation.
22 Results have been estimated based on internal end of and are Intel Confidential
provided for informational purposes only. Any difference in system hardware or software design or
configuration may affect actual performance.
23. Dual Core Performance Refresh
Data Center perf. optimization with Intel® Xeon® 5500 (Nehalem-EP)
2006: 1,000 servers 2009: 1,215 servers
Dual core Intel® Xeon® 5160 Processor (WDC) New Intel® Xeon® 5500 series with SSDs
Up to 5X Performance*
200
Performance
180
160
140
BENEFIT over WDC
120
100
Same Power Envelope
80
SPECfp_rate_base2006 60
40
20
(5.06x) 0
WCD NHM (*without any benefit from SSDs)
Source: Intel estimates and measurements as of Nov 2008. Performance comparison using SPECfp_rate _base2006. Use this slide in conjunction with backup slide.
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by
those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to
Source: Intel internal measurements. Test configurationsof systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel
evaluate the performance in backup
products, visit Intel Performance Benchmark Limitations
For notes and disclaimers, see legal information slide atIntel analysisthis presentation.
23 Results have been estimated based on internal end of and are Intel Confidential
provided for informational purposes only. Any difference in system hardware or software design or
configuration may affect actual performance.
24. Improving Performance Efficiency
Up to
Faster 2
Transitions (msec) 5X
10
Lower CPU 10 Up to
Idle Power (W) 50 5X
More Operating 15
Up to
States 3
5x
2009 Xeon 5500 Series 2007-2008 Xeon 5400 Series 2006 Xeon 5300 Series
Intelligent Power Evolution
† Xeon® 5300 series data based on Xeon® X5365 SKU (B-3 stepping), Xeon® 5400 series based on Xeon® X5470 (E-0 stepping),
and Xeon® 5500 based on Xeon® W5580 (D-0 stepping). Number of operating states includes all frequency operating points,
including Turbo Boost and base frequency. Idle power based on C6 idle power for Xeon® 5500, and C1E for Xeon® 5300 and 5400
SKUs. C6 also requires OS support and may vary by SKU. Faster transitions based on Package C1E exit transition latency.
24 Intel Confidential
* Other names and brands may be claimed as the property of others. Copyright © 2008, Intel Corporation.
25. Extending Performance with SSD’s
Usage Models HPC Opportunities
• Hard Disk Drive Replacement; I/O intense apps
•Boot Drive Replacement
Benefits • Lower Latency and Faster Access
Higher Throughput to Data
• New Levels of Lower TCO
Reliability & Mgmt
Energy & Space
• Less Power and Savings
Smaller Footprint
Unparalleled IOPs with Solid State Disks
25 Intel Confidential
* Other names and brands may be claimed as the property of others. Copyright © 2008, Intel Corporation.
26. The Truth Of Law’s & Observations
Amdahl’s Gustafson’s
Law Observation
If Serial Component If The Serial Component
Remains Shrinks In Size As
Proportionately Problem Expands, There
Equal, There Is No 70% are Significant Speed Up 95%
Inherent Speed Up Parall Opportunity Available Parall
Factor Available el el
70% 70%
Parall Parall
el 30% el
30% Serial 30% 5%
Serial Serial Serial
If parallel component is 50x If parallel component is 50x
faster the max speed up is 3.25X faster the max speed up is 18.26X
Growing the problem size may mitigate the impact of Amdahl’s Law.
ONLY if the serial fraction doesn’t grow in proportion to the problem size
26 Intel Confidential
* Other names and brands may be claimed as the property of others. Copyright © 2008, Intel Corporation.
27. Intel® Xeon® 5500: HPC Leading Capability
– See Intel® Xeon® 5500: An Advance in HPC performance
slides
(previous 2 slides for turbo and HT details)
Feature Today Nehalem-EP Benefit
Peak CPU-Chipset 21 GB/s 46.1 GB/s Up to 2.2x
Platform
BW (1333MHz) (6.4 GT/s)
Peak Mem BW 21 GB/s 32 GB/s Up to 1.5x/CPU
(FBD-667) (DDR3-1333)
Max Memory 64/128 GB 144 GB Up to 2x*
Capacity (FBD) (DDR3)
CPU
Performance on demand
Turbo Boost No Yes
based on SW needs
Up to 16 threads
Hyper-Threading No Yes
for a DP system
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 27
28. 2-Socket Quad-Core Intel® Xeon® Processor 5500 Series based platforms
Intel® X25-E Extreme SSD Performance Comparison using ANSYS ® Mechanical™ 12.0 Preview 7
• ISV Application Description ANSYS® Mechanical™12.0 Preview 7
ANSYS 12.0 software is a comprehensive
Relative Performance Higher is better
multiphysics tool combining structural,
thermal, fluids, acoustic and
electromagnetic simulation capabilities in
a single engineering software solution. 1.74
Its comprehensive range of physical Up to
models can be applied to simulation- 74%
based product development in a broad 1.25
range of industries and applications. 1.00
• Benchmark description
The benchmark uses a FEA model with
1.5 million degrees of freedom to extract
50 dynamic mode frequencies and mode
shapes using block Lanczos solver. The
workload is IO-intensive with limited Quad-Core Int el Xeon Quad-Core Int el Xeon Quad-Core Int el Xeon
scaling. The results are based on 4- X5482/1600 FSB HDD X5570/6.4 MT HDD X5570/6.4 MT SSD
process parallel execution; see backup RAID0 RAID0 RAID0
slides for details.
Data Source: Approved/published results as of March 30, 2009.
Quad-Core Intel® Xeon E5570 with Intel® X25-E Extreme SSD is
74% faster than previous quad core processor
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by
those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to
evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel
products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other
names and brands may be claimed as the property of others.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 28
29. 2-Socket Quad-Core Intel® Xeon® Processor 5500 Series based platforms
Intel® X25-E Extreme SSD Performance Comparison using MD Nastran benchmarks
• ISV Application Description MD Nastran R3
MD Nastran R3 combines best-in-class
solver technologies - Relative Performance
Nastran, Marc, Dytran, Adams, and LS- Higher is better
Dyna - into one, fully-
integrated, multidiscipline simulation
solution for the manufacturing enterprise 1.60
allowing manufacturers to perform
interoperable, multidisciplinary analyses 1.33
on complex models.
Up to
1.00 60%
Benchmark description
MD Nastran benchmarks representing 5
solutions sequences including static
analysis, normal modes analysis
with/without ACMS, direct frequency
response, modal frequency response and
non-linear analysis using serial, SMP, and Quad-Core Int el Xeon Quad-Core Int el Xeon Quad-Core Int el Xeon
DMP execution. X5482/1600 FSB HDD E5570/6.4 MT HDD E5570/6.4 MT SSD
RAID0 RAID0 RAID0
Data Source: Approved/published results as of March 30, 2009.
Quad-Core Intel® Xeon E5570 with Intel® X25-E Extreme SSD is
60% faster than previous quad core processor
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by
those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to
evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products,
visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other names and
brands may be claimed as the property of others.
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 29
30. LEGAL DISCLAIMERS
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 30
31. Nehalem-EP Performance
Comparison to Previous Generation 5400 Series on Server and HPC
Benchmarks – Config Details
Xeon 5400 Server platform common configuration details: Super Micro server platform X7DB3 with two Quad-Core Intel Xeon processor X5460(HTN 3.16GHz) or
X5470(HTN 3.33GHz) with 2x6M L2 Cache, 1333 MHz system bus, Blackford Chipset
Xeon 5400 HPC platform common configuration details: Super Micro server platform X7DWA-N with two Quad-Core Intel Xeon processor E5472(HTN 3.0GHz) or
X5482(HTN 3.20GHz) with 2x6M L2 Cache, 1600 MHz system bus, Seaburg Chipset
Nehalem-EP platform common configuration details: Intel server pre-production SuperMicro platform with two Quad-Core Nehalem-EP processor, 2.93GHz with 8M L3
Cache, 6.4QPI, Tylesburg-EP Chipset. (SPECcpu2006 measured on “Green City” platform)
Benchmark Specific Details (All data based on Intel internal measurements, February 2009)
Benchmark OS Memory Other Software & Hardware details
SPECint*_rate_base2006, Suse Linux 10-64bit Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHz SPEC binaries built with Intel Compiler 11.0 for 32-bit/64-bit Linux.
SPECfp*_rate_base2006, Xeon 5400 HPC: 16GB (8x2GB) FB DDR2-800MHz HT ON for Nehalem-EP. Turbo mode disabled
Nehalem-EP: 24GB (6x4GB) DDR3-1333MHz
TPC*-C – Oracle* RedHat Linux OS Xeon 5400 Server: 64GB (16x4GB) FB DDR2-667 Oracle* 11g. HT ON for Nehalem-EP. Turbo mode disabled.
TPC*-C – SQLServer Microsoft Windows Nehalem-EP: 288GB memory simulated using 72GB Microsoft SQLServer*2005. HT ON for Nehalem-EP. Turbo mode
Server 2003 (18x4GB) DDR3-800 MHz. Result recalibrated for 144GB disabled.
TPC-*H Microsoft Windows Xeon 5400 Server: 64GB (16x4GB) FB DDR2-667MHz Microsoft SQLServer 2008 RTM; HT ON, Turbo mode disabled
Server 2008 Nehalem-EP: 72GB (18x4GB) DDR3-800MHz
SAP-SD* Suse Linux 10-64bit Xeon 5400 Server: 32GB (8x4GB) FB DDR2-667MHz SAP* 2-Tier SD benchmark. ECC 5.0 Version. Oracle database.
Nehalem-EP: 48GB (12x4GB) DDR3-1066MHz HT ON for Nehalem-EP. Turbo mode enabled
SPECjbb*2005 Various Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHz 4 JVM instances for HTN and 2 JVM instances on Nehalem-EP.
Nehalem-EP: 24GB (6x4GB) DDR3-1333MHz HT ON for Nehalem-EP. Turbo mode enabled
SPECjvm*2008 Various Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHz Baseline result. 1 JVM instance. HT ON for Nehalem-EP. Turbo
Nehalem-EP: 24GB (6x4GB) DDR3-1333MHz mode enabled
SPECweb*2005 Microsoft Windows Nehalem-EP: 18GB (18x1GB) DDR3-800MHz IIS7 with Zend PHP Isapi Dll 5.0; HT ON vs OFF study
Server 2008
vConsolidate vCon 2.0 - Profile 2 Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHz Vmware ESX 3.5 for Xeon 5400; Vmware ESX 4.0 Beta 1 for
Nehalem-EP: 48GB (12x4GB) DDR3-1066MHz Nehalem-EP. HT ON, Turbo mode disabled.
All HPC applications Red Hat EL5-U3 Xeon 5400 HPC: 16GB (8x2GB) FB DDR2-800MHz All benchmarks run with 8 process. HT ON for Nehalem-EP. Turbo
Beta - 64-bit; Nehalem-EP: 24GB (12x2GB) DDR3-1066MHz mode disabled
Linpack Red Hat EL5-U2 Xeon 5400 HPC: 16GB (8x2GB) FB DDR2-800MHz Intel® SMP LINPACK 10.0.4 (Linux) for HTN. Intel® SMP
Beta - 64-bit; Nehalem-EP: 24GB (6x4GB) DDR3-1333MHz LINPACK 10.1 Beta 2 (Linux) for Nehalem-EP; HT OFF
Stream Red Hat EL5-U1 64- Xeon 5400 Server: 16GB (8x2GB) FB DDR2-667MHz 8 Copies. Stream Triad used for comparison. HT OFF for
bit; Nehalem-EP: All memory configurations were run Nehalem-EP
Benchmark comparisons for HT ON vs OFF, Turbo ON vs OFF shown were measured using the same platform configuration as above. Comparisons across different Nehalem-EP skus
were measured on the same platform using the above configuration.
Data source: Intel Internal measurements – November 2008
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel
products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers
should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more
information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 31
32. Intel® Xeon 5500: A New Generation
of Intelligent Processors
System Configuration Information
– All comparisons based on published/submitted/approved results as of March 30, 2009
– Fluent: Comparison based on published/submitted results to www.fluent.com/software/fluent/fl6bench/fl6bench_6.4.x/index.htm as of March 30, 2009. All comparisons were
using results run on 8 cores within a single machine on dual socket quad-core servers.
– Baseline Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+* server platform with two Intel® Xeon® processors X5482 3.20GHz, 12MB L2 cache,
1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.3*. Performance measured using Fluent Version 12.0 Beta. (Version
12.0.13)*. Six individual benchmarks are shown as a measure of single node performance. "Overall" performance is the geometric mean of the six individual benchmarks.
– Intel® Xeon® processor X5570 based platform details: SGI Altix ICE 8200EX* server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4
MT/sec, 24GB memory (12x2GB 1066MHz DDR3), 64-bit Suse Linux Enterprise Server* 10 SP2 with ProPack 6SP2*. Performance measured using Fluent Version 12.0
Beta. (Version 12.0.9) Six individual benchmarks are shown as a measure of single node performance. "Overall" performance is the geometric mean of the six individual
benchmarks.
– Quad-Core AMD Opteron* processor model 2384 platform based details:Server platform with two AMD Opteron 2384 processor 2.7GHz, 6MB L3 cache, Linux OS.
Performance measured using Fluent Version 12.0 Beta. (Version 12.0.7) Six individual benchmarks are shown as a measure of single node performance. "Overall"
performance is the geometric mean of the six individual benchmarks.
– SPECompM2001
– Baseline Intel® Xeon® processor E5472 based platform details: Supermicro X7DB8+ server platform* with two Intel Xeon processors E5472 3.0GHz, 12MB L2 cache,
1600MHz FSB, 32GB memory (8x4GB 800MHz DDR2 FB-DIMM), SUSE LINUX 10.1* (X86-64) (Linux 2.6.16.13-4-smp). Binaries built with Intel Compiler 10.1. Referenced
as published at 17187. (SPECompMbase2001). For more information see http://www.spec.org/omp/results/res2007q4/omp2001-20071107-00274.html.
– Intel® Xeon® processor X5570 based platform details: Cisco B-200 M1 server platform* with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, 6.4GT/s QPI, 24 GB
memory (6x4 GB DDR3-1333MHz), Red Hat EL 5.3, Linux Kernel 2.6.18-128.el5 SMP x86_64, Binaries built with Intel® C/C++ Compiler 11.0 for Linux. Result submitted to
www.spec.org for review at 43593 (SPECompMbase2001) as of March 30, 2009.
– Quad-Core AMD Opteron processor 2384 based platform* details: Supermicro H8DMU Server platform* with two Quad-Core AMD Opteron processors 2386SE* 2.80GHz,
6MB L3 cache, 16GB memory (8x2GB, PC2-6400, Reg, dual-rank CL5), SUSE Linux Enterprise Server 10 64-bit, Binaries built with PathScale Compiler Suite*, Release 3.1.
Referenced as published at 22678 (SPECompMbase2001). For more information http://www.spec.org/omp/results/res2008q4/omp2001-20081021-00320.html.
– Multiphysics Finite Element Analysis using ANSYS* - Comparison based on published/submitted results to www.ansys.com/services/hardware-support-db.htm as of
March 30, 2009.
– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+* server platform with two Intel® Xeon® processors X5482 3.20GHz, 12MB L2 cache, 1600MHz
FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.1*. Performance measured using ANSYS* Mechanical* 12.0 Preview 7.
Benchmark for Ansys-Shared* consists of a suite of 8 workloads and Ansys-Distributed* consists of a suite of 7 workloads. Geo mean of each these workload groups used
for comparison.
– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+ server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4 MT/sec,
24GB memory (12x2GB 1066MHz DDR3), 64-bit RedHat Enterprise Linux 5.3. Performance measured using ANSYS* Mechanical* 12.0 Preview 7. Benchmark for Ansys-
Shared consists of a suite of 8 workloads and Ansys-Distributed consists of a suite of 7 workloads. Geo mean of each these workload groups used for comparison.
– MM5 v4.7.4 - t3a and WRF v3.0.1 - 12km CONUS : Comparison based on measured results as of March 30, 2009. All comparisons were using results run on 8 cores
within a single machine on dual socket quad-core servers. Same platform used for both benchmarks
– Baseline Intel® Xeon® processor X5482 based platform details: SGI Altix ICE 8200EX* server platform with two Intel® Xeon® processors X5482 3.20GHz, 12MB L2 cache,
1600MHz FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit Suse Linux Enterprise Server* 10 SP2 with ProPack 6SP2*.
– Intel® Xeon® processor X5570 based platform details: SGI Altix ICE 8200EX* server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4
MT/sec, 24GB memory (12x2GB 1066MHz DDR3), 64-bit Suse Linux Enterprise Server* 10 SP2 with ProPack 6SP2*..
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system
hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering
* Other names andmore information on performancethe property of others. Copyright © 2009, Intel Corporation.
purchasing. For brands may be claimed as tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright © 2009, Intel Corporation. * Other names 32
33. Intel® Xeon 5500: A New Generation
of Intelligent Processors
System Configuration Information
– All comparisons based on published/submitted/approved results as of March 30, 2009
– Reservoir Simulation using Schlumberger Eclipse*
– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+ server platform* with two Intel Xeon processors X5482 3.20GHz, 12MB L2 cache, 1600MHz
FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.3*. (64-bit) and Eclipse version 2008.1 software.
– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+ server platform* with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4*
MT/sec, 24GB (12x2GB 1066MHz DDR3) memory, 64-bit RedHat Enterprise Linux 5.3. Eclipse version 2008.1 software.
– Reservoir Simulation using Landmark Nexus*
– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+ server platform* with two Intel Xeon processors X5482 3.20GHz, 12MB L2 cache, 1600MHz
FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.2*. (64-bit) . Landmark Nexus R5000 software*.
– Intel® Xeon® processor X5560 based platform details: Supermicro X8DTN+ server platform* with two Intel Xeon processors X5560 2.80GHz, 8MB L3 cache, QPI 6.4
MT/sec, 12GB memory, 64-bit RedHat Enterprise Linux 5.3. Landmark Nexus R5000 software.
– Reservoir Simulation using CMG* IMEX*
– Intel® Xeon® processor X5482 based platform details: Dell Precision T7400 platform* with two Intel Xeon processors X5482 3.20GHz, 12MB L2 cache, 1600MHz FSB,
32GB RedHat Enterprise Linux 5.2*. (64-bit) CMG IMEX, Version 2008.11.
– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+ server platform* with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4*
MT/sec, 18GB memory. RedHat Enterprise Linux 5.3. (64-bit) CMG IMEX, Version 2008.11.
– Computational Fluid Dynamics analysis using Star-CD* (Single Node) - Comparison based on published/submitted results to http://www.cd-adapco.com/products/STAR-
CD/performance/406/index.html as of March 30, 2009. All comparisons were using results run on 8 cores within a single machine on dual socket quad-core servers.
– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+* server platform with two Intel Xeon processors X5482 3.20GHz, 12MB L2 cache, 1600MHz
FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux 5.3*. Performance measured using STAR-CD v4.06. Same configuration used for all
both benchmark results - A-Class and C-Class.
– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+* server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI* 6.4
MT/sec, 24GB memory (12x2GB 1066MHz DDR3), 64-bit RedHat Enterprise Linux 5.3. Performance measured using STAR-CD v4.06. Same configuration used for all both
benchmark results - A-Class and C-Class.
– Crash Simulation analysis using LS-DYNA* (Single Node): Comparison based on published/submitted results to http://www.topcrunch.org/ as of March 30, 2009. All
comparisons were using results run on 8 cores within a single machine on dual socket quad-core servers.
– Intel® Xeon® processor X5482 based platform details: Supermicro X7DB8+* server platform with two Intel® Xeon® processors X5482 3.20GHz, 12MB L2 cache, 1600MHz
FSB, 16GB memory (8x2GB 800MHz DDR2 FB-DIMM), 64-bit RedHat Enterprise Linux* 5.3. Performance measured using LS-DYNA mpp971.s.R321. Same configuration
used for all three benchmark results - neon_refined_revised, 3 vehicle collision, car2car.
– Intel® Xeon® processor X5570 based platform details: Supermicro X8DTN+ server platform with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, QPI 6.4 MT/sec,
24GB memory (12x2GB 1066MHz DDR3), 64-bit RedHat Enterprise Linux 5.3. Performance measured using LS-DYNA mpp971.s.R321. Same configuration used for all
three benchmark results - neon_refined_revised, 3 vehicle collision, car2car.
– SPECompL2001
– Baseline Intel® Xeon® processor E5472 based platform details: Supermicro X7DB8+ server platform* with two Intel Xeon processors X5482 3.20GHz, 12MB L2 cache,
1600MHz FSB, 32GB memory (8x4GB 800MHz DDR2 FB-DIMM), SUSE LINUX 10.1* (X86-64) Intel Compiler 11.0. Submitted to www.spec.org for review at 81332 as of
March 30, 2009.
– Intel® Xeon® processor X5570 based platform details: Cisco B-200 M1 server platform* with two Intel Xeon processors X5570 2.93GHz, 8MB L3 cache, 6.4GT/s QPI, 24 GB
memory (6x4 GB DDR3-1333MHz), Red Hat EL 5.3, Linux Kernel 2.6.18-128.el5 SMP x86_64, Binaries built with Intel® C/C++ Compiler 11.0 for Linux. Result submitted to
www.spec.org for review at 234,996 (SPECompMbase2001) as of March 30, 2009.
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system
hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering
* Other names andmore information on performancethe property of others. Copyright © 2009, Intel Corporation.
purchasing. For brands may be claimed as tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright © 2009, Intel Corporation. * Other names 33
34. Step Function in Performance
NHM-EP SPEC CPU2006 benchmark preliminary results (Read disclaimers below)
SPECint_ SPECfp_
rate_base2006 rate_base2006 Memory details Compiler
Nehalem 1.86/4.8/800 DC E5502 No data No data 24GB (12x2GB) DDR3-800MHz Intel Compiler 11.0
2.00/4.8/800 E5504 125 110 24GB (12x2GB) DDR3-800MHz Intel Compiler 11.0
2.13/4.8/800 E5506 130 113 24GB (12x2GB) DDR3-800MHz Intel Compiler 11.0
2.26/5.86/1066 E5520 185 154 24GB (12x2GB) DDR3-1066MHz Intel Compiler 11.0
2.40/5.86/1066 E5530 192 158 24GB (12x2GB) DDR3-1066MHz Intel Compiler 11.0
2.53/5.86/1066 E5540 199 161 24GB (12x2GB) DDR3-1066MHz Intel Compiler 11.0
2.66/6.40/1333 X5550 225 185 24GB (6x4GB) DDR3-1333MHz Intel Compiler 11.0
2.80/6.40/1333 X5560 230 188 24GB (6x4GB) DDR3-1333MHz Intel Compiler 11.0
2.93/6.40/1333 X5570 235 190 24GB (6x4GB) DDR3-1333MHz Intel Compiler 11.0
Disclaimers: All NHM-EP numbers are preliminary. Numbers in Red are estimates. Others are measured.
Numbers were measured using Intel Compiler 11.0 binaries from Oct 2008.
Final numbers (with minor variations from above) will be based on newer binaries.
“Peak” and “SPEED” results are WIP as well. All results are with HT ON and Turbo ON. Jan 26, 2009.
Data Source: Kuppuswamy Sivakumar, Intel Corporation, SPG Marketing
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 34
35. 2-Socket Quad-Core Intel® Xeon® Processor 5500 Series based platforms
SSD Performance Comparison using ANSYS ® Mechanical™ 12 P7 benchmarks
Test System Configuration and Results
Quad Core Intel® Xeon® Quad Core Intel® Quad Core Intel®
5482 Xeon® 5570 Xeon® 5570
(Harpertown) (Nehalem) (Nehalem)
3.2/1600 2.93/6.4 2.93/6.4
System Baseboard Supermicro X7DB8+ Supermicro X8DTN+ Supermicro X8DTN+
Processors Intel Xeon 5482 Intel Xeon 5570 Intel Xeon 5570
number/type sockets 2 Quad-core 2 Quad-core 2 Quad-core
core frequency 3.2 GHz 2.93 GHz 2.93 GHz
LL cache size 2x 6144 KB 8192 KB 8192 KB
Chipset FSB/QPI Seaburg 1600 MT/s Tylersburg 6400 MT/s Tylersburg 6400 MT/s
Memory 16 GB 24 GB 24 GB
DIMMS 8x2 GB FBD 12x2GB DDR3 12x2GB DDR3
memory speed 800 MHz 1067 MHz 1067 MHz
4x 15K RPM U320 SCSI 4x 15K RPM U320 SCSI
I/O Subsystem 4x SLC SSD RAID0
RAID0 RAID0
Operating System 64-bit Red Hat EL5U1 64-bit Red Hat EL5U3 64-bit Red Hat EL5U3
Elapsed time in seconds
8124 6522 4680
(lower is better)
Relative performance
1 1.25 1.74
(higher is better)
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by
those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to
evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel
* Otherproducts, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.)2009, Intel Corporation.
names and brands may be claimed as the property of others. Copyright © 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other 35
36. 2-Socket Quad-Core Intel® Xeon® Processor 5500 Series based platforms
Intel® X25-E Extreme SSD Performance Comparison using MD Nastran benchmarks
Test System Configuration and Results
Quad Core Intel® Quad Core Intel®
Quad Core Intel® Xeon® 5482 Xeon® 5570 Xeon® 5570
(Harpertown) (Nehalem) (Nehalem)
3.2/1600 2.93/6.4 2.93/6.4
System Baseboard Supermicro X7DB8+ Supermicro X8DTN+ IN001 Rev 1.02 Supermicro X8DTN+ IN001 Rev 1.02
Processors Intel Xeon 5482 Intel Xeon 5570 Intel Xeon 5570
number/type sockets 2 Quad-core 2 Quad-core 2 Quad-core
core frequency 3.2 GHz 2.93 GHz 2.93 GHz
LL cache size 2x 6144 KB 8192 KB 8192 KB
Chipset FSB/QPI Seaburg 1600 MT/s Tylersburg 6400 MT/s Tylersburg 6400 MT/s
Memory 32 GB 24 GB 24 GB
DIMMS 8x4 GB FBD 12x2GB DDR3 12x2GB DDR3
memory speed 800 MHz 1067 MHz 1067 MHz
4x Intel® X25-E Extreme SATA Solid-
I/O Subsystem 4 x 15K RPM U320 SCSI RAID0 4 x 15K RPM U320 SCSI RAID0
State Drive RAID0
Operating System 64-bit Red Hat EL5U1 64-bit Red Hat EL5U3 64-bit Red Hat EL5U3
Geomean for 12 workolads
2838.52 2137.04 1772.29
(lower is better)
Relative performance
1.00 1.33 1.60
(higher is better)
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by
those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to
evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel
* Otherproducts, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.)2009, Intel Corporation.
names and brands may be claimed as the property of others. Copyright © 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other 36
37. Dual Core Performance Refresh Calculation Details
Intel Estimated (1,000 1,000)
2006 2009 Delta / Notes
Product Intel® Xeon® 5160 Intel Xeon 5500 series
Processor (3.00GHz) (2.93GHz)
Performance 45.1 188 up to 4.16x per/server
per Server SPECfp_rate_base2006 SPECfp_rate_base2006
Server Power 365W active / 240W 329W active / 125W Server active 20 hours and idle for 4 hours
(Watts) idle idle per day. Assumes cooling with 2.0 PUE.
# Servers needed 1,000 1000
# Racks needed 48 racks 48 racks Same # of Racks
Total Perf 45,100 total
SPECfp_rate_base2006 Performance
188,000 total
SPECfp_rate_base2006 Performance
Up to 4.16X
performance boost
Annual kW/hr 6,046,320 5,182,560 Estimated 14% lower
energy Utilization
Annual Energy $604,632 $518,256 $86,376 less electricity costs per
Costs year. Assumes $0.10/kWhr and 2x cooling
factor
Annual Cost Savings of $
Cost of new HW n/a $86,376
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 37
38. Dual Core Performance Refresh Calculation Details
Intel Estimated (1,000 server w/HDD 1,215 w/ SSD
2006 2009 Delta / Notes
Product Intel® Xeon® 5160 Intel Xeon 5500 series
Processor (3.00GHz) (2.93GHz)
Performance 45.1 188 Up to 5x per/server
per Server SPECfp_rate_base2006 SPECfp_rate_base2006
Server Power 365W active / 240W 316W active / 117W Server active 20 hours and idle for 4 hours
(Watts) idle idle per day. Assumes cooling with 2.0 PUE.
# Servers needed 1,000 1215 Using SSD’s and ½ size brds
# Racks needed 48 racks 48 racks Same # of Racks
Total Perf 45,100 total
SPECfp_rate_base2006 Performance
188,000 total
SPECfp_rate_base2006 Performance
Up to 5X
performance boost
Annual kW/hr 6,046,320 6,044,440 Similar power
requirements
Annual Energy $604,632 $604,444 Approximately the same power cost
Costs
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 38
39. Dual Core Performance Refresh Calculation Details
Intel Estimated (1,000 server w/HDD 1,869 w/ SSD and optimized data
center, PUE 2.0 1.3)
2006 2009 Delta / Notes
Product Intel® Xeon® 5160 Intel Xeon 5500 series
Processor (3.00GHz) (2.93GHz)
Performance 45.1 188 Up to 7.8xx per/server
per Server SPECfp_rate_base2006 SPECfp_rate_base2006
Server Power 365W active / 240W 316W active / 117W Server active 20 hours and idle for 4 hours
(Watts) idle idle per day. Assumes cooling with 2.0 PUE.
# Servers needed 1,000 1869 Using SSD’s and ½ size brds
# Racks needed 48 racks 48 racks Same # of Racks
Total Perf 45,100 total
SPECfp_rate_base2006 Performance
188,000 total
SPECfp_rate_base2006 Performance
Up to 7.8X
performance boost
Annual kW/hr 6,046,320 6,043,690 Similar power
requirements
Annual Energy $604,632 $604,369 Approximately the same power cost
Costs
* Other names and brands may be claimed as the property of others. Copyright © 2009, Intel Corporation. 39
Editor's Notes Communicate how HPC and workstations work together.Technical computing is a combination of workstations and High performance computing clusters. The technical computing industry is driven to deliver results …fast. Workstations are required to create and HPC clusters are needed to simulate and analyze. After you analyze the data you can visualize the results to enable faster innovation and discovery This slide is the Spring board into the rest of the presentationPerformance – maximize performance per sq meter and performance per watt to reduce TCO is what the user is seekingVersatility – Customers want to be able to see immediate results when they port their software over to new architecture. Intel needs to provide the tools to ensure versatilityEase of Deployment – Customers want to purchase and easily deploy a cluster. They want to maximize their ROI by seeing their assets going to work immediately. No one wants to see their asset sitting in a datacenter waiting to be installed or trying to debug why it isn’t running as promised. The Intel Xeon processor 5500 series delivers up to 3x the performance of previous generation processors in HPC.New technology available in the processor will allow users to optimize the processor to their environmentAnd a more efficient processor to provide an even lower TCO then what was achievable by previous generation processors. The slide identifies the key features of the Intel® Xeon® processor 5500 series that enable it to be the ideal solution for the customers HPC environment. By optimizing for your environment you can achieve lower TCO. Nehalem processors performance is more than just frequency. QPI speed, memory speed and Turbo and HT support need to be considered. To ensure you are maximizing your performance customers need to consider the advanced skus. They offer highest frequency, fastest QPI, support the fastest memory, more Turbo; up to 400MHZ, and HT. The graph identifies the positive performance results advanced skus have in HPC. Y axis – spec scoresX axis – Intel® Xeon® processor 5500 series skus SSD’sExtreme Performance >100x IOPS€ performance gains vs. 15k HDDPower Efficient - >5x lower power€ vs. 15k HDDIncreased Reliability - 2.0M Hrs MTBF vs, 1.20M Hrs MTBF for 7.2K WD RE2Reduce system cost - Replace HDD and Memory with SSD’s10GbEExtreme Performance - iWARP provides low latency over 10GbE Low overhead and high bandwidthIncreased Reliability - Over 25 years delivering leading Ethernet products Broad OS Support Designed for Multi-corePower Efficient - Low power design <3.5WLower TCO Consolidated fabric through industry standardized technology Pulling together everything we just talked about enables a possible data center to increase their performance by up to 7.8x while staying within the same footprint. In 2006 the datacenter was using 5160 series processors with HDD and standard 1U rack servers and a power utilization efficiency of 2.0 Today we can refresh the datacenter with Intel Xeon processors 5500 series, Solid state drives, use half size 1U mother boards and increase the PUE from 2.0 to 1.3. By doing all of this we are able to achieve a performance increase of 7.8X. In the example above the only benefit we are gaining from the SSD’s of lower power. Depending on your environment you may also achieve a performance benefit as well with SSD’s. The PUE (Power Utilization Efficiency) improvement from 2.0 to 1.3 will require an investment into your datacenter. 1.3 is what current datacenters are being designed to.If you wanted to do a 1,000 1,000 refresh (using HDD and full size boards) you can achieve a 4x performance gain and a power savings of 14%. Use the same PUE with SSD’s you can increase performamance by greater than 5X and add 215 nodes to the datacenterBy keeping the same footprint as woodcrest, 1,000 servers, you can get a performance increase of >4x and a power savings of 14% or 863K KW.5x5x5 = 5x more power states, 5x lower idle power, 5x faster transitions between power states The Intel architecture is easy to use and flexible. IA architecture enables software to scale from one generation to another while achieving increased performance. By optimizing code you can achieve even greater performance. Intel software tools enable an easy transition from one generation to another and help prepare you for the future We now must look at the applications supporting HPC and ensure they are taking advantage of the technology designed into Nehalem. Is the code parallelized? Is it optimized on NHM? For many years applications have been able to take advantage of the increased frequency to improve performance. Now we are offering more cores to gain performance. ISV’s are now taking their serial code and parallelizing it. This is a challenge Intel is trying to make as simple as possible. Debug and Tune become equally important to carry forward to many-core. This is the heterogeneous tool set now, as many-core applications scale to terascale on clients, and these terascale nodes make clusters of petascale machines.Better performance, multi-core advancements and support for Intel® Core™ i7 processors. New versions of SW tools released in Nov. 08.the first step in the cycle is to gain insight into your code by analyzing it with tools such as Vtune performance analyzer and/or Thread CheckerNext, you parallelize your code with Intel tools such as Intel® Threading Blocks, Compilers, and Performance LibrariesAfter you parallelize your code you review the resutls for correctness/confidence. If you do not achieve the results you expect you can begin the cycle again with insight. Once you have achieved the desired results you and then performa a final optimization to ensure peak performance with Intel® VTune Performance Analyzer and Thread Profiler. Intel understands users want to quickly deploy their cluster to ensure they are maximizing their investment. To quickly deploy the cluster Intel has developed a specification called Intel Cluster Ready. The specification enables OEMs to create recipes. Recipes can be combined with certified software to create a certified cluster configuration. The configurations can be validated with Intel Cluster Checker to quickly ensure the cluster has been properly configured. This allows for a simple way to install and launch a cluster. The end result is a awesome out of box experience. Let’s talk a bit more about ICR… ICR enables users to simplify the purchasing process. Identify the certified software and certified cluster to ensure compatibility. Simplify manufacturing – enables manufacturers to build a particular configuration over and over. Simplify deployment – deploy the cluster, run Intel cluster checker and the system should run. If it does not run as it should Intel Cluster checker should idenify the issue for quick resolution.Simplify management – Easier to manage and ensure uptime with a certified cluster. Intel cluster checker can be run at any time to ensure the system has all key components operating properly. We continue track to the tick tock strategy. Our future is bright as we transition to new process technology and new technologies Having a strong future is important but also knowing how we will meet the challenges of today is what I would like to focus todays presentation on…let’s get started. Intelligent performance helping to deliver a lower TCO as well as ~3x the performance of previous generation processors.Intel Software tools enable users to easily optimize their software to maximize performance on current and future generation IA hardwareIntel Cluster Ready makes deploying a cluster easy Hard to get to 95% Hard to get to 200XNehalem delivers ~~ 3X vs 18.26X (6X delta)Most apps will resemble Amdahl’s Neha,lem ~~3 X vs accelerator increase of 3.25XIs the pain worth the glory