January 2011 MANAGEMENT BRIEF Value Proposition for IBM POWER7 Based Blade Servers Analysis Based on User Experiences International Technology Group ITG 4546 El
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Value proposition for ibm power7 based blade servers
1. January 2011
MANAGEMENT BRIEF
Value Proposition for
IBM POWER7 Based Blade Servers
Analysis Based on User Experiences
International Technology Group
4546 El Camino Real, Suite 230
Los Altos, California 94022-1069
ITG Telephone: (650) 949-8410
Facsimile: (650) 949-8415
Email: info-itg@pacbell.net
3. TABLE OF CONTENTS
EXECUTIVE SUMMARY 1
Differentiation 1
Customers 1
Futures 2
TELCORDIA: GETTING IT RIGHT FIRST TIME 3
UPMC: SUCCESS IN PARTITIONING 5
DANCERACE: BUILDING A BUSINESS ON TECHNOLOGY 6
UNIX SERVER BLADES 9
Overview 9
Platforms and Products 9
Transitions 9
HP Integrity 10
IBM Power 12
Sun T-Series 12
Comparative Performance 13
Virtualization Capabilities 15
Partitioning 15
Workload Management 16
Availability Optimization 18
X86 SERVER BLADES 22
Differentiators 22
Comparative Performance 22
Virtualization Capabilities 23
Virtualization Enablers 23
Size and Scale 23
Complexity 24
Availability and Security 25
Comparing Availability 25
Security and Malware Resistance 26
LIST OF FIGURES
1. Latest-generation HP, IBM and Sun UNIX Blades 10
2. HP Integrity Itanium 2- and 9300-based Models 11
3. Comparative Performance: IBM POWER6 and POWER7 Based Blades 13
4. Comparative Performance per Socket: POWER6 and POWER7 Based Blades 14
5. Software-based Minimum Partition Sizes: HP, IBM and Sun Blades 15
6. Hewlett-Packard Workload Manager Services 17
7. POWER7 Based Systems Virtualization Capabilities 17
8. Key POWER7 Availability Optimization Technologies 19
9. Key AIX 7.1 Availability Optimization Features 20
10. System Environment Layers: Example 24
11. Major Components of VMware vSphere 4 Environment 25
International Technology Group i
4. EXECUTIVE SUMMARY
DIFFERENTIATION
For almost a decade, blades have been one of the fastest-growing segments of the server market.
In 2010, blades will probably account for more than 25 percent of x86 server sales. Demand for
UNIX blades has also shown more rapid growth over the last few years.
The appeal of both types of blade has been driven by common factors. Server consolidation has
enabled organizations to realize space, energy and other savings. Capacity upgrades and
provisioning have been facilitated. Network complexity has been reduced.
However, although hardware packaging may be similar, blades from different vendors are far
from the same. Variations in performance, virtualization, availability and other capabilities reflect
differences in system architectures, processors and operating systems.
In comparing UNIX blades from Hewlett-Packard (HP), IBM and Oracle’s Sun, these differences
are clearly apparent. The leadership position that IBM’s Power Systems have gained in the overall
UNIX server market extends to blades built around the company’s POWER7 Architecture.
POWER7 based blades are also significantly differentiated from their x86 counterparts. Higher
performance, more granular partitioning and more effective workload management are delivered
than by Windows and x86 Linux blades equipped with VMware and equivalents.
The value proposition for POWER7 based blades is materially reinforced by higher availability
and better security than competitive platforms. These affect not only the quality of service, but
also the cost-effectiveness experienced by users.
POWER7 based blades may not be appropriate for all applications. But it is important to
understand how they differ from competitive platforms, and how the distinctive strengths of
POWER7 Architecture may provide unique customer value.
CUSTOMERS
This report presents three case studies of such value.
Telcordia, a leading player in the highly competitive market for mobile communications
solutions, employs POWER7 based blades and the AIX operating system to deliver the real-time
performance and 24/7 availability required by its customers.
University of Pittsburgh Medical Center (UPMC) exploits POWER7 based blades, highly
granular PowerVM partitioning and AIX to consolidate application and Web serving across its
full range of business-critical systems.
Dancerace, a 20-person UK company that offers online invoice discounting, factoring and trade
financing, employs POWER7 based blades and the IBM i operating system to deliver services to
a worldwide customer base that can afford neither delays nor downtime.
All three organizations are recognized leaders in their respective industries. All selected
POWER7 based blades and IBM BladeCenter chassis to handle demanding workloads, support
growth and maintain continuous availability for applications that run their businesses.
International Technology Group 1
5. FUTURES
For Telcordia, UPMC, Dancerace and many other organizations, the selection of POWER7 based
blades was influenced as much by future as present needs. Use of blades is evolving in ways that
play to the strengths of Power Architecture.
When blades first came into widespread use in the early 2000s, they were employed primarily for
“scale-out” applications requiring light-duty servers. Over time, however, blades have moved into
broader roles. Databases, transactional and – increasingly – mixed workloads must be handled
with the same efficiency and reliability as conventional platforms.
This trend has intersected with another: growing use of virtualization. Deployment of blades, as
well as the adoption of VMware, Xen and equivalents have been driven by server consolidation.
Support for multiple partitioned instances, and execution of the often-diverse workloads that
these generate, have become key new requirements.
Power Systems are industry leaders in these areas of capability. Among UNIX blade vendors,
HP’s Integrity platform comes closest to Power strengths. Latest-generation Itanium 9300 Series-
based Integrity blades, however, are outclassed by POWER7 in performance terms, and lack key
reliability, availability and serviceability (RAS) features offered on larger Integrity models.
Sun’s T-Series blades implement an architecture that was designed to handle high-volume, low-
impact Internet workloads. It performs less well in other roles. Performance, virtualization,
workload management, availability optimization and other capabilities are significantly weaker
than those of HP Integrity and IBM Power Systems.
Compared to x86 blades, key POWER7 differentiators include higher performance as well as the
integration, stability and resilience of IBM AIX and i; the ability of PowerVM to support higher
concentrations of diverse guest workloads; and industry-leading automation features that enable
greater operating efficiency and significantly reduce administrative overhead.
There also a number of differences in blade chassis design that tend to favor POWER7 based
BladeCenter systems over HP BladeSystem, Sun Blade 6000 and other competitive equivalents.
This does not mean that POWER7 based blades are appropriate in all roles. There are numerous
applications, particularly in the x86 space, where this may not be the case. HP, Sun and x86
blades can provide a great deal of value to organizations consolidating servers. But for the most
demanding and business-critical applications, POWER7 based systems are strong candidates.
International Technology Group 2
6. Telcordia: Getting It Right First Time
This case study is based on an interview with Richard
Goldberg, Product Manager, Service Delivery Solutions for
Telcordia. Richard is responsible for the Telcordia
Converged Application Server, which is built upon IBM
BladeCenter systems with POWER blades.
Telcordia provides software, turnkey systems and services to
communications services suppliers worldwide. Based in
Piscataway, New Jersey, it was formed in 1984 and was
originally known as Bell Communications Research or Bellcore. Currently,
it is a privately held company employing more than 2,800 people and operating
in 15 countries. Approximately half of its business is outside the U.S.
Telcordia provides service delivery and charging, operational support system
(OSS) and interconnection solutions, along with research and consulting.
Customers include landline, mobile and converged operators, Internet service
providers (ISPs), government entities and companies offering multiple types of
service. Telcordia has expanded rapidly during the 2000s, particularly in fast-
growth, highly competitive communications markets in Asia and Europe.
The company has won numerous industry awards for innovation, product quality
and network design. It is widely regarded as an industry leader in several areas
including real-time operational, policy management and billing solutions for
converged services.
Since 2005 Telcordia has employed IBM POWER based BladeCenter servers to
host the core platform of its Service Delivery Suite, which includes the Telcordia
Converged Application Server. The Telcordia Converged Application Server
incorporates a base set of applications for real-time policy, charging, service
creation and converged and interactive services.
These may be configured with additional Telcordia modules, or employed to
develop customized service solutions for individual operators. Systems employ
clusters of multiple blades to form a highly available solution.
Telcordia’s next-generation version of the Converged Application Server, which
will be delivered in the first quarter of 2011, will be built around POWER7 based
blades. The company evaluated, but decided not to use late-model POWER6+
based BladeSystem JS23 and JS43 blades. Price/performance for POWER7 based
models equipped with DDR3 RAM and a higher-speed memory bus was found to
be superior. A number of competitive platforms were similarly rejected.
International Technology Group 3
7. Use of BladeCenter systems, according to the company, provides the flexibility
to meet a wide range of customer needs, and allows for rapid, non-disruptive
scaling. Many of the company’s customers experience high levels of business
growth over long periods. The Telcordia solution is designed to scale from
entry-level configurations to systems supporting more than 100 million
customers.
Other Telcordia requirements that POWER based BladeCenter servers have met
include high levels of performance to support real-time operations, including an
in-memory database. Availability is also critical. The company advertises
“better than five nines” uptime for the solution. The applications it supports
typically operate on a 24/7 basis, and outages can translate into serious
customer losses.
Telcordia also cites the stability and predictability of AIX as a key factor in its
commitment to POWER BladeCenter servers. Telcordia bundles not only
application offerings, but also complex custom middleware into its solutions.
Close integration and optimization, and exhaustive testing of overall system
packages are mandated. Operating system problems would not be welcome at
any time during the lifecycle of the next-generation solution.
For this reason, Telcordia closely reviewed the IBM AIX Roadmap. It was
determined that this roadmap would meet its requirements for the foreseeable
future.
In marketing to communications services providers, the company claims that:
As the global leader in the development of mobile, broadband and enterprise
software and services, Telcordia is known for getting it right the first time.
No argument there.
International Technology Group 4
8. UPMC: Success in Partitioning
This case study is based on an interview with Iftekhar Kazi
(Senior Enterprise Architect) at the University of Pittsburgh
Medical Center (UPMC) based in Pittsburgh, Pennsylvania.
Additional information was supplied by Bill Hirsch
(Manager of Systems Support for AIX).
UPMC and Power
The University of Pittsburgh Medical Center (UPMC) is one of the largest and
most respected health care organizations in the United States.
UPMC is a not-for-profit organization affiliated with the University of Pittsburgh
Schools of the Health Sciences. It runs industry-leading programs in
transplantation, oncology, neurosurgery, psychiatry, orthopedics, sports
medicine and other areas, and has been nationally recognized by business and
industry groups for innovation and service excellence.
UPMC currently operates 20 hospitals with 4,000+ inpatient beds, along with
more than 400 clinics and outpatient locations in Western Pennsylvania, and
health care facilities in Europe. It employs more than 2,700 doctors. Its health
insurance arm covers more than 1.4 million members.
UPMC has grown during the 2000s through expansion of existing programs and
facilities, along with acquisitions and joint ventures that have expanded its local
and global presence. Between 2005 and 2009, operating revenues grew from $5
billion to almost $8 billion, while numbers of employees increased from 40,000
to 50,000.
UPMC standardized on IBM POWER based systems and AIX in 2005, and now
uses these for all systems. Patient care, scheduling, billing, Oracle’s PeopleSoft
and other applications run on a variety of IBM POWER based platforms, ranging
from blades to a high-end Power 595 server.
Growth posed challenges for the UPMC IT infrastructure. After researching a
number of options, an aggressive program to deploy Power and AIX partitioning
was decided upon. By using LPARs and micro-partitions to improve capacity
utilization, the organization was able to reclaim around 50 percent of its
processor capacity.
Workload growth could thus be supported while limiting new hardware and
software investments. A further benefit was that capacity could be brought online
more rapidly, and at lower cost. Time to provision a new server was reduced from
a day or more to a few minutes.
By the time this exercise had been completed, UPMC had become an industry
leader in the use of Power partitioning and workload management. A key lesson
learned is that workloads must be understood in detail, and monitored and
managed with high levels of granularity.
International Technology Group 5
9. Blade Deployments
In 2008, the Center decided to
deploy blade servers to consolidate
application and Web serving
workloads that required low levels
of disk and I/O throughput. The
strategy was to employ AIX Micro-
Partitioning to consolidate large
numbers of instances onto fewer
physical machines.
After initially deploying POWER6
UPMC Data Center: Racks and Blades
based blades, UPMC moved to
POWER7 technology as soon as this became available.
The Center has installed six 16-core PS702s with 256GB of memory, hosting up to
40 partitions each (compared to 24 for POWER6 based blades). The higher
performance and scalability of POWER7 based blades allows the Center to
achieve higher levels of concentration than with POWER6 based equivalents.
UPMC also employs pools of shared processors, along with Power system
mechanisms that allow these to use spare cycles in dedicated partitions. System
resources can be re-allocated in a rapid and flexible manner as needs change.
According to UPMC, consolidation has been facilitated by the IBM BladeCenter
Open Fabric Manager (BOFM). This enables high-speed switching across
multiple BladeCenter chassis, allowing blades to be located, and new capacity
activated at any point within the Center’s blade infrastructure. BOFM also
provides automated failover between blades in different chassis in the event that
a failure occurs.
Other benefits of employing
POWER based blades include
lower operating costs. Footprints
and energy consumption have been
reduced not only for servers, but
also for network connections. A
BladeCenter, according to UPMC,
requires eight to ten network
cables rather than the 60 to 80
that would have been required if
the same workloads had been
UPMC Data Center: Control Center
deployed on conventional servers.
UPMC plans to build upon its successes by implementing more granular service-
level management, extending automated provisioning across all hardware and AIX
components, and more closely integrating server and storage virtualization. The
Center has been an early adopter of IBM’s SAN Volume Controller (SVC), and sees
important synergies between this and its POWER and AIX based environment.
UPMC prides itself on being an innovator in heath care. But that, clearly, is not
the only area in which it is on the industry’s leading edge.
International Technology Group 6
10. Dancerace: Building a Business on Technology
This case study is based on an interview with Anthony
Avison, Chief Executive of Dancerace plc. The company is
based in the city of Bath in the United Kingdom. Avison
founded the company in 1992 and has personally led its
technology strategy since that time.
Dancerace is one of a new breed of companies: a global
business that processes £ billions with around 20 employees.
Dancerace specializes in software and services for invoice discounting, factoring
and trade financing; i.e., raising funds from receivables. In a tight credit market,
this approach has proved popular among businesses of all sizes.
Facing larger players in a near-saturated market, for almost 20 years Dancerace
has sought to differentiate itself through the way in which it uses IT. The company
was the first in its industry to offer products over the Internet, and has used a
combination of leading-edge proprietary applications and latest-generation IBM
technology to gain and retain customers. Continuing this tradition, Dancerace was
the first IBM customer in the UK to deploy POWER7 based blade servers.
The company’s business is entirely online. Customers are located in Europe,
Australia and Asia, as well as in developing countries in Africa, where Dancerace
is involved in local micro-financing initiatives.
Dancerace’s core system must meet exacting requirements. Workloads for
individual customers vary widely. The largest runs to millions of transactions, but
there are also customers for whom Dancerace services comparatively small
portfolios. The system must be capable of handling diverse, often fluctuating
workloads and of growing rapidly as the business expands.
In 2008, Dancerace replaced its earlier IBM System i server with a BladeCenter
chassis and three POWER6 based blades running IBM i. Two years later, the
company redeployed these as standby failover and recovery systems and
substituted a new BladeCenter H with two POWER7 based PS700 blades for
production.
The two new blades were able to handle the same workloads as the three
POWER6 based models. A third PS700 was later added to support growth.
Key benefits of deploying IBM PS700s, according to Dancerace, included only
higher performance, but also partitioning capabilities, and support for the IBM i
operating system – which is valued by Dancerace for its reliability, stability and
ease of management. Lower electricity consumption is also seen as important.
Dancerace describes itself as a “green” business.
Fourteen micro-partitions of varying sizes run on production, and seven on
standby blades. Workloads for individual customers are hosted in micro-
partitions, which can be modified automatically if their needs change. It is also
easy to add new partitions and/or blades as business expands.
International Technology Group 7
11. The ability to maintain continuous uptime also played a key role in Dancerace’s
choice of PS700s. Twenty-four/seven availability is critical for the company.
Because customers depend upon Dancerace for short-term financing needs, even
brief outages could have a significant impact on their bottom lines. Dancerace
could lose customers, and its reputation could be undermined.
This has not occurred. As the company’s Website notes:
We’re easily in the top two suppliers to this market in the world but only we can
boast that none of our clients have lost a business day due to a fault of ours
since the first Dancerace systems were launched in 1994.
Which has proved to be a useful marketing message.
Management also put a more sophisticated disaster recovery infrastructure in
place. The objective was to provide a further level of protection against the effects
of network and power outages, fires, extreme weather conditions and other
events that might disable the company’s main data center.
PRIMARY SITE STANDBY SITE
DS4800 DS3200
BLADECENTER H BLADECENTER S
10 miles
PS700 Servers JS12 Servers
(16 kilometers)
14 virtual servers/blade 7 virtual servers/blade
IBM i operating system IBM i operating system
BOFM BOFM
Duplexed, fault-tolerant
fiber optics
Dancerace Data Center
The POWER6 based failover and recovery configuration is located at a secure site
approximately 10 miles away. An in-house solution built on the remote
journaling feature of the IBM i operating system has been put in place, enabling
service to be resumed in, at most, a few minutes. Duplexed, fault-tolerant fiber
optic networks link the main and standby sites.
Customers have been impressed by these arrangements, which have also proved
useful in attracting new accounts.
Dancerace has also invested in advanced data center infrastructures, SAN-
connected IBM disk systems and other state-of-the-art capabilities. Quality of
technology, according to management, has played an important role in the
company’s success.
Not bad for a 20-person company. As the British say: brilliant.
IBM Business Partner Imtech ICT and IBM personnel provided valued assistance in
deploying and supporting Dancerace’s blade-based systems.
International Technology Group 8
12. UNIX SERVER BLADES
OVERVIEW
Over the last decade, three players – HP, IBM and Sun – have dominated the UNIX server market.
These companies are also the major players in UNIX server blades.
All three companies market UNIX and x86 server blades for use in common BladeSystem (HP),
BladeCenter (IBM) and Blade 6000 (Sun) chassis.
IBM and Sun began to offer UNIX server blades in 2003, and HP in 2005. IBM and HP have
offered blade models of their Power and Integrity platforms since that time. Sun has offered a
broader mix of products built around Sun UltraSPARC (now withdrawn) and T-Series processors,
along with x86 models running the Solaris operating system.
Sun x86 blades have enjoyed steady demand among users seeking to replace the company’s older
SPARC-based servers. Sales volume for T-Series blades has been significantly lower, for reasons
detailed later in this section. Market share calculations are often confused by the fact that the x86
version of Solaris is also supported on HP, IBM and other vendors’ x86 blades.
This section deals with HP Integrity, IBM Power and Sun T-Series blades. Vendor platforms and
products are first outlined, then capabilities in three key areas – performance, virtualization and
availability optimization – are compared. The following section compares x86 and UNIX server
blades, focusing primarily on differences between x86 and POWER7 based systems.
PLATFORMS AND PRODUCTS
Transitions
During 2010, the UNIX server product lines of HP, IBM and Sun have undergone significant
changes. HP Integrity systems have moved from Intel Itanium 2 to next-generation Itanium 9300
Series processors, while IBM Power Systems have transitioned from POWER6 to the POWER7
generation of architecture. These shifts, which were announced in April, have been reflected in
blade product lines.
HP now markets Integrity blades based on quad-core Itanium 9300 Series processors with rated
frequencies of 1.33 to 1.73 GHz, while IBM Power blades employ quad-core 3.0 GHz POWER7
processors. IBM also offers POWER7 six- and eight-core processors with frequencies of up to 4.0
GHz in other server forms.
In September 2010, Sun announced a third generation of T-Series systems (the first generation was
introduced in 2005) built around 1.65 GHz T3 processors. These systems included a single-socket
blade model, the T3-1B, which may be configured with an 8- or 16-core T3 processor.
In addition to their principal chassis, the c7000 and BladeCenter H respectively, HP and IBM offer
the smaller-format c3000 and BladeCenter S. IBM offers a compact version of the BladeCenter H
chassis designated BladeCenter E.
HP, IBM and Sun also offer modified chassis that comply with U.S. Network Equipment –
Building System (NEBS) standards for communications carrier applications.
International Technology Group 9
13. HP and IBM offer blade fabric management tools, Virtual Connect and BladeCenter Open Fabric
Manager (BOFM) respectively. These handle switching and failover processes across multiple
chassis. HP offers its own line of switches, while IBM supports Blade Network Technologies,
Brocade, Cisco Systems and other third-party offerings. There is no Sun equivalent.
These products are summarized in figure 1. All three vendors also continue to market older blades.
Figure 1
Latest-generation HP, IBM and Sun UNIX Blades
HP IBM SUN
Chassis – BladeSystem c7000 – 16x, 10U BladeCenter H – 14x, 9U Blade 6000 – 10x, 10U
Number blades, BladeSystem c3000 – 8x, 6U BladeCenter E – 14x, 7U Blade 6048 – 48x, 42U
size
BladeCenter S – 6x, 7U
Model(s) BL860c i2 PS700 Express T3-1B
1-socket 1.6 GHz 1-socket 3.0 GHz – up to 1-socket 8- or 16-core T3
2-socket 1.33, 1.6 or 1.73 GHz 64GB main memory 1.65 GHz – up to 128GB
up to 192GB main memory PS701 Express main memory
BL870c i2 2-socket 3.0 GHz – up to
2- or 4-socket 1.33, 1.6 or 1.73 GHz 128GB main memory
up to 384GB main memory PS702 Express
BL890c i2 4-socket 3.0 GHz – up to
4- or 8-socket 1.33, 1.6 or 1.73 GHz 256GB main memory
up to 768 GB main memory
Operating systems HP-UX, Windows*, RHEL*, SLES AIX, IBM i, RHEL, SLES Solaris
Threads per core 2 2–4 16
Fabric Virtual Connect: 1-4 chassis BladeCenter Open Fabric No equivalent
management Virtual Connect Enterprise Manager (BOFM): 1-100
Manager: 5-100 chassis chassis
*Current versions
HP Integrity blades support the HP-UX operating system as well as Windows Server, Red Hat
Enterprise Linux (RHEL) and SUSE Linux Enterprise Server (SLES). IBM POWER7 based
models support IBM AIX and i, along with RHEL and SLES. An optional facility, Lx86, also
supports older 32-bit Linux applications. Sun T-Series blades support only Solaris.
IBM i is the latest version of an operating system that has been employed by over 200,000
organizations worldwide on IBM AS/400, iSeries and System i platforms, in some cases for more
than 20 years. Since 2008, IBM i has run on the same POWER based hardware platforms that
support AIX.
The IBM i operating system has proved popular among small, midsize and some large corporate
users, and is generally recognized as one of the most closely integrated, automated and reliable
system environments in existence. It is employed at Dancerace, one of the case studies presented
in this report.
HP Integrity
HP’s UNIX server strategy has, during the 2000s, been dominated by its commitment to Intel
Itanium processors for its Integrity platform.
The original Itanium design, which was developed by HP and Intel in the late 1990s, was intended
to address the high-end RISC and volume microprocessor markets with a single chip. A series of
performance shortfalls and technical problems resulted, however, in weak early market
penetration. Intel shifted its volume focus to the Xeon family.
International Technology Group 10
14. By the mid-2000s, Integrity systems were outperformed by IBM Power equivalents. The Integrity
market position was undermined by delays in upgrading Itanium performance. The most recent
“Tukwila” or Itanium 9300 Series generation of processors, in particular, was originally scheduled
for release in 2007, but appeared only in February 2010.
One result is that Itanium processors have not matched progress in POWER technology. Current
Itanium 9300 Series processors are four-core chips with rated frequencies of 1.33 GHz to 1.73
GHz. IBM POWER7 processors range from four-core 3.0 GHz (employed in Power blades) to
eight-core 4.0 GHz chips.
Integrity market momentum has also been undermined by relatively weak appeal for Windows and
Linux deployment. Although Windows, RHEL and SLES have been supported on Integrity
systems since they were first introduced, the majority of installations – more than 80 percent,
according to most industry estimates – run HP-UX. Windows deployments are believed to account
for 5 to 10 percent.
In moving to Itanium 9300 Series processors, HP has replaced earlier Integrity rack-mount and
tower models. Apart from an entry-level rx2800 designed for remote office and small business use,
the entire Integrity product line is now blade based. The company appears to have abandoned the
market space previously occupied by its midrange rack-mounted rx6600, rx7640 and rx8640.
Figure 2 illustrates this transition.
Figure 2
HP Integrity Itanium 2- and 9300-based Models
ITANIUM 2-BASED
BL860c rx2660 rx7640
Model rx6600 Superdome
BL870c rx3600 rx8640
Cores 2-4 1-4 2-8 2-32 2-128
nPars, vPars, nPars, vPars,
Partitioning HPVMs HPVMs HPVMs
HPVMs HPVMs
Form Blade Rack, Tower Rack, Tower Rack Rack
ITANIUM 9300 SERIES-BASED
Model rx2800 i2 BL860c i2 BL870c i2 BL890c i2 Superdome 2
Cores 2-8 2-4 4 4-32 16-128
nPars, vPars,
Partitioning HPVMs HPVMs HPVMs HPVMs
HPVMs
Form Rack, Tower Blade Blade Blade Blade
HPVMs = Integrity Virtual Machines
At the high end of the line, the Superdome 2 consists of Itanium 9300-based blades grouped in
8-, 16- and 32-socket configurations. These implement HP nPar hard partitioning technology.
However, nPars are not supported on other Itanium 9300 Series-based models, including blades. In
the past, HP has characterized nPars as essential to maintaining “business-critical” availability.
According to HP, the company remains committed to the Integrity platform, and plans to use future
Intel Itanium processors. These are expected to include “Poulson” and “Kittson” processors said by
Intel to be scheduled for 2012 and 2014 respectively. No details of these have been released.
International Technology Group 11
15. IBM Power
Compared to the experiences of HP and Sun, the evolution of IBM Power Systems resembles the
story of “the tortoise and the hare.” The Power platform maintained a steady pace of
price/performance gains and functional improvement through its POWER5 (2004), POWER6
(2007) and POWER7 (2010) generations. Over this period, Power Systems emerged as the overall
market share leader in UNIX servers.
In contrast to the Intel Itanium 9300 Series and Sun T3 introductions, the POWER7 generation
represents a significant architectural transition.
At the chip level, the POWER7 design features not only higher core densities but also features that
deliver high levels of performance with a comparatively small chip size. POWER7 processors
embed 1.2 billion transistors, compared to 2 billion for the Intel Itanium 9300 and Sun T3, and 2.3
billion for the eight-core Intel 7500 Series (Nehalem EX).
The POWER7 outperforms both. Reduced transistor counts contribute to both faster speeds and
lower energy consumption. Other new POWER7 capabilities include:
Intelligent Threads. The maximum number of threads supported per core increases from two
for POWER6 processors to four for POWER7. The IBM implementation allows workloads to
be executed using one, two or four threads per core. The approach is highly automated.
Systems can automatically determine which to use for optimum performance, or system
administrators may select the number of threads employed. In automatic mode, the system
provides continuous optimization of performance, which materially facilitates execution of
heterogeneous workloads.
Intelligent Cache. The POWER7 cache structure provides 256KB of on-chip Level 2 (L2)
cache per core, and 32 MB of shared Level 3 (L3) on-chip cache per processor. As for
numbers of threads, the amount of cache employed for specific workloads may be determined
automatically by the system, or set by system administrators.
Active Memory Expansion. This enables system-managed compression and decompression of
data in memory. POWER7 is the first major commercial processor to offer this capability.
Compression rates of up to 50 percent are supported; i.e., useable main memory may be up to
double physical memory.
Exploitation of Active Memory Expansion is again highly automated. Compression and
decompression may be turned on and off by the system to optimize performance for partition
based workloads. The amount of memory made available in this manner may be subject to
system-level priorities.
According to IBM, a next generation of “POWER8” architecture is under development, although
no details have been released. It is expected that this will appear in the 2013 to 2014 timeframe.
Sun T-Series
Although there are significant differences between them, both Integrity and Power systems draw
upon Reduced Instruction Set Computing (RISC) design concepts intended to yield high levels of
performance for multiple types of workload.
International Technology Group 12
16. In contrast, T-Series systems employ a distinctive architecture originally developed by Afara
Websystems, which was acquired by Sun Microsystems in 2002. This employs a combination of
low-frequency processors and large numbers of threads – up to 16 per core on latest-generation T3
systems. It was designed to support high-volume, low-latency Internet workloads.
Since the first T-Series models were introduced in 2005, they have been deployed primarily by
Internet Services Providers (ISPs), communications services providers and others for this type of
workload. T-Series systems have proved less effective for database- and transaction-intensive
workloads, which tend to run more efficiently using single threads on higher-frequency cores.
This limitation is recognized by Oracle, which recommends that T-Series systems should not be
employed for database workloads sensitive to response time, or for batch or “heavyweight” single-
threaded applications. Most commercial applications operate in single-threaded mode.
Oracle has committed to maintain and enhance the T-Series until 2015, after which the company
plans to introduce a new SPARC processor platform. No further details have been released.
COMPARATIVE PERFORMANCE
In comparing performance of Integrity, POWER and T-Series blades, certain caveats are in order.
At the time of writing, there was limited user experience with Itanium 9300- and POWER7 based
blades, which were introduced in April 2010, and with Sun SPARC T3 systems, which were
introduced in September 2010.
Comparative performance of previous generations of systems, however, may be taken as a general
baseline. Prior to 2010 introductions, IBM POWER6 based systems outperformed competitive
platforms with the same number of cores by margins of two to three times.
In transitioning to the POWER7 generation of systems, Power Systems performance has increased
substantially. Within the IBM BladeCenter product line, performance per core for POWER7 based
models has increased by 49 to 53 percent compared to POWER6 based and 19 to 24 percent
compared to POWER6+ based models. Figure 3 shows comparisons.
Figure 3
Comparative Performance: IBM POWER6 and POWER7 Based Blades
Model Sockets Processor Cores rPerf rPerf/Core
POWER6 based
JS12 1 POWER6 2 x 3.8 GHz 14.71 7.36
JS22 2 POWER6 4 x 3.8 GHz 30.26 7.57
JS23 2 POWER6+ 4 x 4.2 GHz 36.28 9.07
JS43 4 POWER6+ 8 x 4.2 GHz 68.20 8.53
POWER7 based
PS700 1 POWER7 4 x 3.0 GHz 45.13 11.28
PS701 2 POWER7 8 x 3.0 GHz 81.24 10.16
PS702 4 POWER7 16 x 3.0 GHz 154.36 9.65
Across the Power product line as a whole, performance typically increased by between 40 to 60
percent compared to POWER6 based systems introduced in 2007, and by 20 to 30 percent
compared to systems based on POWER6+ processors introduced in 2008 and 2009.
International Technology Group 13
17. These comparisons are based on IBM rPerf metrics, which measure comparative performance of
POWER based systems for commercial workloads. Users have found them to be generally
reliable. There are no HP or Sun equivalents.
Performance increases appear to be greater than for latest-generation HP Integrity and Sun T-
Series systems. A TPC-H 1TB decision support benchmark published by HP indicates, for
example, an average per core performance increase of 14 percent for Superdome 2 systems based
on Itanium 9300 Series 1.73 GHz quad-core processors compared to previous-generation
Superdome systems based on Itanium 2 1.6 GHz dual-core processors.
The probability is that the Sun T-Series transition from 8-core 1.6 GHz to 16-core 1.65 GHz
processors also represents, at best, a minor improvement in per core performance. The SPARC T3
processor consists of two T2 processors embedded on a single chip. It does not incorporate
significant functional changes that might further accelerate performance.
It may thus be reasonably concluded that POWER7 based systems have retained an approximately
two to three times advantage in against latest-generation competitive systems with the same
number of cores.
In practice, as the density of cores per processor is significantly higher for latest-generation
systems, performance per socket has increased more dramatically. Within the BladeCenter product
line, for example, the evolution has been as shown in figure 4. Comparisons are again based on
IBM rPerf ratings.
Figure 4
Comparative Performance per Socket: POWER6 and POWER7 Based Blades
SINGLE-SOCKET SERVERS
JS12– 2 x POWER6 3.8 GHz cores
14.71
PS700 – 4 x POWER7 3.0 GHz cores
45.13
DUAL-SOCKET SERVERS
JS23 – 4 x POWER6+ 4.2 GHz cores
36.28
PS701 – 8 x POWER7 3.0 GHz cores
81.24
FOUR-SOCKET SERVERS
JS43 – 8 x POWER6+ 4.2 GHz cores
68.2
PS702 – 16 x POWER7 3.0 GHz cores
154.36
In these comparisons, single-socket performance has increased by more than three times, and both
dual-socket and four-socket performance more than doubled. Price/performance ratios also, as is
noted in the Telcordia case study, appear to have improved significantly.
International Technology Group 14
18. VIRTUALIZATION CAPABILITIES
Partitioning
Although virtualization and partitioning are often equated, in practice they are different.
Virtualization is a broader concept that extends to management of system resources in partitioned
environments. These capabilities are discussed separately below.
In comparing the capabilities of HP Integrity, IBM Power and Sun T-Series blades, three types of
partitioning should be addressed:
1. Hard partitioning refers to hardware- or microcode-based methods that allow better isolation
of workloads than software-based equivalents. Advantages include improved manageability –
workloads are less likely to interfere with each other – and reduced security exposure.
HP, IBM and Sun offer hard partitioning on at least some of their UNIX server platforms.
Only IBM offers this capability on blades.
Hewlett-Packard’s strategic hard partitioning technology, nPars, is supported only on
Superdome 2 systems. It requires specialized, comparatively expensive “cell blades,” which
duplicate the functions of cell boards employed in earlier Integrity Superdome systems.
Sun offers a hard partitioning capability, Dynamic Domains, on its M-Series servers, which
are not available in blade form. There is, however, no T-Series equivalent.
IBM logical partitions (LPARs), which are implemented in PowerVM microcode, may be
configured in increments as small as 1/10th of a core. Up to 10 LPARS are supported per core,
for totals of between 40 and 160 per blade. LPARs may host AIX, IBM i, RHEL and SLES
instances, or combinations of these.
2. Software-based partitioning, in this context, refers to software-based techniques employed to
host multiple operating system instances. All three vendors offer such techniques. There are
variations in granularity that are shown in figure 5.
Figure 5
Software-based Minimum Partition Sizes: HP, IBM and Sun Blades
HP INTEGRITY IBM POWER7 SUN T-Series
th
HPVMs – 1/20 core Micro-partitions Oracle VM Server for SPARC
th th
– 1/10 core initial increment – 1/8 core
th
– 1/100 core subsequent increment
IBM micro-partitions must initially be configured in increments of 1/10th, but may be
expanded in later increments of 1/100th of a core. HPVMs and Oracle VM Server for SPARC
were formerly known as Integrity Virtual Machines (IVMs), and Sun Logical Domains
(LDOMs) respectively.
The capabilities of software-based techniques overlap, to some extent, those of hard
partitioning equivalents. Hard partitions continue, however, to be preferred by many
organizations for their most performance- and availability-sensitive applications, and for
database serving.
Among the case studies presented in this report, for example, UPMC employs micro-partitions
to consolidate application and Web serving. Consolidation of database servers, however, is
handled by LPARs.
International Technology Group 15
19. One reason for this preference is that, in a multitier architecture, database servers represent the
main point of vulnerability. If an application or Web server fails, an organization can switch to
an alternate. Loss of a database server, however, will disable the entire system. Loss of
database contents may have even more serious repercussions.
3. Application partitioning, in this context, means techniques that allow system resources to be
allocated to specific applications sharing a common operating system instance in a partition.
Application partitioning is typically employed for development, test and other non-production
instances as well as for light-duty production applications.
The best-known approach, Oracle Containers and Zones (Containers provide system resource
controls, while Zones define partitions) are supported on Sun T-Series as well as other Solaris
platforms. IBM Workload Partitions (WPARs) and HP Secure Resource Partitions provide
functionally similar capabilities.
A fourth type of partitioning, Virtual I/O Server (VIOS), is offered only by IBM. VIOS partitions,
which are supported on all Power based systems, including blades, allow operating system
instances in multiple LPARs to share a common pool of LAN adapters as well as Fiber Channel,
SCSI and RAID devices; i.e., it is not necessary to dedicate adapters to individual partitions.
Numbers of physical adapters required may be significantly reduced. It can be expected that this
capability will become increasingly significant as processors become more powerful, and numbers
of partitions on individual blades increase. Apart from cost savings, management complexity and
energy consumption may be significantly reduced. Redundant VIOS may be employed.
IBM is the only vendor that supports its full range of partitioning capabilities on blades.
Workload Management
Partitioning creates the potential for high levels of concentration and capacity utilization. The
extent to which these are realized in practice, however, depends upon the mechanisms that (1)
allocate and re-allocate system resources and (2) monitor and control execution processes across
and within partitions.
If these mechanisms are ineffective, risks are run. Low-priority processes may draw resources
away from business-critical applications. Contention for resources may degrade system-level
performance and cause outages. Utilization goals may not be realized because IT organizations
leave a great deal of spare capacity to allow for these effects.
Risks are compounded when workloads are subject to sustained growth, or fluctuate, and they are
compounded further when both occur.
In order to deal with these challenges, HP, IBM and Sun all offer such capabilities as workload
prioritization, dynamic partitioning, shared processor pools, allocation of processor and memory
resources based on performance, service level or other targets, and capping (i.e., setting limits on
the resources that may be consumed by a particular partition or workload).
HP and IBM approaches are, however, more effective at handling dynamic resources, and in the
extent of their integration with mechanisms that manage processor, memory, I/O and other system
resources. The HP Workload Manager (WLM), for example, offers the range of services
summarized in figure 6.
International Technology Group 16
20. Figure 6
Hewlett-Packard Workload Manager Services
Adjust processor resources based on workload priorities
Migrate cores between virtual partitions
Adjust number of cores in processor sets
Manage resources inside virtual machines
Adjust resource allocations by time of day, system events or application metrics
Ensure critical workloads have sufficient resources to perform at required levels
Set minimum & maximum amounts of CPU & memory available to workloads
Grant workloads dedicated processor sets
Grant workload CPU resources according to a metric, such as number of processes
Optimize performance for multiple workloads on a single system
Monitor resource consumption by applications or users
Capabilities for POWER7 based systems are similar but broader, and are implemented using a
combination of workload management features built into IBM AIX and i, and into the PowerVM
hypervisor. These capabilities illustrated in figure 7.
Figure 7
POWER7 Based Systems Virtualization Capabilities
AIX 7.1
Intelligent Threads • Intelligent Cache
WORKLOAD MANAGER
Active Memory Sharing • Shared Dedicated Capacity
Shared Processor Pools • Multiple Shared Pools
Active Memory Expansion
POWERVM HYPERVISOR
LPAR LPAR LPAR LPAR
Micro-partitions Micro-partitions
WPARs
LPAR
Virtual LAN
VIRTUAL I/O SERVER VIRTUAL I/O SERVER
In POWER7 based systems, processors and main memory, along with cache and threads may be
allocated and re-allocated. Resources may be dedicated to LPARs (Static LPARs), or shared
according to application priorities (Dynamic LPARs). Static LPARs are typically employed for
applications with high levels of business criticality.
International Technology Group 17
21. Physical as well as logical (thread-based) processors may be grouped in shared pools. A key IBM
differentiator is that, in POWER7 based systems, physical and logical processors may be assigned
separately to specific pools, individual workloads or both. This provides a great deal more
flexibility than offered by HP or Sun.
Other mechanisms include Active Memory Sharing, which allows memory to be shared between
LPARs; Shared Dedicated Capacity, which allows shared processor pools to use idle CPU cycles
in dedicated LPARs; and Multiple Shared Pools, which allows resources used by groups of
LPARs to be capped. These are integrated with new POWER7 capabilities such as Intelligent
Threads and Intelligent Cache.
The result is that POWER7 based systems can manipulate a wider range of variables – including
threads, cache, main memory and I/O, multiple types of partition, multiple threads, and dedicated
or pooled processors – to optimize performance for heterogeneous applications and workloads.
The IBM design emphasis on automation should be highlighted. Although key parameters may be
set by system administrators, systems can automatically determine, for example, how much cache,
how many threads, or how many virtual or physical processors to allocate to a specific application
task based on workload characteristics and priorities. Systems evaluate resource utilization every
10 milliseconds, and may change resource allocations as rapidly.
Automation yields multiple benefits. One is that, by reducing the complexities to which system
administrators are exposed, full time equivalent (FTE) staffing may be reduced. Automation may
also improve overall capacity utilization, and improve quality of service by reducing the potential
for performance bottlenecks and outages caused by human error.
Integrity systems and HP-UX offer many of the same features. But they do not match the overall
set of capabilities offered by POWER7 based systems, or the manner in which these are integrated
and optimized in a mutually reinforcing manner.
Sun offers some comparable functions through the Resource Manager component of Solaris.
These function, however, are more rudimentary and less granular than HP and IBM equivalents,
and dynamic resource allocation and automation capabilities are significantly weaker.
Manageability has not been a major Sun focus in the past.
AVAILABILITY OPTIMIZATION
For several decades, availability expectations have typically been higher for UNIX than for x86
servers, and this is also proving to be the case for blades.
The level of availability maintained by any system depends on a number of factors. These include
capabilities that minimize the frequency and effects of component or software failures, along with
monitoring, diagnostic, and fault masking and resolution, and other tools.
In virtualized system environments, workload management effectiveness also plays an important
role. If workloads interfere with each other or exceed their capacity limits, outages may occur.
Organizations must, moreover, seek not only to deal with unplanned (i.e. accidental) outages, but
also to minimize the frequency and duration of planned downtime for such functions as hardware
and software upgrades and scheduled maintenance.
Maintaining high availability thus presents multiple, overlapping challenges. The extent to which
platforms are capable of meeting these challenges is determined not only by individual hardware
and software, but also by overall system-level design and optimization.
International Technology Group 18
22. The extent to which Integrity, Power and T-Series systems meet these challenges varies. All three
employ basic techniques such as component redundancy and hot-swapping (i.e., allowing a device
to be replaced without taking systems online), and provide facilities for monitoring, diagnostics,
predictive failure analysis and related functions.
HP and IBM have generally been leaders in availability optimization. Their approaches have been
built around multiple levels of capability across all hardware and software components that
represent potential sources of downtime. For example, IBM employs the hardware- and
microcode-based technologies summarized in figure 8 in POWER7 based blades.
Figure 8
Key POWER7 Availability Optimization Technologies
BASIC CAPABILITIES
Redundancy, hot-swap & internal Redundant/hot-swap disks, PCI adapters, GX buses, fans & blowers,
failover power supplies, power regulators & other components.
Redundant disk controllers & I/O paths.
Concurrent system clock repair.
Redundant oscillators/dynamic oscillator failover.
Concurrent firmware update Server firmware may be updated without taking systems offline.
Concurrent maintenance Allows processors, memory cards & adapters to be replaced,
upgraded or serviced without taking systems offline.
MONITORING, DIAGNOSTICS & FAULT ISOLATION/RESOLUTION
Hardware-assisted memory scrubbing Automatic daily test of all system memory. Detects & reports
developing memory errors before they cause problems.
Chipkill error checking Technology capable of detecting & correcting single-bit as well as 2-,
3- & 4-bit errors in memory devices, including cache & memory
interfaces.
Employs RAID-like striping of data across memory devices to provide
redundancy & enable reinstatement of original data. Significantly more
reliable than conventional error correction code (ECC) technology.
First Failure Data Capture (FFDC) Employs 1,000+ embedded sensors that identify errors in any system
component. Root causes of errors are determined without the need to
recreate problems or run tracing or diagnostics programs.
FAULT MASKING
Processor instruction retry If an instruction fails to execute due to a hardware or software fault,
Alternate processor recovery the system automatically retries the operation. If the failure persists,
Processor-contained checkstop the operation is repeated on a different processor &, if this does not
succeed, the failed processor is taken out of service (checkstopped).
Only LPARs supported by the failed processor are affected.
Dynamic processor sparing Allows idle Capacity Upgrade on Demand (CUoD) processors to be
automatically activated as replacements for failed processors.
Partition availability priority In the event of a processor failure, maintains LPAR-based workloads
based on assigned priorities; i.e., remaining processor capacity is
assigned to the highest-priority workloads.
Memory sparing Enables redundant memory modules to be activated in the event of
memory failures.
Enhanced memory subsystem Enables memory controller & cache sparing.
Enhanced cache recovery Detects & purges processor, L2 & L3 cache errors. Recovers &
reinstates original data.
Dynamic I/O line bit repair (eRepair) Detects & bypasses failed memory pins.
PCI bus parity error retry Retries an I/O operation if an error occurs.
POWER7 based systems benefit from a number of technology transfers from IBM mainframe
systems, which enjoy the highest levels of availability of any major platform. According to IBM,
the availability optimization features of POWER7 based systems were developed jointly by the
company’s Power and System z (mainframe) design teams.
International Technology Group 19
23. Key mainframe-derived features include First Failure Data Capture (FFDC), which employs
thousands of embedded sensors to identify and determine the cause of errors in any system
component; and Alternate Processor Retry, which engages a series of diagnostic and remedial
actions if an instruction fails to execute.
A further level of capability is provided by AIX. The latest version 7.1 includes the features shown
in figure 9.
Figure 9
Key AIX 7.1 Availability Optimization Features
Second Failure Data Capture Supports First Failure Data Capture technology with additional diagnostic & data
(SFDC)* capture features built into the operating system.
Multisystem First Failure Data Consolidates FFDC information, & provides single point to launch data collection,
Capture* debug & monitoring actions across multiple systems.
Run-time error checking System-wide framework for FFDC & SFDC capabilities.
Concurrent Kernel Updates Enables some kernel fixes to be installed without rebooting. Allows patches to be
applied without interruption of service. Can be employed for approximately 80
percent of required single module kernel updates.
Kernel exploitation of POWER Exploits a POWER7 hardware feature that separates memory spaces for the
Storage Keys* kernel, file system & drivers to prevent software errors affecting one of these
from spreading to the others.
Functional Recovery Routines* Suite of diagnostic & recovery routines that can enable recovery from errors that
would otherwise cause the operating system to crash.
Kernel no-execute protection Establishes kernel data areas that should not be treated as executable code.
Enables immediate detection if erroneous device driver or kernel code strays into
these areas. Avoids potential system crashes.
Kernel stack overflow detection Detects stacks overflows & enables recovery of some of these.
Tracing facilities System trace – main AIX trace facility.
Lightweight memory trace – allows tracing of key kernel events only. Lightweight
structure results in minimal performance impact.
Component trace – enables tracing with per-component granularity.
Dynamic Tracing with probevue Allows developers or system administrators to dynamically place probes in
existing application or kernel code, without requiring special source code or even
recompilation. Simplifies debugging of complex system or application code.
POSIX trace Implements POSIX Trace Standard for application tracing.
Live Dump* Allows key subsystems to dump diagnostic information for service analysis,
without requiring a full system dump.
Firmware-assisted dump* Allows firmware to incorporate FFDC information in system dumps.
MiniDump Small compressed dump of system data for diagnostic analysis. Enables quick
snapshot of crash without full system dump.
Parallel Dump Compressed format enabling multiple processors to dump in parallel sub-areas.
Greatly reduces time to produce dump.
Netmalloc debug Memory subsystem monitoring tool that enables isolation of memory leaks.
Live Partition Mobility Allows movement of active LPARs between Power Systems. Brief interruptions –
no more than one or two seconds – may occur due to network latency.
Live Application Mobility Allows movement of WPARs between systems. Service interruptions are longer
than for Live Partition Mobility – typically around 20 seconds.
Cluster Aware AIX Provides kernel-based heartbeat, messaging, file sharing, commands & APIs,
data collection & event management services supporting clustered HA solutions.
*Mainframe-derived feature
Key AIX mainframe-derived features include Second Failure Data Capture (SFDC), Kernel
Exploitation of Storage Keys and Functional Recovery Routines, which are drawn from z/OS
operating system.
International Technology Group 20
24. HP and (to a lesser extent) Sun offer a number of comparable capabilities. Generally, however, the
quality of IBM microelectronics technology is superior – unlike HP and Sun, the company is a
major semiconductor designer and manufacturer – and neither competitor has been able to draw
upon mainframe hardware and software technology in the same manner as IBM.
It is unclear whether the Integrity transition to blade-based hardware structures will have
availability implications. HP nPar partitioning will be supported only on high-end Superdome 2
models. According to the company, this will also be the case for other Integrity RAS features.
Other Power and AIX capabilities include Live Partition Mobility, which allows movement of
active LPARs between Power Systems, and Live Application Mobility, which allows WPARs to be
moved in the same manner. Live Partition Mobility users may experience service interruptions of
one or two seconds due to network latency. For Live Application Mobility, interruptions are
typically around 20 seconds.
HP offers a similar capability for HPVMs. Oracle VM Server for SPARC allows for domain
migration, but this a labor-intensive and protracted process. The company has committed to
improved automation.
Clustered failover solutions are available for all three platforms. The HP Serviceguard and IBM
PowerHA SystemMirror, which is offered for IBM AIX and i, are among the industry’s most
stable and mature high availability clustering solutions. PowerHA SystemMirror for AIX was
formerly known as IBM High Availability Cluster Multi-Processing (HACMP).
International Technology Group 21
25. X86 SERVER BLADES
DIFFERENTIATORS
There are some obvious resemblances between UNIX and x86 server blades. Hardware formats
are similar, and the leading vendors, HP and IBM, offer the same BladeSystem and BladeCenter
chassis as for their UNIX server blades. This is also the case for Sun, although the company is a
minor player outside the x86 Solaris space.
In comparing POWER7 based blades to x86 equivalents, however, differences are greater than
resemblances. POWER7 based blades yield higher levels of performance; system architectures are
significantly different; and the capabilities of operating systems (AIX compared to Windows and
Linux) and virtualization enablers (PowerVM compared to VMware and others) vary widely.
The extent to which these compete directly is less than is generally realized. At the risk of stating
the obvious, POWER7 based systems do not support Microsoft Windows. POWER7 based
systems are not candidates for deployment of popular Microsoft solutions such as Exchange,
SharePoint and SQL Server, for the company’s infrastructure products, or for third-party add-ons
to and extensions of these.
Equally, x86 Linux deployments tend to involve applications and workloads that differ from, and
are typically less challenging than those for which POWER based systems are employed.
Comparative market share statistics may not reflect these demographics.
Power Systems have evolved over more than 20 years to handle applications that are more
demanding, and to deliver scalability, concentration and quality of service that are – by wide
margins – greater than those experienced in most Windows and x86 Linux server environments.
COMPARATIVE PERFORMANCE
In terms of “raw” performance, x86 servers equipped with latest-generation Intel Nehalem EX
processors and Advanced Micro Devices (AMD) Opteron equivalents have begun to approach
RISC levels. This is less the case for IBM Power than for HP Integrity and Sun servers, although
there is some overlap with lower-density POWER7 processors.
Raw performance is not, however, the only determinant of the actual performance levels that will
be experienced by users. Most industry benchmarks, for example, are run using standardized
workloads in stable operating conditions. Actual production environments tend to be very different,
particularly when they involve virtualized servers hosting diverse, fluctuating workloads.
Where this is the case, the effectiveness of Power partitioning and workload management
mechanisms, and the impact of such capabilities as intelligent threads and cache may significantly
increase the amount of work that may be performed by a POWER7 based system.
This may not be visible in capacity utilization statistics. Knowing that a server is operating at, say,
65 or 85 percent of capacity does not provide insight into how efficiently that capacity is used.
Rapid allocation and re-allocation of system resources, and fine-grained concurrent workload
execution may mean that a great deal more work is performed by a POWER7 based system even if
capacity utilization is the same.
Performance measurement in virtualized environments is not a simple process.
International Technology Group 22
26. VIRTUALIZATION CAPABILITIES
Virtualization Enablers
In comparing the capabilities of POWER7 based systems, and Windows and x86 Linux servers in
virtualized environments, a key role is played by the relative strengths and weaknesses of
PowerVM and x86 virtualization enablers.
These include Microsoft Hyper-V, which is implemented as an extension of Windows Server
2008. Linux equivalents include Xen and Kernel-based Virtual Machine (KVM), which originated
as open source tools but have been upgraded and enhanced by Citrix Systems (XenServer) and
Red Hat respectively.
Oracle VM is also Xen-based, and is supported by the company for Oracle Enterprise Linux,
RHEL and Windows servers. The dominant player for Windows as well as x86 Linux
virtualization is, however, VMware. Its market share is generally estimated at over 75 percent.
VMware solutions are technically sophisticated and address a wide range of functions for servers
as well as storage and networks supporting these. However, user experiences have highlighted
drawbacks in several areas – including difficulties in supporting large-scale virtualized
environments, and in levels of complexity that are generated.
Although the following comments deal with VMware, they may also be considered to apply to the
other x86 enablers mentioned above.
Size and Scale
Although there are some large VMware based systems, the vast majority of installations involve
consolidation of comparatively small applications. The company itself estimates that more than 95
percent of its base consists of applications running on single- or dual-processor servers with less
than four GB peak memory utilization, and fewer than 100 I/Os per second (IOPS).
The largest areas of VMware consolidation have involved test and development instances – there
are routinely at least two, and often as many as six of these for each production instance – and
comparatively light-duty Web, infrastructure and end-user applications.
The majority of VMware servers run five or fewer instances, and most industry estimates are that
between 75 and 85 percent of all VMware servers support fewer than 15 VMs, This is the industry
norm even for latest-generation two-socket servers. Larger consolidated environments – typically
in the range of 15 to 30 instances per server – tend to run on newer four-socket servers with
multicore processors.
These statistics do not necessarily mean that VMware systems cannot support large applications or
workloads. But the reality is that, in terms of usage patterns, VMware is predominantly a small-
scale server virtualization solution.
There are also distinct nuances in applications that are typically consolidated using VMware.
Database servers and transaction processing systems, for example, have seen considerably less
activity than other types. This is particularly the case for production systems.
Even when VMware has been employed to virtualize “heavyweight” solutions, such as enterprise
resource planning (ERP) systems, typically only the more lightweight components of suites have
been deployed in virtual machines (VMs).
International Technology Group 23
27. This pattern is striking in that VMware, in principle, can deliver extremely high levels of
performance and scalability. The Virtual Infrastructure 3 product set, introduced in 2006,
supported up 32 physical cores, 256GB of main memory, and 128 powered-on VMs per host, and
up to four virtual CPUs and 64GB of main memory per VM.
The latest vSphere 4 version extended support to 64 physical cores, 1TB of main memory, and 320
powered-on VMs per host, and up to eight virtual CPUs and 256GB of main memory per VM.
Only a small fraction of the potential of either version, however, has been exploited in practice.
One reason for this is that VMware is a software-based partitioning technology, with
comparatively weak server-level workload management capabilities. This is particularly the case
where mixed workloads must be dealt with.
Although VMware (the company) has invested heavily in management solutions within its
vSphere environment, these are primarily designed to manage resources across server farms, and
supporting storage and networks, rather than within individual servers.
As a result, users are often reluctant to “push the envelope” in configuring numbers of VMs on
individual servers. Unless application installation instances generate very small, homogeneous
and/or stable workloads, risks of performance bottlenecks and disruption might escalate to
unacceptable levels.
Power Systems capabilities are significantly more effective. Multiple forms of partitioning are
tightly integrated with industry-leading mixed workload management capabilities.
Even using earlier generations of technology, Power Systems users have routinely been able to
support 30 to 50 LPARs and even larger numbers of micro-partitions on single systems. Use of
Power blades is moving in the same direction. Among the case studies presented in this report, for
example, UPMC anticipated that it could support up to 40 micro-partitions on new PS702 blades.
Complexity
After almost a decade of rapid growth, market researchers have reported a recent slowdown in x86
virtualization initiatives. The primary reason is that organizations face growing complexity
challenges. Implementation has often proved to be a longer and more difficult process than
anticipated, and skill requirements and staffing levels have tended to escalate.
Virtualization inevitably increases complexity by introducing a new layer of architecture into
system environments. Figure 10 illustrates this effect.
Figure 10
System Environment Layers: Example
APPLICATIONS
DATABASES/MIDDLEWARE
OPERATING SYSTEMS
VIRTUALIZATION
HARDWARE
International Technology Group 24
28. The organizational impact of adding this layer depends upon a number of factors. These include
the number of hardware and software components, the degree of integration between these and the
extent to which processes are automated.
A VMware environment will typically include components from at least four vendors – Intel or
AMD, the server hardware manufacturer, the operating system supplier and VMware itself. The
number may be significantly larger if environments also extend to storage and networks, or if
third-party tools are added.
Complexity is also materially affected by concentration. If system processes must be coordinated,
and resources managed across large numbers of separate hypervisors, challenges are greater than if
the virtualization layer is built around more compact structures.
In most organizations, it would be necessary to implement the VMware solution set – which now
includes the major components shown in figure 11 – across x86 infrastructures that may be less
fragmented than would have been the case in the past, but which nevertheless fall well short of the
degree of concentration that may be realized with POWER7 based systems.
Figure 11
Major Components of VMware vSphere 4 Environment
vCenter
Converter Host Profiles Lab Manager Lifecycle Manager Orchestrator
Site Recover Manager Tools & Utilities
Availability & Recovery Security
HA, Data Recovery, Fault Tolerance vShield Zones, VMsafe
vMotion, Storage vMotion
System Base Storage Network
ESX, ESXi VM File System Distributed Switch
Distributed Resource Scheduler Thin Provisioning Network I/O Control
Distributed Power Management Storage I/O Control
Memory Overcommit
There are other Power Systems advantages. Hardware, microcode and operating systems are
developed and supported by one vendor, and are integrated and optimized to a degree that far
exceeds what has been achieved in the x86 world. Automation is also more advanced, and more
pervasive than is the case for any x86 server platform, operating system or virtualization enabler.
AVAILABILITY AND SECURITY
Comparing Availability
Differences in the levels of availability that are typically realized with Windows and Linux
compared to UNIX servers have been widely documented.
Certain misperceptions nevertheless remain. In availability, for example, it is commonly argued
that latest-generation x86 processors can deliver “mainframe-class” uptime.
International Technology Group 25
29. Prevention of outages, however, requires more than hardware reliability. The availability
optimization capabilities of POWER7 based systems described earlier, for example, include
multiple levels of redundancy and concurrency, as well as mechanisms for monitoring,
diagnostics, fault isolation and resolution, and fault masking (enabling systems to continue
functioning even if a fault occurs) that are more sophisticated than those of most x86 platforms.
Such mechanisms must also extend to operating systems. More planned outages are caused by
software than by hardware failures, and this is the case for most planned outages. In POWER7
based systems, availability optimization features are also embedded in the microcode components
of the underlying system, and of PowerVM virtualization technology.
In an x86 environment, availability optimization needs to occur not only across processors and
server platforms, but also across Windows or Linux and virtualization enablers. Typically, this
again means dealing with the offerings of at least four vendors. Integration and optimization of
availability features is inevitably more problematic than is the case for Power Systems.
Further points should be made. Maintenance of, say, 99.5 percent availability is a comparatively
simple process. Difficulties increase substantially, however, if higher levels must be sustained. The
challenges of maintaining near-continuous availability may be orders of magnitude greater.
Equally, maintenance of high levels of availability for low-impact workloads is a great deal
simpler than for high-volume database- and transaction-intensive systems. Challenges are
compounded if it is necessary to support virtualized environments characterized by diverse,
volatile workloads.
Use of high availability clusters does not necessarily change this picture. Windows and Linux
clusters typically experience more downtime than UNIX equivalents, and failover and recovery
processes tend to be both slower and less reliable. Clusters also tend to generate complexity, and
the effects are magnified if they are multiplied across farms of small servers.
Experience has shown that x86 clusters are often necessary to achieve availability levels that may
be realized by standalone Power Systems.
Security and Malware Resistance
It is a truism that security and malicious code (malware) exposure for Windows and x86 Linux is
greater than for UNIX. Windows, in particular, is the world’s most targeted operating system, and
there are believed to be over a million Windows malware variants. An unprotected Windows
server connected to the Internet will typically become infected in a matter of minutes.
The comparatively loose design of, diversity of components, and open source origins of Linux
environments also pose security and malware challenges that are greater than for major versions of
UNIX. Since the mid-2000s, VMware and other x86 virtualization enablers have also emerged as
major hacker and malware targets.
The business impacts of security violations, data loss and malware damage may be significant.
Experience has shown, however, that there are also major cost implications. Even if high levels of
security can be realized for Windows and Linux systems, the process tends to be more labor-
intensive than for UNIX systems, and security administration costs tend to be higher.
At a time when security budgets are under pressure in many organizations, the central challenge is
not simply to maintain security. It is to do so in a cost-effective manner. POWER7 based systems
can materially assist in achieving that goal.
International Technology Group 26