SlideShare uma empresa Scribd logo
1 de 15
Baixar para ler offline
© 2018 IBM Corporation
IBM POWER9 Introduction
Summit Training Workshop
Brian Thompto
POWER Systems, IBM Systems
© 2018 IBM Corporation IBM Systems© 2018 IBM Corporation
POWER AC922 Design – 6 GPU
Power 9 Processor (2x)
• 18, 22C water cooled
• 16, 20C air cooled
PCIe slot (4x)
• Gen4 PCIe
• 2, x16 HHHL Adapter
• 1, Shared slot
• 1 x8 HHHL Adapter
Memory DIMM’s (16x)
• 8 DDR4 IS DIMMs per socket
• 8, 16, 32, 64, 128 GB DIMMs
NVidia Volta GPU
• 3 per socket
• SXM2 form factor
• 300W
• NVLink 2.0
• Air/Water Cooled
Power Supplies (2x)
• 2200W
• 200VAC, 277VAC, 400VDC input
BMC Card
• IPMI
• 1 Gb Ethernet
• VGA
• 1 USB 3.0
2
© 2018 IBM Corporation IBM Systems© 2018 IBM Corporation
POWER9 – AC922 with 6 GPU’s – Block Diagram
3
Images / diagrams modified from:
"IBM POWER9 systems designed for commercial cognitive and cloud", IBM J. Res. & Dev., vol. 62, no. 4/5, 2018.
© 2018 IBM Corporation
POWER9 Processor – Common Features
4
New Core Microarchitecture
• 24 cores / die; 22 active for Summit
• Stronger thread performance
• Efficient agile pipeline
• POWER ISA v3.0
Enhanced Cache Hierarchy
• 120MB NUCA L3 architecture
• 12 x 20-way associative regions
• Advanced replacement policies
• Fed by 7 TB/s on-chip bandwidth
Cloud + Virtualization Innovation
• Quality of service assists
• New interrupt architecture
• Workload optimized frequency
• Hardware enforced trusted execution
Leadership
Hardware Acceleration Platform
• Enhanced on-chip acceleration
• Nvidia NVLink 2.0: High bandwidth,
advanced new features
• CAPI 2.0: Coherent accelerator and
storage attach (PCIe G4)
• OpenCAPI 3.0: Improved latency
and bandwidth, open interface
State of the Art I/O Subsystem
• PCIe Gen4 – 48 lanes
High Bandwidth
Signaling Technology
• 16 Gb/s interface
– Local SMP
• 25 Gb/s interface – 25G Link
– Accelerator, remote SMP
14nm finFET Semiconductor Process
• Improved device performance and
reduced energy
• 17 layer metal stack and eDRAM
• 8.0 billion transistorsSMPInterconnect&
Off-ChipAcceleratorEnablement
On-Chip Accel
Memory SignalingSMP/Accelerator Signaling
SMP/Accelerator Signaling Memory Signaling
PCIeSignaling
SMPSignaling
Core
L2
L3 Region
Core Core
L2
L3 Region
Core Core
L2
L3 Region
Core Core
L2
L3 Region
Core
PCIe
Core
L2
L3 Region
Core Core
L2
L3 Region
Core Core
L2
L3 Region
Core Core
L2
L3 Region
Core
Core
L2
L3 Region
Core Core
L2
L3 Region
Core Core
L2
L3 Region
Core Core
L2
L3 Region
Core
© 2018 IBM Corporation
POWER9 SMT4-Core Pipeline
5
128b
Super-slice
DW
LSU
64b
VSU
64b
Slice
DW
LSU
64b
VSU
DW
LSU
64b
VSU
POWER9 SMT4 Core
2 x 128b
Super-slice
ISU
IFU
Exec
Super
Slice
2 x 64b
LSU
Exec
Super
Slice
2 x 64b
LSU
Super
Slice
LSU
Super
Slice
64b compute
64b load/store
2 x 64b
1 x 128b
POWER9 SMT4 Core – Sliced Micro-arch
Images / diagrams modified from:
"POWER9: Processor for the cognitive era", Proc. Hot Chips 28 Symp., pp. 1-19, Aug. 2016.
"IBM POWER9 processor core", IBM Journal of Research and Development, vol. 62, no. 4/5, pp. 2:1-2:12, 2018.
© 2018 IBM Corporation
PM
QFX
Slice 3
AGEN
Slice 2
AGEN
CRYPT
QP/DFU
Branch Slice
BRU
DIV
ALU
XS
FP
MUL
XC
ST-D
FP
MUL
XC
FP
MUL
XC
FP
MUL
XC
ALU
XS
ST-D
PM
QFX
Slice 1
AGEN
Slice 0
AGEN
DIV
ALU
XS
ST-D
ALU
XS
ST-D
LRQ 2/3
L1D$ 3L1D$ 2
SRQ 2 SRQ 3
LRQ 0/1
L1D$ 0 L1D$ 1
SRQ 1SRQ 0
Dispatch: Allocate / Rename
Decode / CrackIBUFL1 Instruction $Predecode
Branch
Prediction
Instruction / Iop
Completion Table
128b
Super-slice
SMT4 Core
x6
x8
POWER9 – Core Compute
Fetch / Branch
• 32kB, 8-way Instruction Cache
• 8 fetch, 6 decode
• 1x branch execution
Slices issue VSU and AGEN
• 4x scalar-64b / 2x vector-128b
• 4x load/store AGEN
Vector Scalar Unit (VSU) Pipes
• 4x ALU + Simple (64b)
• 4x FP + FX-MUL + Complex (64b)
• 2x Permute (128b)
• 2x Quad Fixed (128b)
• 2x Fixed Divide (64b)
• 1x Quad FP & Decimal FP
• 1x Cryptography
Load Store Unit (LSU) Slices
• 32kB, 8-way Data Cache
• Up to 4 DW load or store
SMT4 Core Resources
6
SMT4 Core x 22 per Socket for Summit Systems
© 2018 IBM Corporation
POWER9: Cache Capacity
17 Layers of Metal
eDRAM
POWER9
10M
Processing Cores
10M 10M 10M 10M 10M 10M 10M 10M 10M
7 TB/s
10M 10M
256 GB/s x 12
POWER9Nvidia
GPUMemory
IBM &
Partner
Devices
IBM &
Partner
Devices
PCIe
Device
SMP
NVLink2
PCIe
NewCAPI
DDR
CAPI
7
Caches per pair of SMT4 cores (up to 1-8 threads)
• L2: 512k, 8-way
• L3: 10 MB, 20-way
• Enhanced L3 Cache Effectiveness with enhanced Replacement
• Aggregate 110 MB, 11 x 20 way associativity when 22 cores active (out of 24) on Summit
© 2018 IBM Corporation
POWER ISA v3.0
Broader data type support
• 128-bit IEEE 754 Quad-Precision Float – Full width quad-precision for financial and security applications
• Expanded BCD and 128b Decimal Integer – For database and native analytics
• Half-Precision Float Conversion – Optimized for accelerator bandwidth and data exchange
Support Emerging Algorithms
• Enhanced Arithmetic and SIMD
• Random Number Generation Instruction
Accelerate Emerging Workloads
• Memory Atomics – For high scale data-centric applications
Cloud Optimization
• Enhanced Translation Architecture – Optimized for Linux
• New Interrupt Architecture – Automated partition routing for extreme virtualization
• Enhanced Accelerator Virtualization
• Hardware Enforced Trusted Execution
Energy & Frequency Management
• POWER9 Workload Optimized Frequency – Manage energy between threads and cores with reduced wakeup latency
– Enables boost of frequency beyond the 3.1 Ghz base; Linux governors can also restrict / lower frequency to save power or boost other cores
New Instruction Set Architecture Implemented on POWER9 vs. POWER8
8
© 2018 IBM Corporation
POWER9: On-Processor Accelerators
Virtualized: User mode invocation (No Hypervisor Calls)
Shared accelerators, accessible from each Thread
Accelerator Types
– Industry Standard GZIP Compression / Decompression
– Up to 16GB/s of gzip / gunzip
– AES / SHA Cryptography Support
– AES 128b
– AES 256b
– SHA 256
– SHA 512
– Memory compression engine
– True Random Number Generation
– Data Mover
Accelerators
© 2018 IBM Corporation
Scale Out
Direct Attach Memory
8 Direct DDR4 Ports Per Socket
POWER9 – Memory Architecture
• 140 GB/s streaming, 170 GB/s of bandwidth peak
• Up to 4TB memory capacity
• Low latency access
• Commodity packaging form factor
• Adaptive 64B / 128B reads
Memory AC922 Summit Systems
•16 direct attach industry standard DDR4 DIMMs
•32 GB DIMM, 2666 MHZ
•512 GB Memory Capacity per System
© 2018 IBM Corporation
POWER9 Premier Acceleration Platform
CacheHierarchy&On-ChipInterconnect
CAPII/OMemoryNVLINKOpenCAPI
25Gb/sPCIeGen4
DDR4
/DMI
On-Chip
Accelerators
SMP
16Gb/s
Core
……
POWER9
OpenCAPI 3.0
(200 GB/s)
NVLINK 2.0
(300 GB/s)
CAPI 2.0
(128 GB/s)
NVIDIA GPU Attach (x3 GPU)
• 10x industry standard PCIe attach BW
• Reduced overhead + simpler programming w/
virtual address support + coherent memory
sharing
Coherent, Open Attach over
PCIeG4 Physical Connections
• Attach storage class memory (SCM), network
adapters (NIC), FPGA/GPU accelerators, or storage
controllers as coherent peer to processor core
• Coherent model enables simpler programming,
reduced overhead, and new applications
• Up to 4x attach bandwidth of CAPI 1.0 (POWER8)
Coherent, Open, Host-Agnostic Attach
over 25Gb/s Physical Connections
• Higher BW, lower latency physical connection
• Host agnostic protocol enables the same device to
attach to multiple CPU architectures
© 2016 IBM Corporation
POWER9 – AC922 NVLINK 2.0
• Extreme Processor / Accelerator Bandwidth and Reduced Latency
• 300 GB/s duplex between each POWER9 socket and 3 Volta GPU’s
• Coherent Memory and Virtual Addressing Capability
12
Images / diagrams modified from:
"Functionality and performance of NVLink with IBM POWER9 processors", IBM J. Res. & Dev., vol. 62, no. 4/5, pp. 9:1-9:10, 2018.
© 2018 IBM Corporation 13
Thank you
© 2018 IBM Corporation
Special notices
This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other
countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in
your area.
Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any
license to these patents. Send license inquiries, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785
USA.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either
expressed or implied.
All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results
that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions.
IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to
qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and
may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice.
IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies.
All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on
many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have
been made on development-level systems. There is no guarantee these measurements will be the same on generally-available systems. Some
measurements quoted in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their
specific environment.
Revised September 26, 2006
14
© 2018 IBM Corporation
Special notices (continued)
15
IBM, the IBM logo, ibm.com AIX, AIX (logo), IBM Watson, DB2 Universal Database, POWER, PowerLinux, PowerVM, PowerVM (logo), PowerHA, Power Architecture, Power Family, POWER Hypervisor,
Power Systems, Power Systems (logo), POWER2, POWER3, POWER4, POWER4+, POWER5, POWER5+, POWER6, POWER6+, POWER7, POWER7+, and POWER8 are trademarks or registered
trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information
with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered
or common law trademarks in other countries.
A full list of U.S. trademarks owned by IBM may be found at: http://www.ibm.com/legal/copytrade.shtml.
NVIDIA, the NVIDIA logo, and NVLink are trademarks or registered trademarks of NVIDIA Corporation in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries or both.
PowerLinux™ uses the registered trademark Linux® pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the Linux® mark on a world-wide basis.
The Power Architecture and Power.org wordmarks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org.
The OpenPOWER word mark and the OpenPOWER Logo mark, and related marks, are trademarks and service marks licensed by OpenPOWER.
Other company, product and service names may be trademarks or service marks of others.

Mais conteúdo relacionado

Mais procurados

Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 

Mais procurados (20)

Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
 
OpenPOWER Webinar on Machine Learning for Academic Research
OpenPOWER Webinar on Machine Learning for Academic Research OpenPOWER Webinar on Machine Learning for Academic Research
OpenPOWER Webinar on Machine Learning for Academic Research
 
Covid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsCovid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power Systems
 
OpenPOWER System Marconi100
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100
 
WML OpenPOWER presentation
WML OpenPOWER presentationWML OpenPOWER presentation
WML OpenPOWER presentation
 
SNAP MACHINE LEARNING
SNAP MACHINE LEARNINGSNAP MACHINE LEARNING
SNAP MACHINE LEARNING
 
TAU E4S ON OpenPOWER /POWER9 platform
TAU E4S ON OpenPOWER /POWER9 platformTAU E4S ON OpenPOWER /POWER9 platform
TAU E4S ON OpenPOWER /POWER9 platform
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
 
EXTENT-2017: Heterogeneous Computing Trends and Business Value Creation
EXTENT-2017: Heterogeneous Computing Trends and Business Value CreationEXTENT-2017: Heterogeneous Computing Trends and Business Value Creation
EXTENT-2017: Heterogeneous Computing Trends and Business Value Creation
 
2018 bsc power9 and power ai
2018   bsc power9 and power ai 2018   bsc power9 and power ai
2018 bsc power9 and power ai
 
CFD on Power
CFD on Power CFD on Power
CFD on Power
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1
 
Using a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceUsing a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application Performance
 
BSC LMS DDL
BSC LMS DDL BSC LMS DDL
BSC LMS DDL
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
IBM BOA for POWER
IBM BOA for POWER IBM BOA for POWER
IBM BOA for POWER
 

Semelhante a Summit workshop thompto

32960 lar visit 022713v2
32960 lar visit 022713v232960 lar visit 022713v2
32960 lar visit 022713v2
gmazuel
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
IBM Switzerland
 

Semelhante a Summit workshop thompto (20)

Power overview 2018 08-13b
Power overview 2018 08-13bPower overview 2018 08-13b
Power overview 2018 08-13b
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
 
IBM Power Systems at FIS InFocus 2019
IBM Power Systems at FIS InFocus 2019IBM Power Systems at FIS InFocus 2019
IBM Power Systems at FIS InFocus 2019
 
32960 lar visit 022713v2
32960 lar visit 022713v232960 lar visit 022713v2
32960 lar visit 022713v2
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
 
@IBM Power roadmap 8
@IBM Power roadmap 8 @IBM Power roadmap 8
@IBM Power roadmap 8
 
High-Density Top-Loading Storage for Cloud Scale Applications
High-Density Top-Loading Storage for Cloud Scale Applications High-Density Top-Loading Storage for Cloud Scale Applications
High-Density Top-Loading Storage for Cloud Scale Applications
 
Drive Data Center Efficiency with SuperBlade, Powered by AMD EPYC™ and Instinct™
Drive Data Center Efficiency with SuperBlade, Powered by AMD EPYC™ and Instinct™Drive Data Center Efficiency with SuperBlade, Powered by AMD EPYC™ and Instinct™
Drive Data Center Efficiency with SuperBlade, Powered by AMD EPYC™ and Instinct™
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
Understanding the IBM Power Systems Advantage
Understanding the IBM Power Systems AdvantageUnderstanding the IBM Power Systems Advantage
Understanding the IBM Power Systems Advantage
 
Ibm power systems facts and features power 8
Ibm power systems facts and features  power 8 Ibm power systems facts and features  power 8
Ibm power systems facts and features power 8
 
G108277 ds8000-resiliency-lagos-v1905c
G108277 ds8000-resiliency-lagos-v1905cG108277 ds8000-resiliency-lagos-v1905c
G108277 ds8000-resiliency-lagos-v1905c
 
IBM POWER8 as an HPC platform
IBM POWER8 as an HPC platformIBM POWER8 as an HPC platform
IBM POWER8 as an HPC platform
 
Ibm power systems hpc cluster
Ibm power systems hpc cluster Ibm power systems hpc cluster
Ibm power systems hpc cluster
 
Ibm power 824
Ibm power 824Ibm power 824
Ibm power 824
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 

Mais de Ganesan Narayanasamy

180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
Ganesan Narayanasamy
 

Mais de Ganesan Narayanasamy (20)

Chip Design Curriculum development Residency program
Chip Design Curriculum development Residency programChip Design Curriculum development Residency program
Chip Design Curriculum development Residency program
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
 
OpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT RoorkeeOpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT Roorkee
 
Deep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsDeep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systems
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
 
AI in healthcare - Use Cases
AI in healthcare - Use Cases AI in healthcare - Use Cases
AI in healthcare - Use Cases
 
AI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systemsAI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systems
 
AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems
 
Poster from NUS
Poster from NUSPoster from NUS
Poster from NUS
 
SAP HANA on POWER9 systems
SAP HANA on POWER9 systemsSAP HANA on POWER9 systems
SAP HANA on POWER9 systems
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
AI in the enterprise
AI in the enterprise AI in the enterprise
AI in the enterprise
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 
Perspectives of Frond end Design
Perspectives of Frond end DesignPerspectives of Frond end Design
Perspectives of Frond end Design
 
A2O Core implementation on FPGA
A2O Core implementation on FPGAA2O Core implementation on FPGA
A2O Core implementation on FPGA
 
OpenPOWER Foundation Introduction
OpenPOWER Foundation Introduction OpenPOWER Foundation Introduction
OpenPOWER Foundation Introduction
 
Open Hardware and Future Computing
Open Hardware and Future ComputingOpen Hardware and Future Computing
Open Hardware and Future Computing
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 

Summit workshop thompto

  • 1. © 2018 IBM Corporation IBM POWER9 Introduction Summit Training Workshop Brian Thompto POWER Systems, IBM Systems
  • 2. © 2018 IBM Corporation IBM Systems© 2018 IBM Corporation POWER AC922 Design – 6 GPU Power 9 Processor (2x) • 18, 22C water cooled • 16, 20C air cooled PCIe slot (4x) • Gen4 PCIe • 2, x16 HHHL Adapter • 1, Shared slot • 1 x8 HHHL Adapter Memory DIMM’s (16x) • 8 DDR4 IS DIMMs per socket • 8, 16, 32, 64, 128 GB DIMMs NVidia Volta GPU • 3 per socket • SXM2 form factor • 300W • NVLink 2.0 • Air/Water Cooled Power Supplies (2x) • 2200W • 200VAC, 277VAC, 400VDC input BMC Card • IPMI • 1 Gb Ethernet • VGA • 1 USB 3.0 2
  • 3. © 2018 IBM Corporation IBM Systems© 2018 IBM Corporation POWER9 – AC922 with 6 GPU’s – Block Diagram 3 Images / diagrams modified from: "IBM POWER9 systems designed for commercial cognitive and cloud", IBM J. Res. & Dev., vol. 62, no. 4/5, 2018.
  • 4. © 2018 IBM Corporation POWER9 Processor – Common Features 4 New Core Microarchitecture • 24 cores / die; 22 active for Summit • Stronger thread performance • Efficient agile pipeline • POWER ISA v3.0 Enhanced Cache Hierarchy • 120MB NUCA L3 architecture • 12 x 20-way associative regions • Advanced replacement policies • Fed by 7 TB/s on-chip bandwidth Cloud + Virtualization Innovation • Quality of service assists • New interrupt architecture • Workload optimized frequency • Hardware enforced trusted execution Leadership Hardware Acceleration Platform • Enhanced on-chip acceleration • Nvidia NVLink 2.0: High bandwidth, advanced new features • CAPI 2.0: Coherent accelerator and storage attach (PCIe G4) • OpenCAPI 3.0: Improved latency and bandwidth, open interface State of the Art I/O Subsystem • PCIe Gen4 – 48 lanes High Bandwidth Signaling Technology • 16 Gb/s interface – Local SMP • 25 Gb/s interface – 25G Link – Accelerator, remote SMP 14nm finFET Semiconductor Process • Improved device performance and reduced energy • 17 layer metal stack and eDRAM • 8.0 billion transistorsSMPInterconnect& Off-ChipAcceleratorEnablement On-Chip Accel Memory SignalingSMP/Accelerator Signaling SMP/Accelerator Signaling Memory Signaling PCIeSignaling SMPSignaling Core L2 L3 Region Core Core L2 L3 Region Core Core L2 L3 Region Core Core L2 L3 Region Core PCIe Core L2 L3 Region Core Core L2 L3 Region Core Core L2 L3 Region Core Core L2 L3 Region Core Core L2 L3 Region Core Core L2 L3 Region Core Core L2 L3 Region Core Core L2 L3 Region Core
  • 5. © 2018 IBM Corporation POWER9 SMT4-Core Pipeline 5 128b Super-slice DW LSU 64b VSU 64b Slice DW LSU 64b VSU DW LSU 64b VSU POWER9 SMT4 Core 2 x 128b Super-slice ISU IFU Exec Super Slice 2 x 64b LSU Exec Super Slice 2 x 64b LSU Super Slice LSU Super Slice 64b compute 64b load/store 2 x 64b 1 x 128b POWER9 SMT4 Core – Sliced Micro-arch Images / diagrams modified from: "POWER9: Processor for the cognitive era", Proc. Hot Chips 28 Symp., pp. 1-19, Aug. 2016. "IBM POWER9 processor core", IBM Journal of Research and Development, vol. 62, no. 4/5, pp. 2:1-2:12, 2018.
  • 6. © 2018 IBM Corporation PM QFX Slice 3 AGEN Slice 2 AGEN CRYPT QP/DFU Branch Slice BRU DIV ALU XS FP MUL XC ST-D FP MUL XC FP MUL XC FP MUL XC ALU XS ST-D PM QFX Slice 1 AGEN Slice 0 AGEN DIV ALU XS ST-D ALU XS ST-D LRQ 2/3 L1D$ 3L1D$ 2 SRQ 2 SRQ 3 LRQ 0/1 L1D$ 0 L1D$ 1 SRQ 1SRQ 0 Dispatch: Allocate / Rename Decode / CrackIBUFL1 Instruction $Predecode Branch Prediction Instruction / Iop Completion Table 128b Super-slice SMT4 Core x6 x8 POWER9 – Core Compute Fetch / Branch • 32kB, 8-way Instruction Cache • 8 fetch, 6 decode • 1x branch execution Slices issue VSU and AGEN • 4x scalar-64b / 2x vector-128b • 4x load/store AGEN Vector Scalar Unit (VSU) Pipes • 4x ALU + Simple (64b) • 4x FP + FX-MUL + Complex (64b) • 2x Permute (128b) • 2x Quad Fixed (128b) • 2x Fixed Divide (64b) • 1x Quad FP & Decimal FP • 1x Cryptography Load Store Unit (LSU) Slices • 32kB, 8-way Data Cache • Up to 4 DW load or store SMT4 Core Resources 6 SMT4 Core x 22 per Socket for Summit Systems
  • 7. © 2018 IBM Corporation POWER9: Cache Capacity 17 Layers of Metal eDRAM POWER9 10M Processing Cores 10M 10M 10M 10M 10M 10M 10M 10M 10M 7 TB/s 10M 10M 256 GB/s x 12 POWER9Nvidia GPUMemory IBM & Partner Devices IBM & Partner Devices PCIe Device SMP NVLink2 PCIe NewCAPI DDR CAPI 7 Caches per pair of SMT4 cores (up to 1-8 threads) • L2: 512k, 8-way • L3: 10 MB, 20-way • Enhanced L3 Cache Effectiveness with enhanced Replacement • Aggregate 110 MB, 11 x 20 way associativity when 22 cores active (out of 24) on Summit
  • 8. © 2018 IBM Corporation POWER ISA v3.0 Broader data type support • 128-bit IEEE 754 Quad-Precision Float – Full width quad-precision for financial and security applications • Expanded BCD and 128b Decimal Integer – For database and native analytics • Half-Precision Float Conversion – Optimized for accelerator bandwidth and data exchange Support Emerging Algorithms • Enhanced Arithmetic and SIMD • Random Number Generation Instruction Accelerate Emerging Workloads • Memory Atomics – For high scale data-centric applications Cloud Optimization • Enhanced Translation Architecture – Optimized for Linux • New Interrupt Architecture – Automated partition routing for extreme virtualization • Enhanced Accelerator Virtualization • Hardware Enforced Trusted Execution Energy & Frequency Management • POWER9 Workload Optimized Frequency – Manage energy between threads and cores with reduced wakeup latency – Enables boost of frequency beyond the 3.1 Ghz base; Linux governors can also restrict / lower frequency to save power or boost other cores New Instruction Set Architecture Implemented on POWER9 vs. POWER8 8
  • 9. © 2018 IBM Corporation POWER9: On-Processor Accelerators Virtualized: User mode invocation (No Hypervisor Calls) Shared accelerators, accessible from each Thread Accelerator Types – Industry Standard GZIP Compression / Decompression – Up to 16GB/s of gzip / gunzip – AES / SHA Cryptography Support – AES 128b – AES 256b – SHA 256 – SHA 512 – Memory compression engine – True Random Number Generation – Data Mover Accelerators
  • 10. © 2018 IBM Corporation Scale Out Direct Attach Memory 8 Direct DDR4 Ports Per Socket POWER9 – Memory Architecture • 140 GB/s streaming, 170 GB/s of bandwidth peak • Up to 4TB memory capacity • Low latency access • Commodity packaging form factor • Adaptive 64B / 128B reads Memory AC922 Summit Systems •16 direct attach industry standard DDR4 DIMMs •32 GB DIMM, 2666 MHZ •512 GB Memory Capacity per System
  • 11. © 2018 IBM Corporation POWER9 Premier Acceleration Platform CacheHierarchy&On-ChipInterconnect CAPII/OMemoryNVLINKOpenCAPI 25Gb/sPCIeGen4 DDR4 /DMI On-Chip Accelerators SMP 16Gb/s Core …… POWER9 OpenCAPI 3.0 (200 GB/s) NVLINK 2.0 (300 GB/s) CAPI 2.0 (128 GB/s) NVIDIA GPU Attach (x3 GPU) • 10x industry standard PCIe attach BW • Reduced overhead + simpler programming w/ virtual address support + coherent memory sharing Coherent, Open Attach over PCIeG4 Physical Connections • Attach storage class memory (SCM), network adapters (NIC), FPGA/GPU accelerators, or storage controllers as coherent peer to processor core • Coherent model enables simpler programming, reduced overhead, and new applications • Up to 4x attach bandwidth of CAPI 1.0 (POWER8) Coherent, Open, Host-Agnostic Attach over 25Gb/s Physical Connections • Higher BW, lower latency physical connection • Host agnostic protocol enables the same device to attach to multiple CPU architectures
  • 12. © 2016 IBM Corporation POWER9 – AC922 NVLINK 2.0 • Extreme Processor / Accelerator Bandwidth and Reduced Latency • 300 GB/s duplex between each POWER9 socket and 3 Volta GPU’s • Coherent Memory and Virtual Addressing Capability 12 Images / diagrams modified from: "Functionality and performance of NVLink with IBM POWER9 processors", IBM J. Res. & Dev., vol. 62, no. 4/5, pp. 9:1-9:10, 2018.
  • 13. © 2018 IBM Corporation 13 Thank you
  • 14. © 2018 IBM Corporation Special notices This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in your area. Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. Send license inquiries, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either expressed or implied. All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions. IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice. IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies. All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have been made on development-level systems. There is no guarantee these measurements will be the same on generally-available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their specific environment. Revised September 26, 2006 14
  • 15. © 2018 IBM Corporation Special notices (continued) 15 IBM, the IBM logo, ibm.com AIX, AIX (logo), IBM Watson, DB2 Universal Database, POWER, PowerLinux, PowerVM, PowerVM (logo), PowerHA, Power Architecture, Power Family, POWER Hypervisor, Power Systems, Power Systems (logo), POWER2, POWER3, POWER4, POWER4+, POWER5, POWER5+, POWER6, POWER6+, POWER7, POWER7+, and POWER8 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A full list of U.S. trademarks owned by IBM may be found at: http://www.ibm.com/legal/copytrade.shtml. NVIDIA, the NVIDIA logo, and NVLink are trademarks or registered trademarks of NVIDIA Corporation in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries or both. PowerLinux™ uses the registered trademark Linux® pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the Linux® mark on a world-wide basis. The Power Architecture and Power.org wordmarks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org. The OpenPOWER word mark and the OpenPOWER Logo mark, and related marks, are trademarks and service marks licensed by OpenPOWER. Other company, product and service names may be trademarks or service marks of others.