Palestrante: Fernando Nogueira Alves Ferreira - IBM Brasil
Com a evolução das tecnologias de processador e o esgotamento do crescimento com base na frequência, a arquitetura e uso dos caches passou a ser cada vex mais importante na definição da capacidade dos processadores. Conceitos com caracterização de workloads, CPI (Cycles per Instruction ), RNI, cache miss, alocação de PUs por Book/Drawer, Hiperdispatch passaram a ser cada vez mais essenciais para entender o comportamento de desempenho dos servidores zSystems. Nesta apresentação vamos falar sobre esses conceitos e sobre como caracterizar workloads, entender fatores que podem afetar o uso da estrutura de caches e como melhor configurar e gerenciar as cargas do seu servidor.
Handwritten Text Recognition for manuscripts and early printed texts
Como configurar seu zSystem para workloads rebeldes
1. Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Como configurar seu zSystem para
workloads rebeldes
Fernando Ferreira
IBM Executive I/T Specialist
IBM Academy of Technology / zChampion
Email/Linkedin fernafe@br.ibm.com
Twiter fernafeibm
2. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Trademarks
The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.
The following are trademarks or registered trademarks of other companies.
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any
user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the
workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured Sync new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have
achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to
change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained Sync the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the
performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of
Intel Corporation or its subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.
Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not
actively marketed or is not significant within its relevant market.
Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.
For a more complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml:
*BladeCenter®, CICS®, DataPower®, DB2®, e business(logo)®, ESCON, eServer, FICON®, IBM®, IBM (logo)®, IMS, MVS, OS/390®,
POWER6®, POWER6+, POWER7®, Power Architecture®, PowerVM®, PureFlex, PureSystems, S/390®, ServerProven®, Sysplex Timer®,
System p®, System p5, System x®, z Systems®, System z9®, System z10®, WebSphere®, X-Architecture®, z13™, z Systems™, z9®,
z10, z/Architecture®, z/OS®, z/VM®, z/VSE®, zEnterprise®, zSeries®
3. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Tópicos
Introdução
Conceitos
Questoes de Capacidade
Referencias
4. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Introdução
L1 missL1 miss
Time
zEC12 Out-of-order core execution
Time
Faster millicode
execution
Instrs
1
2
3
4
5
6
7
Time
L1 missL1 miss
z196 Out-of-order core execution
Instrs
1
2
3
4
5
6
7
Better
Instruction Delivery
Shorter L1
Miss latency
Execution
Storage access
Dependency
Execution
Storage access
Dependency
5. Proibida cópia ou divulgação sem
permissão escrita do CMG Brasil.
Conceitos
6. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Caches
PU
1
2
PU
7. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem
permissão escrita do CMG Brasil.
RNI
L1
L2LP
L4LP
L2RP
L4RP MEMP
How Often?
L1MP
L3P
RNI
How Far?
8. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
CPI
9. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
CPUMF
Node 1 Node 2
Memory
L4 Cache
L3 Cache
L2
L3 Cache
L1
PU1
L2
L1
PU8
L2
L1
PU1
L2
L1
PU8
L2
L1
PU1
L2
L1
PU8
L3 Cache
PU SCM1 PU SCM2 PU SCM3
Memory
L4 Cache
L3 Cache
L2
L3 Cache
L1
PU1
L2
L1
PU8
L2
L1
PU1
L2
L1
PU8
L2
L1
PU1
L2
L1
PU8
L3 Cache
PU SCM4 PU SCM5 PU SCM6
SCSCM1
SCSCM2
LPAR
CPUMF
HIS
LP1 LP2 LP3 LP5LP4 LP6
z/OS
10. Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Questões
De
Capacidade
11. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
zSystem Caches
zEC12
CPU
5.5 GHz (1514 PCI)
Enhanced Out-Of-Order
Caches
L1 private 64k i, 96k d
L2 private 1 MB i + 1 MB d
L3 shared 48 MB / chip
L4 shared 384 MB / book
z13
CPU
5.0 GHz (1695 PCI)
Major pipeline enhancements
Caches
L1 private 96k i, 128k d
L2 private 2 MB i + 2 MB d
L3 shared 64 MB / chip
L4 shared 480 MB / node
- plus 224 MB NIC
...
Memory
L4 Cache
L2
CPU1
L1
L3 Cache
L2
CPU6
L1... L2
CPU1
L1
L3 Cache
L2
CPU6
L1...
...
Memory
L4 Cache
L2
PU1
L1
L3 Cache
... L2
PU8
L1
L2
PU1
L1
L3 Cache
...L2
PU8
L1
...
Memory
L4 Cache
L2
PU1
L1
L3 Cache
... L2
PU8
L1
L2
PU1
L1
L3 Cache
...L2
PU8
L1
Single Book View
Single Drawer View
13. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Hiperdispatch
14. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem
permissão escrita do CMG Brasil.
Alocação e Dispatch
S
C
S
C
Node 1 Node 0
XS S
S
C
S
C
Node 1 Node 0
X SS
S = S BUS X = X BUS
Drawer 0
Drawer 1
Swap
15. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Caracterização de workloads
Low Relative Nest Intensity High
Batch Application Type Transactional
Low IO Rate High
Single Application Mix Many
Intensive CPU Usage Light
High locality Data Reference Pattern Diverse
Simple LPAR Configuration Complex
Extensive Software Configuration Tuning Limited
16. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
Workloads “Intermediarios”
HIGH
AVG-HIGH
LOW
AVERAGE
AVG-LOW
17. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
In Ready e MPL vs Caches
L1MP CPI
18. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Efeito 90%
19. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem
permissão escrita do CMG Brasil.
zIIPs e GCPs
GCP
zIIP
20. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
SMT
PR/SM Hypervisor MT Aware
MT Ignorant
z/OS z/VM
21. Proibida cópia ou divulgação sem
permissão escrita do CMG Brasil.
Ferramentas
22. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
WLM Topology Tool
Requirements: A z10 or newer System z environment with partitions running in Hiperdispatch mode
Collecting SMF 99 subtype 14 records
Excel Version 2013. The spreadsheet should also work on Excel 2007 and 2010
The tool is publicly available at
http://www-03.ibm.com/systems/z/os/zos/features/wlm/WLM_Further_Info_Tools.html#Topology
23. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem permissão
escrita do CMG Brasil.
LPAR design Tool
The tool is publicly available at http://www-
03.ibm.com/systems/z/os/zos/features/wlm/WLM_Further_Info_Tools.html#Topology
24. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem
permissão escrita do CMG Brasil.
SMF 113 Reporting Tool
Requirements: Collecting SMF 113 subtype 2 records on a z10 or newer z system.
Excel Version 2013. The spreadsheet should also work on Excel 2007 and 2010
The tool is publicly available at
http://www-03.ibm.com/systems/z/os/zos/features/wlm/WLM_Further_Info_Tools.html#Topology
25. WebSphere for zSeries skills transfer Group
Proibida cópia ou divulgação sem permissão escrita do CMG Brasil.
Proibida cópia ou divulgação sem
permissão escrita do CMG Brasil.
Referencias II
Redbooks:
www.redbooks.ibm.com
z Systems Simultaneous Multithreading Revolution – redp5144
IBM z13 Technical Introduction - SG24-8250
IBM z13 Technical Guide - SG24-8251
STG Technical University (Brazil, Las Vegas e Budapeste) :
Kathy Walsh – Performance Hot Topics ( STU )
Frank Kyne - Why is the CPU Time For a Job so Variable?
Artigo do Bob Rogers no IBM Systems Magazine sobre SMT
http://www.ibmsystemsmag.com/mainframe/trends/IBMResearch/smt_mainframe
Notas do Editor
Abstratct
Com a evolução das tecnologias de processador e o esgotamento do crescimento com base na frequência, a arquitetura e uso dos caches passou a ser cada vex mais importante na definição da capacidade dos processadores. Conceitos com caracterização de workloads, CPI (Cycles per Instruction ), RNI, cache miss, alocação de PUs por Book/Drawer, Hiperdispatch passaram a ser cada vez mais essenciais para entender o comportamento de desempenho dos servidores zSystems. Nesta apresentação vamos falar sobre esses conceitos e sobre como caracterizar workloads, entender fatores que podem afetar o uso da estrutura de caches e como melhor configurar e gerenciar as cargas do seu servidor.
Tópicos
1 – Introdução
Análise de desempenho em ambiente WEB é um tema por vezes complexo devido ao número de componentes normalmente envolvidos nesta arquitetura.
Cada componente por sua vez possui peculiaridades, processos e preocupaçoes próprias. As varias arquiteturas e combinações entre as mesmas dificultam a elaboracao de uma visao completa sobre o assunto. Como resultado os documentos sobre o tema em geral buscam subdividi-lo em seus varios elementos.
Quando falamos do ambiente WEB mainframe adicionamos mais um grau de complexidade em função das capacidades desta plataforma que adiciona novas alternativas em termos de elementos de arquitetura.
Quando falamos do zOS este ainda aumenta ainda mais a diversidade de elementos e funcionalidades varios deles com especifidades proprias em termos de analise de desempenho e planejamento de capacidade.
Nesta introdução daremos uma visao geral do ambiente WEB em mainframe e dos grupos de elementos envolvidos. Serão descritos de forma simplificada os cenarios e os elementos básicos deste ambiente alem de um breve posicionamento quanto a utilizacao de WebSphere em Linux ou em zOS.
Various combinations of prior workload primitives are measured on which the new workload categories are based
Applications include CICS, DB2, IMS, OSAM, VSAM, WebSphere, COBOL, utilities
Low (relative nest intensity)
Workload curve representing light use of the memory hierarchy
Similar to past high N-way scaling workload primitives
Average (relative nest intensity)
Workload curve expected to represent the majority of customer workloads
Similar to the past LoIO-mix curve
High (relative nest intensity)
Workload curve representing heavy use of the memory hierarchy
Similar to the past DI-mix curve
zPCR and zCP3000 extend published categories to add granularity to LSPR workloads
Low-Avg
50% Low and 50% Average
Avg-High
50% Average and 50% High
1 – Introdução
Análise de desempenho em ambiente WEB é um tema por vezes complexo devido ao número de componentes normalmente envolvidos nesta arquitetura.
Cada componente por sua vez possui peculiaridades, processos e preocupaçoes próprias. As varias arquiteturas e combinações entre as mesmas dificultam a elaboracao de uma visao completa sobre o assunto. Como resultado os documentos sobre o tema em geral buscam subdividi-lo em seus varios elementos.
Quando falamos do ambiente WEB mainframe adicionamos mais um grau de complexidade em função das capacidades desta plataforma que adiciona novas alternativas em termos de elementos de arquitetura.
Quando falamos do zOS este ainda aumenta ainda mais a diversidade de elementos e funcionalidades varios deles com especifidades proprias em termos de analise de desempenho e planejamento de capacidade.
Nesta introdução daremos uma visao geral do ambiente WEB em mainframe e dos grupos de elementos envolvidos. Serão descritos de forma simplificada os cenarios e os elementos básicos deste ambiente alem de um breve posicionamento quanto a utilizacao de WebSphere em Linux ou em zOS.
Dispatch an “on and ready” Dedicated Logical Processor
To its dedicated home physical processor core
Dispatch the “most deserving” on and ready Shared Logical Processor
To the best choice shared physical processor core of the same type
Favor: Home core, chip, node, and drawer to maximize cache reuse.
Goals: Allocate shared processor resource according shares calculated from weights. Work with HiperDispatch (e.g. “Vertical High” or “Parked” logical processors). Minimize performance variability. Maximize capacity.
Dispatching work to SMT threads is done by z/OS for zIIPs or z/VM for IFLs if enabled by PTFs and activated with operating system parameters.Note: z/OS messages may refer to logical processors as “CPUs” or “Cores”
PR/SM cooperates but does NOT dispatch SMT threads
PR/SM dynamic relocation of running processor cores to different physical core locations
CP, zIIP, IFL and ICF supported
Swap an active core to a core in a different PU chip in a different drawer or node
Designed to optimize physical processor location for the current LPAR’s logical processor configuration:
Better L3 and L4 cache reuse
Move processor to partition memory
Triggers: Partition activation/deactivation, machine upgrades/downgrades, logical processors configured on/off
Designed to provide the most benefit for:
Multiple drawer machines
Dedicated partitions and wide partitions with HiperDispatch active
Default processor assignments by POR, MES adds, and On Demand activation:
Assign IFLs and ICFs to cores on chips in “high” drawers working down
Assign CPs and zIIP in low drawers working up.
Objective: Keep “Linux Only”, “IBM zAware” and “Coupling Facility” using IFLs and ICFs “away” from “ESA/390” partitions running z/OS on CPs and zIIPs and in different drawers if possible.
PR/SM makes optimum available memory and logical processor assignment at activation
Logical Processors specified in the Image Profile, are assigned a core if Dedicated or a “home” drawer, node and chip if Shared. Later, if it becomes a HiperDispatch “Vertical High”, a Shared Logical Processor is assigned a specific core.
Ideally assign all memory in one drawer with the processors if everything “fits”
With memory striped across drawers with processors if memory or processors must be split
PR/SM optimizes resource assignment when triggered
Triggers: Available resources changes: partition activation or deactivation or significant processor entitlement changes, dynamic memory increases or processor increases or decreases (e.g. by CBU) or MES change.
Examines partitions in priority order by the size of their “processor entitlement” (dedicated processor count or shared processor pool allocation by weight) to determine priority for optimization
Changes logical processor “home” drawer/node/chip assignment
Moves processors to different chips, nodes, drawers (LPAR Dynamic PU Reassignment)
Relocates partition memory to active memory in a different drawer or drawers using the newly optimized Dynamic Memory Relocation (DMR), also exploited by Enhanced Drawer Availability (EDA).
If available but inactive memory hardware is present (e.g. hardware driven by Flexible or Plan Ahead) in a drawer where more active memory would help: activate it, reassign active partition memory to it, and deactivate the source memory hardware, again using DRM. (PR/SM can use all memory hardware but concurrently enables no more memory than the client has paid to use.)
Attempt assignment to an available physical CP (in wait)
CP last run on is available on home chip
Choose available CP on home chip
Choose available CP on home node
CP last run on is available on home drawer
Choose available CP on home drawer
Choose available CP on “sister” drawer(s)
“sisters” is the set of drawers where memory is allocated for the partition
Search for lowest priority dispatched work to displace in this order:
CPs on home chip
CPs on home node
CPs on home drawer
CPs on “sister” drawer(s)
The entire CPC (non-sisters) is not immediately searched. After some delay, the above searches are be expanded to include everything. Attempting to trade off higher cache hit rate for transient expansion. Synergy here with z/OS HiperDispatch decisions.
LSPR is based on workloads running at high utilization
• Processor is a 2817-720 with 3 LPARs but is running only 50% busy
– zPCR Multi-Image Table places this at 17,171 MIPS, or 859 MIPS per CP
• Processor is actually running faster than this
• Impact to capacity planning comes in two flavors
– May have less headroom on processor than expected
– When moving a workload, it may not fit in the new container
• Example
– Assume a workload is running at 50% busy on a 2000 MIPS box without factoring in utilization effect, it will be called a 1000 MIPS workload in fact, it may be an 1100 MIPS workload when running at the efficiency of a 90% busy box
• Caution #1: There is NOT room to double this workload on the current box
• Caution #2: If moved to a new box or LPAR, it will likely need a 1100 MIPS container (not 1000 MIPS) to fit
• ROT:
– CPU per tran will vary 3-5% for every 10% change in utilization
LSPR is based on workloads running at high utilization
• Processor is a 2817-720 with 3 LPARs but is running only 50% busy
– zPCR Multi-Image Table places this at 17,171 MIPS, or 859 MIPS per CP
• Processor is actually running faster than this
• Impact to capacity planning comes in two flavors
– May have less headroom on processor than expected
– When moving a workload, it may not fit in the new container
• Example
– Assume a workload is running at 50% busy on a 2000 MIPS box without factoring in utilization effect, it will be called a 1000 MIPS workload in fact, it may be an 1100 MIPS workload when running at the efficiency of a 90% busy box
• Caution #1: There is NOT room to double this workload on the current box
• Caution #2: If moved to a new box or LPAR, it will likely need a 1100 MIPS container (not 1000 MIPS) to fit
• ROT:
– CPU per tran will vary 3-5% for every 10% change in utilization
On the positive side, SMT delivers more throughput per core.
–More capacity for a given footprint size
–Less power and cooling required per unit of capacity
•But, there are negatives as well.
•The first is that an individual thread in multithread mode is slower than a single thread would be. The speed drops quite rapidly with the number of threads.
–If an SMT2 core provides 140% of the capacity of a single thread, then two threads will (on average) each run at 70% of the single-thread speed when both threads are active.
–For SMT4, if all four threads are active, they would run at only 40% of the single thread speed.
•The second negative is an increase in variability.
–Increased sharing of low-level resources by threads makes the amount of work that a thread can do dependent on what else the core is doing.
•A major cause of less than linear speed-up is the sharing of processor cache.
–On recent System z processors, there are two levels of cache that are private to the core (on zEC12, they are called L1 and L2). If a core has more than one thread, these caches will be shared across all the threads.
–Each thread is forces to get by with a smaller footprint in these caches and so takes more L1 and L2 misses than if the caches were not shared.
•Other resources must also be shared:
–The pipes,
–The translation lookaside buffer (TLB), and,
–Physical General Purpose Registers
–Store Buffers and other resources on Why the Variability?
•Citing numbers like 140% throughput for SMT2 or 160% throughput for SMT4 are gross simplifications.
-Actual throughput for SMT2 can range from less than 100% to close to 200%, depending upon the usage of the shared resources.
- For example, if programs running on the same core stress the same resources, they will run slower than average.
- Alternately, if the programs resource use is complimentary, they can run close to the ideal maximum speed.
•Running the same application multiple times shows less repeatable CPU usage because it may run in differing environments.
In traditional SMT implementations, the hypervisor can pair any two control program CPUs on a core. Serving more client control programs, the number of workloads (and the number of potential pairs of candidate CPUs on a single core) also increases.
This approach to SMT cannot determine runtime capacities, including capacity in use and capacity free. Thus accounting and charging for resource consumption becomes disconnected from actual capacity in use, which obscures job costs.
The hypervisor ends up grouping different work units together on the same core. Different pairings yield different capacities because the work has different characteristics, so separate measurements fall short of what is needed for measuring and charging.
Because the hypervisor can pair any two control program CPUs together on a core, one control program CPU will observe capacity variability that depends on the characteristics of the other CPU on the same core.
The z13 platform supports SMT2 (two threads per core) for a control program that is SMT aware.. A single control program manages an entire core and so controls all of that core’s threads. This aspect of the design limits the effects of SMT variability to an individual workload within a control program.
The z13 also supports new instrumentation that enables a control program to deliver real-time SMT measurements that can be used for capacity planning and chargeback purposes.
- SMT Mode 2 core capacity when two threads are in use
- SMT Mode 2 core capacity when one thread is in use (which is identical to SMT Mode 1 core capacity)
- SMT Mode 2 core capacity free when one thread is in use
z/OS implements intelligent expansion and contraction algorithms to maximize core throughput. It uses the fewest number of cores necessary to meet its application goals, which maximizes available cores for other images.
However z/VM could mix different virtual CPs in the same core since the VMs are not SMT aware.
1 – Introdução
Análise de desempenho em ambiente WEB é um tema por vezes complexo devido ao número de componentes normalmente envolvidos nesta arquitetura.
Cada componente por sua vez possui peculiaridades, processos e preocupaçoes próprias. As varias arquiteturas e combinações entre as mesmas dificultam a elaboracao de uma visao completa sobre o assunto. Como resultado os documentos sobre o tema em geral buscam subdividi-lo em seus varios elementos.
Quando falamos do ambiente WEB mainframe adicionamos mais um grau de complexidade em função das capacidades desta plataforma que adiciona novas alternativas em termos de elementos de arquitetura.
Quando falamos do zOS este ainda aumenta ainda mais a diversidade de elementos e funcionalidades varios deles com especifidades proprias em termos de analise de desempenho e planejamento de capacidade.
Nesta introdução daremos uma visao geral do ambiente WEB em mainframe e dos grupos de elementos envolvidos. Serão descritos de forma simplificada os cenarios e os elementos básicos deste ambiente alem de um breve posicionamento quanto a utilizacao de WebSphere em Linux ou em zOS.
The topology report displays the logical processor topology for systems running in Hiperdispatch mode. The Excel report on your workstation uses an input file (comma separated value) which must be first created on a z/OS system from SMF 99 subtype 14 records. The tool supports all System z environments from z10 to z13 for partitions running in Hiperdispatch mode. It displays the association of logical processors to books, chips, drawers, and nodes, the polarization of the processors (high, medium, low), the processor type (regular CP, zIIP, or zAAP), and the association to WLM nodes. The tool can be used to understand the processor placement and how it changes when topology changes occur.
In order to run the tool it is required to install the exe file from this webpage and afterwards two z/OS datasets on your local z/OS system. The install file creates two entries: "TopoReport.lnk" and "Topo Report Help.lnk" in the Windows program folder "IBM RMF Performance Management". Please select the "Topo Report Help" link and follow the instructions in topic "Processing SMF 99 data" to install and execute the z/OS datasets and programs. The other topics in the help file describe the usage of the Excel spreadsheet to display the information on your workstation.
The LPAR Design tool assists you in planning the LPAR layout of your Central Processor Complexes. The tool allows you to specify all partitions, the number of logical processors and their weights. It then calculates the MIPS capacity (*) for the partition. If you run your system in Hiperdispatch mode it also assist you in displaying the number of high, medium and low processors as a result of your definition. This will help you to easily identify definition errors. In addition offload processors like zIIPs and zAAPs are also supported. (*) Please notice: The MIPS capacity is a rough estimate and does not replace the necessity to use zPCR for a correct capacity projection of your environment. To install the tool, download the LPAR Design Tool (Version 7.3, 2.4 MB) and unzip it to your workstation. The tool consists of one Microsoft Excel spreadsheet. You'll find the associated user's guide here:
SMF 113 records provide insight into the usage of hardware cache structures of your partitions. This reporting tool provides a set of REXX programs which assist you in printing SMF 113 subtype 2 records and they also provide a a basic summary of the Cache activity in form of a CSV report. For collecting SMF 113 data (CPU Measurement Facility or Hardware Instrumentation Counters) please refer to CPU MF Overview and WSC Experiences
In order to run the tool it is required to install the exe file from this webpage and afterwards three z/OS datasets on your local z/OS system. The install file creates two entries: "HISandCSVReport.lnk" and "HIS and CSV Reporting Help.lnk" in the Windows program folder "IBM RMF Performance Management". Please select the "HIS and CSV Reporting Help" link and follow the instructions in topic "Installing Host files" to install and "Process SMF 113 data" execute the z/OS datasets and programs. The other topics in the help file describe the usage of the Excel spreadsheet to display the information on your workstation.