SlideShare uma empresa Scribd logo
1 de 15
Baixar para ler offline
HCQC - HPC compiler quality checker
Masaki Arai
masaki.arai@linaro.org
LEG HPC-SIG
arai.masaki@jp.fujitsu.com
FUJITSU LABORATORIES LTD.
1
Background and Purpose
• The quality of the kernel part is important in HPC
applications(number crunching on supercomputers).
• Make it easy to check the quality of compiler
optimizations and acquire data to improve them
HCQC:HPC compiler quality checker
2
Subject of Quality Check
• Configuration file defines the subject of quality check.
• Main items:
 Compiler
 Compiler version
 Optimization flags
{
“DISTRIBUTION" : "OpenSUSE Tumbleweed",
"ARCH" : “aarch64",
"CPU" : "AMD Opteron A1100 Cortex A57",
"LANGUAGE" : "C",
"COMPILER" : “GCC",
"COMMAND" : "/usr/bin/gcc",
"VERSION" : “7.1.1",
"OPT_FLAGS" : ["-O2"],
"ASM_FLAGS" : ["-S“, “-fverbose-asm”],
“FLAG_DB” : [[“?DEBUG_FLAG", “-g”],
[“?C99_STANDARD", “-std=c99”]]
}
3
Example of configuration file
Metrics for Quality Evaluation
• HCQC has the following metrics:
 op : # of mnemonics in an assembly code
 kind : The kind of mnemonics in an assembly code(memory, branch,
other)
 regalloc : The quality of register allocation(# of spill in/out instructions)
 height : The height of instruction dependence graph
 ilp :Instruction level parallelism by instruction scheduler
 vectorize : Vectrization/SIMDization situation
 swpl : # of initiation interval by software pipelining
4
These data are basically static data at compile time.
Investigation Result
5
ARCH : aarch64
CPU : AMD Opteron A1100 Cortex A57
LANGUAGE : C
COMPILER : ClangLLVM
COMMAND : /usr/bin/clang
VERSION :4.0.1
OPT_FLAGS : -O2
TEST_PROGRAM: sample
KERNEL_FUNCTION_NAME : kernel
DATE: 2017/11/07
ilp swpl
memorybranch other spill in spill out IPC II kind mem arith other
BB0 cond LBB0_11 0 3 1 3 0 0 0.5 0 0 0
BB1 0 0 0 4 0 0 0.5 0 0 0
LBB0_2 1 3 0 5 0 1 0.7 0 0 0
LBB0_3 cond LBB0_5 2 2 1 1 2 0 0.9 SLP 0 1 0
BB4 LBB0_7 LBB0_9 2 1 2 5 0 0 0.9 SLP 0 1 1
LBB0_5 cond LBB0_9 2 3 1 5 0 0 1.3 SLP 0 1 1
LBB0_7 2 3 0 5 0 3 1.7 SLP 0 1 2
LBB0_8 cond LBB0_8 3 7 1 5 0 0 2.5 5 LOOP,SLP 2 2 2
LBB0_9 cond LBB0_3 2 2 1 3 0 2 1.3 SLP 0 1 1
BB10 cond LBB0_2 1 1 1 2 0 0 0.8 0 0 0
LBB0_11 0 0 0 1 0 0 0.2 0 0 0
*SUMMARY* 25 8 39 2 6 2 7 7
vectorizekind regalloc
CFG DEPTH
Quality Evaluation by Comparison
• One investigation result has little meaning.
• Typical comparison examples:
GCC vs. LLVM(on AArch64)
LLVM 4.0.0 vs. LLVM 5.0.0(on AArch64)
LLVM with –O2 vs. LLVM with –O3(on AArch64)
LLVM on AArch64 vs. LLVM on x86_64
Missing optimizations on AArch64
LLVM on AArch64 vs. ICC on x86_84
Optimization hints for SVE from AVX codes
6
Example of Comparison(1)
• HimenoBMT-dynamic(regalloc)
 GCC is better than Clang/LLVM.
7
CFG DEPTH spill in spill out CFG DEPTH spill in spill out
jacobi cond .L18 0 0 7 jacobi cond .LBB0_32 0 0 12
0 0 5 0 0 6
.L3 cond .L16 1 1 0 .LBB0_2 cond .LBB0_31 1 1 0
1 3 2 1 0 2
.L8 cond .L28 2 1 0 .LBB0_4 cond .LBB0_11 2 0 1
2 4 6 2 0 1
.L11 cond .L7 3 0 1 .LBB0_6 cond .LBB0_10 4 0 0
3 10 0 3 13 7
.L4 cond .L4 4 0 0 .LBB0_8 cond .LBB0_8 4 2 0
.L7 cond .L11 3 2 0 cond .LBB0_6 3 3 0
.L5 cond .L8 2 5 1 goto .LBB0_11 2 0 0
1 5 0 .LBB0_10 cond .LBB0_6 3 0 0
.L9 cond .L13 2 0 0 .LBB0_11 cond .LBB0_4 2 3 2
2 0 0 cond .LBB0_30 1 1 0
.L17 cond .L15 3 0 0 1 3 0
3 0 0 .LBB0_14 cond .LBB0_29 2 0 0
.L12 cond .L12 4 0 0 goto .LBB0_19 2 0 0
.L15 cond .L17 3 0 0 .LBB0_16 3 0 0
.L13 cond .L9 2 0 0 .LBB0_17 cond .LBB0_17 4 0 0
.L16 cond .L3 1 2 1 cond .LBB0_28 3 0 0
end 0 0 0 goto .LBB0_26 3 0 0
.L28 goto .L5 2 1 2 .LBB0_19 cond .LBB0_28 3 0 0
.L18 end 0 0 0 cond .LBB0_22 3 1 0
*SUMMARY* - 34 25 goto .LBB0_26 3 0 0
.LBB0_22 cond .LBB0_25 3 0 0
cond .LBB0_16 3 0 0
cond .LBB0_16 3 0 0
.LBB0_25 3 0 0
.LBB0_26 3 0 0
.LBB0_27 cond .LBB0_27 4 0 0
.LBB0_28 cond .LBB0_19 3 0 0
.LBB0_29 cond .LBB0_14 2 1 0
goto .LBB0_31 1 0 0
.LBB0_30 1 1 0
.LBB0_31 cond .LBB0_2 1 0 0
goto .LBB0_33 0 0 0
.LBB0_32 0 0 0
.LBB0_33 end 0 8 0
*SUMMARY* - 37 31
LLVMGCC
LLVMGCC
Example of Comparison(2)
• mVMC-mini-calculateNewPfMTwo_child(op)
– GCC generates codes with better addressing modes.
8
CFG DEPTH lsl
calculateNewPfMTwo_child cond .LBB0_3 0 0
0 0
.LBB0_2 cond .LBB0_2 1 1
.LBB0_3 cond .LBB0_11 0 0
0 0
.LBB0_5 cond .LBB0_5 1 0
cond .LBB0_12 0 0
0 1
.LBB0_8 1 0
.LBB0_9 cond .LBB0_9 2 0
cond .LBB0_8 1 0
goto .LBB0_13 0 0
.LBB0_11 0 0
.LBB0_12 0 0
.LBB0_13 end 0 0
*SUMMARY* - 2
CFG DEPTH lsl
calculateNewPfMTwo_child cond .L2 0
0
.L3 cond .L3 1
.L2 cond .L8 0
0
.L5 cond .L5 1
0
.L7 1
.L6 cond .L6 2
cond .L7 1
0
.L4 end 0
.L8 goto .L4 0
*SUMMARY* -
LLVM
GCC
.LBB0_2:
ldr w1, [x5, x16, lsl #2]
sdiv w2, w16, w13
lsl x3, x16, #3
add x16, x16, #1
madd w1, w17, w2, w1
sbfiz x1, x1, #3, #32
ldr x2, [x18, x1]
cmp x11, x16
str x2, [x10, x3]
ldr x1, [x0, x1]
str x1, [x9, x3]
b.ne .LBB0_2 // x3 is dead
LLVM
Example of Comparison(3)
• HimenoBMT-static(height)
 Clang/LLVM is not good for register usage.
9
CFG DEPTH # height
.LBB1_8 cond .LBB1_8 4 71 25
CFG DEPTH # height
.L19 cond .L19 4 62 17
LLVM
GCC
71/25 = 2.84
62/17 = 3.647
.LBB1_8:
add x17, x4, x14
mov v6.16b, v17.16b
ldr s17, [x17, #8844]
add x16, x3, x14
mov v7.16b, v16.16b
ldr s16, [x16, #8844]
add x17, x25, x14
ldr s21, [x17, x7]
fmul s0, s17, s0
ldr s17, [x17, x21]
add x16, x5, x14
mov v5.16b, v18.16b
ldr s18, [x16, #8844]
……
LLVM
The metric `height’ is not committed on GitHub yet.
Workflow of HCQC
① Compile one test program
② Run the executable file and verify its result by
comparing output and answer data
③ Generate the assembly code file
④ Make the control flow graph of the kernel part from
the assembly code
⑤ Get result data using metric programs
⑥ Make the report file from data
10
% hcqc config test metric+
% hcqc-report config test metric+
Workflow of HCQC
11
COMMAND:/usr/bin/clang
OPT_FLAGS:-O2
Configuration FileTest Program
Executable File
out.data resut.data
in.data Assembly Code File
cfg.py
CFG+DEPTH
hcqc-report
diff check DISTRIBUTION : OpenSUSE Tumbleweed
ARCH : aarch64
CPU : AMD Opteron A1100 Cortex A57
LANGUAGE : C
COMPILER : ClangLLVM
COMMAND : /usr/bin/clang
VERSION :4.0.1
OPT_FLAGS : -O2
TEST_PROGRAM: sample
KERNEL=FUNCTION=NAME : kernel
DATE: 2017/11/07
ilp swpl
mem branch other spill in spill out IPC II kind mem arith other
BB0 cond LBB0_11 0 3 1 3 0 0 0.5 0 0 0
BB1 0 0 0 4 0 0 0.5 0 0 0
LBB0_2 1 3 0 5 0 1 0.7 0 0 0
LBB0_3 cond LBB0_5 2 2 1 1 2 0 0.9 SLP 0 1 0
BB4 LBB0_7 LBB0_9 2 1 2 5 0 0 0.9 SLP 0 1 1
LBB0_5 cond LBB0_9 2 3 1 5 0 0 1.3 SLP 0 1 1
LBB0_7 2 3 0 5 0 3 1.7 SLP 0 1 2
LBB0_8 cond LBB0_8 3 7 1 5 0 0 2.5 5 LOOP,SLP 2 2 2
LBB0_9 cond LBB0_3 2 2 1 3 0 2 1.3 SLP 0 1 1
BB10 cond LBB0_2 1 1 1 2 0 0 0.8 0 0 0
LBB0_11 0 0 0 1 0 0 0.2 0 0 0
*SUMMARY* 25 8 39 2 6 2 7 7
vectorizekind regalloc
CFG DEPTH
Report file(csv)
Result Data
Result Data
Result Data
Metric Program
Metric Program
Metric Program
Test Program
Info File
①
②
③
④
⑤
⑥
JSON format file
Debug Information?
Test Programs for HCQC
• Generate from programs that were problematic in Fujitsu's
production compilers in the past
• Extract kernel parts and modify them to use under HCQC
– Extract hot spots
– If it is Fortran program, then convert them to C
language(for comparison between GCC and Clang/LLVM)
– Prepare the data to run and check those kernel parts
12
Test Programs for HCQC
• All original benchmarks are
publically available.
• I/O data for HCQC is being prepared.
– Some data file sizes are very large.
– For different architectures or different
optimization levels, error tolerance is required.
13
benchark name kernel name
HimenoBMT-dynamic jacobi
HimenoBMT-static jacobi
hpcg-3.0 ComputeSYMGS_ref
ccs-qcd bicgstab_hmc
ccs-qcd clover
ffb-mini CALAX3
ffb-mini FLD3X2
ffb-mini GRAD3X
ffvc-mini poi_residual
ffvc-mini psor2sma_core
mVMC-mini calculateNewPfMTwo_child
mVMC-mini updateMAllTwo_child
mVMC-mini updateMAll_child
ngsa-mini bwt_match_exact_alt
ngsa-mini bwt_match_gap
nicam-dc-mini vi_path2
Future Work
• Add supports for SVE(if available in GCC or LLVM)
• Implement metric programs:
– vectorization(vectorize)
– software pipelining(swpl)
– instruction level parallelism(ilp)
• Add features for comparing with x86_64(SVE vs. AVX)
• Add tools for automatic and intelligent comparison
14
15
URL https://github.com/Linaro/hcqc
Thank you very much!
Any comments or suggestions are welcome.

Mais conteúdo relacionado

Mais procurados

LCA14: LCA14-412: GPGPU on ARM SoC session
LCA14: LCA14-412: GPGPU on ARM SoC sessionLCA14: LCA14-412: GPGPU on ARM SoC session
LCA14: LCA14-412: GPGPU on ARM SoC sessionLinaro
 
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)Linaro
 
Performance optimization 101 - Erlang Factory SF 2014
Performance optimization 101 - Erlang Factory SF 2014Performance optimization 101 - Erlang Factory SF 2014
Performance optimization 101 - Erlang Factory SF 2014lpgauth
 
Bpf performance tools chapter 4 bcc
Bpf performance tools chapter 4   bccBpf performance tools chapter 4   bcc
Bpf performance tools chapter 4 bccViller Hsiao
 
Debugging node in prod
Debugging node in prodDebugging node in prod
Debugging node in prodYunong Xiao
 
p4alu: Arithmetic Logic Unit in P4
p4alu: Arithmetic Logic Unit in P4p4alu: Arithmetic Logic Unit in P4
p4alu: Arithmetic Logic Unit in P4Kentaro Ebisawa
 
P4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadP4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadOpen-NFP
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Brendan Gregg
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 InstancesBrendan Gregg
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019Brendan Gregg
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemPost-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemLinaro
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel TLV
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsKernel TLV
 
Staying Afloat with Buoy: A High-Performance HTTP Client
Staying Afloat with Buoy: A High-Performance HTTP ClientStaying Afloat with Buoy: A High-Performance HTTP Client
Staying Afloat with Buoy: A High-Performance HTTP Clientlpgauth
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem	Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem Linaro
 
FortranCon2020: Highly Parallel Fortran and OpenACC Directives
FortranCon2020: Highly Parallel Fortran and OpenACC DirectivesFortranCon2020: Highly Parallel Fortran and OpenACC Directives
FortranCon2020: Highly Parallel Fortran and OpenACC DirectivesJeff Larkin
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsBrendan Gregg
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_mapslcplcp1
 

Mais procurados (20)

LCA14: LCA14-412: GPGPU on ARM SoC session
LCA14: LCA14-412: GPGPU on ARM SoC sessionLCA14: LCA14-412: GPGPU on ARM SoC session
LCA14: LCA14-412: GPGPU on ARM SoC session
 
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
BKK16-308 The tool called Auto-Tuned Optimization System (ATOS)
 
Performance optimization 101 - Erlang Factory SF 2014
Performance optimization 101 - Erlang Factory SF 2014Performance optimization 101 - Erlang Factory SF 2014
Performance optimization 101 - Erlang Factory SF 2014
 
Bpf performance tools chapter 4 bcc
Bpf performance tools chapter 4   bccBpf performance tools chapter 4   bcc
Bpf performance tools chapter 4 bcc
 
Debugging node in prod
Debugging node in prodDebugging node in prod
Debugging node in prod
 
p4alu: Arithmetic Logic Unit in P4
p4alu: Arithmetic Logic Unit in P4p4alu: Arithmetic Logic Unit in P4
p4alu: Arithmetic Logic Unit in P4
 
P4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadP4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC Offload
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
 
Onnc intro
Onnc introOnnc intro
Onnc intro
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 Instances
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemPost-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and Containers
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance Tools
 
Staying Afloat with Buoy: A High-Performance HTTP Client
Staying Afloat with Buoy: A High-Performance HTTP ClientStaying Afloat with Buoy: A High-Performance HTTP Client
Staying Afloat with Buoy: A High-Performance HTTP Client
 
Lustre Best Practices
Lustre Best Practices Lustre Best Practices
Lustre Best Practices
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem	Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem
 
FortranCon2020: Highly Parallel Fortran and OpenACC Directives
FortranCon2020: Highly Parallel Fortran and OpenACC DirectivesFortranCon2020: Highly Parallel Fortran and OpenACC Directives
FortranCon2020: Highly Parallel Fortran and OpenACC Directives
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance Results
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 

Semelhante a HCQC : HPC Compiler Quality Checker

Adding a BOLT pass
Adding a BOLT passAdding a BOLT pass
Adding a BOLT passAmir42407
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPFAlex Maestretti
 
Update from android kk to android l
Update from android kk to android lUpdate from android kk to android l
Update from android kk to android lBin Yang
 
Embedded system (Chapter 2) part 2
Embedded system (Chapter 2) part 2Embedded system (Chapter 2) part 2
Embedded system (Chapter 2) part 2Ikhwan_Fakrudin
 
Open stackdaykorea2016 wedge
Open stackdaykorea2016 wedgeOpen stackdaykorea2016 wedge
Open stackdaykorea2016 wedgeJunho Suh
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsGergely Szabó
 
The true story_of_hello_world
The true story_of_hello_worldThe true story_of_hello_world
The true story_of_hello_worldfantasy zheng
 
Shifter singularity - june 7, 2018 - bw symposium
Shifter  singularity - june 7, 2018 - bw symposiumShifter  singularity - june 7, 2018 - bw symposium
Shifter singularity - june 7, 2018 - bw symposiuminside-BigData.com
 
Not breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABINot breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABIAlison Chaiken
 
Os Grossupdated
Os GrossupdatedOs Grossupdated
Os Grossupdatedoscon2007
 
BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64Linaro
 
Cobol performance tuning paper lessons learned - s8833 tr
Cobol performance tuning paper   lessons learned - s8833 trCobol performance tuning paper   lessons learned - s8833 tr
Cobol performance tuning paper lessons learned - s8833 trPedro Barros
 
Ee325 cmos design lab 7 report - loren k schwappach
Ee325 cmos design   lab 7 report - loren k schwappachEe325 cmos design   lab 7 report - loren k schwappach
Ee325 cmos design lab 7 report - loren k schwappachLoren Schwappach
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockScyllaDB
 
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...Insight Technology, Inc.
 
Summit 16: How to Compose a New OPNFV Solution Stack?
Summit 16: How to Compose a New OPNFV Solution Stack?Summit 16: How to Compose a New OPNFV Solution Stack?
Summit 16: How to Compose a New OPNFV Solution Stack?OPNFV
 

Semelhante a HCQC : HPC Compiler Quality Checker (20)

Adding a BOLT pass
Adding a BOLT passAdding a BOLT pass
Adding a BOLT pass
 
RISC-V Zce Extension
RISC-V Zce ExtensionRISC-V Zce Extension
RISC-V Zce Extension
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 
Update from android kk to android l
Update from android kk to android lUpdate from android kk to android l
Update from android kk to android l
 
Next Stop, Android
Next Stop, AndroidNext Stop, Android
Next Stop, Android
 
Callgraph analysis
Callgraph analysisCallgraph analysis
Callgraph analysis
 
Embedded system (Chapter 2) part 2
Embedded system (Chapter 2) part 2Embedded system (Chapter 2) part 2
Embedded system (Chapter 2) part 2
 
Open stackdaykorea2016 wedge
Open stackdaykorea2016 wedgeOpen stackdaykorea2016 wedge
Open stackdaykorea2016 wedge
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native Environments
 
LabDocumentation
LabDocumentationLabDocumentation
LabDocumentation
 
The true story_of_hello_world
The true story_of_hello_worldThe true story_of_hello_world
The true story_of_hello_world
 
Shifter singularity - june 7, 2018 - bw symposium
Shifter  singularity - june 7, 2018 - bw symposiumShifter  singularity - june 7, 2018 - bw symposium
Shifter singularity - june 7, 2018 - bw symposium
 
Not breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABINot breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABI
 
Os Grossupdated
Os GrossupdatedOs Grossupdated
Os Grossupdated
 
BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64BKK16-305B ILP32 Performance on AArch64
BKK16-305B ILP32 Performance on AArch64
 
Cobol performance tuning paper lessons learned - s8833 tr
Cobol performance tuning paper   lessons learned - s8833 trCobol performance tuning paper   lessons learned - s8833 tr
Cobol performance tuning paper lessons learned - s8833 tr
 
Ee325 cmos design lab 7 report - loren k schwappach
Ee325 cmos design   lab 7 report - loren k schwappachEe325 cmos design   lab 7 report - loren k schwappach
Ee325 cmos design lab 7 report - loren k schwappach
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
[db tech showcase Tokyo 2017] A11: SQLite - The most used yet least appreciat...
 
Summit 16: How to Compose a New OPNFV Solution Stack?
Summit 16: How to Compose a New OPNFV Solution Stack?Summit 16: How to Compose a New OPNFV Solution Stack?
Summit 16: How to Compose a New OPNFV Solution Stack?
 

Mais de Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaLinaro
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaLinaro
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018Linaro
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Linaro
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteLinaro
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopLinaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allLinaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorLinaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMULinaro
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MLinaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootLinaro
 

Mais de Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Último

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Último (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

HCQC : HPC Compiler Quality Checker

  • 1. HCQC - HPC compiler quality checker Masaki Arai masaki.arai@linaro.org LEG HPC-SIG arai.masaki@jp.fujitsu.com FUJITSU LABORATORIES LTD. 1
  • 2. Background and Purpose • The quality of the kernel part is important in HPC applications(number crunching on supercomputers). • Make it easy to check the quality of compiler optimizations and acquire data to improve them HCQC:HPC compiler quality checker 2
  • 3. Subject of Quality Check • Configuration file defines the subject of quality check. • Main items:  Compiler  Compiler version  Optimization flags { “DISTRIBUTION" : "OpenSUSE Tumbleweed", "ARCH" : “aarch64", "CPU" : "AMD Opteron A1100 Cortex A57", "LANGUAGE" : "C", "COMPILER" : “GCC", "COMMAND" : "/usr/bin/gcc", "VERSION" : “7.1.1", "OPT_FLAGS" : ["-O2"], "ASM_FLAGS" : ["-S“, “-fverbose-asm”], “FLAG_DB” : [[“?DEBUG_FLAG", “-g”], [“?C99_STANDARD", “-std=c99”]] } 3 Example of configuration file
  • 4. Metrics for Quality Evaluation • HCQC has the following metrics:  op : # of mnemonics in an assembly code  kind : The kind of mnemonics in an assembly code(memory, branch, other)  regalloc : The quality of register allocation(# of spill in/out instructions)  height : The height of instruction dependence graph  ilp :Instruction level parallelism by instruction scheduler  vectorize : Vectrization/SIMDization situation  swpl : # of initiation interval by software pipelining 4 These data are basically static data at compile time.
  • 5. Investigation Result 5 ARCH : aarch64 CPU : AMD Opteron A1100 Cortex A57 LANGUAGE : C COMPILER : ClangLLVM COMMAND : /usr/bin/clang VERSION :4.0.1 OPT_FLAGS : -O2 TEST_PROGRAM: sample KERNEL_FUNCTION_NAME : kernel DATE: 2017/11/07 ilp swpl memorybranch other spill in spill out IPC II kind mem arith other BB0 cond LBB0_11 0 3 1 3 0 0 0.5 0 0 0 BB1 0 0 0 4 0 0 0.5 0 0 0 LBB0_2 1 3 0 5 0 1 0.7 0 0 0 LBB0_3 cond LBB0_5 2 2 1 1 2 0 0.9 SLP 0 1 0 BB4 LBB0_7 LBB0_9 2 1 2 5 0 0 0.9 SLP 0 1 1 LBB0_5 cond LBB0_9 2 3 1 5 0 0 1.3 SLP 0 1 1 LBB0_7 2 3 0 5 0 3 1.7 SLP 0 1 2 LBB0_8 cond LBB0_8 3 7 1 5 0 0 2.5 5 LOOP,SLP 2 2 2 LBB0_9 cond LBB0_3 2 2 1 3 0 2 1.3 SLP 0 1 1 BB10 cond LBB0_2 1 1 1 2 0 0 0.8 0 0 0 LBB0_11 0 0 0 1 0 0 0.2 0 0 0 *SUMMARY* 25 8 39 2 6 2 7 7 vectorizekind regalloc CFG DEPTH
  • 6. Quality Evaluation by Comparison • One investigation result has little meaning. • Typical comparison examples: GCC vs. LLVM(on AArch64) LLVM 4.0.0 vs. LLVM 5.0.0(on AArch64) LLVM with –O2 vs. LLVM with –O3(on AArch64) LLVM on AArch64 vs. LLVM on x86_64 Missing optimizations on AArch64 LLVM on AArch64 vs. ICC on x86_84 Optimization hints for SVE from AVX codes 6
  • 7. Example of Comparison(1) • HimenoBMT-dynamic(regalloc)  GCC is better than Clang/LLVM. 7 CFG DEPTH spill in spill out CFG DEPTH spill in spill out jacobi cond .L18 0 0 7 jacobi cond .LBB0_32 0 0 12 0 0 5 0 0 6 .L3 cond .L16 1 1 0 .LBB0_2 cond .LBB0_31 1 1 0 1 3 2 1 0 2 .L8 cond .L28 2 1 0 .LBB0_4 cond .LBB0_11 2 0 1 2 4 6 2 0 1 .L11 cond .L7 3 0 1 .LBB0_6 cond .LBB0_10 4 0 0 3 10 0 3 13 7 .L4 cond .L4 4 0 0 .LBB0_8 cond .LBB0_8 4 2 0 .L7 cond .L11 3 2 0 cond .LBB0_6 3 3 0 .L5 cond .L8 2 5 1 goto .LBB0_11 2 0 0 1 5 0 .LBB0_10 cond .LBB0_6 3 0 0 .L9 cond .L13 2 0 0 .LBB0_11 cond .LBB0_4 2 3 2 2 0 0 cond .LBB0_30 1 1 0 .L17 cond .L15 3 0 0 1 3 0 3 0 0 .LBB0_14 cond .LBB0_29 2 0 0 .L12 cond .L12 4 0 0 goto .LBB0_19 2 0 0 .L15 cond .L17 3 0 0 .LBB0_16 3 0 0 .L13 cond .L9 2 0 0 .LBB0_17 cond .LBB0_17 4 0 0 .L16 cond .L3 1 2 1 cond .LBB0_28 3 0 0 end 0 0 0 goto .LBB0_26 3 0 0 .L28 goto .L5 2 1 2 .LBB0_19 cond .LBB0_28 3 0 0 .L18 end 0 0 0 cond .LBB0_22 3 1 0 *SUMMARY* - 34 25 goto .LBB0_26 3 0 0 .LBB0_22 cond .LBB0_25 3 0 0 cond .LBB0_16 3 0 0 cond .LBB0_16 3 0 0 .LBB0_25 3 0 0 .LBB0_26 3 0 0 .LBB0_27 cond .LBB0_27 4 0 0 .LBB0_28 cond .LBB0_19 3 0 0 .LBB0_29 cond .LBB0_14 2 1 0 goto .LBB0_31 1 0 0 .LBB0_30 1 1 0 .LBB0_31 cond .LBB0_2 1 0 0 goto .LBB0_33 0 0 0 .LBB0_32 0 0 0 .LBB0_33 end 0 8 0 *SUMMARY* - 37 31 LLVMGCC LLVMGCC
  • 8. Example of Comparison(2) • mVMC-mini-calculateNewPfMTwo_child(op) – GCC generates codes with better addressing modes. 8 CFG DEPTH lsl calculateNewPfMTwo_child cond .LBB0_3 0 0 0 0 .LBB0_2 cond .LBB0_2 1 1 .LBB0_3 cond .LBB0_11 0 0 0 0 .LBB0_5 cond .LBB0_5 1 0 cond .LBB0_12 0 0 0 1 .LBB0_8 1 0 .LBB0_9 cond .LBB0_9 2 0 cond .LBB0_8 1 0 goto .LBB0_13 0 0 .LBB0_11 0 0 .LBB0_12 0 0 .LBB0_13 end 0 0 *SUMMARY* - 2 CFG DEPTH lsl calculateNewPfMTwo_child cond .L2 0 0 .L3 cond .L3 1 .L2 cond .L8 0 0 .L5 cond .L5 1 0 .L7 1 .L6 cond .L6 2 cond .L7 1 0 .L4 end 0 .L8 goto .L4 0 *SUMMARY* - LLVM GCC .LBB0_2: ldr w1, [x5, x16, lsl #2] sdiv w2, w16, w13 lsl x3, x16, #3 add x16, x16, #1 madd w1, w17, w2, w1 sbfiz x1, x1, #3, #32 ldr x2, [x18, x1] cmp x11, x16 str x2, [x10, x3] ldr x1, [x0, x1] str x1, [x9, x3] b.ne .LBB0_2 // x3 is dead LLVM
  • 9. Example of Comparison(3) • HimenoBMT-static(height)  Clang/LLVM is not good for register usage. 9 CFG DEPTH # height .LBB1_8 cond .LBB1_8 4 71 25 CFG DEPTH # height .L19 cond .L19 4 62 17 LLVM GCC 71/25 = 2.84 62/17 = 3.647 .LBB1_8: add x17, x4, x14 mov v6.16b, v17.16b ldr s17, [x17, #8844] add x16, x3, x14 mov v7.16b, v16.16b ldr s16, [x16, #8844] add x17, x25, x14 ldr s21, [x17, x7] fmul s0, s17, s0 ldr s17, [x17, x21] add x16, x5, x14 mov v5.16b, v18.16b ldr s18, [x16, #8844] …… LLVM The metric `height’ is not committed on GitHub yet.
  • 10. Workflow of HCQC ① Compile one test program ② Run the executable file and verify its result by comparing output and answer data ③ Generate the assembly code file ④ Make the control flow graph of the kernel part from the assembly code ⑤ Get result data using metric programs ⑥ Make the report file from data 10 % hcqc config test metric+ % hcqc-report config test metric+
  • 11. Workflow of HCQC 11 COMMAND:/usr/bin/clang OPT_FLAGS:-O2 Configuration FileTest Program Executable File out.data resut.data in.data Assembly Code File cfg.py CFG+DEPTH hcqc-report diff check DISTRIBUTION : OpenSUSE Tumbleweed ARCH : aarch64 CPU : AMD Opteron A1100 Cortex A57 LANGUAGE : C COMPILER : ClangLLVM COMMAND : /usr/bin/clang VERSION :4.0.1 OPT_FLAGS : -O2 TEST_PROGRAM: sample KERNEL=FUNCTION=NAME : kernel DATE: 2017/11/07 ilp swpl mem branch other spill in spill out IPC II kind mem arith other BB0 cond LBB0_11 0 3 1 3 0 0 0.5 0 0 0 BB1 0 0 0 4 0 0 0.5 0 0 0 LBB0_2 1 3 0 5 0 1 0.7 0 0 0 LBB0_3 cond LBB0_5 2 2 1 1 2 0 0.9 SLP 0 1 0 BB4 LBB0_7 LBB0_9 2 1 2 5 0 0 0.9 SLP 0 1 1 LBB0_5 cond LBB0_9 2 3 1 5 0 0 1.3 SLP 0 1 1 LBB0_7 2 3 0 5 0 3 1.7 SLP 0 1 2 LBB0_8 cond LBB0_8 3 7 1 5 0 0 2.5 5 LOOP,SLP 2 2 2 LBB0_9 cond LBB0_3 2 2 1 3 0 2 1.3 SLP 0 1 1 BB10 cond LBB0_2 1 1 1 2 0 0 0.8 0 0 0 LBB0_11 0 0 0 1 0 0 0.2 0 0 0 *SUMMARY* 25 8 39 2 6 2 7 7 vectorizekind regalloc CFG DEPTH Report file(csv) Result Data Result Data Result Data Metric Program Metric Program Metric Program Test Program Info File ① ② ③ ④ ⑤ ⑥ JSON format file Debug Information?
  • 12. Test Programs for HCQC • Generate from programs that were problematic in Fujitsu's production compilers in the past • Extract kernel parts and modify them to use under HCQC – Extract hot spots – If it is Fortran program, then convert them to C language(for comparison between GCC and Clang/LLVM) – Prepare the data to run and check those kernel parts 12
  • 13. Test Programs for HCQC • All original benchmarks are publically available. • I/O data for HCQC is being prepared. – Some data file sizes are very large. – For different architectures or different optimization levels, error tolerance is required. 13 benchark name kernel name HimenoBMT-dynamic jacobi HimenoBMT-static jacobi hpcg-3.0 ComputeSYMGS_ref ccs-qcd bicgstab_hmc ccs-qcd clover ffb-mini CALAX3 ffb-mini FLD3X2 ffb-mini GRAD3X ffvc-mini poi_residual ffvc-mini psor2sma_core mVMC-mini calculateNewPfMTwo_child mVMC-mini updateMAllTwo_child mVMC-mini updateMAll_child ngsa-mini bwt_match_exact_alt ngsa-mini bwt_match_gap nicam-dc-mini vi_path2
  • 14. Future Work • Add supports for SVE(if available in GCC or LLVM) • Implement metric programs: – vectorization(vectorize) – software pipelining(swpl) – instruction level parallelism(ilp) • Add features for comparing with x86_64(SVE vs. AVX) • Add tools for automatic and intelligent comparison 14
  • 15. 15 URL https://github.com/Linaro/hcqc Thank you very much! Any comments or suggestions are welcome.