SlideShare uma empresa Scribd logo
1 de 40
Baixar para ler offline
Panic Attack
Finding some order in the panic chaos
Guilherme G. Piccoli (Igalia)
2023-09-26 / Kernel Recipes
1
Context
Interest in having a panic log collecting tool on Arch / SteamOS
Analysis of kernel infra available - different use cases:
kdump == more data collected, heavier on resources
pstore == log collected on panic -> lightweight, but less data
By playing with kdump/pstore, crossed paths with panic
notifiers
Panic path is full of trade-offs / conflicting goals
Panic notifiers discussions, ideas and eventual refactor
Some other orthogonal problems on panic time
Interrupt storms / Graphics on panic
2
Disclaimer
Feel free to interrupt with questions
Multiple concepts / dense topic
Risks of "assumed knowledge"
kexec set of recent problems won't be addressed here
Memory preserving across kexec boots
SEV / TDX problems with kexec
Unikernels support, etc.
3
Outline
The genesis of this work: SteamOS
Panic notifiers: discussion and refactor
Chaos on kdump: a real case of interrupt storm
Challenges of GFX on panic: dream or reality?
4
Where all started: Steam Deck
Steam Deck, from Valve
CPU/APU AMD Zen 2 (custom), 4-cores/8-threads
16 GB of RAM / 7" display
3 models of NVMe storage (64G, 256G, 512G)
5
Deck's distro: SteamOS 3
Arch Linux based distro with gamescope (games) and KDE
Plasma (desktop)
Sophisticated stack for games: Steam, Proton (Wine), DXVK,
VKD3D, etc
Arch Linux has no kdump official tool
Steam Deck community would benefit of such tool for panic log
collection!
6
Requirements: what logs to collect?
Collect the most logs we can: dmesg (call trace), tasks' state,
memory info
Though being careful with size - should be easy to share
Information that could be used for kernel/HW debugging
Rely on in-kernel infrastructure for that - don't reinvent the
wheel
7
How to collect such logs? Kernel infra
kdump: kexec-loaded crash kernel
kexec to a new kernel to collect info from the broken kernel
Requires pre-reserved memory (>200MB usually)
Collects a vmcore (full memory image) of the crashed kernel
Lots of information, but heavy / hard for users to share it
pstore: persistent storage log saving
Save dmesg during panic time to some backend
Multiple backends (RAM, UEFI, ACPI, etc)
Also multiple frontends (oops, ftrace, console, etc)
Enough amount of information? (dmesg only)
Both tools benefits from userspace counter-part
Kdump tooling common (Debian/Fedora), but not Arch
8
Presenting kdumpst
is an Arch Linux kdump and pstore tool
Available on , supports GRUB and initcpio / dracut
Defaults to pstore; currently only ramoops backend (UEFI plans)
Used by default on Steam Deck, submits logs to Valve
But how to improve the amount of logs on dmesg?
panic_print FTW!
kdumpst
AUR
9
panic_print VS pstore ordering
panic_print parameter allows to show more data on dmesg
during panic
Tasks info, system memory state, timers
But such function runs after pstore! So can't collect the data.
Idea: re-order the code
Move the call earlier in the panic path
[discussion]
10
Panic (over-simplified) code path
local IRQ and
preempt disable
dump stack
kdump?
crash_kexec()
disable the
other CPUs
panic notifiers
and kmsg_dump()
arch code / reboot
arch code / kexec
YES NO
11
The code re-ordering
/* Simplified function names */
void panic()
[...]
<---------------------|
if (!panic_notifiers) |
crash_kexec(); /* kdump */ |
|
panic_notifiers(); |
|
kmsg_dump(KMSG_DUMP_PANIC); /* pstore! */ |
|
if (panic_notifiers) |
crash_kexec(); |
|
panic_print(); ---------------------------|
[...]
12
And then, the discussion starts...
Problems with such approach: panic_print before kdump is
risky
with Baoquan and others
Alternative: propose less invasive change, moving that before
pstore only
New problem then: what if users want panic_print before
kdump?
Makes sense if vmcore is too much
Only possible if we run the panic notifiers before kdump! So
the notifiers journey begins...
[discussion]
[discussion]
13
Outline
The genesis of this work: SteamOS
Panic notifiers: discussion and refactor
Chaos on kdump: a real case of interrupt storm
Challenges of GFX on panic: dream or reality?
14
Notifier call chains
List of callbacks to be executed (usually) in any order
There's a (frequently unused) "priority" tune for call ordering
Multiple types - atomic callbacks, blocking callbacks, etc
Panic notifiers == list of atomic callbacks executed on panic
/* Example from kernel/rcu/tree_stall.h */
/* Don't print RCU CPU stall warnings during a kernel panic. */
static int rcu_panic(...)
{
rcu_cpu_stall_suppress = 1;
return NOTIFY_DONE;
}
static struct notifier_block rcu_panic_block = {
.notifier_call = rcu_panic,
};
atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block);
15
Deep dive into panic notifiers
Any driver (even OOT) can register a notifier, to do...anything!
Risky for kdump reliability / but sometimes, notifiers could be
necessary
"Solution": a new kernel parameter, crash_kexec_post_notifiers
Proper name for a bazooka shot: all-or-nothing option, runs
ALL notifiers before kdump
Middle-ground idea: panic notifiers filter! User selects which
notifiers to run
kdump maintainers kinda welcome the feature:
But really paper over a real issue: notifiers is a no man's land
Very good from Petr Mladek exposed the need of a
refactor
[discussion]
analysis
16
Mladek's refactor proposal
Split panic notifiers in more lists, according to their "goals"
Information ones: extra info dump, stop watchdogs
Hypervisor/FW poking notifiers
Others: actions taken when kdump isn't set (LED blink, halt)
Ordering regarding kdump
Hypervisor list before kdump
Info list also before, IF any kmsg_dump() is set
Final list runs only if kdump isn't set
V1 submitted ~1y ago
Special thanks to Petr Mladek for the idea and all reviews.
Thanks also Baoquan and Michael Kelly (Hyper-V) for the
great discussions!
17
1st step - fixing current panic notifiers
First thing: build a list with all existing in-tree panic notifiers
As of today (6.6-rc2): 47 notifiers (18 on arch/)
Fix / improve them, before splitting in lists. Some patterns:
Decouple multi-purpose notifiers
Change ordering through the notifier's priorities
Machine halt or firmware-reset - put'em to run last
Disabling watchdogs (RCU, hung tasks): run ASAP
Avoid regular locks
Panic path disables secondary CPUs, interrupts,
preemption
mutex_trylock() and spin_trylock() FTW
18
Real example: pvpanic
/* drivers/misc/pvpanic/pvpanic.c - simplified code */
static void pvpanic_send_event() {
- spin_lock(&pvpanic_lock);
+ if (!spin_trylock(&pvpanic_lock))
+ return;
static int panic_panic_notify(...) {
pvpanic_send_event(PVPANIC_PANICKED);
}
[...]
+ /* Call our notifier very early on panic */
static struct notifier_block pvpanic_panic_nb = {
.notifier_call = pvpanic_panic_notify,
- .priority = 1,
+ .priority = INT_MAX,
};
19
List splitting (yay, a 4th list!)
Original plan was splitting in 3 lists, but... ended-up with 4
list: hypervisor/FW notification, LED blinking
Hyper-V, PPC/fadump, pvpanic, LEDs stuff, etc
list: dump extra info, disable watchdogs
KASLR offsets, RCU/hung task watchdog off, ftrace_dump_on_oops
: includes the remaining ones (halt, risky funcs)
S390 and PPC/pseries FW halt, IPMI interfaces notification
: contains previously hardcoded (arch) final calls
SPARC "stop" button enabling (if reboot on panic not set)
List to be renamed on V2 (loop list)
Hypervisors
Informational
Pre-reboot
Post-reboot
20
The notifier "levels" model
One of the biggest questions regarding panic notifiers
Which ones should run before kdump?
Usual / possible answer: low risk / necessary ones
Introduce the concept of
Fine-grained tuning of which lists run before/after kdump
Defaults to:
Hypervisor always run before
Sometimes informational also (if kmsg_dump() is set)
Implementation maps levels into bits and order the lists
Was gently called "black magic" on review
panic notifier levels
21
Subsequent improvements
Proposal to convert panic_print into a panic notifier
Good acceptance, fits perfectly the informational list
Stop exporting crash_kexec_post_notifiers
Sadly some users of panic notifiers forcibly set this
parameter in code
Hyper-V is one of such users, and they have reasons for
that...
22
Hyper-V case / arm64 custom crash
handler
Hyper-V requires hypervisor action in case of kdump
Requires to unload its "vmbus connection" before crash
kernel takes over
x86 does it on crash through machine_ops() crash shutdown
arm64 though doesn't have similar architecture hook
with arm64 maintainers revealed little interest in
adding that
Unworthy complexity / not a good idea to mimic x86 case
Forcing panic notifiers seems a last resort for Hyper-V
Unless some alternative for arm64 is implemented
Discussion
23
Pros / Cons and follow-up discussion
Exhaustive exposed plenty conflicting views
First of all, not really clear what should run before kdump
The notifiers lists are incredibly flexible and "loose"
How to be sure anyone knowledgeable on panic will review?
Brainstorm: somehow force registering users to add the cb
name to a central place?
Less is more: too much flexibility is not a good fit for panic
Also, are notifier lists reliable on panic path?
What if memory corruption corrupts the list?
Alternatives? Hardcoded calls? (headers/exports hell)
discussion
24
Next steps / V2
Rework lists as suggested (move some callbacks here and
there)
Split submission - first the lists, then the refactor (kdump vs
notifiers order)
Consider ways of improving panic notifiers review
Improve documentation
Central place for registering !?
25
Outline
The genesis of this work: SteamOS
Panic notifiers: discussion and refactor
Chaos on kdump: a real case of interrupt storm
Challenges of GFX on panic: dream or reality?
26
Shifting gears: an interrupt storm tale
Another painful area to deal with is device state on kdump
A regular kexec would handle device's quiesce process
.shutdown() callback
Crash kexec (kdump) can't risk that -> way more limited
environment
Real case: device caused an interrupt storm, kdump couldn't
boot
27
The problem
Intel NIC running under PCI-PT (SR-IOV)
No in-tree driver - DPDK instead
Custom tool collecting NIC stats, triggered weird NIC FW bug
Symptom: lockups on host, non-responsive system
(Non-trivial) cause: NIC interrupt storm
Kdump attempt: unsuccessful -> crash kernel hung on boot
Guess what? Still the interrupt storm!
28
Look 'ma, no PCI reset
Despite kexec is a new boot, there are many differences from
FW boot
A fundamental limitation is the lack of PCI controller reset
x86 has no "protocol" / standard for root complexes resets
PPC64 has a FW-aided PCI reset (ppc_pci_reset_phbs)
Multiple debug attempts later...an idea: clear devices's MSIs
on boot
But how to achieve this? PCI layer is initialized much later
x86 early PCI infrastructure FTW! (Special thanks to Gavin
Shan)
29
pci=clearmsi proposal
Through the early PCI trick, we could clear the MSIs of all PCI
devs
Interrupt storm was shut-off and kdump boot succeeded
to linux-pci (~3y ago)
Some concerns from Bjorn (PCI maintainer)
First: limited approach -> pci_config_16()
This conf mode access is limited to first domain/segment
Other concern: solution only for x86
In principle, this affects more archs
Patches
30
Discussion
Also, was not really clear exactly what was the precise point of
failure
Thanks Thomas Gleixner for that
Interrupt flood happens right when interrupts are enabled on
start_kernel()
MSIs are DMA writes to a memory area (interrupt remapping
tables)
An IOMMU approach was suggested
Clearing these mappings and IOMMU error reporting early
in boot
Proper cleaning routines to run on panic kernel also suggested
clarifying
31
Potential next steps
Attempt implementing the IOMMU idea
Too limited? What if no IOMMU?
Investigate other archs to see how's the status
Reliably reproduce the problem!
Extend early PCI conf access mode?
Bjorn would be unhappy
32
Outline
The genesis of this work: SteamOS
Panic notifiers: discussion and refactor
Chaos on kdump: a real case of interrupt storm
Challenges of GFX on panic: dream or reality?
33
Final problem: GFX on panic
GPUs are complex beasts / interrupts are disabled on panic
Even regular kexec are challenging for them!
Currently, no reliable way to dump data on display during panic
Though it would be great for users to see something on crash
Reliable GFX on kdump? Wishful thinking
34
Framebuffer reuse
While working on kdumpst, experimenting with GFX on kdump
Managed to make it work only with framebuffers
Why not restore the FB on kdump then?
Interesting shows it's definitely not trivial
Once a GPU driver takes over, HW is reprogrammed
GOP driver (UEFI) programs FB/display
We'd need to reprogram the device either on panic (ugh) or
on kdump kernel
discussion
35
Current approaches
Noralf Trønnes (~4y ago)
Iterates on available framebuffers, find a suitable one
Jocelyn Falempe (last week)
Works with simpledrm currently, API to get a scanout buffer
Seems on early stages, with great potential / community
acceptance
Panic time approaches are risky / limited, must be simple
Not sure if that's possible one day for amdgpu / i915
proposal
proposal
36
Different approach: FW notification
What if we print nothing on panic, but defer for FW / next
kernel?
UEFI panic notification (~1y ago)
Simple UEFI variable set on kernel panic (through notifiers!)
Next kernel clears the var (and potentially prints something)
Simple and flexible - FW could plot a different logo
UEFI maintainer (Ard) not really convinced
Suggestions for using UEFI pstore for tracking that
Orthogonal goals / limited space on UEFI / dmesg "privacy"
Next steps: might try to implement that solution in a prototype
proposal
37
Conclusion
Quite a long path, from Linux gaming to panic notifiers refactor
Everything on panic is polemic / conflicting
"Slightly" long road ahead for the refactor
V2 of the refactor soon(tm), not so invasive
HW quiesce on crash kexec is still full of issues
Interesting area for some research / multi-arch work (IMHO)
GFX on panic: still in early stages, other OSes / game consoles
seems to have it
The UEFI approach, while kinda orthogonal, it's way simpler
38
THANKS
Feel free to reach me on IRC (gpiccoli - OFTC/Libera)
39
Panic Attack – a discussion about kdump, panic notifiers, graphics on crash event and all of that

Mais conteúdo relacionado

Semelhante a Panic Attack – a discussion about kdump, panic notifiers, graphics on crash event and all of that

When kdump is way too much
When kdump is way too muchWhen kdump is way too much
When kdump is way too muchIgalia
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Ricardo Amaro
 
Docker Container: isolation and security
Docker Container: isolation and securityDocker Container: isolation and security
Docker Container: isolation and security宇 傅
 
Docker Security Paradigm
Docker Security ParadigmDocker Security Paradigm
Docker Security ParadigmAnis LARGUEM
 
[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practicalMoabi.com
 
Drupalcamp es 2013 drupal with lxc docker and vagrant
Drupalcamp es 2013  drupal with lxc docker and vagrant Drupalcamp es 2013  drupal with lxc docker and vagrant
Drupalcamp es 2013 drupal with lxc docker and vagrant Ricardo Amaro
 
Linux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisLinux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisPaul V. Novarese
 
Linux Kernel Tour
Linux Kernel TourLinux Kernel Tour
Linux Kernel Toursamrat das
 
Linux kernel driver tutorial vorlesung
Linux kernel driver tutorial vorlesungLinux kernel driver tutorial vorlesung
Linux kernel driver tutorial vorlesungdns -
 
“Linux Kernel CPU Hotplug in the Multicore System”
“Linux Kernel CPU Hotplug in the Multicore System”“Linux Kernel CPU Hotplug in the Multicore System”
“Linux Kernel CPU Hotplug in the Multicore System”GlobalLogic Ukraine
 
Development platform virtualization using qemu
Development platform virtualization using qemuDevelopment platform virtualization using qemu
Development platform virtualization using qemuPremjith Achemveettil
 
Kubernetes laravel and kubernetes
Kubernetes   laravel and kubernetesKubernetes   laravel and kubernetes
Kubernetes laravel and kubernetesWilliam Stewart
 
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxConAnatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxConJérôme Petazzoni
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Jérôme Petazzoni
 
IRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the PreemptibleIRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the PreemptibleAlison Chaiken
 
DCEU 18: Tips and Tricks of the Docker Captains
DCEU 18: Tips and Tricks of the Docker CaptainsDCEU 18: Tips and Tricks of the Docker Captains
DCEU 18: Tips and Tricks of the Docker CaptainsDocker, Inc.
 
Linuxdd[1]
Linuxdd[1]Linuxdd[1]
Linuxdd[1]mcganesh
 
Linux Porting
Linux PortingLinux Porting
Linux PortingChamp Yen
 

Semelhante a Panic Attack – a discussion about kdump, panic notifiers, graphics on crash event and all of that (20)

When kdump is way too much
When kdump is way too muchWhen kdump is way too much
When kdump is way too much
 
Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant Automate drupal deployments with linux containers, docker and vagrant
Automate drupal deployments with linux containers, docker and vagrant
 
Docker Container: isolation and security
Docker Container: isolation and securityDocker Container: isolation and security
Docker Container: isolation and security
 
Logging kernel oops and panic
Logging kernel oops and panicLogging kernel oops and panic
Logging kernel oops and panic
 
Docker Security Paradigm
Docker Security ParadigmDocker Security Paradigm
Docker Security Paradigm
 
[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical
 
Drupalcamp es 2013 drupal with lxc docker and vagrant
Drupalcamp es 2013  drupal with lxc docker and vagrant Drupalcamp es 2013  drupal with lxc docker and vagrant
Drupalcamp es 2013 drupal with lxc docker and vagrant
 
Linux Crash Dump Capture and Analysis
Linux Crash Dump Capture and AnalysisLinux Crash Dump Capture and Analysis
Linux Crash Dump Capture and Analysis
 
Linux Kernel Tour
Linux Kernel TourLinux Kernel Tour
Linux Kernel Tour
 
Linux kernel driver tutorial vorlesung
Linux kernel driver tutorial vorlesungLinux kernel driver tutorial vorlesung
Linux kernel driver tutorial vorlesung
 
“Linux Kernel CPU Hotplug in the Multicore System”
“Linux Kernel CPU Hotplug in the Multicore System”“Linux Kernel CPU Hotplug in the Multicore System”
“Linux Kernel CPU Hotplug in the Multicore System”
 
Development platform virtualization using qemu
Development platform virtualization using qemuDevelopment platform virtualization using qemu
Development platform virtualization using qemu
 
Kubernetes laravel and kubernetes
Kubernetes   laravel and kubernetesKubernetes   laravel and kubernetes
Kubernetes laravel and kubernetes
 
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxConAnatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...
 
IRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the PreemptibleIRQs: the Hard, the Soft, the Threaded and the Preemptible
IRQs: the Hard, the Soft, the Threaded and the Preemptible
 
DCEU 18: Tips and Tricks of the Docker Captains
DCEU 18: Tips and Tricks of the Docker CaptainsDCEU 18: Tips and Tricks of the Docker Captains
DCEU 18: Tips and Tricks of the Docker Captains
 
Linuxdd[1]
Linuxdd[1]Linuxdd[1]
Linuxdd[1]
 
netLec5.pdf
netLec5.pdfnetLec5.pdf
netLec5.pdf
 
Linux Porting
Linux PortingLinux Porting
Linux Porting
 

Mais de Igalia

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Building End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPEBuilding End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPEIgalia
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded DevicesAutomated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded DevicesIgalia
 
Embedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to MaintenanceEmbedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to MaintenanceIgalia
 
Optimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdfOptimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdfIgalia
 
Running JS via WASM faster with JIT
Running JS via WASM      faster with JITRunning JS via WASM      faster with JIT
Running JS via WASM faster with JITIgalia
 
Implementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamerImplementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamerIgalia
 
8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in MesaIgalia
 
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por IgaliaIntroducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por IgaliaIgalia
 
2023 in Chimera Linux
2023 in Chimera                    Linux2023 in Chimera                    Linux
2023 in Chimera LinuxIgalia
 
Building a Linux distro with LLVM
Building a Linux distro        with LLVMBuilding a Linux distro        with LLVM
Building a Linux distro with LLVMIgalia
 
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUsturnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUsIgalia
 
Graphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devicesGraphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devicesIgalia
 
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOSDelegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOSIgalia
 
MessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the webMessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the webIgalia
 
Replacing the geometry pipeline with mesh shaders
Replacing the geometry pipeline with mesh shadersReplacing the geometry pipeline with mesh shaders
Replacing the geometry pipeline with mesh shadersIgalia
 
I'm not an AMD expert, but...
I'm not an AMD expert, but...I'm not an AMD expert, but...
I'm not an AMD expert, but...Igalia
 
Status of Vulkan on Raspberry
Status of Vulkan on RaspberryStatus of Vulkan on Raspberry
Status of Vulkan on RaspberryIgalia
 
Enable hardware acceleration for GL applications without glamor on Xorg modes...
Enable hardware acceleration for GL applications without glamor on Xorg modes...Enable hardware acceleration for GL applications without glamor on Xorg modes...
Enable hardware acceleration for GL applications without glamor on Xorg modes...Igalia
 

Mais de Igalia (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Building End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPEBuilding End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPE
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded DevicesAutomated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded Devices
 
Embedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to MaintenanceEmbedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to Maintenance
 
Optimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdfOptimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdf
 
Running JS via WASM faster with JIT
Running JS via WASM      faster with JITRunning JS via WASM      faster with JIT
Running JS via WASM faster with JIT
 
Implementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamerImplementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamer
 
8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa
 
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por IgaliaIntroducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
 
2023 in Chimera Linux
2023 in Chimera                    Linux2023 in Chimera                    Linux
2023 in Chimera Linux
 
Building a Linux distro with LLVM
Building a Linux distro        with LLVMBuilding a Linux distro        with LLVM
Building a Linux distro with LLVM
 
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUsturnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
 
Graphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devicesGraphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devices
 
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOSDelegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
 
MessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the webMessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the web
 
Replacing the geometry pipeline with mesh shaders
Replacing the geometry pipeline with mesh shadersReplacing the geometry pipeline with mesh shaders
Replacing the geometry pipeline with mesh shaders
 
I'm not an AMD expert, but...
I'm not an AMD expert, but...I'm not an AMD expert, but...
I'm not an AMD expert, but...
 
Status of Vulkan on Raspberry
Status of Vulkan on RaspberryStatus of Vulkan on Raspberry
Status of Vulkan on Raspberry
 
Enable hardware acceleration for GL applications without glamor on Xorg modes...
Enable hardware acceleration for GL applications without glamor on Xorg modes...Enable hardware acceleration for GL applications without glamor on Xorg modes...
Enable hardware acceleration for GL applications without glamor on Xorg modes...
 

Último

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Panic Attack – a discussion about kdump, panic notifiers, graphics on crash event and all of that

  • 1. Panic Attack Finding some order in the panic chaos Guilherme G. Piccoli (Igalia) 2023-09-26 / Kernel Recipes 1
  • 2. Context Interest in having a panic log collecting tool on Arch / SteamOS Analysis of kernel infra available - different use cases: kdump == more data collected, heavier on resources pstore == log collected on panic -> lightweight, but less data By playing with kdump/pstore, crossed paths with panic notifiers Panic path is full of trade-offs / conflicting goals Panic notifiers discussions, ideas and eventual refactor Some other orthogonal problems on panic time Interrupt storms / Graphics on panic 2
  • 3. Disclaimer Feel free to interrupt with questions Multiple concepts / dense topic Risks of "assumed knowledge" kexec set of recent problems won't be addressed here Memory preserving across kexec boots SEV / TDX problems with kexec Unikernels support, etc. 3
  • 4. Outline The genesis of this work: SteamOS Panic notifiers: discussion and refactor Chaos on kdump: a real case of interrupt storm Challenges of GFX on panic: dream or reality? 4
  • 5. Where all started: Steam Deck Steam Deck, from Valve CPU/APU AMD Zen 2 (custom), 4-cores/8-threads 16 GB of RAM / 7" display 3 models of NVMe storage (64G, 256G, 512G) 5
  • 6. Deck's distro: SteamOS 3 Arch Linux based distro with gamescope (games) and KDE Plasma (desktop) Sophisticated stack for games: Steam, Proton (Wine), DXVK, VKD3D, etc Arch Linux has no kdump official tool Steam Deck community would benefit of such tool for panic log collection! 6
  • 7. Requirements: what logs to collect? Collect the most logs we can: dmesg (call trace), tasks' state, memory info Though being careful with size - should be easy to share Information that could be used for kernel/HW debugging Rely on in-kernel infrastructure for that - don't reinvent the wheel 7
  • 8. How to collect such logs? Kernel infra kdump: kexec-loaded crash kernel kexec to a new kernel to collect info from the broken kernel Requires pre-reserved memory (>200MB usually) Collects a vmcore (full memory image) of the crashed kernel Lots of information, but heavy / hard for users to share it pstore: persistent storage log saving Save dmesg during panic time to some backend Multiple backends (RAM, UEFI, ACPI, etc) Also multiple frontends (oops, ftrace, console, etc) Enough amount of information? (dmesg only) Both tools benefits from userspace counter-part Kdump tooling common (Debian/Fedora), but not Arch 8
  • 9. Presenting kdumpst is an Arch Linux kdump and pstore tool Available on , supports GRUB and initcpio / dracut Defaults to pstore; currently only ramoops backend (UEFI plans) Used by default on Steam Deck, submits logs to Valve But how to improve the amount of logs on dmesg? panic_print FTW! kdumpst AUR 9
  • 10. panic_print VS pstore ordering panic_print parameter allows to show more data on dmesg during panic Tasks info, system memory state, timers But such function runs after pstore! So can't collect the data. Idea: re-order the code Move the call earlier in the panic path [discussion] 10
  • 11. Panic (over-simplified) code path local IRQ and preempt disable dump stack kdump? crash_kexec() disable the other CPUs panic notifiers and kmsg_dump() arch code / reboot arch code / kexec YES NO 11
  • 12. The code re-ordering /* Simplified function names */ void panic() [...] <---------------------| if (!panic_notifiers) | crash_kexec(); /* kdump */ | | panic_notifiers(); | | kmsg_dump(KMSG_DUMP_PANIC); /* pstore! */ | | if (panic_notifiers) | crash_kexec(); | | panic_print(); ---------------------------| [...] 12
  • 13. And then, the discussion starts... Problems with such approach: panic_print before kdump is risky with Baoquan and others Alternative: propose less invasive change, moving that before pstore only New problem then: what if users want panic_print before kdump? Makes sense if vmcore is too much Only possible if we run the panic notifiers before kdump! So the notifiers journey begins... [discussion] [discussion] 13
  • 14. Outline The genesis of this work: SteamOS Panic notifiers: discussion and refactor Chaos on kdump: a real case of interrupt storm Challenges of GFX on panic: dream or reality? 14
  • 15. Notifier call chains List of callbacks to be executed (usually) in any order There's a (frequently unused) "priority" tune for call ordering Multiple types - atomic callbacks, blocking callbacks, etc Panic notifiers == list of atomic callbacks executed on panic /* Example from kernel/rcu/tree_stall.h */ /* Don't print RCU CPU stall warnings during a kernel panic. */ static int rcu_panic(...) { rcu_cpu_stall_suppress = 1; return NOTIFY_DONE; } static struct notifier_block rcu_panic_block = { .notifier_call = rcu_panic, }; atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block); 15
  • 16. Deep dive into panic notifiers Any driver (even OOT) can register a notifier, to do...anything! Risky for kdump reliability / but sometimes, notifiers could be necessary "Solution": a new kernel parameter, crash_kexec_post_notifiers Proper name for a bazooka shot: all-or-nothing option, runs ALL notifiers before kdump Middle-ground idea: panic notifiers filter! User selects which notifiers to run kdump maintainers kinda welcome the feature: But really paper over a real issue: notifiers is a no man's land Very good from Petr Mladek exposed the need of a refactor [discussion] analysis 16
  • 17. Mladek's refactor proposal Split panic notifiers in more lists, according to their "goals" Information ones: extra info dump, stop watchdogs Hypervisor/FW poking notifiers Others: actions taken when kdump isn't set (LED blink, halt) Ordering regarding kdump Hypervisor list before kdump Info list also before, IF any kmsg_dump() is set Final list runs only if kdump isn't set V1 submitted ~1y ago Special thanks to Petr Mladek for the idea and all reviews. Thanks also Baoquan and Michael Kelly (Hyper-V) for the great discussions! 17
  • 18. 1st step - fixing current panic notifiers First thing: build a list with all existing in-tree panic notifiers As of today (6.6-rc2): 47 notifiers (18 on arch/) Fix / improve them, before splitting in lists. Some patterns: Decouple multi-purpose notifiers Change ordering through the notifier's priorities Machine halt or firmware-reset - put'em to run last Disabling watchdogs (RCU, hung tasks): run ASAP Avoid regular locks Panic path disables secondary CPUs, interrupts, preemption mutex_trylock() and spin_trylock() FTW 18
  • 19. Real example: pvpanic /* drivers/misc/pvpanic/pvpanic.c - simplified code */ static void pvpanic_send_event() { - spin_lock(&pvpanic_lock); + if (!spin_trylock(&pvpanic_lock)) + return; static int panic_panic_notify(...) { pvpanic_send_event(PVPANIC_PANICKED); } [...] + /* Call our notifier very early on panic */ static struct notifier_block pvpanic_panic_nb = { .notifier_call = pvpanic_panic_notify, - .priority = 1, + .priority = INT_MAX, }; 19
  • 20. List splitting (yay, a 4th list!) Original plan was splitting in 3 lists, but... ended-up with 4 list: hypervisor/FW notification, LED blinking Hyper-V, PPC/fadump, pvpanic, LEDs stuff, etc list: dump extra info, disable watchdogs KASLR offsets, RCU/hung task watchdog off, ftrace_dump_on_oops : includes the remaining ones (halt, risky funcs) S390 and PPC/pseries FW halt, IPMI interfaces notification : contains previously hardcoded (arch) final calls SPARC "stop" button enabling (if reboot on panic not set) List to be renamed on V2 (loop list) Hypervisors Informational Pre-reboot Post-reboot 20
  • 21. The notifier "levels" model One of the biggest questions regarding panic notifiers Which ones should run before kdump? Usual / possible answer: low risk / necessary ones Introduce the concept of Fine-grained tuning of which lists run before/after kdump Defaults to: Hypervisor always run before Sometimes informational also (if kmsg_dump() is set) Implementation maps levels into bits and order the lists Was gently called "black magic" on review panic notifier levels 21
  • 22. Subsequent improvements Proposal to convert panic_print into a panic notifier Good acceptance, fits perfectly the informational list Stop exporting crash_kexec_post_notifiers Sadly some users of panic notifiers forcibly set this parameter in code Hyper-V is one of such users, and they have reasons for that... 22
  • 23. Hyper-V case / arm64 custom crash handler Hyper-V requires hypervisor action in case of kdump Requires to unload its "vmbus connection" before crash kernel takes over x86 does it on crash through machine_ops() crash shutdown arm64 though doesn't have similar architecture hook with arm64 maintainers revealed little interest in adding that Unworthy complexity / not a good idea to mimic x86 case Forcing panic notifiers seems a last resort for Hyper-V Unless some alternative for arm64 is implemented Discussion 23
  • 24. Pros / Cons and follow-up discussion Exhaustive exposed plenty conflicting views First of all, not really clear what should run before kdump The notifiers lists are incredibly flexible and "loose" How to be sure anyone knowledgeable on panic will review? Brainstorm: somehow force registering users to add the cb name to a central place? Less is more: too much flexibility is not a good fit for panic Also, are notifier lists reliable on panic path? What if memory corruption corrupts the list? Alternatives? Hardcoded calls? (headers/exports hell) discussion 24
  • 25. Next steps / V2 Rework lists as suggested (move some callbacks here and there) Split submission - first the lists, then the refactor (kdump vs notifiers order) Consider ways of improving panic notifiers review Improve documentation Central place for registering !? 25
  • 26. Outline The genesis of this work: SteamOS Panic notifiers: discussion and refactor Chaos on kdump: a real case of interrupt storm Challenges of GFX on panic: dream or reality? 26
  • 27. Shifting gears: an interrupt storm tale Another painful area to deal with is device state on kdump A regular kexec would handle device's quiesce process .shutdown() callback Crash kexec (kdump) can't risk that -> way more limited environment Real case: device caused an interrupt storm, kdump couldn't boot 27
  • 28. The problem Intel NIC running under PCI-PT (SR-IOV) No in-tree driver - DPDK instead Custom tool collecting NIC stats, triggered weird NIC FW bug Symptom: lockups on host, non-responsive system (Non-trivial) cause: NIC interrupt storm Kdump attempt: unsuccessful -> crash kernel hung on boot Guess what? Still the interrupt storm! 28
  • 29. Look 'ma, no PCI reset Despite kexec is a new boot, there are many differences from FW boot A fundamental limitation is the lack of PCI controller reset x86 has no "protocol" / standard for root complexes resets PPC64 has a FW-aided PCI reset (ppc_pci_reset_phbs) Multiple debug attempts later...an idea: clear devices's MSIs on boot But how to achieve this? PCI layer is initialized much later x86 early PCI infrastructure FTW! (Special thanks to Gavin Shan) 29
  • 30. pci=clearmsi proposal Through the early PCI trick, we could clear the MSIs of all PCI devs Interrupt storm was shut-off and kdump boot succeeded to linux-pci (~3y ago) Some concerns from Bjorn (PCI maintainer) First: limited approach -> pci_config_16() This conf mode access is limited to first domain/segment Other concern: solution only for x86 In principle, this affects more archs Patches 30
  • 31. Discussion Also, was not really clear exactly what was the precise point of failure Thanks Thomas Gleixner for that Interrupt flood happens right when interrupts are enabled on start_kernel() MSIs are DMA writes to a memory area (interrupt remapping tables) An IOMMU approach was suggested Clearing these mappings and IOMMU error reporting early in boot Proper cleaning routines to run on panic kernel also suggested clarifying 31
  • 32. Potential next steps Attempt implementing the IOMMU idea Too limited? What if no IOMMU? Investigate other archs to see how's the status Reliably reproduce the problem! Extend early PCI conf access mode? Bjorn would be unhappy 32
  • 33. Outline The genesis of this work: SteamOS Panic notifiers: discussion and refactor Chaos on kdump: a real case of interrupt storm Challenges of GFX on panic: dream or reality? 33
  • 34. Final problem: GFX on panic GPUs are complex beasts / interrupts are disabled on panic Even regular kexec are challenging for them! Currently, no reliable way to dump data on display during panic Though it would be great for users to see something on crash Reliable GFX on kdump? Wishful thinking 34
  • 35. Framebuffer reuse While working on kdumpst, experimenting with GFX on kdump Managed to make it work only with framebuffers Why not restore the FB on kdump then? Interesting shows it's definitely not trivial Once a GPU driver takes over, HW is reprogrammed GOP driver (UEFI) programs FB/display We'd need to reprogram the device either on panic (ugh) or on kdump kernel discussion 35
  • 36. Current approaches Noralf Trønnes (~4y ago) Iterates on available framebuffers, find a suitable one Jocelyn Falempe (last week) Works with simpledrm currently, API to get a scanout buffer Seems on early stages, with great potential / community acceptance Panic time approaches are risky / limited, must be simple Not sure if that's possible one day for amdgpu / i915 proposal proposal 36
  • 37. Different approach: FW notification What if we print nothing on panic, but defer for FW / next kernel? UEFI panic notification (~1y ago) Simple UEFI variable set on kernel panic (through notifiers!) Next kernel clears the var (and potentially prints something) Simple and flexible - FW could plot a different logo UEFI maintainer (Ard) not really convinced Suggestions for using UEFI pstore for tracking that Orthogonal goals / limited space on UEFI / dmesg "privacy" Next steps: might try to implement that solution in a prototype proposal 37
  • 38. Conclusion Quite a long path, from Linux gaming to panic notifiers refactor Everything on panic is polemic / conflicting "Slightly" long road ahead for the refactor V2 of the refactor soon(tm), not so invasive HW quiesce on crash kexec is still full of issues Interesting area for some research / multi-arch work (IMHO) GFX on panic: still in early stages, other OSes / game consoles seems to have it The UEFI approach, while kinda orthogonal, it's way simpler 38
  • 39. THANKS Feel free to reach me on IRC (gpiccoli - OFTC/Libera) 39