Intel Processor Trace is a hardware feature that recording information about software execution with minimal impact to system execution. Existing hardware is unfriendly to enable Intel PT in guest because the implementation of shadow ToPA is very complex. Intel PT VMX improvements will treat PT output addresses as Guest Physical Addresses (GPAs) and translate them using EPT that serves to simplify the process of Intel PT virtualization for using by a guest software. We have submitted a patch set to enable Intel PT in XEN HVM guest for collecting hardware behavior, backwards debugging for GDB and so on. We also plan to implement system mode for tracing XEN hypervisor and guest's behavior if necessary.
3. 3
Intel Processor Trace Overview
• Hardware feature that record software execution information
‒ Using dedicated hardware to capture the information about
software execution to physical memory buffer.
• Decoder can determine exact flow of software execution
‒ Software can process the captured trace data and reconstruct the
exact program execution flow.
• Low-overhead to system execution
‒ Depends on usage scenario.
4. 4
Intel Processor Trace Data Packets
• Control flow packets (TNT, TIP, FUP, MODE packet)
‒ Branch, interrupt, exception, mode change, VM-exit/entry…
• Paging information (PIP)
‒ MOV CR3, Task switch, INIT…
• Timing packets (TSC, MTC, CYC)
‒ Periodical, base on the core crystal clock and software configuration
• VMCS packet (VMCS)
‒ VMPTRLD, VM-exit, VM-entry
• Overflow packet (OVF)
‒ Sent when Intel PT buffer is overflow and packets were likely lost.
4
5. 5
Usage Model of Intel Processor Trace
• Trigger save of Intel Processor Trace log for post-mortem analysis
‒ Save on crash, core dump, software event(s), …
• Debug short-lived, non-steady-state performance issues
‒ Hard to catch with sampling, but PT captures everything with precise
timing info
• Replace call-stack info with full, timed control flow trace
‒ Provides path history even when stack is corrupted
• Diagnostic code coverage
‒ Record footprint on program execution
5
Record everything on program execution
6. 6
Hardware Enhancement for Virtualization (1/2)
• Intel PT can be enabled in VMX operation
• New guest state field for IA32_RTIT_CTL
‒ Speed up and simplify the process of disabling trace on VM exit and
restoring it on VM entry.
• Using EPT to redirect PT output
• CPU treat PT output addresses as Guest Physical Addresses (GPAs) and
translate them using EPT.
• New VM Exits qualification for Intel PT Output.
MSR IA32_VMX_MISC
031 14Can enable PT in VMX
7. 7
Hardware Enhancement for Virtualization (2/2)
• VMX control for Intel PT
‒ New bits be added in VM Execution control, VM-exit control and VM-
entry control filed
‒ Produce VMCS packet
‒ Mark non-root mode in packet (e.g. PIP)
‒ Control register auto save/restore on VM-exit/entry
‒ Output buffer address be treated as GPAs and translated with EPT
8. 8
Intel Processor Trace working mode (1/2)
• Guest mode
‒ Expose Intel PT feature to
guest
‒ Only trace guest and output
to guest buffer
‒ Dom0 can’t detect Intel PT
Host HW
Intel PT
Dom0 VM2VM1
PT drv
Perf
PT drv
Perf
9. 9
Intel Processor Trace working mode (2/2)
• System mode
‒ Trace both VMM and guest,
output to VMM buffer
‒ Do not expose Intel PT
feature to guest
‒ Add PV interface in PT
driver to set PT buffer for
VMM
Host HW
Intel PT
Dom0
Perf
PT driver
VM2VM1
PV logic
10. 10
Enabling for Guest Mode (1/4)
• Enumerate PT and enable sub-features via cpuid
‒ Leaf 07H – Intel Processor Trace enumeration
‒ Leaf 14H – Intel Processor Trace sub-feature
• VMCS configuration
‒ Address of PT output can be translated by EPT
‒ Auto save/restore IA32_RTIT_CTL value to/from guest state area
‒ Do not need include VMCS packet and non-root bit in trace packets
11. 11
Enabling for Guest Mode (2/4)
• MSRs Read/Write emulation
‒ Inject a #GP when guest try to R/W unsupported bits or MSRs.
• Context switch (manual)
‒ VM entry (Intel PT is supported in guest)
‒ Initialize the guest state of IA32_RTIT_CTL
‒ Restore other Intel PT MSRs
‒ VM exit (Intel PT is supported in guest)
‒ Save Intel PT MSRs except IA32_RTIT_CTL (auto saved in guest
state before VM exit)
• Disable Intel PT VMX in nested VMs
12. 12
Enabling for Guest Mode (3/4)
• Inject a PMI (performance monitoring interrupt) to XEN guest
‒ The processor will signal a PMI when the corresponding trace output
region is filled
• Disable Intel PT VMX in nested VMs
‒ e.g. L1 guest in “SYSTEM” mode when EPT on EPT
‒ L0 hypervisor working in Guest mode (EPT table GPA->HPA)
‒ L1 hypervisor working in System mode (EPT table nGPA->HPA)
‒ There have no EPT mapping for PT address from nGPA -> HPA in L2
guest
13. 13
Enabling for Guest Mode (4/4)
• VM Exits Due to Intel PT Output
‒ EPT violations
‒ Don’t have EPT mapping for PT buffer address (crash guest when
POD not be used)
‒ PT output address is MMIO (crash guest)
‒ PT output address is write protected
‒ EPT misconfiguration
‒ PML Log-Full VM exits
‒ PT output to a new page may cause PML log-full
‒ APIC access VM exits
‒ PT output address have overlap with 4KB APIC MMIO (crash guest)
14. 14
Summary
• Benefit
‒ Simplify the process of Intel PT virtualization enabling
‒ Using for tracing Guest/Hypervisor’s behavior
• Status
‒ Have sent version 2 to community (Guest mode)
• TODO
‒ SYSTEM mode
‒ PT output emulation for introspection scenarios