SlideShare uma empresa Scribd logo
1 de 30
Dave Probert, Ph.D. - Windows Kernel Architect
MicrosoftWindows Division
Copyright Microsoft Corporation
About Me
 Ph.D. in Computer Engineering (Operating Systems w/o
Kernels)
 Kernel Architect at Microsoft for over 13 years
 Managed platform-independent kernel development in Win2K/XP
 Working on multi-core & heterogeneous parallel computing support
 Architect for UMS in Windows 7 / Windows Server 2008 R2
 Co-instigator of the Windows Academic Program
 Providing kernel source and curriculum materials to universities
 http://microsoft.com/WindowsAcademic or compsci@microsoft.com
 Wrote the Windows material for leading OS textbooks
 Tanenbaum, Silberschatz, Stallings
 Consulted on others, including a successful OS textbook in China
UNIX vs NT Design Environments
Environment which influenced
fundamental design decisions
UNIX [1969] Windows (NT) [1989]
16-bit program address space
Kbytes of physical memory
Swapping system with memory mapping
Kbytes of disk, fixed disks
Uniprocessor
State-machine based I/O devices
Standalone interactive systems
Small number of friendly users
32-bit program address space
Mbytes of physical memory
Virtual memory
Mbytes of disk, removable disks
Multiprocessor (4-way)
Micro-controller based I/O devices
Client/Server distributed computing
Large, diverse user populations
Copyright Microsoft Corporation
Effect on OS Design
NT vs UNIX
Although both Windows and Linux have adapted to changes in the
environment, the original design environments (i.e. in 1989 and 1969) heavily
influenced the design choices:
Unit of concurrency:
Process creation:
I/O:
Namespace root:
Security:
Threads vs processes
CreateProcess() vs fork()
Async vs sync
Virtual vs Filesystem
ACLs vs uid/gid
Addr space, uniproc
Addr space, swapping
Swapping, I/O devices
Removable storage
User populations
Copyright Microsoft Corporation
Today’s Environment [2009]
64-bit addresses
GBytes of physical memory
TBytes of rotational disk
New Storage hierarchies (SSDs)
Hypervisors, virtual processors
Multi-core/Many-core
Heterogeneous CPU architectures, Fixed function hardware
High-speed internet/intranet, Web Services
Media-rich applications
Single user, but vulnerable to hackers worldwide
Convergence: Smartphone / Netbook / Laptop / Desktop / TV / Web / Cloud
Copyright Microsoft Corporation
Windows Architecture
hardware interfaces (buses, I/O devices, interrupts,
interval timers, DMA, memory cache control, etc., etc.)
System Service Dispatcher
Task Manager
Explorer
SvcHost.Exe
WinMgt.Exe
SpoolSv.Exe
Service
Control Mgr.
LSASS
Object
Mgr.
Windows
USER,
GDI
File
System
Cache
I/O Mgr
Environment
Subsystems
User
Application
Subsystem DLLs
System Processes Services Applications
System
Threads
User
Mode
Kernel
Mode
NTDLL.DLL
Device &
File Sys.
Drivers
WinLogon
Session Manager
Services.Exe POSIX
Windows DLLs
Plugand
PlayMgr.
Power
Mgr.
Security
Reference
Monitor
Virtual
Memory
Processes
&
Threads
Local
Procedure
Call
Graphics
Drivers
Kernel
Hardware Abstraction Layer (HAL)
(kernel mode callable interfaces)
Configura-
tionMgr
(registry)
OS/2
Windows
Copyright Microsoft Corporation
Kernel-mode Architecture of
Windows
Copyright Microsoft Corporation
NT API stubs (wrap sysenter) -- system library (ntdll.dll)
user
mode
kernel
mode
NTOS executive layer
Trap/Exception/Interrupt Dispatch
CPU mgmt: scheduling, synchr, ISRs/DPCs/APCs
Drivers
Devices, Filters,
Volumes,
Networking,
Graphics
Hardware Abstraction Layer (HAL): BIOS/chipset details
firmware/
hardware CPU, MMU, APIC, BIOS/ACPI, memory, devices
NTOS
kernel
layer
Caching Mgr
Security
Procs/Threads
Virtual Memory
IPC
glue
I/O
Object Mgr
Registry
Copyright Microsoft Corporation
Kernel/Executive layers
 Kernel layer – ntos/ke – ~ 5% of NTOS source)
 Abstracts the CPU
 Threads, Asynchronous Procedure Calls (APCs)
 Interrupt Service Routines (ISRs)
 Deferred Procedure Calls (DPCs – aka Software Interrupts)
 Providers low-level synchronization
 Executive layer
 OS Services running in a multithreaded environment
 Full virtual memory, heap, handles
 Extensions to NTOS: drivers, file systems, network, …
Copyright Microsoft Corporation
NT (Native) API examples
NtCreateProcess (&ProcHandle, Access, SectionHandle,
DebugPort, ExceptionPort, …)
NtCreateThread (&ThreadHandle, ProcHandle, Access,
ThreadContext, bCreateSuspended, …)
NtAllocateVirtualMemory (ProcHandle, Addr, Size,
Type, Protection, …)
NtMapViewOfSection (SectionHandle, ProcHandle,
Addr, Size, Protection, …)
NtReadVirtualMemory (ProcHandle, Addr, Size, …)
NtDuplicateObject (srcProcHandle, srcObjHandle,
dstProcHandle, dstHandle, Access, Attributes,
Options)
Copyright Microsoft Corporation
Windows Vista Kernel
Changes Kernel changes mostly minor improvements
 Algorithms, scalability, code maintainability
 CPU timing: Uses Time Stamp Counter (TSC)
 Interrupts not charged to threads
 Timing and quanta are more accurate
 Communication
 ALPC: Advanced Lightweight Procedure Calls
 Kernel-mode RPC
 New TCP/IP stack (integrated IPv4 and IPv6)
 I/O
 Remove a context switch from I/O Completion Ports
 I/O cancellation improvements
 Memory management
 Address space randomization (DLLs, stacks)
 Kernel address space dynamically configured
 Security: BitLocker, DRM, UAC, Integrity Levels
Copyright Microsoft Corporation
Windows 7 Kernel Changes
 Miscellaneous kernel changes
 MinWin
 Change how Windows is built
 Lots of DLL refactoring
 API Sets (virtual DLLs)
 Working-set management
 Runaway processes quickly start reusing own pages
 Break up kernel working-set into multiple working-sets
 System cache, paged pool, pageable system code
 Security
 Better UAC, new account types, less BitLocker blockers
 Energy efficiency
 Trigger-started background services
 Core Parking
 Timer-coalescing, tick skipping
 Major scalability improvements for large server apps
 Broke apart last two major kernel locks, >64p
 Kernel support for ConcRT
 User-Mode Scheduling (UMS)
Copyright Microsoft Corporation
MinWin
 MinWin is first step at creating architectural
partitions
 Can be built, booted and tested separately from the rest of the
system
 Higher layers can evolve independently
 An engineering process improvement, not a microkernel NT!
 MinWin was defined as set of components required to
boot and access network
 Kernel, file system driver, TCP/IP stack, device drivers, services
 No servicing, WMI, graphics, audio or shell, etc, etc, etc
 MinWin footprint:
 150 binaries, 25MB on disk, 40MB in-memory
MinWin Layering
Shell,
Graphics,
Multimedia,
Layered Services,
Applets,
Etc.
Kernel,
HAL,
TCP/IP,
File Systems,
Drivers,
Core System Services
MinWin
Timer Coalescing
 Secret of energy efficiency: Go idle and Stay idle
 Staying idle requires minimizing timer interrupts
 Before, periodic timers had independent cycles even when period
was the same
 New timer APIs permit timer coalescing
 Application or driver specifies tolerable delay
 Timer system shifts timer firing
MarkRuss
Broke apart the Dispatcher
Lock Scheduler Dispatcher lock hottest on server workloads
 Lock protects all thread state changes (wait, unwait)
 Very lock at >64x
 Dispatcher lock broken up inWindows 7 / Server 2008 R2
 Each object protected by its own lock
 Many operations are lock-free
Copyright Microsoft Corporation
Removed PFN Lock
 Windows tracks the state of pages in physical memory
 In use: in working sets:
 Not assigned: on paging lists: freemodified, standby, …
 Before, all page state changes protected by global PFN
(Physical Frame Number) lock
 As of Windows 7 the PFN lock is gone
 Pages are now locked individually
 Improves scalability for large memory applications
Copyright Microsoft Corporation
The Silicon Power Wall
The situation:
 Power2
∝ Clock frequency
 Voltage ∝ Power2
⇨Clock frequency and Voltage offset each other
 Clock frequency inversely proportional to logic path length
Bad News:
 Power is about as low as it can go
 Logic paths between clocked elements are pretty short
Good News:
 Moore’s Law continues (# transistors doubles ~22 months)
 All that parallel computational theory is going into practice
Transistors going into more cores, not faster cores!
Software subject to Amdahl’s Law, not Moore’s Law
(or Gustafson’s Law
– if my wife can find large enough datasets she cares about) 17
Approaches to HW
parallelismHomogeneous
More big superscalar cores
 Extend with private (or shared) SIMD engines (SSE on steroids)
 (Maybe) not very energy efficient
A few more big, cores and lots of smaller, slower, cooler cores
 Use SIMD for performance
 Shutoff idle small cores for energy efficiency (but leakage?)
Lots of little fully programmable cores, all the same
 Nobody has ever gotten this to work – more on this later
Heterogeneous
Programmable Accelerators (e.g. GPUs)
 Attach loosely-coupled, specialized (non-x86), energy-efficient cores
Fixed-function Accelerators
 Very energy-efficient, device-like computational units for very-specific tasks
18
User Mode Scheduling (UMS)
 Improve support for efficient cooperative multithreaded
scheduling of small tasks (over-decomposition)
⇒ Want to schedule tasks in user-mode
⇒ Use NT threads to simulate CPUs, multiplex tasks onto these
threads
 When a task calls into the kernel and blocks, the CPU may get
scheduled to a different app
⇒ If a single NT thread per CPU, when it blocks it blocks.
⇒ Could have extra threads, but then kernel and user-mode are
competing to schedule the CPU
 Tasks run arbitraryWin32 code (but only x64/IA64)
⇒ Assumes running on an NT thread (TEB, kernel thread)
 Used by ConcRT (Visual Studio 2010’s Concurrency Run-Time)
Copyright Microsoft Corporation
Windows 7 User-Mode Scheduling
 UMS breaks NT thread into two parts:
 UT: user-mode portion (TEB, ustack, registers)
 KT: kernel-mode portion (ETHREAD, kstack, registers)
 Three key properties:
 User-mode scheduler switches UTs w/o ring crossing
 KT switch is lazy: at kernel entry (e.g. syscall, pagefault)
 CPU returned to user-mode scheduler when KT blocks
 KT “returns” to user-mode by queuing completion
 User-mode scheduler schedules corresponding UT
 (similar to scheduler activations, etc)
Copyright Microsoft Corporation
Normal NT Threading
kernel
user
KT0 KT1 KT2
UT2UT1
UT0
Kernel-mode
Scheduler
NTOS executive
trap code
NT Thread is Kernel Thread (KT) and User Thread (UT)
UT/KT form a single logical thread representing NT thread in user or
kernel
KT: ETHREAD, KSTACK, link to EPROCESS
UT: TEB, USTACK
x86 core
Copyright Microsoft Corporation
User-Mode Scheduling (UMS)
kernel
user
Thread Parking
KT0 KT1 KT2
UT Completion list
Primary
Thread
UT0
UT1
UT0
User-mode
Scheduler
trap code
NTOS executive
KT0 blocks
Only primary thread runs in user-mode
Trap code switches to parked KT
KT blocks ⇒ primary returns to user-mode
KT unblocks & parks ⇒ queue UT completion
Copyright Microsoft Corporation
UMS
 Based on NT threads
⇒ Each NT thread has user & kernel parts (UT & KT)
⇒ When a thread becomes UMS, KT never returns to UT
⇒ (Well, sort of)
⇒ Instead, the primary thread calls the USched
 USched
⇒ Switches between UTs, all in user-mode
⇒ When a UT enters kernel and blocks, the primary thread will hand
CPU back to the USched declaring UT blocked
⇒ When UT unblocks, kernel queues notification
⇒ USched consumes notifications, marks UT runnable
 PrimaryThread
⇒ Self-identified by entering kernel with wrongTEB
⇒ So UTs can migrate between threads
⇒ Affinities of primaries and KTs are orthogonal issues
Copyright Microsoft Corporation
UMS Thread Roles
 Primary threads: represent CPUs, normal app threads enter the
USched world and become primaries, primaries also can be created
by UScheds to allow parallel execution
 Primaries represent concurrent execution
 UMS threads (UT/KTs): allow blocking in the kernel without losing
the CPU
 UMS thread represent concurrent blocking in kernel
Copyright Microsoft Corporation
Thread Scheduling vs UMS
Core 2
Thread
3
Non-running threads
Core 1
Thread
4
Thread
5
Thread
1
Thread
2
Thread
6
Core 2Core 1
User
Thread
2
Kernel
Thread
2
User
Thread
1
Kernel
Thread
1
User
Thread
3
Kernel
Thread
3
User
Thread
4
Kernel
Thread
4
User
Thread
5
Kernel
Thread
5
User
Thread
6
Kernel
Thread
6
MarkRuss
Win32 compat considerations
Why not Win32 fibers?
 TEB issues
⇒ ContainsTLS andWin32-specific fields (incl LastError)
⇒ Fibers run on multiple threads, soTEB state doesn’t track
 Kernel thread issues
⇒ Visibility toTEB
⇒ I/O is queued to thread
⇒ Mutexes record thread owner
⇒ Impersonation
⇒ Cross-thread operations expect to find threads and IDs
⇒ Win32 code has thread and affinity awareness
Copyright Microsoft Corporation
Futures: Master/Slave UMS?
remote kernel
Remote x86
Thread Parking
KT0 KT1 KT2
UT2
UT1
Remote
Scheduler
trap code
NTOS executiveKernel-mode
Scheduler
Syscall Completion QueueSyscall Request Queue
UT0
x86 core
UTs (can) run on accelerators or x86s
KTs run on x86s, syscalls remoted/batched
Pagefaults are just like syscalls
Accelerator never “loses the CPU” (implicit primary)
Copyright Microsoft Corporation
Operating Systems Futures
 Many-core challenge
 New driving force in software innovation:
Amdahl’s Law overtakes Moore’s Law as high-order bit
 Heterogeneous cores?
 OS Scalability
 Loosely –coupled OS: mem + cpu + services?
 Energy efficiency
 Shrink-wrap and Freeze-dry applications?
 Hypervisor/Kernel/Runtime relationships
 Move kernel scheduling (cpu/memory) into run-times?
 Move kernel resource management into Hypervisor?
Copyright Microsoft Corporation
Windows Academic Program
 Windows Kernel Internals
 Windows kernel in source (Windows Research Kernel –WRK)
 Windows kernel in PowerPoint (Curriculum Resource Kit – CRK)
 Based onWindows Server 2008 Service Pack 1
 Latest kernel at time of release
 First kernel release with AMD64 support
 Joint program betweenWindows Product Group and MS
Academic Groups
 Program directed by Arkady Retik (Need a DVD? Have questions?)
Information available at
 http://microsoft.com/WindowsAcademic OR
 compsci@microsoft.com
 Microsoft Academic Contacts in Buenos Aires
Miguel Saez (masaez@microsoft.com) or
Ezequiel Glinsky (eglinsky@microsoft.com)
Copyright Microsoft Corporation
30
muchas gracias

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

System components of windows xp
System components of windows xpSystem components of windows xp
System components of windows xp
 
In a monolithic kerne1
In a monolithic kerne1In a monolithic kerne1
In a monolithic kerne1
 
Priority Inversion on Mars
Priority Inversion on MarsPriority Inversion on Mars
Priority Inversion on Mars
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Studies
StudiesStudies
Studies
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
Win8 architecture for developers
Win8 architecture for developersWin8 architecture for developers
Win8 architecture for developers
 
KERNAL ARCHITECTURE
KERNAL ARCHITECTUREKERNAL ARCHITECTURE
KERNAL ARCHITECTURE
 
High Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux KernelHigh Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux Kernel
 
Browsing Linux Kernel Source
Browsing Linux Kernel SourceBrowsing Linux Kernel Source
Browsing Linux Kernel Source
 
Architecture Of The Linux Kernel
Architecture Of The Linux KernelArchitecture Of The Linux Kernel
Architecture Of The Linux Kernel
 
Linux Kernel Programming
Linux Kernel ProgrammingLinux Kernel Programming
Linux Kernel Programming
 
Os Linux
Os LinuxOs Linux
Os Linux
 
Linux kernel architecture
Linux kernel architectureLinux kernel architecture
Linux kernel architecture
 
Linux kernel
Linux kernelLinux kernel
Linux kernel
 
2. microkernel new
2. microkernel new2. microkernel new
2. microkernel new
 
Ubuntu OS Presentation
Ubuntu OS PresentationUbuntu OS Presentation
Ubuntu OS Presentation
 
distcom-short-20140112-1600
distcom-short-20140112-1600distcom-short-20140112-1600
distcom-short-20140112-1600
 
Processes and Threads in Windows Vista
Processes and Threads in Windows VistaProcesses and Threads in Windows Vista
Processes and Threads in Windows Vista
 

Destaque

Protecting Windows Networks From Malware 31 Jan09
Protecting Windows Networks From Malware 31 Jan09Protecting Windows Networks From Malware 31 Jan09
Protecting Windows Networks From Malware 31 Jan09technext1
 
Get Ready For Ipv6
Get Ready For Ipv6Get Ready For Ipv6
Get Ready For Ipv6technext1
 
Microsoft Platform Security Briefing
Microsoft Platform Security BriefingMicrosoft Platform Security Briefing
Microsoft Platform Security Briefingtechnext1
 
Win 7 & Intel V Pro Tech
Win 7 & Intel V Pro TechWin 7 & Intel V Pro Tech
Win 7 & Intel V Pro Techtechnext1
 
It Governance OC CIO Nov,2013
It Governance OC CIO Nov,2013It Governance OC CIO Nov,2013
It Governance OC CIO Nov,2013Jim Sutter
 

Destaque (6)

Protecting Windows Networks From Malware 31 Jan09
Protecting Windows Networks From Malware 31 Jan09Protecting Windows Networks From Malware 31 Jan09
Protecting Windows Networks From Malware 31 Jan09
 
Get Ready For Ipv6
Get Ready For Ipv6Get Ready For Ipv6
Get Ready For Ipv6
 
Microsoft Platform Security Briefing
Microsoft Platform Security BriefingMicrosoft Platform Security Briefing
Microsoft Platform Security Briefing
 
Win 7 & Intel V Pro Tech
Win 7 & Intel V Pro TechWin 7 & Intel V Pro Tech
Win 7 & Intel V Pro Tech
 
It Governance OC CIO Nov,2013
It Governance OC CIO Nov,2013It Governance OC CIO Nov,2013
It Governance OC CIO Nov,2013
 
Lru Algorithm
Lru AlgorithmLru Algorithm
Lru Algorithm
 

Semelhante a 2337610

Evolution of the Windows Kernel Architecture, by Dave Probert
Evolution of the Windows Kernel Architecture, by Dave ProbertEvolution of the Windows Kernel Architecture, by Dave Probert
Evolution of the Windows Kernel Architecture, by Dave Probertyang
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersVaibhav Sharma
 
Lecture 3,4 operating systems
Lecture 3,4   operating systemsLecture 3,4   operating systems
Lecture 3,4 operating systemsPradeep Kumar TS
 
Lecture 3,4 operating systems
Lecture 3,4   operating systemsLecture 3,4   operating systems
Lecture 3,4 operating systemsPradeep Kumar TS
 
Embedded System
Embedded SystemEmbedded System
Embedded Systemsurendar
 
Principles of operating system
Principles of operating systemPrinciples of operating system
Principles of operating systemAnil Dharmapuri
 
Embedded Intro India05
Embedded Intro India05Embedded Intro India05
Embedded Intro India05Rajesh Gupta
 
Operating System 4 1193308760782240 2
Operating System 4 1193308760782240 2Operating System 4 1193308760782240 2
Operating System 4 1193308760782240 2mona_hakmy
 
Operating System 4
Operating System 4Operating System 4
Operating System 4tech2click
 
Operating system Definition Structures
Operating  system Definition  StructuresOperating  system Definition  Structures
Operating system Definition Structuresanair23
 
CSE 370 - Introduction to Operating Systems
CSE 370 - Introduction to Operating SystemsCSE 370 - Introduction to Operating Systems
CSE 370 - Introduction to Operating SystemsDev Khare
 
I/O System and Case study
I/O System and Case studyI/O System and Case study
I/O System and Case studymalarselvi mms
 

Semelhante a 2337610 (20)

Evolution of the Windows Kernel Architecture, by Dave Probert
Evolution of the Windows Kernel Architecture, by Dave ProbertEvolution of the Windows Kernel Architecture, by Dave Probert
Evolution of the Windows Kernel Architecture, by Dave Probert
 
Introduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & ContainersIntroduction to OS LEVEL Virtualization & Containers
Introduction to OS LEVEL Virtualization & Containers
 
Lecture 3,4 operating systems
Lecture 3,4   operating systemsLecture 3,4   operating systems
Lecture 3,4 operating systems
 
Lecture 3,4 operating systems
Lecture 3,4   operating systemsLecture 3,4   operating systems
Lecture 3,4 operating systems
 
Embedded System
Embedded SystemEmbedded System
Embedded System
 
Principles of operating system
Principles of operating systemPrinciples of operating system
Principles of operating system
 
Operating system ppt
Operating system pptOperating system ppt
Operating system ppt
 
Operating system ppt
Operating system pptOperating system ppt
Operating system ppt
 
Operating system ppt
Operating system pptOperating system ppt
Operating system ppt
 
Operating system ppt
Operating system pptOperating system ppt
Operating system ppt
 
Embedded Intro India05
Embedded Intro India05Embedded Intro India05
Embedded Intro India05
 
Operating System 4 1193308760782240 2
Operating System 4 1193308760782240 2Operating System 4 1193308760782240 2
Operating System 4 1193308760782240 2
 
Operating System 4
Operating System 4Operating System 4
Operating System 4
 
Mainframe
MainframeMainframe
Mainframe
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
 
intro.ppt
intro.pptintro.ppt
intro.ppt
 
Operating system Definition Structures
Operating  system Definition  StructuresOperating  system Definition  Structures
Operating system Definition Structures
 
CSE 370 - Introduction to Operating Systems
CSE 370 - Introduction to Operating SystemsCSE 370 - Introduction to Operating Systems
CSE 370 - Introduction to Operating Systems
 
I/O System and Case study
I/O System and Case studyI/O System and Case study
I/O System and Case study
 
intro.ppt
intro.pptintro.ppt
intro.ppt
 

Último

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Último (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

2337610

  • 1. Dave Probert, Ph.D. - Windows Kernel Architect MicrosoftWindows Division Copyright Microsoft Corporation
  • 2. About Me  Ph.D. in Computer Engineering (Operating Systems w/o Kernels)  Kernel Architect at Microsoft for over 13 years  Managed platform-independent kernel development in Win2K/XP  Working on multi-core & heterogeneous parallel computing support  Architect for UMS in Windows 7 / Windows Server 2008 R2  Co-instigator of the Windows Academic Program  Providing kernel source and curriculum materials to universities  http://microsoft.com/WindowsAcademic or compsci@microsoft.com  Wrote the Windows material for leading OS textbooks  Tanenbaum, Silberschatz, Stallings  Consulted on others, including a successful OS textbook in China
  • 3. UNIX vs NT Design Environments Environment which influenced fundamental design decisions UNIX [1969] Windows (NT) [1989] 16-bit program address space Kbytes of physical memory Swapping system with memory mapping Kbytes of disk, fixed disks Uniprocessor State-machine based I/O devices Standalone interactive systems Small number of friendly users 32-bit program address space Mbytes of physical memory Virtual memory Mbytes of disk, removable disks Multiprocessor (4-way) Micro-controller based I/O devices Client/Server distributed computing Large, diverse user populations Copyright Microsoft Corporation
  • 4. Effect on OS Design NT vs UNIX Although both Windows and Linux have adapted to changes in the environment, the original design environments (i.e. in 1989 and 1969) heavily influenced the design choices: Unit of concurrency: Process creation: I/O: Namespace root: Security: Threads vs processes CreateProcess() vs fork() Async vs sync Virtual vs Filesystem ACLs vs uid/gid Addr space, uniproc Addr space, swapping Swapping, I/O devices Removable storage User populations Copyright Microsoft Corporation
  • 5. Today’s Environment [2009] 64-bit addresses GBytes of physical memory TBytes of rotational disk New Storage hierarchies (SSDs) Hypervisors, virtual processors Multi-core/Many-core Heterogeneous CPU architectures, Fixed function hardware High-speed internet/intranet, Web Services Media-rich applications Single user, but vulnerable to hackers worldwide Convergence: Smartphone / Netbook / Laptop / Desktop / TV / Web / Cloud Copyright Microsoft Corporation
  • 6. Windows Architecture hardware interfaces (buses, I/O devices, interrupts, interval timers, DMA, memory cache control, etc., etc.) System Service Dispatcher Task Manager Explorer SvcHost.Exe WinMgt.Exe SpoolSv.Exe Service Control Mgr. LSASS Object Mgr. Windows USER, GDI File System Cache I/O Mgr Environment Subsystems User Application Subsystem DLLs System Processes Services Applications System Threads User Mode Kernel Mode NTDLL.DLL Device & File Sys. Drivers WinLogon Session Manager Services.Exe POSIX Windows DLLs Plugand PlayMgr. Power Mgr. Security Reference Monitor Virtual Memory Processes & Threads Local Procedure Call Graphics Drivers Kernel Hardware Abstraction Layer (HAL) (kernel mode callable interfaces) Configura- tionMgr (registry) OS/2 Windows Copyright Microsoft Corporation
  • 7. Kernel-mode Architecture of Windows Copyright Microsoft Corporation NT API stubs (wrap sysenter) -- system library (ntdll.dll) user mode kernel mode NTOS executive layer Trap/Exception/Interrupt Dispatch CPU mgmt: scheduling, synchr, ISRs/DPCs/APCs Drivers Devices, Filters, Volumes, Networking, Graphics Hardware Abstraction Layer (HAL): BIOS/chipset details firmware/ hardware CPU, MMU, APIC, BIOS/ACPI, memory, devices NTOS kernel layer Caching Mgr Security Procs/Threads Virtual Memory IPC glue I/O Object Mgr Registry Copyright Microsoft Corporation
  • 8. Kernel/Executive layers  Kernel layer – ntos/ke – ~ 5% of NTOS source)  Abstracts the CPU  Threads, Asynchronous Procedure Calls (APCs)  Interrupt Service Routines (ISRs)  Deferred Procedure Calls (DPCs – aka Software Interrupts)  Providers low-level synchronization  Executive layer  OS Services running in a multithreaded environment  Full virtual memory, heap, handles  Extensions to NTOS: drivers, file systems, network, … Copyright Microsoft Corporation
  • 9. NT (Native) API examples NtCreateProcess (&ProcHandle, Access, SectionHandle, DebugPort, ExceptionPort, …) NtCreateThread (&ThreadHandle, ProcHandle, Access, ThreadContext, bCreateSuspended, …) NtAllocateVirtualMemory (ProcHandle, Addr, Size, Type, Protection, …) NtMapViewOfSection (SectionHandle, ProcHandle, Addr, Size, Protection, …) NtReadVirtualMemory (ProcHandle, Addr, Size, …) NtDuplicateObject (srcProcHandle, srcObjHandle, dstProcHandle, dstHandle, Access, Attributes, Options) Copyright Microsoft Corporation
  • 10. Windows Vista Kernel Changes Kernel changes mostly minor improvements  Algorithms, scalability, code maintainability  CPU timing: Uses Time Stamp Counter (TSC)  Interrupts not charged to threads  Timing and quanta are more accurate  Communication  ALPC: Advanced Lightweight Procedure Calls  Kernel-mode RPC  New TCP/IP stack (integrated IPv4 and IPv6)  I/O  Remove a context switch from I/O Completion Ports  I/O cancellation improvements  Memory management  Address space randomization (DLLs, stacks)  Kernel address space dynamically configured  Security: BitLocker, DRM, UAC, Integrity Levels Copyright Microsoft Corporation
  • 11. Windows 7 Kernel Changes  Miscellaneous kernel changes  MinWin  Change how Windows is built  Lots of DLL refactoring  API Sets (virtual DLLs)  Working-set management  Runaway processes quickly start reusing own pages  Break up kernel working-set into multiple working-sets  System cache, paged pool, pageable system code  Security  Better UAC, new account types, less BitLocker blockers  Energy efficiency  Trigger-started background services  Core Parking  Timer-coalescing, tick skipping  Major scalability improvements for large server apps  Broke apart last two major kernel locks, >64p  Kernel support for ConcRT  User-Mode Scheduling (UMS) Copyright Microsoft Corporation
  • 12. MinWin  MinWin is first step at creating architectural partitions  Can be built, booted and tested separately from the rest of the system  Higher layers can evolve independently  An engineering process improvement, not a microkernel NT!  MinWin was defined as set of components required to boot and access network  Kernel, file system driver, TCP/IP stack, device drivers, services  No servicing, WMI, graphics, audio or shell, etc, etc, etc  MinWin footprint:  150 binaries, 25MB on disk, 40MB in-memory
  • 14. Timer Coalescing  Secret of energy efficiency: Go idle and Stay idle  Staying idle requires minimizing timer interrupts  Before, periodic timers had independent cycles even when period was the same  New timer APIs permit timer coalescing  Application or driver specifies tolerable delay  Timer system shifts timer firing MarkRuss
  • 15. Broke apart the Dispatcher Lock Scheduler Dispatcher lock hottest on server workloads  Lock protects all thread state changes (wait, unwait)  Very lock at >64x  Dispatcher lock broken up inWindows 7 / Server 2008 R2  Each object protected by its own lock  Many operations are lock-free Copyright Microsoft Corporation
  • 16. Removed PFN Lock  Windows tracks the state of pages in physical memory  In use: in working sets:  Not assigned: on paging lists: freemodified, standby, …  Before, all page state changes protected by global PFN (Physical Frame Number) lock  As of Windows 7 the PFN lock is gone  Pages are now locked individually  Improves scalability for large memory applications Copyright Microsoft Corporation
  • 17. The Silicon Power Wall The situation:  Power2 ∝ Clock frequency  Voltage ∝ Power2 ⇨Clock frequency and Voltage offset each other  Clock frequency inversely proportional to logic path length Bad News:  Power is about as low as it can go  Logic paths between clocked elements are pretty short Good News:  Moore’s Law continues (# transistors doubles ~22 months)  All that parallel computational theory is going into practice Transistors going into more cores, not faster cores! Software subject to Amdahl’s Law, not Moore’s Law (or Gustafson’s Law – if my wife can find large enough datasets she cares about) 17
  • 18. Approaches to HW parallelismHomogeneous More big superscalar cores  Extend with private (or shared) SIMD engines (SSE on steroids)  (Maybe) not very energy efficient A few more big, cores and lots of smaller, slower, cooler cores  Use SIMD for performance  Shutoff idle small cores for energy efficiency (but leakage?) Lots of little fully programmable cores, all the same  Nobody has ever gotten this to work – more on this later Heterogeneous Programmable Accelerators (e.g. GPUs)  Attach loosely-coupled, specialized (non-x86), energy-efficient cores Fixed-function Accelerators  Very energy-efficient, device-like computational units for very-specific tasks 18
  • 19. User Mode Scheduling (UMS)  Improve support for efficient cooperative multithreaded scheduling of small tasks (over-decomposition) ⇒ Want to schedule tasks in user-mode ⇒ Use NT threads to simulate CPUs, multiplex tasks onto these threads  When a task calls into the kernel and blocks, the CPU may get scheduled to a different app ⇒ If a single NT thread per CPU, when it blocks it blocks. ⇒ Could have extra threads, but then kernel and user-mode are competing to schedule the CPU  Tasks run arbitraryWin32 code (but only x64/IA64) ⇒ Assumes running on an NT thread (TEB, kernel thread)  Used by ConcRT (Visual Studio 2010’s Concurrency Run-Time) Copyright Microsoft Corporation
  • 20. Windows 7 User-Mode Scheduling  UMS breaks NT thread into two parts:  UT: user-mode portion (TEB, ustack, registers)  KT: kernel-mode portion (ETHREAD, kstack, registers)  Three key properties:  User-mode scheduler switches UTs w/o ring crossing  KT switch is lazy: at kernel entry (e.g. syscall, pagefault)  CPU returned to user-mode scheduler when KT blocks  KT “returns” to user-mode by queuing completion  User-mode scheduler schedules corresponding UT  (similar to scheduler activations, etc) Copyright Microsoft Corporation
  • 21. Normal NT Threading kernel user KT0 KT1 KT2 UT2UT1 UT0 Kernel-mode Scheduler NTOS executive trap code NT Thread is Kernel Thread (KT) and User Thread (UT) UT/KT form a single logical thread representing NT thread in user or kernel KT: ETHREAD, KSTACK, link to EPROCESS UT: TEB, USTACK x86 core Copyright Microsoft Corporation
  • 22. User-Mode Scheduling (UMS) kernel user Thread Parking KT0 KT1 KT2 UT Completion list Primary Thread UT0 UT1 UT0 User-mode Scheduler trap code NTOS executive KT0 blocks Only primary thread runs in user-mode Trap code switches to parked KT KT blocks ⇒ primary returns to user-mode KT unblocks & parks ⇒ queue UT completion Copyright Microsoft Corporation
  • 23. UMS  Based on NT threads ⇒ Each NT thread has user & kernel parts (UT & KT) ⇒ When a thread becomes UMS, KT never returns to UT ⇒ (Well, sort of) ⇒ Instead, the primary thread calls the USched  USched ⇒ Switches between UTs, all in user-mode ⇒ When a UT enters kernel and blocks, the primary thread will hand CPU back to the USched declaring UT blocked ⇒ When UT unblocks, kernel queues notification ⇒ USched consumes notifications, marks UT runnable  PrimaryThread ⇒ Self-identified by entering kernel with wrongTEB ⇒ So UTs can migrate between threads ⇒ Affinities of primaries and KTs are orthogonal issues Copyright Microsoft Corporation
  • 24. UMS Thread Roles  Primary threads: represent CPUs, normal app threads enter the USched world and become primaries, primaries also can be created by UScheds to allow parallel execution  Primaries represent concurrent execution  UMS threads (UT/KTs): allow blocking in the kernel without losing the CPU  UMS thread represent concurrent blocking in kernel Copyright Microsoft Corporation
  • 25. Thread Scheduling vs UMS Core 2 Thread 3 Non-running threads Core 1 Thread 4 Thread 5 Thread 1 Thread 2 Thread 6 Core 2Core 1 User Thread 2 Kernel Thread 2 User Thread 1 Kernel Thread 1 User Thread 3 Kernel Thread 3 User Thread 4 Kernel Thread 4 User Thread 5 Kernel Thread 5 User Thread 6 Kernel Thread 6 MarkRuss
  • 26. Win32 compat considerations Why not Win32 fibers?  TEB issues ⇒ ContainsTLS andWin32-specific fields (incl LastError) ⇒ Fibers run on multiple threads, soTEB state doesn’t track  Kernel thread issues ⇒ Visibility toTEB ⇒ I/O is queued to thread ⇒ Mutexes record thread owner ⇒ Impersonation ⇒ Cross-thread operations expect to find threads and IDs ⇒ Win32 code has thread and affinity awareness Copyright Microsoft Corporation
  • 27. Futures: Master/Slave UMS? remote kernel Remote x86 Thread Parking KT0 KT1 KT2 UT2 UT1 Remote Scheduler trap code NTOS executiveKernel-mode Scheduler Syscall Completion QueueSyscall Request Queue UT0 x86 core UTs (can) run on accelerators or x86s KTs run on x86s, syscalls remoted/batched Pagefaults are just like syscalls Accelerator never “loses the CPU” (implicit primary) Copyright Microsoft Corporation
  • 28. Operating Systems Futures  Many-core challenge  New driving force in software innovation: Amdahl’s Law overtakes Moore’s Law as high-order bit  Heterogeneous cores?  OS Scalability  Loosely –coupled OS: mem + cpu + services?  Energy efficiency  Shrink-wrap and Freeze-dry applications?  Hypervisor/Kernel/Runtime relationships  Move kernel scheduling (cpu/memory) into run-times?  Move kernel resource management into Hypervisor? Copyright Microsoft Corporation
  • 29. Windows Academic Program  Windows Kernel Internals  Windows kernel in source (Windows Research Kernel –WRK)  Windows kernel in PowerPoint (Curriculum Resource Kit – CRK)  Based onWindows Server 2008 Service Pack 1  Latest kernel at time of release  First kernel release with AMD64 support  Joint program betweenWindows Product Group and MS Academic Groups  Program directed by Arkady Retik (Need a DVD? Have questions?) Information available at  http://microsoft.com/WindowsAcademic OR  compsci@microsoft.com  Microsoft Academic Contacts in Buenos Aires Miguel Saez (masaez@microsoft.com) or Ezequiel Glinsky (eglinsky@microsoft.com) Copyright Microsoft Corporation