SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
Practical ,Transparent Operating System Support For

 SUPERPAGES

Authors:
Juan Navarro, Sitaram Iyer, Peter Drusche, Alan Cox

Presented by:
Nadeeshani Hewage
Basics & Terminology
•   TLB?
     o Caches virtual-to-physical address mappings from the page tables
     o Faster access
•   TLB Coverage?
     o Amount of memory accessible through cached mappings, without TLB
       misses
•   Superpage
     o Memory page for larger size than a Base-page (ordinary memory page)
     o Multiple sizes
     o Single entry in TLB per superpage
•   Memory objects
     o Occupy some contiguous space in virtual memory
     o Eg: memory mapped files, code, data
•   TLB coverage decreasing with growing memory size over years
•   Motivation to increase TLB coverage without enlargening TLB size.
Constraints for Superpages
•   Page size should be supported by processor.

•   Contiguous

•   Starting address in physical & virtual address space must be a multiple of its
    size.

•   Single TLB entry for a superpage. Meaning single set of protection
    attributes and single dirty bit for a superpage.
Considerations
•   Allocation
     o Allocating physical frames for the memory object.
     o If there exists a plan to create a superpage by the OS, to satisfy the
        contiguity requirement,
           Relocation based allocation
              •  Create contiguity
              •  Physically copying allocated page frames when it is decided
                 that superpages are likely to be beneficial
                    o Software managed TLB + TLB miss handler
                    o Counter per superpage incremented at TLB miss
           Reservation based allocation
              •  Preserve contiguity
              •  Make allocation decisions at page fault time
              •  Decide preferred size of allocation on an algorithm
                    o Eg: Talluri & Hill proposed
•   Fragmentation Control
     o Release contiguous chunks of inactive memory from previous
       allocations
     o Preempt an existing partially used reservation

•   Promotion
     o Set of base pages from a potential super page satisfying constraints
       promoted to a Superpage.

•   Demotion
     o Reduce size of a superpage
           A smaller super page
           Base pages
•   Eviction
     o At a demand of memory preassure, inactive superpage is evicted from
        physical memory.
Design & Implementation
Preferred Superpage Size Policy
  o Decision made early at process execution
  o Looks only at attributes of memory object
  o Decided size
      Too large: decision overridden by preempting init reservation
      Too small: cannot be reverted
  o Policy: Pick the maximum superpage size that can be used for an object
Reservation based Allocation
•   Already spoken
Preempting Reservations
•   Free physical memory scarce/ excessively fragmented

•   Preempt reserved but unused frames

•   Choice between
     o Refusing allocation; reserving smaller extent than desired for new
       allocation
     o Preempting existing reservation which has enough unallocated frames

•   Policy: Whenever possible preempt & give space for alloaction
Incremental Promotions, Demotions
•   Promotions
     o Multiple superpage sizes exist
     o Superpage gets created when any size supported gets fully populated
       within a reservation
     o Increment smaller superpage to next larger size superpage

•   Demotions
     o A base page of a superpage targetted to be evicted by page daemon,
       causes a demotion

     o Demote to next smaller superpage size
Paging Out Dirty Superpages
•   OS keeps only single dirty bit for the whole superpage.

•   Problem: only one base page modified; cost of writing full superpage to disk
    (high cost on I/O)

•   Practice: Demote clean superpages whenever process attempts to write to
    them. Repromote later.
Reservation Lists
•   Keep track of reserved page extent that are not fully populated

•   One list per each page size supported.

•   Kept sorted by the time of their most recent page frame allocation

•   Policy: Preempt the extent whose most recent allocation, occurred least
    recently among all reservations in the list (head)
Population Map
•   Keep track of allocated base pages within each memory object
•   Queried on every page fault, should support efficient lookups.
•   Radix tree
     o Each level corresponds to a page size
     o Root: max superpage size supported
     o Non-leaf:
          # of supperpage sized virtual regions at next lower level with a
            population of at least one (somepop)
          # of supperpage sized virtual regions at next lower level fully
            populated (fullpop)
•   Used for various decisions
Wired Page clustering
•   Pages used for internal datastructures for the OS (implementation on
    FreeBSD) are wired
    (not evicted)

•   At system boot time, clustered together in physical memory.

•   Gradually as kernel allocates memory they get scattered, causes
    fragmentation

•   To avoid fragmentation, identify pages which are to be wired (internal use),
    cluster them in pools of contiguous physical memory
Evaluation
Platform & Workload
•   Platform
     o FreeBSD 4.3 kernel as a loadable module
     o Alpha 21264 processor at 500 MHz
     o four page sizes: 8KB base pages, 64KB, 512KB and 4MB superpages
     o 512MB RAM
     o 64KB data and 64KB instruction L1 caches, virtually indexed and 2-way
        associative
     o fully associative TLB with 128 entries for data and 128 for instructions
•   Workload
     o FFTW: The Fastest Fourier Transform in the West with a 200x200x200
        matrix as input
     o Matrix: A non-blocked matrix transposition of a1000x1000 matrix
     o Web: The thttpd web server servicing 50000 requests selected from an
        access log of the CS departmental web server at Rice University. The
        working set size of this trace is 238MB, while its data set is 3.6GB.
Best case benefits
•   When free memory is plentiful
    & non fragmented
     o Speedup
       (mean elapsed runtime of 3
       runs after an initial warmup)
     o TLB misses reduction
       percentage
Evaluation ctd.
•   Fragment system by running web server & feeding it with requests
•   FFTW run on fragmented system 4 times in sequence.
•   Goal : To see how quickly the system recovers just enough contiguous
    memory to build superpages
•   2 contiguity restoration techniques
     o Cache scheme
          Treat all cache pages as available, coalesces them into allocator
          0 superpages
     o Daemon scheme
          Contiguity aware page replacement + wired page clustering
          Get 20/60, 35/60, 38/60, 40/60 superpages
•   Even if the superpages implementation is there
    it requires a promising fragmentation control
    mechanism to acquire promising results.
•   With superpages and effective fragmentation control, evaluations have
    given promising results.
Thank You!

Mais conteúdo relacionado

Mais procurados

Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
Hiroki Endo
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
Mr SMAK
 

Mais procurados (20)

Chapter 1 - introduction - parallel computing
Chapter  1 - introduction - parallel computingChapter  1 - introduction - parallel computing
Chapter 1 - introduction - parallel computing
 
11. operating-systems-part-2
11. operating-systems-part-211. operating-systems-part-2
11. operating-systems-part-2
 
Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)Process scheduling (CPU Scheduling)
Process scheduling (CPU Scheduling)
 
Multiprocessor Scheduling
Multiprocessor SchedulingMultiprocessor Scheduling
Multiprocessor Scheduling
 
Linux kernel
Linux kernelLinux kernel
Linux kernel
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Multithreading
Multithreading Multithreading
Multithreading
 
Linux kernel
Linux kernelLinux kernel
Linux kernel
 
Operating Systems: Process Scheduling
Operating Systems: Process SchedulingOperating Systems: Process Scheduling
Operating Systems: Process Scheduling
 
CPU Scheduling algorithms
CPU Scheduling algorithmsCPU Scheduling algorithms
CPU Scheduling algorithms
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdf
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
 
Docker internals
Docker internalsDocker internals
Docker internals
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
Osnug meetup-tungsten fabric - overview.pptx
Osnug meetup-tungsten fabric - overview.pptxOsnug meetup-tungsten fabric - overview.pptx
Osnug meetup-tungsten fabric - overview.pptx
 
Local DNS with pfSense 2.4 - pfSense Hangout April 2018
Local DNS with pfSense 2.4 - pfSense Hangout April 2018Local DNS with pfSense 2.4 - pfSense Hangout April 2018
Local DNS with pfSense 2.4 - pfSense Hangout April 2018
 
Processes and Processors in Distributed Systems
Processes and Processors in Distributed SystemsProcesses and Processors in Distributed Systems
Processes and Processors in Distributed Systems
 

Semelhante a Practical ,Transparent Operating System Support For Superpages

Lecture 8- Virtual Memory Final.pptx
Lecture 8- Virtual Memory Final.pptxLecture 8- Virtual Memory Final.pptx
Lecture 8- Virtual Memory Final.pptx
Amanuelmergia
 
08 virtual memory
08 virtual memory08 virtual memory
08 virtual memory
Kamal Singh
 

Semelhante a Practical ,Transparent Operating System Support For Superpages (20)

Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
 
virtual_memory (3).pptx
virtual_memory (3).pptxvirtual_memory (3).pptx
virtual_memory (3).pptx
 
virtual memory
virtual memoryvirtual memory
virtual memory
 
Chapter08
Chapter08Chapter08
Chapter08
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
Virtual Memory.pdf
Virtual Memory.pdfVirtual Memory.pdf
Virtual Memory.pdf
 
Os unit 3
Os unit 3Os unit 3
Os unit 3
 
The life and times
The life and timesThe life and times
The life and times
 
Understanding memory management
Understanding memory managementUnderstanding memory management
Understanding memory management
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
 
Segmentation and paging
Segmentation and paging Segmentation and paging
Segmentation and paging
 
Unit-4 swapping.pptx
Unit-4 swapping.pptxUnit-4 swapping.pptx
Unit-4 swapping.pptx
 
Memory Management.pdf
Memory Management.pdfMemory Management.pdf
Memory Management.pdf
 
Demand paging
Demand pagingDemand paging
Demand paging
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & Techniques
 
Driver development – memory management
Driver development – memory managementDriver development – memory management
Driver development – memory management
 
Training Webinar: Enterprise application performance with distributed caching
Training Webinar: Enterprise application performance with distributed cachingTraining Webinar: Enterprise application performance with distributed caching
Training Webinar: Enterprise application performance with distributed caching
 
UNIT IV.pptx
UNIT IV.pptxUNIT IV.pptx
UNIT IV.pptx
 
Lecture 8- Virtual Memory Final.pptx
Lecture 8- Virtual Memory Final.pptxLecture 8- Virtual Memory Final.pptx
Lecture 8- Virtual Memory Final.pptx
 
08 virtual memory
08 virtual memory08 virtual memory
08 virtual memory
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 

Practical ,Transparent Operating System Support For Superpages

  • 1. Practical ,Transparent Operating System Support For SUPERPAGES Authors: Juan Navarro, Sitaram Iyer, Peter Drusche, Alan Cox Presented by: Nadeeshani Hewage
  • 2. Basics & Terminology • TLB? o Caches virtual-to-physical address mappings from the page tables o Faster access • TLB Coverage? o Amount of memory accessible through cached mappings, without TLB misses • Superpage o Memory page for larger size than a Base-page (ordinary memory page) o Multiple sizes o Single entry in TLB per superpage • Memory objects o Occupy some contiguous space in virtual memory o Eg: memory mapped files, code, data • TLB coverage decreasing with growing memory size over years • Motivation to increase TLB coverage without enlargening TLB size.
  • 3. Constraints for Superpages • Page size should be supported by processor. • Contiguous • Starting address in physical & virtual address space must be a multiple of its size. • Single TLB entry for a superpage. Meaning single set of protection attributes and single dirty bit for a superpage.
  • 4. Considerations • Allocation o Allocating physical frames for the memory object. o If there exists a plan to create a superpage by the OS, to satisfy the contiguity requirement,  Relocation based allocation • Create contiguity • Physically copying allocated page frames when it is decided that superpages are likely to be beneficial o Software managed TLB + TLB miss handler o Counter per superpage incremented at TLB miss  Reservation based allocation • Preserve contiguity • Make allocation decisions at page fault time • Decide preferred size of allocation on an algorithm o Eg: Talluri & Hill proposed
  • 5. Fragmentation Control o Release contiguous chunks of inactive memory from previous allocations o Preempt an existing partially used reservation • Promotion o Set of base pages from a potential super page satisfying constraints promoted to a Superpage. • Demotion o Reduce size of a superpage  A smaller super page  Base pages • Eviction o At a demand of memory preassure, inactive superpage is evicted from physical memory.
  • 7. Preferred Superpage Size Policy o Decision made early at process execution o Looks only at attributes of memory object o Decided size  Too large: decision overridden by preempting init reservation  Too small: cannot be reverted o Policy: Pick the maximum superpage size that can be used for an object
  • 9. Preempting Reservations • Free physical memory scarce/ excessively fragmented • Preempt reserved but unused frames • Choice between o Refusing allocation; reserving smaller extent than desired for new allocation o Preempting existing reservation which has enough unallocated frames • Policy: Whenever possible preempt & give space for alloaction
  • 10. Incremental Promotions, Demotions • Promotions o Multiple superpage sizes exist o Superpage gets created when any size supported gets fully populated within a reservation o Increment smaller superpage to next larger size superpage • Demotions o A base page of a superpage targetted to be evicted by page daemon, causes a demotion o Demote to next smaller superpage size
  • 11. Paging Out Dirty Superpages • OS keeps only single dirty bit for the whole superpage. • Problem: only one base page modified; cost of writing full superpage to disk (high cost on I/O) • Practice: Demote clean superpages whenever process attempts to write to them. Repromote later.
  • 12. Reservation Lists • Keep track of reserved page extent that are not fully populated • One list per each page size supported. • Kept sorted by the time of their most recent page frame allocation • Policy: Preempt the extent whose most recent allocation, occurred least recently among all reservations in the list (head)
  • 13. Population Map • Keep track of allocated base pages within each memory object • Queried on every page fault, should support efficient lookups. • Radix tree o Each level corresponds to a page size o Root: max superpage size supported o Non-leaf:  # of supperpage sized virtual regions at next lower level with a population of at least one (somepop)  # of supperpage sized virtual regions at next lower level fully populated (fullpop) • Used for various decisions
  • 14.
  • 15. Wired Page clustering • Pages used for internal datastructures for the OS (implementation on FreeBSD) are wired (not evicted) • At system boot time, clustered together in physical memory. • Gradually as kernel allocates memory they get scattered, causes fragmentation • To avoid fragmentation, identify pages which are to be wired (internal use), cluster them in pools of contiguous physical memory
  • 17. Platform & Workload • Platform o FreeBSD 4.3 kernel as a loadable module o Alpha 21264 processor at 500 MHz o four page sizes: 8KB base pages, 64KB, 512KB and 4MB superpages o 512MB RAM o 64KB data and 64KB instruction L1 caches, virtually indexed and 2-way associative o fully associative TLB with 128 entries for data and 128 for instructions • Workload o FFTW: The Fastest Fourier Transform in the West with a 200x200x200 matrix as input o Matrix: A non-blocked matrix transposition of a1000x1000 matrix o Web: The thttpd web server servicing 50000 requests selected from an access log of the CS departmental web server at Rice University. The working set size of this trace is 238MB, while its data set is 3.6GB.
  • 18. Best case benefits • When free memory is plentiful & non fragmented o Speedup (mean elapsed runtime of 3 runs after an initial warmup) o TLB misses reduction percentage
  • 19. Evaluation ctd. • Fragment system by running web server & feeding it with requests • FFTW run on fragmented system 4 times in sequence. • Goal : To see how quickly the system recovers just enough contiguous memory to build superpages • 2 contiguity restoration techniques o Cache scheme  Treat all cache pages as available, coalesces them into allocator  0 superpages o Daemon scheme  Contiguity aware page replacement + wired page clustering  Get 20/60, 35/60, 38/60, 40/60 superpages
  • 20. Even if the superpages implementation is there it requires a promising fragmentation control mechanism to acquire promising results.
  • 21. With superpages and effective fragmentation control, evaluations have given promising results.