SlideShare uma empresa Scribd logo
1 de 80
Memory Management
Memory management
    Basic memory management
✦   Swapping
✦   Virtual memory
✦   Page replacement algorithms
✦   Modeling page replacement algorithms
✦   Design issues for paging systems
✦   Implementation issues
✦   Segmentation
In an ideal world…
✦   The ideal world has memory that is
    • Very large
     • Very fast
      • Non-volatile (doesn’t go away when power is turned
        off)
✦   The real world has memory that is:
    • Very large
     • Very fast
      • Affordable!
         Pick any two…
✦   Memory management goal: make the real world
    look as much like the ideal world as possible
Memory hierarchy
✦   What is the memory hierarchy?
    • Different levels of memory
     • Some are small & fast
      • Others are large & slow
✦   What levels are usually included?
    • Cache: small amount of fast, expensive memory
        - L1 (level 1) cache: usually on the CPU chip
        - L2: may be on or off chip
        - L3 cache: off-chip, made of SRAM
     • Main memory: medium-speed, medium price
        memory (DRAM)
      • Disk: many gigabytes of slow, cheap, non-volatile
        storage
✦   Memory manager handles the memory hierarchy
Basic memory management
    Components include
    • Operating system (perhaps with device drivers)
     • Single process
✦   Goal: lay these out in memory
    • Memory protection may not be an issue (only one program)
     • Flexibility may still be useful (allow OS changes, etc.)
✦   No swapping or paging
    0xFFFF                                                            0xFFFF
                                                    Device drivers
                                Operating system
                                                       (ROM)
              User program          (ROM)
                 (RAM)
                                                    User program
                                                       (RAM)
                                 User program
             Operating system       (RAM)
                                                   Operating system
                 (RAM)
                                                       (RAM)
       0                                                              0
Fixed partitions: multiple
programs
    Fixed memory partitions
    • Divide memory into fixed spaces
     • Assign a process to a space when it’s free
✦   Mechanisms
    • Separate input queues for each partition
     • Single input queue: better ability to optimize CPU usage
                            900K                            900K
              Partition 4                     Partition 4
                            700K                            700K
              Partition 3                     Partition 3
                            600K                            600K
              Partition 2                     Partition 2
                            500K                            500K
              Partition 1                     Partition 1
                            100K   Process                  100K
                 OS                              OS
                            0                               0
How many processes are
enough?
✦   Several memory partitions (fixed or variable size)
✦   Lots of processes wanting to use the CPU
✦   Tradeoff
    • More processes utilize the CPU better
     • Fewer processes use less memory (cheaper!)
✦   How many processes do we need to keep the
    CPU fully utilized?
    • This will help determine how much memory we need
     • Is this still relevant with memory costing $15/GB?
Modeling multiprogramming
✦   More I/O wait means less
    processor utilization
    • At 20% I/O wait, 3– 4
       processes fully utilize CPU
     • At 80% I/O wait, even 10
       processes aren’t enough
✦   This means that the OS
    should have more
    processes if they’re I/O
    bound
✦   More processes ⇒
    memory management &
    protection more
    important!
Multiprogrammed system
performance
    Arrival and work requirements of 4 jobs
✦   CPU utilization for 1– 4 jobs with 80% I/O wait
✦   Sequence of events as jobs arrive and finish
    • Numbers show amount of CPU time jobs get in each interval
     • More processes ⇒ better utilization, less time per process
            Job   Arrival CPU
                   time needed
                                                1        2        3   4
            1     10:00   4
                                    CPU idle 0.80 0.64 0.51 0.41
            2     10:10   3
                                    CPU busy 0.20 0.36 0.49 0.59
            3     10:15   2
                                   CPU/process 0.20 0.18 0.16 0.15
            4     10:20   2

        0       Time          10     15             20       22           27.6 28.2   31.7
    1
    2
    3
    4
Memory and multiprogramming
✦   Memory needs two things for multiprogramming
    • Relocation
     • Protection
✦   The OS cannot be certain where a program will
    be loaded in memory
    • Variables and procedures can’t use absolute
       locations in memory
     • Several ways to guarantee this
✦   The OS must keep processes’ memory separate
    • Protect a process from other processes reading or
       modifying its own memory
     • Protect a process from modifying its own memory in
       undesirable ways (such as writing to program code)
Base and limit registers
    Special CPU registers: base &              0xFFFF
                                                                       0x2000
    limit
    • Access to the registers limited to                               Limit
       system mode                                      Process
     • Registers contain                                partition
       - Base: start of the process’s                                  Base
         memory partition                                               0x9000
       - Limit: length of the process’s
         memory partition
✦   Address generation
    • Physical address: location in                       OS
                                                    0
         actual memory
     • Logical address: location from the          Logical address: 0x1204
         process’s point of view                   Physical address:
      • Physical address = base + logical
                                                   0x1204+0x9000 =
         address
                                                   0xa204
       • Logical address larger than limit ⇒
         error
Swapping
                     C     C       C    C    C
                B    B     B       B
                                             A
          A     A    A
                                   D    D    D
          OS   OS    OS   OS       OS   OS   OS


✦   Memory allocation changes as
    • Processes come into memory
     • Processes leave memory
       - Swapped to disk
       - Complete execution
✦   Gray regions are unused memory
Swapping: leaving room to grow
✦   Need to allow for
    programs to grow                           Stack
    • Allocate more memory for                         Room for
                                     Process
       data                             B
                                                       B to grow
     • Larger stack                            Data
✦   Handled by allocating                      Code
    more space than is                         Stack
    necessary at the start                             Room for
                                     Process
    • Inefficient: wastes memory        A
                                                       A to grow
       that’s not currently in use             Data
     • What if the process                     Code
       requests too much                        OS
       memory?
Tracking memory usage: bitmaps
    Keep track of free / allocated memory regions with a bitmap
    • One bit in map corresponds to a fixed-size region of memory
     • Bitmap is a constant size for a given amount of memory regardless of
       how much is allocated at a particular time
✦   Chunk size determines efficiency
    • At 1 bit/4KB chunk, we need just 256 bits (32 bytes)/ MB of memory
     • For smaller chunks, we need more memory for the bitmap
      • Can be difficult to find large contiguous free areas in bitmap
            A               B                  C          D
                      8
                                     16            24             32
           11111100               Memory regions
           00111000
           01111111
           11111000 Bitmap
Tracking memory usage: linked
lists
✦   Keep track of free / allocated memory regions with a linked list
    • Each entry in the list corresponds to a contiguous region of memory
     • Entry can indicate either allocated or free (and, optionally, owning
        process)
      • May have separate lists for free and allocated areas
✦   Efficient if chunks are large
    • Fixed-size representation for each region
     • More regions → more space needed for free lists

           A                B                    C            D
                      8
                                      16               24              32
                           Memory regions
       A 0 6          -   6 4       B 10 3          - 13 4        C 17 9


       D 26 3         - 29 3
Allocating memory
✦   Search through region list to find a large enough space
✦   Suppose there are several choices: which one to use?
    • First fit: the first suitable hole on the list
     • Next fit: the first suitable after the previously allocated hole
      • Best fit: the smallest hole that is larger than the desired region
         (wastes least space?)
       • Worst fit: the largest available hole (leaves largest fragment)
✦   Option: maintain separate queues for different-size holes
            Allocate 20 blocks first fit   Allocate 13 blocks best fit
            Allocate 12 blocks next fit    Allocate 15 blocks worst fit
                                 1                5              18
        -   6   5        -   19 14         -   52 25        - 102 30      - 135 16


        - 202 10         - 302 20          - 350 30         - 411 19      - 510 3
                                                  15
Freeing memory
✦   Allocation structures must be updated when memory is
    freed
✦   Easy with bitmaps: just set the appropriate bits in the
    bitmap
✦   Linked lists: modify adjacent elements as needed
    • Merge adjacent free regions into a single region
     • May involve merging two regions with the just-freed area

             A     X    B                  A          B

             A     X                       A

                   X    B                             B

                   X
Knuth's observations:
50% rule: if # of processes in memory is n, mean number holes n/2
  In equilibrium: half of the ops above allocs and the other
  half deallocs; on avg, one hole for 2 procs

 unused memory rule: (n/2)*k*s=m-n*s
   m: total memory
   s, k*s: avg size of process, hole
   fraction of memory wasted: k/k+2

overhead of paging: process size*size of page entry/page size+pagezise/2
some interesting results
• if n is the number of allocated areas, then n/2 is the
  number of holes for “ simple” alloc algs (not buddy!) in
  equilibrium
• dynamic storage alloc strategies that never relocate
  reserved blocks: mem eff not guaranteed!
  – with blocks 1 and 2, can run out of mem even with just
    2/3rds full
     • 23 seats in a row, groups of 1 and 2 arrive; do we need
       to split any pair to seat? no more than 16 present
        –Solution: no single is given seat 2, 5, 8, ... 20
     • not possible with 22 seats; no more than 14 present
Limitations of swapping
✦   Problems with swapping
    • Process must fit into physical memory (impossible to
        run larger processes)
     • Memory becomes fragmented
        - External fragmentation: lots of small free areas
        - Compaction needed to reassemble larger free
          areas
      • Processes are either in memory or on disk: half and
        half doesn’t do any good
✦   Overlays solved the first problem
    • Bring in pieces of the process over time (typically
       data)
     • Still doesn’t solve the problem of fragmentation or
       partially resident processes
Virtual memory
✦   Basic idea: allow the OS to hand out more
    memory than exists on the system
✦   Keep recently used stuff in physical memory
✦   Move less recently used stuff to disk
✦   Keep all of this hidden from processes
    • Processes still see an address space from 0–
       max_address
     • Movement of information to and from disk handled by
       the OS without process help
✦   Virtual memory (VM) especially helpful in
    multiprogrammed systems
    • CPU schedules process B while process A waits for
      its memory to be retrieved from disk
Virtual and physical addresses
                                   ✦   Program uses virtual
           CPU chip                    addresses
      CPU               MMU            • Addresses local to the
                                          process
  Virtual addresses                     • Hardware translates virtual
 from CPU to MMU                          address to physical address
                                   ✦   Translation done by the
                      Memory
                                       Memory Management Unit
                                       • Usually on the same chip as
Physical addresses                        the CPU
on bus, in memory
                                        • Only physical addresses
                        Disk              leave the CPU/MMU chip
                      controller   ✦   Physical memory indexed
                                       by physical addresses
Paging and page tables
    Virtual addresses mapped to              60– 64K   X Virtual memory
    physical addresses                       56– 60K   X
    • Unit of mapping is called a page       52– 56K   0
     • All addresses in the same virtual     48– 52K   X
        page are in the same physical page   44– 48K   X
      • Page table entry (PTE) contains      40– 44K   X
        translation for a single page        36– 40K   3
✦   Table translates virtual page            32– 36K   X
    number to physical page number           28– 32K   X
    • Not all virtual memory has a           24– 28K   X
       physical page                         20– 24K   X           Physical
     • Not every physical page need be       16– 20K   1           memory
       used                                  12– 16K   2               12– 16K
✦   Example:                                  8– 12K   X               8– 12K
    • 64 KB virtual memory                     4– 8K   X               4– 8K
     • 16 KB physical memory                   0– 4K   X               0– 4K
What’s in a page table entry?
✦   Each entry in the page table contains
    • Valid bit: set if this logical page number has a corresponding
          physical frame in memory
          - If not valid, remainder of PTE is irrelevant
     • Page frame number: page in physical memory
      • Referenced bit: set if data on the page has been accessed
       • Dirty (modified) bit :set if data on the page has been modified
        • Protection information


                                                 Page frame
            Protection      D   R   V
                                                  number



                Dirty bit   Referenced bit    Valid bit
Mapping logical addresses to
physical addresses
e   Split address from CPU           Example:
    into two pieces                  • 4 KB (=4096 byte) pages
    • Page number (p)                 • 32 bit logical addresses
     • Page offset (d)                2d = 4096         d = 12
✦   Page number
    • Index into page table
     • Page table contains base
       address of page in physical     32-12 = 20 bits 12 bits
       memory
✦   Page offset                                p          d
    • Added to base address to
      get actual physical memory         32 bit logical address
      address
✦   Page size = 2d bytes
Address translation architecture
                               Page frame number   Page frame number
       page number
                 page offset

                                                               0
 CPU         p     d                 f    d                    1
                                                          .
                                                          .
                                                          .
                   0
                                                                f-1
                   1
                           .
                           .                                       f
                           .                                   f+1
                 p-1                                           f+2
                  p        f                              .
                                                          .
                 p+1
                                                          .
                                                   physical memory
                       page table
Memory & paging structures
                                                    Physical
                                Page frame number
                                                    memory
        Page 0              6
        Page 1              3                   0   Page 1 (P1)
        Page 2              4                   1

        Page 3              9                   2   Page 4 (P0)
        Page 4              2                   3   Page 1 (P0)
                                        Free    4   Page 2 (P0)
Logical memory (P0)   Page table (P0)   pages   5
                                                6   Page 0 (P0)
        Page 0              8
                                                7
        Page 1              0
                                                8   Page 0 (P1)
                                                9   Page 3 (P0)
Logical memory (P1)   Page table (P1)
Two-level page tables
    Problem: page tables can be too
    large
    • 232 bytes in 4KB pages ⇒
      1 million PTEs
✦   Solution: use multi-level page
    tables
    • “ Page size” in first page table is
       large (megabytes)
                                             Level 1
     • PTE marked invalid in first page
                                            page table
       table needs no 2nd level page
       table
✦   1st level page table has pointers
                                                           Level 2
    to 2nd level page tables                             page tables Memory
✦   2nd level page table has actual
    physical page numbers in it
More on two-level page tables
l   Tradeoffs between 1st and 2nd level page table
    sizes
    • Total number of bits indexing 1st and 2nd level table
       is constant for a given page size and logical address
       length
     • Tradeoff between number of bits indexing 1st and
       number indexing 2nd level tables
       - More bits in 1st level: fine granularity at 2nd level
       - Fewer bits in 1st level: maybe less wasted space?
✦   All addresses in table are physical addresses
✦   Protection bits kept in 2nd level table
Two-level paging: example
    System characteristics
    •   8 KB pages
    •   32-bit logical address divided into 13 bit page offset, 19 bit page number
✦   Page number divided into:
    •   10 bit page number
    •   9 bit page offset
✦   Logical address looks like this:
    •   p1 is an index into the 1st level page table
    •   p2 is an index into the 2nd level page table pointed to by p1




                         page number                 page offset

                  p1 = 10 bits      p2 = 9 bits    offset = 13 bits
2-level address translation
example
Implementing page tables in hardware
    Page table resides in main (physical) memory
✦   CPU uses special registers for paging
    • Page table base register (PTBR) points to the page table
     • Page table length register (PTLR) contains length of page
       table: restricts maximum legal logical address
✦   Translating an address requires two memory
    accesses
    • First access reads page table entry (PTE)
     • Second access reads the data / instruction from memory
✦   Reduce number of memory accesses
    • Can’t avoid second access (we need the value from
       memory)
     • Eliminate first access by keeping a hardware cache
       (called a translation lookaside buffer or TLB) of recently
       used page table entries
Translation Lookaside Buffer
       (TLB)
    Search the TLB for the desired
    logical page number                   Logical   Physical
                                          page #    frame #
    • Search entries in parallel
     • Use standard cache techniques         8        3
✦   If desired logical page number         unused

    is found, get frame number               2        1
    from TLB                                 3        0
✦   If desired logical page number           12       12
    isn’t found                              29       6
    • Get frame number from page             22       11
       table in memory                       7        4
     • Replace an entry in the TLB with    Example TLB
       the logical & physical page
       numbers from this reference
Handling TLB misses
s   If PTE isn’t found in TLB, OS needs to do the lookup
    in the page table
✦   Lookup can be done in hardware or software
✦   Hardware TLB replacement
    • CPU hardware does page table lookup
     • Can be faster than software
      • Less flexible than software, and more complex hardware
✦   Software TLB replacement
    • OS gets TLB exception
     • Exception handler does page table lookup & places the
         result into the TLB
      • Program continues after return from exception
       • Larger TLB (lower miss rate) can make this feasible
How long do memory accesses
      take?
✦   Assume the following times:
    • TLB lookup time = a (often zero— overlapped in CPU)
     • Memory access time = m
✦     Hit ratio (h) is percentage of time that a logical page
    number is found in the TLB
    • Larger TLB usually means higher h
     • TLB structure can affect h as well
✦   Effective access time (an average) is calculated as:
    • EAT = (m + a)h + (m + m + a)(1-h)
     • EAT = a + (2-h)m
✦   Interpretation
    • Reference always requires TLB lookup, 1 memory access
     • TLB misses also require an additional memory reference
Inverted page table
✦   Reduce page table size further: keep one entry for
    each frame in memory
    • Alternative: merge tables for pages in memory and on disk
✦   PTE contains
    • Virtual address pointing to this frame
     • Information about the process that owns this page
✦   Search page table by
    • Hashing the virtual page number and process ID
     • Starting at the entry corresponding to the hash result
      • Search until either the entry is found or a limit is reached
✦   Page frame number is index of PTE
✦   Improve performance by using more advanced hashing
    algorithms
Inverted page table architecture




 One-to-one correspondence between page table entries and pages in
 memory
Memory Management
 Part 2: Paging Algorithms and
      Implementation Issues
Page replacement algorithms
✦   Page fault forces a choice
    • No room for new page (steady state)
     • Which page must be removed to make room for an
       incoming page?
✦   How is a page removed from physical memory?
    • If the page is unmodified, simply overwrite it: a copy
       already exists on disk
     • If the page has been modified, it must be written
       back to disk: prefer unmodified pages?
✦   Better not to choose an often used page
    • It’ll probably need to be brought back in soon
Optimal page replacement
algorithm
✦   What’s the best we can possibly do?
    • Assume perfect knowledge of the future
     • Not realizable in practice (usually)
      • Useful for comparison: if another algorithm is within
        5% of optimal, not much more can be done…
✦   Algorithm: replace the page that will be used
    furthest in the future
    • Only works if we know the whole sequence!
     • Can be approximated by running the program twice
       - Once to generate the reference trace
       - Once (or more) to apply the optimal algorithm
✦   Nice, but not achievable in real systems!
Not-recently-used (NRU)
     algorithm
d   Each page has reference bit and dirty bit
    • Bits are set when page is referenced and/or modified
✦   Pages are classified into four classes
    • 0: not referenced, not dirty
     • 1: not referenced, dirty
      • 2: referenced, not dirty
       • 3: referenced, dirty
✦   Clear reference bit for all pages periodically
    • Can’t clear dirty bit: needed to indicate which pages need to be
       flushed to disk
     • Class 1 contains dirty pages where reference bit has been cleared
✦   Algorithm: remove a page from the lowest non-empty class
    • Select a page at random from that class
✦   Easy to understand and implement
✦   Performance adequate (though not optimal)
First-In, First-Out (FIFO) algorithm
✦   Maintain a linked list of all pages
    • Maintain the order in which they entered memory
✦   Page at front of list replaced
✦   Advantage: (really) easy to implement
✦   Disadvantage: page in memory the longest may
    be often used
    • This algorithm forces pages out regardless of usage
     • Usage may be helpful in determining which pages to
       keep
Second chance page replacement
g   Modify FIFO to avoid throwing out heavily used
    pages
    • If reference bit is 0, throw the page out
     • If reference bit is 1
       - Reset the reference bit to 0
       - Move page to the tail of the list
       - Continue search for a free page
✦   Still easy to implement, and better than plain FIFO
      referencedunreferenced


       A       B      C          D      E      F      G      H      A
      t=0     t=4    t=8       t=15   t=21   t=22   t=29   t=30   t=32
Clock algorithm
✦   Same functionality as                         A
    second chance                         H     t=32
                                                 t=0      B
✦   Simpler implementation              t=30            t=32
                                                         t=4
    • “ Clock” hand points to next
        page to replace                G                      C
     • If R=0, replace page          t=29                   t=32
                                                             t=8
      • If R=1, set R=0 and
        advance the clock hand            F               D
                                                          J
✦   Continue until page with            t=22      E     t=32
                                                        t=15
    R=0 is found                                t=21
    • This may involve going all
      the way around the clock…        referencedunreferenced
Least Recently Used (LRU)
✦   Assume pages used recently will used again soon
    • Throw out page that has been unused for longest time
✦   Must keep a linked list of pages
    • Most recently used at front, least at rear
     • Update this list every memory reference!
       - This can be somewhat slow: hardware has to update a
         linked list on every reference!
✦   Alternatively, keep counter in each page table entry
    • Global counter increments with each CPU cycle
     • Copy global counter to PTE counter on a reference to the
        page
      • For replacement, evict page with lowest counter value
Simulating LRU in software
✦   Few computers have the necessary hardware to
    implement full LRU
    • Linked-list method impractical in hardware
     • Counter-based method could be done, but it’s slow to
       find the desired page
✦   Approximate LRU with Not Frequently Used
    (NFU) algorithm
    • At each clock interrupt, scan through page table
     • If R=1 for a page, add one to its counter value
      • On replacement, pick the page with the lowest
        counter value
✦   Problem: no notion of age— pages with high
    counter values will tend to keep them!
Aging replacement algorithm
t   Reduce counter values over time
    • Divide by two every clock cycle (use right shift)
     • More weight given to more recent references!
✦   Select page to be evicted by finding the lowest counter value
✦   Algorithm is:
    • Every clock tick, shift all counters right by 1 bit
     • On reference, set leftmost bit of a counter (can be done by copying
       the reference bit to the counter at the clock tick)
     Referenced
      this tick     Tick 0       Tick 1       Tick 2       Tick 3     Tick 4
          Page 0   10000000    11000000     11100000      01110000   10111000
          Page 1   00000000    10000000     01000000      00100000   00010000
          Page 2   10000000    01000000     00100000      10010000   01001000
          Page 3   00000000    00000000     00000000      10000000   01000000
          Page 4   10000000    01000000     10100000      11010000   01101000
          Page 5   10000000    11000000     01100000      10110000   11011000
Working set
✦   Demand paging: bring a page into memory when
    it’s requested by the process
✦   How many pages are needed?
    • Could be all of them, but not likely
     • Instead, processes reference a small set of pages at any
        given time— locality of reference
      • Set of pages can be different for different processes or
        even different times in the running of a single process
✦   Set of pages used by a process in a given interval
    of time is called the working set
    • If entire working set is in memory, no page faults!
     • If insufficient space for working set, thrashing may occur
      • Goal: keep most of working set in memory to minimize
        the number of page faults suffered by a process
How big is the working set?

    w(k,t)


                                 k
✦   Working set is the set of pages used by the k most
    recent memory references
✦   w(k,t) is the size of the working set at time t
✦   Working set may change over time
    • Size of working set can change over time as well…
Working set page replacement
algorithm
Page replacement algorithms:
 summary
        Algorithm                              Comment
OPT (Optimal)                Not implementable, but useful as a benchmark
NRU (Not Recently Used)      Crude
FIFO (First-In, First Out)   Might throw out useful pages
Second chance                Big improvement over FIFO
Clock                        Better implementation of second chance
LRU (Least Recently Used)    Excellent, but hard to implement exactly
NFU (Not Frequently Used)    Poor approximation to LRU
Aging                        Good approximation to LRU, inefficient to implement
Working Set                  Somewhat expensive to implement
WSClock                      Implementable version of Working Set
Modeling page replacement
algorithms
✦   Goal: provide quantitative analysis (or simulation)
    showing which algorithms do better
    • Workload (page reference string) is important:
       different strings may favor different algorithms
     • Show tradeoffs between algorithms
✦   Compare algorithms to one another
✦   Model parameters within an algorithm
    • Number of available physical pages
     • Number of bits for aging
How is modeling done?
d   Generate a list of references
    •   Artificial (made up)
    •   Trace a real workload (set of processes)
✦   Use an array (or other structure) to track the pages in physical
    memory at any given time
    •   May keep other information per page to help simulate the algorithm
        (modification time, time when paged in, etc.)
✦   Run through references, applying the replacement algorithm
✦   Example: FIFO replacement on reference string 0 1 2 3 0 1 4 0 1
    234
    •   Page replacements highlighted in yellow

            Page               0 1 2 3 0 1 4 0 1 2 3 4
            referenced
            Youngest page      0 1 2 3 0 1 4 4 4 2 3 3
                                   0 1 2 3 0 1 1 1 4 2 2
            Oldest page               0 1 2 3 0 0 0 1 4 4
Belady’s anomaly
ý   Reduce the number of page faults by supplying more
    memory
    • Use previous reference string and FIFO algorithm
     • Add another page to physical memory (total 4 pages)
✦   More page faults (10 vs. 9), not fewer!
    • This is called Belady’s anomaly
     • Adding more pages shouldn’t result in worse performance!
✦   Motivated the study of paging algorithms
            Page referenced   0 1 2 3 0 1 4 0 1 2 3 4
             Youngest page    0 1 2 3 3 3 4 0 1 2 3 4
                                0 1 2 2 2 3 4 0 1 2 3
                                  0 1 1 1 2 3 4 0 1 2
              Oldest page           0 0 0 1 2 3 4 0 1
Modeling more replacement
algorithms
p   Paging system characterized by:
    • Reference string of executing process
     • Page replacement algorithm
      • Number of page frames available in physical memory
        (m)
✦   Model this by keeping track of all n pages
    referenced in array M
    • Top part of M has m pages in memory
     • Bottom part of M has n-m pages stored on disk
✦   Page replacement occurs when page moves
    from top to bottom
    • Top and bottom parts may be rearranged without
      causing movement between memory and disk
Example: LRU
ý   Model LRU replacement with
    • 8 unique references in the reference string
     • 4 pages of physical memory
✦   Array state over time shown below
✦   LRU treats list of pages like a stack

    Page referenced 0 2 1 3 5 4 6 3 7 4 7 3 3 5 5 3 1 1 1 7 1 3 4 1
                    0 2 1 3 5 4 6 3 7 4 7 3 3 5 5 3 1 1 1 7 1 3 4 1
     Pages in RAM    0 2 1 3 5 4 6 3 7 4 7 7 3 3 5 3 3 3 1 7 1 3 4
                      0 2 1 3 5 4 6 3 3 4 4 7 7 7 5 5 5 3 3 7 1 3
                        0 2 1 3 5 4 6 6 6 6 4 4 4 7 7 7 5 5 5 7 7
                           0 2 1 1 5 5 5 5 5 6 6 6 4 4 4 4 4 4 5 5
     Pages on disk           0 2 2 1 1 1 1 1 1 1 1 6 6 6 6 6 6 6 6
                               0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
                                   0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Stack algorithms
✦   LRU is an example of a stack algorithm
✦   For stack algorithms
    • Any page in memory with m physical pages is also in
       memory with m+1 physical pages
     • Increasing memory size is guaranteed to reduce (or at
       least not increase) the number of page faults
✦   Stack algorithms do not suffer from Belady’s anomaly
✦   Distance of a reference == position of the page in the
    stack before the reference was made
    • Distance is ∞ if no reference had been made before
     • Distance depends on reference string and paging
       algorithm: might be different for LRU and optimal (both
       stack algorithms)
Predicting page fault rates using
distance
✦   Distance can be used to predict page fault rates
✦   Make a single pass over the reference string to
    generate the distance string on-the-fly
✦   Keep an array of counts
    • Entry j counts the number of times distance j occurs
      in the distance string
✦   The number of page faults for a memory of size
    m is the sum of the counts for j>m
    • This can be done in a single pass!
     • Makes for fast simulations of page replacement
       algorithms
✦   This is why virtual memory theorists like stack
    algorithms!
Local vs. global allocation policies
    What is the pool of pages           Last access time
    eligible to be replaced?           Page
    • Pages belonging to the process    A0    14
       needing a new page               A1    12      Local
     • All pages in the system          A2    8
                                                   allocation
✦   Local allocation: replace a         A3
                                        A4    5
                                        B0    10           A4
    page from this process              B1    9
    • May be more “ fair” : penalize    B2
                                        A4    3
                                                    Global
       processes that replace many      C0    16   allocation
       pages                            C1    12
     • Can lead to poor performance:    C2    8
       some processes need more         C3    5
       pages than others                C4    4

✦   Global allocation: replace a
    page from any process
Page fault rate vs. allocated
     frames
✦   Local allocation may be
    more “ fair”
    • Don’t penalize other




                                       Page faults/second
      processes for high page fault                                       High
                                                                          rate
      rate
✦   Global allocation is better
    for overall system
    performance                                                           Low
                                                                          rate
    • Take page frames from
       processes that don’t need
       them as much
     • Reduce the overall page fault             Number of page frames assigned
       rate (even though rate for a
       single process may go up)
Control overall page fault rate
✦   Despite good designs, system may still thrash
✦   Most (or all) processes have high page fault rate
    • Some processes need more memory, …
     • but no processes need less memory (and could give
       some up)
✦   Problem: no way to reduce page fault rate
✦   Solution :
    Reduce number of processes competing for
    memory
    • Swap one or more to disk, divide up pages they held
     • Reconsider degree of multiprogramming
How big should a page be?
✦   Smaller pages have advantages
    • Less internal fragmentation
     • Better fit for various data structures, code sections
      • Less unused physical memory (some pages have 20
        useful bytes and the rest isn’t needed currently)
✦   Larger pages are better because
    • Less overhead to keep track of them
       - Smaller page tables
       - TLB can point to more memory (same number of
         pages, but more memory per page)
       - Faster paging algorithms (fewer table entries to look
         through)
     • More efficient to transfer larger pages to and from disk
Separate I & D address spaces
✦   One user address space for
                                                              Instructions     Data
    both data & code                           232-1
    • Simpler
     • Code/data separation harder
        to enforce




                                        Data
      • More address space?
✦   One address space for data,




                                                                        Data
    another for code
    • Code & data separated
                                        Code




                                                       Code
     • More complex in hardware
      • Less flexible                          0
       • CPU must handle instructions
         & data differently
✦   MINIX does the latter
Sharing pages
✦   Processes can share pages
    • Entries in page tables point to the same physical
       page frame
     • Easier to do with code: no problems with modification
✦   Virtual addresses in different processes can be…
    • The same: easier to exchange pointers, keep data
       structures consistent
     • Different: may be easier to actually implement
       - Not a problem if there are only a few shared regions
       - Can be very difficult if many processes share
         regions with each other
When are dirty pages written to
disk?
✦   On demand (when they’re replaced)
    • Fewest writes to disk
     • Slower: replacement takes twice as long (must wait
       for disk write and disk read)
✦   Periodically (in the background)
    • Background process scans through page tables,
      writes out dirty pages that are pretty old
✦   Background process also keeps a list of pages
    ready for replacement
    • Page faults handled faster: no need to find space on
       demand
     • Cleaner may use the same structures discussed
       earlier (clock, etc.)
Implementation issues
✦   Four times when OS involved with paging
✦   Process creation
    • Determine program size
     • Create page table
✦   During process execution
    • Reset the MMU for new process
     • Flush the TLB (or reload it from saved state)
✦   Page fault time
    • Determine virtual address causing fault
     • Swap target page out, needed page in
✦   Process termination time
    • Release page table
     • Return pages to the free pool
How is a page fault handled?
u   Hardware causes a page fault
✦   General registers saved (as on every exception)
✦   OS determines which virtual page needed
    • Actual fault address in a special register
     • Address of faulting instruction in register
       - Page fault was in fetching instruction, or
       - Page fault was in fetching operands for instruction
       - OS must figure out which…
✦   OS checks validity of address
✦   Process killed if address was illegal
✦   OS finds a place to put new page frame
✦   If frame selected for replacement is dirty, write it out to disk
✦   OS requests the new page from disk
✦   Page tables updated
✦   Faulting instruction backed up so it can be restarted
✦   Faulting process scheduled
✦   Registers restored
✦   Program continues
Backing up an instruction
✦   Problem: page fault happens in the middle of
    instruction execution
    • Some changes may have already happened
     • Others may be waiting for VM to be fixed
✦   Solution: undo all of the changes made by the
    instruction
    • Restart instruction from the beginning
     • This is easier on some architectures than others
✦   Example: LW R1, 12(R2)
    • Page fault in fetching instruction: nothing to undo
     • Page fault in getting value at 12(R2): restart instruction
✦   Example: ADD (Rd)+,(Rs1)+,(Rs2)+
    • Page fault in writing to (Rd): may have to undo an awful
      lot…
Locking pages in memory
✦   Virtual memory and I/O occasionally interact
✦   P1 issues call for read from device into buffer
    • While it’s waiting for I/O, P2 runs
     • P2 has a page fault
      • P1’s I/O buffer might be chosen to be paged out
        - This can create a problem because an I/O device is
          going to write to the buffer on P1’s behalf
✦   Solution: allow some pages to be locked into
    memory
    • Locked pages are immune from being replaced
     • Pages only stay locked for (relatively) short periods
Storing pages on disk
    Pages removed from memory are stored on disk
✦   Where are they placed?
    •   Static swap area: easier to code, less flexible
    •   Dynamically allocated space: more flexible, harder to locate a page
        - Dynamic placement often uses a special file (managed by the file system)
          to hold pages
✦   Need to keep track of which pages are where within the on-disk
    storage
           Main memory            Disk             Main memory            Disk
                Pages                                   Pages
               3    0                                   3   0
                                Swap area                               Swap area
               5    6                                   5   6
                                     1                                       7
                                     2                                       2
                                                                             4
                                     4                                       1

                                                                       Swap file
        Page table                   7
                                                 Page table
Separating policy and mechanism
 ✦   Mechanism for page replacement has to be in
     kernel
     • Modifying page tables
      • Reading and writing page table entries
 ✦   Policy for deciding which pages to replace could be
     in user space
     • More flexibility                                                     3. Request page

                                                                  4. Page
      User          User
                   process    2. Page needed
                                                   External
                                                    pager
                                                                  arrives
     space
                                           5. Here is the page!
     Kernel                       Fault                        MMU
     space    1. Page fault      handler
                                            6. Map in page
                                                              handler
Why use segmentation?
      Virtual address space       ✦   Different “ units” in a
                                      single virtual address
                Call
               stack                  space
                                      • Each unit can grow
             Constants                 • How can they be kept
                                          apart?
                                        • Example: symbol table is
 Allocated                                out of space
              Source
                                  ✦   Solution: segmentation
                         In use
               text                   • Give each unit its own
                                        address space
              Symbol
               table
Using segments
✦   Each region of the process has its own segment
✦   Each segment can start at 0
    • Addresses within the segment relative to the segment
      start
✦   Virtual addresses are <segment #, offset within
    segment>
     20K

     16K               16K

     12K Symbol        12K                          12K
          table           Source
      8K               8K  text      8K             8K     Call
                                                          stack
      4K               4K            4K Constants   4K

      0K               0K            0K             0K
           Segment 0     Segment 1     Segment 2      Segment 3
Paging vs. segmentation
         What?                  Paging             Segmentation
Need the programmer know            No                       Yes
about it?
How many linear address             One                     Many
spaces?
More addresses than                 Yes                      Yes
physical memory?
Separate protection for          Not really                  Yes
different objects?
Variable-sized objects              No                       Yes
handled with ease?
Is sharing easy?                    No                       Yes
Why use it?                More address space    Break programs into
                           without buying more   logical pieces that are
                           memory                handled separately.
Implementing segmentation
                         Segment 5
 Segment 3   Segment 3     (6 KB)                Segment 6
                                                 9 KB free
  (10 KB)     (10 KB)                              (8 KB)
                         4 KB free
 Segment 2   Segment 2   Segment 2               Segment 5
   (4 KB)      (4 KB)      (4 KB)                  (6 KB)
                                     Segment 6
             Segment 4   Segment 4     (8 KB)    Segment 2
 Segment 1     (7 KB)      (7 KB)                  (4 KB)
  (12 KB)                                        Segment 4
             5 KB free   5 KB free                 (7 KB)

 Segment 0   Segment 0   Segment 0               Segment 0
   (6 KB)      (6 KB)      (6 KB)                  (6 KB)


    Need to do memory compaction!
A better way: segmentation with
paging
Translating an address in
MULTICS
Memory management in the
Pentium
t   Memory composed of segments
    •   Segment pointed to by segment descriptor
    •   Segment selector used to identify descriptor
✦   Segment descriptor describes segment
    •   Base virtual address
    •   Size
    •   Protection
    •   Code / data
Converting segment to linear
address
n   Selector identifies            Selector
    segment descriptor                                     Offset
    • Limited number of
      selectors available in the
      CPU                                       Base
✦   Offset added to                             Limit        +

    segment’s base address                    Other info
✦   Result is a virtual address
    that will be translated by
    paging
                                                        32-bit linear
                                                         address
Translating virtual to physical
addresses
✦   Pentium uses two-level page tables
    • Top level is called a “ page directory” (1024 entries)
     • Second level is called a “ page table” (1024 entries each)
      • 4 KB pages

Mais conteúdo relacionado

Mais procurados

Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012
Michael Noel
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
Jack Levin
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
larsgeorge
 

Mais procurados (20)

QNAP Control4 Training
QNAP Control4 TrainingQNAP Control4 Training
QNAP Control4 Training
 
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech TalksDeep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech Talks
 
Maximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk PerformanceMaximizing EC2 and Elastic Block Store Disk Performance
Maximizing EC2 and Elastic Block Store Disk Performance
 
Deep Dive - Maximising EC2 & EBS Performance
Deep Dive - Maximising EC2 & EBS PerformanceDeep Dive - Maximising EC2 & EBS Performance
Deep Dive - Maximising EC2 & EBS Performance
 
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
 
Dumb Simple PostgreSQL Performance (NYCPUG)
Dumb Simple PostgreSQL Performance (NYCPUG)Dumb Simple PostgreSQL Performance (NYCPUG)
Dumb Simple PostgreSQL Performance (NYCPUG)
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
 
LUG 2014
LUG 2014LUG 2014
LUG 2014
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
HBase Sizing Notes
HBase Sizing NotesHBase Sizing Notes
HBase Sizing Notes
 
Ch9 mass storage systems
Ch9   mass storage systemsCh9   mass storage systems
Ch9 mass storage systems
 
IBM-AIX Online Training
IBM-AIX Online TrainingIBM-AIX Online Training
IBM-AIX Online Training
 
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
 
IBM-AIX Classroom Training
IBM-AIX Classroom TrainingIBM-AIX Classroom Training
IBM-AIX Classroom Training
 
Raid+controllers
Raid+controllersRaid+controllers
Raid+controllers
 
(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for Performance(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for Performance
 
Windows Server 2012 R2 Software-Defined Storage
Windows Server 2012 R2 Software-Defined StorageWindows Server 2012 R2 Software-Defined Storage
Windows Server 2012 R2 Software-Defined Storage
 
PostgreSQL Portland Performance Practice Project - Database Test 2 Filesystem...
PostgreSQL Portland Performance Practice Project - Database Test 2 Filesystem...PostgreSQL Portland Performance Practice Project - Database Test 2 Filesystem...
PostgreSQL Portland Performance Practice Project - Database Test 2 Filesystem...
 
Tuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy WorkloadTuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy Workload
 

Destaque

Destaque (13)

Atencion farmaceutica
Atencion farmaceuticaAtencion farmaceutica
Atencion farmaceutica
 
Enfermedades autoinmunes
Enfermedades autoinmunesEnfermedades autoinmunes
Enfermedades autoinmunes
 
Technology class
Technology classTechnology class
Technology class
 
ЖИЛИЩЕ ДРЕВНИХ СЛАВЯН
ЖИЛИЩЕ ДРЕВНИХ СЛАВЯНЖИЛИЩЕ ДРЕВНИХ СЛАВЯН
ЖИЛИЩЕ ДРЕВНИХ СЛАВЯН
 
іновації
іноваціїіновації
іновації
 
Ferreira 4x6 20avril2012
Ferreira 4x6 20avril2012Ferreira 4x6 20avril2012
Ferreira 4x6 20avril2012
 
Tema 0
Tema 0Tema 0
Tema 0
 
Udruženje informatičara u zdravstvu Srbije - retrospektiva 2013-2015
Udruženje informatičara u zdravstvu Srbije - retrospektiva 2013-2015Udruženje informatičara u zdravstvu Srbije - retrospektiva 2013-2015
Udruženje informatičara u zdravstvu Srbije - retrospektiva 2013-2015
 
Luminance for M&A Due Diligence
Luminance for M&A Due DiligenceLuminance for M&A Due Diligence
Luminance for M&A Due Diligence
 
Service exports from india scheme (seis)
Service exports from india scheme (seis)Service exports from india scheme (seis)
Service exports from india scheme (seis)
 
Logic
LogicLogic
Logic
 
BDO Payroll Services
BDO Payroll ServicesBDO Payroll Services
BDO Payroll Services
 
My Portfolio in EdTech 2
My Portfolio in EdTech 2My Portfolio in EdTech 2
My Portfolio in EdTech 2
 

Semelhante a Chap4

Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
MongoDB
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performance
WSO2
 

Semelhante a Chap4 (20)

Memory management in operating system | Paging | Virtual memory
Memory management in operating system | Paging | Virtual memoryMemory management in operating system | Paging | Virtual memory
Memory management in operating system | Paging | Virtual memory
 
Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)Deployment Strategies (Mongo Austin)
Deployment Strategies (Mongo Austin)
 
Deployment Strategies
Deployment StrategiesDeployment Strategies
Deployment Strategies
 
Deployment
DeploymentDeployment
Deployment
 
Basics of JVM Tuning
Basics of JVM TuningBasics of JVM Tuning
Basics of JVM Tuning
 
Deployment Strategy
Deployment StrategyDeployment Strategy
Deployment Strategy
 
Memory management in sql server
Memory management in sql serverMemory management in sql server
Memory management in sql server
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation
 
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Micro control idsecconf2010
Micro control idsecconf2010Micro control idsecconf2010
Micro control idsecconf2010
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 
logical memory-organisation
logical memory-organisationlogical memory-organisation
logical memory-organisation
 
Minding SQL Server Memory
Minding SQL Server MemoryMinding SQL Server Memory
Minding SQL Server Memory
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data Safe
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performance
 
Single and Multi core processor
Single and Multi core processorSingle and Multi core processor
Single and Multi core processor
 
Lecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptxLecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptx
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performance
 
Memory Management.pdf
Memory Management.pdfMemory Management.pdf
Memory Management.pdf
 

Último

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Último (20)

How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 

Chap4

  • 2. Memory management Basic memory management ✦ Swapping ✦ Virtual memory ✦ Page replacement algorithms ✦ Modeling page replacement algorithms ✦ Design issues for paging systems ✦ Implementation issues ✦ Segmentation
  • 3. In an ideal world… ✦ The ideal world has memory that is • Very large • Very fast • Non-volatile (doesn’t go away when power is turned off) ✦ The real world has memory that is: • Very large • Very fast • Affordable! Pick any two… ✦ Memory management goal: make the real world look as much like the ideal world as possible
  • 4. Memory hierarchy ✦ What is the memory hierarchy? • Different levels of memory • Some are small & fast • Others are large & slow ✦ What levels are usually included? • Cache: small amount of fast, expensive memory - L1 (level 1) cache: usually on the CPU chip - L2: may be on or off chip - L3 cache: off-chip, made of SRAM • Main memory: medium-speed, medium price memory (DRAM) • Disk: many gigabytes of slow, cheap, non-volatile storage ✦ Memory manager handles the memory hierarchy
  • 5. Basic memory management Components include • Operating system (perhaps with device drivers) • Single process ✦ Goal: lay these out in memory • Memory protection may not be an issue (only one program) • Flexibility may still be useful (allow OS changes, etc.) ✦ No swapping or paging 0xFFFF 0xFFFF Device drivers Operating system (ROM) User program (ROM) (RAM) User program (RAM) User program Operating system (RAM) Operating system (RAM) (RAM) 0 0
  • 6. Fixed partitions: multiple programs Fixed memory partitions • Divide memory into fixed spaces • Assign a process to a space when it’s free ✦ Mechanisms • Separate input queues for each partition • Single input queue: better ability to optimize CPU usage 900K 900K Partition 4 Partition 4 700K 700K Partition 3 Partition 3 600K 600K Partition 2 Partition 2 500K 500K Partition 1 Partition 1 100K Process 100K OS OS 0 0
  • 7. How many processes are enough? ✦ Several memory partitions (fixed or variable size) ✦ Lots of processes wanting to use the CPU ✦ Tradeoff • More processes utilize the CPU better • Fewer processes use less memory (cheaper!) ✦ How many processes do we need to keep the CPU fully utilized? • This will help determine how much memory we need • Is this still relevant with memory costing $15/GB?
  • 8. Modeling multiprogramming ✦ More I/O wait means less processor utilization • At 20% I/O wait, 3– 4 processes fully utilize CPU • At 80% I/O wait, even 10 processes aren’t enough ✦ This means that the OS should have more processes if they’re I/O bound ✦ More processes ⇒ memory management & protection more important!
  • 9. Multiprogrammed system performance Arrival and work requirements of 4 jobs ✦ CPU utilization for 1– 4 jobs with 80% I/O wait ✦ Sequence of events as jobs arrive and finish • Numbers show amount of CPU time jobs get in each interval • More processes ⇒ better utilization, less time per process Job Arrival CPU time needed 1 2 3 4 1 10:00 4 CPU idle 0.80 0.64 0.51 0.41 2 10:10 3 CPU busy 0.20 0.36 0.49 0.59 3 10:15 2 CPU/process 0.20 0.18 0.16 0.15 4 10:20 2 0 Time 10 15 20 22 27.6 28.2 31.7 1 2 3 4
  • 10. Memory and multiprogramming ✦ Memory needs two things for multiprogramming • Relocation • Protection ✦ The OS cannot be certain where a program will be loaded in memory • Variables and procedures can’t use absolute locations in memory • Several ways to guarantee this ✦ The OS must keep processes’ memory separate • Protect a process from other processes reading or modifying its own memory • Protect a process from modifying its own memory in undesirable ways (such as writing to program code)
  • 11. Base and limit registers Special CPU registers: base & 0xFFFF 0x2000 limit • Access to the registers limited to Limit system mode Process • Registers contain partition - Base: start of the process’s Base memory partition 0x9000 - Limit: length of the process’s memory partition ✦ Address generation • Physical address: location in OS 0 actual memory • Logical address: location from the Logical address: 0x1204 process’s point of view Physical address: • Physical address = base + logical 0x1204+0x9000 = address 0xa204 • Logical address larger than limit ⇒ error
  • 12. Swapping C C C C C B B B B A A A A D D D OS OS OS OS OS OS OS ✦ Memory allocation changes as • Processes come into memory • Processes leave memory - Swapped to disk - Complete execution ✦ Gray regions are unused memory
  • 13. Swapping: leaving room to grow ✦ Need to allow for programs to grow Stack • Allocate more memory for Room for Process data B B to grow • Larger stack Data ✦ Handled by allocating Code more space than is Stack necessary at the start Room for Process • Inefficient: wastes memory A A to grow that’s not currently in use Data • What if the process Code requests too much OS memory?
  • 14. Tracking memory usage: bitmaps Keep track of free / allocated memory regions with a bitmap • One bit in map corresponds to a fixed-size region of memory • Bitmap is a constant size for a given amount of memory regardless of how much is allocated at a particular time ✦ Chunk size determines efficiency • At 1 bit/4KB chunk, we need just 256 bits (32 bytes)/ MB of memory • For smaller chunks, we need more memory for the bitmap • Can be difficult to find large contiguous free areas in bitmap A B C D 8 16 24 32 11111100 Memory regions 00111000 01111111 11111000 Bitmap
  • 15. Tracking memory usage: linked lists ✦ Keep track of free / allocated memory regions with a linked list • Each entry in the list corresponds to a contiguous region of memory • Entry can indicate either allocated or free (and, optionally, owning process) • May have separate lists for free and allocated areas ✦ Efficient if chunks are large • Fixed-size representation for each region • More regions → more space needed for free lists A B C D 8 16 24 32 Memory regions A 0 6 - 6 4 B 10 3 - 13 4 C 17 9 D 26 3 - 29 3
  • 16. Allocating memory ✦ Search through region list to find a large enough space ✦ Suppose there are several choices: which one to use? • First fit: the first suitable hole on the list • Next fit: the first suitable after the previously allocated hole • Best fit: the smallest hole that is larger than the desired region (wastes least space?) • Worst fit: the largest available hole (leaves largest fragment) ✦ Option: maintain separate queues for different-size holes Allocate 20 blocks first fit Allocate 13 blocks best fit Allocate 12 blocks next fit Allocate 15 blocks worst fit 1 5 18 - 6 5 - 19 14 - 52 25 - 102 30 - 135 16 - 202 10 - 302 20 - 350 30 - 411 19 - 510 3 15
  • 17. Freeing memory ✦ Allocation structures must be updated when memory is freed ✦ Easy with bitmaps: just set the appropriate bits in the bitmap ✦ Linked lists: modify adjacent elements as needed • Merge adjacent free regions into a single region • May involve merging two regions with the just-freed area A X B A B A X A X B B X
  • 18. Knuth's observations: 50% rule: if # of processes in memory is n, mean number holes n/2 In equilibrium: half of the ops above allocs and the other half deallocs; on avg, one hole for 2 procs unused memory rule: (n/2)*k*s=m-n*s m: total memory s, k*s: avg size of process, hole fraction of memory wasted: k/k+2 overhead of paging: process size*size of page entry/page size+pagezise/2
  • 19. some interesting results • if n is the number of allocated areas, then n/2 is the number of holes for “ simple” alloc algs (not buddy!) in equilibrium • dynamic storage alloc strategies that never relocate reserved blocks: mem eff not guaranteed! – with blocks 1 and 2, can run out of mem even with just 2/3rds full • 23 seats in a row, groups of 1 and 2 arrive; do we need to split any pair to seat? no more than 16 present –Solution: no single is given seat 2, 5, 8, ... 20 • not possible with 22 seats; no more than 14 present
  • 20. Limitations of swapping ✦ Problems with swapping • Process must fit into physical memory (impossible to run larger processes) • Memory becomes fragmented - External fragmentation: lots of small free areas - Compaction needed to reassemble larger free areas • Processes are either in memory or on disk: half and half doesn’t do any good ✦ Overlays solved the first problem • Bring in pieces of the process over time (typically data) • Still doesn’t solve the problem of fragmentation or partially resident processes
  • 21. Virtual memory ✦ Basic idea: allow the OS to hand out more memory than exists on the system ✦ Keep recently used stuff in physical memory ✦ Move less recently used stuff to disk ✦ Keep all of this hidden from processes • Processes still see an address space from 0– max_address • Movement of information to and from disk handled by the OS without process help ✦ Virtual memory (VM) especially helpful in multiprogrammed systems • CPU schedules process B while process A waits for its memory to be retrieved from disk
  • 22. Virtual and physical addresses ✦ Program uses virtual CPU chip addresses CPU MMU • Addresses local to the process Virtual addresses • Hardware translates virtual from CPU to MMU address to physical address ✦ Translation done by the Memory Memory Management Unit • Usually on the same chip as Physical addresses the CPU on bus, in memory • Only physical addresses Disk leave the CPU/MMU chip controller ✦ Physical memory indexed by physical addresses
  • 23. Paging and page tables Virtual addresses mapped to 60– 64K X Virtual memory physical addresses 56– 60K X • Unit of mapping is called a page 52– 56K 0 • All addresses in the same virtual 48– 52K X page are in the same physical page 44– 48K X • Page table entry (PTE) contains 40– 44K X translation for a single page 36– 40K 3 ✦ Table translates virtual page 32– 36K X number to physical page number 28– 32K X • Not all virtual memory has a 24– 28K X physical page 20– 24K X Physical • Not every physical page need be 16– 20K 1 memory used 12– 16K 2 12– 16K ✦ Example: 8– 12K X 8– 12K • 64 KB virtual memory 4– 8K X 4– 8K • 16 KB physical memory 0– 4K X 0– 4K
  • 24. What’s in a page table entry? ✦ Each entry in the page table contains • Valid bit: set if this logical page number has a corresponding physical frame in memory - If not valid, remainder of PTE is irrelevant • Page frame number: page in physical memory • Referenced bit: set if data on the page has been accessed • Dirty (modified) bit :set if data on the page has been modified • Protection information Page frame Protection D R V number Dirty bit Referenced bit Valid bit
  • 25. Mapping logical addresses to physical addresses e Split address from CPU Example: into two pieces • 4 KB (=4096 byte) pages • Page number (p) • 32 bit logical addresses • Page offset (d) 2d = 4096 d = 12 ✦ Page number • Index into page table • Page table contains base address of page in physical 32-12 = 20 bits 12 bits memory ✦ Page offset p d • Added to base address to get actual physical memory 32 bit logical address address ✦ Page size = 2d bytes
  • 26. Address translation architecture Page frame number Page frame number page number page offset 0 CPU p d f d 1 . . . 0 f-1 1 . . f . f+1 p-1 f+2 p f . . p+1 . physical memory page table
  • 27. Memory & paging structures Physical Page frame number memory Page 0 6 Page 1 3 0 Page 1 (P1) Page 2 4 1 Page 3 9 2 Page 4 (P0) Page 4 2 3 Page 1 (P0) Free 4 Page 2 (P0) Logical memory (P0) Page table (P0) pages 5 6 Page 0 (P0) Page 0 8 7 Page 1 0 8 Page 0 (P1) 9 Page 3 (P0) Logical memory (P1) Page table (P1)
  • 28. Two-level page tables Problem: page tables can be too large • 232 bytes in 4KB pages ⇒ 1 million PTEs ✦ Solution: use multi-level page tables • “ Page size” in first page table is large (megabytes) Level 1 • PTE marked invalid in first page page table table needs no 2nd level page table ✦ 1st level page table has pointers Level 2 to 2nd level page tables page tables Memory ✦ 2nd level page table has actual physical page numbers in it
  • 29. More on two-level page tables l Tradeoffs between 1st and 2nd level page table sizes • Total number of bits indexing 1st and 2nd level table is constant for a given page size and logical address length • Tradeoff between number of bits indexing 1st and number indexing 2nd level tables - More bits in 1st level: fine granularity at 2nd level - Fewer bits in 1st level: maybe less wasted space? ✦ All addresses in table are physical addresses ✦ Protection bits kept in 2nd level table
  • 30. Two-level paging: example System characteristics • 8 KB pages • 32-bit logical address divided into 13 bit page offset, 19 bit page number ✦ Page number divided into: • 10 bit page number • 9 bit page offset ✦ Logical address looks like this: • p1 is an index into the 1st level page table • p2 is an index into the 2nd level page table pointed to by p1 page number page offset p1 = 10 bits p2 = 9 bits offset = 13 bits
  • 32. Implementing page tables in hardware Page table resides in main (physical) memory ✦ CPU uses special registers for paging • Page table base register (PTBR) points to the page table • Page table length register (PTLR) contains length of page table: restricts maximum legal logical address ✦ Translating an address requires two memory accesses • First access reads page table entry (PTE) • Second access reads the data / instruction from memory ✦ Reduce number of memory accesses • Can’t avoid second access (we need the value from memory) • Eliminate first access by keeping a hardware cache (called a translation lookaside buffer or TLB) of recently used page table entries
  • 33. Translation Lookaside Buffer (TLB) Search the TLB for the desired logical page number Logical Physical page # frame # • Search entries in parallel • Use standard cache techniques 8 3 ✦ If desired logical page number unused is found, get frame number 2 1 from TLB 3 0 ✦ If desired logical page number 12 12 isn’t found 29 6 • Get frame number from page 22 11 table in memory 7 4 • Replace an entry in the TLB with Example TLB the logical & physical page numbers from this reference
  • 34. Handling TLB misses s If PTE isn’t found in TLB, OS needs to do the lookup in the page table ✦ Lookup can be done in hardware or software ✦ Hardware TLB replacement • CPU hardware does page table lookup • Can be faster than software • Less flexible than software, and more complex hardware ✦ Software TLB replacement • OS gets TLB exception • Exception handler does page table lookup & places the result into the TLB • Program continues after return from exception • Larger TLB (lower miss rate) can make this feasible
  • 35. How long do memory accesses take? ✦ Assume the following times: • TLB lookup time = a (often zero— overlapped in CPU) • Memory access time = m ✦ Hit ratio (h) is percentage of time that a logical page number is found in the TLB • Larger TLB usually means higher h • TLB structure can affect h as well ✦ Effective access time (an average) is calculated as: • EAT = (m + a)h + (m + m + a)(1-h) • EAT = a + (2-h)m ✦ Interpretation • Reference always requires TLB lookup, 1 memory access • TLB misses also require an additional memory reference
  • 36. Inverted page table ✦ Reduce page table size further: keep one entry for each frame in memory • Alternative: merge tables for pages in memory and on disk ✦ PTE contains • Virtual address pointing to this frame • Information about the process that owns this page ✦ Search page table by • Hashing the virtual page number and process ID • Starting at the entry corresponding to the hash result • Search until either the entry is found or a limit is reached ✦ Page frame number is index of PTE ✦ Improve performance by using more advanced hashing algorithms
  • 37. Inverted page table architecture One-to-one correspondence between page table entries and pages in memory
  • 38. Memory Management Part 2: Paging Algorithms and Implementation Issues
  • 39. Page replacement algorithms ✦ Page fault forces a choice • No room for new page (steady state) • Which page must be removed to make room for an incoming page? ✦ How is a page removed from physical memory? • If the page is unmodified, simply overwrite it: a copy already exists on disk • If the page has been modified, it must be written back to disk: prefer unmodified pages? ✦ Better not to choose an often used page • It’ll probably need to be brought back in soon
  • 40. Optimal page replacement algorithm ✦ What’s the best we can possibly do? • Assume perfect knowledge of the future • Not realizable in practice (usually) • Useful for comparison: if another algorithm is within 5% of optimal, not much more can be done… ✦ Algorithm: replace the page that will be used furthest in the future • Only works if we know the whole sequence! • Can be approximated by running the program twice - Once to generate the reference trace - Once (or more) to apply the optimal algorithm ✦ Nice, but not achievable in real systems!
  • 41. Not-recently-used (NRU) algorithm d Each page has reference bit and dirty bit • Bits are set when page is referenced and/or modified ✦ Pages are classified into four classes • 0: not referenced, not dirty • 1: not referenced, dirty • 2: referenced, not dirty • 3: referenced, dirty ✦ Clear reference bit for all pages periodically • Can’t clear dirty bit: needed to indicate which pages need to be flushed to disk • Class 1 contains dirty pages where reference bit has been cleared ✦ Algorithm: remove a page from the lowest non-empty class • Select a page at random from that class ✦ Easy to understand and implement ✦ Performance adequate (though not optimal)
  • 42. First-In, First-Out (FIFO) algorithm ✦ Maintain a linked list of all pages • Maintain the order in which they entered memory ✦ Page at front of list replaced ✦ Advantage: (really) easy to implement ✦ Disadvantage: page in memory the longest may be often used • This algorithm forces pages out regardless of usage • Usage may be helpful in determining which pages to keep
  • 43. Second chance page replacement g Modify FIFO to avoid throwing out heavily used pages • If reference bit is 0, throw the page out • If reference bit is 1 - Reset the reference bit to 0 - Move page to the tail of the list - Continue search for a free page ✦ Still easy to implement, and better than plain FIFO referencedunreferenced A B C D E F G H A t=0 t=4 t=8 t=15 t=21 t=22 t=29 t=30 t=32
  • 44. Clock algorithm ✦ Same functionality as A second chance H t=32 t=0 B ✦ Simpler implementation t=30 t=32 t=4 • “ Clock” hand points to next page to replace G C • If R=0, replace page t=29 t=32 t=8 • If R=1, set R=0 and advance the clock hand F D J ✦ Continue until page with t=22 E t=32 t=15 R=0 is found t=21 • This may involve going all the way around the clock… referencedunreferenced
  • 45. Least Recently Used (LRU) ✦ Assume pages used recently will used again soon • Throw out page that has been unused for longest time ✦ Must keep a linked list of pages • Most recently used at front, least at rear • Update this list every memory reference! - This can be somewhat slow: hardware has to update a linked list on every reference! ✦ Alternatively, keep counter in each page table entry • Global counter increments with each CPU cycle • Copy global counter to PTE counter on a reference to the page • For replacement, evict page with lowest counter value
  • 46. Simulating LRU in software ✦ Few computers have the necessary hardware to implement full LRU • Linked-list method impractical in hardware • Counter-based method could be done, but it’s slow to find the desired page ✦ Approximate LRU with Not Frequently Used (NFU) algorithm • At each clock interrupt, scan through page table • If R=1 for a page, add one to its counter value • On replacement, pick the page with the lowest counter value ✦ Problem: no notion of age— pages with high counter values will tend to keep them!
  • 47. Aging replacement algorithm t Reduce counter values over time • Divide by two every clock cycle (use right shift) • More weight given to more recent references! ✦ Select page to be evicted by finding the lowest counter value ✦ Algorithm is: • Every clock tick, shift all counters right by 1 bit • On reference, set leftmost bit of a counter (can be done by copying the reference bit to the counter at the clock tick) Referenced this tick Tick 0 Tick 1 Tick 2 Tick 3 Tick 4 Page 0 10000000 11000000 11100000 01110000 10111000 Page 1 00000000 10000000 01000000 00100000 00010000 Page 2 10000000 01000000 00100000 10010000 01001000 Page 3 00000000 00000000 00000000 10000000 01000000 Page 4 10000000 01000000 10100000 11010000 01101000 Page 5 10000000 11000000 01100000 10110000 11011000
  • 48. Working set ✦ Demand paging: bring a page into memory when it’s requested by the process ✦ How many pages are needed? • Could be all of them, but not likely • Instead, processes reference a small set of pages at any given time— locality of reference • Set of pages can be different for different processes or even different times in the running of a single process ✦ Set of pages used by a process in a given interval of time is called the working set • If entire working set is in memory, no page faults! • If insufficient space for working set, thrashing may occur • Goal: keep most of working set in memory to minimize the number of page faults suffered by a process
  • 49. How big is the working set? w(k,t) k ✦ Working set is the set of pages used by the k most recent memory references ✦ w(k,t) is the size of the working set at time t ✦ Working set may change over time • Size of working set can change over time as well…
  • 50. Working set page replacement algorithm
  • 51. Page replacement algorithms: summary Algorithm Comment OPT (Optimal) Not implementable, but useful as a benchmark NRU (Not Recently Used) Crude FIFO (First-In, First Out) Might throw out useful pages Second chance Big improvement over FIFO Clock Better implementation of second chance LRU (Least Recently Used) Excellent, but hard to implement exactly NFU (Not Frequently Used) Poor approximation to LRU Aging Good approximation to LRU, inefficient to implement Working Set Somewhat expensive to implement WSClock Implementable version of Working Set
  • 52. Modeling page replacement algorithms ✦ Goal: provide quantitative analysis (or simulation) showing which algorithms do better • Workload (page reference string) is important: different strings may favor different algorithms • Show tradeoffs between algorithms ✦ Compare algorithms to one another ✦ Model parameters within an algorithm • Number of available physical pages • Number of bits for aging
  • 53. How is modeling done? d Generate a list of references • Artificial (made up) • Trace a real workload (set of processes) ✦ Use an array (or other structure) to track the pages in physical memory at any given time • May keep other information per page to help simulate the algorithm (modification time, time when paged in, etc.) ✦ Run through references, applying the replacement algorithm ✦ Example: FIFO replacement on reference string 0 1 2 3 0 1 4 0 1 234 • Page replacements highlighted in yellow Page 0 1 2 3 0 1 4 0 1 2 3 4 referenced Youngest page 0 1 2 3 0 1 4 4 4 2 3 3 0 1 2 3 0 1 1 1 4 2 2 Oldest page 0 1 2 3 0 0 0 1 4 4
  • 54. Belady’s anomaly ý Reduce the number of page faults by supplying more memory • Use previous reference string and FIFO algorithm • Add another page to physical memory (total 4 pages) ✦ More page faults (10 vs. 9), not fewer! • This is called Belady’s anomaly • Adding more pages shouldn’t result in worse performance! ✦ Motivated the study of paging algorithms Page referenced 0 1 2 3 0 1 4 0 1 2 3 4 Youngest page 0 1 2 3 3 3 4 0 1 2 3 4 0 1 2 2 2 3 4 0 1 2 3 0 1 1 1 2 3 4 0 1 2 Oldest page 0 0 0 1 2 3 4 0 1
  • 55. Modeling more replacement algorithms p Paging system characterized by: • Reference string of executing process • Page replacement algorithm • Number of page frames available in physical memory (m) ✦ Model this by keeping track of all n pages referenced in array M • Top part of M has m pages in memory • Bottom part of M has n-m pages stored on disk ✦ Page replacement occurs when page moves from top to bottom • Top and bottom parts may be rearranged without causing movement between memory and disk
  • 56. Example: LRU ý Model LRU replacement with • 8 unique references in the reference string • 4 pages of physical memory ✦ Array state over time shown below ✦ LRU treats list of pages like a stack Page referenced 0 2 1 3 5 4 6 3 7 4 7 3 3 5 5 3 1 1 1 7 1 3 4 1 0 2 1 3 5 4 6 3 7 4 7 3 3 5 5 3 1 1 1 7 1 3 4 1 Pages in RAM 0 2 1 3 5 4 6 3 7 4 7 7 3 3 5 3 3 3 1 7 1 3 4 0 2 1 3 5 4 6 3 3 4 4 7 7 7 5 5 5 3 3 7 1 3 0 2 1 3 5 4 6 6 6 6 4 4 4 7 7 7 5 5 5 7 7 0 2 1 1 5 5 5 5 5 6 6 6 4 4 4 4 4 4 5 5 Pages on disk 0 2 2 1 1 1 1 1 1 1 1 6 6 6 6 6 6 6 6 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  • 57. Stack algorithms ✦ LRU is an example of a stack algorithm ✦ For stack algorithms • Any page in memory with m physical pages is also in memory with m+1 physical pages • Increasing memory size is guaranteed to reduce (or at least not increase) the number of page faults ✦ Stack algorithms do not suffer from Belady’s anomaly ✦ Distance of a reference == position of the page in the stack before the reference was made • Distance is ∞ if no reference had been made before • Distance depends on reference string and paging algorithm: might be different for LRU and optimal (both stack algorithms)
  • 58. Predicting page fault rates using distance ✦ Distance can be used to predict page fault rates ✦ Make a single pass over the reference string to generate the distance string on-the-fly ✦ Keep an array of counts • Entry j counts the number of times distance j occurs in the distance string ✦ The number of page faults for a memory of size m is the sum of the counts for j>m • This can be done in a single pass! • Makes for fast simulations of page replacement algorithms ✦ This is why virtual memory theorists like stack algorithms!
  • 59. Local vs. global allocation policies What is the pool of pages Last access time eligible to be replaced? Page • Pages belonging to the process A0 14 needing a new page A1 12 Local • All pages in the system A2 8 allocation ✦ Local allocation: replace a A3 A4 5 B0 10 A4 page from this process B1 9 • May be more “ fair” : penalize B2 A4 3 Global processes that replace many C0 16 allocation pages C1 12 • Can lead to poor performance: C2 8 some processes need more C3 5 pages than others C4 4 ✦ Global allocation: replace a page from any process
  • 60. Page fault rate vs. allocated frames ✦ Local allocation may be more “ fair” • Don’t penalize other Page faults/second processes for high page fault High rate rate ✦ Global allocation is better for overall system performance Low rate • Take page frames from processes that don’t need them as much • Reduce the overall page fault Number of page frames assigned rate (even though rate for a single process may go up)
  • 61. Control overall page fault rate ✦ Despite good designs, system may still thrash ✦ Most (or all) processes have high page fault rate • Some processes need more memory, … • but no processes need less memory (and could give some up) ✦ Problem: no way to reduce page fault rate ✦ Solution : Reduce number of processes competing for memory • Swap one or more to disk, divide up pages they held • Reconsider degree of multiprogramming
  • 62. How big should a page be? ✦ Smaller pages have advantages • Less internal fragmentation • Better fit for various data structures, code sections • Less unused physical memory (some pages have 20 useful bytes and the rest isn’t needed currently) ✦ Larger pages are better because • Less overhead to keep track of them - Smaller page tables - TLB can point to more memory (same number of pages, but more memory per page) - Faster paging algorithms (fewer table entries to look through) • More efficient to transfer larger pages to and from disk
  • 63. Separate I & D address spaces ✦ One user address space for Instructions Data both data & code 232-1 • Simpler • Code/data separation harder to enforce Data • More address space? ✦ One address space for data, Data another for code • Code & data separated Code Code • More complex in hardware • Less flexible 0 • CPU must handle instructions & data differently ✦ MINIX does the latter
  • 64. Sharing pages ✦ Processes can share pages • Entries in page tables point to the same physical page frame • Easier to do with code: no problems with modification ✦ Virtual addresses in different processes can be… • The same: easier to exchange pointers, keep data structures consistent • Different: may be easier to actually implement - Not a problem if there are only a few shared regions - Can be very difficult if many processes share regions with each other
  • 65. When are dirty pages written to disk? ✦ On demand (when they’re replaced) • Fewest writes to disk • Slower: replacement takes twice as long (must wait for disk write and disk read) ✦ Periodically (in the background) • Background process scans through page tables, writes out dirty pages that are pretty old ✦ Background process also keeps a list of pages ready for replacement • Page faults handled faster: no need to find space on demand • Cleaner may use the same structures discussed earlier (clock, etc.)
  • 66. Implementation issues ✦ Four times when OS involved with paging ✦ Process creation • Determine program size • Create page table ✦ During process execution • Reset the MMU for new process • Flush the TLB (or reload it from saved state) ✦ Page fault time • Determine virtual address causing fault • Swap target page out, needed page in ✦ Process termination time • Release page table • Return pages to the free pool
  • 67. How is a page fault handled? u Hardware causes a page fault ✦ General registers saved (as on every exception) ✦ OS determines which virtual page needed • Actual fault address in a special register • Address of faulting instruction in register - Page fault was in fetching instruction, or - Page fault was in fetching operands for instruction - OS must figure out which… ✦ OS checks validity of address ✦ Process killed if address was illegal ✦ OS finds a place to put new page frame ✦ If frame selected for replacement is dirty, write it out to disk ✦ OS requests the new page from disk ✦ Page tables updated ✦ Faulting instruction backed up so it can be restarted ✦ Faulting process scheduled ✦ Registers restored ✦ Program continues
  • 68. Backing up an instruction ✦ Problem: page fault happens in the middle of instruction execution • Some changes may have already happened • Others may be waiting for VM to be fixed ✦ Solution: undo all of the changes made by the instruction • Restart instruction from the beginning • This is easier on some architectures than others ✦ Example: LW R1, 12(R2) • Page fault in fetching instruction: nothing to undo • Page fault in getting value at 12(R2): restart instruction ✦ Example: ADD (Rd)+,(Rs1)+,(Rs2)+ • Page fault in writing to (Rd): may have to undo an awful lot…
  • 69. Locking pages in memory ✦ Virtual memory and I/O occasionally interact ✦ P1 issues call for read from device into buffer • While it’s waiting for I/O, P2 runs • P2 has a page fault • P1’s I/O buffer might be chosen to be paged out - This can create a problem because an I/O device is going to write to the buffer on P1’s behalf ✦ Solution: allow some pages to be locked into memory • Locked pages are immune from being replaced • Pages only stay locked for (relatively) short periods
  • 70. Storing pages on disk Pages removed from memory are stored on disk ✦ Where are they placed? • Static swap area: easier to code, less flexible • Dynamically allocated space: more flexible, harder to locate a page - Dynamic placement often uses a special file (managed by the file system) to hold pages ✦ Need to keep track of which pages are where within the on-disk storage Main memory Disk Main memory Disk Pages Pages 3 0 3 0 Swap area Swap area 5 6 5 6 1 7 2 2 4 4 1 Swap file Page table 7 Page table
  • 71. Separating policy and mechanism ✦ Mechanism for page replacement has to be in kernel • Modifying page tables • Reading and writing page table entries ✦ Policy for deciding which pages to replace could be in user space • More flexibility 3. Request page 4. Page User User process 2. Page needed External pager arrives space 5. Here is the page! Kernel Fault MMU space 1. Page fault handler 6. Map in page handler
  • 72. Why use segmentation? Virtual address space ✦ Different “ units” in a single virtual address Call stack space • Each unit can grow Constants • How can they be kept apart? • Example: symbol table is Allocated out of space Source ✦ Solution: segmentation In use text • Give each unit its own address space Symbol table
  • 73. Using segments ✦ Each region of the process has its own segment ✦ Each segment can start at 0 • Addresses within the segment relative to the segment start ✦ Virtual addresses are <segment #, offset within segment> 20K 16K 16K 12K Symbol 12K 12K table Source 8K 8K text 8K 8K Call stack 4K 4K 4K Constants 4K 0K 0K 0K 0K Segment 0 Segment 1 Segment 2 Segment 3
  • 74. Paging vs. segmentation What? Paging Segmentation Need the programmer know No Yes about it? How many linear address One Many spaces? More addresses than Yes Yes physical memory? Separate protection for Not really Yes different objects? Variable-sized objects No Yes handled with ease? Is sharing easy? No Yes Why use it? More address space Break programs into without buying more logical pieces that are memory handled separately.
  • 75. Implementing segmentation Segment 5 Segment 3 Segment 3 (6 KB) Segment 6 9 KB free (10 KB) (10 KB) (8 KB) 4 KB free Segment 2 Segment 2 Segment 2 Segment 5 (4 KB) (4 KB) (4 KB) (6 KB) Segment 6 Segment 4 Segment 4 (8 KB) Segment 2 Segment 1 (7 KB) (7 KB) (4 KB) (12 KB) Segment 4 5 KB free 5 KB free (7 KB) Segment 0 Segment 0 Segment 0 Segment 0 (6 KB) (6 KB) (6 KB) (6 KB) Need to do memory compaction!
  • 76. A better way: segmentation with paging
  • 78. Memory management in the Pentium t Memory composed of segments • Segment pointed to by segment descriptor • Segment selector used to identify descriptor ✦ Segment descriptor describes segment • Base virtual address • Size • Protection • Code / data
  • 79. Converting segment to linear address n Selector identifies Selector segment descriptor Offset • Limited number of selectors available in the CPU Base ✦ Offset added to Limit + segment’s base address Other info ✦ Result is a virtual address that will be translated by paging 32-bit linear address
  • 80. Translating virtual to physical addresses ✦ Pentium uses two-level page tables • Top level is called a “ page directory” (1024 entries) • Second level is called a “ page table” (1024 entries each) • 4 KB pages