O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Evaluation of Caching Strategies Based on Access Statistics on Past Requests

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 13 Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Evaluation of Caching Strategies Based on Access Statistics on Past Requests (20)

Anúncio

Mais de SmartenIT (13)

Mais recentes (20)

Anúncio

Evaluation of Caching Strategies Based on Access Statistics on Past Requests

  1. 1. Evaluation of Caching Strategies based on Access Statistics on Past Requests Gerhard Haßlinger, Konstantinos Ntougias gerhard.hasslinger@telekom.de; kostas_ntougias@yahoo.gr Commercial in Confidence  Least Recently Used (LRU): Simple Standard Method - Analysis, Simulation: Deficits of LRU Cache Hit Rate  Statistics-based Caching Strategies - Window: over the last K Requests - Geometrical Aging: Geom. Decreasing Weight per Request - Criteria: Hit Rate and Effort for Alternative Strategies  Summary on hit rates and effort of web caching strategies © 2013 The SmartenIT Consortium
  2. 2. Cache Efficiency for YouTube Video Traces 60% Cache Hit Rate 50% Optimal Cache Strategy: Most Popular Data in Cache Zipf Law Approximation: 0.004*R**(-5/8) LRU Cache Strategie: Least Recently Used 40% 30% 20% 10% Commercial in Confidence 0% 0.0078% 0.031% 0.124% 0.5% 2% Cache Size: Fraction of videos in the cache Evaluation of 3.7 billion accesses on 1.65 million YouTube files Sources: M. Cha et al., I tube, you tube, everybody tubes: Analyzing the world’s largest user generated content video system, Internet measurement conference IMC, San Diego, USA (2007) Efficiency of caching for IP-based Content Delivery (G. Haßlinger, O. Hohlfeld, ITC 2010) Results confirmed by N. Megiddo and S. Modha, Outperforming LRU with an adaptive replacement cache algorithm, IEEE Computer, (Apr. 2004) 4-11 © 2013 The SmartenIT Consortium
  3. 3. Cache Strategies incl. Statistics on Past Requests Sliding Window: Cache holds objects with highest request frequency over a sliding window of the last K requests  Geometric Fading: Cache holds objects that have the highest sum of weights for past requests, where the kth request in the past has a geometrically decreasing weight r k (0 < r < 1). Commercial in Confidence  © 2013 The SmartenIT Consortium
  4. 4. Statistics over window of the last K requests  Commercial in Confidence  Converges to caching of the most popular objects for large K Reacts to dynamic change in population, after delay until requests to new item are relevant in the statistics Implementation:  The request sequence in the window has to be stored; for a new request one request is falling out of the window and has to be removed from statistics  2 objects change their statistics score per new request: Updates in cache still have constant effort per request, although more than for LRU © 2013 The SmartenIT Consortium
  5. 5. Statistics with geometrical aging  Commercial in Confidence  The k-th request in the past is weighted by ρ k (ρ <1) The weight of an object is the sum of the weights of request Objects are ordered according to their weights Implementation:  In principle, all weights should be multiplied by ρ for each request; instead, the new weight can be multiplied by 1/ρ (>1) i.e. weights are (1/ρ )k for the k-th request  One object changes rank per request; Effort for update rank in sorted list: O(ln(M)) Faster approx.: Requested object to step up noly one rank; or rank updates only e.g. per hour or per day © 2013 The SmartenIT Consortium
  6. 6. Basic Assumptions on Cache Modeling & Evaluation We assume  a set of N objects and a cache for M (< N) objects of fixed size (objects of different size are handled as k unit size chunks; bin-packing problems are almost irrelevant in large caches) Commercial in Confidence  Random independent requests with static popularity pk: Request Probability to object k in the order of popularity ⇒ Optimum strategy holds the most popular objects in cache Static popularity is favourable for the cache hit rate, since unforeseen changes in popularity detract from cache efficiency   Measurement traces of request to Youtube show only slowly varying popularity, a few percent of new top 100 items appear per day/week © 2013 The SmartenIT Consortium
  7. 7. Results on LRU Caching Strategy  An LRU cache is implemented as a stack of dept M; A new request puts the object on top LRU is simple and frequently used (Squid, DropBox etc.)  Analysis of the hit rate for static distribution is possible: pk2 hLRU ( M ) = ∑ pk1 ∑ 1 − pk1 k1 =1 k 2 =1 Commercial in Confidence N N k 2 ≠ k1  N ∑ k3 =1 k3 ≠ k1 ,k 2 p k3 ... 1 − pk1 − pk2 N ∑ k M =1 k M ≠ k1 ,..., k n −1 pkM 1 − ∑ j =1 pk j M −1 M ∑p j =1 kj . but has complex evaluation feasible only for small size M < 15 Approximations by Towsley et al. (1999), Ha. & Ho. (2010), Fricker, Robert, Roberts (2011) seem to be good for arbitrary static request distribution but verified only by simulation © 2013 The SmartenIT Consortium
  8. 8. Worst Case Analysis of LRU Caching Strategy  Cache size M =1 with only one popular popularity p1 >> ε > p2 , … When most popular item is always in cache ⇒ optimum hit rate: p1; LRU hit rate is smaller: p12.  Arbitrary cache size M with a set IPop of M popular objects p1 = p2 = … =pM = p/M >> ε > pM+1, pM+2, … Commercial in Confidence Commercial in Confidence  pLRU(j, k): probability of j popular items from the set IPop are found in an LRU cache of size k. We can analyse pLRU(j, k) iteratively: pLRU ( j , k ) = p ( j , k − 1) 1 − Mp − (k − 1 − j )ε ( M − j + 1) p + p( j − 1, k − 1) . 1 − jp − (k − 1 − j )ε 1 − ( j − 1) p − (k − j )ε ⇒ LRU hit rate hLRU = Σj pLRU(j, M)[ j  p + (M – j)ε ]. LRU Cache of size k = XTop + Cache of size k-1 XTop ∈ IPop Last request to an object X not in the cache of size k-1 pLRU(j+1, k) XTop ∉ IPop pLRU(j, k) pLRU(j, k-1) ⇒ Exact analysis of LRU worst case hit rate is feasible © 2013 The SmartenIT Consortium
  9. 9. Worst Case Analysis of LRU Caching 100% Most popular items in cache LRU Worst Case for Cache of Size 1 LRU Worst Case for Cache of Size 2 LRU Worst Case for Cache of Size 10 LRU Worst Case for Cache of Size 50 Commercial in Confidence Cache Hit Rate 80% 60% 28.9% max. absolute deficit → severe relative deficits for ↓ small cache hit rate 40% 20% 0% 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 Worst Case LRU Scenario: Probability of a request to the set of popular objects © 2013 The SmartenIT Consortium 1
  10. 10. Simulation Results for Caching Strategies Hit rate of the caching strategies (N = 1000 objects; K = 1000) for Zipf distributed requests A(R) = α R–β (β = 0.6; α = 2.7%) 40% Most popular objects in the cache Geometrical fading Sliding window LRU Approximation LRU Simulation 20% R t i H e h c a C Commercial in Confidence 30% 10% 0% M= 5 © 2013 The SmartenIT Consortium 10 20 50 100
  11. 11. Simulation Results for Caching Strategies Hit rate of the caching strategies (N = 1000; K = 1000) for Zipf distributed requests A(R) = α R–β (β = 0.99; α = 13.9%) 70% Optimum Geometrical fading Sliding window LRU Approximation LRU Simulation 60% 40% 30% R t i H e h c a C Commercial in Confidence 50% 20% 10% 0% M= 5 © 2013 The SmartenIT Consortium 10 20 50 100
  12. 12. Simulation Results for Caching Strategies Hit rate of the caching strategies (N = 1000) for Zipf distributed requests A(R) = α R–β (β = 0.99; α = 6.5%) 60% 55% 45% R t i H e h c a C Commercial in Confidence 50% Optimum for i.i.d. requests Geometrical fading Sliding window LRU 40% 35% K= 1 4 16 64 128 256 512 1024 2048 Sliding Window and Geometrical Fading: Hit rate depending on the window size K, ρ (ρ = K/(K + 1)) © 2013 The SmartenIT Consortium
  13. 13. Conclusions on Cache Replacement Strategies   Commercial in Confidence  LRU seems most often used in web caches (Squid, DropBox) For static popularity, LRU is below the maximal hit rate by - 28.9% in the worst case - 10-20% for large content sites (YouTube; Zipf-like requests) LRU performance is poor especially for small caches Statistics over a fixed size window and geometric aging can converge to optimum hit rate of the static popularity case  Implementation: - Statistics over window needs some storage, has constant update effort per request but more than LRU - Geometric aging has effort O(ln(M))  Zipf law popularity makes (small) caches efficient © 2013 The SmartenIT Consortium

Notas do Editor

  • Fluß: Point-to-Multipoint (insbes. Bei RSVP)

×