O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google Talk, and MSN Messenger

1.953 visualizações

Publicada em

VoIP playout buffer dimensioning has long been a challeng- ing optimization problem, as the buffer size must maintain a balance between conversational interactivity and speech quality. The conversational quality may be affected by a number of factors, some of which may change over time. Although a great deal of research effort has been expended in trying to solve the problem, how the research results are applied in practice is unclear.

In this paper, we investigate the playout buffer dimension- ing algorithms applied in three popular VoIP applications, namely, Skype, Google Talk, and MSN Messenger. We conduct experiments to assess how the applications adjust their playout buffer sizes. Using an objective QoE (Quality of Experience) metric, we show that Google Talk and MSN Messenger do not adjust their respective buffer sizes appropriately, while Skype does not adjust its buffer at all. In other words, they could provide better QoE to users by improving their buffer dimensioning algorithms. Moreover, none of the applications adapts its buffer size to the network loss rate, which should also be considered to ensure optimal QoE provisioning.

Publicada em: Tecnologia, Negócios
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google Talk, and MSN Messenger

  1. 1. An Empirical Evaluation of  VoIP Playout Buffer Dimensioning  for Kuan‐Ta Chen, Academia Sinica Chen‐Chi Wu and Chin‐Laung Lei, National Taiwan University Chun‐Yin Huang, National Taiwan Ocean University NOSSDAV 2009
  2. 2. Playout Buffering Voice needs to be played smoothly, but network  may inject delay variance (jitters) (dejitter buffer)
  3. 3. The Tradeoff As buffering sacrifices delay in exchange for a  higher chance that voice packets arrive on time Larger buffer size  better speech quality lower conversational interactivity Smaller buffer size  worse speech quality higher conversational interactivity NOSSDAV 2009 / Kuan‐Ta Chen  3
  4. 4. Playout Buffer Dimensioning Deciding the optimal buffer size maintain a balance between  conversational interactivity and speech quality It’s challenging because of so many factors Network delay Network delay jitter changes over time Network loss Codec implementation Redundancy control Error recovery Transport protocol, etc NOSSDAV 2009 / Kuan‐Ta Chen  4
  5. 5. Proposals on Buffer Dimensioning Many algorithms have been proposed  The k largest delay among the previous m delays [Naylor’85] Inflate buffer size when packets arrive late, and shrink buffer  size over time [Stone’95] Weighted sum of EWMA of delay and delay jitter [Ramjee’94] Automatic adjustment of EWMA weights [Narbutt’03] Weight adjustment within talk bursts [Liang’03, Sreenan’03] … NOSSDAV 2009 / Kuan‐Ta Chen  5
  6. 6. We are curious about … Skype: 405 million registrars (15 million online) Whether a gap exists between  research community and software  practitioners?
  7. 7. In other words … Do commercial products really  adopt any of these proposals? What’s the performance of  commercial products (in terms  of buffer dimensioning)?
  8. 8. Our Contribution A systematic measurement methodology for  measuring VoIP playout buffer size Show that the real‐life VoIP applications do not  adjust their buffer sizes appropriately based on QoE measures A regression‐based algorithm to compute the  optimal buffer size given a network condition Light‐weight computation; thus can be applied in run time NOSSDAV 2009 / Kuan‐Ta Chen  8
  9. 9. Talk Progress Overview Measuring Applications’ Buffer Size Experiment Methodology Measurement Results Deriving Optimal Buffer Size Methodology Derived Buffer Sizes Evaluation of Applications’ Dimensioning Algorithms Conclusion & On‐going Work
  10. 10. Experiment Methodology Speaker Left channel  pac Audio input Audio output ket (speaker) s k ets Recorder pac Right channel  FreeBSD w/  (listener ) Audio output dummynet Listener Wave File Dummynet for controling network conditions (delay,  jitter, and packet loss) Use a recording card (ESI Maya44) to ensure time‐ synchronized audio recordings from two hosts NOSSDAV 2009 / Kuan‐Ta Chen  10
  11. 11. Buffer Size Estimation Finding best alignment  (based on cross‐correlation  end‐to‐end  coeffcients) delay End‐to‐end delay components Network delay (dummynet) Coder delay + packetization delay (assumes 50 ms) Playout buffering delay (unknown) Buffer size = e2e delay – network delay – 50 ms NOSSDAV 2009 / Kuan‐Ta Chen  11
  12. 12. Experiment Settings Application: Skype, Google Talk, MSN Messenger Network delay & jitter: 0 ms, 25 ms, 50ms, …,  200 ms Network loss rate: 0%, 1%, …, 10% 10 VoIP calls for each app/network setting Each call lasts 240 seconds NOSSDAV 2009 / Kuan‐Ta Chen  12
  13. 13. Effect of Delay and Jitter Skype maintains the same buffer size Google Talk slightly adjusts the buffer size according to  delay and jitter MSN Messenger’s buffer size grows linearly as the jitter  increases
  14. 14. Effect of Packet Loss All three applications do not adapt buffer size to  packet loss NOSSDAV 2009 / Kuan‐Ta Chen  14
  15. 15. Having seen the different behavior  of the applications, … Which one application’s playout  dimensioning is best? Is the best one optimal?
  16. 16. Talk Progress Overview Measuring Applications’ Buffer Size Experiment Methodology Measurement Results Deriving Optimal Buffer Size Methodology Derived Buffer Sizes Evaluation of Applications’ Dimensioning Algorithms Conclusion On‐going Work
  17. 17. Deriving Optimal Buffer Size ‐ QoE How to define the “optimal” buffer size? Buffer size that yields the best quality of experience (QoE) How to measure the QoE of a VoIP call? PESQ (ITU‐T P.862, Perceptual Evaluation of Speech  Quality) measures listening quality signal level, accurate E‐Model (ITU‐T G.107) measures overall quality (listening + interactivity) network level, lightweight but not accurate in listening  quality NOSSDAV 2009 / Kuan‐Ta Chen  17
  18. 18. QoE Assessment Model [Ding’03] original audio clip  degraded audio clip  PESQ algorithm extension MOS score conversion R score substract Id in E‐model final R score final MOS score
  19. 19. Deriving Optimal Buffer Size ‐ Simulation For each (buffer size, network setting) 1. Encode an audio clip into a sequence of VoIP frames 2. Impairment at the network network delay & jitter packet loss (Gilbert model) 3. Packet discarding at the receiver drop a packet if its arrival time is later than scheduled time (sent time +  playout buffer size)  4. Decode the result frames (a subset of original frames)  to a degraded audio clip 5. Compute average QoE scores NOSSDAV 2009 / Kuan‐Ta Chen  19
  20. 20. Derived Optimal Buffer Size (1)
  21. 21. Derived Optimal Buffer Size (2)
  22. 22. A typical relationship between QoS  and QoE Hard to tell “very bad” from “extremely bad” QoE Marginal benefit is small QoS,  e.g., speech quality or (e2e delay)‐1 NOSSDAV 2009 / Kuan‐Ta Chen  22
  23. 23. Comparing Real‐Life Applications with  Optimal Settings MSN Messenger’s buffer dimensioning algorithm is  better than those of Skype and Google Talk
  24. 24. Modeling Optimal Buffer Size Using a linear regression to model the optimal  buffer size given a network setting (const.) + coef delay ⋅ delay + coef delay⋅ jitter ⋅ delay ⋅ jitter + coef delay⋅ jitter ⋅ plr ⋅ delay ⋅ jitter ⋅ plr NOSSDAV 2009 / Kuan‐Ta Chen  24
  25. 25. Talk Progress Overview Measuring Applications’ Buffer Size Experiment Methodology Measurement Results Deriving Optimal Buffer Size Methodology Derived  Buffer Sizes Evaluation of Applications’ Dimensioning Algorithms Conclusion & On‐going Work
  26. 26. Conclusion Our results show that MSN Messenger performs the  best in terms of buffer dimensioning Proprietary codec are used in Skype Other factors may dominate the final quality perceived by users Results from the research community seem not be  applied in real‐life VoIP applications methods not generalizable enough? methods not inutitive enough? methods not practical enough? (e.g., hard to implement) other explanations? A regression modeling appraoch to compute the  optimal buffer size in run time
  27. 27. On‐going Work More factors in measuring applications’ buffer  dimensioning behavior frame size, redundancy control, loss burstiness, speech codec,  … More factors in deriving optimal buffer size transport protocol (TCP in addition to UDP), speech codec, … Real‐life network experiments to evaluate the   regression‐based buffer dimensioning algorithm NOSSDAV 2009 / Kuan‐Ta Chen  27
  28. 28. Thank You! Kuan‐Ta Chen http://www.iis.sinica.edu.tw/~ktchen NOSSDAV 2009