O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
   
   
by niteroi @ panoramio.com
   
vimeo.com/43800150
   
   
   
   
problems
Metrics 2.0 concepts
implementations
& examples
   
Mostly
graphite
   
terminology
sync
   
(1234567890, 82)
(1234567900, 123)
(1234567910, 109)
(1234567920, 77)
db15.mysql.queries_running
host=db15 mysql.queri...
   
Problems
   
Vimeo.com
pagerequests/s?
server X write perf?
   
Finding metrics
Browse hierarchies
Dashboard search .. which keywords?
Search in source code/documentation?
Ask around...
   
stats.hits.vimeo_com
stats_counts.hits.vimeo_com
stats.*.requesthostport.
vimeo_com_80
   
Meaning, difference
Unit?
Where and how.. hard
Prefixes
Understanding metrics
   
collectd.db.disk.sda1.disk_time
.write
   
Terminology? Which field is where?
Total so far? From zero per datapoint?
Aggregate? Which?
Point at t=x describes whi...
   
Change agent?
   
Unclear, inconsistent
terminology, format
tightly coupled
lack information
   
O(S*P*A) 
  S = # Sources     
P = # People     
A = # Aggregators    
   
   
   
times
N
   
graph definitions are
redundant and a time sink.
   
   
http://litlquest.com/forest-trees/see-forest-trees-2
   
metrics 2.0
concepts
   
Self-describing
Standardized
Orthogonal dimensions
   
stats.timers.dfs5.
proxy-server.object.GET.200.
timing.upper_90
   
{
server: dfvimeodfsproxy5,
http_method: GET,
http_code: 200,
unit: ms,
metric_type: gauge,
stat: upper_90,
swift_type...
   
allow more characters
unit: Req/s,
site: vimeo.com,
...
   
Metadata
meta: {
src: proxy.py:458,
from: diamond
}
   
Conceptual model vs
wire protocol vs
storage
   
metrics20.org
   
SI + IEC
B Err Warn Conn
Job File Req ...
MB/s Err/d
Req/h ...
   
Immediate understanding
of metrics
Minimize time to graphs,
alerting rules, debugging
compatibility & flexibility
in t...
   
Implementations
& examples
   
   
Carbon-tagger
…
stats.gauges.host.foo 125 1234567890
service=foo instance=host
target_type=gauge unit=B 123 1234567890...
   
   
Statsdaemon
unit=B
unit=B
...
unit=ms
unit=ms
...
unit=B/s
unit=ms stat=mean
unit=ms stat=upper_90
...
   
Keep metric
tags in sync
with data
   
Graph
Explorer
   
   
Graph­Explorer queries 101
proxy-server swift
server:regex unit=ms
(AND)
   
   
   
   
   
   
   
   
upper_90 (or stat=upper_90)
from <datetime>
to <datetime>
avg over <timespec>
(5M, 1h, 3d, ...)
   
Compare object put/get
stack …
http_method:(PUT|GET)
swift_type=object
avg by http_code,server
   
   
Comparing servers
http_method:(PUT|GET)
group by unit,target_type
avg by http_code,
swift_type,http_method
   
   
transcode unit=Job/s
avg over <time>
from <datetime> to <datetime>
   
Note: data is obfuscated
   
Bucketing
sum by zone:eu-west|us-
east|ap-southeast|us-west|
sa-east|vimeo-df|vimeo-lv
group by state
   
Note: data is obfuscated
   
Compare job states per region (zones 
bucket)
group by zone
   
Note: data is obfuscated
   
Unit conversion
unit=Mb/s network
server:regex
sum by server
   
   
   
Integration
Metric unit=B/s
Query unit=TB
   
   
Deriving
Metric unit=B
Query unit=GB/d
   
   
Bonus round
   
   
   
   
   
   
   
   
   
Dashboard definition
 queries = [
'cpu usage sum by core',
'mem unit=B !total group by type:swap',
'stack network unit...
   
   
   
   
Future
Work
   
● Storage aggregation rules
● graphite API functions such as 
cumulative, summarize and 
smartSummarize
● consolidateB...
   
   
Self-describing &
standardized
stat=upper/lower/mean/...
target_type=counter..
   
Select your view
   
From: dygraphs.com
   
Facet based suggestions
   
unit=Err/s
   
Conclusion
structured
self­describing 
standardized
metrics = enabler
   
Conclusion
Manual composing 
should be last 
resort, not default
   
Conclusion
This sucks
– Tell me why
– What should we do instead?
This is neat!
– Help me make it better
– Adopt native...
   
Seen in this presentation:
metrics20.org
vimeo.github.io/graph-explorer
github.com/vimeo/timeserieswidget
github.com/v...
Próximos SlideShares
Carregando em…5
×

Metrics 2.0 @ Monitorama PDX 2014

15.444 visualizações

Publicada em

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Metrics 2.0 @ Monitorama PDX 2014

  1. 1.    
  2. 2.     by niteroi @ panoramio.com
  3. 3.     vimeo.com/43800150
  4. 4.    
  5. 5.    
  6. 6.    
  7. 7.     problems Metrics 2.0 concepts implementations & examples
  8. 8.     Mostly graphite
  9. 9.     terminology sync
  10. 10.     (1234567890, 82) (1234567900, 123) (1234567910, 109) (1234567920, 77) db15.mysql.queries_running host=db15 mysql.queries_running
  11. 11.     Problems
  12. 12.     Vimeo.com pagerequests/s? server X write perf?
  13. 13.     Finding metrics Browse hierarchies Dashboard search .. which keywords? Search in source code/documentation? Ask around ...
  14. 14.     stats.hits.vimeo_com stats_counts.hits.vimeo_com stats.*.requesthostport. vimeo_com_80
  15. 15.     Meaning, difference Unit? Where and how.. hard Prefixes Understanding metrics
  16. 16.     collectd.db.disk.sda1.disk_time .write
  17. 17.     Terminology? Which field is where? Total so far? From zero per datapoint? Aggregate? Which? Point at t=x describes which timeframe? Understanding metrics
  18. 18.     Change agent?
  19. 19.     Unclear, inconsistent terminology, format tightly coupled lack information
  20. 20.     O(S*P*A)    S = # Sources      P = # People      A = # Aggregators    
  21. 21.    
  22. 22.    
  23. 23.     times N
  24. 24.     graph definitions are redundant and a time sink.
  25. 25.    
  26. 26.     http://litlquest.com/forest-trees/see-forest-trees-2
  27. 27.     metrics 2.0 concepts
  28. 28.     Self-describing Standardized Orthogonal dimensions
  29. 29.     stats.timers.dfs5. proxy-server.object.GET.200. timing.upper_90
  30. 30.     { server: dfvimeodfsproxy5, http_method: GET, http_code: 200, unit: ms, metric_type: gauge, stat: upper_90, swift_type: object }
  31. 31.     allow more characters unit: Req/s, site: vimeo.com, ...
  32. 32.     Metadata meta: { src: proxy.py:458, from: diamond }
  33. 33.     Conceptual model vs wire protocol vs storage
  34. 34.     metrics20.org
  35. 35.     SI + IEC B Err Warn Conn Job File Req ... MB/s Err/d Req/h ...
  36. 36.     Immediate understanding of metrics Minimize time to graphs, alerting rules, debugging compatibility & flexibility in tooling
  37. 37.     Implementations & examples
  38. 38.    
  39. 39.     Carbon-tagger … stats.gauges.host.foo 125 1234567890 service=foo instance=host target_type=gauge unit=B 123 1234567890 …
  40. 40.    
  41. 41.     Statsdaemon unit=B unit=B ... unit=ms unit=ms ... unit=B/s unit=ms stat=mean unit=ms stat=upper_90 ...
  42. 42.     Keep metric tags in sync with data
  43. 43.     Graph Explorer
  44. 44.    
  45. 45.     Graph­Explorer queries 101 proxy-server swift server:regex unit=ms (AND)
  46. 46.    
  47. 47.    
  48. 48.    
  49. 49.    
  50. 50.    
  51. 51.    
  52. 52.    
  53. 53.     upper_90 (or stat=upper_90) from <datetime> to <datetime> avg over <timespec> (5M, 1h, 3d, ...)
  54. 54.     Compare object put/get stack … http_method:(PUT|GET) swift_type=object avg by http_code,server
  55. 55.    
  56. 56.     Comparing servers http_method:(PUT|GET) group by unit,target_type avg by http_code, swift_type,http_method
  57. 57.    
  58. 58.     transcode unit=Job/s avg over <time> from <datetime> to <datetime>
  59. 59.     Note: data is obfuscated
  60. 60.     Bucketing sum by zone:eu-west|us- east|ap-southeast|us-west| sa-east|vimeo-df|vimeo-lv group by state
  61. 61.     Note: data is obfuscated
  62. 62.     Compare job states per region (zones  bucket) group by zone
  63. 63.     Note: data is obfuscated
  64. 64.     Unit conversion unit=Mb/s network server:regex sum by server
  65. 65.    
  66. 66.    
  67. 67.     Integration Metric unit=B/s Query unit=TB
  68. 68.    
  69. 69.     Deriving Metric unit=B Query unit=GB/d
  70. 70.    
  71. 71.     Bonus round
  72. 72.    
  73. 73.    
  74. 74.    
  75. 75.    
  76. 76.    
  77. 77.    
  78. 78.    
  79. 79.    
  80. 80.     Dashboard definition  queries = [ 'cpu usage sum by core', 'mem unit=B !total group by type:swap', 'stack network unit=Mb/s', 'unit=B (free|used) group by =mountpoint' ]
  81. 81.    
  82. 82.    
  83. 83.    
  84. 84.     Future Work
  85. 85.     ● Storage aggregation rules ● graphite API functions such as  cumulative, summarize and  smartSummarize ● consolidateBy & Graph  renderers
  86. 86.    
  87. 87.     Self-describing & standardized stat=upper/lower/mean/... target_type=counter..
  88. 88.     Select your view
  89. 89.     From: dygraphs.com
  90. 90.     Facet based suggestions
  91. 91.     unit=Err/s
  92. 92.     Conclusion structured self­describing  standardized metrics = enabler
  93. 93.     Conclusion Manual composing  should be last  resort, not default
  94. 94.     Conclusion This sucks – Tell me why – What should we do instead? This is neat! – Help me make it better – Adopt native metrics 2.0, structured_metrics
  95. 95.     Seen in this presentation: metrics20.org vimeo.github.io/graph-explorer github.com/vimeo/timeserieswidget github.com/vimeo/carbon-tagger github.com/vimeo/statsdaemon github.com/Dieterbe/anthracite github.com/graphite-ng github.com/vimeo/graphite-influxdb github.com/vimeo/smoketcp github.com/vimeo/tailgate twitter.com/Dieter_be dieter.plaetinck.be You might also like:

×