IAC 2024 - IA Fast Track to Search Focused AI Solutions
Boomerang @ NY Web Perf meetup
1. Why measure?
boomerang
data data data
Measuring the web with boomerang
Philip Tellis / philip@bluesmoon.info
NY Performance Meetup / 2010-09-15
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
2. Why measure?
boomerang
data data data
$ finger philip
Philip Tellis
philip@bluesmoon.info
@bluesmoon
yahoo
geek
http://bluesmoon.info/
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
3. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Where does all the time go?
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
4. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Who controls it?
Some of this we control and some of it we don’t
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
5. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Back end
Measuring and improving back end performance can be done
during development
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
6. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
80-20
Turns out that less than 20% of the time is spent on the back
end
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
7. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Front end
It’s what we can’t control that bites us
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
8. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
browsers
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
9. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
plugins
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
10. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
OSes
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
11. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
viruses
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
12. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
antiviruses
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
13. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
microwaves
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
14. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
baby monitors
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
15. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
naughty neighbours
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
16. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
file shares
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
17. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
governments
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
18. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
rodents
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
19. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Too many variations
Try simulating all that in the lab!
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
20. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
We need to measure real end-user performance
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
21. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
We need to measure it from the real end-user’s box
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
22. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Ask the user?
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
23. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Bias
While this might work, it isn’t necessarily representative
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
24. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
A/B testing
You also want to be able to dynamically tune which users get
which tests
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
25. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Phone home
It’s most useful if you can send these measurements back to
your server for analysis
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
26. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Mostly ubiquitous
We know that javascript is available on almost every browser
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
27. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Rich pages
We really want to measure the performance of rich pages
which depend on javascript already
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
28. Why measure? The slow web
boomerang Measurements
data data data Measuring with javascript
Limited
But javascript can’t measure everything... we get as close as
we can
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
29. Why measure? What?
boomerang How does it work?
data data data Accuracy
A piece of javascript that you add to your web page where it
measures and beacons back to you, the end user’s perceived
performance of your page
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
30. Why measure? What?
boomerang How does it work?
data data data Accuracy
How?
<script src="boomerang.js" type="text/javascript">
</script>
<script type="text/javascript">
BOOMR.init({
user_ip: "<user’s ip address>",
beacon_url: "http://yoursite.com/beacon.php"
});
</script>
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
31. Why measure? What?
boomerang How does it work?
data data data Accuracy
What does it do?
About once a week, measures user’s bandwidth and
latency to your server
On (almost) every request, measures the time it took to
load the current page
Beacons these results back to your server
Other stuff based on plugins
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
32. Why measure? What?
boomerang How does it work?
data data data Accuracy
How does it do it?
Let’s take that one at a time
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
33. Why measure? What?
boomerang How does it work?
data data data Accuracy
How do we measure latency?
Download a 32 byte gif 10 times in sequence
Measure the time to download each
Discard the first measurement because it’s overpriced
Calculate the arithmetic mean, standard deviation and
margin of error of the rest
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
34. Why measure? What?
boomerang How does it work?
data data data Accuracy
Wait, did you say overpriced?
The first image might require a DNS lookup and TCP
handshake
Slow start is not an issue since 32 bytes fits in 1 packet
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
35. Why measure? What?
boomerang How does it work?
data data data Accuracy
How do we measure bandwidth?
After the latency test is done, we download progressively
larger images
Stop at the first image that times out
Redownload that image a few more times
Calculate the median, standard deviation and margin of
error of the largest images
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
36. Why measure? What?
boomerang How does it work?
data data data Accuracy
Measuring latency before bandwidth helps here
Those 10 latency images do a lot to widen the TCP
window size
The bandwidth images make much better use of bandwidth
The image we end with uses the most bandwidth
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
37. Why measure? What?
boomerang How does it work?
data data data Accuracy
How do we measure page load time?
In the onbeforeunload event, measure the time and
store it in a cookie
In the onload event, check the cookie, and measure the
difference with the current time
We also make sure that the page that set the cookie is the
referrer of the current page
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
38. Why measure? What?
boomerang How does it work?
data data data Accuracy
What? Two pages?
Yes, this needs two pages and cookies. If those aren’t
supported, we try to use the WebTiming API.
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
39. Why measure? What?
boomerang How does it work?
data data data Accuracy
How accurate is it?
Latency measurements are very accurate (±1%)
Bandwidth is to an order of magnitude. For bad
connections can be ±30%
Page load time sometimes has outliers, you need
post-filtering
The margin of error tells you how good your data is
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
40. Why measure? Filtering
boomerang Grouping
data data data Data
What do we do with the data?
Sanity checking to:
Remove fake data
Remove abusive data
Maybe just rate limiting
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
41. Why measure? Filtering
boomerang Grouping
data data data Data
What do we do with the data?
Statistical analysis to:
Remove outliers
Aggregate based on bandwidth blocks
Measure trends over time and correlate them with code
changes
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
42. Why measure? Filtering
boomerang Grouping
data data data Data
Bandwidth blocks
0-100 kbps
100-300 kbps
300-2000 kbps
2-6 Mbps
6+ Mbps
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
43. Why measure? Filtering
boomerang Grouping
data data data Data
Bandwidth blocks
Group page load times based on bandwidth block
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
44. Why measure? Filtering
boomerang Grouping
data data data Data
Bandwidth blocks
Data points from some countries may require narrower bands
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
45. Why measure? Filtering
boomerang Grouping
data data data Data
Geographic data
Looking at latency from different geographic locations can tell
you where to put your next CDN
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
46. Why measure? Filtering
boomerang Grouping
data data data Data
ISPs
Grouping data by ISP can tell you who’s behaving badly
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
47. Why measure? Filtering
boomerang Grouping
data data data Data
Storing the data
We log all beacon requests to apache’s log access file
Low traffic sites could write directly to a DB
Others have suggested using CouchDB as the beacon
server
Daily summaries can be sent across to ShowSlow
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
48. Why measure? Filtering
boomerang Grouping
data data data Data
More data
Write plugins to get more performance data
We already have a DNS plugin
I’m thinking of an IPv6 v/s IPv4 plugin
What about a full WebTiming plugin?
Can we measure connection setup time?
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
49. Why measure? Filtering
boomerang Grouping
data data data Data
You decide
Once you have the data, you can do anything with it
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
50. Why measure? Filtering
boomerang Grouping
data data data Data
Thank you
http://github.com/yahoo/boomerang
http://yahoo.github.com/boomerang/doc/
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
51. Why measure? Filtering
boomerang Grouping
data data data Data
Photo credits
flickr.com/photos/21233184@N02/4389412851
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
52. Why measure? Filtering
boomerang Grouping
data data data Data
Contact me
Philip Tellis
yahoo
geek
@bluesmoon
http://bluesmoon.info/
slideshare.net/bluesmoon
philip@bluesmoon.info
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang
53. Why measure? Filtering
boomerang Grouping
data data data Data
References
github.com/yahoo/boomerang
More bandwidth doesn’t matter (much) – Mike Belshe
Analysing Bandwidth & Latency – YUI Blog
It’s the latency, stupid – Stuart Cheshire
The statistics of web performance
NY Performance Meetup / 2010-09-15 Measuring the web with boomerang