In this series of 15-minute technical flash talks you will learn directly from Amazon CloudFront engineers and their best practices on debugging caching issues, measuring performance using Real User Monitoring (RUM), and stopping malicious viewers using CloudFront and AWS WAF.
2. What to Expect from the Session
• How Amazon CloudFront delivers content
• Configuring your cache on CloudFront
• Measure application performance with real user
monitoring (RUM)
• Stop malicious viewers with CloudFront and AWS WAF
4. Definitions
• Viewer
• An end-user requesting content from CloudFront
• On a mobile device, desktop or other internet-connected
device
• CloudFront POP
• Point Of Presence, also referred to as an Edge Location
• Located in datacenters in major metropolitan areas, directly
connected to multiple ISPs
• Several racks of servers and network equipment, terminating
viewer connections
10. What happened?
• A divergent resolver
• Resolvers that serve a wide set of users across many
networks/geographies
• VPN users
• Distributed corporate networks
• What can be done?
• Use a local resolver
• Use a resolver that supports EDNS0 ECS
11. What is EDNS0 client-subnet (ECS)?
• IETF open internet-draft
• Informational RFC 7871
• DNS query includes information about the network that
originated the query:
• First three octets of a IPv4 address commonly used
(1.2.3.0/24)
• No client-side resolver modifications necessary
• Some common open resolvers (such as Google’s 8.8.8.8
anycast resolver) support it
13. Key takeaways
• Where you are routed depends on many factors
• Network
• Geographic Location
• Individual POP status
• DNS is an imperfect request routing mechanism
• But it is also ubiquitous
• If your customers use ECS-enabled resolvers, their
experience will improve
16. What to expect
• What we do with a viewer request?
• How do we cache?
• Generating cache keys
• Managing your cache
• Setting Cache-Control headers
• Configuring your distribution and cache behaviors
• Additional Best Practices
• Versioning your assets
• Forwarding only required values
• Monitor your logs
18. What happens with each request?
Is it in
cache?
Is it
expired?
Revalidate
with Origin
Origin
responds
with 304 (Not
Modified)
Origin
responds
with 200
(OK) and
latest version
of object
Forward
request to
origin
Y Y
NN
Viewer
Request
Hit / Refresh Hit
Miss
Cache
it
19. How do we generate a cache key?
Use the host header to create an internal canonical URL.
E.g., d123.cloudfront.net, example.com
Then…
- Remove query strings
- Remove the protocol
- Add accept-encoding (i.e., gzip, identity)
21. Expires headers from origin
Expires reflects when the cache must go back to the origin
server to see if the object has changed.
It is a fixed point in time and accuracy relies on clock
synchronization.
< Expires: Fri, 1 Dec 2017 12:34:50 GMT
22. Cache-Control headers from origin
These directives give you fine-grained control over what is cached and
for how long (in seconds):
< Cache-Control: max-age=300
< Cache-Control: max-age=30, s-maxage=3600
Example: max-age=0, s-maxage=86400 for display ads
Browser
Shared Edge Cache
24. Dynamic content? Cache it.
Use Cache-Control directives to minimize load on your origin:
- no-cache: cache & ask origin
- max-age=0: cache & ask origin
Other options:
- no-store: never cached at the edge nor by the browser
- private: never cached at the edge, but might be cached
by the browser
26. Cache behaviors
on CloudFront
Specify caching configurations
based on URL path matching
(i.e., for different content).
Whatever you forward affects
your cache key. Use Trusted
Advisor checks!
Be wary of:
• Forwarded headers
• Query string forwarding
• Cookie forwarding
27. Set Min, Max, and Default TTLs for CloudFront
Min TTL Max TTLmax-age /
Expires
Browser Edge Cache
max-age /
s-maxage /
Expires
Max TTLmax-age /
Expires
max-age /
s-maxage /
Expires
Min TTL
Max TTL
max-age /
s-maxage /
Expires
Min TTLmax-age /
Expires
29. Errors? Cache them too!
Cache and return a custom error
page and response code for each
HTTP error code.
Give your origin just the right
amount of time to recover.
30. Enable faster iteration of new styles without issuing invalidations.
Protect against browsers that don’t honor your Cache-Control headers.
<link
href="//assets.example.com/assets/v1/css/jumbotron-narrow.css“
rel="stylesheet">
<link
href="//assets.example.com/assets/v2/css/jumbotron-narrow.css“
rel="stylesheet">
<link
href="//assets.example.com/assets/css/jumbotron-
narrow.css?<md5sum>“
rel="stylesheet">
Version your assets
31. Minimize forwarded values
All forwarded headers are
used as part of the cache
key, which means it
dramatically reduces your
cacheability.
32. When in doubt, check the logs!
#Version: 1.0 #Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer)
cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-
taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type cs-protocol-version
2014-05-23 01:13:11 FRA2 182 192.0.2.10 GET d111111abcdef8.cloudfront.net /view/my/file.html 200
www.displaymyfiles.com Mozilla/4.0%20(compatible;%20MSIE%205.0b1;%20Mac_PowerPC) - zip=98101 RefreshHit
MRVMF7KydIvxMWfJIglgwHQwZsbG2IhRJ07sn9AkKUFSHS9EXAMPLE== d111111abcdef8.cloudfront.net http - 0.001 - - - RefreshHit
HTTP/1.1
2014-05-23 01:13:12 LAX1 2390282 192.0.2.202 GET d111111abcdef8.cloudfront.net /soundtrack/happy.mp3 304
www.unknownsingers.com Mozilla/4.0%20(compatible;%20MSIE%207.0;%20Windows%20NT%205.1) a=b&c=d zip=50158 Hit
xGN7KWpVEmB9Dp7ctcVFQC4E-nrcOcEKS3QyAez--06dV7TEXAMPLE== d111111abcdef8.cloudfront.net http - 0.002 - - - Hit
HTTP/1.1
34. Key takeaways
• Set Cache-Control headers appropriately for your
content
• Cache dynamic content
• Create multiple cache behaviors and adapt
configurations for your content type, including errors
• Forward only required values
• Version your assets
• Log your request IDs!
36. Measure application performance with RUM
Synthetic monitoring vs. real user monitoring (RUM):
• Synthetic monitoring overview
• RUM overview
• When to use one over the other (baselining vs. gaining
situational insight)
37. What is synthetic monitoring?
Pros:
• Consistent signal of service health
• Easy to setup (kind of)
• Baseline performance
synthetic monitoring
configuration
synthetic
monitoring
portal
web application
simulated users
38. What is synthetic monitoring?
Pros:
• Consistent signal of service health
• Easy to setup (kind of)
• Baseline performance
synthetic monitoring
configuration
web application
simulated users
39. Where synthetic measurements go wrong
Cons:
• Network path to your application might not be representative
• Special cases and snowflakes
synthetic monitoring
configuration
web application
simulated usersreal
user
40. Where synthetic measurements go wrong
Cons:
• Network path to your application might not be representative
• Special cases and snowflakes
synthetic monitoring
configuration
web application
simulated usersreal
user
41. How do you feel about RUM?
web application
real users
script injected in
web page HTTP
response
RUM
provider
portal
• Script injected in web page
• Script beacons data back from the user’s browser session to the
RUM provider
• RUM provider portal aggregates the data for analysis
42. What can RUM tell you?
• What should my next optimization be?
• What is the cause of a loss of availability?
*Reference: https://developers.google.com
43. Network optimizations: connections
Connection definitions:
• Queueing – Time spent waiting to begin processing
• Stalled/Blocking – Total time spent in queue or proxying
• DNS lookup – Time taken to receive DNS records (like A or
AAAA)
• Initial connection – Inclusive of TCP handshake and negotiating
SSL
44. Network optimizations: requests
Request definitions:
• Request sent – HTTP request sent time
• TTFB - Time To first byte
• Content download – Time to last byte
45. Network optimizations: head of line blocking
Serialized requests could be your bottleneck due to head of line blocking in
HTTP 1.1 if you’re serving from the same origin!
46. Network optimizations: Key takeaways
Insights from this example:
• Evaluate your user-base
• Know your data
• Look at the right data
Optimizations:
• Use CloudFront!
• Origin as close to your end-users as possible (multi-region)
• HTTP/2
47. Best practices for configuring RUM on CloudFront
• Availability: Test your critical resources
• Index pages
• Video manifests
• Critical resources required for page load
• Performance: Capture Total Load time
• First-Byte latency is not always important. Know your content
and optimize on the appropriate dimension!
49. Securing your CloudFront distribution
• Leverage AWS WAF with preconfigured protections
• Configure CloudFront to serve private content
• Automate security response by using services like AWS
Lambda
• Leverage AWS Certificate Manager for SSL
56. Private content – restrict origin access
Amazon S3
Origin Access Identify (OAI)
• Prevents direct access to your Amazon
S3 bucket
• Ensures performance benefits to all
customers
Custom origin
Block by IP address
• Whitelist only the Amazon CloudFront
IP range
• Protects origin from overload
• Ensures performance benefits to all
customers
57. Signed URLs
• Add signature to the Querystring in
URL
• Your URL changes
• Use to restrict access to individual
files
Signed Cookies
• Add signature to a cookie
• Your URL does not change
• Use to restrict access to multiple
files
Private content – signed URLs and cookies
58. Automate security response
• Subscribe to Amazon SNS notifications for changes to
IP ranges
• Automatically update security groups
AWS Lambda
Amazon CloudFront
Amazon SNS
Security group
Web app
server
Web app
serverAWS IP ranges
Update IP range
SNS message
https://github.com/awslabs/aws-cloudfront-samples