This document discusses latency and strategies for improving latency when building APIs and services. It describes using percentiles instead of averages to measure latency, and tools like HdrHistogram and Finagle for tracking latencies. It also discusses experiences using Redis, Cassandra and Aerospike for low latency, and strategies for connecting to Cassandra and handling timeouts. Key takeaways are around not relying on averages, challenges of scaling in .NET, choosing the right NoSQL solution, and that blocking is not always bad if it reduces timeouts.
[2024]Digital Global Overview Report 2024 Meltwater.pdf
Donatas Mažionis, Building low latency web APIs
1.
2. This talk is not a hardcore latency talk
I will not talk about:
•CPU caches
•System.nanoTime
•lockless concurrent queues
•magic low latency framework
3. This talk is not a hardcore latency talk
Scaling from 500 to 150K QPS, the hard way
11. Percentiles
The value below which a given percentage of observations in a group of observations fall
Like p50% = the max value of 50% of the values
14. Libraries for tracking latencies
HdrHistogram: http://hdrhistogram.github.io/HdrHistogram/
Uses fixed memory and constant CPU for recording (C, Java, C# work in progress).
Finagle: https://twitter.github.io/finagle/
Scala, Java RPC framework by Twitter, has built in stats and latency tracking.
17. APIs in online advertising
98% of requests under 100 ms
HTTP
18. APIs in online advertising
98% of requests under 100 ms
HTTP
JSON
19. APIs in online advertising
98% of requests under 100 ms
HTTP
JSON
Protocol Buffers
20. Real-time bidding API
How much would you pay if you give us an ad of size 200x120 to show it on youtube.com for a user from Belgium, who is interested in Sports and Culture?
23. 1.Deserialize request
2.Process some rules
3.Get pre-calculated bid price from storage
4.Calculate some more
5.Serialize response
Real-time bidding request processing
All rest 40 ms for network latency
40 ms
60 ms
24. LVS + keepalived
Profiler API
User profiles
Bid price calculators
Bidder API
Ad serving
25. Redis in 50 words or less
Redis is an open source, BSD licensed, advanced key-value cache and store.
26. Redis as key-value store
•Append write, flush every second
•Operations on multiple keys
•Works great, but watch out when writing/reading on the same node simultaneously
28. Cassandra in 50 words or less
Apache Cassandra is an open source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, tuneably consistent, column-oriented database
29. Why Cassandra is good
•Fast writes
•User profile is a natural key-value model
•Easy to scale (especially with virtual nodes)
•Seemed the most mature at that time (started using from v0.7)
•Runs on a legacy spare HW
•Runs on Windows :)
30. Why Cassandra is good
•Fast writes
•User profile is a natural key-value model
•All nice features mentioned before
•Seemed the most mature at that time (started using from v0.7)
•Runs on a legacy spare HW
•Runs on Windows :)
38. Fail fast plan
1.Set a TSocket timeout to 10 ms
2.If node does not answer under 10 ms, try another from the same range
3.Repeat this 3 times
39. Timeouts in .NET are broken
•.NET Socket SendReceiveTimeout does not work for values less than 500 ms
•Same applies to SocketAsyncEventArgs
•Async version even worse (timer queues, etc.)
40. Thing that worked
Socket.Poll(int microseconds, SelectMode mode) allows to block until data is available or timeout occurs
41. Blocking is not always bad
•Timeouts between 0 and 2%
•Scale by adding new servers
42. Or scale by adding less servers
•Cassandra is not very good at deterministic low latencies
•We switched to Aerospike, same number of QPS, 2x less servers, p99% for reads <= 10 ms
•The whole story here: “Married to Cassandra” http://vimeo.com/101290545
43. Takeaways
•Don’t measure latency averages
•It’s expensive to scale in .NET:
•No decent Cassandra library, have to roll your own (while Java devs having fun with astyanax, datastax driver, etc.)
•Even though we have rewritten our WCF based bidder to HttpListener (saved 10% CPU), netty throughput is 15% better
•Finagle is a great framework