SlideShare uma empresa Scribd logo
1 de 109
All about rate limiting
Alexander Tokarev
Xsolla
Who am I
1. Distributed systems expert
2. Head of development hub – Xsolla
• Leading leads of teamleads
• In charge of 24x7 production
• Balancing between finance and architecture
2
1. Payments and payouts for gamedev – 800+ payment providers
2. BaaS for in-game shops
3. LiveOps
4. Antifraud
5. From indie to enterprises
6. 800+ team members across the globe
3
Agenda
1.What for
2.Architecture
3.RL points
4.Documentation
5.Production
6.Malicious behavior
7.Pain sharing time
4
What’s not about
1. Promotion of a particular RL tools
2. Subjects of rate limiting in a particular API
3. Aproach what to do in case of 429 status
4. 100500 lines of code
5
What’s about
1. Background
2. History
3. Variety
6
What’s about purpose
1. Rate limiting implicates total request count by time or by queue length criteria
2. Any request, made after exceeding the limit, will be declined
3. One is allowed to configure various limits with time window and queue
7
What’s about purpose
Rate limiting implicates total request count by time or by queue length criteria.
Any request, made after exceeding the limit, will be auto- rejected or declined.
One is allowed to configure various limits with time window.
Throttling implicates total request count by time or by queue length criteria.
Any request, made after exceeding the limit, will be queued. One is allowed to
configure various limits with time window.
8
What’s about purpose
Rate limiting implicates total request count by time or by queue length criteria.
Any request, made after exceeding the limit, will be auto- rejected or declined.
One is allowed to configure various limits with time window.
Throttling implicates total request count by time or by queue length criteria.
Any request, made after exceeding the limit, will be queued. One is allowed to
configure various limits with time window.
9
What’s about types
A rate limiter limits the number of requests received by the API within
any given time window - short webshop cart calls
A concurrency limiter that limits the number of requests that are
active/queued at any given time - long-lived payments transaction
registry call
10
What’s about types
A rate limiter limits the number of requests received by the API
within any given time window - short webshop cart calls
150 requests to cart per second allowed
A concurrency limiter that limits the number of requests that are
active/queued at any given time - long-lived payments transaction
registry call
5 simultaneous transaction reports allowed
11
What’s about purpose
1. Eliminate unexpected traffic patterns - internal and external spikes
2. Get rid of unwanted traffic patterns - brute-force
12
What’s about purpose
1. Eliminate unexpected traffic patterns - internal and external spikes
2. Get rid of unwanted traffic patterns - brute-force
13
What’s about history
14
Rate limits is all about QoS!
The measurable end-to-end performance
properties of a network service, which can
be guaranteed in advance by a Service
Level Agreement between a user and a
service provider, so as to satisfy specific
customer application requirements.
Note: These properties may include
throughput (bandwidth), transit delay
(latency), error rates, priority, security,
packet loss, packet jitter, etc.
15
Rate limits in HTTP
16
What’s about resources to limit
Rate limit everything
- Alexander Tokarev
17
What’s about resources to limit
18
Wanna math?
1. Token bucket
2. Leaky bucket
3. Fixed window
4. Sliding window
5. Sliding log
6. Timing wheel – carousel
7. Generic cell rate
8. ……..
19
Wanna math?
20
State issue: global with local
21
Global rate limiting - rate limiting for all instanceS behind LB
Local rate limiting - rate limiting per service instance
Requested RPS Service count Actual RPS
Local 26 3 26 * 3 = 78
Global 26 3 26
State issue: key sync required
22
Without sync
With sync
State issue: paid RL options
23
The same for HAProxy for instance…
State issue – hand-made sync
24 Extra moving parts + LUA performance!!!
What’s about users
1. Try to tell users that they are limited
2. Say when limitations will be removed
3. Save your support team efforts
25
What’s about users
26
What’s about users
27
What’s about users
28
What’s about users
29
What’s about users
X-RateLimit-UserLimit: 1231513
X-RateLimit-UserRemaining: 2342
X-custom-retry-after-ms: 10000
X-ratelimit-minute: 3
X-rate-limit-hour: 1
X-RateLimit-Retry-After: 11529485261
X-Rate-Limit-Reset: Wed, 21 Oct 2015 07:28:00 GMT
30
Time to solve
31
Time to solve
32
Solved!
33
Solved!
34
Solved!
35
Solved?
36
Solved?
37
What’s about headers in
38
What’s about headers in
Forget about rate limits headers in NGINX!!!
39
Rate limiting info without reaching rate limit?
1. Return rate limit headers for 20X status 
2. Dedicated rate limit info service
40
What’s about where
1. Network interface
2. Code
3. Database
4. Database proxy
5. Load balancer
6. WAF
7. CNI
8. API gateway
9. Ingress controller
10.Service mesh
11......
41
My rate-limit experience
1. Databases - all about Oracle,
Greenplum, BigQuery
2. API gateways - 80% DataArt
projects
3. Sberbank load balancers
4. Sberbank mesh
5. A lot of RL in Java code -
Bucket4j, Resilence4j mostly
6. Rate limits on Envoy-based
ingress - epic fail!
My name is Alex and I used rate limits
Tell us about rate limits,
Alex
42
Network level rate limiting
OpenvSwitch
43
Types of RL in code
1. Via framework only
2. Via framework + external service for state
44
Code + external service
Which pattern is here?
Be silent who knows!
45
RL code
Rocket Chat!
46
RL code
47
RL code Go
48
Pros&cons Code
+
1. Pure code is very fast
2. No infra dependencies
-
1. Multiple implementations across the company: language, library, …
2. Infra dependencies %) Own Redis, Memcahed….
49
Database MySQL
50
Database MySQL ratelimiting
51
Database MySQL ratelimiting
RL settings
52
Database ProxySQL
53
Database ProxySQL throttling
54
Database BigQuery quoting
Could be Usage Per User Per Day as well!!
55
Load balancer
WHY?????
Otherwise how to sell WAF?
56
WAF
57
WAF
58
AWS WAF limitation
1. IP keys only
2. Fixed time window 5 minutes
3. Minimum rate granularity 100
4. Blocks up to 10000 IP. If more – passes through.
59
Rate limits Yandex cloud
No rate limits - neither
NLB, ALB, AG 
60
Load balancer
61
Load balancer
62
Pros&cons LB
-
1. Infra does all magic
2. Vague visibility for end users
3. Vague visibility for infra - tail logs…
4. Update limit is a magic
5. Local without Nginx Plus
+
Infra does all magic
Traffic doesn’t enter internal network
In case of NGINX
63
API gateway
64
API gateway Internal rate limiting
External rate limiting
Rate limit all!!!
65
API gateway paid features
Algorithm tuning
HA/DR
HA/DR
Algorithm tuning
66
GraphQL rate limiting
1. Different from REST API limiting
2. Single GraphQL call - many calls inside
3. Decision - calculate score: gateway or code
Query depth may not be 100% relevant!
Only code-based RL is relevant!
67
Depth
Score code-based calculation approaches
1. Query depth, annotations - https://github.com/4Catalyzer/graphql-validation-complexity
2. Fields count, annotations - https://github.com/slicknode/graphql-query-complexity
3. Cost – hand-made via Apache Calcite or https://github.com/pa-bru/graphql-cost-analysis
4. All of them in RL library + external state server - https://github.com/ravangen/graphql-rate-limit
The best part for JS only …
68
Schema RL mapping
69
Ingress controller
Burst + Nodelay doesn’t work!
Replicas without state mess!
70
Ingress pain
Dedicated ingress
per protected API!!!
71
Ingress controller
Per-route rate-limiting
72
DaemonSet
Pros&cons Ingress
+
More or less standard
-
1. Granular management is sophisticated
2. Granular manament is non-standard + extra hardware
73
Service mesh
74
Service mesh
75
Service mesh
76
Service mesh
Nearly the same for local rate limiting…
77
Service mesh
20Ms
1. Blocking read - unary gRPC mode
2. Every request call
3. No stickiness - no profit from cache
78
Service mesh RL further architecture
1. Bi-directional stream
2. Local counters with CRDT sync
3. Stickiness
4. Cache
5-10 Ms delays Exists in Lift fork only 
79
Pros&cons Service mesh
+
Very granular
-
1. Hard to maintain – 3-6 config per RL at least
2. Extra hardware - 500Mi per service at least + rate limiter service
3. New complicated moving part
80
CNI
81
CNI
eBPF based CNI
+
Carousel algorithm
82
Up to4x faster!!!
CNI
83
What’s about large rate limits
84
Quotas example
85
Quotas status code
86
Rate limiter operation features
1. Endpoints for information and management
2. Hot reload
3. Rate limits as a code
4. Counters + metadata DBs
- high availability
- insurance limiter
5. Many DBs for statefull layer
6. Shadow mode
7. Monitoring
8. Logs
9. Audit
87
Rate limiter metrics
88
Rate limiter endpoints
Just information 
Management should be implemented…
89
What’s about how
1. Discuss with product owners
2. Select resources to limit
3. Decide about environments
4. Calculate limit figures
5. Choose identifier limit on
6. Create list of exceptions
90
Product part for rate-limits
1. Just be silent - black hole limits
2. Let users know they are limited
3. Use CAPCHA-magic
91
Environment
Distinguish production live, production sandbox and dev rate limits!!!
92
Calculate limit figures
Agile approach
1. Identify rate limits
2. Set rate limits
3. Get hate from customers from support team
Smart agile approach
1. Identify rate limits
2. Set rate limits but in shadow - logs only!
3. Analyze logs and identify rate limits
4. Analyze logs and identify burst settings
5. Set rate limits and burst
6. Keep monitoring
93
Envoy shadow mode
94
Local ShadowMode
for all mesh
for a route
Choose identifier
1. Per IP - what’s about NAT/proxy
2. Per user - what’s about anonymous
3. Per session - what is a session
4. Per header - what’s about spoofing
5. Per subject domain id – what if different ID needs different limits
95
Hybrid limits
AddGoodsToCart(GoodsId int)
96
Id Tables to insert RPS
1 5 5000
10 12 + lock per id 3000
Depends on item type
Hybrid limits
AddGoodsToCart(GoodsId int)
97
RL approach Lowest RPS Highest RPS Pros Cons
LB by method 3000 3000 1. 1 point for RL
2. No customer + service
code changes
Always low RPS
LB by method
lower/upper
+ item type in http
header
3000 5000 1. No service code changes
2. Relevant RPS
1. Customers aware about
implementation details
2. Customer code changes
3. Business logic in LB
1. LB by method upper
2. Service by GoodsId
lower
3000 5000 1. No customer code
changes
2. Relevant RPS
1. Service code changes
2. All cons of code RL
3. 2 places for RL
What’s about bad guys
98
Attack options
1. Null chars in request headers + parameters - %00, %0d%0a, %0d,
%0a, %09, %0C, %20
2. Extra parameters and values in patch
3. Space characters in payload
4. IP Rotate Burp extension
5. …
AWS API Gateway based
99
Attack options
1. Null chars in request headers + parameters - %00, %0d%0a,
%0d, %0a, %09, %0C, %20
2. Extra parameters and values in patch
3. Space characters in payload
4. IP Rotate Burp extension
5. …
10% success!!!!!
100
Demo time
101
Reason to failure
102
Reason to failure
103
90% use it!
Bug bounty common approach
104
What’s about bad guys
Never open RL
implementation
details in headers!
105
What’s about bad guys
106
1. How to hack rate limiting vulnerabilities with tools :
• https://t.ly/Cg0Q
• https://t.ly/h-XP
• https://t.ly/QMSA
2. Investigate IEEE doc
• https://t.ly/V9XF_
3. Assess the maturity of your teams rate limiting
Hometask
107
Conclusion
1. Check rate limit attack vectors
2. Rate-limiter - perfect test task
3. At least 10 places for rate limiting
4. No ideal rate limiter - choose RL + algorithm based on a task
5. Rate limits not only about requests
6. Rate limit everything even internal services
7. Care about debug
8. Please do hometask
108
Pain sharing time
https://t.ly/_jo3 – my presentations
https://t.ly/6JZx - my LiknkedIn profile
@Shtock
109
Vote for the presentation!!!
1. Check rate limit attack vectors!!!
2. Rate-limiter - perfect test task
3. 10 places for rate limiting
4. No ideal rate limiter - choose RL + algorithm based on
a task
5. Rate limits not only about requests
6. Rate limit everything even internal services
7. Care about debug

Mais conteúdo relacionado

Mais procurados

MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera ClusterAbdul Manaf
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Mario Fusco
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2Abdelkrim Hadjidj
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...StampedeCon
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
 
State transfer With Galera
State transfer With GaleraState transfer With Galera
State transfer With GaleraMydbops
 
MySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group ReplicationMySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group ReplicationFrederic Descamps
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Kai Wähner
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestratorYoungHeon (Roy) Kim
 
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...HostedbyConfluent
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerCloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerDatabricks
 

Mais procurados (20)

MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2
 
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon...
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
kafka
kafkakafka
kafka
 
State transfer With Galera
State transfer With GaleraState transfer With Galera
State transfer With Galera
 
MySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group ReplicationMySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group Replication
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
My sql failover test using orchestrator
My sql failover test  using orchestratorMy sql failover test  using orchestrator
My sql failover test using orchestrator
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...
Developing Kafka Streams Applications with Upgradability in Mind with Neil Bu...
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerCloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
 

Semelhante a Rate limits and all about

Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobileDataWorks Summit
 
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...HostedbyConfluent
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Redis Labs
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
Donatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsDonatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsTanya Denisyuk
 
Three Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyThree Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyScyllaDB
 
VISUG - Approaches for application request throttling
VISUG - Approaches for application request throttlingVISUG - Approaches for application request throttling
VISUG - Approaches for application request throttlingMaarten Balliauw
 
X-Ray distributed tracing proof-of-concept
X-Ray distributed tracing proof-of-conceptX-Ray distributed tracing proof-of-concept
X-Ray distributed tracing proof-of-conceptAram Alipoor
 
Building scalable flexible messaging systems using qpid
Building scalable flexible messaging systems using qpidBuilding scalable flexible messaging systems using qpid
Building scalable flexible messaging systems using qpidJack Gibson
 
Model based transaction-aware cloud resources management case study and met...
Model based transaction-aware cloud resources management   case study and met...Model based transaction-aware cloud resources management   case study and met...
Model based transaction-aware cloud resources management case study and met...Leonid Grinshpan, Ph.D.
 
Building a Dynamic Rules Engine with Kafka Streams
Building a Dynamic Rules Engine with Kafka StreamsBuilding a Dynamic Rules Engine with Kafka Streams
Building a Dynamic Rules Engine with Kafka StreamsHostedbyConfluent
 
VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series VMworld
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleAmbassador Labs
 
Approaches to application request throttling
Approaches to application request throttlingApproaches to application request throttling
Approaches to application request throttlingMaarten Balliauw
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Storyvanphp
 
Low latency microservices in java QCon New York 2016
Low latency microservices in java   QCon New York 2016Low latency microservices in java   QCon New York 2016
Low latency microservices in java QCon New York 2016Peter Lawrey
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 

Semelhante a Rate limits and all about (20)

Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
Learnings from the Field. Lessons from Working with Dozens of Small & Large D...
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
Donatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsDonatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIs
 
Three Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyThree Perspectives on Measuring Latency
Three Perspectives on Measuring Latency
 
VISUG - Approaches for application request throttling
VISUG - Approaches for application request throttlingVISUG - Approaches for application request throttling
VISUG - Approaches for application request throttling
 
X-Ray distributed tracing proof-of-concept
X-Ray distributed tracing proof-of-conceptX-Ray distributed tracing proof-of-concept
X-Ray distributed tracing proof-of-concept
 
Building scalable flexible messaging systems using qpid
Building scalable flexible messaging systems using qpidBuilding scalable flexible messaging systems using qpid
Building scalable flexible messaging systems using qpid
 
Model based transaction-aware cloud resources management case study and met...
Model based transaction-aware cloud resources management   case study and met...Model based transaction-aware cloud resources management   case study and met...
Model based transaction-aware cloud resources management case study and met...
 
Building a Dynamic Rules Engine with Kafka Streams
Building a Dynamic Rules Engine with Kafka StreamsBuilding a Dynamic Rules Engine with Kafka Streams
Building a Dynamic Rules Engine with Kafka Streams
 
VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series VMworld 2014: Extreme Performance Series
VMworld 2014: Extreme Performance Series
 
Neutron scaling
Neutron scalingNeutron scaling
Neutron scaling
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
 
Approaches to application request throttling
Approaches to application request throttlingApproaches to application request throttling
Approaches to application request throttling
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
 
Low latency microservices in java QCon New York 2016
Low latency microservices in java   QCon New York 2016Low latency microservices in java   QCon New York 2016
Low latency microservices in java QCon New York 2016
 
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 

Mais de Alexander Tokarev

Open Policy Agent for governance as a code
Open Policy Agent for governance as a code Open Policy Agent for governance as a code
Open Policy Agent for governance as a code Alexander Tokarev
 
Relational databases for BigData
Relational databases for BigDataRelational databases for BigData
Relational databases for BigDataAlexander Tokarev
 
P9 speed of-light faceted search via oracle in-memory option by alexander tok...
P9 speed of-light faceted search via oracle in-memory option by alexander tok...P9 speed of-light faceted search via oracle in-memory option by alexander tok...
P9 speed of-light faceted search via oracle in-memory option by alexander tok...Alexander Tokarev
 
Row Level Security in databases advanced edition
Row Level Security in databases advanced editionRow Level Security in databases advanced edition
Row Level Security in databases advanced editionAlexander Tokarev
 
Row level security in enterprise applications
Row level security in enterprise applicationsRow level security in enterprise applications
Row level security in enterprise applicationsAlexander Tokarev
 
Inmemory BI based on opensource stack
Inmemory BI based on opensource stackInmemory BI based on opensource stack
Inmemory BI based on opensource stackAlexander Tokarev
 
Oracle InMemory hardcore edition
Oracle InMemory hardcore editionOracle InMemory hardcore edition
Oracle InMemory hardcore editionAlexander Tokarev
 
Tagging search solution design Advanced edition
Tagging search solution design Advanced editionTagging search solution design Advanced edition
Tagging search solution design Advanced editionAlexander Tokarev
 
Faceted search with Oracle InMemory option
Faceted search with Oracle InMemory optionFaceted search with Oracle InMemory option
Faceted search with Oracle InMemory optionAlexander Tokarev
 
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018Alexander Tokarev
 
Tagging search solution design
Tagging search solution designTagging search solution design
Tagging search solution designAlexander Tokarev
 
Oracle JSON internals advanced edition
Oracle JSON internals advanced editionOracle JSON internals advanced edition
Oracle JSON internals advanced editionAlexander Tokarev
 
Oracle Result Cache deep dive
Oracle Result Cache deep diveOracle Result Cache deep dive
Oracle Result Cache deep diveAlexander Tokarev
 
Oracle result cache highload 2017
Oracle result cache highload 2017Oracle result cache highload 2017
Oracle result cache highload 2017Alexander Tokarev
 

Mais de Alexander Tokarev (20)

rnd teams.pptx
rnd teams.pptxrnd teams.pptx
rnd teams.pptx
 
FinOps for private cloud
FinOps for private cloudFinOps for private cloud
FinOps for private cloud
 
Graph ql and enterprise
Graph ql and enterpriseGraph ql and enterprise
Graph ql and enterprise
 
FinOps introduction
FinOps introductionFinOps introduction
FinOps introduction
 
Open Policy Agent for governance as a code
Open Policy Agent for governance as a code Open Policy Agent for governance as a code
Open Policy Agent for governance as a code
 
Relational databases for BigData
Relational databases for BigDataRelational databases for BigData
Relational databases for BigData
 
Cloud DWH deep dive
Cloud DWH deep diveCloud DWH deep dive
Cloud DWH deep dive
 
Cloud dwh
Cloud dwhCloud dwh
Cloud dwh
 
P9 speed of-light faceted search via oracle in-memory option by alexander tok...
P9 speed of-light faceted search via oracle in-memory option by alexander tok...P9 speed of-light faceted search via oracle in-memory option by alexander tok...
P9 speed of-light faceted search via oracle in-memory option by alexander tok...
 
Row Level Security in databases advanced edition
Row Level Security in databases advanced editionRow Level Security in databases advanced edition
Row Level Security in databases advanced edition
 
Row level security in enterprise applications
Row level security in enterprise applicationsRow level security in enterprise applications
Row level security in enterprise applications
 
Inmemory BI based on opensource stack
Inmemory BI based on opensource stackInmemory BI based on opensource stack
Inmemory BI based on opensource stack
 
Oracle InMemory hardcore edition
Oracle InMemory hardcore editionOracle InMemory hardcore edition
Oracle InMemory hardcore edition
 
Tagging search solution design Advanced edition
Tagging search solution design Advanced editionTagging search solution design Advanced edition
Tagging search solution design Advanced edition
 
Faceted search with Oracle InMemory option
Faceted search with Oracle InMemory optionFaceted search with Oracle InMemory option
Faceted search with Oracle InMemory option
 
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
Oracle JSON treatment evolution - from 12.1 to 18 AOUG-2018
 
Tagging search solution design
Tagging search solution designTagging search solution design
Tagging search solution design
 
Oracle JSON internals advanced edition
Oracle JSON internals advanced editionOracle JSON internals advanced edition
Oracle JSON internals advanced edition
 
Oracle Result Cache deep dive
Oracle Result Cache deep diveOracle Result Cache deep dive
Oracle Result Cache deep dive
 
Oracle result cache highload 2017
Oracle result cache highload 2017Oracle result cache highload 2017
Oracle result cache highload 2017
 

Último

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Último (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Rate limits and all about

  • 1. All about rate limiting Alexander Tokarev Xsolla
  • 2. Who am I 1. Distributed systems expert 2. Head of development hub – Xsolla • Leading leads of teamleads • In charge of 24x7 production • Balancing between finance and architecture 2
  • 3. 1. Payments and payouts for gamedev – 800+ payment providers 2. BaaS for in-game shops 3. LiveOps 4. Antifraud 5. From indie to enterprises 6. 800+ team members across the globe 3
  • 5. What’s not about 1. Promotion of a particular RL tools 2. Subjects of rate limiting in a particular API 3. Aproach what to do in case of 429 status 4. 100500 lines of code 5
  • 6. What’s about 1. Background 2. History 3. Variety 6
  • 7. What’s about purpose 1. Rate limiting implicates total request count by time or by queue length criteria 2. Any request, made after exceeding the limit, will be declined 3. One is allowed to configure various limits with time window and queue 7
  • 8. What’s about purpose Rate limiting implicates total request count by time or by queue length criteria. Any request, made after exceeding the limit, will be auto- rejected or declined. One is allowed to configure various limits with time window. Throttling implicates total request count by time or by queue length criteria. Any request, made after exceeding the limit, will be queued. One is allowed to configure various limits with time window. 8
  • 9. What’s about purpose Rate limiting implicates total request count by time or by queue length criteria. Any request, made after exceeding the limit, will be auto- rejected or declined. One is allowed to configure various limits with time window. Throttling implicates total request count by time or by queue length criteria. Any request, made after exceeding the limit, will be queued. One is allowed to configure various limits with time window. 9
  • 10. What’s about types A rate limiter limits the number of requests received by the API within any given time window - short webshop cart calls A concurrency limiter that limits the number of requests that are active/queued at any given time - long-lived payments transaction registry call 10
  • 11. What’s about types A rate limiter limits the number of requests received by the API within any given time window - short webshop cart calls 150 requests to cart per second allowed A concurrency limiter that limits the number of requests that are active/queued at any given time - long-lived payments transaction registry call 5 simultaneous transaction reports allowed 11
  • 12. What’s about purpose 1. Eliminate unexpected traffic patterns - internal and external spikes 2. Get rid of unwanted traffic patterns - brute-force 12
  • 13. What’s about purpose 1. Eliminate unexpected traffic patterns - internal and external spikes 2. Get rid of unwanted traffic patterns - brute-force 13
  • 15. Rate limits is all about QoS! The measurable end-to-end performance properties of a network service, which can be guaranteed in advance by a Service Level Agreement between a user and a service provider, so as to satisfy specific customer application requirements. Note: These properties may include throughput (bandwidth), transit delay (latency), error rates, priority, security, packet loss, packet jitter, etc. 15
  • 16. Rate limits in HTTP 16
  • 17. What’s about resources to limit Rate limit everything - Alexander Tokarev 17
  • 19. Wanna math? 1. Token bucket 2. Leaky bucket 3. Fixed window 4. Sliding window 5. Sliding log 6. Timing wheel – carousel 7. Generic cell rate 8. …….. 19
  • 21. State issue: global with local 21 Global rate limiting - rate limiting for all instanceS behind LB Local rate limiting - rate limiting per service instance Requested RPS Service count Actual RPS Local 26 3 26 * 3 = 78 Global 26 3 26
  • 22. State issue: key sync required 22 Without sync With sync
  • 23. State issue: paid RL options 23 The same for HAProxy for instance…
  • 24. State issue – hand-made sync 24 Extra moving parts + LUA performance!!!
  • 25. What’s about users 1. Try to tell users that they are limited 2. Say when limitations will be removed 3. Save your support team efforts 25
  • 30. What’s about users X-RateLimit-UserLimit: 1231513 X-RateLimit-UserRemaining: 2342 X-custom-retry-after-ms: 10000 X-ratelimit-minute: 3 X-rate-limit-hour: 1 X-RateLimit-Retry-After: 11529485261 X-Rate-Limit-Reset: Wed, 21 Oct 2015 07:28:00 GMT 30
  • 39. What’s about headers in Forget about rate limits headers in NGINX!!! 39
  • 40. Rate limiting info without reaching rate limit? 1. Return rate limit headers for 20X status  2. Dedicated rate limit info service 40
  • 41. What’s about where 1. Network interface 2. Code 3. Database 4. Database proxy 5. Load balancer 6. WAF 7. CNI 8. API gateway 9. Ingress controller 10.Service mesh 11...... 41
  • 42. My rate-limit experience 1. Databases - all about Oracle, Greenplum, BigQuery 2. API gateways - 80% DataArt projects 3. Sberbank load balancers 4. Sberbank mesh 5. A lot of RL in Java code - Bucket4j, Resilence4j mostly 6. Rate limits on Envoy-based ingress - epic fail! My name is Alex and I used rate limits Tell us about rate limits, Alex 42
  • 43. Network level rate limiting OpenvSwitch 43
  • 44. Types of RL in code 1. Via framework only 2. Via framework + external service for state 44
  • 45. Code + external service Which pattern is here? Be silent who knows! 45
  • 49. Pros&cons Code + 1. Pure code is very fast 2. No infra dependencies - 1. Multiple implementations across the company: language, library, … 2. Infra dependencies %) Own Redis, Memcahed…. 49
  • 55. Database BigQuery quoting Could be Usage Per User Per Day as well!! 55
  • 59. AWS WAF limitation 1. IP keys only 2. Fixed time window 5 minutes 3. Minimum rate granularity 100 4. Blocks up to 10000 IP. If more – passes through. 59
  • 60. Rate limits Yandex cloud No rate limits - neither NLB, ALB, AG  60
  • 63. Pros&cons LB - 1. Infra does all magic 2. Vague visibility for end users 3. Vague visibility for infra - tail logs… 4. Update limit is a magic 5. Local without Nginx Plus + Infra does all magic Traffic doesn’t enter internal network In case of NGINX 63
  • 65. API gateway Internal rate limiting External rate limiting Rate limit all!!! 65
  • 66. API gateway paid features Algorithm tuning HA/DR HA/DR Algorithm tuning 66
  • 67. GraphQL rate limiting 1. Different from REST API limiting 2. Single GraphQL call - many calls inside 3. Decision - calculate score: gateway or code Query depth may not be 100% relevant! Only code-based RL is relevant! 67 Depth
  • 68. Score code-based calculation approaches 1. Query depth, annotations - https://github.com/4Catalyzer/graphql-validation-complexity 2. Fields count, annotations - https://github.com/slicknode/graphql-query-complexity 3. Cost – hand-made via Apache Calcite or https://github.com/pa-bru/graphql-cost-analysis 4. All of them in RL library + external state server - https://github.com/ravangen/graphql-rate-limit The best part for JS only … 68
  • 70. Ingress controller Burst + Nodelay doesn’t work! Replicas without state mess! 70
  • 71. Ingress pain Dedicated ingress per protected API!!! 71
  • 73. Pros&cons Ingress + More or less standard - 1. Granular management is sophisticated 2. Granular manament is non-standard + extra hardware 73
  • 77. Service mesh Nearly the same for local rate limiting… 77
  • 78. Service mesh 20Ms 1. Blocking read - unary gRPC mode 2. Every request call 3. No stickiness - no profit from cache 78
  • 79. Service mesh RL further architecture 1. Bi-directional stream 2. Local counters with CRDT sync 3. Stickiness 4. Cache 5-10 Ms delays Exists in Lift fork only  79
  • 80. Pros&cons Service mesh + Very granular - 1. Hard to maintain – 3-6 config per RL at least 2. Extra hardware - 500Mi per service at least + rate limiter service 3. New complicated moving part 80
  • 82. CNI eBPF based CNI + Carousel algorithm 82 Up to4x faster!!!
  • 84. What’s about large rate limits 84
  • 87. Rate limiter operation features 1. Endpoints for information and management 2. Hot reload 3. Rate limits as a code 4. Counters + metadata DBs - high availability - insurance limiter 5. Many DBs for statefull layer 6. Shadow mode 7. Monitoring 8. Logs 9. Audit 87
  • 89. Rate limiter endpoints Just information  Management should be implemented… 89
  • 90. What’s about how 1. Discuss with product owners 2. Select resources to limit 3. Decide about environments 4. Calculate limit figures 5. Choose identifier limit on 6. Create list of exceptions 90
  • 91. Product part for rate-limits 1. Just be silent - black hole limits 2. Let users know they are limited 3. Use CAPCHA-magic 91
  • 92. Environment Distinguish production live, production sandbox and dev rate limits!!! 92
  • 93. Calculate limit figures Agile approach 1. Identify rate limits 2. Set rate limits 3. Get hate from customers from support team Smart agile approach 1. Identify rate limits 2. Set rate limits but in shadow - logs only! 3. Analyze logs and identify rate limits 4. Analyze logs and identify burst settings 5. Set rate limits and burst 6. Keep monitoring 93
  • 94. Envoy shadow mode 94 Local ShadowMode for all mesh for a route
  • 95. Choose identifier 1. Per IP - what’s about NAT/proxy 2. Per user - what’s about anonymous 3. Per session - what is a session 4. Per header - what’s about spoofing 5. Per subject domain id – what if different ID needs different limits 95
  • 96. Hybrid limits AddGoodsToCart(GoodsId int) 96 Id Tables to insert RPS 1 5 5000 10 12 + lock per id 3000 Depends on item type
  • 97. Hybrid limits AddGoodsToCart(GoodsId int) 97 RL approach Lowest RPS Highest RPS Pros Cons LB by method 3000 3000 1. 1 point for RL 2. No customer + service code changes Always low RPS LB by method lower/upper + item type in http header 3000 5000 1. No service code changes 2. Relevant RPS 1. Customers aware about implementation details 2. Customer code changes 3. Business logic in LB 1. LB by method upper 2. Service by GoodsId lower 3000 5000 1. No customer code changes 2. Relevant RPS 1. Service code changes 2. All cons of code RL 3. 2 places for RL
  • 99. Attack options 1. Null chars in request headers + parameters - %00, %0d%0a, %0d, %0a, %09, %0C, %20 2. Extra parameters and values in patch 3. Space characters in payload 4. IP Rotate Burp extension 5. … AWS API Gateway based 99
  • 100. Attack options 1. Null chars in request headers + parameters - %00, %0d%0a, %0d, %0a, %09, %0C, %20 2. Extra parameters and values in patch 3. Space characters in payload 4. IP Rotate Burp extension 5. … 10% success!!!!! 100
  • 104. Bug bounty common approach 104
  • 105. What’s about bad guys Never open RL implementation details in headers! 105
  • 106. What’s about bad guys 106
  • 107. 1. How to hack rate limiting vulnerabilities with tools : • https://t.ly/Cg0Q • https://t.ly/h-XP • https://t.ly/QMSA 2. Investigate IEEE doc • https://t.ly/V9XF_ 3. Assess the maturity of your teams rate limiting Hometask 107
  • 108. Conclusion 1. Check rate limit attack vectors 2. Rate-limiter - perfect test task 3. At least 10 places for rate limiting 4. No ideal rate limiter - choose RL + algorithm based on a task 5. Rate limits not only about requests 6. Rate limit everything even internal services 7. Care about debug 8. Please do hometask 108
  • 109. Pain sharing time https://t.ly/_jo3 – my presentations https://t.ly/6JZx - my LiknkedIn profile @Shtock 109 Vote for the presentation!!! 1. Check rate limit attack vectors!!! 2. Rate-limiter - perfect test task 3. 10 places for rate limiting 4. No ideal rate limiter - choose RL + algorithm based on a task 5. Rate limits not only about requests 6. Rate limit everything even internal services 7. Care about debug

Notas do Editor

  1. https://tech.groww.in/rate-limiter-and-its-algorithms-with-illustrations-564455162935
  2. https://leandromoreira.com/2019/01/25/how-to-build-a-distributed-throttling-system-with-nginx-lua-redis/
  3. В самом простом случае документация говорит о факте рейтлимитов
  4. Или в случае зрелости посервисно описывает различные рейт-лимиты и как с ними жить
  5. Например, один из наших конкурентов дает максимально полную информацию по обработке и как раз говорит что у них есть и рейт-лимиты на короткие операции и рейт-лимиты на долгие операции и они абсолютно разные. То что называется rate limiter и concurrency limiter
  6. Опрос – а как получить рейтлимиты без их достижения? Обратите внимание на graphql-казалось бы ничего необычного, но мы к этому еще вернемся
  7. Сколько мест для рейтлимитов знаете вы,
  8. Что важно – хидеры в ЛЮБОМ случае
  9. Связка прояснить
  10. Говорим что недокументировано и подбрито с UI
  11. Обратите внимание как мы берем url. Он нам понадобится позже для темы кибербезопасности.
  12. А тут про лоадбалансер у яндекса
  13. Тут вот в архитектуру про стейт
  14. Спросить зачем на ингрессе и зачем на service
  15. Примеры + почитать мануал про графкуэль в тике как он считает
  16. Примеры + почитать мануал про графкуэль в тике как он считает
  17. Вопрос в чем проблема + еще nginx конфиг притащить + разобрать на 2 слайда для контура и там типа Issue solved, но спытать как работает – сколько реально энвоев и рассказать что у нас не заработал
  18. Conflict-free distributed data types
  19. Как вы думаете какой статус код
  20. Хотя я считаю рейтлимитить надо всё
  21. Примеры атак
  22. Примеры атак
  23. Примеры атак
  24. Запускаем 1000 раз и не ждем ответа чтобы реально быстро было
  25. Мало видно - идея не ясна
  26. Мало видно - идея не ясна
  27. Спрашиваем что тут плохо
  28. Спрашиваем что тут плохо
  29. Заменить на короткие ссылки