SlideShare uma empresa Scribd logo
1 de 23
Worldwide Local
Latency with ScyllaDB
Carly Christensen, Director of Software Engineering
■ Who are ZeroFlucs and What Do We Do?
■ The ZeroFlucs Process
■ Challenges
■ How ScyllaDB Helped
Agenda
Carly Christensen
■ 20 years experience in IT industry
■ Software Development
■ Database Systems
■ Career highlights include:
■ Lead Developer and Head of Trading Solutions at Entain Australia
■ SQL Server Consultant at Wardy IT
Your photo
goes here,
smile :)
What is ZeroFlucs?
What is ZeroFlucs?
ZeroFlucs primarily provide same-game pricing calculations to the wagering
industry.
■ This allows customers to price bets on correlated outcomes from within the
same match.
■ Our vision is to let sportsbook customers explore their theory of the game:
■ Far more engaging than just “Team to Win” bets.
Same-Game Example
Match Winner First Touchdown Total Score
Team A
$1.50
Team B
$2.60
T.S. Wong
$8.00
B. Bhooma
$10.00
S. Gray
$14.00
Over 45.5
$1.90
Under 45.5
$1.90
Total: $14.50
But It’s Not That Simple…
To calculate the probability of this outcome as closely as possible, we need to simulate
the game, play by play.
■ These are correlated outcomes, so we use a simulation-based approach to
effectively model the relationships.
■ For example, if a team wins then it’s more likely they’ll score the first, last or any
individual goal.
■ To ensure we cover as much of the probability space as possible a minimum of
50,000 simulations are run for every price change, for every event.
How We Do It
Our Architecture
content
3rd
party
data
booking transforms
model
simulations
market
generation
query
engine
api
Our Architecture
■ Platform has been designed to be cloud-native from the ground up.
■ Software stack runs on Kubernetes
■ We use Oracle Container Engine for Kubernetes (OKE)
■ Over 120 microservices, growing every week.
■ Much of the environment can be managed via CRDs and operators.
■ Languages and Tools
■ Golang and Python
■ GRPC for internal communications
■ Kafka (for at least once message delivery)
■ GraphQL for external-facing APIs
Our Challenges
Our Goal
■ Our ultimate goal is to be able to process and simulate events fast enough to
provide Same-Game prices for Live (in-play) events.
■ Will this corner result in a goal?
■ Will this play result in a touchdown?
■ Which player will score a touchdown from this play?
Our Challenges
■ Price changes must be as up to date as possible.
■ Events can each be updating dozens of times per minute and trigger thousands of re-simulations.
■ We’re processing up to 250,000 in-game events per second.
■ Simulation data can be hundreds of MB.
■ Customers can be anywhere in the world:
■ With each request passing through many microservices, even a small increase in latency between the
service and repository can have a large negative impact on the end-user experience.
■ A lot of optimisation was achieved through code changes and increased
parallelism, however the database remained an area we could improve on.
Databases Explored
■ MongoDB
■ Familiar – several of the team had used this in previous projects.
■ Found issues with limited concurrent connections would manifest in simple queries occasionally
taking several seconds.
■ Cassandra
■ Supported network-aware replication strategies.
■ Performance and resource usage was not where we needed it to be.
■ Cosmos DB
■ Addressed all of our performance and regional distribution requirements.
■ High cost, Azure-only – limiting our portability.
The Solution:
ScyllaDB
Why ScyllaDB?
■ We had trialled ScyllaDB for a previous project and although it didn’t suit that situation, it
was perfect for ZeroFlucs.
■ Distributed architecture – data replicas can be located local to services and customers ensuring low
latency for every request.
■ High throughput and concurrency – with ScyllaDB we have yet to see an issue we can’t scale through.
■ Ease of adoption – ScyllaDB Operator meant we needed little knowledge to start.
Architecture with ScyllaDB
■ ScyllaDB Open Source
■ Hosted on Flex 4 VMs
■ Currently performing well, but our throughput increases with every new customer.
■ Option to scale up and run on bare metal if needed in the future.
■ In development we used ScyllaDB Operator to manage ScyllaDB, but use
self-managed in production as operator is limited to a single Kubernetes cluster (hard
to span geographically).
■ We’re still reviewing our strategy around ScyllaDB Manager and monitoring
Data Local to Customers
■ Global data
■ Slow-changing data, replicated to every region.
■ Regional data
■ Data used by multiple customers in a region e.g. a sport feed.
■ If a customer in another region needs that same data, it will be
replicated separately to their region.
■ Customer specific data
■ Event configuration, model and simulation outputs, margins and pricing ladders.
■ Customers have a home region, where we store multiple replicas of their data.
■ BONUS: Additional replicas can be made to other regions for DR purposes.
Architecture with ScyllaDB
cell
content
books
transforms
simulations
api
UK South (London) UK West (Newport)
cell
content
books
transforms
simulations
api
Australia East (Sydney)
Charybdis
■ Drivers:
■ Keeping keyspace configurations up to date between regions would be a challenge.
■ Similar table structures were being repeated over and over through many different services.
■ So we created Charybdis – a Golang ScyllaDB helper library:
■ A table-manager that will automatically create keyspaces and add tables, columns and indexes as
required.
■ Simplified functions for CRUD-style operations.
■ Support for LWT and TTL.
Maintenance
Our Topology service controls the location of data replicas on a per-service and/or
per-customer basis - each has a dedicated keyspace per service, per data type.
service topology
CREATE KEYSPACE record_customer_a WITH replication={'class':'NetworkTopologyStrategy'...
CREATE TABLE record_customer_a.records (user_id text) PRIMARY KEY (user_id);
ALTER TABLE record_customer_a.records ADD (first_name text);
ALTER TABLE record_customer_a.records ADD (visits int);
Replication Strategy
More to Learn…
■ We’re still early in our journey.
■ For example, our initial attempt at dynamic schema validation caused some requests
between services to time out, if it was the first access of customer specific data for that
instance.
■ There are still many ScyllaDB configuration settings we’ve yet to test and we’re certain that
we can increase performance further.
Thank You
Stay in Touch
Carly Christensen
carly@zeroflucs.io
@carlyflucs
www.linkedin.com/in/carly-christensen-16441375/

Mais conteúdo relacionado

Semelhante a Worldwide Local Latency With ScyllaDB

How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
StreamNative
 

Semelhante a Worldwide Local Latency With ScyllaDB (20)

[KGC 2012] Online Game Server Architecture Case Study Performance and Security
[KGC 2012] Online Game Server Architecture Case Study Performance and Security[KGC 2012] Online Game Server Architecture Case Study Performance and Security
[KGC 2012] Online Game Server Architecture Case Study Performance and Security
 
Laskar: High-Velocity GraphQL & Lambda-based Software Development Model
Laskar: High-Velocity GraphQL & Lambda-based Software Development ModelLaskar: High-Velocity GraphQL & Lambda-based Software Development Model
Laskar: High-Velocity GraphQL & Lambda-based Software Development Model
 
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architectureCommit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture
Commit Conf 2018 - Hotelbeds' journey to a microservice cloud-based architecture
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
FoundationDB - NoSQL and ACID
FoundationDB - NoSQL and ACIDFoundationDB - NoSQL and ACID
FoundationDB - NoSQL and ACID
 
Our Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudOur Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent Cloud
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Idi2017 - Cloud DB: strengths and weaknesses
Idi2017 - Cloud DB: strengths and weaknessesIdi2017 - Cloud DB: strengths and weaknesses
Idi2017 - Cloud DB: strengths and weaknesses
 
Adopting Karpenter for Cost and Simplicity at Grafana Labs.pdf
Adopting Karpenter for Cost and Simplicity at Grafana Labs.pdfAdopting Karpenter for Cost and Simplicity at Grafana Labs.pdf
Adopting Karpenter for Cost and Simplicity at Grafana Labs.pdf
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
 
CloudOpen Japan - Controlling the cost of your first cloud
CloudOpen Japan - Controlling the cost of your first cloudCloudOpen Japan - Controlling the cost of your first cloud
CloudOpen Japan - Controlling the cost of your first cloud
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
 
Serverless ddd
Serverless dddServerless ddd
Serverless ddd
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
 
GraphTour - Closing Keynote
GraphTour - Closing KeynoteGraphTour - Closing Keynote
GraphTour - Closing Keynote
 
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
 
Serverless Apps on Google Cloud: more dev, less ops
Serverless Apps on Google Cloud:  more dev, less opsServerless Apps on Google Cloud:  more dev, less ops
Serverless Apps on Google Cloud: more dev, less ops
 
Serverless Apps on Google Cloud: more dev, less ops
Serverless Apps on Google Cloud: more dev, less opsServerless Apps on Google Cloud: more dev, less ops
Serverless Apps on Google Cloud: more dev, less ops
 
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?The Crown Jewels: Is Enterprise Data Ready for the Cloud?
The Crown Jewels: Is Enterprise Data Ready for the Cloud?
 

Mais de ScyllaDB

Mais de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Worldwide Local Latency With ScyllaDB

  • 1. Worldwide Local Latency with ScyllaDB Carly Christensen, Director of Software Engineering
  • 2. ■ Who are ZeroFlucs and What Do We Do? ■ The ZeroFlucs Process ■ Challenges ■ How ScyllaDB Helped Agenda
  • 3. Carly Christensen ■ 20 years experience in IT industry ■ Software Development ■ Database Systems ■ Career highlights include: ■ Lead Developer and Head of Trading Solutions at Entain Australia ■ SQL Server Consultant at Wardy IT Your photo goes here, smile :)
  • 5. What is ZeroFlucs? ZeroFlucs primarily provide same-game pricing calculations to the wagering industry. ■ This allows customers to price bets on correlated outcomes from within the same match. ■ Our vision is to let sportsbook customers explore their theory of the game: ■ Far more engaging than just “Team to Win” bets.
  • 6. Same-Game Example Match Winner First Touchdown Total Score Team A $1.50 Team B $2.60 T.S. Wong $8.00 B. Bhooma $10.00 S. Gray $14.00 Over 45.5 $1.90 Under 45.5 $1.90 Total: $14.50
  • 7. But It’s Not That Simple… To calculate the probability of this outcome as closely as possible, we need to simulate the game, play by play. ■ These are correlated outcomes, so we use a simulation-based approach to effectively model the relationships. ■ For example, if a team wins then it’s more likely they’ll score the first, last or any individual goal. ■ To ensure we cover as much of the probability space as possible a minimum of 50,000 simulations are run for every price change, for every event.
  • 10. Our Architecture ■ Platform has been designed to be cloud-native from the ground up. ■ Software stack runs on Kubernetes ■ We use Oracle Container Engine for Kubernetes (OKE) ■ Over 120 microservices, growing every week. ■ Much of the environment can be managed via CRDs and operators. ■ Languages and Tools ■ Golang and Python ■ GRPC for internal communications ■ Kafka (for at least once message delivery) ■ GraphQL for external-facing APIs
  • 12. Our Goal ■ Our ultimate goal is to be able to process and simulate events fast enough to provide Same-Game prices for Live (in-play) events. ■ Will this corner result in a goal? ■ Will this play result in a touchdown? ■ Which player will score a touchdown from this play?
  • 13. Our Challenges ■ Price changes must be as up to date as possible. ■ Events can each be updating dozens of times per minute and trigger thousands of re-simulations. ■ We’re processing up to 250,000 in-game events per second. ■ Simulation data can be hundreds of MB. ■ Customers can be anywhere in the world: ■ With each request passing through many microservices, even a small increase in latency between the service and repository can have a large negative impact on the end-user experience. ■ A lot of optimisation was achieved through code changes and increased parallelism, however the database remained an area we could improve on.
  • 14. Databases Explored ■ MongoDB ■ Familiar – several of the team had used this in previous projects. ■ Found issues with limited concurrent connections would manifest in simple queries occasionally taking several seconds. ■ Cassandra ■ Supported network-aware replication strategies. ■ Performance and resource usage was not where we needed it to be. ■ Cosmos DB ■ Addressed all of our performance and regional distribution requirements. ■ High cost, Azure-only – limiting our portability.
  • 16. Why ScyllaDB? ■ We had trialled ScyllaDB for a previous project and although it didn’t suit that situation, it was perfect for ZeroFlucs. ■ Distributed architecture – data replicas can be located local to services and customers ensuring low latency for every request. ■ High throughput and concurrency – with ScyllaDB we have yet to see an issue we can’t scale through. ■ Ease of adoption – ScyllaDB Operator meant we needed little knowledge to start.
  • 17. Architecture with ScyllaDB ■ ScyllaDB Open Source ■ Hosted on Flex 4 VMs ■ Currently performing well, but our throughput increases with every new customer. ■ Option to scale up and run on bare metal if needed in the future. ■ In development we used ScyllaDB Operator to manage ScyllaDB, but use self-managed in production as operator is limited to a single Kubernetes cluster (hard to span geographically). ■ We’re still reviewing our strategy around ScyllaDB Manager and monitoring
  • 18. Data Local to Customers ■ Global data ■ Slow-changing data, replicated to every region. ■ Regional data ■ Data used by multiple customers in a region e.g. a sport feed. ■ If a customer in another region needs that same data, it will be replicated separately to their region. ■ Customer specific data ■ Event configuration, model and simulation outputs, margins and pricing ladders. ■ Customers have a home region, where we store multiple replicas of their data. ■ BONUS: Additional replicas can be made to other regions for DR purposes.
  • 19. Architecture with ScyllaDB cell content books transforms simulations api UK South (London) UK West (Newport) cell content books transforms simulations api Australia East (Sydney)
  • 20. Charybdis ■ Drivers: ■ Keeping keyspace configurations up to date between regions would be a challenge. ■ Similar table structures were being repeated over and over through many different services. ■ So we created Charybdis – a Golang ScyllaDB helper library: ■ A table-manager that will automatically create keyspaces and add tables, columns and indexes as required. ■ Simplified functions for CRUD-style operations. ■ Support for LWT and TTL.
  • 21. Maintenance Our Topology service controls the location of data replicas on a per-service and/or per-customer basis - each has a dedicated keyspace per service, per data type. service topology CREATE KEYSPACE record_customer_a WITH replication={'class':'NetworkTopologyStrategy'... CREATE TABLE record_customer_a.records (user_id text) PRIMARY KEY (user_id); ALTER TABLE record_customer_a.records ADD (first_name text); ALTER TABLE record_customer_a.records ADD (visits int); Replication Strategy
  • 22. More to Learn… ■ We’re still early in our journey. ■ For example, our initial attempt at dynamic schema validation caused some requests between services to time out, if it was the first access of customer specific data for that instance. ■ There are still many ScyllaDB configuration settings we’ve yet to test and we’re certain that we can increase performance further.
  • 23. Thank You Stay in Touch Carly Christensen carly@zeroflucs.io @carlyflucs www.linkedin.com/in/carly-christensen-16441375/