SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
Eyal Gutkind - VP of Solutions, ScyllaDB
Sizing your Scylla Cluster
A walk through the process
Presenter
2
Eyal Gutkind
Eyal Gutkind is VP of Solutions at ScyllaDB. Prior to joining
ScyllaDB, Eyal held product management roles at Mirantis
and DataStax, and spent 12 years with Mellanox
Technologies in various engineering management and
product marketing roles. Eyal holds a BSc. degree in
Electrical and Computer Engineering from Ben Gurion
University in Israel and an MBA from Fuqua School of
Business at Duke University.
3
+ The Real-Time Big Data Database
+ Drop-in replacement for Cassandra
+ 10X the performance & low tail latency
+ New: Scylla Cloud, DBaaS
+ Open source and enterprise editions
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA; Herzelia, Israel
About ScyllaDB
Agenda
+ Understand your workload
+ The machines
+ Let’s build a system
+ How it will look at different IaaS
+ Our sizing process
Understand
Your Workload
Thinking about database cluster sizing?!
6
Make sure you have all requirements set
7
Make sure you have all requirements set
8
+ Business
Make sure you have all requirements set
9
+ Business
+ Application
Make sure you have all requirements set
10
+ Business
+ Application
+ Infrastructure
Make sure you have all requirements set
11
+ Business
+ Application
+ Infrastructure
+ Resiliency
Make sure you have all requirements set
12
+ Business
+ Application
+ Infrastructure
+ Resiliency
+ Developer and Operator Friendliness
The obvious...
+ Data volume ingested per second/hour/day/year
+ Data attrition policy
+ Data format : Text, Binary blob
+ Required replication factor
+ What’s in your storage system?
13
14
The shape of your workload
The shape of your workload
https://www.scylladb.com/2019/05/23/workload-prioritization-running-oltp-and-olap-traffic-on-t
he-same-superhighway/
The Machines
Someone needs to do the job!
17
Memory and data volume
+ Keep disk space to memory at a reasonable ratio
18
Connectivity
+ Enable enough bandwidth for client-server and server-server communication
19
Connectivity
+ Enable enough bandwidth for
client-server and server-server
communication
+ Geo-replication benefits
20
Let’s build a system
Let’s build a system
Business requirements:
+ 1st year customers: 30M profiles
+ Number of records per profile: 12
+ Avg. size of each record: 3KB
+ Data types: text
22
Let’s build a system
Business requirements:
+ 1st year customers: 30M profiles
+ Number of records per profile: 12
+ Avg. size of each record: 3KB
+ Data types: text
Application requirements:
+ 99% write response time: 15ms
+ 99% read response time: 10ms
+ Peak throughput: 150,000 operations/sec
+ Read:Write ratio: 70:30
23
Let’s build a system
Business requirements:
+ 1st year customers: 30M profiles
+ Number of records per profile: 12
+ Avg. size of each record: 3KB
+ Data types: text
Application requirements:
+ 99% write response time: 15ms
+ 99% read response time: 10ms
+ Peak throughput: 150,000 operations/sec
+ Read:Write ratio: 70:30
Infrastructure:
+ AWS
+ Available instance types: all
+ Multi-DC: Oregon and N.Virgina
+ OS: CentOS 7.6
+ Replication Factor: 3
24
Let’s build a system
Business requirements:
+ 1st year customers: 30M profiles
+ Number of records per profile: 12
+ Avg. size of each record: 3KB
+ Data types: text
Application requirements:
+ 99% write response time: 15ms
+ 99% read response time: 10ms
+ Peak throughput: 150,000 operations/sec
+ Read:Write ratio: 70:30
Infrastructure:
+ AWS
+ Available instance types: all
+ Multi-DC: Oregon and N.Virgina
+ OS: CentOS 7.6
+ Replication Factor: 3
Auxiliary applications:
+ Spark
25
How it looks
at different IaaS
Let’s build a system
+ End of year Total raw data size: 2TB, Starting with ~1TB
+ Typical record size read/written by the application: 3KB
+ Data model: 20-30 tables, up to 20 columns per row, 10-50 rows per partition, mainly text
+ Latency requirements: 10ms Write, 15ms Read, for the 99%
+ Read:Write ratio: 70:30
+ Throughput: 150,000 database op/s
+ IaaS: AWS, multi-region, multi-availability-zone, N. Virginia and Oregon
+ Replication Factor: 3
+ Spark for analytics
27
Let’s build a system, Amazon Web Services
+ Per Data Center
+ Needed disk space: 12TB
+ Media type: NVMe drives to meet latency SLA
+ Requires 30 threads, 15 physical cores
+ Per Data center instance options
+ 3 x i3.4xlarge → Total disk: 11.5TB
and
+ 3 x i3.2xlarge for the Spark cluster
+ 1x i3.2xlarge for Scyla monitoring and Scylla manager
28
Let’s build a system, Azure
+ Per Data Center
+ Needed disk space: 12TB
+ Media type: NVMe drives to meet latency SLA
+ Requires 30 threads, 15 physical cores
+ Per Data center instance options
+ 3 x standard L16 v2 → Total disk: 11.5TB
and
+ 3 x standard L8 v2 for the Spark cluster
+ 1x standard L8 v2 for Scyla monitoring and Scylla manager
29
Let’s build a system, Google Cloud
+ Per Data Center
+ Needed disk space: 12TB
+ Media type: NVMe drives to meet latency SLA
+ Requires 30 threads, 15 physical cores
+ Per Data center instance options
+ 6 x n1-standard-16 + 5x NVMe based direct attached, 375GB drives
and
+ 3 x n1-standard-8 for the Spark cluster
+ 1 x n1-standard-8 for Scyla monitoring and Scylla manager
30
Let’s build a system, Scylla Cloud
+ Per Data Center
+ Needed disk space: 12TB
+ Media type: NVMe drives to meet latency SLA
+ Requires 30 threads, 15 physical cores
+ Per Data center instance options
+ 3 x i3.4xlarge → Total disk: 11.5TB
and
+ 3 x i3.2xlarge for the Spark cluster
31
Let’s build a system, on premise
+ Per Data Center
+ Needed disk space: 12TB
+ Media type: NVMe drives to meet latency SLA
+ Requires 30 threads, 15 physical cores
+ Per Data center instance options
+ 3 machines with 8 physical cores each and at least 4TB of SSD direct attached drives
+ 3 machines with 4 or more physical cores for Spark
+ 1 x machine for Scyla monitoring and Scylla manager
+ Scylla nodes and Spark nodes: 128GB RAM
+ Network: 10GbE
32
Our sizing process
Scylla’s sizing sheet
34
Summary
+ Do not think only storage!
+ Gather applications and business requirements
+ Throughput and SLAs
+ Growth expectations
+ Security and compliance needs
+ Select the right infrastructure
+ Think about resiliency and high availability
Ask us questions!
35
Best Practices for Data Modeling
August 7, 2019 | 10:00 AM PT - 1:00 PM ET
How to Shrink Your
Datacenter Footprint by 50%
August 14, 2019 | 10:00 AM PT - 1:00 PM ET
Q&A
Stay in touch
eyal@scylladb.com
@gutkinde
United States
1900 Embarcadero Road
Palo Alto, CA 94303
Israel
11 Galgalei Haplada
Herzelia, Israel
www.scylladb.com
@scylladb
Thank you

Mais conteúdo relacionado

Mais procurados

How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...HostedbyConfluent
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseDatabricks
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScyllaDB
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBill Liu
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiDatabricks
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsDatabricks
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLScyllaDB
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitFlink Forward
 

Mais procurados (20)

How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
Flink Forward Berlin 2017: Patrick Lucas - Flink in Containerland
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
ZDLRA in Action
ZDLRA in ActionZDLRA in Action
ZDLRA in Action
 
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Presto: SQL-on-anything
Presto: SQL-on-anythingPresto: SQL-on-anything
Presto: SQL-on-anything
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 

Semelhante a Sizing your Scylla Cluster in AWS, Azure and GCP

Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020ScyllaDB
 
Scylla Virtual Workshop 2022
Scylla Virtual Workshop 2022Scylla Virtual Workshop 2022
Scylla Virtual Workshop 2022ScyllaDB
 
Transforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at ScaleTransforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at ScaleScyllaDB
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the CloudAmihay Zer-Kavod
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3 Omid Vahdaty
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned Omid Vahdaty
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Matej Misik
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...Andrew Liu
 
"EventStoreDb: To be, or not to be, that is the question", Illia Maier
"EventStoreDb: To be, or not to be, that is the question",  Illia Maier"EventStoreDb: To be, or not to be, that is the question",  Illia Maier
"EventStoreDb: To be, or not to be, that is the question", Illia MaierFwdays
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist SoftServe
 
The True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsThe True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsScyllaDB
 
Measuring Database Performance on Bare Metal AWS Instances
Measuring Database Performance on Bare Metal AWS InstancesMeasuring Database Performance on Bare Metal AWS Instances
Measuring Database Performance on Bare Metal AWS InstancesScyllaDB
 
Introducing Scylla Cloud
Introducing Scylla CloudIntroducing Scylla Cloud
Introducing Scylla CloudScyllaDB
 
Keeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatKeeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatScyllaDB
 
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with KafkaAvoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with KafkaHostedbyConfluent
 
DDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for ExascaleDDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for ExascaleIntel IT Center
 

Semelhante a Sizing your Scylla Cluster in AWS, Azure and GCP (20)

Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020
 
Scylla Virtual Workshop 2022
Scylla Virtual Workshop 2022Scylla Virtual Workshop 2022
Scylla Virtual Workshop 2022
 
Transforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at ScaleTransforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at Scale
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks Presentation
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
Fast data in times of crisis with GPU accelerated database QikkDB | Business ...
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
 
"EventStoreDb: To be, or not to be, that is the question", Illia Maier
"EventStoreDb: To be, or not to be, that is the question",  Illia Maier"EventStoreDb: To be, or not to be, that is the question",  Illia Maier
"EventStoreDb: To be, or not to be, that is the question", Illia Maier
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 
The True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsThe True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS Options
 
Measuring Database Performance on Bare Metal AWS Instances
Measuring Database Performance on Bare Metal AWS InstancesMeasuring Database Performance on Bare Metal AWS Instances
Measuring Database Performance on Bare Metal AWS Instances
 
Introducing Scylla Cloud
Introducing Scylla CloudIntroducing Scylla Cloud
Introducing Scylla Cloud
 
Keeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatKeeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter what
 
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with KafkaAvoiding Common Pitfalls: Spark Structured Streaming with Kafka
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
 
DDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for ExascaleDDN and Intel: Partnered for Exascale
DDN and Intel: Partnered for Exascale
 

Mais de ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

Mais de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Último

SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 

Último (20)

SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 

Sizing your Scylla Cluster in AWS, Azure and GCP

  • 1. Eyal Gutkind - VP of Solutions, ScyllaDB Sizing your Scylla Cluster A walk through the process
  • 2. Presenter 2 Eyal Gutkind Eyal Gutkind is VP of Solutions at ScyllaDB. Prior to joining ScyllaDB, Eyal held product management roles at Mirantis and DataStax, and spent 12 years with Mellanox Technologies in various engineering management and product marketing roles. Eyal holds a BSc. degree in Electrical and Computer Engineering from Ben Gurion University in Israel and an MBA from Fuqua School of Business at Duke University.
  • 3. 3 + The Real-Time Big Data Database + Drop-in replacement for Cassandra + 10X the performance & low tail latency + New: Scylla Cloud, DBaaS + Open source and enterprise editions + Founded by the creators of KVM hypervisor + HQs: Palo Alto, CA; Herzelia, Israel About ScyllaDB
  • 4. Agenda + Understand your workload + The machines + Let’s build a system + How it will look at different IaaS + Our sizing process
  • 6. Thinking about database cluster sizing?! 6
  • 7. Make sure you have all requirements set 7
  • 8. Make sure you have all requirements set 8 + Business
  • 9. Make sure you have all requirements set 9 + Business + Application
  • 10. Make sure you have all requirements set 10 + Business + Application + Infrastructure
  • 11. Make sure you have all requirements set 11 + Business + Application + Infrastructure + Resiliency
  • 12. Make sure you have all requirements set 12 + Business + Application + Infrastructure + Resiliency + Developer and Operator Friendliness
  • 13. The obvious... + Data volume ingested per second/hour/day/year + Data attrition policy + Data format : Text, Binary blob + Required replication factor + What’s in your storage system? 13
  • 14. 14 The shape of your workload
  • 15. The shape of your workload https://www.scylladb.com/2019/05/23/workload-prioritization-running-oltp-and-olap-traffic-on-t he-same-superhighway/
  • 17. Someone needs to do the job! 17
  • 18. Memory and data volume + Keep disk space to memory at a reasonable ratio 18
  • 19. Connectivity + Enable enough bandwidth for client-server and server-server communication 19
  • 20. Connectivity + Enable enough bandwidth for client-server and server-server communication + Geo-replication benefits 20
  • 21. Let’s build a system
  • 22. Let’s build a system Business requirements: + 1st year customers: 30M profiles + Number of records per profile: 12 + Avg. size of each record: 3KB + Data types: text 22
  • 23. Let’s build a system Business requirements: + 1st year customers: 30M profiles + Number of records per profile: 12 + Avg. size of each record: 3KB + Data types: text Application requirements: + 99% write response time: 15ms + 99% read response time: 10ms + Peak throughput: 150,000 operations/sec + Read:Write ratio: 70:30 23
  • 24. Let’s build a system Business requirements: + 1st year customers: 30M profiles + Number of records per profile: 12 + Avg. size of each record: 3KB + Data types: text Application requirements: + 99% write response time: 15ms + 99% read response time: 10ms + Peak throughput: 150,000 operations/sec + Read:Write ratio: 70:30 Infrastructure: + AWS + Available instance types: all + Multi-DC: Oregon and N.Virgina + OS: CentOS 7.6 + Replication Factor: 3 24
  • 25. Let’s build a system Business requirements: + 1st year customers: 30M profiles + Number of records per profile: 12 + Avg. size of each record: 3KB + Data types: text Application requirements: + 99% write response time: 15ms + 99% read response time: 10ms + Peak throughput: 150,000 operations/sec + Read:Write ratio: 70:30 Infrastructure: + AWS + Available instance types: all + Multi-DC: Oregon and N.Virgina + OS: CentOS 7.6 + Replication Factor: 3 Auxiliary applications: + Spark 25
  • 26. How it looks at different IaaS
  • 27. Let’s build a system + End of year Total raw data size: 2TB, Starting with ~1TB + Typical record size read/written by the application: 3KB + Data model: 20-30 tables, up to 20 columns per row, 10-50 rows per partition, mainly text + Latency requirements: 10ms Write, 15ms Read, for the 99% + Read:Write ratio: 70:30 + Throughput: 150,000 database op/s + IaaS: AWS, multi-region, multi-availability-zone, N. Virginia and Oregon + Replication Factor: 3 + Spark for analytics 27
  • 28. Let’s build a system, Amazon Web Services + Per Data Center + Needed disk space: 12TB + Media type: NVMe drives to meet latency SLA + Requires 30 threads, 15 physical cores + Per Data center instance options + 3 x i3.4xlarge → Total disk: 11.5TB and + 3 x i3.2xlarge for the Spark cluster + 1x i3.2xlarge for Scyla monitoring and Scylla manager 28
  • 29. Let’s build a system, Azure + Per Data Center + Needed disk space: 12TB + Media type: NVMe drives to meet latency SLA + Requires 30 threads, 15 physical cores + Per Data center instance options + 3 x standard L16 v2 → Total disk: 11.5TB and + 3 x standard L8 v2 for the Spark cluster + 1x standard L8 v2 for Scyla monitoring and Scylla manager 29
  • 30. Let’s build a system, Google Cloud + Per Data Center + Needed disk space: 12TB + Media type: NVMe drives to meet latency SLA + Requires 30 threads, 15 physical cores + Per Data center instance options + 6 x n1-standard-16 + 5x NVMe based direct attached, 375GB drives and + 3 x n1-standard-8 for the Spark cluster + 1 x n1-standard-8 for Scyla monitoring and Scylla manager 30
  • 31. Let’s build a system, Scylla Cloud + Per Data Center + Needed disk space: 12TB + Media type: NVMe drives to meet latency SLA + Requires 30 threads, 15 physical cores + Per Data center instance options + 3 x i3.4xlarge → Total disk: 11.5TB and + 3 x i3.2xlarge for the Spark cluster 31
  • 32. Let’s build a system, on premise + Per Data Center + Needed disk space: 12TB + Media type: NVMe drives to meet latency SLA + Requires 30 threads, 15 physical cores + Per Data center instance options + 3 machines with 8 physical cores each and at least 4TB of SSD direct attached drives + 3 machines with 4 or more physical cores for Spark + 1 x machine for Scyla monitoring and Scylla manager + Scylla nodes and Spark nodes: 128GB RAM + Network: 10GbE 32
  • 35. Summary + Do not think only storage! + Gather applications and business requirements + Throughput and SLAs + Growth expectations + Security and compliance needs + Select the right infrastructure + Think about resiliency and high availability Ask us questions! 35
  • 36. Best Practices for Data Modeling August 7, 2019 | 10:00 AM PT - 1:00 PM ET How to Shrink Your Datacenter Footprint by 50% August 14, 2019 | 10:00 AM PT - 1:00 PM ET
  • 38. United States 1900 Embarcadero Road Palo Alto, CA 94303 Israel 11 Galgalei Haplada Herzelia, Israel www.scylladb.com @scylladb Thank you