SlideShare uma empresa Scribd logo
1 de 20
Baixar para ler offline
Handling Increasing Load and
Reducing Costs Using Aerospike
NoSQL Database
Speed and Scale when you need it
Zohar Elkayam, Solutions Architect, Aerospike
April 2020
2 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• Some background on how Aerospike works
• What happens when we need more?
• Live Demo
• All Flash – Reducing Costs
• Aerospike Cloud
Agenda
3 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
Unbreakable Competitive Advantage
Flash Optimized
Storage Layer
✓ Significantly higher
performance & IOPS
Multi-threaded
Massively Parallel
✓ ‘Scale up’ and ‘Scale out’
Self-healing
clusters
✓ Superior Uptime,
Availability and Reliability
Storage indices in DRAM
Data on optimized SSD’s
✓ Predictable Performance
regardless of scale
✓ Single-hop to data
patented
Aerospike Hybrid Memory Architecture TM
4 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
CLUSTER DATA
5%
5%
5%
5%
5%
% OF CLUSTER DATA
CLUSTER DATA CLUSTER DATA CLUSTER DATA
25% 25% 25%
SSD 1
SSD 2
SSD 3
SSD 4
SSD 5
Linear Scaling
✓ Scale UP – take full advantage of hardware
✓ Scale OUT – linear scaling with number of nodes
Automatic Distribution of Data using
Smart PartitionsTM Algorithm
✓ Even amount of data on every node and flash device
✓ All hardware used equally
✓ Load on all servers is balanced
✓ No “hot spots”
✓ No config changes as workload or use case changes
Smart Clients
✓ Single “hop” from client to server
✓ Cluster-spanning operations (scan, query, batch) sent to all
nodes for parallel processing
Data Distribution and Scalability
5 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
Node A Node B Node C Node D
Partition Master Replica 1 Replica 2 Replica 3
1 A B C D
2 B D A C
3 D A C B
…
4096 C A B D
CLIENT CLIENT CLIENT
Partition table created when cluster (re-)forms
✓ Deterministic
✓ Optimized for even data distribution
Client Pulls Partition Map
✓ Detects when cluster has changed and refreshes map
✓ Constant hashing algorithm (RIPEMD160) used to map
key to partition id
✓ Allows single network hop to owning node
Node Addition / Removal
✓ Cluster detects new / removed node through heartbeats
✓ Reforms table by promoting / removing
✓ Eg: If node B is removed, becomes replica on partition 1, D becomes
master and A becomes replica on partition 2.
✓ Minimizes migration of data
✓ Distributes load of lost node to all other cluster nodes.
Cluster Formation
6 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• Cluster forms using Paxos algorithm and a Partition Table is generated.
• Each row in the Partition Table is the Succession List for that partition.
The Partition Table
7 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• Every second, ACL tend thread queries each node for Partition Version.
• Cluster change triggers Paxos re-clustering and bumps Partition Version.
• When ACL detects change in Partition Version, it re-builds the Partition Map
by querying each node for its Master and Replica(s) ownership.
Partition Map
8 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• New node F is added to the cluster.
• F may land anywhere in a partition’s
succession list.
Example:
• Partition 0: Node F joins as Replica, B remains
Master and fills data into F (Fill Migration).
• Partition 1: Node F joins as Master,
E continues to act as Master till it finishes
filling data into F. When this fill migration
completes, F becomes new Master
(Master-Handoff).
Scaling Out - Adding a Node
9 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• When a node is lost (e.g., node C ), succession list
moves left.
Example:
• Partition 1: C was Replica, A becomes new Replica.
Partition data migrates from Master E to A.
Two copies of data restored upon completion of
migration.
• Partition 4094: C was Master, Replica B gets
promoted to new Master. Typically, B will have
full data. A becomes the new Replica. Partition
data will be migrated from from B to A.
Scaling Down: Removing/Losing a Node
10 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• We have a 5-node cluster: 4096/5= ~819 master partitions per node.
• Adding a node, we have a 6-node cluster: 4096/6 = ~683 master partitions per node.
• For a given node capacity (RAM, DISK), as cluster size grows, each node is responsible for less
partitions, less data, less activity.
• When a node is taken out (e.g. rolling upgrade), the remaining nodes should be able to still store 2
copies of the data after cluster re-balances automatically.
➢ Adjust cluster size with automatic data re-distribution and rebalancing.
Cluster Capacity When Scaling
Scale Up Demo Time!
12 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
Aerospike Server Version 4.3.0.2+ introduced ALL FLASH storage option.
• Allows user to store the PRIMARY INDEX (PI) on device (NVMe SSD) instead of in-memory.
Edge Systems
• For large number of very small size records with relaxed latency needs.
• RAM vs. SSD storage space ratio approaches 1:1 causing server sprawl.
• Significant cost savings by using ALL FLASH storage.
• No need to modify data model with a reverse lookup implementation to improve RAM:SSD ratio.
System of Records
• Cost savings with very small objects and very large data stores. (> 100 TB)
ALL FLASH Configuration
13 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
About ALL FLASH Configuration
14 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
Scenario: 10 Billion Objects, 64 bytes per object
• When using Hybrid Memory, resources needed cluster-wide:
• Memory: 10B * (Replication Factor=2) * (PI=64 bytes) = ~1.2 TB
• Disk: 10B * (RF) * 64 bytes = ~1.2 TB
• We need as much on memory as we do disks - not a lot of data, but things becomes
expensive!
• Example hardware needed: 6 nodes of r5d.8xl (1.4TB of RAM), at ~76K USD a year.
• When using All Flash:
• Memory needed: 13GB
• Disk needed for Index: 4TB
• Index Actual Utilization: 10B * (Replication Factor=2) * 64 bytes = ~1.2 TB
• Disk Utilization: 10B * (RF) * 64 = ~1.2 TB
• DRAM needs reduced; Hardware needed: 3 nodes of i3en.3xl, at ~24K USD a year.
• Costs saved!
All Flash Cost Savings Examples
15 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• Aerospike Cloud: Empowering customers to build, manage and automate their
own Aerospike database-as-a-service (DBaaS).
• Aerospike Cloud for Accelerated Cloud Deployments: use standard tools across
multiple cloud environments to accelerate the development, management and
automation of your own Aerospike database-as-a-service (DBaaS).
• Standard Based Approach: Aerospike Cloud Foundations is based on Cloud
Native Computing Foundation (CNCF) standards.
New: Aerospike Cloud
16 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• CNCF is a set of technologies that make loosely coupled cloud-based
deployments resilient, manageable, and observable.
• A basis for automating the management of cloud deployments
• A standard set of tools for alerting and monitoring systems
• Managed under the Linux Foundation
• Provides a governance model fit for enterprises and vendors of enterprise
software
• For Aerospike CNCF provides a complete model
• Kubernetes, evolving support for Helm Charts, and Prometheus
What Is CNCF?
17 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
What We Are Delivering
Kubernetes operator
Custom Aerospike-specific extensions to the Kubernetes API that
encapsulate operations domain knowledge, such as scale-up,
scale-down, cluster configuration management, upgrades.
Helm Charts
The ability to deploy Aerospike clusters in a Kubernetes
environment using the Helm package manager, a CNCF incubating
project.
Prometheus
Integration with the CNCF graduated monitoring and alerting
solution by way of a custom exporter for Aerospike Enterprise
Edition and Alertmanager configs.
Grafana
Integration with CNCF member Grafana Labs' open source
visualization platform through custom dashboards for the
Aerospike EE Prometheus exporter.
18 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc.
• Announced on March 2020.
• Google Cloud First: Aerospike Cloud supports the Google Kubernetes Engine
(GKE) on Google Cloud Platform (GCP). Full integration to other cloud platforms will
follow soon.
• Individual parts are available for other cloud/on-prem platforms as well.
Aerospike Cloud Availability
Time for Q&A!
Thank You!
zelkayam@aerospike.com

Mais conteúdo relacionado

Mais procurados

Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
DataWorks Summit
 

Mais procurados (20)

HDF-EOS Status and Developments
HDF-EOS Status and DevelopmentsHDF-EOS Status and Developments
HDF-EOS Status and Developments
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory Architecture
 
Flexible and Fast Storage for Deep Learning with Alluxio
Flexible and Fast Storage for Deep Learning with Alluxio Flexible and Fast Storage for Deep Learning with Alluxio
Flexible and Fast Storage for Deep Learning with Alluxio
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 
Query Anything, Anywhere with Kubernetes
Query Anything, Anywhere with KubernetesQuery Anything, Anywhere with Kubernetes
Query Anything, Anywhere with Kubernetes
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
 
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
 
MyCloud for $100k
MyCloud for $100kMyCloud for $100k
MyCloud for $100k
 
An AWS DMS Replication Journey from Oracle to Aurora MySQL
An AWS DMS Replication Journey from Oracle to Aurora MySQLAn AWS DMS Replication Journey from Oracle to Aurora MySQL
An AWS DMS Replication Journey from Oracle to Aurora MySQL
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongUnlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
 
Spectrum Scale - Diversified analytic solution based on various storage servi...
Spectrum Scale - Diversified analytic solution based on various storage servi...Spectrum Scale - Diversified analytic solution based on various storage servi...
Spectrum Scale - Diversified analytic solution based on various storage servi...
 
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangBest Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene Pang
 
Accelerating analytics workloads with Alluxio data orchestration and Intel® O...
Accelerating analytics workloads with Alluxio data orchestration and Intel® O...Accelerating analytics workloads with Alluxio data orchestration and Intel® O...
Accelerating analytics workloads with Alluxio data orchestration and Intel® O...
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
 
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
 

Semelhante a Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - Zohar Elkayam

Semelhante a Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - Zohar Elkayam (20)

Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower Manhattan
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?
 
SD Times - Docker v2
SD Times - Docker v2SD Times - Docker v2
SD Times - Docker v2
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike Architecture
 
Consolidate and prepare for cloud efficiencies
Consolidate and prepare for cloud efficienciesConsolidate and prepare for cloud efficiencies
Consolidate and prepare for cloud efficiencies
 
Architecture_Masking_Delphix.pptx
Architecture_Masking_Delphix.pptxArchitecture_Masking_Delphix.pptx
Architecture_Masking_Delphix.pptx
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
 
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
 
JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data Demystified
 
Running Oracle EBS in the cloud (DOAG TECH17 edition)
Running Oracle EBS in the cloud (DOAG TECH17 edition)Running Oracle EBS in the cloud (DOAG TECH17 edition)
Running Oracle EBS in the cloud (DOAG TECH17 edition)
 
Oracle Database Appliance (ODA) X6-2 Portfolio Overview
Oracle Database Appliance (ODA) X6-2 Portfolio OverviewOracle Database Appliance (ODA) X6-2 Portfolio Overview
Oracle Database Appliance (ODA) X6-2 Portfolio Overview
 
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAccelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
 

Mais de Aerospike

Mais de Aerospike (9)

Aerospike-AppsFlyer COVID-19 Crisis Growth Elad Leev
Aerospike-AppsFlyer COVID-19 Crisis Growth Elad LeevAerospike-AppsFlyer COVID-19 Crisis Growth Elad Leev
Aerospike-AppsFlyer COVID-19 Crisis Growth Elad Leev
 
Contentsquare Aerospike Usage and COVID-19 Impact - Doron Hoffman
Contentsquare Aerospike Usage and COVID-19 Impact - Doron HoffmanContentsquare Aerospike Usage and COVID-19 Impact - Doron Hoffman
Contentsquare Aerospike Usage and COVID-19 Impact - Doron Hoffman
 
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...
 
Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020
 
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020
 
Aerospike Roadmap Overview - Meetup Dec 2019
Aerospike Roadmap Overview - Meetup Dec 2019Aerospike Roadmap Overview - Meetup Dec 2019
Aerospike Roadmap Overview - Meetup Dec 2019
 
Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019
 
Aerospike Data Modeling - Meetup Dec 2019
Aerospike Data Modeling - Meetup Dec 2019Aerospike Data Modeling - Meetup Dec 2019
Aerospike Data Modeling - Meetup Dec 2019
 
JDBC Driver for Aerospike - Meetup Dec 2019
JDBC Driver for Aerospike - Meetup Dec 2019JDBC Driver for Aerospike - Meetup Dec 2019
JDBC Driver for Aerospike - Meetup Dec 2019
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 

Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - Zohar Elkayam

  • 1. Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database Speed and Scale when you need it Zohar Elkayam, Solutions Architect, Aerospike April 2020
  • 2. 2 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Some background on how Aerospike works • What happens when we need more? • Live Demo • All Flash – Reducing Costs • Aerospike Cloud Agenda
  • 3. 3 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. Unbreakable Competitive Advantage Flash Optimized Storage Layer ✓ Significantly higher performance & IOPS Multi-threaded Massively Parallel ✓ ‘Scale up’ and ‘Scale out’ Self-healing clusters ✓ Superior Uptime, Availability and Reliability Storage indices in DRAM Data on optimized SSD’s ✓ Predictable Performance regardless of scale ✓ Single-hop to data patented Aerospike Hybrid Memory Architecture TM
  • 4. 4 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. CLUSTER DATA 5% 5% 5% 5% 5% % OF CLUSTER DATA CLUSTER DATA CLUSTER DATA CLUSTER DATA 25% 25% 25% SSD 1 SSD 2 SSD 3 SSD 4 SSD 5 Linear Scaling ✓ Scale UP – take full advantage of hardware ✓ Scale OUT – linear scaling with number of nodes Automatic Distribution of Data using Smart PartitionsTM Algorithm ✓ Even amount of data on every node and flash device ✓ All hardware used equally ✓ Load on all servers is balanced ✓ No “hot spots” ✓ No config changes as workload or use case changes Smart Clients ✓ Single “hop” from client to server ✓ Cluster-spanning operations (scan, query, batch) sent to all nodes for parallel processing Data Distribution and Scalability
  • 5. 5 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. Node A Node B Node C Node D Partition Master Replica 1 Replica 2 Replica 3 1 A B C D 2 B D A C 3 D A C B … 4096 C A B D CLIENT CLIENT CLIENT Partition table created when cluster (re-)forms ✓ Deterministic ✓ Optimized for even data distribution Client Pulls Partition Map ✓ Detects when cluster has changed and refreshes map ✓ Constant hashing algorithm (RIPEMD160) used to map key to partition id ✓ Allows single network hop to owning node Node Addition / Removal ✓ Cluster detects new / removed node through heartbeats ✓ Reforms table by promoting / removing ✓ Eg: If node B is removed, becomes replica on partition 1, D becomes master and A becomes replica on partition 2. ✓ Minimizes migration of data ✓ Distributes load of lost node to all other cluster nodes. Cluster Formation
  • 6. 6 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Cluster forms using Paxos algorithm and a Partition Table is generated. • Each row in the Partition Table is the Succession List for that partition. The Partition Table
  • 7. 7 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Every second, ACL tend thread queries each node for Partition Version. • Cluster change triggers Paxos re-clustering and bumps Partition Version. • When ACL detects change in Partition Version, it re-builds the Partition Map by querying each node for its Master and Replica(s) ownership. Partition Map
  • 8. 8 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • New node F is added to the cluster. • F may land anywhere in a partition’s succession list. Example: • Partition 0: Node F joins as Replica, B remains Master and fills data into F (Fill Migration). • Partition 1: Node F joins as Master, E continues to act as Master till it finishes filling data into F. When this fill migration completes, F becomes new Master (Master-Handoff). Scaling Out - Adding a Node
  • 9. 9 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • When a node is lost (e.g., node C ), succession list moves left. Example: • Partition 1: C was Replica, A becomes new Replica. Partition data migrates from Master E to A. Two copies of data restored upon completion of migration. • Partition 4094: C was Master, Replica B gets promoted to new Master. Typically, B will have full data. A becomes the new Replica. Partition data will be migrated from from B to A. Scaling Down: Removing/Losing a Node
  • 10. 10 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • We have a 5-node cluster: 4096/5= ~819 master partitions per node. • Adding a node, we have a 6-node cluster: 4096/6 = ~683 master partitions per node. • For a given node capacity (RAM, DISK), as cluster size grows, each node is responsible for less partitions, less data, less activity. • When a node is taken out (e.g. rolling upgrade), the remaining nodes should be able to still store 2 copies of the data after cluster re-balances automatically. ➢ Adjust cluster size with automatic data re-distribution and rebalancing. Cluster Capacity When Scaling
  • 11. Scale Up Demo Time!
  • 12. 12 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. Aerospike Server Version 4.3.0.2+ introduced ALL FLASH storage option. • Allows user to store the PRIMARY INDEX (PI) on device (NVMe SSD) instead of in-memory. Edge Systems • For large number of very small size records with relaxed latency needs. • RAM vs. SSD storage space ratio approaches 1:1 causing server sprawl. • Significant cost savings by using ALL FLASH storage. • No need to modify data model with a reverse lookup implementation to improve RAM:SSD ratio. System of Records • Cost savings with very small objects and very large data stores. (> 100 TB) ALL FLASH Configuration
  • 13. 13 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. About ALL FLASH Configuration
  • 14. 14 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. Scenario: 10 Billion Objects, 64 bytes per object • When using Hybrid Memory, resources needed cluster-wide: • Memory: 10B * (Replication Factor=2) * (PI=64 bytes) = ~1.2 TB • Disk: 10B * (RF) * 64 bytes = ~1.2 TB • We need as much on memory as we do disks - not a lot of data, but things becomes expensive! • Example hardware needed: 6 nodes of r5d.8xl (1.4TB of RAM), at ~76K USD a year. • When using All Flash: • Memory needed: 13GB • Disk needed for Index: 4TB • Index Actual Utilization: 10B * (Replication Factor=2) * 64 bytes = ~1.2 TB • Disk Utilization: 10B * (RF) * 64 = ~1.2 TB • DRAM needs reduced; Hardware needed: 3 nodes of i3en.3xl, at ~24K USD a year. • Costs saved! All Flash Cost Savings Examples
  • 15. 15 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Aerospike Cloud: Empowering customers to build, manage and automate their own Aerospike database-as-a-service (DBaaS). • Aerospike Cloud for Accelerated Cloud Deployments: use standard tools across multiple cloud environments to accelerate the development, management and automation of your own Aerospike database-as-a-service (DBaaS). • Standard Based Approach: Aerospike Cloud Foundations is based on Cloud Native Computing Foundation (CNCF) standards. New: Aerospike Cloud
  • 16. 16 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • CNCF is a set of technologies that make loosely coupled cloud-based deployments resilient, manageable, and observable. • A basis for automating the management of cloud deployments • A standard set of tools for alerting and monitoring systems • Managed under the Linux Foundation • Provides a governance model fit for enterprises and vendors of enterprise software • For Aerospike CNCF provides a complete model • Kubernetes, evolving support for Helm Charts, and Prometheus What Is CNCF?
  • 17. 17 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. What We Are Delivering Kubernetes operator Custom Aerospike-specific extensions to the Kubernetes API that encapsulate operations domain knowledge, such as scale-up, scale-down, cluster configuration management, upgrades. Helm Charts The ability to deploy Aerospike clusters in a Kubernetes environment using the Helm package manager, a CNCF incubating project. Prometheus Integration with the CNCF graduated monitoring and alerting solution by way of a custom exporter for Aerospike Enterprise Edition and Alertmanager configs. Grafana Integration with CNCF member Grafana Labs' open source visualization platform through custom dashboards for the Aerospike EE Prometheus exporter.
  • 18. 18 Proprietary & Confidential | All rights reserved. © 2020 Aerospike Inc. • Announced on March 2020. • Google Cloud First: Aerospike Cloud supports the Google Kubernetes Engine (GKE) on Google Cloud Platform (GCP). Full integration to other cloud platforms will follow soon. • Individual parts are available for other cloud/on-prem platforms as well. Aerospike Cloud Availability