SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
Brought to you by
Scaling Apache
Pulsar to 10 PB/day
Karthik Ramasamy
Senior Director of Engineering at
Karthik Ramasamy
Senior Director of Engineering
@karthikz
streaming @splunk | ex-CEO of @streamlio | co-creator of @heronstreaming | ex @Twitter | Ph.D
During the course of this presentation, we may make forward‐looking statements
regarding future events or plans of the company. We caution you that such statements
reflect our current expectations and estimates based on factors currently known to us
and that actual events or results may differ materially. The forward-looking statements
made in the this presentation are being made as of the time and date of its live
presentation. If reviewed after its live presentation, it may not contain current or
accurate information. We do not assume any obligation to update any forward‐looking
statements made herein.
In addition, any information about our roadmap outlines our general product direction
and is subject to change at any time without notice. It is for informational purposes
only, and shall not be incorporated into any contract or other commitment. Splunk
undertakes no obligation either to develop the features or functionalities described or
to include any such feature or functionality in a future release.
Splunk, Splunk>, Data-to-Everything, D2E and Turn Data Into Doing are trademarks and registered trademarks of Splunk Inc. in the
United States and other countries. All other brand names, product names or trademarks belong to their respective owners. © 2020
Splunk Inc. All rights reserved
Forward-
Looking
Statements
1) Introduction to Splunk & DSP
2) Requirements, Use Cases & Deployment
3) Initial Cluster Size Estimation
4) Optimizations
5) Conclusion
Agenda
Splunk Data Stream Processor
Detect Data Patterns or Conditions
Mask Sensitive Data
Aggregate Format
Normalize Transform
Filter Enhance
Turn Raw Data Into
High-value Information
Protect Sensitive Data
Distribute Data To Splunk
Or Other Destinations
Data
Warehouse
Public
Cloud
Message
Bus
Splunk DSP
A real time stream processing solution that collects, processes and delivers data to Splunk and
other destinations in milliseconds
DSP - Bird’s Eye View
HEC
S2S
Batch
Apache Pulsar
Stream Processing
Engine
External
Systems
REST Client
Forwarders
Data Source
Splunk
Indexer
Apache Pulsar is at the core of DSP
Customer Requirements & Deployment
■ Marquee customer is in finance and payments
■ Microservices and applications emit logs
■ Logs contain rich information
■ Process these logs and extract monitoring & tracing information
■ Filter these logs depending on log volume and if there is high value - justifying
retention
■ Compute real time business metrics
Use Cases
■ Environment - Google Cloud Platform
■ Use of n1-standard-32 VMs
■ Raw data ingestion of 10 PB/day that translates ~120 GB/sec
■ Data retention of 3 hours
■ Need to handle the entire traffic load when a zone fails
Data Requirements
DSP Ingest Cluster
DSP Compute Cluster
DSP Compute Cluster
DSP Compute Cluster
Log
Publisher
Log
Publisher
Log
Publisher
Apache
Pulsar
Cluster
Pipeline 1
Pipeline 2
Pipeline 3
Splunk
Enterprise
Splunk
Observability
DSP Deployment
DSP Deployment
■ Separation of ingestion and computation
■ Pipeline isolation and no noisy neighbor issues
■ Troubleshooting single pipeline gets easier
■ Might not need over provisioning except for peak load + fudge
factor (as compared to deploying a single cluster)
■ 32 vCPUs
■ 120 GB of memory
■ Max number of PDs (EBS equivalent) - 128
■ Max total PD size - 257 TB
■ Max egress network bandwidth - 32 Gbps (4 GBps)
■ Max 24 L-SSDs for a total of 9 TB
VM Configuration - n1-standard-32
P-SSD
P-HDD L-SSD
Storage Options in GCP
Initial Estimation
■ Replica factor of 3
■ Need to handle 120 GBps of raw traffic
■ Need to handle 360 GBps of storage write bandwidth
■ With journal required write bandwidth 720 GBps
■ Total storage required for retention - 3.9 PB
■ Total ingress network bandwidth - 480 GBps
■ Total egress network bandwidth - 1200 GBps
Apache Pulsar Requirements
■ Size of a Pulsar Cluster for a given workload depends on three
parameters -
■ Storage Density - Aggregate storage capacity needed in the
cluster and proportional to retention of data
■ Storage Bandwidth - Aggregate write throughput and read
throughput needed for data ingestion and consumption. Heavily
depends on storage media
■ Network Bandwidth - Aggregate network bandwidth available in
the cluster for input traffic and output traffic.
Pulsar Cluster Size Estimation
Max of 200 MB/sec write
throughput per VM
Max of 9 TB
per instance
Max of 4 GBps egress
and ingress bandwidth
Dominated by
Storage Bandwidth
Estimating VMs using P-HDD
Max of 400 MB/sec write
throughput per VM
Max of 9 TB
per instance
Max of 4 GBps egress
and ingress bandwidth
Dominated by
Storage Bandwidth
Estimating VMs using P-SDD
Max of 850 MB/sec write
throughput per VM
Max of 9 TB
per instance
Max of 4 GBps egress
and ingress bandwidth
Dominated by
Storage Bandwidth
Estimating VMs using L-SDD
Estimation of VMs - Comparison
Optimizations
Optimization #1 - Eliminating Journal
Since all the data is machine logs, we implemented replicated durability
■ Different types of durability
■ Persistent Durability - No data loss in the presence of nodes failures or
entire cluster failure
■ Replicated Durability - No data loss in the presence of limited nodes
failures
■ Transient Durability - Data loss in the presence of failures
Estimating VMs
Dominated by
Storage Bandwidth
Dominated by
Storage Bandwidth
Dominated by
Storage Density
Optimization #2 - Direct I/O
■ Overhead of page cache in container environment is pretty high
■ Kernel needs to keep track of the usage quota per container for the
page cache
■ These translate into maintaining additional data structures and
lookups (older kernel had n^2 lookup time for getting pages in & out)
■ Bypassed page cache for BookKeeper entry log, using JNI:
■ We already have in memory caches (write and read-ahead)
■ We have better control on what to cache and when to evict
■ Avoid double buffering
Performance of Direct I/O
Estimating VMs
Dominated by
Storage Bandwidth
Dominated by
Storage Bandwidth Dominated by
Storage Density
Optimization #3 - Use of Compression
Bookie
Bookie
Bookie
Broker
Producer
Data
Data
Data
Compressed data
Consumer
Compressed data
C
U
Employing compression
■ Compression ratio of 4x
■ Need to handle 360 GBps —> 90 GBps of storage write bandwidth
■ Total storage required for retention - 3.9 PB —> 975 TB
■ Total ingress network bandwidth - 480 GBps —> 120 GBps
■ Total egress network bandwidth - 1200 GBps —> 240 GBps
Estimating VMs
Dominated by
Storage Bandwidth Dominated by
Storage Bandwidth Dominated by
Storage Density
Surviving Zone Failure
Segment 1
Segment 2
Segment n
.
.
.
Segment 2
Segment 3
Segment n
.
.
.
Segment 3
Segment 1
Segment n
.
.
.
Storage
Broker
Serving
Broker Broker
■ Zone/Rack Failures
■ Bookies provide rack awareness
■ Broker replicate data to different
racks/zones
■ In the presence of zone/rack failure,
data is available in other zones
■ One zone failure means two zones should
be capable of handling the entire traffic
■ Requires 50% additional VMs
Zone A Zone B Zone C
Estimating VMs
Optimization #4 - C++ Client CPU & Memory Usage
■ Better round robin across partitions - maximizing the batch
size per partition
■ Having bigger batches reduces the cpu usage for client, broker and
bookies
■ Increases the compression factor
■ Reduced client memory usages
■ Optimizations to minimize memory allocation overhead
■ Implemented memory limit in C++ producer
■ Simplifies the user configuration — One single setting instead of
multiple queue sizes and complex math
Finally …
Running 200 n1-standard-32 VMs for Pulsar cluster with 24 L-SSDs per VM
Brought to you by
@Karthikz
Karthik Ramasamy

Mais conteúdo relacionado

Mais procurados

How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Scaling Apache Pulsar to 10 Petabytes/Day - Pulsar Summit NA 2021 Keynote
Scaling Apache Pulsar to 10 Petabytes/Day - Pulsar Summit NA 2021 KeynoteScaling Apache Pulsar to 10 Petabytes/Day - Pulsar Summit NA 2021 Keynote
Scaling Apache Pulsar to 10 Petabytes/Day - Pulsar Summit NA 2021 KeynoteStreamNative
 
MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)NeoClova
 
ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQLMydbops
 
Maxscale 소개 1.1.1
Maxscale 소개 1.1.1Maxscale 소개 1.1.1
Maxscale 소개 1.1.1NeoClova
 
Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Mydbops
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundMasahiko Sawada
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScaleMariaDB plc
 
[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기NHN FORWARD
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanGabriele Bartolini
 
MariaDB MaxScale monitor 매뉴얼
MariaDB MaxScale monitor 매뉴얼MariaDB MaxScale monitor 매뉴얼
MariaDB MaxScale monitor 매뉴얼NeoClova
 
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentIntroducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentHostedbyConfluent
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafkaconfluent
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11NeoClova
 
Intro ProxySQL
Intro ProxySQLIntro ProxySQL
Intro ProxySQLI Goo Lee
 

Mais procurados (20)

How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Scaling Apache Pulsar to 10 Petabytes/Day - Pulsar Summit NA 2021 Keynote
Scaling Apache Pulsar to 10 Petabytes/Day - Pulsar Summit NA 2021 KeynoteScaling Apache Pulsar to 10 Petabytes/Day - Pulsar Summit NA 2021 Keynote
Scaling Apache Pulsar to 10 Petabytes/Day - Pulsar Summit NA 2021 Keynote
 
MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)
 
ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQL
 
Maxscale 소개 1.1.1
Maxscale 소개 1.1.1Maxscale 소개 1.1.1
Maxscale 소개 1.1.1
 
Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScale
 
[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기[2018] MySQL 이중화 진화기
[2018] MySQL 이중화 진화기
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Webinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with BarmanWebinar: PostgreSQL continuous backup and PITR with Barman
Webinar: PostgreSQL continuous backup and PITR with Barman
 
MariaDB MaxScale monitor 매뉴얼
MariaDB MaxScale monitor 매뉴얼MariaDB MaxScale monitor 매뉴얼
MariaDB MaxScale monitor 매뉴얼
 
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentIntroducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Exactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache KafkaExactly-once Semantics in Apache Kafka
Exactly-once Semantics in Apache Kafka
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11
 
Intro ProxySQL
Intro ProxySQLIntro ProxySQL
Intro ProxySQL
 

Semelhante a Scaling Apache Pulsar to 10 Petabytes/Day

Scaling Apache Pulsar to 10 PB/day
Scaling Apache Pulsar to 10 PB/dayScaling Apache Pulsar to 10 PB/day
Scaling Apache Pulsar to 10 PB/dayKarthik Ramasamy
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsDatabricks
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Landon Robinson
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Databricks
 
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at UberDisaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uberconfluent
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBScott Mansfield
 
SVC / Storwize: cache partition analysis (BVQ howto)
SVC / Storwize: cache partition analysis  (BVQ howto)   SVC / Storwize: cache partition analysis  (BVQ howto)
SVC / Storwize: cache partition analysis (BVQ howto) Michael Pirker
 
AWS Webcast - Cost and Performance Optimization in Amazon RDS
AWS Webcast - Cost and Performance Optimization in Amazon RDSAWS Webcast - Cost and Performance Optimization in Amazon RDS
AWS Webcast - Cost and Performance Optimization in Amazon RDSAmazon Web Services
 
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC
 
Migrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMigrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMaris Elsins
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryDatabricks
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKafkaZone
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkVinoth Chandar
 
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...Databricks
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simpleDori Waldman
 

Semelhante a Scaling Apache Pulsar to 10 Petabytes/Day (20)

Scaling Apache Pulsar to 10 PB/day
Scaling Apache Pulsar to 10 PB/dayScaling Apache Pulsar to 10 PB/day
Scaling Apache Pulsar to 10 PB/day
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
HP 3PAR SSMC 2.1
HP 3PAR SSMC 2.1HP 3PAR SSMC 2.1
HP 3PAR SSMC 2.1
 
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
 
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at UberDisaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
 
SVC / Storwize: cache partition analysis (BVQ howto)
SVC / Storwize: cache partition analysis  (BVQ howto)   SVC / Storwize: cache partition analysis  (BVQ howto)
SVC / Storwize: cache partition analysis (BVQ howto)
 
AWS Webcast - Cost and Performance Optimization in Amazon RDS
AWS Webcast - Cost and Performance Optimization in Amazon RDSAWS Webcast - Cost and Performance Optimization in Amazon RDS
AWS Webcast - Cost and Performance Optimization in Amazon RDS
 
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various cloudsPGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
 
Migrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for OracleMigrating and Running DBs on Amazon RDS for Oracle
Migrating and Running DBs on Amazon RDS for Oracle
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent Memory
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applications
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on Spark
 
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
Incremental Processing on Large Analytical Datasets with Prasanna Rajaperumal...
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
@IBM Power roadmap 8
@IBM Power roadmap 8 @IBM Power roadmap 8
@IBM Power roadmap 8
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
 

Mais de ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

Mais de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Último

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Scaling Apache Pulsar to 10 Petabytes/Day

  • 1. Brought to you by Scaling Apache Pulsar to 10 PB/day Karthik Ramasamy Senior Director of Engineering at
  • 2. Karthik Ramasamy Senior Director of Engineering @karthikz streaming @splunk | ex-CEO of @streamlio | co-creator of @heronstreaming | ex @Twitter | Ph.D
  • 3. During the course of this presentation, we may make forward‐looking statements regarding future events or plans of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results may differ materially. The forward-looking statements made in the this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, it may not contain current or accurate information. We do not assume any obligation to update any forward‐looking statements made herein. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only, and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionalities described or to include any such feature or functionality in a future release. Splunk, Splunk>, Data-to-Everything, D2E and Turn Data Into Doing are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names or trademarks belong to their respective owners. © 2020 Splunk Inc. All rights reserved Forward- Looking Statements
  • 4. 1) Introduction to Splunk & DSP 2) Requirements, Use Cases & Deployment 3) Initial Cluster Size Estimation 4) Optimizations 5) Conclusion Agenda
  • 5. Splunk Data Stream Processor Detect Data Patterns or Conditions Mask Sensitive Data Aggregate Format Normalize Transform Filter Enhance Turn Raw Data Into High-value Information Protect Sensitive Data Distribute Data To Splunk Or Other Destinations Data Warehouse Public Cloud Message Bus Splunk DSP A real time stream processing solution that collects, processes and delivers data to Splunk and other destinations in milliseconds
  • 6. DSP - Bird’s Eye View HEC S2S Batch Apache Pulsar Stream Processing Engine External Systems REST Client Forwarders Data Source Splunk Indexer Apache Pulsar is at the core of DSP
  • 8. ■ Marquee customer is in finance and payments ■ Microservices and applications emit logs ■ Logs contain rich information ■ Process these logs and extract monitoring & tracing information ■ Filter these logs depending on log volume and if there is high value - justifying retention ■ Compute real time business metrics Use Cases
  • 9. ■ Environment - Google Cloud Platform ■ Use of n1-standard-32 VMs ■ Raw data ingestion of 10 PB/day that translates ~120 GB/sec ■ Data retention of 3 hours ■ Need to handle the entire traffic load when a zone fails Data Requirements
  • 10. DSP Ingest Cluster DSP Compute Cluster DSP Compute Cluster DSP Compute Cluster Log Publisher Log Publisher Log Publisher Apache Pulsar Cluster Pipeline 1 Pipeline 2 Pipeline 3 Splunk Enterprise Splunk Observability DSP Deployment
  • 11. DSP Deployment ■ Separation of ingestion and computation ■ Pipeline isolation and no noisy neighbor issues ■ Troubleshooting single pipeline gets easier ■ Might not need over provisioning except for peak load + fudge factor (as compared to deploying a single cluster)
  • 12. ■ 32 vCPUs ■ 120 GB of memory ■ Max number of PDs (EBS equivalent) - 128 ■ Max total PD size - 257 TB ■ Max egress network bandwidth - 32 Gbps (4 GBps) ■ Max 24 L-SSDs for a total of 9 TB VM Configuration - n1-standard-32
  • 15. ■ Replica factor of 3 ■ Need to handle 120 GBps of raw traffic ■ Need to handle 360 GBps of storage write bandwidth ■ With journal required write bandwidth 720 GBps ■ Total storage required for retention - 3.9 PB ■ Total ingress network bandwidth - 480 GBps ■ Total egress network bandwidth - 1200 GBps Apache Pulsar Requirements
  • 16. ■ Size of a Pulsar Cluster for a given workload depends on three parameters - ■ Storage Density - Aggregate storage capacity needed in the cluster and proportional to retention of data ■ Storage Bandwidth - Aggregate write throughput and read throughput needed for data ingestion and consumption. Heavily depends on storage media ■ Network Bandwidth - Aggregate network bandwidth available in the cluster for input traffic and output traffic. Pulsar Cluster Size Estimation
  • 17. Max of 200 MB/sec write throughput per VM Max of 9 TB per instance Max of 4 GBps egress and ingress bandwidth Dominated by Storage Bandwidth Estimating VMs using P-HDD
  • 18. Max of 400 MB/sec write throughput per VM Max of 9 TB per instance Max of 4 GBps egress and ingress bandwidth Dominated by Storage Bandwidth Estimating VMs using P-SDD
  • 19. Max of 850 MB/sec write throughput per VM Max of 9 TB per instance Max of 4 GBps egress and ingress bandwidth Dominated by Storage Bandwidth Estimating VMs using L-SDD
  • 20. Estimation of VMs - Comparison
  • 22. Optimization #1 - Eliminating Journal Since all the data is machine logs, we implemented replicated durability ■ Different types of durability ■ Persistent Durability - No data loss in the presence of nodes failures or entire cluster failure ■ Replicated Durability - No data loss in the presence of limited nodes failures ■ Transient Durability - Data loss in the presence of failures
  • 23. Estimating VMs Dominated by Storage Bandwidth Dominated by Storage Bandwidth Dominated by Storage Density
  • 24. Optimization #2 - Direct I/O ■ Overhead of page cache in container environment is pretty high ■ Kernel needs to keep track of the usage quota per container for the page cache ■ These translate into maintaining additional data structures and lookups (older kernel had n^2 lookup time for getting pages in & out) ■ Bypassed page cache for BookKeeper entry log, using JNI: ■ We already have in memory caches (write and read-ahead) ■ We have better control on what to cache and when to evict ■ Avoid double buffering
  • 26. Estimating VMs Dominated by Storage Bandwidth Dominated by Storage Bandwidth Dominated by Storage Density
  • 27. Optimization #3 - Use of Compression Bookie Bookie Bookie Broker Producer Data Data Data Compressed data Consumer Compressed data C U
  • 28. Employing compression ■ Compression ratio of 4x ■ Need to handle 360 GBps —> 90 GBps of storage write bandwidth ■ Total storage required for retention - 3.9 PB —> 975 TB ■ Total ingress network bandwidth - 480 GBps —> 120 GBps ■ Total egress network bandwidth - 1200 GBps —> 240 GBps
  • 29. Estimating VMs Dominated by Storage Bandwidth Dominated by Storage Bandwidth Dominated by Storage Density
  • 30. Surviving Zone Failure Segment 1 Segment 2 Segment n . . . Segment 2 Segment 3 Segment n . . . Segment 3 Segment 1 Segment n . . . Storage Broker Serving Broker Broker ■ Zone/Rack Failures ■ Bookies provide rack awareness ■ Broker replicate data to different racks/zones ■ In the presence of zone/rack failure, data is available in other zones ■ One zone failure means two zones should be capable of handling the entire traffic ■ Requires 50% additional VMs Zone A Zone B Zone C
  • 32. Optimization #4 - C++ Client CPU & Memory Usage ■ Better round robin across partitions - maximizing the batch size per partition ■ Having bigger batches reduces the cpu usage for client, broker and bookies ■ Increases the compression factor ■ Reduced client memory usages ■ Optimizations to minimize memory allocation overhead ■ Implemented memory limit in C++ producer ■ Simplifies the user configuration — One single setting instead of multiple queue sizes and complex math
  • 33. Finally … Running 200 n1-standard-32 VMs for Pulsar cluster with 24 L-SSDs per VM
  • 34. Brought to you by @Karthikz Karthik Ramasamy