SlideShare uma empresa Scribd logo
1 de 28
Ricardo Paiva
First impressions of Apache Pulsar features from
someone that have never used it. :)
Apache Pulsar
First Overview
Motivation
3 •
Kafka is an amazing tool, with increadible througput and resilience, but it has some
drawbacks or lacks few features:
 Capacity of a partition is limited by the smallest node
 Ops - Add/remove a new broker requires cluster rebalancing
 No long term storage
 Only sub/pub client pattern (no work queue)
 No namespace or tenancy management
 No multi-cluster replication
Motivation
Key concepts
5 •
Tiered Storage
Uses Apache Jclouds
6 •
Multi-tenant and Namespace
Pulsar Components
8 •
Brokers
9 •
Bookies
10 •
Producer
11 •
Consumer
12 •
Zookeeper
13 •
 It uses BookKeeper but other schema registry can be plugged
 Can be uploaded when a typed Producer is created or via REST API
 Versioned
 Defined at topic level
 Format types:
 String (used for UTF-8-encoded strings)
 JSON
 Protobuf
 Avro
 Only works with Java
Schema Registry
Subscription modes
15 •
Message Acknowledgment
16 •
 Message Retention
 Applies to messages that are marked as acknowledged and set to be deleted
 It’s a time limit applied on a topic whereas.
 TTL
 Applies to messages that were not consumed
 It’s a time limit on consumption with a subscription.
Retention
17 •
Exclusive
18 •
Failover
19 •
Shared (Working queue)
 Message ordering is not guaranteed.
 You cannot use cumulative acknowledgment with shared mode.
Internals
21 •
Bookie Storage
22 •
Cold storage
23 •
SQL with Presto
Other features
25 •
Geo Replication (Sync)
 Requires global Zookeeper installation
 Region Aware Placement Policy
 Higher latency
26 •
Geo Replication (ASync)
 Rack Aware Placement Policy
 First persisted to the local cluster and
then replicated asynchronously to the
remote clusters
 Enabled on a per-tenant basis
 Types:
 master-slave replication
 active-active bidirectional
replication
 full-mesh replication between
multiple data centers
27 •
 Per producer/topic sequence numbers to detect duplicates
 Each topic owner broker maintains an in-memory hashmap of the latest sequence number
per topic/producer.
 The broker periodically snapshots the latest sequence number to a cursor, which allows the
map to be reconstructed by another broker after a fail-over.
Deduplication
https://jack-vanlightly.com/blog/2018/10/25/testing-producer-deduplication-in-apache-kafka-and-apache-pulsar
28 •
 Lightweight compute framework
for Pulsar
 Can run inside or outside the
cluster
 State storage is handled by
BookKeeper
 "Serverless" idea
Pulsar Functions

Mais conteúdo relacionado

Mais procurados

Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
Chandler Huang
 

Mais procurados (20)

Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQL
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platform
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharing
 
LF_DPDK_Mellanox bifurcated driver model
LF_DPDK_Mellanox bifurcated driver modelLF_DPDK_Mellanox bifurcated driver model
LF_DPDK_Mellanox bifurcated driver model
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Fight with Metaspace OOM
Fight with Metaspace OOMFight with Metaspace OOM
Fight with Metaspace OOM
 
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...
 
How we scaled git lab for a 30k employee company
How we scaled git lab for a 30k employee companyHow we scaled git lab for a 30k employee company
How we scaled git lab for a 30k employee company
 
Distributed Tracing with Jaeger
Distributed Tracing with JaegerDistributed Tracing with Jaeger
Distributed Tracing with Jaeger
 
kafka
kafkakafka
kafka
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 

Semelhante a Apache Pulsar First Overview

Semelhante a Apache Pulsar First Overview (20)

Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache Pulsar
 
Hands-on Workshop: Apache Pulsar
Hands-on Workshop: Apache PulsarHands-on Workshop: Apache Pulsar
Hands-on Workshop: Apache Pulsar
 
Kafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - PaytmKafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - Paytm
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Apache KAfka
Apache KAfkaApache KAfka
Apache KAfka
 
High performance messaging with Apache Pulsar
High performance messaging with Apache PulsarHigh performance messaging with Apache Pulsar
High performance messaging with Apache Pulsar
 
kafka
kafkakafka
kafka
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
Cloud Messaging Service: Technical Overview
Cloud Messaging Service: Technical OverviewCloud Messaging Service: Technical Overview
Cloud Messaging Service: Technical Overview
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Pulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scalePulsar - flexible pub-sub for internet scale
Pulsar - flexible pub-sub for internet scale
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
 
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityBest practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High Availability
 
A day in the life of a log message
A day in the life of a log messageA day in the life of a log message
A day in the life of a log message
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
OSDC 2018 | Scaling & High Availability MySQL learnings from the past decade+...
OSDC 2018 | Scaling & High Availability MySQL learnings from the past decade+...OSDC 2018 | Scaling & High Availability MySQL learnings from the past decade+...
OSDC 2018 | Scaling & High Availability MySQL learnings from the past decade+...
 
Apache pulsar - storage architecture
Apache pulsar - storage architectureApache pulsar - storage architecture
Apache pulsar - storage architecture
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - Nautilus
 
Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kilo
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Apache Pulsar First Overview

  • 1. Ricardo Paiva First impressions of Apache Pulsar features from someone that have never used it. :) Apache Pulsar First Overview
  • 3. 3 • Kafka is an amazing tool, with increadible througput and resilience, but it has some drawbacks or lacks few features:  Capacity of a partition is limited by the smallest node  Ops - Add/remove a new broker requires cluster rebalancing  No long term storage  Only sub/pub client pattern (no work queue)  No namespace or tenancy management  No multi-cluster replication Motivation
  • 5. 5 • Tiered Storage Uses Apache Jclouds
  • 13. 13 •  It uses BookKeeper but other schema registry can be plugged  Can be uploaded when a typed Producer is created or via REST API  Versioned  Defined at topic level  Format types:  String (used for UTF-8-encoded strings)  JSON  Protobuf  Avro  Only works with Java Schema Registry
  • 16. 16 •  Message Retention  Applies to messages that are marked as acknowledged and set to be deleted  It’s a time limit applied on a topic whereas.  TTL  Applies to messages that were not consumed  It’s a time limit on consumption with a subscription. Retention
  • 19. 19 • Shared (Working queue)  Message ordering is not guaranteed.  You cannot use cumulative acknowledgment with shared mode.
  • 23. 23 • SQL with Presto
  • 25. 25 • Geo Replication (Sync)  Requires global Zookeeper installation  Region Aware Placement Policy  Higher latency
  • 26. 26 • Geo Replication (ASync)  Rack Aware Placement Policy  First persisted to the local cluster and then replicated asynchronously to the remote clusters  Enabled on a per-tenant basis  Types:  master-slave replication  active-active bidirectional replication  full-mesh replication between multiple data centers
  • 27. 27 •  Per producer/topic sequence numbers to detect duplicates  Each topic owner broker maintains an in-memory hashmap of the latest sequence number per topic/producer.  The broker periodically snapshots the latest sequence number to a cursor, which allows the map to be reconstructed by another broker after a fail-over. Deduplication https://jack-vanlightly.com/blog/2018/10/25/testing-producer-deduplication-in-apache-kafka-and-apache-pulsar
  • 28. 28 •  Lightweight compute framework for Pulsar  Can run inside or outside the cluster  State storage is handled by BookKeeper  "Serverless" idea Pulsar Functions

Notas do Editor

  1. Do quick presentation of each other short agenda (first kafka basics + seconds design choice that made it a great tool for our scale)