This document discusses Apache Pulsar, a cloud-native messaging and event streaming platform. It provides an overview of key Pulsar concepts including messaging vs streaming, the Pulsar cluster architecture using brokers and bookies, and Pulsar Functions which allow processing data streams using multiple programming languages. Examples of using Pulsar Functions with Java, Python and deploying on Kubernetes are also presented. Benefits of using Pulsar for building microservices, asynchronous communication, real-time applications and tiered storage are highlighted.
2. Apache Pulsar is a Cloud-Native Messaging
and Event-Streaming Platform.
3. Key Pulsar Concepts: Messaging vs
Streaming
Message Queueing - Queueing
systems are ideal for work queues
that do not require tasks to be
performed in a particular order.
Streaming - Streaming works
best in situations where the
order of messages is important.
5. ● Consume messages from one
or more Pulsar topics.
● Apply user-supplied
processing logic to each
message.
● Publish the results of the
computation to another topic.
● Support multiple
programming languages
(Java, Python, Go)
● Can leverage 3rd-party
libraries
Pulsar Functions
6. import java.util.function.Function;
public class MyFunction implements Function<String, String> {
public String apply(String input) {
return doBusinessLogic(input);
}
}
Entire Function
Why Pulsar
Functions?
7. from pulsar import Function
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import json
class Chat(Function):
def __init__(self):
pass
def process(self, input, context):
fields = json.loads(input)
sid = SentimentIntensityAnalyzer()
ss = sid.polarity_scores(fields["comment"])
row = { }
row['id'] = str(msg_id)
if ss['compound'] < 0.00:
row['sentiment'] = 'Negative'
else:
row['sentiment'] = 'Positive'
row['comment'] = str(fields["comment"])
json_string = json.dumps(row)
return json_string
Entire Function
Pulsar Python
NLP Function
https://github.com/tspannhw/pulsar-pychat-function
8. Function Mesh
Pulsar Functions, along with Pulsar IO/Connectors, provide a powerful API for ingesting, transforming,
and outputting data.
Function Mesh, another StreamNative project, makes it easier for developers to create entire
applications built from sources, functions, and sinks all through a declarative API.
16. Timothy Spann
Developer Advocate
FLiP(N) Stack = Flink, Pulsar and NiFi Stack
Streaming Systems & Data Architecture Expert
Experience:
15+ years of experience with streaming technologies including
Pulsar, Flink, Spark, NiFi, Kafka, Big Data, Cloud, MXNet, IoT
and more.
Today, he helps to grow the Pulsar community sharing rich
technical knowledge and experience at both global
conferences and through individual conversations.
17. FLiP Stack Weekly
This week in Apache Flink, Apache Pulsar, Apache
NiFi, Apache Spark, Elasticsearch and open source
friends.
https://bit.ly/32dAJft
18. Accessing historical as well as real-time data
Pub/sub model enables event streams to be sent from multiple
producers, and consumed by multiple consumers
To process large amounts of data in a highly scalable way
When is Messaging and Streaming used?
19. CONFIDENTIAL. DO NOT SHARE.
Apache Pulsar adoption is
being driven by
organizations seeking
cloud-native architectures
and new uses cases.
Microservices
Apache Pulsar - Built for Containers / Modern Cloud
Cloud Native
Hybrid & Multi-Cloud Containers
20. ● “Bookies”
● Stores messages and cursors
● Messages are grouped in
segments/ledgers
● A group of bookies form an “ensemble” to
store a ledger
● “Brokers”
● Handles message routing and
connections
● Stateless, but with caches
● Automatic load-balancing
● Topics are composed of
multiple segments
●
● Stores metadata for both Pulsar
and BookKeeper
● Service discovery
Store
Messages
Metadata & Service
Discovery
Metadata & Service
Discovery
Metadata Store
(ZK, RocksDB, etcd, …)
Pulsar Cluster
22. Apache Pulsar
Apache BookKeeper
Offloader & Tiered Storage
Offloading
Broker 0
Producer Consumer
Broker 1 Broker 2
Bookie 0 Bookie 1 Bookie 2 Bookie 3 Bookie 4
S S S S
T
1
T
2
T
3
T
4
T
0
S