Citizen Streaming Engineer - A How To

Citizen Streaming Engineer - A How
To
2022.07.29
Tim Spann | Developer Advocate

Tim Spann
Developer Advocate
● FLiP(N) Stack = Flink, Pulsar and NiFi Stack
● Streaming Systems/ Data Architect
● Experience:
○ 15+ years of experience with batch and streaming
technologies including Pulsar, Flink, Spark, NiFi, Spring,
Java, Big Data, Cloud, MXNet, Hadoop, Datalakes, IoT
and more.

Demo
https://github.com/tspannhw/airquality

Apache Pulsar is a Cloud-Native
Messaging and Event-Streaming Platform.

Why Apache Pulsar?
Uniﬁed
Messaging
Platform
Guaranteed
Message
Delivery
Resiliency Inﬁnite
Scalability

Unified Messaging
Model
Streaming
Consumer
Consumer
Consumer
Subscription
Shared
Failover
Consumer
Consumer
Subscription
In case of failure in
Consumer B-0
Consumer
Consumer
Subscription
Exclusive
X
Consumer
Consumer
Key-Shared
Subscription
Pulsar
Topic/Partition
Messaging

• Ingest Data
• Route, Transform, Enrich
• Join Data
• ML Model Access
• Store
Easy to Build Streaming Data Pipelines

Why Apache NiFi?
• Guaranteed delivery
• Data buffering
- Backpressure
- Pressure release
• Prioritized queuing
• Flow specific QoS
- Latency vs. throughput
- Loss tolerance
• Data provenance
• Supports push and pull
models
• Hundreds of processors
• Visual command and
control
• Over a 300 components
• Flow templates
• Pluggable/multi-role
security
• Designed for extension
• Clustering
• Version Control

Use Apache NiFi For Ingest
https://streamnative.io/apache-nifi-connector/
• Ingest Data
• Cleanse

Use Pulsar to Route/Transform/Enrich
• Libraries
• Functions
• Connectors
• AMQP, Kafka, MQTT
• Tiered Storage

• Utilizing JSON Data with a JSON Schema
• Consistency, Contracts, Clean Data
• This enables easy SQL:
• Pulsar SQL (Presto SQL)
• Flink SQL
• Spark Structured Streaming
Use Schemas

• Use Java, Python or Go
• Simple way to add
functionality
• Route / Filter /
Transform
• Call Machine Learning
Models
Use Pulsar Functions

Deploying AI With an Event-Driven
Platform
https://dzone.com/trendreports/enterprise-ai-1

ML Models via Python / Java FN
• Visual Question and Answer
• Natural Language Processing
• Sentiment Analysis
• Text Classification
• Named Entity Recognition
• Content-based
Recommendations
• Predictive
Maintenance
• Fault Detection
• Fraud Detection
• Time-Series
Predictions
• Naive Bayes

Use Apache Flink to Join / Aggregate
Continuous SQL

Use Apache Spark To Store
val dfPulsar = spark.readStream.format("pulsar")
.option("service.url",
"pulsar://pulsar1:6650")
.option("admin.url",
"http://pulsar1:8080")
.option("topic",
"persistent://public/default/airquality").load()
val pQuery = dfPulsar.selectExpr("*")
.writeStream.format("parquet")
.option("truncate", false).start()
https://pulsar.apache.org/docs/en/adaptors-spark/

Use Pulsar to Stream to Lakehouses

(Queuing + Streaming)
Simple Data Pipeline

Streaming FLiP-ML Apps
StreamNative Hub
StreamNative Cloud
Uniﬁed Batch and Stream COMPUTING
Batch
(Batch + Stream)
Uniﬁed Batch and Stream STORAGE
Offload
(Queuing + Streaming)
Tiered Storage
Pulsar
---
KoP
---
MoP
---
Websocket
Pulsar
Sink
Streaming
Edge Gateway
Protocols
CDC
Apps

Continuous Air Quality Aggregate Monitoring

● Buffer
● Batch
● Route
● Filter
● Aggregate
● Enrich
● Replicate
● Dedupe
● Decouple
● Distribute

Apache Pulsar
Apache Flink
Apache NiFi
Apache Spark
https://streamnative.io/blog/engineering/2022-04-14-what-the-flip-is-the-flip-stack/

https://streamnative.io/blog/engineering/2021-11-17-building-edge-applications-with-apache-pulsar/

FLiP Stack Weekly
This week in Apache Flink, Apache Pulsar, Apache
NiFi, Apache Spark, Java and Open Source friends.
https://bit.ly/32dAJft

Let’s Keep
in Touch!
Tim Spann
Developer Advocate
PaaSDev
https://www.linkedin.com/in/timothyspann
https://github.com/tspannhw

Apache Pulsar Training
• Instructor-led courses
• Pulsar Fundamentals
• Pulsar Developers
• Pulsar Operations
• On-demand learning with labs
• 300+ engineers, admins and architects trained!
StreamNative Academy
Now Available
On-Demand
Pulsar Training
Academy.StreamNative.io

Python For Pulsar on Pi
● https://github.com/tspannhw/FLiP-Pi-BreakoutGarden
● https://github.com/tspannhw/FLiP-Pi-Thermal
● https://github.com/tspannhw/FLiP-Pi-Weather
● https://github.com/tspannhw/FLiP-RP400
● https://github.com/tspannhw/FLiP-Py-Pi-GasThermal
● https://github.com/tspannhw/FLiP-PY-FakeDataPulsar
● https://github.com/tspannhw/FLiP-Py-Pi-EnviroPlus
● https://github.com/tspannhw/PythonPulsarExamples
● https://github.com/tspannhw/pulsar-pychat-function
● https://github.com/tspannhw/FLiP-PulsarDevPython101

Citizen Streaming Engineer - A How To

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Citizen Streaming Engineer - A How To

Semelhante a Citizen Streaming Engineer - A How To (20)

Mais de Timothy Spann

Mais de Timothy Spann (20)

Último

Último (20)

Citizen Streaming Engineer - A How To