Anúncio
Anúncio

Mais conteúdo relacionado

Similar a Citizen Streaming Engineer - A How To(20)

Mais de Timothy Spann(20)

Anúncio

Citizen Streaming Engineer - A How To

  1. Citizen Streaming Engineer - A How To 2022.07.29 Tim Spann | Developer Advocate
  2. Tim Spann Developer Advocate ● FLiP(N) Stack = Flink, Pulsar and NiFi Stack ● Streaming Systems/ Data Architect ● Experience: ○ 15+ years of experience with batch and streaming technologies including Pulsar, Flink, Spark, NiFi, Spring, Java, Big Data, Cloud, MXNet, Hadoop, Datalakes, IoT and more.
  3. Demo
  4. Demo https://github.com/tspannhw/airquality
  5. Apache Pulsar is a Cloud-Native Messaging and Event-Streaming Platform.
  6. Why Apache Pulsar? Unified Messaging Platform Guaranteed Message Delivery Resiliency Infinite Scalability
  7. Unified Messaging Model Streaming Consumer Consumer Consumer Subscription Shared Failover Consumer Consumer Subscription In case of failure in Consumer B-0 Consumer Consumer Subscription Exclusive X Consumer Consumer Key-Shared Subscription Pulsar Topic/Partition Messaging
  8. Ecosystem
  9. • Ingest Data • Route, Transform, Enrich • Join Data • ML Model Access • Store Easy to Build Streaming Data Pipelines
  10. Why Apache NiFi? • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Supports push and pull models • Hundreds of processors • Visual command and control • Over a 300 components • Flow templates • Pluggable/multi-role security • Designed for extension • Clustering • Version Control
  11. Use Apache NiFi For Ingest https://streamnative.io/apache-nifi-connector/ • Ingest Data • Cleanse
  12. Apache NiFi <-> Apache Pulsar
  13. Use Apache Pulsar For Ingest
  14. Use Pulsar to Route/Transform/Enrich • Libraries • Functions • Connectors • AMQP, Kafka, MQTT • Tiered Storage
  15. • Utilizing JSON Data with a JSON Schema • Consistency, Contracts, Clean Data • This enables easy SQL: • Pulsar SQL (Presto SQL) • Flink SQL • Spark Structured Streaming Use Schemas
  16. • Use Java, Python or Go • Simple way to add functionality • Route / Filter / Transform • Call Machine Learning Models Use Pulsar Functions
  17. Deploying AI With an Event-Driven Platform https://dzone.com/trendreports/enterprise-ai-1
  18. ML Models via Python / Java FN • Visual Question and Answer • Natural Language Processing • Sentiment Analysis • Text Classification • Named Entity Recognition • Content-based Recommendations • Predictive Maintenance • Fault Detection • Fraud Detection • Time-Series Predictions • Naive Bayes
  19. Functions for Enrichment
  20. Use Apache Flink to Join / Aggregate Continuous SQL
  21. Use Apache Spark To Store val dfPulsar = spark.readStream.format("pulsar") .option("service.url", "pulsar://pulsar1:6650") .option("admin.url", "http://pulsar1:8080") .option("topic", "persistent://public/default/airquality").load() val pQuery = dfPulsar.selectExpr("*") .writeStream.format("parquet") .option("truncate", false).start() https://pulsar.apache.org/docs/en/adaptors-spark/
  22. Use Pulsar to Stream to Lakehouses
  23. (Queuing + Streaming) Simple Data Pipeline
  24. Streaming FLiP-ML Apps StreamNative Hub StreamNative Cloud Unified Batch and Stream COMPUTING Batch (Batch + Stream) Unified Batch and Stream STORAGE Offload (Queuing + Streaming) Tiered Storage Pulsar --- KoP --- MoP --- Websocket Pulsar Sink Streaming Edge Gateway Protocols CDC Apps
  25. Continuous Air Quality Aggregate Monitoring
  26. ● Buffer ● Batch ● Route ● Filter ● Aggregate ● Enrich ● Replicate ● Dedupe ● Decouple ● Distribute
  27. Apache Pulsar Apache Flink Apache NiFi Apache Spark https://streamnative.io/blog/engineering/2022-04-14-what-the-flip-is-the-flip-stack/
  28. https://streamnative.io/blog/engineering/2021-11-17-building-edge-applications-with-apache-pulsar/
  29. FLiP Stack Weekly This week in Apache Flink, Apache Pulsar, Apache NiFi, Apache Spark, Java and Open Source friends. https://bit.ly/32dAJft
  30. Let’s Keep in Touch! Tim Spann Developer Advocate PaaSDev https://www.linkedin.com/in/timothyspann https://github.com/tspannhw
  31. Apache Pulsar Training • Instructor-led courses • Pulsar Fundamentals • Pulsar Developers • Pulsar Operations • On-demand learning with labs • 300+ engineers, admins and architects trained! StreamNative Academy Now Available On-Demand Pulsar Training Academy.StreamNative.io
  32. Python For Pulsar on Pi ● https://github.com/tspannhw/FLiP-Pi-BreakoutGarden ● https://github.com/tspannhw/FLiP-Pi-Thermal ● https://github.com/tspannhw/FLiP-Pi-Weather ● https://github.com/tspannhw/FLiP-RP400 ● https://github.com/tspannhw/FLiP-Py-Pi-GasThermal ● https://github.com/tspannhw/FLiP-PY-FakeDataPulsar ● https://github.com/tspannhw/FLiP-Py-Pi-EnviroPlus ● https://github.com/tspannhw/PythonPulsarExamples ● https://github.com/tspannhw/pulsar-pychat-function ● https://github.com/tspannhw/FLiP-PulsarDevPython101
  33. Thanks
Anúncio