SlideShare uma empresa Scribd logo
1 de 39
Baixar para ler offline
Pulsar Summit
San Francisco
Hotel Nikko
August 18 2022
Ecosystem
Simplify Pulsar
Functions
Development with SQL
Neng Lu
Platform Engineering Lead • StreamNative
Neng Lu is the platform engineering
lead of compute at StreamNative. He
drives the development of Pulsar
Functions, Serverless Computing and
ecosystem integration. He is also a
committer of Apache Pulsar.
Neng Lu
Platform Engineering Lead
StreamNative
Rui Fu is a senior software engineer at
StreamNative and a committer of
Apache Pulsar. He actively contributes
to Pulsar Functions, Function Mesh and
Serverless Computing
Rui Fu
Senior Software Engineer
StreamNative
Pulsar Functions
Pulsar Functions – Recap
Pulsar Functions – Recap
Pulsar Functions – Use Cases
● ETL(Extract-Transform-Load) Jobs
● Microservices
● Event Routing
● Real-time Aggregation
● Easy Operation
○ Fully Integrated with Pulsar
○ No Extra Setup Needed
● Easy Development
○ Intuitive APIs:
■ Java: public O process(I input, Context context)
■ Python: def process(self, input, context)
■ Golang: func HandleRequest(ctx context.Context, in []byte) error
Pulsar Functions – Benefits
Easier Operation?
Function Worker Recap
● Function Worker interleaves with Pulsar Broker
● Need to set up separate Function Worker cluster
● Function Worker relies on Pulsar Topics for scheduling
● Function Worker’s k8s runtime not truly cloud native
Function Mesh
Function Mesh – Recap
● Serverless framework to run Pulsar Functions in a cloud native way
● Consists of:
○ Set of CRDs for defining Pulsar Functions and Connectors
■ Function
■ Source
■ Sink
○ Operator that constantly reconciles the submitted CR
■ create sts, service, configmap, etc.
■ update according to user change
■ auto-scale if configured
Function Mesh – Architecture
Function Mesh – Summary
● Scheduling by Kubernetes not Function Worker
○ Simplicity
○ Reliability
○ Stability (both for function & brokers)
○ Extensibility (HPA, VPA, Scale-To-Zero etc)
● Compatible with Pulsar Admin Rest API
○ Seamless user experience
Easier Development?
Use Case 1 – Filtering/Routing
● Commonly used for different business purposes → duplicated
code development
● Go through the whole Pulsar Functions dev life cycle
○ (Learn)
○ Develop
○ Package
○ Debug
○ Deploy
Use Case 2 – Connector with Transformations
● Long pipeline:
○ Connector
○ Transformation Function (Often duplicated with minor diffs)
○ Intermediate topic
● Go through the Pulsar Functions life cycle TWICE:
○ Connector
■ Develop(optional)
■ …
○ Transformation Function
■ Develop
■ Package
■ …
Any Solution?
SELECT * FROM StreamNative
SQL Abstraction – Why?
● Easiest to learn and apply
● Wide audience
● Safe & Controlled Operations
● Easy job life-cycle management
● Stream Processing Trend
SQL Abstraction – What?
● IS
○ an simplified way to develop Pulsar Functions pipeline
● IS NOT
○ an interactive tool to run ad-hoc query
SQL Abstraction – Components
● Gateway
● Runner
● Cli
SQL Abstraction – Gateway
● Parser <-> Runner
● Rest API Server <-> Cli
SQL Gateway – Parser
● Antlr4 grammar
● AST processor
● JSON representation
SQL Statement
Abstract
Syntax Tree
JSON
Representation
Parser – Grammar
SQL Abstraction – Syntax
● Value Expression
○ Literal: Primitive value, like string, number, or boolean
○ Field: message payload field
○ KEY: message key
○ PROPERTIES[P_KEY]: message property
● WITH Item Definition
○ WITH MERGE KEYVALUE: Merge the fields of KeyValue
schema
○ WITH UNWRAP KEY|VALUE: Extract Key or Value fields from
KeyValue schema
Parser – Examples
Parser – AST
Parser – JSON Representation
● Intermediate Representation
○ Filter
○ Router
○ Projection
○ WITH Conditions
SQL Abstraction – Runner
● An implementation of Pulsar
Functions API
● Accept the JSON
representation
● Generate Filtering/Routing
processor during initialization
● Utilize `GenericObject` to
handle different schemas
● Directly push result into target
topic
SQL Abstraction – Runner
● Processor
○ An interface for classes that
implement data transformations
○ schema projections
○ data manipulations
○ data type conversions
● Chain Compiler
○ List<Processor>
○ Compiled from the SQL Context
SQL Gateway – REST APIs
Query Management /snsql/query POST
/snsql/query/pause/$NAME GET
/snsql/query/resume/$NAME GET
/snsql/query/delete/$NAME GET
/snsql/query/status/$NAME GET
/snsql/query/stats/$NAME GET
Gateway Information /snsql/info GET
/snsql/healthcheck GET
SQL Gateway – REST Server
● Quarkus Framework
○ easy to implement
○ cloud-native support
● Metadata Management
○ write into Pulsar topic
○ read with TableView API
SQL Abstraction – CLI
● Terminal based tool
● Interact with the
SQL gateway APIs
● Query management
SQL Abstraction – Summary
Demo
Future Work
● Syntax support for Source/Sink
● Builtin system function support
● Aggregation Operation
● Join Operation
Resources
● Pulsar Functions:
https://pulsar.apache.org/docs/functions-overview/
● Function Mesh: https://functionmesh.io/
● Slack & Mailing List:
○ Apache Pulsar Slack: https://apache-pulsar.slack.com/
○ StreamNative Community Slack:
https://streamnativecommunity.slack.com/
○ Apache Pulsar Mailing List:
■ users@pulsar.apache.org
■ dev@pulsar.apache.org
Neng Lu
Thank you!
nlu@streamnative.io
Pulsar Summit
San Francisco
Hotel Nikko
August 18 2022
rfu@streamnative.io
Rui Fu
@nlu90
@freeznet rfu
nlu

Mais conteúdo relacionado

Semelhante a Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022

Openshift service broker and catalog ocp-meetup july 2018
Openshift service broker and catalog  ocp-meetup july 2018Openshift service broker and catalog  ocp-meetup july 2018
Openshift service broker and catalog ocp-meetup july 2018Michael Calizo
 
Spark Workflow Management
Spark Workflow ManagementSpark Workflow Management
Spark Workflow ManagementRomi Kuntsman
 
Your journey into the serverless world
Your journey into the serverless worldYour journey into the serverless world
Your journey into the serverless worldRed Hat Developers
 
Interactive workflow management using Azkaban
Interactive workflow management using AzkabanInteractive workflow management using Azkaban
Interactive workflow management using Azkabandatamantra
 
Activity feeds (and more) at mate1
Activity feeds (and more) at mate1Activity feeds (and more) at mate1
Activity feeds (and more) at mate1Hisham Mardam-Bey
 
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka Hua Chu
 
Serverless Boston @ Oracle Meetup
Serverless Boston @ Oracle MeetupServerless Boston @ Oracle Meetup
Serverless Boston @ Oracle MeetupWayne Scarano
 
The Fn Project by Jesse Butler
 The Fn Project by Jesse Butler The Fn Project by Jesse Butler
The Fn Project by Jesse ButlerOracle Developers
 
ESB integration for node.js
ESB integration for node.js ESB integration for node.js
ESB integration for node.js SÎNICĂ Alboaie
 
Developing Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideDeveloping Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideMohanraj Thirumoorthy
 
Cassandra Lunch #88: Cadence
Cassandra Lunch #88: CadenceCassandra Lunch #88: Cadence
Cassandra Lunch #88: CadenceAnant Corporation
 
Azure functions: from a function to a whole application in 60 minutes
Azure functions: from a function to a whole application in 60 minutesAzure functions: from a function to a whole application in 60 minutes
Azure functions: from a function to a whole application in 60 minutesAlessandro Melchiori
 
SFO15-102:ODP Project Update
SFO15-102:ODP Project UpdateSFO15-102:ODP Project Update
SFO15-102:ODP Project UpdateLinaro
 
Free the Functions with Fn project!
Free the Functions with Fn project!Free the Functions with Fn project!
Free the Functions with Fn project!J On The Beach
 
Load testing in Zonky with Gatling
Load testing in Zonky with GatlingLoad testing in Zonky with Gatling
Load testing in Zonky with GatlingPetr Vlček
 
CON6423: Scalable JavaScript applications with Project Nashorn
CON6423: Scalable JavaScript applications with Project NashornCON6423: Scalable JavaScript applications with Project Nashorn
CON6423: Scalable JavaScript applications with Project NashornMichel Graciano
 
Structured Streaming in Spark
Structured Streaming in SparkStructured Streaming in Spark
Structured Streaming in SparkDigital Vidya
 
XStream: stream processing platform at facebook
XStream:  stream processing platform at facebookXStream:  stream processing platform at facebook
XStream: stream processing platform at facebookAniket Mokashi
 
Introduction to Postrges-XC
Introduction to Postrges-XCIntroduction to Postrges-XC
Introduction to Postrges-XCAshutosh Bapat
 

Semelhante a Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022 (20)

Openshift service broker and catalog ocp-meetup july 2018
Openshift service broker and catalog  ocp-meetup july 2018Openshift service broker and catalog  ocp-meetup july 2018
Openshift service broker and catalog ocp-meetup july 2018
 
Spark Workflow Management
Spark Workflow ManagementSpark Workflow Management
Spark Workflow Management
 
Your journey into the serverless world
Your journey into the serverless worldYour journey into the serverless world
Your journey into the serverless world
 
Plpgsql russia-pgconf
Plpgsql russia-pgconfPlpgsql russia-pgconf
Plpgsql russia-pgconf
 
Interactive workflow management using Azkaban
Interactive workflow management using AzkabanInteractive workflow management using Azkaban
Interactive workflow management using Azkaban
 
Activity feeds (and more) at mate1
Activity feeds (and more) at mate1Activity feeds (and more) at mate1
Activity feeds (and more) at mate1
 
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
 
Serverless Boston @ Oracle Meetup
Serverless Boston @ Oracle MeetupServerless Boston @ Oracle Meetup
Serverless Boston @ Oracle Meetup
 
The Fn Project by Jesse Butler
 The Fn Project by Jesse Butler The Fn Project by Jesse Butler
The Fn Project by Jesse Butler
 
ESB integration for node.js
ESB integration for node.js ESB integration for node.js
ESB integration for node.js
 
Developing Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideDeveloping Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's Guide
 
Cassandra Lunch #88: Cadence
Cassandra Lunch #88: CadenceCassandra Lunch #88: Cadence
Cassandra Lunch #88: Cadence
 
Azure functions: from a function to a whole application in 60 minutes
Azure functions: from a function to a whole application in 60 minutesAzure functions: from a function to a whole application in 60 minutes
Azure functions: from a function to a whole application in 60 minutes
 
SFO15-102:ODP Project Update
SFO15-102:ODP Project UpdateSFO15-102:ODP Project Update
SFO15-102:ODP Project Update
 
Free the Functions with Fn project!
Free the Functions with Fn project!Free the Functions with Fn project!
Free the Functions with Fn project!
 
Load testing in Zonky with Gatling
Load testing in Zonky with GatlingLoad testing in Zonky with Gatling
Load testing in Zonky with Gatling
 
CON6423: Scalable JavaScript applications with Project Nashorn
CON6423: Scalable JavaScript applications with Project NashornCON6423: Scalable JavaScript applications with Project Nashorn
CON6423: Scalable JavaScript applications with Project Nashorn
 
Structured Streaming in Spark
Structured Streaming in SparkStructured Streaming in Spark
Structured Streaming in Spark
 
XStream: stream processing platform at facebook
XStream:  stream processing platform at facebookXStream:  stream processing platform at facebook
XStream: stream processing platform at facebook
 
Introduction to Postrges-XC
Introduction to Postrges-XCIntroduction to Postrges-XC
Introduction to Postrges-XC
 

Mais de StreamNative

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...StreamNative
 
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021StreamNative
 

Mais de StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
 

Último

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 

Último (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022

  • 1. Pulsar Summit San Francisco Hotel Nikko August 18 2022 Ecosystem Simplify Pulsar Functions Development with SQL Neng Lu Platform Engineering Lead • StreamNative
  • 2. Neng Lu is the platform engineering lead of compute at StreamNative. He drives the development of Pulsar Functions, Serverless Computing and ecosystem integration. He is also a committer of Apache Pulsar. Neng Lu Platform Engineering Lead StreamNative Rui Fu is a senior software engineer at StreamNative and a committer of Apache Pulsar. He actively contributes to Pulsar Functions, Function Mesh and Serverless Computing Rui Fu Senior Software Engineer StreamNative
  • 6. Pulsar Functions – Use Cases ● ETL(Extract-Transform-Load) Jobs ● Microservices ● Event Routing ● Real-time Aggregation
  • 7. ● Easy Operation ○ Fully Integrated with Pulsar ○ No Extra Setup Needed ● Easy Development ○ Intuitive APIs: ■ Java: public O process(I input, Context context) ■ Python: def process(self, input, context) ■ Golang: func HandleRequest(ctx context.Context, in []byte) error Pulsar Functions – Benefits
  • 9. Function Worker Recap ● Function Worker interleaves with Pulsar Broker ● Need to set up separate Function Worker cluster ● Function Worker relies on Pulsar Topics for scheduling ● Function Worker’s k8s runtime not truly cloud native
  • 11. Function Mesh – Recap ● Serverless framework to run Pulsar Functions in a cloud native way ● Consists of: ○ Set of CRDs for defining Pulsar Functions and Connectors ■ Function ■ Source ■ Sink ○ Operator that constantly reconciles the submitted CR ■ create sts, service, configmap, etc. ■ update according to user change ■ auto-scale if configured
  • 12. Function Mesh – Architecture
  • 13. Function Mesh – Summary ● Scheduling by Kubernetes not Function Worker ○ Simplicity ○ Reliability ○ Stability (both for function & brokers) ○ Extensibility (HPA, VPA, Scale-To-Zero etc) ● Compatible with Pulsar Admin Rest API ○ Seamless user experience
  • 15. Use Case 1 – Filtering/Routing ● Commonly used for different business purposes → duplicated code development ● Go through the whole Pulsar Functions dev life cycle ○ (Learn) ○ Develop ○ Package ○ Debug ○ Deploy
  • 16. Use Case 2 – Connector with Transformations ● Long pipeline: ○ Connector ○ Transformation Function (Often duplicated with minor diffs) ○ Intermediate topic ● Go through the Pulsar Functions life cycle TWICE: ○ Connector ■ Develop(optional) ■ … ○ Transformation Function ■ Develop ■ Package ■ …
  • 18. SELECT * FROM StreamNative
  • 19. SQL Abstraction – Why? ● Easiest to learn and apply ● Wide audience ● Safe & Controlled Operations ● Easy job life-cycle management ● Stream Processing Trend
  • 20. SQL Abstraction – What? ● IS ○ an simplified way to develop Pulsar Functions pipeline ● IS NOT ○ an interactive tool to run ad-hoc query
  • 21. SQL Abstraction – Components ● Gateway ● Runner ● Cli
  • 22. SQL Abstraction – Gateway ● Parser <-> Runner ● Rest API Server <-> Cli
  • 23. SQL Gateway – Parser ● Antlr4 grammar ● AST processor ● JSON representation SQL Statement Abstract Syntax Tree JSON Representation
  • 25. SQL Abstraction – Syntax ● Value Expression ○ Literal: Primitive value, like string, number, or boolean ○ Field: message payload field ○ KEY: message key ○ PROPERTIES[P_KEY]: message property ● WITH Item Definition ○ WITH MERGE KEYVALUE: Merge the fields of KeyValue schema ○ WITH UNWRAP KEY|VALUE: Extract Key or Value fields from KeyValue schema
  • 28. Parser – JSON Representation ● Intermediate Representation ○ Filter ○ Router ○ Projection ○ WITH Conditions
  • 29. SQL Abstraction – Runner ● An implementation of Pulsar Functions API ● Accept the JSON representation ● Generate Filtering/Routing processor during initialization ● Utilize `GenericObject` to handle different schemas ● Directly push result into target topic
  • 30. SQL Abstraction – Runner ● Processor ○ An interface for classes that implement data transformations ○ schema projections ○ data manipulations ○ data type conversions ● Chain Compiler ○ List<Processor> ○ Compiled from the SQL Context
  • 31. SQL Gateway – REST APIs Query Management /snsql/query POST /snsql/query/pause/$NAME GET /snsql/query/resume/$NAME GET /snsql/query/delete/$NAME GET /snsql/query/status/$NAME GET /snsql/query/stats/$NAME GET Gateway Information /snsql/info GET /snsql/healthcheck GET
  • 32. SQL Gateway – REST Server ● Quarkus Framework ○ easy to implement ○ cloud-native support ● Metadata Management ○ write into Pulsar topic ○ read with TableView API
  • 33. SQL Abstraction – CLI ● Terminal based tool ● Interact with the SQL gateway APIs ● Query management
  • 35. Demo
  • 36.
  • 37. Future Work ● Syntax support for Source/Sink ● Builtin system function support ● Aggregation Operation ● Join Operation
  • 38. Resources ● Pulsar Functions: https://pulsar.apache.org/docs/functions-overview/ ● Function Mesh: https://functionmesh.io/ ● Slack & Mailing List: ○ Apache Pulsar Slack: https://apache-pulsar.slack.com/ ○ StreamNative Community Slack: https://streamnativecommunity.slack.com/ ○ Apache Pulsar Mailing List: ■ users@pulsar.apache.org ■ dev@pulsar.apache.org
  • 39. Neng Lu Thank you! nlu@streamnative.io Pulsar Summit San Francisco Hotel Nikko August 18 2022 rfu@streamnative.io Rui Fu @nlu90 @freeznet rfu nlu