SlideShare uma empresa Scribd logo
1 de 94
Baixar para ler offline
1
UDFs/UDAFs :
The Extensibility Framework
for KSQL
Hojjat Jafarpour
hojjat@confluent.io
@Hojjat
2
Agenda
● Intro to KSQL
● Functions in KSQL
● UDFs/UDAFs in KSQL
● Miscellaneous Considerations
3
Goal
● If you haven’t used KSQL yet...
4
Goal
● If you haven’t used KSQL yet...
5
Goal
● If you haven’t used KSQL yet...
○ Try it today!
6
Goal
● If you haven’t used KSQL yet...
○ Try it today!
● Extend KSQL with your custom computation through
○ UDF
○ UDAF
7
KSQL: the Streaming SQL Engine for Apache Kafka
®
from Confluent
● Enables stream processing with zero coding required
● The simplest way to process streams of data in real
time
● Powered by Kafka: scalable, distributed, battle-tested
● All you need is Kafka–no complex deployments of
bespoke systems for stream processing
8
What is it for?
Streaming ETL
● Kafka is popular for data pipelines.
● KSQL enables easy transformations of data within the pipe
CREATE STREAM vip_actions AS
SELECT userid, page, action
FROM clickstream c
LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';
9
What is it for?
Anomaly Detection
● Identifying patterns or anomalies in real-time data, surfaced in milliseconds
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING count(*) > 3;
10
What is it for?
Real Time Monitoring
● Log data monitoring, tracking and alerting
● Sensor / IoT data
CREATE TABLE error_counts AS
SELECT error_code, count(*)
FROM monitoring_stream
WINDOW TUMBLING (SIZE 1 MINUTE)
WHERE type = 'ERROR'
GROUP BY error_code;
11
How does it works?
● Streaming SQL to Kafka Streams Apps Streaming SQL
Statement
12
How does it works?
● Streaming SQL to Kafka Streams Apps Streaming SQL
Statement
13
How does it works?
● Streaming SQL to Kafka Streams Apps
● Continuously read from source topic(s),
process, and write the results into sink
topic
Streaming SQL
Statement
Source
Sink
14
Example
● An online store!
15
Example
● Stream of shipments events!
CREATE STREAM shipments (
ID VARCHAR,
ORDER_ID VARCHAR,
STREET VARCHAR,
CITY VARCHAR,
STATE VARCHAR,
ZIPCODE VARCHAR,
EMAIL VARCHAR,
PHONE VARCHAR
) WITH (KAFKA_TOPIC=’ShipmentsTopic’, VALUE_FORMAT=’JSON’);
16
Example
● Sample continuous queries:
○ All shipments to CA:
CREATE STREAM ca_shipments AS
SELECT *
FROM shipments
WHERE STATE = ’CA’;
17
Example
● Sample continuous queries:
○ All shipments to CA:
○ Daily shipments count for each zipcode:
CREATE STREAM ca_shipments AS
SELECT *
FROM shipments
WHERE STATE = ’CA’;
CREATE TABLE zip_daily_shipment_count AS
SELECT ZIPCODE, COUNT(*)
FROM shipments
WINDOW tumbling (SIZE 1 DAY)
GROUP BY ZIPCODE;
18
Functions
● KSQL built-in functions to be used in expressions
19
Functions
● Scalar Functions (Stateless)
20
Functions
● Scalar Functions (Stateless)
LEN(c)
21
Functions
● Scalar Functions (Stateless)
LEN(c)bar hi foo
22
Functions
● Scalar Functions (Stateless)
LEN(c)bar hi 3
23
Functions
● Scalar Functions (Stateless)
LEN(c)bar 32
24
Functions
● Scalar Functions (Stateless)
○ Substring
○ Trim
○ Concat
○ Abs
○ Floor
○ ...
25
Functions
● Scalar Functions (Stateless)
○ Substring
○ Trim
○ Concat
○ Abs
○ Floor
○ ...
● Aggregate Functions (Stateful)
26
Functions
COUNT(c)bar hi foo
● Aggregate Functions (Stateful)
Key Value
K1 0
27
Functions
COUNT(c)bar hi
● Aggregate Functions (Stateful)
Key Value
K1 1
foo
28
Functions
COUNT(c)bar hi
● Aggregate Functions (Stateful)
Key Value
K1 2
foo
29
Functions
● Aggregate Functions (Stateful)
○ Count
○ Sum
○ Min
○ Max
○ ...
30
Functions
● Scalar Functions (Stateless)
○ Substring
○ Trim
○ Concat
○ Abs
○ Floor
○ ...
● Aggregate Functions (Stateful)
○ Count
○ Sum
○ Min
○ Max
○ ...
31
Functions
● Scalar Functions (Stateless)
○ Substring
○ Trim
○ Concat
○ Abs
○ Floor
○ ...
● Aggregate Functions (Stateful)
○ Count
○ Sum
○ Min
○ Max
○ ...
What if I need a
function that is not
one of the KSQL
built-in functions?
32
Functions
● User Defined Functions
(UDFs)
○ Stateless
● User Defined Aggregate
Functions (UDAFs)
○ Stateful
33
UDFs/UDAFs
● How?
a. Write your UDF or UDAF class in Java.
34
UDFs/UDAFs
● How?
a. Write your UDF or UDAF class in Java.
b. Deploy the JAR file to the KSQL extensions directory.
35
UDFs/UDAFs
● How?
a. Write your UDF or UDAF class in Java.
b. Deploy the JAR file to the KSQL extensions directory.
c. Use your function like any other KSQL function in your
queries.
36
UDFs/UDAFs
● How?
a. Write your UDF or UDAF class in Java.
b. Deploy the JAR file to the KSQL extensions directory.
c. Use your function like any other KSQL function in your
queries.
37
UDFs/UDAFs
● How?
a. Write your UDF or UDAF class in Java.
b. Deploy the JAR file to the KSQL extensions directory.
c. Use your function like any other KSQL function in your
queries.
38
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
39
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
Gradle:
compile 'io.confluent.ksql:ksql-udf:5.3.1'
40
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
Gradle: Maven POM:
<repositories>
<repository>
<id>confluent</id>
<url>http://packages.confluent.io/maven/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>io.confluent.ksql</groupId>
<artifactId>ksql-udf</artifactId>
<version>5.3.1</version>
</dependency>
</dependencies>
compile 'io.confluent.ksql:ksql-udf:5.3.1'
41
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
2. Create a class that is annotated with
@UdfDescription.
42
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
2. Create a class that is annotated with
@UdfDescription.
UDF to validate email address
format
43
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
2. Create a class that is annotated with
@UdfDescription.
import io.confluent.ksql.function.udf.Udf;
import io.confluent.ksql.function.udf.UdfDescription;
@UdfDescription(
name = "validateEmail",
description = "Validates email address format")
public class MyUDFs {
}
44
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
2. Create a class that is annotated with
@UdfDescription.
import io.confluent.ksql.function.udf.Udf;
import io.confluent.ksql.function.udf.UdfDescription;
@UdfDescription(
name = "validateEmail",
description = "Validates email address format")
public class MyUDFs {
}
import io.confluent.ksql.function.udf.Udf;
import io.confluent.ksql.function.udf.UdfDescription;
45
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
2. Create a class that is annotated with
@UdfDescription.
import io.confluent.ksql.function.udf.Udf;
import io.confluent.ksql.function.udf.UdfDescription;
@UdfDescription(
name = "validateEmail",
description = "Validates email address format")
public class MyUDFs {
} @UdfDescription(
name = "validateEmail “,
description = "Validates email address")
46
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
2. Create a class that is annotated with
@UdfDescription.
47
Write a UDF for KSQL
1. Create a project with dependency on ksql-udf
module
2. Create a class that is annotated with
@UdfDescription.
3. Implement UDFs as public methods with @Udf
annotation.
a. Use @UdfParameter annotation to provide
more info on UDF parameters (optional)
48
Write a UDF for KSQL
● Email validator UDF.
import io.confluent.ksql.function.udf.Udf;
import io.confluent.ksql.function.udf.UdfDescription;
@UdfDescription(
name = "validateEmail “,
description = "Validates emails")
public class MyUDFs {
@Udf(description = "Validates email format.")
public boolean validateEmail(String email) {
final String EMAIL_REGEX = "^[w-+]+(.[w]+)*@[w-]+(.[w]+)*(.[a-z]{2,})$";
return Pattern.compile(EMAIL_REGEX, Pattern.CASE_INSENSITIVE).matcher(email).matches();
}
}
49
Write a UDF for KSQL
● Email validator UDF.
import io.confluent.ksql.function.udf.Udf;
import io.confluent.ksql.function.udf.UdfDescription;
@UdfDescription(
description = "validateEmail “,
description = "Validates phone numbers and emails")
public class MyUDFs {
@Udf(description = "Validates email format.")
public boolean validateEmail(String email) {
final String EMAIL_REGEX = "^[w-+]+(.[w]+)*@[w-]+(.[w]+)*(.[a-z]{2,})$";
return Pattern.compile(EMAIL_REGEX, Pattern.CASE_INSENSITIVE).matcher(email).matches();
}
}
@Udf(description = "Validates email format.")
public boolean validateEmail(String email) {
final String EMAIL_REGEX = "^[w-+]+(.[w]+)*@[w-]+(.[w]+)*(.[a-z]{2,})$";
return Pattern.compile(EMAIL_REGEX, Pattern.CASE_INSENSITIVE).matcher(email).matches();
}
50
UDFs/UDAFs
● How?
a. Write your UDF or UDAF class in Java.
b. Deploy the JAR file to the KSQL extensions directory.
c. Use your function like any other KSQL function in your
queries.
51
Deploy UDFs in KSQL
● Build an uber-jar with all the dependencies
52
Deploy UDFs in KSQL
● Build an uber-jar with all the dependencies
● Copy the uber-jar into the extension directory in each
KSQL server
○ Default is $KSQL_HOME/ext
○ Can be configured by ksql.extension.dir
property for KSQL server
53
Deploy UDFs in KSQL
● Build an uber-jar with all the dependencies
● Copy the uber-jar into the extension directory in each
KSQL server
○ Default is $KSQL_HOME/ext
○ Can be configured by ksql.extension.dir
property for KSQL server
● Restart every KSQL server
54
UDFs/UDAFs
● How?
a. Write your UDF or UDAF class in Java.
b. Deploy the JAR file to the KSQL extensions directory.
c. Use your function like any other KSQL function in your
queries.
55
Use UDFs in KSQL Queries
● All shipments with invalid email address
CREATE STREAM invalid_shipments AS
SELECT *
FROM shipments
WHERE validateEamil(email) = false;
56
UDFs in KSQL
● You can have much more complex UDFs
57
UDFs in KSQL
● You can have much more complex UDFs
Deep Learning UDF for KSQL
for Streaming Anomaly
Detection of MQTT IoT Sensor
Data
https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot
58
Write a UDAF for KSQL
1. Create a project with dependency on ksql-udf module
59
Write a UDAF for KSQL
1. Create a project with dependency on ksql-udf module
2. Create a class that is annotated with @UdafDescription.
60
Write a UDAF for KSQL
1. Create a project with dependency on ksql-udf module
2. Create a class that is annotated with @UdafDescription.
UDAF to collect all order ids in a set
for a shipment
61
Write a UDAF for KSQL
package testudaf;
import com.google.common.collect.Lists;
import io.confluent.ksql.function.udaf.Udaf;
import io.confluent.ksql.function.udaf.UdafDescription;
import io.confluent.ksql.function.udaf.UdafFactory;
import java.util.List;
@UdafDescription(
name = "collectOrderSet",
description = "Collect all the orders for a shipment..")
public final class CollectOrdersSet {
...
}
62
Write a UDAF for KSQL
package testudaf;
import com.google.common.collect.Lists;
import io.confluent.ksql.function.udaf.Udaf;
import io.confluent.ksql.function.udaf.UdafDescription;
import io.confluent.ksql.function.udaf.UdafFactory;
import java.util.List;
@UdafDescription(
name = "collectOrderSet",
description = "Collect all the orders for a shipment..")
public final class CollectOrdersSet {
...
}
imports
63
Write a UDAF for KSQL
package testudaf;
import com.google.common.collect.Lists;
import io.confluent.ksql.function.udaf.Udaf;
import io.confluent.ksql.function.udaf.UdafDescription;
import io.confluent.ksql.function.udaf.UdafFactory;
import java.util.List;
@UdafDescription(
name = "collectOrderSet",
description = "Collect all the orders for a shipment..")
public final class CollectOrdersSet {
...
}
imports
Udaf annotation
64
Write a UDAF for KSQL
1. Create a project with dependency on ksql-udf module
2. Create a class that is annotated with @UdafDescription.
3. Implement UDAF Factories as public and static methods with
@UdafFactory annotation.
a. The factory methods should return Udaf or TableUdaf
b. Implement the UDAF logic in the returned Udaf or
TableUdaf
65
Write a UDAF for KSQL
package testudaf;
import com.google.common.collect.Lists;
import io.confluent.ksql.function.udaf.Udaf;
import io.confluent.ksql.function.udaf.UdafDescription;
import io.confluent.ksql.function.udaf.UdafFactory;
import java.util.List;
@UdafDescription(
name = "collectOrderSet",
description = "Collect all the orders for a shipment..")
public final class CollectOrdersSet {
private static final int LIMIT = 1000;
@UdafFactory(description = "Collect all the orders for a shipment..")
public static Udaf<String, List<String>> orderSetCollector() {
return new Udaf<String, List<String>>() {
// Implement TableUdaf methods
…
};
}}
imports
Udaf annotation
Udaf factory
66
Write a UDAF for KSQL
return new Udaf<String, List<String>>() {
@Override
public List<String> initialize() {...}
@Override
public List<String> aggregate(final String thisValue, final List<String> aggregate) { ... }
@Override
public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {...}
};
67
Write a UDAF for KSQL
return new Udaf<String, List<String>>() {
@Override
public List<String> initialize() {...}
@Override
public List<String> aggregate(final String thisValue, final List<String> aggregate) { ... }
@Override
public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {...}
};
// The initializer for the Aggregation
@Override
public List<String> initialize() {
return Lists.newArrayList();
}
68
Write a UDAF for KSQL
return new Udaf<String, List<String>>() {
@Override
public List<String> initialize() {...}
@Override
public List<String> aggregate(final String thisValue, final List<String> aggregate) { ... }
@Override
public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {...}
};
// Aggregates the current value into the existing aggregate
@Override
public List<String> aggregate(final String thisValue, final List<String> aggregate) {
if (aggregate.size() < LIMIT && !aggregate.contains(thisValue)) {
aggregate.add(thisValue);
}
return aggregate;
}
69
Write a UDAF for KSQL
collectOrderSet(c)bar hi
Key Value
K1 {foo}
70
Write a UDAF for KSQL
collectOrderSet(c)bar
Key Value
K1 {foo, hi}
hi
71
Write a UDAF for KSQL
collectOrderSet(c) bar
Key Value
K1 {foo,hi, bar}
hi
72
Write a UDAF for KSQL
return new Udaf<String, List<String>>() {
@Override
public List<String> initialize() {...}
@Override
public List<String> aggregate(final String thisValue, final List<String> aggregate) { ... }
@Override
public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {...}
};
// Merge two aggregates when merging session windows
@Override
public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {
for (final T thisEntry : aggTwo) {
if (aggOne.size() == LIMIT) { break; }
if (!aggOne.contains(thisEntry)) {
aggOne.add(thisEntry);
}
}
return aggOne;
}
73
Write a UDAF for KSQL
collectOrderSet(c)
Key Value
K1(W1) {foo,hi, bar}
K1(W2) {bar, tab}
K1(W1) K1(W2)
foo hi bar bar tab
session
inactivity gap
74
Write a UDAF for KSQL
collectOrderSet(c)
Key Value
K1(W1) {foo,hi, bar}
K1(W2) {bar, tab}
K1(W1) K1(W2)
foo hi bar bar tab
session
inactivity gap
hello
75
Write a UDAF for KSQL
collectOrderSet(c)
Key Value
K1(W1) {foo,hi, bar}
K1(W2) {bar, tab}
K1(W1) K1(W2)
foo hi bar bar tab
session
inactivity gap
hello
76
Write a UDAF for KSQL
collectOrderSet(c)
Key Value
K1(W1) {foo,hi, bar}
K1(W2) {bar, tab}
K1(W3)
foo hi bar bar tabhello
K1(W3) {foo, hi, bar, tab, hello}
77
Use UDAFs in KSQL Queries
● Set of orders for each shipment per day
78
Use UDAFs in KSQL Queries
● Set of orders for each shipment per day
CREATE TABLE shipment_orders AS
SELECT id, collectOrderSet(order_id)
FROM shipments
WINDOW tumbling (SIZE 24 HOURS)
GROUP BY id;
79
Miscellaneous Considerations
● Security
80
Miscellaneous Considerations
● Security
○ Blacklisting classes
81
Miscellaneous Considerations
● Security
○ Blacklisting classes
■ Optionally blacklist classes and packages such that
they can't be used from a UD(A)F.
■ resource-blacklist.txt in the extension
directory
82
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
83
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
■ Blocks attempts by any UD(A)Fs to fork processes
from the KSQL server.
■ Prevents them from calling System.exit(..)
84
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
● Metric Collection
85
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
● Metric Collection
○ Set the config ksql.udf.collect.metrics to true.
86
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
● Metric Collection
○ Set the config ksql.udf.collect.metrics to true.
○ Collected Metrics:
■ Average/Max time for an invocation
■ Total number of invocations
■ The average number of invocations per second
87
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
● Metric Collection
● Configurable UDF
88
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
● Metric Collection
● Configurable UDF
○ UDF access to KSQL server configs
89
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
● Metric Collection
● Configurable UDF
○ UDF access to KSQL server configs
○ Implement org.apache.kafka.common.Configurable
90
Miscellaneous Considerations
@UdfDescription(name = "MyFirstUDF", description = "multiplies 2 numbers")
public class SomeConfigurableUdf implements Configurable {
private String someSetting = "a.default.value";
@Override
public void configure(final Map<String, ?> map) {
this.someSetting = (String)map.get("ksql.functions.myfirstudf.some.setting");
}
...
}
91
Miscellaneous Considerations
● Security
○ Blacklisting classes
○ SecuriyManager
● Metric Collection
● Configurable UDF
○ Only configs whose name is prefixed with
ksql.functions.<lowercase-udfname>. or
ksql.functions._global_. are accessible
92
Shout out to Mitch Seymour
● Luna: a place for developers to publish their own UDFs /
UDAFs that may not otherwise be a good fit for
contributing to the KSQL codebase, itself
https://magicalpipelines.com/luna/
93
Wrapping up
● Introduction to KSQL
● KSQL Built-in Functions
● Extending KSQL with Custom Functions
○ UDFs (Stateless)
○ UDAFs (Stateful)
● Resources
○ KSQL Docs: https://docs.confluent.io/current/ksql/docs/developer-guide/udf.html#
○ Confluent Examples: https://github.com/confluentinc/demo-scene/tree/master/ksql-udf-advanced-example
○ Luna: https://magicalpipelines.com/luna/
94
UDFs/UDAFs :
The Extensibility Framework
for KSQL
Hojjat Jafarpour
hojjat@confluent.io
@Hojjat

Mais conteúdo relacionado

Mais procurados

Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...confluent
 
Being Glue (Newer slides at https://noidea.dog/glue)
Being Glue (Newer slides at https://noidea.dog/glue)Being Glue (Newer slides at https://noidea.dog/glue)
Being Glue (Newer slides at https://noidea.dog/glue)Tanya Reilly
 
Understanding Reactive Programming
Understanding Reactive ProgrammingUnderstanding Reactive Programming
Understanding Reactive ProgrammingAndres Almiray
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
Api-First service design
Api-First service designApi-First service design
Api-First service designStefaan Ponnet
 
Kata: Hexagonal Architecture / Ports and Adapters
Kata: Hexagonal Architecture / Ports and AdaptersKata: Hexagonal Architecture / Ports and Adapters
Kata: Hexagonal Architecture / Ports and Adaptersholsky
 
Microservice With Spring Boot and Spring Cloud
Microservice With Spring Boot and Spring CloudMicroservice With Spring Boot and Spring Cloud
Microservice With Spring Boot and Spring CloudEberhard Wolff
 
Streaming ETL to Elastic with Apache Kafka and KSQL
Streaming ETL to Elastic with Apache Kafka and KSQLStreaming ETL to Elastic with Apache Kafka and KSQL
Streaming ETL to Elastic with Apache Kafka and KSQLconfluent
 
ADO.NET Entity Framework
ADO.NET Entity FrameworkADO.NET Entity Framework
ADO.NET Entity FrameworkDoncho Minkov
 
Microservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, KanbanMicroservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, KanbanAraf Karsh Hamid
 
Event Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
Event Driven Systems with Spring Boot, Spring Cloud Streams and KafkaEvent Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
Event Driven Systems with Spring Boot, Spring Cloud Streams and KafkaVMware Tanzu
 
Deep Dive into Firecracker Using Lightweight Virtual Machines to Enhance the ...
Deep Dive into Firecracker Using Lightweight Virtual Machines to Enhance the ...Deep Dive into Firecracker Using Lightweight Virtual Machines to Enhance the ...
Deep Dive into Firecracker Using Lightweight Virtual Machines to Enhance the ...Amazon Web Services
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System OverviewFlink Forward
 
Streaming SQL with Apache Calcite
Streaming SQL with Apache CalciteStreaming SQL with Apache Calcite
Streaming SQL with Apache CalciteJulian Hyde
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - ObservabilityAraf Karsh Hamid
 
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Matt Fuller
 
Agile, User Stories, Domain Driven Design
Agile, User Stories, Domain Driven DesignAgile, User Stories, Domain Driven Design
Agile, User Stories, Domain Driven DesignAraf Karsh Hamid
 

Mais procurados (20)

Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
 
Being Glue (Newer slides at https://noidea.dog/glue)
Being Glue (Newer slides at https://noidea.dog/glue)Being Glue (Newer slides at https://noidea.dog/glue)
Being Glue (Newer slides at https://noidea.dog/glue)
 
Understanding Reactive Programming
Understanding Reactive ProgrammingUnderstanding Reactive Programming
Understanding Reactive Programming
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Api-First service design
Api-First service designApi-First service design
Api-First service design
 
React & GraphQL
React & GraphQLReact & GraphQL
React & GraphQL
 
Kata: Hexagonal Architecture / Ports and Adapters
Kata: Hexagonal Architecture / Ports and AdaptersKata: Hexagonal Architecture / Ports and Adapters
Kata: Hexagonal Architecture / Ports and Adapters
 
Microservice With Spring Boot and Spring Cloud
Microservice With Spring Boot and Spring CloudMicroservice With Spring Boot and Spring Cloud
Microservice With Spring Boot and Spring Cloud
 
Streaming ETL to Elastic with Apache Kafka and KSQL
Streaming ETL to Elastic with Apache Kafka and KSQLStreaming ETL to Elastic with Apache Kafka and KSQL
Streaming ETL to Elastic with Apache Kafka and KSQL
 
ADO.NET Entity Framework
ADO.NET Entity FrameworkADO.NET Entity Framework
ADO.NET Entity Framework
 
Microservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, KanbanMicroservices, Containers, Kubernetes, Kafka, Kanban
Microservices, Containers, Kubernetes, Kafka, Kanban
 
Event Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
Event Driven Systems with Spring Boot, Spring Cloud Streams and KafkaEvent Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
Event Driven Systems with Spring Boot, Spring Cloud Streams and Kafka
 
Reactive programming intro
Reactive programming introReactive programming intro
Reactive programming intro
 
Deep Dive into Firecracker Using Lightweight Virtual Machines to Enhance the ...
Deep Dive into Firecracker Using Lightweight Virtual Machines to Enhance the ...Deep Dive into Firecracker Using Lightweight Virtual Machines to Enhance the ...
Deep Dive into Firecracker Using Lightweight Virtual Machines to Enhance the ...
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Streaming SQL with Apache Calcite
Streaming SQL with Apache CalciteStreaming SQL with Apache Calcite
Streaming SQL with Apache Calcite
 
Service Mesh - Observability
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - Observability
 
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
 
Agile, User Stories, Domain Driven Design
Agile, User Stories, Domain Driven DesignAgile, User Stories, Domain Driven Design
Agile, User Stories, Domain Driven Design
 

Semelhante a UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) Kafka Summit SF 2019

Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteChris Baynes
 
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019confluent
 
A New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKA New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKShu-Jeng Hsieh
 
Enabling Data Scientists to easily create and own Kafka Consumers
Enabling Data Scientists to easily create and own Kafka ConsumersEnabling Data Scientists to easily create and own Kafka Consumers
Enabling Data Scientists to easily create and own Kafka ConsumersStefan Krawczyk
 
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...HostedbyConfluent
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
 
Apache Spark, the Next Generation Cluster Computing
Apache Spark, the Next Generation Cluster ComputingApache Spark, the Next Generation Cluster Computing
Apache Spark, the Next Generation Cluster ComputingGerger
 
Dependency Injection in Apache Spark Applications
Dependency Injection in Apache Spark ApplicationsDependency Injection in Apache Spark Applications
Dependency Injection in Apache Spark ApplicationsDatabricks
 
Access Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsAccess Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsTeamstudio
 
Declarative benchmarking of cassandra and it's data models
Declarative benchmarking of cassandra and it's data modelsDeclarative benchmarking of cassandra and it's data models
Declarative benchmarking of cassandra and it's data modelsMonal Daxini
 
Why UI Developers Love GraphQL - Sashko Stubailo, Apollo/Meteor
Why UI Developers Love GraphQL - Sashko Stubailo, Apollo/MeteorWhy UI Developers Love GraphQL - Sashko Stubailo, Apollo/Meteor
Why UI Developers Love GraphQL - Sashko Stubailo, Apollo/MeteorJon Wong
 
Why UI developers love GraphQL
Why UI developers love GraphQLWhy UI developers love GraphQL
Why UI developers love GraphQLSashko Stubailo
 
BOF2644 Developing Java EE 7 Scala apps
BOF2644 Developing Java EE 7 Scala appsBOF2644 Developing Java EE 7 Scala apps
BOF2644 Developing Java EE 7 Scala appsPeter Pilgrim
 
Front End Development: The Important Parts
Front End Development: The Important PartsFront End Development: The Important Parts
Front End Development: The Important PartsSergey Bolshchikov
 
Singpore Oracle Sessions III - What is truly useful in Oracle Database 12c fo...
Singpore Oracle Sessions III - What is truly useful in Oracle Database 12c fo...Singpore Oracle Sessions III - What is truly useful in Oracle Database 12c fo...
Singpore Oracle Sessions III - What is truly useful in Oracle Database 12c fo...Lucas Jellema
 
Introduction to SQLStreamBuilder: Rich Streaming SQL Interface for Creating a...
Introduction to SQLStreamBuilder: Rich Streaming SQL Interface for Creating a...Introduction to SQLStreamBuilder: Rich Streaming SQL Interface for Creating a...
Introduction to SQLStreamBuilder: Rich Streaming SQL Interface for Creating a...Eventador
 
A Step to programming with Apache Spark
A Step to programming with Apache SparkA Step to programming with Apache Spark
A Step to programming with Apache SparkKnoldus Inc.
 

Semelhante a UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) Kafka Summit SF 2019 (20)

Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
 
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019
KSQL in Practice (Almog Gavra, Confluent) Kafka Summit London 2019
 
Nzitf Velociraptor Workshop
Nzitf Velociraptor WorkshopNzitf Velociraptor Workshop
Nzitf Velociraptor Workshop
 
A New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKA New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDK
 
Enabling Data Scientists to easily create and own Kafka Consumers
Enabling Data Scientists to easily create and own Kafka ConsumersEnabling Data Scientists to easily create and own Kafka Consumers
Enabling Data Scientists to easily create and own Kafka Consumers
 
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...
 
Cummingsdceluna2012
Cummingsdceluna2012Cummingsdceluna2012
Cummingsdceluna2012
 
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
 
Apache Spark, the Next Generation Cluster Computing
Apache Spark, the Next Generation Cluster ComputingApache Spark, the Next Generation Cluster Computing
Apache Spark, the Next Generation Cluster Computing
 
Dependency Injection in Apache Spark Applications
Dependency Injection in Apache Spark ApplicationsDependency Injection in Apache Spark Applications
Dependency Injection in Apache Spark Applications
 
Access Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsAccess Data from XPages with the Relational Controls
Access Data from XPages with the Relational Controls
 
Declarative benchmarking of cassandra and it's data models
Declarative benchmarking of cassandra and it's data modelsDeclarative benchmarking of cassandra and it's data models
Declarative benchmarking of cassandra and it's data models
 
Why UI Developers Love GraphQL - Sashko Stubailo, Apollo/Meteor
Why UI Developers Love GraphQL - Sashko Stubailo, Apollo/MeteorWhy UI Developers Love GraphQL - Sashko Stubailo, Apollo/Meteor
Why UI Developers Love GraphQL - Sashko Stubailo, Apollo/Meteor
 
Why UI developers love GraphQL
Why UI developers love GraphQLWhy UI developers love GraphQL
Why UI developers love GraphQL
 
BOF2644 Developing Java EE 7 Scala apps
BOF2644 Developing Java EE 7 Scala appsBOF2644 Developing Java EE 7 Scala apps
BOF2644 Developing Java EE 7 Scala apps
 
Front End Development: The Important Parts
Front End Development: The Important PartsFront End Development: The Important Parts
Front End Development: The Important Parts
 
Singpore Oracle Sessions III - What is truly useful in Oracle Database 12c fo...
Singpore Oracle Sessions III - What is truly useful in Oracle Database 12c fo...Singpore Oracle Sessions III - What is truly useful in Oracle Database 12c fo...
Singpore Oracle Sessions III - What is truly useful in Oracle Database 12c fo...
 
Introduction to SQLStreamBuilder: Rich Streaming SQL Interface for Creating a...
Introduction to SQLStreamBuilder: Rich Streaming SQL Interface for Creating a...Introduction to SQLStreamBuilder: Rich Streaming SQL Interface for Creating a...
Introduction to SQLStreamBuilder: Rich Streaming SQL Interface for Creating a...
 
A Step to programming with Apache Spark
A Step to programming with Apache SparkA Step to programming with Apache Spark
A Step to programming with Apache Spark
 
Linq to sql
Linq to sqlLinq to sql
Linq to sql
 

Mais de confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

Mais de confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) Kafka Summit SF 2019

  • 1. 1 UDFs/UDAFs : The Extensibility Framework for KSQL Hojjat Jafarpour hojjat@confluent.io @Hojjat
  • 2. 2 Agenda ● Intro to KSQL ● Functions in KSQL ● UDFs/UDAFs in KSQL ● Miscellaneous Considerations
  • 3. 3 Goal ● If you haven’t used KSQL yet...
  • 4. 4 Goal ● If you haven’t used KSQL yet...
  • 5. 5 Goal ● If you haven’t used KSQL yet... ○ Try it today!
  • 6. 6 Goal ● If you haven’t used KSQL yet... ○ Try it today! ● Extend KSQL with your custom computation through ○ UDF ○ UDAF
  • 7. 7 KSQL: the Streaming SQL Engine for Apache Kafka ® from Confluent ● Enables stream processing with zero coding required ● The simplest way to process streams of data in real time ● Powered by Kafka: scalable, distributed, battle-tested ● All you need is Kafka–no complex deployments of bespoke systems for stream processing
  • 8. 8 What is it for? Streaming ETL ● Kafka is popular for data pipelines. ● KSQL enables easy transformations of data within the pipe CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id WHERE u.level = 'Platinum';
  • 9. 9 What is it for? Anomaly Detection ● Identifying patterns or anomalies in real-time data, surfaced in milliseconds CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING count(*) > 3;
  • 10. 10 What is it for? Real Time Monitoring ● Log data monitoring, tracking and alerting ● Sensor / IoT data CREATE TABLE error_counts AS SELECT error_code, count(*) FROM monitoring_stream WINDOW TUMBLING (SIZE 1 MINUTE) WHERE type = 'ERROR' GROUP BY error_code;
  • 11. 11 How does it works? ● Streaming SQL to Kafka Streams Apps Streaming SQL Statement
  • 12. 12 How does it works? ● Streaming SQL to Kafka Streams Apps Streaming SQL Statement
  • 13. 13 How does it works? ● Streaming SQL to Kafka Streams Apps ● Continuously read from source topic(s), process, and write the results into sink topic Streaming SQL Statement Source Sink
  • 15. 15 Example ● Stream of shipments events! CREATE STREAM shipments ( ID VARCHAR, ORDER_ID VARCHAR, STREET VARCHAR, CITY VARCHAR, STATE VARCHAR, ZIPCODE VARCHAR, EMAIL VARCHAR, PHONE VARCHAR ) WITH (KAFKA_TOPIC=’ShipmentsTopic’, VALUE_FORMAT=’JSON’);
  • 16. 16 Example ● Sample continuous queries: ○ All shipments to CA: CREATE STREAM ca_shipments AS SELECT * FROM shipments WHERE STATE = ’CA’;
  • 17. 17 Example ● Sample continuous queries: ○ All shipments to CA: ○ Daily shipments count for each zipcode: CREATE STREAM ca_shipments AS SELECT * FROM shipments WHERE STATE = ’CA’; CREATE TABLE zip_daily_shipment_count AS SELECT ZIPCODE, COUNT(*) FROM shipments WINDOW tumbling (SIZE 1 DAY) GROUP BY ZIPCODE;
  • 18. 18 Functions ● KSQL built-in functions to be used in expressions
  • 20. 20 Functions ● Scalar Functions (Stateless) LEN(c)
  • 21. 21 Functions ● Scalar Functions (Stateless) LEN(c)bar hi foo
  • 22. 22 Functions ● Scalar Functions (Stateless) LEN(c)bar hi 3
  • 23. 23 Functions ● Scalar Functions (Stateless) LEN(c)bar 32
  • 24. 24 Functions ● Scalar Functions (Stateless) ○ Substring ○ Trim ○ Concat ○ Abs ○ Floor ○ ...
  • 25. 25 Functions ● Scalar Functions (Stateless) ○ Substring ○ Trim ○ Concat ○ Abs ○ Floor ○ ... ● Aggregate Functions (Stateful)
  • 26. 26 Functions COUNT(c)bar hi foo ● Aggregate Functions (Stateful) Key Value K1 0
  • 27. 27 Functions COUNT(c)bar hi ● Aggregate Functions (Stateful) Key Value K1 1 foo
  • 28. 28 Functions COUNT(c)bar hi ● Aggregate Functions (Stateful) Key Value K1 2 foo
  • 29. 29 Functions ● Aggregate Functions (Stateful) ○ Count ○ Sum ○ Min ○ Max ○ ...
  • 30. 30 Functions ● Scalar Functions (Stateless) ○ Substring ○ Trim ○ Concat ○ Abs ○ Floor ○ ... ● Aggregate Functions (Stateful) ○ Count ○ Sum ○ Min ○ Max ○ ...
  • 31. 31 Functions ● Scalar Functions (Stateless) ○ Substring ○ Trim ○ Concat ○ Abs ○ Floor ○ ... ● Aggregate Functions (Stateful) ○ Count ○ Sum ○ Min ○ Max ○ ... What if I need a function that is not one of the KSQL built-in functions?
  • 32. 32 Functions ● User Defined Functions (UDFs) ○ Stateless ● User Defined Aggregate Functions (UDAFs) ○ Stateful
  • 33. 33 UDFs/UDAFs ● How? a. Write your UDF or UDAF class in Java.
  • 34. 34 UDFs/UDAFs ● How? a. Write your UDF or UDAF class in Java. b. Deploy the JAR file to the KSQL extensions directory.
  • 35. 35 UDFs/UDAFs ● How? a. Write your UDF or UDAF class in Java. b. Deploy the JAR file to the KSQL extensions directory. c. Use your function like any other KSQL function in your queries.
  • 36. 36 UDFs/UDAFs ● How? a. Write your UDF or UDAF class in Java. b. Deploy the JAR file to the KSQL extensions directory. c. Use your function like any other KSQL function in your queries.
  • 37. 37 UDFs/UDAFs ● How? a. Write your UDF or UDAF class in Java. b. Deploy the JAR file to the KSQL extensions directory. c. Use your function like any other KSQL function in your queries.
  • 38. 38 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module
  • 39. 39 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module Gradle: compile 'io.confluent.ksql:ksql-udf:5.3.1'
  • 40. 40 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module Gradle: Maven POM: <repositories> <repository> <id>confluent</id> <url>http://packages.confluent.io/maven/</url> </repository> </repositories> <dependencies> <dependency> <groupId>io.confluent.ksql</groupId> <artifactId>ksql-udf</artifactId> <version>5.3.1</version> </dependency> </dependencies> compile 'io.confluent.ksql:ksql-udf:5.3.1'
  • 41. 41 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdfDescription.
  • 42. 42 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdfDescription. UDF to validate email address format
  • 43. 43 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdfDescription. import io.confluent.ksql.function.udf.Udf; import io.confluent.ksql.function.udf.UdfDescription; @UdfDescription( name = "validateEmail", description = "Validates email address format") public class MyUDFs { }
  • 44. 44 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdfDescription. import io.confluent.ksql.function.udf.Udf; import io.confluent.ksql.function.udf.UdfDescription; @UdfDescription( name = "validateEmail", description = "Validates email address format") public class MyUDFs { } import io.confluent.ksql.function.udf.Udf; import io.confluent.ksql.function.udf.UdfDescription;
  • 45. 45 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdfDescription. import io.confluent.ksql.function.udf.Udf; import io.confluent.ksql.function.udf.UdfDescription; @UdfDescription( name = "validateEmail", description = "Validates email address format") public class MyUDFs { } @UdfDescription( name = "validateEmail “, description = "Validates email address")
  • 46. 46 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdfDescription.
  • 47. 47 Write a UDF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdfDescription. 3. Implement UDFs as public methods with @Udf annotation. a. Use @UdfParameter annotation to provide more info on UDF parameters (optional)
  • 48. 48 Write a UDF for KSQL ● Email validator UDF. import io.confluent.ksql.function.udf.Udf; import io.confluent.ksql.function.udf.UdfDescription; @UdfDescription( name = "validateEmail “, description = "Validates emails") public class MyUDFs { @Udf(description = "Validates email format.") public boolean validateEmail(String email) { final String EMAIL_REGEX = "^[w-+]+(.[w]+)*@[w-]+(.[w]+)*(.[a-z]{2,})$"; return Pattern.compile(EMAIL_REGEX, Pattern.CASE_INSENSITIVE).matcher(email).matches(); } }
  • 49. 49 Write a UDF for KSQL ● Email validator UDF. import io.confluent.ksql.function.udf.Udf; import io.confluent.ksql.function.udf.UdfDescription; @UdfDescription( description = "validateEmail “, description = "Validates phone numbers and emails") public class MyUDFs { @Udf(description = "Validates email format.") public boolean validateEmail(String email) { final String EMAIL_REGEX = "^[w-+]+(.[w]+)*@[w-]+(.[w]+)*(.[a-z]{2,})$"; return Pattern.compile(EMAIL_REGEX, Pattern.CASE_INSENSITIVE).matcher(email).matches(); } } @Udf(description = "Validates email format.") public boolean validateEmail(String email) { final String EMAIL_REGEX = "^[w-+]+(.[w]+)*@[w-]+(.[w]+)*(.[a-z]{2,})$"; return Pattern.compile(EMAIL_REGEX, Pattern.CASE_INSENSITIVE).matcher(email).matches(); }
  • 50. 50 UDFs/UDAFs ● How? a. Write your UDF or UDAF class in Java. b. Deploy the JAR file to the KSQL extensions directory. c. Use your function like any other KSQL function in your queries.
  • 51. 51 Deploy UDFs in KSQL ● Build an uber-jar with all the dependencies
  • 52. 52 Deploy UDFs in KSQL ● Build an uber-jar with all the dependencies ● Copy the uber-jar into the extension directory in each KSQL server ○ Default is $KSQL_HOME/ext ○ Can be configured by ksql.extension.dir property for KSQL server
  • 53. 53 Deploy UDFs in KSQL ● Build an uber-jar with all the dependencies ● Copy the uber-jar into the extension directory in each KSQL server ○ Default is $KSQL_HOME/ext ○ Can be configured by ksql.extension.dir property for KSQL server ● Restart every KSQL server
  • 54. 54 UDFs/UDAFs ● How? a. Write your UDF or UDAF class in Java. b. Deploy the JAR file to the KSQL extensions directory. c. Use your function like any other KSQL function in your queries.
  • 55. 55 Use UDFs in KSQL Queries ● All shipments with invalid email address CREATE STREAM invalid_shipments AS SELECT * FROM shipments WHERE validateEamil(email) = false;
  • 56. 56 UDFs in KSQL ● You can have much more complex UDFs
  • 57. 57 UDFs in KSQL ● You can have much more complex UDFs Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot
  • 58. 58 Write a UDAF for KSQL 1. Create a project with dependency on ksql-udf module
  • 59. 59 Write a UDAF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdafDescription.
  • 60. 60 Write a UDAF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdafDescription. UDAF to collect all order ids in a set for a shipment
  • 61. 61 Write a UDAF for KSQL package testudaf; import com.google.common.collect.Lists; import io.confluent.ksql.function.udaf.Udaf; import io.confluent.ksql.function.udaf.UdafDescription; import io.confluent.ksql.function.udaf.UdafFactory; import java.util.List; @UdafDescription( name = "collectOrderSet", description = "Collect all the orders for a shipment..") public final class CollectOrdersSet { ... }
  • 62. 62 Write a UDAF for KSQL package testudaf; import com.google.common.collect.Lists; import io.confluent.ksql.function.udaf.Udaf; import io.confluent.ksql.function.udaf.UdafDescription; import io.confluent.ksql.function.udaf.UdafFactory; import java.util.List; @UdafDescription( name = "collectOrderSet", description = "Collect all the orders for a shipment..") public final class CollectOrdersSet { ... } imports
  • 63. 63 Write a UDAF for KSQL package testudaf; import com.google.common.collect.Lists; import io.confluent.ksql.function.udaf.Udaf; import io.confluent.ksql.function.udaf.UdafDescription; import io.confluent.ksql.function.udaf.UdafFactory; import java.util.List; @UdafDescription( name = "collectOrderSet", description = "Collect all the orders for a shipment..") public final class CollectOrdersSet { ... } imports Udaf annotation
  • 64. 64 Write a UDAF for KSQL 1. Create a project with dependency on ksql-udf module 2. Create a class that is annotated with @UdafDescription. 3. Implement UDAF Factories as public and static methods with @UdafFactory annotation. a. The factory methods should return Udaf or TableUdaf b. Implement the UDAF logic in the returned Udaf or TableUdaf
  • 65. 65 Write a UDAF for KSQL package testudaf; import com.google.common.collect.Lists; import io.confluent.ksql.function.udaf.Udaf; import io.confluent.ksql.function.udaf.UdafDescription; import io.confluent.ksql.function.udaf.UdafFactory; import java.util.List; @UdafDescription( name = "collectOrderSet", description = "Collect all the orders for a shipment..") public final class CollectOrdersSet { private static final int LIMIT = 1000; @UdafFactory(description = "Collect all the orders for a shipment..") public static Udaf<String, List<String>> orderSetCollector() { return new Udaf<String, List<String>>() { // Implement TableUdaf methods … }; }} imports Udaf annotation Udaf factory
  • 66. 66 Write a UDAF for KSQL return new Udaf<String, List<String>>() { @Override public List<String> initialize() {...} @Override public List<String> aggregate(final String thisValue, final List<String> aggregate) { ... } @Override public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {...} };
  • 67. 67 Write a UDAF for KSQL return new Udaf<String, List<String>>() { @Override public List<String> initialize() {...} @Override public List<String> aggregate(final String thisValue, final List<String> aggregate) { ... } @Override public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {...} }; // The initializer for the Aggregation @Override public List<String> initialize() { return Lists.newArrayList(); }
  • 68. 68 Write a UDAF for KSQL return new Udaf<String, List<String>>() { @Override public List<String> initialize() {...} @Override public List<String> aggregate(final String thisValue, final List<String> aggregate) { ... } @Override public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {...} }; // Aggregates the current value into the existing aggregate @Override public List<String> aggregate(final String thisValue, final List<String> aggregate) { if (aggregate.size() < LIMIT && !aggregate.contains(thisValue)) { aggregate.add(thisValue); } return aggregate; }
  • 69. 69 Write a UDAF for KSQL collectOrderSet(c)bar hi Key Value K1 {foo}
  • 70. 70 Write a UDAF for KSQL collectOrderSet(c)bar Key Value K1 {foo, hi} hi
  • 71. 71 Write a UDAF for KSQL collectOrderSet(c) bar Key Value K1 {foo,hi, bar} hi
  • 72. 72 Write a UDAF for KSQL return new Udaf<String, List<String>>() { @Override public List<String> initialize() {...} @Override public List<String> aggregate(final String thisValue, final List<String> aggregate) { ... } @Override public List<String> merge(final List<String> aggOne, final List<String> aggTwo) {...} }; // Merge two aggregates when merging session windows @Override public List<String> merge(final List<String> aggOne, final List<String> aggTwo) { for (final T thisEntry : aggTwo) { if (aggOne.size() == LIMIT) { break; } if (!aggOne.contains(thisEntry)) { aggOne.add(thisEntry); } } return aggOne; }
  • 73. 73 Write a UDAF for KSQL collectOrderSet(c) Key Value K1(W1) {foo,hi, bar} K1(W2) {bar, tab} K1(W1) K1(W2) foo hi bar bar tab session inactivity gap
  • 74. 74 Write a UDAF for KSQL collectOrderSet(c) Key Value K1(W1) {foo,hi, bar} K1(W2) {bar, tab} K1(W1) K1(W2) foo hi bar bar tab session inactivity gap hello
  • 75. 75 Write a UDAF for KSQL collectOrderSet(c) Key Value K1(W1) {foo,hi, bar} K1(W2) {bar, tab} K1(W1) K1(W2) foo hi bar bar tab session inactivity gap hello
  • 76. 76 Write a UDAF for KSQL collectOrderSet(c) Key Value K1(W1) {foo,hi, bar} K1(W2) {bar, tab} K1(W3) foo hi bar bar tabhello K1(W3) {foo, hi, bar, tab, hello}
  • 77. 77 Use UDAFs in KSQL Queries ● Set of orders for each shipment per day
  • 78. 78 Use UDAFs in KSQL Queries ● Set of orders for each shipment per day CREATE TABLE shipment_orders AS SELECT id, collectOrderSet(order_id) FROM shipments WINDOW tumbling (SIZE 24 HOURS) GROUP BY id;
  • 81. 81 Miscellaneous Considerations ● Security ○ Blacklisting classes ■ Optionally blacklist classes and packages such that they can't be used from a UD(A)F. ■ resource-blacklist.txt in the extension directory
  • 82. 82 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager
  • 83. 83 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager ■ Blocks attempts by any UD(A)Fs to fork processes from the KSQL server. ■ Prevents them from calling System.exit(..)
  • 84. 84 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager ● Metric Collection
  • 85. 85 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager ● Metric Collection ○ Set the config ksql.udf.collect.metrics to true.
  • 86. 86 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager ● Metric Collection ○ Set the config ksql.udf.collect.metrics to true. ○ Collected Metrics: ■ Average/Max time for an invocation ■ Total number of invocations ■ The average number of invocations per second
  • 87. 87 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager ● Metric Collection ● Configurable UDF
  • 88. 88 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager ● Metric Collection ● Configurable UDF ○ UDF access to KSQL server configs
  • 89. 89 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager ● Metric Collection ● Configurable UDF ○ UDF access to KSQL server configs ○ Implement org.apache.kafka.common.Configurable
  • 90. 90 Miscellaneous Considerations @UdfDescription(name = "MyFirstUDF", description = "multiplies 2 numbers") public class SomeConfigurableUdf implements Configurable { private String someSetting = "a.default.value"; @Override public void configure(final Map<String, ?> map) { this.someSetting = (String)map.get("ksql.functions.myfirstudf.some.setting"); } ... }
  • 91. 91 Miscellaneous Considerations ● Security ○ Blacklisting classes ○ SecuriyManager ● Metric Collection ● Configurable UDF ○ Only configs whose name is prefixed with ksql.functions.<lowercase-udfname>. or ksql.functions._global_. are accessible
  • 92. 92 Shout out to Mitch Seymour ● Luna: a place for developers to publish their own UDFs / UDAFs that may not otherwise be a good fit for contributing to the KSQL codebase, itself https://magicalpipelines.com/luna/
  • 93. 93 Wrapping up ● Introduction to KSQL ● KSQL Built-in Functions ● Extending KSQL with Custom Functions ○ UDFs (Stateless) ○ UDAFs (Stateful) ● Resources ○ KSQL Docs: https://docs.confluent.io/current/ksql/docs/developer-guide/udf.html# ○ Confluent Examples: https://github.com/confluentinc/demo-scene/tree/master/ksql-udf-advanced-example ○ Luna: https://magicalpipelines.com/luna/
  • 94. 94 UDFs/UDAFs : The Extensibility Framework for KSQL Hojjat Jafarpour hojjat@confluent.io @Hojjat