SlideShare uma empresa Scribd logo
1 de 49
Baixar para ler offline
Fast Fourier Transform (FFT)
of Time Series in Kafka Streams
Igor Khalitov
Senior Solutions Architect
Agenda
2
Digital Signal Processing powered by Kafka streams can open IoT to a new era of possibilities by
bringing computational power closer to the location where the data originated.
The purpose of this presentation is to demonstrate the power of Kafka Streams as a backbone for
mathematical methods of signal processing. In this session, we’ll explore transforming signals from
the time domain to the frequency domain using FFT, maximizing the level of compression of input
signals while building a precise frequency alert system.
This presentation focuses on the following areas:
-Signal imitators of periodic waveforms (sine, triangle, square, sawtooth, etc.) in compliance with
OpenMetrics standard.
-The processor that performs FFT, converting input signals into individual spectral components and
thereby provides frequency information about the signal.
-Prometheus/Grafana visualization of FFT of input signals in real time.
By the end of the session, you’ll understand the fundamentals of digital signal processing using
Kafka and have the tools you need to build and implement FFT in Kafka Streams.
Agenda
Digital Signal Processing
4
Digital Signal Processing
Signal Generator Confluent Kafka Visualization
UDF
MQTT Broker
Processing
Pipeline
Wave Superposition
-8
-6
-4
-2
0
2
4
6
8
0 50 100 150 200 250 300 350 400
V1 V2 V3 S
Signal Generator in Python
samples = 1000
delta = 3 * 2*math.pi/samples
r=0
in_array = []
for i in range(samples):
in_array.append(r)
r += delta
out_array = []
for i in range(len(in_array)):
x=in_array[i]
y=math.sin(x) -0.5* math.sin(2*x) + 1/3 * math.sin(3*x)
out_array.append(y)
i += 1
plt.plot(in_array, out_array, color = 'green', marker = ".")
plt.title("curve")
plt.xlabel("X")
plt.ylabel("Y")
plt.show()
Jython
implementation group: 'org.python', name: 'jython-slim', version: '2.7.3'
@Bean
PythonInterpreter pythonInterpreter;
pythonInterpreter.exec("import math");
# define Python function:
String mathFunc = new String("def genFormula(): return " + paramFormula);
pythonInterpreter.exec(mathFunc);
# interpretate function:
# Example: math.sin(x) -0.5* math.sin(2*x) + 1/3 * math.sin(3*x)
pythonInterpreter.set("x", new PyFloat(rad));
pythonInterpreter.exec("y = genFormula()");
y = pythonInterpreter.get("y");
Signal Generator
Pulsar
(Runnable)
Sensor
(Callable)
Sleep (dT)
T
F = 1/T
dT - Sampling Interval
Formula
(Jython)
curve=math.sin(x)
{
"name": "sensor#1",
"type": "curve",
"timestamp": 1695081516716,
"dimensions": {
"formula": "math.sin(x)",
"msStampInterval": 100
},
"values": {
"value": -0.9772681235681934
}
}
MQTT Broker
MqttSourceConnector
brew services mosquito start
java -jar signalgen.jar --publisherId='sensor#5' --paramFormula='5*math.sin(2*x)'
"config": {
"connector.class": "io.confluent.connect.mqtt.MqttSourceConnector",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"tasks.max": "1",
"mqtt.server.uri": "tcp://127.0.0.1:1883",
"confluent.topic.bootstrap.servers": "localhost:9092",
"confluent.topic.replication.factor": "1",
"kafka.topic": "wave",
"mqtt.topics": "mqtt/sine/wave",
"mqtt.qos": "0"
}
Connectors
11
MQTT Source Connector
{
"name": "source-mqtt-wave",
"config": {
"connector.class":
"io.confluent.connect.mqtt.MqttSourceConnector",
"key.converter":
"org.apache.kafka.connect.storage.StringConverter",
"value.converter":
"org.apache.kafka.connect.converters.ByteArrayConverter",
"tasks.max": "1",
"mqtt.server.uri": "tcp://127.0.0.1:1883",
"confluent.topic.bootstrap.servers": "localhost:9092",
"confluent.topic.replication.factor": "1",
"kafka.topic": "wave",
"mqtt.topics": "mqtt/sine/wave",
"mqtt.qos": "0"
}
}
InfluxDB Sink Connector
curl -i -X PUT -H "Content-Type:application/json" 
http://localhost:8083/connectors/influx_sink_wave/config 
-d '{
"name" : "influx_sink_wave",
"connector.class" : "io.confluent.influxdb.InfluxDBSinkConnector",
"tasks.max" : "1",
"topics" : "wave-influxdb",
"influxdb.url" : "http://localhost:8086",
"influxdb.db" : "waves-bucket",
"measurement.name.format" : "${topic}",
"influxdb.username" : "current2023",
"influxdb.password" : ”**********",
"access.token" : ”xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==",
"org.id" : "9b8f7c51f7b6df8a",
"value.converter" : "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false"
}'
Prometheus Metrics Sink Connector
{
"name": "prometheus-metrics-sink-wave",
"config": {
"topics": "wave",
"connector.class":
"io.confluent.connect.prometheus.PrometheusMetricsSinkConnector",
"tasks.max": "1",
"confluent.topic.bootstrap.servers": "localhost:9092",
"prometheus.scrape.url": "http://localhost:8889/metrics",
"prometheus.listener.url": "http://localhost:8889/metrics",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable": "false",
"value.converter.schemas.enable": "false",
"reporter.bootstrap.servers": "localhost:9092",
"reporter.result.topic.replication.factor": "1",
"reporter.error.topic.replication.factor": "1",
"behavior.on.error": "log"
}
}
Sampling
15
Sampling in Signal Processing
https://en.wikipedia.org/wiki/Sampling_%28signal_processing%29
Kafka Streaming
17
Windowing
18
Tumbling Window
19
https://oceanrep.geomar.de/
COLLECT_LIST(col) => ARRAY
COLLECT_SET(col) => ARRAY
HISTOGRAM(col) => ARRAY
20
ksqlDB Lambda Functions
• TRANSFORM
• REDUCE
• FILTER
COLLECT_LIST(timestamp) AS T,
COLLECT_LIST(`values`->`value`) AS V,
…
TRANSFORM(T, x => CAST(x AS VARCHAR)) AS T_VARCHAR,
…
AS_MAP(T_VARCHAR, MAGNITUDE) AS POINTS,
…
FILTER(POINTS, (k,v) => v > 0.4 ) FILTERED
Hopping Window
22
https://oceanrep.geomar.de/
Fourier Analysis
23
FFT
https://www.mathworks.com/help/matlab/ref/fft.html
Fourier Analysis
https://www.mathworks.com/matlabcentral/fileexchange/106725-fourier-analysis
Fs = 1000; # Sampling frequency
T = 1/Fs; # Sampling period
L = 1500; # Length of signal
t = (0:L-1)*T; # Time vector
1). Form a signal containing a DC offset of amplitude 0.8, a 50 Hz sinusoid of amplitude 0.7, and a 120 Hz
sinusoid of amplitude 1.
S = 0.8 + 0.7*sin(2*pi*50*t) + sin(2*pi*120*t);
2). Corrupt the signal with zero-mean random noise with a variance of 4.
X = S + 2*randn(size(t));
Complex Magnitude of FFT spectrum
https://www.mathworks.com/help/matlab/ref/fft.html
1). Compute the Fourier transform of the signal.
Y = fft(X);
the signal have three frequency peaks at 0 Hz, 50 Hz, and 120 H
2). Complex numbers -> complex Magnitude
plot(Fs/L*(0:L-1),abs(Y),"LineWidth",3)
title("Complex Magnitude of fft Spectrum")
xlabel("f (Hz)")
ylabel("|fft(X)|")
Single-Sided Amplitude Spectrum of X(t)
https://www.mathworks.com/help/matlab/ref/fft.html
3). find the amplitudes of the three frequency peaks
P2 = abs(Y/L);
P1 = P2(1:L/2+1);
P1(2:end-1) = 2*P1(2:end-1);
4). take the Fourier transform of the original, uncorrupted
signal and retrieve the exact amplitudes at 0.8, 0.7, and 1.0.
Y = fft(S);
P2 = abs(Y/L);
P1 = P2(1:L/2+1);
P1(2:end-1) = 2*P1(2:end-1);
kslqDB Time Series manipulation
28
ksqlDB Lambda Functions
• TRANSFORM
• REDUCE
• FILTER
COLLECT_LIST(timestamp) AS T,
COLLECT_LIST(`values`->`value`) AS V,
…
TRANSFORM(T, x => CAST(x AS VARCHAR)) AS T_VARCHAR,
…
AS_MAP(T_VARCHAR, MAGNITUDE) AS POINTS,
…
FILTER(POINTS, (k,v) => v > 0.4 ) FILTERED
Time
X1
X2
…
Xn
Sample 1
X1
Y1
Sample 2
X2
Y2
Sample n
Xn
Yn
Window
Y1
Y2
…
Yn
…
EMIT FINAL
UDF:
SELECT
FFT(Y) AS FOURIER…
User Defined Function (UDF)
31
UDF
@Udf(description = "fft time series List<T>, returns frequency List<T>")
public <T> List<T> fft (
@UdfParameter( description = "time domain", value = "timeDomain") final List<T> timeDomain) {
double[] arr = timeDomain.stream().mapToDouble(d->(double)d).toArray();
// FFT
double[] adjustedArr = arrayHelper.adjustPowerTwo(arr);
double[] frequencyDomain = transform(adjustedArr);
final List<T> result = (List<T>) DoubleStream.of(frequencyDomain).boxed().collect(Collectors.toList());
return result;
}
FFT.jar
How to build UDF:
1. copy this JAR file to the ksqlDB extension directory
2. restart your ksqlDB server so that it can pick up the new JAR containing your custom ksqlDB function.
/Users/ikhalitov/CFLT/cp_zip/confluent-7.4.1/etc/ksqldb/ext/ksql-server.properties
...
ksql.extension.dir=/Users/ikhalitov/CFLT/cp_zip/confluent-7.4.1/etc/ksqldb/ext
SHOW FUNCTIONS;
DESCRIBE FUNCTION FFT;
ksqlDB streams
34
WAVES_JET
CREATE STREAM IF NOT EXISTS WAVES_JET (
NAME VARCHAR
, TYPE VARCHAR
, timestamp BIGINT
, dimensions STRUCT<
formula VARCHAR,
frequency DOUBLE,
msStampInterval BIGINT
>
, `values` STRUCT<
`value` DOUBLE
>
) WITH (
KAFKA_TOPIC = 'wave',
PARTITIONS = 1,
REPLICAS = 1,
VALUE_FORMAT = 'JSON'
);
{
"name": "sensor#1",
"type": "curve",
"timestamp": 1695756674961,
"dimensions": {
"formula": "math.sin(x)",
"msStampInterval": 10
},
"values": {
"value": -0.2426000471283583
}
}
CREATE TABLE IF NOT EXISTS WAVES_WINDOWED_TBL
WITH (KAFKA_TOPIC='wave-windowed',
VALUE_FORMAT='JSON',
PARTITIONS=6,
REPLICAS = 1
, retention_ms=3600000
) AS
SELECT
-- EARLIEST_BY_OFFSET(ROWKEY)
-- EARLIEST_BY_OFFSET(AS_VALUE(ROWKEY)) AS MQTT_KEY
NAME AS SENSOR
, COUNT(*) AS SAMPLES
, EARLIEST_BY_OFFSET(dimensions->frequency) AS FREQUENCY
, EARLIEST_BY_OFFSET(dimensions->msStampInterval) AS
MS_SAMPLING_INTERVAL
, COLLECT_LIST(timestamp) AS T
, COLLECT_LIST(`values`->`value`) AS V
, FROM_UNIXTIME(WINDOWSTART) AS WINDOW_START
, FROM_UNIXTIME(WINDOWEND) AS WINDOW_END
, FROM_UNIXTIME(max(ROWTIME)) AS WINDOW_EMIT
FROM WAVES_JET
-- WINDOW HOPPING (SIZE 30 SECONDS, ADVANCE BY 5 SECOND)
WINDOW HOPPING (SIZE 20 SECONDS, ADVANCE BY 2 SECOND)
GROUP BY NAME
EMIT FINAL;
{
"SAMPLES": 1596,
"FREQUENCY": null,
"MS_SAMPLING_INTERVAL": 10,
"T": [
1695756698005,
…
1695756717976,
1695756717988
],
"V": [
0.03141014189275034,
0.11285466728516695,
0.18738077087567337,
…
-0.15022507859000683,
-0.07532702075491876
],
"WINDOW_START": 1695756698000,
"WINDOW_END": 1695756718000,
"WINDOW_EMIT": 1695756717989
}
WAVES_WINDOWED_TBL
WAVES_WINDOWED_JET
CREATE STREAM IF NOT EXISTS WAVES_WINDOWED_JET (
SENSOR VARCHAR KEY,
MQTT_KEY VARCHAR,
SAMPLES INTEGER,
MS_SAMPLING_INTERVAL INTEGER,
T ARRAY<BIGINT>,
V ARRAY<DOUBLE>,
WINDOW_START TIMESTAMP,
WINDOW_END TIMESTAMP
) WITH (
kafka_topic = 'wave-windowed',
VALUE_FORMAT = 'JSON'
);
KStream - KTable duality
WAVES_STEP_1_JET
CREATE OR REPLACE STREAM IF NOT EXISTS WAVES_STEP_1_JET
WITH (
kafka_topic = 'wave-step-1',
VALUE_FORMAT = 'JSON'
) AS
SELECT
SENSOR,
AS_VALUE(SENSOR) AS SENSOR_VAL,
SAMPLES AS SAMPLES_NUM,
MS_SAMPLING_INTERVAL,
CAST(MS_SAMPLING_INTERVAL AS DOUBLE)/1000 AS SAMPLING_PERIOD,
UNIX_TIMESTAMP(WINDOW_START) AS UNIX_TIMESTAMP_START,
UNIX_TIMESTAMP(WINDOW_END) AS UNIX_TIMESTAMP_END,
T,
FFT(V) AS FOURIER
FROM WAVES_WINDOWED_JET EMIT CHANGES;
{
"SENSOR_VAL":
"sensor#1u0000u0000u0001 ",
"SAMPLES_NUM": 1588,
"MS_SAMPLING_INTERVAL": 10,
"SAMPLING_PERIOD": 0.01,
"UNIX_TIMESTAMP_START":
1695757040000,
"UNIX_TIMESTAMP_END":
1695757060000,
"T": [
1695757040007,
1695757040020,
…
],
"FOURIER ": [
0.03141014189275034,
…
-0.15022507859000683,
-0.07532702075491876
],
WAVES_STEP_2_JET
CREATE STREAM IF NOT EXISTS WAVES_STEP_2_JET
WITH (
kafka_topic = 'wave-step-2',
VALUE_FORMAT = 'JSON'
) AS
SELECT
SENSOR,
…
SAMPLING_PERIOD,
1 / SAMPLING_PERIOD AS SAMPLING_FREQ,
SAMPLES_NUM * SAMPLING_PERIOD * 2 AS TIME_VECTOR,
UNIX_TIMESTAMP_END - UNIX_TIMESTAMP_START AS MS_TIME_VECTOR,
1 / (SAMPLES_NUM * SAMPLING_PERIOD * 2) AS DELTA_FREQ,
ARRAY_LENGTH(T) AS T_LENGTH,
T,
TRANSFORM(T, x => x - UNIX_TIMESTAMP_START) AS NORM_T,
ARRAY_LENGTH(FOURIER) AS FOURIER_LENGTH,
FOURIER
FROM WAVES_STEP_1_JET EMIT CHANGES;
WAVES_STEP_3_JET
CREATE STREAM IF NOT EXISTS WAVES_STEP_3_JET
WITH (
kafka_topic = 'wave-step-3',
VALUE_FORMAT = 'JSON',
RETENTION_MS=600000
) AS
SELECT
SENSOR,
…
T_LENGTH,
T,
TRANSFORM(T, x => DELTA_FREQ * (SAMPLES_NUM-1) * ((CAST(x AS DOUBLE) -
UNIX_TIMESTAMP_START)/MS_TIME_VECTOR) ) AS FREQUENCIES,
FOURIER_LENGTH,
TRANSFORM(FOURIER, x => x/(SAMPLES_NUM/2)) AS MAGNITUDE
FROM WAVES_STEP_2_JET EMIT CHANGES;
WAVES_STEP_4_JET
CREATE STREAM IF NOT EXISTS WAVES_STEP_4_JET
WITH (
kafka_topic = 'wave-step-4',
VALUE_FORMAT = 'JSON',
RETENTION_MS=600000
) AS
SELECT
SENSOR,
SENSOR_VAL,
UNIX_TIMESTAMP_START,
UNIX_TIMESTAMP_END,
SAMPLES_NUM,
MS_SAMPLING_INTERVAL,
MS_TIME_VECTOR,
DELTA_FREQ,
T_LENGTH,
TRANSFORM(T, x => CAST(x AS VARCHAR)) AS T_VARCHAR,
FOURIER_LENGTH,
MAGNITUDE
WAVES_STEP_5_JET
CREATE STREAM IF NOT EXISTS WAVES_STEP_5_JET
WITH (
kafka_topic = 'wave-step-5',
VALUE_FORMAT = 'JSON',
RETENTION_MS=600000
) AS
SELECT
SENSOR,
SENSOR_VAL,
UNIX_TIMESTAMP_START,
UNIX_TIMESTAMP_END,
SAMPLES_NUM,
MS_SAMPLING_INTERVAL,
MS_TIME_VECTOR,
DELTA_FREQ,
AS_MAP(T_VARCHAR, MAGNITUDE) AS POINTS
FROM WAVES_STEP_4_JET EMIT CHANGES;
WAVES_STEP_6_JET
CREATE STREAM IF NOT EXISTS WAVES_STEP_6_JET
WITH (
kafka_topic = 'wave-step-6',
VALUE_FORMAT = 'JSON',
RETENTION_MS=600000
) AS
SELECT
SENSOR,
SENSOR_VAL,
UNIX_TIMESTAMP_START,
UNIX_TIMESTAMP_END,
SAMPLES_NUM,
MS_SAMPLING_INTERVAL,
MS_TIME_VECTOR,
DELTA_FREQ,
FILTER(POINTS, (k,v) => v > 0.4 ) FILTERED
FROM WAVES_STEP_5_JET EMIT CHANGES;
WAVES_STEP_7_JET
CREATE STREAM IF NOT EXISTS WAVES_STEP_7_JET
WITH (
kafka_topic = 'wave-step-7',
VALUE_FORMAT = 'JSON',
RETENTION_MS=600000
) AS
SELECT
SENSOR,
SENSOR_VAL,
UNIX_TIMESTAMP_START,
UNIX_TIMESTAMP_END,
SAMPLES_NUM,
MS_SAMPLING_INTERVAL,
MS_TIME_VECTOR,
DELTA_FREQ,
MAP_KEYS(FILTERED) AS X_TIME,
MAP_VALUES(FILTERED) AS Y_AMPL
FROM WAVES_STEP_6_JET EMIT CHANGES;
WAVES_STEP_8_JET
CREATE STREAM IF NOT EXISTS WAVES_STEP_8_JET
WITH (
kafka_topic = 'wave-step-8',
VALUE_FORMAT = 'JSON',
RETENTION_MS=3600000
) AS
SELECT
SENSOR,
SENSOR_VAL,
UNIX_TIMESTAMP_START,
UNIX_TIMESTAMP_END,
SAMPLES_NUM,
MS_SAMPLING_INTERVAL,
MS_TIME_VECTOR,
DELTA_FREQ,
TRANSFORM(X_TIME, x => DELTA_FREQ * (SAMPLES_NUM-1) * ((CAST(x AS DOUBLE) -
UNIX_TIMESTAMP_START)/MS_TIME_VECTOR) ) AS X_FREQ,
Y_AMPL
FROM WAVES_STEP_7_JET EMIT CHANGES;
influxDB
46
Transform
CREATE TYPE IF NOT EXISTS TAGS_TYPE AS STRUCT<sensor_id VARCHAR, formula
VARCHAR>;
-- DROP TYPE TAGTAGS_TYPES;
CREATE OR REPLACE STREAM IF NOT EXISTS WAVE_INFLUXDB_JET
(
`measurement` VARCHAR,
`tags` TAGS_TYPE,
`time` VARCHAR,
`wave` DOUBLE
)
WITH (
kafka_topic = 'wave-influxdb',
VALUE_FORMAT = 'JSON',
PARTITIONS = 1,
RETENTION_MS=600000
);
INSERT INTO SELECT FROM
INSERT INTO WAVE_INFLUXDB_JET
SELECT
'WAVE' AS `measurement`
, STRUCT(sensor_id := NAME, formula := dimensions->formula) AS `tags`
, TIMESTAMPTOSTRING(TIMESTAMP, 'yyyy-MM-dd HH:mm:ss.SSS') AS `time`
,`values`->`value` AS `wave`
FROM WAVES_JET
EMIT CHANGES;
Questions?
49

Mais conteúdo relacionado

Semelhante a Fast Fourier Transform (FFT) of Time Series in Kafka Streams

Ff tand matlab-wanjun huang
Ff tand matlab-wanjun huangFf tand matlab-wanjun huang
Ff tand matlab-wanjun huang
jhonce
 
Experiment3_DCS-21BEC0384Adityabonnerjee
Experiment3_DCS-21BEC0384AdityabonnerjeeExperiment3_DCS-21BEC0384Adityabonnerjee
Experiment3_DCS-21BEC0384Adityabonnerjee
AdityaBonnerjee21BEC
 
Module1_dsffffffffffffffffffffgggpa.pptx
Module1_dsffffffffffffffffffffgggpa.pptxModule1_dsffffffffffffffffffffgggpa.pptx
Module1_dsffffffffffffffffffffgggpa.pptx
realme6igamerr
 
Matlab 2
Matlab 2Matlab 2
Matlab 2
asguna
 

Semelhante a Fast Fourier Transform (FFT) of Time Series in Kafka Streams (20)

Comm lab manual_final-1
Comm lab manual_final-1Comm lab manual_final-1
Comm lab manual_final-1
 
Comm lab manual_final
Comm lab manual_finalComm lab manual_final
Comm lab manual_final
 
Ff tand matlab-wanjun huang
Ff tand matlab-wanjun huangFf tand matlab-wanjun huang
Ff tand matlab-wanjun huang
 
Ff tand matlab-wanjun huang
Ff tand matlab-wanjun huangFf tand matlab-wanjun huang
Ff tand matlab-wanjun huang
 
FPGA Implementation of Large Area Efficient and Low Power Geortzel Algorithm ...
FPGA Implementation of Large Area Efficient and Low Power Geortzel Algorithm ...FPGA Implementation of Large Area Efficient and Low Power Geortzel Algorithm ...
FPGA Implementation of Large Area Efficient and Low Power Geortzel Algorithm ...
 
Experiment3_DCS-21BEC0384Adityabonnerjee
Experiment3_DCS-21BEC0384AdityabonnerjeeExperiment3_DCS-21BEC0384Adityabonnerjee
Experiment3_DCS-21BEC0384Adityabonnerjee
 
DIT-Radix-2-FFT in SPED
DIT-Radix-2-FFT in SPEDDIT-Radix-2-FFT in SPED
DIT-Radix-2-FFT in SPED
 
Dsp manual
Dsp manualDsp manual
Dsp manual
 
Lti system
Lti systemLti system
Lti system
 
Signal classification of signal
Signal classification of signalSignal classification of signal
Signal classification of signal
 
Module1_dsffffffffffffffffffffgggpa.pptx
Module1_dsffffffffffffffffffffgggpa.pptxModule1_dsffffffffffffffffffffgggpa.pptx
Module1_dsffffffffffffffffffffgggpa.pptx
 
Final ppt
Final pptFinal ppt
Final ppt
 
A Simple Communication System Design Lab #3 with MATLAB Simulink
A Simple Communication System Design Lab #3 with MATLAB SimulinkA Simple Communication System Design Lab #3 with MATLAB Simulink
A Simple Communication System Design Lab #3 with MATLAB Simulink
 
Random Number Generators 2018
Random Number Generators 2018Random Number Generators 2018
Random Number Generators 2018
 
Exam 6 commlab 18_119_ei0292
Exam 6 commlab 18_119_ei0292Exam 6 commlab 18_119_ei0292
Exam 6 commlab 18_119_ei0292
 
Reconstruction
ReconstructionReconstruction
Reconstruction
 
Dsp file
Dsp fileDsp file
Dsp file
 
Matlab 2
Matlab 2Matlab 2
Matlab 2
 
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐCHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
CHƯƠNG 2 KỸ THUẬT TRUYỀN DẪN SỐ - THONG TIN SỐ
 
Dsp iit workshop
Dsp iit workshopDsp iit workshop
Dsp iit workshop
 

Mais de HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 

Mais de HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Último

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 

Último (20)

Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 

Fast Fourier Transform (FFT) of Time Series in Kafka Streams

  • 1. Fast Fourier Transform (FFT) of Time Series in Kafka Streams Igor Khalitov Senior Solutions Architect
  • 3. Digital Signal Processing powered by Kafka streams can open IoT to a new era of possibilities by bringing computational power closer to the location where the data originated. The purpose of this presentation is to demonstrate the power of Kafka Streams as a backbone for mathematical methods of signal processing. In this session, we’ll explore transforming signals from the time domain to the frequency domain using FFT, maximizing the level of compression of input signals while building a precise frequency alert system. This presentation focuses on the following areas: -Signal imitators of periodic waveforms (sine, triangle, square, sawtooth, etc.) in compliance with OpenMetrics standard. -The processor that performs FFT, converting input signals into individual spectral components and thereby provides frequency information about the signal. -Prometheus/Grafana visualization of FFT of input signals in real time. By the end of the session, you’ll understand the fundamentals of digital signal processing using Kafka and have the tools you need to build and implement FFT in Kafka Streams. Agenda
  • 5. Digital Signal Processing Signal Generator Confluent Kafka Visualization UDF MQTT Broker Processing Pipeline
  • 6. Wave Superposition -8 -6 -4 -2 0 2 4 6 8 0 50 100 150 200 250 300 350 400 V1 V2 V3 S
  • 7. Signal Generator in Python samples = 1000 delta = 3 * 2*math.pi/samples r=0 in_array = [] for i in range(samples): in_array.append(r) r += delta out_array = [] for i in range(len(in_array)): x=in_array[i] y=math.sin(x) -0.5* math.sin(2*x) + 1/3 * math.sin(3*x) out_array.append(y) i += 1 plt.plot(in_array, out_array, color = 'green', marker = ".") plt.title("curve") plt.xlabel("X") plt.ylabel("Y") plt.show()
  • 8. Jython implementation group: 'org.python', name: 'jython-slim', version: '2.7.3' @Bean PythonInterpreter pythonInterpreter; pythonInterpreter.exec("import math"); # define Python function: String mathFunc = new String("def genFormula(): return " + paramFormula); pythonInterpreter.exec(mathFunc); # interpretate function: # Example: math.sin(x) -0.5* math.sin(2*x) + 1/3 * math.sin(3*x) pythonInterpreter.set("x", new PyFloat(rad)); pythonInterpreter.exec("y = genFormula()"); y = pythonInterpreter.get("y");
  • 9. Signal Generator Pulsar (Runnable) Sensor (Callable) Sleep (dT) T F = 1/T dT - Sampling Interval Formula (Jython) curve=math.sin(x) { "name": "sensor#1", "type": "curve", "timestamp": 1695081516716, "dimensions": { "formula": "math.sin(x)", "msStampInterval": 100 }, "values": { "value": -0.9772681235681934 } }
  • 10. MQTT Broker MqttSourceConnector brew services mosquito start java -jar signalgen.jar --publisherId='sensor#5' --paramFormula='5*math.sin(2*x)' "config": { "connector.class": "io.confluent.connect.mqtt.MqttSourceConnector", "key.converter": "org.apache.kafka.connect.storage.StringConverter", "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter", "tasks.max": "1", "mqtt.server.uri": "tcp://127.0.0.1:1883", "confluent.topic.bootstrap.servers": "localhost:9092", "confluent.topic.replication.factor": "1", "kafka.topic": "wave", "mqtt.topics": "mqtt/sine/wave", "mqtt.qos": "0" }
  • 12. MQTT Source Connector { "name": "source-mqtt-wave", "config": { "connector.class": "io.confluent.connect.mqtt.MqttSourceConnector", "key.converter": "org.apache.kafka.connect.storage.StringConverter", "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter", "tasks.max": "1", "mqtt.server.uri": "tcp://127.0.0.1:1883", "confluent.topic.bootstrap.servers": "localhost:9092", "confluent.topic.replication.factor": "1", "kafka.topic": "wave", "mqtt.topics": "mqtt/sine/wave", "mqtt.qos": "0" } }
  • 13. InfluxDB Sink Connector curl -i -X PUT -H "Content-Type:application/json" http://localhost:8083/connectors/influx_sink_wave/config -d '{ "name" : "influx_sink_wave", "connector.class" : "io.confluent.influxdb.InfluxDBSinkConnector", "tasks.max" : "1", "topics" : "wave-influxdb", "influxdb.url" : "http://localhost:8086", "influxdb.db" : "waves-bucket", "measurement.name.format" : "${topic}", "influxdb.username" : "current2023", "influxdb.password" : ”**********", "access.token" : ”xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==", "org.id" : "9b8f7c51f7b6df8a", "value.converter" : "org.apache.kafka.connect.json.JsonConverter", "value.converter.schemas.enable": "false" }'
  • 14. Prometheus Metrics Sink Connector { "name": "prometheus-metrics-sink-wave", "config": { "topics": "wave", "connector.class": "io.confluent.connect.prometheus.PrometheusMetricsSinkConnector", "tasks.max": "1", "confluent.topic.bootstrap.servers": "localhost:9092", "prometheus.scrape.url": "http://localhost:8889/metrics", "prometheus.listener.url": "http://localhost:8889/metrics", "key.converter": "org.apache.kafka.connect.storage.StringConverter", "value.converter": "org.apache.kafka.connect.json.JsonConverter", "key.converter.schemas.enable": "false", "value.converter.schemas.enable": "false", "reporter.bootstrap.servers": "localhost:9092", "reporter.result.topic.replication.factor": "1", "reporter.error.topic.replication.factor": "1", "behavior.on.error": "log" } }
  • 16. Sampling in Signal Processing https://en.wikipedia.org/wiki/Sampling_%28signal_processing%29
  • 20. COLLECT_LIST(col) => ARRAY COLLECT_SET(col) => ARRAY HISTOGRAM(col) => ARRAY 20
  • 21. ksqlDB Lambda Functions • TRANSFORM • REDUCE • FILTER COLLECT_LIST(timestamp) AS T, COLLECT_LIST(`values`->`value`) AS V, … TRANSFORM(T, x => CAST(x AS VARCHAR)) AS T_VARCHAR, … AS_MAP(T_VARCHAR, MAGNITUDE) AS POINTS, … FILTER(POINTS, (k,v) => v > 0.4 ) FILTERED
  • 25. Fourier Analysis https://www.mathworks.com/matlabcentral/fileexchange/106725-fourier-analysis Fs = 1000; # Sampling frequency T = 1/Fs; # Sampling period L = 1500; # Length of signal t = (0:L-1)*T; # Time vector 1). Form a signal containing a DC offset of amplitude 0.8, a 50 Hz sinusoid of amplitude 0.7, and a 120 Hz sinusoid of amplitude 1. S = 0.8 + 0.7*sin(2*pi*50*t) + sin(2*pi*120*t); 2). Corrupt the signal with zero-mean random noise with a variance of 4. X = S + 2*randn(size(t));
  • 26. Complex Magnitude of FFT spectrum https://www.mathworks.com/help/matlab/ref/fft.html 1). Compute the Fourier transform of the signal. Y = fft(X); the signal have three frequency peaks at 0 Hz, 50 Hz, and 120 H 2). Complex numbers -> complex Magnitude plot(Fs/L*(0:L-1),abs(Y),"LineWidth",3) title("Complex Magnitude of fft Spectrum") xlabel("f (Hz)") ylabel("|fft(X)|")
  • 27. Single-Sided Amplitude Spectrum of X(t) https://www.mathworks.com/help/matlab/ref/fft.html 3). find the amplitudes of the three frequency peaks P2 = abs(Y/L); P1 = P2(1:L/2+1); P1(2:end-1) = 2*P1(2:end-1); 4). take the Fourier transform of the original, uncorrupted signal and retrieve the exact amplitudes at 0.8, 0.7, and 1.0. Y = fft(S); P2 = abs(Y/L); P1 = P2(1:L/2+1); P1(2:end-1) = 2*P1(2:end-1);
  • 28. kslqDB Time Series manipulation 28
  • 29. ksqlDB Lambda Functions • TRANSFORM • REDUCE • FILTER COLLECT_LIST(timestamp) AS T, COLLECT_LIST(`values`->`value`) AS V, … TRANSFORM(T, x => CAST(x AS VARCHAR)) AS T_VARCHAR, … AS_MAP(T_VARCHAR, MAGNITUDE) AS POINTS, … FILTER(POINTS, (k,v) => v > 0.4 ) FILTERED
  • 30. Time X1 X2 … Xn Sample 1 X1 Y1 Sample 2 X2 Y2 Sample n Xn Yn Window Y1 Y2 … Yn … EMIT FINAL UDF: SELECT FFT(Y) AS FOURIER…
  • 32. UDF @Udf(description = "fft time series List<T>, returns frequency List<T>") public <T> List<T> fft ( @UdfParameter( description = "time domain", value = "timeDomain") final List<T> timeDomain) { double[] arr = timeDomain.stream().mapToDouble(d->(double)d).toArray(); // FFT double[] adjustedArr = arrayHelper.adjustPowerTwo(arr); double[] frequencyDomain = transform(adjustedArr); final List<T> result = (List<T>) DoubleStream.of(frequencyDomain).boxed().collect(Collectors.toList()); return result; }
  • 33. FFT.jar How to build UDF: 1. copy this JAR file to the ksqlDB extension directory 2. restart your ksqlDB server so that it can pick up the new JAR containing your custom ksqlDB function. /Users/ikhalitov/CFLT/cp_zip/confluent-7.4.1/etc/ksqldb/ext/ksql-server.properties ... ksql.extension.dir=/Users/ikhalitov/CFLT/cp_zip/confluent-7.4.1/etc/ksqldb/ext SHOW FUNCTIONS; DESCRIBE FUNCTION FFT;
  • 35. WAVES_JET CREATE STREAM IF NOT EXISTS WAVES_JET ( NAME VARCHAR , TYPE VARCHAR , timestamp BIGINT , dimensions STRUCT< formula VARCHAR, frequency DOUBLE, msStampInterval BIGINT > , `values` STRUCT< `value` DOUBLE > ) WITH ( KAFKA_TOPIC = 'wave', PARTITIONS = 1, REPLICAS = 1, VALUE_FORMAT = 'JSON' ); { "name": "sensor#1", "type": "curve", "timestamp": 1695756674961, "dimensions": { "formula": "math.sin(x)", "msStampInterval": 10 }, "values": { "value": -0.2426000471283583 } }
  • 36. CREATE TABLE IF NOT EXISTS WAVES_WINDOWED_TBL WITH (KAFKA_TOPIC='wave-windowed', VALUE_FORMAT='JSON', PARTITIONS=6, REPLICAS = 1 , retention_ms=3600000 ) AS SELECT -- EARLIEST_BY_OFFSET(ROWKEY) -- EARLIEST_BY_OFFSET(AS_VALUE(ROWKEY)) AS MQTT_KEY NAME AS SENSOR , COUNT(*) AS SAMPLES , EARLIEST_BY_OFFSET(dimensions->frequency) AS FREQUENCY , EARLIEST_BY_OFFSET(dimensions->msStampInterval) AS MS_SAMPLING_INTERVAL , COLLECT_LIST(timestamp) AS T , COLLECT_LIST(`values`->`value`) AS V , FROM_UNIXTIME(WINDOWSTART) AS WINDOW_START , FROM_UNIXTIME(WINDOWEND) AS WINDOW_END , FROM_UNIXTIME(max(ROWTIME)) AS WINDOW_EMIT FROM WAVES_JET -- WINDOW HOPPING (SIZE 30 SECONDS, ADVANCE BY 5 SECOND) WINDOW HOPPING (SIZE 20 SECONDS, ADVANCE BY 2 SECOND) GROUP BY NAME EMIT FINAL; { "SAMPLES": 1596, "FREQUENCY": null, "MS_SAMPLING_INTERVAL": 10, "T": [ 1695756698005, … 1695756717976, 1695756717988 ], "V": [ 0.03141014189275034, 0.11285466728516695, 0.18738077087567337, … -0.15022507859000683, -0.07532702075491876 ], "WINDOW_START": 1695756698000, "WINDOW_END": 1695756718000, "WINDOW_EMIT": 1695756717989 } WAVES_WINDOWED_TBL
  • 37. WAVES_WINDOWED_JET CREATE STREAM IF NOT EXISTS WAVES_WINDOWED_JET ( SENSOR VARCHAR KEY, MQTT_KEY VARCHAR, SAMPLES INTEGER, MS_SAMPLING_INTERVAL INTEGER, T ARRAY<BIGINT>, V ARRAY<DOUBLE>, WINDOW_START TIMESTAMP, WINDOW_END TIMESTAMP ) WITH ( kafka_topic = 'wave-windowed', VALUE_FORMAT = 'JSON' ); KStream - KTable duality
  • 38. WAVES_STEP_1_JET CREATE OR REPLACE STREAM IF NOT EXISTS WAVES_STEP_1_JET WITH ( kafka_topic = 'wave-step-1', VALUE_FORMAT = 'JSON' ) AS SELECT SENSOR, AS_VALUE(SENSOR) AS SENSOR_VAL, SAMPLES AS SAMPLES_NUM, MS_SAMPLING_INTERVAL, CAST(MS_SAMPLING_INTERVAL AS DOUBLE)/1000 AS SAMPLING_PERIOD, UNIX_TIMESTAMP(WINDOW_START) AS UNIX_TIMESTAMP_START, UNIX_TIMESTAMP(WINDOW_END) AS UNIX_TIMESTAMP_END, T, FFT(V) AS FOURIER FROM WAVES_WINDOWED_JET EMIT CHANGES; { "SENSOR_VAL": "sensor#1u0000u0000u0001 ", "SAMPLES_NUM": 1588, "MS_SAMPLING_INTERVAL": 10, "SAMPLING_PERIOD": 0.01, "UNIX_TIMESTAMP_START": 1695757040000, "UNIX_TIMESTAMP_END": 1695757060000, "T": [ 1695757040007, 1695757040020, … ], "FOURIER ": [ 0.03141014189275034, … -0.15022507859000683, -0.07532702075491876 ],
  • 39. WAVES_STEP_2_JET CREATE STREAM IF NOT EXISTS WAVES_STEP_2_JET WITH ( kafka_topic = 'wave-step-2', VALUE_FORMAT = 'JSON' ) AS SELECT SENSOR, … SAMPLING_PERIOD, 1 / SAMPLING_PERIOD AS SAMPLING_FREQ, SAMPLES_NUM * SAMPLING_PERIOD * 2 AS TIME_VECTOR, UNIX_TIMESTAMP_END - UNIX_TIMESTAMP_START AS MS_TIME_VECTOR, 1 / (SAMPLES_NUM * SAMPLING_PERIOD * 2) AS DELTA_FREQ, ARRAY_LENGTH(T) AS T_LENGTH, T, TRANSFORM(T, x => x - UNIX_TIMESTAMP_START) AS NORM_T, ARRAY_LENGTH(FOURIER) AS FOURIER_LENGTH, FOURIER FROM WAVES_STEP_1_JET EMIT CHANGES;
  • 40. WAVES_STEP_3_JET CREATE STREAM IF NOT EXISTS WAVES_STEP_3_JET WITH ( kafka_topic = 'wave-step-3', VALUE_FORMAT = 'JSON', RETENTION_MS=600000 ) AS SELECT SENSOR, … T_LENGTH, T, TRANSFORM(T, x => DELTA_FREQ * (SAMPLES_NUM-1) * ((CAST(x AS DOUBLE) - UNIX_TIMESTAMP_START)/MS_TIME_VECTOR) ) AS FREQUENCIES, FOURIER_LENGTH, TRANSFORM(FOURIER, x => x/(SAMPLES_NUM/2)) AS MAGNITUDE FROM WAVES_STEP_2_JET EMIT CHANGES;
  • 41. WAVES_STEP_4_JET CREATE STREAM IF NOT EXISTS WAVES_STEP_4_JET WITH ( kafka_topic = 'wave-step-4', VALUE_FORMAT = 'JSON', RETENTION_MS=600000 ) AS SELECT SENSOR, SENSOR_VAL, UNIX_TIMESTAMP_START, UNIX_TIMESTAMP_END, SAMPLES_NUM, MS_SAMPLING_INTERVAL, MS_TIME_VECTOR, DELTA_FREQ, T_LENGTH, TRANSFORM(T, x => CAST(x AS VARCHAR)) AS T_VARCHAR, FOURIER_LENGTH, MAGNITUDE
  • 42. WAVES_STEP_5_JET CREATE STREAM IF NOT EXISTS WAVES_STEP_5_JET WITH ( kafka_topic = 'wave-step-5', VALUE_FORMAT = 'JSON', RETENTION_MS=600000 ) AS SELECT SENSOR, SENSOR_VAL, UNIX_TIMESTAMP_START, UNIX_TIMESTAMP_END, SAMPLES_NUM, MS_SAMPLING_INTERVAL, MS_TIME_VECTOR, DELTA_FREQ, AS_MAP(T_VARCHAR, MAGNITUDE) AS POINTS FROM WAVES_STEP_4_JET EMIT CHANGES;
  • 43. WAVES_STEP_6_JET CREATE STREAM IF NOT EXISTS WAVES_STEP_6_JET WITH ( kafka_topic = 'wave-step-6', VALUE_FORMAT = 'JSON', RETENTION_MS=600000 ) AS SELECT SENSOR, SENSOR_VAL, UNIX_TIMESTAMP_START, UNIX_TIMESTAMP_END, SAMPLES_NUM, MS_SAMPLING_INTERVAL, MS_TIME_VECTOR, DELTA_FREQ, FILTER(POINTS, (k,v) => v > 0.4 ) FILTERED FROM WAVES_STEP_5_JET EMIT CHANGES;
  • 44. WAVES_STEP_7_JET CREATE STREAM IF NOT EXISTS WAVES_STEP_7_JET WITH ( kafka_topic = 'wave-step-7', VALUE_FORMAT = 'JSON', RETENTION_MS=600000 ) AS SELECT SENSOR, SENSOR_VAL, UNIX_TIMESTAMP_START, UNIX_TIMESTAMP_END, SAMPLES_NUM, MS_SAMPLING_INTERVAL, MS_TIME_VECTOR, DELTA_FREQ, MAP_KEYS(FILTERED) AS X_TIME, MAP_VALUES(FILTERED) AS Y_AMPL FROM WAVES_STEP_6_JET EMIT CHANGES;
  • 45. WAVES_STEP_8_JET CREATE STREAM IF NOT EXISTS WAVES_STEP_8_JET WITH ( kafka_topic = 'wave-step-8', VALUE_FORMAT = 'JSON', RETENTION_MS=3600000 ) AS SELECT SENSOR, SENSOR_VAL, UNIX_TIMESTAMP_START, UNIX_TIMESTAMP_END, SAMPLES_NUM, MS_SAMPLING_INTERVAL, MS_TIME_VECTOR, DELTA_FREQ, TRANSFORM(X_TIME, x => DELTA_FREQ * (SAMPLES_NUM-1) * ((CAST(x AS DOUBLE) - UNIX_TIMESTAMP_START)/MS_TIME_VECTOR) ) AS X_FREQ, Y_AMPL FROM WAVES_STEP_7_JET EMIT CHANGES;
  • 47. Transform CREATE TYPE IF NOT EXISTS TAGS_TYPE AS STRUCT<sensor_id VARCHAR, formula VARCHAR>; -- DROP TYPE TAGTAGS_TYPES; CREATE OR REPLACE STREAM IF NOT EXISTS WAVE_INFLUXDB_JET ( `measurement` VARCHAR, `tags` TAGS_TYPE, `time` VARCHAR, `wave` DOUBLE ) WITH ( kafka_topic = 'wave-influxdb', VALUE_FORMAT = 'JSON', PARTITIONS = 1, RETENTION_MS=600000 );
  • 48. INSERT INTO SELECT FROM INSERT INTO WAVE_INFLUXDB_JET SELECT 'WAVE' AS `measurement` , STRUCT(sensor_id := NAME, formula := dimensions->formula) AS `tags` , TIMESTAMPTOSTRING(TIMESTAMP, 'yyyy-MM-dd HH:mm:ss.SSS') AS `time` ,`values`->`value` AS `wave` FROM WAVES_JET EMIT CHANGES;