Mais conteúdo relacionado Semelhante a Fast data for fitness 10 nov 2020 (20) Mais de Timothy Spann (20) Fast data for fitness 10 nov 20201. Fast Data for Fitness
Real-time Health Sensor Ingest
Timothy Spann - Principal DataFlow Field Engineer
3. © 2020 Cloudera, Inc. All rights reserved. 3
Timothy Spann
Data in Motion Principal Field Engineer
@PaasDev
DZone Zone Leader and Big Data MVB
Princeton NJ Future of Data Meetup
https://github.com/tspannhw https://www.datainmotion.dev/
https://meetup.com/futureofdata-princeton/
4. © 2020 Cloudera, Inc. All rights reserved. 4
Data Sources - From Images to Sub-Second Data Events
Heart Rate
Temperature
Blood Sugar
Altimeter
Blood Pressure
Weight
GPS
Live Video/Picture
AI for Pose Estimation
Improvement to Technique
5. © 2020 Cloudera, Inc. All rights reserved. 5
Data for Fitness+ Pipeline
Weather
Weather
Climate
Aggregates
Sensors
SQL
Analytics
Sources
Pollution
Wearables
REST
6. © 2020 Cloudera, Inc. All rights reserved. 6
Fitness+ Data Streaming Pipeline
Device Data
SensorsLogs
Weather
Sensors
Aggregates
Energy
SQL
Analytics
MiNiFi
Agent
Deep Learning
Classification
Edge Private
Cloud
Multi-Public
Cloud
8. © 2020 Cloudera, Inc. All rights reserved. 8
CLOUDERA FLOW AND EDGE MANAGEMENT
Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud, data
center) to any downstream system with built in end-to-end security and provenance
Advanced tooling to industrialize
flow development (Flow Development
Life Cycle)
ACQUIRE
• Over 300 Prebuilt Processors
• Easy to build your own
• Parse, Enrich & Apply Schema
• Filter, Split, Merger & Route
• Throttle & Backpressure
FTP
SFTP
HL7
UDP
XML
HTTP
EMAIL
HTML
IMAGE
SYSLOG
PROCESS
HASH
MERGE
EXTRACT
DUPLICATE
SPLIT
ENCRYPT
TALL
EVALUATE
EXECUTE
GEOENRICH
SCAN
REPLACE
TRANSLATE
CONVERT
ROUTE TEXT
ROUTE CONTENT
ROUTE CONTEXT
ROUTE RATE
DISTRIBUTE LOAD
DELIVER
• Guaranteed Delivery
• Full data provenance from
acquisition to delivery
• Diverse, Non-Traditional Sources
• Eco-system integration
FTP
SFTP
HL7
UDP
XML
HTTP
EMAIL
HTML
IMAGE
SYSLOG
9. © 2019 Cloudera, Inc. All rights reserved. 9
Apache Kafka
• Highly reliable distributed
messaging system
• Decouple applications, enables
many-to-many patterns
• Publish-Subscribe semantics
• Horizontal scalability
• Efficient implementation to
operate at speed with big data
volumes
• Organized by topic to support
several use cases
Source
System
Source
System
Source
System
Kafka
Fraud
Detection
Security
Systems
Real-Time
Monitoring
Source
System
Source
System
Source
System
Fraud
Detection
Security
Systems
Real-Time
Monitoring
Many-To-Many
Publish-Subscribe
Point-To-Point
Request-Response
10. © 2020 Cloudera, Inc. All rights reserved. 10
New features delivered with Cloudera Streaming Analytics (CSA) 1.2
Next Generation Streaming Analytics
Flink SQL Support
Agile Streaming App
Development using SQL
New Flink Atlas Hook
Capture operational Flink
app metadata and lineage
Single View of Flink Yarn Jobs
Improve Developer Experience
& operational visibility
11. © 2020 Cloudera, Inc. All rights reserved. 11
● https://www.cloudera.com/tutorials/building-a-sentiment-analysis-application/3.html
● https://blog.cloudera.com/benchmarking-nifi-performance-and-scalability/
● https://www.datainmotion.dev/2020/06/no-more-spaghetti-flows.html
● https://www.datainmotion.dev/2020/06/the-rise-of-mega-edge-flank.html
● https://www.datainmotion.dev/2020/01/cloudera-edge2ai-minifi-java-agent-with.html
● https://www.datainmotion.dev/2019/08/rapid-iot-development-with-cloudera.html
● https://www.datainmotion.dev/2019/09/powering-edge-ai-for-sensor-reading.html
● https://www.datainmotion.dev/2019/05/dataworks-summit-dc-2019-report.html
● https://www.datainmotion.dev/2019/03/using-raspberry-pi-3b-with-apache-nifi.html
● https://www.datainmotion.dev/2020/06/unboxing-most-amazing-edge-ai-device.html
● https://www.datainmotion.dev/2019/08/edge-processing-with-jetson-nano-part-3.html
● https://www.datainmotion.dev/2020/04/predicting-sensor-readings-with-time.html
REFERENCES