SlideShare a Scribd company logo
1 of 56
Download to read offline
1 © Hortonworks Inc. 2011–2018. All rights reserved
Dataflow Management From Edge to
Core with Apache NiFi
Andy LoPresto | @yolopey
Sr. Member of Technical Staff at Hortonworks, Apache NiFi PMC & Committer
11 October 2018 Dataworks Summit Singapore
2 © Hortonworks Inc. 2011–2018. All rights reserved
Gauging Audience Familiarity with NiFi
“What’s a NeeFee?”
No experience with dataflow
No experience with NiFi
“I can pick this up pretty quickly”
Some experience with dataflow
Some experience with NiFi
“I refactored the Ambari
integration endpoint to
allow for mutual
authentication TLS during
my coffee break”
Forgotten more about NiFi
than most of us will ever
know
3 © Hortonworks Inc. 2011–2018. All rights reserved
Agenda
• What is dataflow and what are the challenges?
• Apache NiFi
• Apache MiNiFi
• Apache NiFi Registry
• Complementary Tools
• Community
• All slides provided online, so no need to transcribe
4 © Hortonworks Inc. 2011–2018. All rights reserved
What Is Dataflow?
5 © Hortonworks Inc. 2011–2018. All rights reserved
What Is Dataflow?
• Moving some content from A to B
• Content could be any bytes
• Logs
• HTTP
• XML
• CSV
• Images
• Video
• Telemetry
Producers A.K.A
Things
Anything
AND
Everything
Internet!
Consumers
• User
• Storage
• System
• …More Things
© Hortonworks Inc. 2011–2018. All rights reserved
Moving Data Effectively Is Hard
“Data Pipeline” https://xkcd.com/2054/
© Hortonworks Inc. 2011–2018. All rights reserved
• Standards
• Formats
• Protocols
• Veracity
• Validity
• Schemas
• Partitioning/Bun
dling
Data
Dataflow Challenges in 3 Categories
Infrastructure
• “Exactly Once”
Delivery
• Ensuring
Security
• Overcoming
Security
• Credential
Management
• Network
People
• Compliance
• “That
[person|team|g
roup]”
• Consumers
Change
• Requirements
Change
• “Exactly Once”
Delivery
© Hortonworks Inc. 2011–2018. All rights reserved
Raise your hand if you want to maintain Python scripts for the rest of your life
Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs
9 © Hortonworks Inc. 2011–2018. All rights reserved
Apache NiFi
© Hortonworks Inc. 2011–2018. All rights reserved
• Guaranteed delivery
• Data buffering
• Backpressure
• Pressure release
• Prioritized queuing
• Flow specific QoS
• Latency vs. throughput
• Loss tolerance
Key Features
Apache NiFi
• Data provenance
• Supports push and pull models
• Recovery/recording
a rolling log of fine-grained history
• Visual command and control
• Flow templates
• Pluggable, multi-tenant security
• Designed for extension
• Clustering
© Hortonworks Inc. 2011–2018. All rights reserved
Flowfiles Are Like HTTP Data
HTTP Data FlowFile
HTTP/1.1 200 OK
Date: Sun, 10 Oct 2010 23:26:07 GMT
Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g
Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT
ETag: "45b6-834-49130cc1182c0"
Accept-Ranges: bytes
Content-Length: 13
Connection: close
Content-Type: text/html
Hello world!
Standard FlowFile Attributes
Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'fileSize’ Value: '23609'
FlowFile Attribute Map Content
Key: 'filename’ Value: '15650246997242'
Key: 'path’ Value: './’
Binary Content *
Header
Content
© Hortonworks Inc. 2011–2018. All rights reserved
User Interface
Less of this… … more of this
© Hortonworks Inc. 2011–2018. All rights reserved
Deeper Ecosystem Integration: 274+ Processors,
57 Controller Services
Hash
Extract
Merge
Duplicate
Scan
GeoEnrich
Replace
ConvertSplit
Translate
Route Content
Route Context
Route Text
Control Rate
Distribute Load
Generate Table Fetch
Jolt Transform JSON
Prioritized Delivery
Encrypt
Tail
Evaluate
Execute
All Apache project logos are trademarks of the ASF and the respective projects.
Fetch
HTTP
Syslog
Email
HTML
Image
HL7
FTP
UDP
XML
SFTP
AMQP
WebSocket
Parse Records Convert Records
22 © Hortonworks Inc. 2011–2018. All rights reserved
Apache MiNiFi
23 © Hortonworks Inc. 2011–2018. All rights reserved
IoT Challenges
• Limited computing capability
• Limited power/network
• Restricted software library/platform
availability
• No UI
• Physically inaccessible
• Not frequently updated
• Competing standards/protocols
• Scalability
• Privacy & Security
@_lennart
© Hortonworks Inc. 2011–2018. All rights reserved27
• NiFi is designed to “own the box”
• NiFi 0.7.x started up in about 10-15 minutes on RP3 (593 MB)
• NiFi 1.x started up in about 30 minutes on RP3 (760 MB)
• 33 new processors
• Rewrite for multi tenant authorization
• Complete UI overhaul
So Why Do We Need a Different Solution?
© Hortonworks Inc. 2011–2018. All rights reserved28
• Get the key parts of NiFi close to where data begins and provide bidirectional
communication
• NiFi lives in the data center — give it an enterprise server or a cluster of them
• MiNiFi lives as close to where data is born and is a guest on that device or system
• IoT
• Connected car
• Legacy hardware
Apache NiFi Subproject: MiNiFi
© Hortonworks Inc. 2011–2018. All rights reserved30
• MiNiFi Java (v0.5.0)
• Modified version of NiFi
• No UI
• YAML configuration
• Reduced processor count
• 63+ by default, more
available with
additional NARs
• MiNiFi C++ (v0.5.0)
• Written from scratch
• 33 processors by default
• Bi-directional site-to-site & provenance data
Flavors of MiNiFi
© Hortonworks Inc. 2011–2018. All rights reserved32
• NiFi
• Design flows
• Aggregate data from many
sources
• Perform routing/analysis/SEP
• MiNiFi
• Receive flows
• Collect data
• Send for processing
How Does MiNiFi Interact with NiFi?
© Hortonworks Inc. 2011–2018. All rights reserved33
• We’ve been imagining EDGE to CORE as a bi-directional linear system
• Let’s expand
that to the real
world
Let’s Add Dimensionality
© Hortonworks Inc. 2011–2018. All rights reserved34
• Data tagging/provenance
• Governance from edge (geopolitical
restrictions)
• Security (encryption, certificate-based
authentication)
• Low latency (immediate reactions &
decision-making)
What Does MiNiFi provide? Connected Car Reference Platform Box
Tuner + DSRC CardConnectivity Card
© Hortonworks Inc. 2011–2018. All rights reserved37
• Site-to-Site
• NiFi protocol
• Two implementations
• Raw socket
• HTTP(S)
• Secured with mutual authentication TLS
• HTTP(S), (S)FTP, JMS, Syslog, File, Email, Process
MiNiFi Exfil
38 © Hortonworks Inc. 2011–2018. All rights reserved
Apache NiFi Registry
39 © Hortonworks Inc. 2011–2018. All rights reserved
Flow Development Lifecycle (FDLC)
• Origins of NiFi
• Operator Experience
• MC data, don’t drop, mitigate temporarily
• Version Control
• Environment Promotion
40 © Hortonworks Inc. 2011–2018. All rights reserved
Operator Experience
© Hortonworks Inc. 2011–2018. All rights reserved41
• Shows previous values (user,
time changed)
• Sensitive values are always
encrypted at rest and never
returned via the API
Component Property History
42 © Hortonworks Inc. 2011–2018. All rights reserved
Exporting Flows
• XML templates
• Copying flow.xml.gz
between systems
43 © Hortonworks Inc. 2011–2018. All rights reserved
Challenges
• Templates
• Updates/replacement
• Sensitive property replacement
• Flow.xml.gz migration
• Key synchronization
• Environment promotion
• Approval processes
• Verifiability
44 © Hortonworks Inc. 2011–2018. All rights reserved
Template Replacement
• Export a new version of template
• Transfer (somehow)
• Verify?
• Import onto canvas side-by-side existing
flow
• Stop processors
• Empty queues
• Reconnect queues
• Start
• Pray?
45 © Hortonworks Inc. 2011–2018. All rights reserved
Template Replacement
© Hortonworks Inc. 2011–2018. All rights reserved46
• Previously, flows were exported via XML
templates
• Didn’t contain sensitive values
• Couldn’t be updated in-place
• No tracking system
• NiFi Registry brings asset management
as first-class citizen to NiFi
• Flows can be versioned
Introducing Apache NiFi Registry 0.3.0
NiFi Registry for Dataflows
© Hortonworks Inc. 2011–2018. All rights reserved47
• Connect multiple NiFi instances
to a NiFi Registry instance
• Communicate between multiple
NiFi Registry instances
• via multiple Registry Clients
• via NiFi CLI
Flows Can Be Promoted Between Environments
© Hortonworks Inc. 2011–2018. All rights reserved48
• Git-backed persistence
• Share flows via GitHub, etc.
• Commit hooks
• Register a hook & action
• “When a new version of the
flow is committed to QA
Registry, email the QA team
and post in the QA Deploy
Slack channel”
• Pluggable DB implementations
Extensibility
49 © Hortonworks Inc. 2011–2018. All rights reserved
Demo
© Hortonworks Inc. 2011–2018. All rights reserved50
• Install nifi-registry
• $ mvn clean install
• $ ./bin/nifi-registry.sh
start
• Browse to http://localhost:18080
Create Registry
51 © Hortonworks Inc. 2011–2018. All rights reserved
Create Bucket
© Hortonworks Inc. 2011–2018. All rights reserved52
Connect to NiFi
© Hortonworks Inc. 2011–2018. All rights reserved53
Create Process Group
© Hortonworks Inc. 2011–2018. All rights reserved54
Commit Version
© Hortonworks Inc. 2011–2018. All rights reserved55
View Flow in Registry
© Hortonworks Inc. 2011–2018. All rights reserved56
Import New Instance into NiFi
© Hortonworks Inc. 2011–2018. All rights reserved57
Modify the Original Flow
© Hortonworks Inc. 2011–2018. All rights reserved58
See Local Changes Before Committing
© Hortonworks Inc. 2011–2018. All rights reserved59
Commit
© Hortonworks Inc. 2011–2018. All rights reserved60
Update New Instance from Registry
61 © Hortonworks Inc. 2011–2018. All rights reserved
Complementary Tools
© Hortonworks Inc. 2011–2018. All rights reserved62
• NiFi Toolkit
• NiPyAPI
• MiNiFi Converter Toolkit
Complementary Tools
63 © Hortonworks Inc. 2011–2018. All rights reserved
NiFi Toolkit
• TLS Toolkit
• Generates, signs, and packages
keys and certificates for NiFi
services (node/cluster, clients)
• Encrypt Config
• Protects sensitive configuration
values like passwords
• CLI
• Interacts with NiFi & NiFi
Registry to operate on flows
64 © Hortonworks Inc. 2011–2018. All rights reserved
NiPyAPI
• Python wrapper around NiFi REST API
• Community-provided by Daniel Chaffelson
• Exposes common operations for automation, batch processing, recursion, etc.
dev_bucket = nipyapi.versioning.get_registry_bucket(dev_bucket_name)
dev_ver_flow = nipyapi.versioning.get_flow_in_bucket(
dev_bucket.identifier,
identifier=dev_ver_flow_name
)
dev_export = nipyapi.versioning.export_flow_version(
bucket_id=dev_bucket.identifier,
flow_id=dev_ver_flow.identifier,
mode='yaml'
)
65 © Hortonworks Inc. 2011–2018. All rights reserved
MiNiFi Converter Toolkit
• Save as template from NiFi
• Run $ ./bin/config.sh transform
template.xml config.yml
• MiNiFi flow ready to run
66 © Hortonworks Inc. 2011–2018. All rights reserved
Community
© Hortonworks Inc. 2011–2018. All rights reserved67
• FDLC with Apache NiFi, Kevin
Doran
• NiPyAPI Docs, Daniel
Chaffelson
• DevOps Tips, Tim Spann
• Automate Workflow, Pierre
Villard
More Resources
© Hortonworks Inc. 2011–2018. All rights reserved68
• NiFi 1.8.0 — … Oct 2018 (170+ Jiras)
• Jetty, DB improvements
• Auto load-balancing queues
• TLS Toolkit w/ external CA
• Record processor improvements
• MiNiFi C++ 0.5.0 — 6 June 2018
• MiNiFi Java 0.5.0 — 7 July 2018
• NiFi Registry 0.3.0 — 25 Sept 2018
New Announcements
© Hortonworks Inc. 2011–2018. All rights reserved69
Community Health
© Hortonworks Inc. 2011–2018. All rights reserved70
Apache NiFi site
https://nifi.apache.org
Subproject MiNiFi site
https://nifi.apache.org/minifi/
Subscribe to and collaborate at
dev@nifi.apache.org
users@nifi.apache.org
Submit Ideas or Issues
https://issues.apache.org/jira/browse/NIFI
Follow us on Twitter
@apachenifi
Learn More and Join Us
72 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you
alopresto@hortonworks.com | alopresto@apache.org | @yolopey
github.com/alopresto/slides

More Related Content

What's hot

Observability in Java: Getting Started with OpenTelemetry
Observability in Java: Getting Started with OpenTelemetryObservability in Java: Getting Started with OpenTelemetry
Observability in Java: Getting Started with OpenTelemetryDevOps.com
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveMichal Rostecki
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseGregory Keys
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...NETWAYS
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultDataWorks Summit
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusGrafana Labs
 
OFI libfabric Tutorial
OFI libfabric TutorialOFI libfabric Tutorial
OFI libfabric Tutorialdgoodell
 
Manchester MuleSoft Meetup #6 - Runtime Fabric with Mulesoft
Manchester MuleSoft Meetup #6 - Runtime Fabric with Mulesoft Manchester MuleSoft Meetup #6 - Runtime Fabric with Mulesoft
Manchester MuleSoft Meetup #6 - Runtime Fabric with Mulesoft Akshata Sawant
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayDataWorks Summit
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesIsheeta Sanghi
 
cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouseAltinity Ltd
 
Learning how AWS implement AWS VPC CNI
Learning how AWS implement AWS VPC CNILearning how AWS implement AWS VPC CNI
Learning how AWS implement AWS VPC CNIHungWei Chiu
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiTimothy Spann
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For OperatorsKevin Brockhoff
 
Grafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsGrafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsMarco Pracucci
 
Drone Data Flowing Through Apache NiFi
Drone Data Flowing Through Apache NiFiDrone Data Flowing Through Apache NiFi
Drone Data Flowing Through Apache NiFiTimothy Spann
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiDataWorks Summit
 

What's hot (20)

Observability in Java: Getting Started with OpenTelemetry
Observability in Java: Getting Started with OpenTelemetryObservability in Java: Getting Started with OpenTelemetry
Observability in Java: Getting Started with OpenTelemetry
 
Kubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep DiveKubernetes Networking with Cilium - Deep Dive
Kubernetes Networking with Cilium - Deep Dive
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the Enterprise
 
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at Renault
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
OFI libfabric Tutorial
OFI libfabric TutorialOFI libfabric Tutorial
OFI libfabric Tutorial
 
Manchester MuleSoft Meetup #6 - Runtime Fabric with Mulesoft
Manchester MuleSoft Meetup #6 - Runtime Fabric with Mulesoft Manchester MuleSoft Meetup #6 - Runtime Fabric with Mulesoft
Manchester MuleSoft Meetup #6 - Runtime Fabric with Mulesoft
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 
Nifi
NifiNifi
Nifi
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
 
cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouse
 
Learning how AWS implement AWS VPC CNI
Learning how AWS implement AWS VPC CNILearning how AWS implement AWS VPC CNI
Learning how AWS implement AWS VPC CNI
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
Grafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for LogsGrafana Loki: like Prometheus, but for Logs
Grafana Loki: like Prometheus, but for Logs
 
Drone Data Flowing Through Apache NiFi
Drone Data Flowing Through Apache NiFiDrone Data Flowing Through Apache NiFi
Drone Data Flowing Through Apache NiFi
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
 
Tail f - Why ConfD
Tail f - Why ConfDTail f - Why ConfD
Tail f - Why ConfD
 

Similar to Dataflow Management From Edge to Core with Apache NiFi

The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiDataWorks Summit
 
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiThe First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiDataWorks Summit
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataWorks Summit
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityAccumulo Summit
 
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiDataWorks Summit
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiJoe Percivall
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiDataWorks Summit
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiAldrin Piri
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIsheeta Sanghi
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and FlinkBryan Bende
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIsheeta Sanghi
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIsheeta Sanghi
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkHortonworks
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and ApexBryan Bende
 
Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Apache Apex
 
Apache Deep Learning 101 - DWS Berlin 2018
Apache Deep Learning 101 - DWS Berlin 2018Apache Deep Learning 101 - DWS Berlin 2018
Apache Deep Learning 101 - DWS Berlin 2018Timothy Spann
 

Similar to Dataflow Management From Edge to Core with Apache NiFi (20)

The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile -- Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFiThe First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
The First Mile - Edge and IoT Data Collection With Apache Nifi and MiniFi
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and Apex
 
Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Apache Deep Learning 101 - DWS Berlin 2018
Apache Deep Learning 101 - DWS Berlin 2018Apache Deep Learning 101 - DWS Berlin 2018
Apache Deep Learning 101 - DWS Berlin 2018
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Recently uploaded (20)

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Dataflow Management From Edge to Core with Apache NiFi

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Dataflow Management From Edge to Core with Apache NiFi Andy LoPresto | @yolopey Sr. Member of Technical Staff at Hortonworks, Apache NiFi PMC & Committer 11 October 2018 Dataworks Summit Singapore
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved Gauging Audience Familiarity with NiFi “What’s a NeeFee?” No experience with dataflow No experience with NiFi “I can pick this up pretty quickly” Some experience with dataflow Some experience with NiFi “I refactored the Ambari integration endpoint to allow for mutual authentication TLS during my coffee break” Forgotten more about NiFi than most of us will ever know
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved Agenda • What is dataflow and what are the challenges? • Apache NiFi • Apache MiNiFi • Apache NiFi Registry • Complementary Tools • Community • All slides provided online, so no need to transcribe
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved What Is Dataflow?
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved What Is Dataflow? • Moving some content from A to B • Content could be any bytes • Logs • HTTP • XML • CSV • Images • Video • Telemetry Producers A.K.A Things Anything AND Everything Internet! Consumers • User • Storage • System • …More Things
  • 6. © Hortonworks Inc. 2011–2018. All rights reserved Moving Data Effectively Is Hard “Data Pipeline” https://xkcd.com/2054/
  • 7. © Hortonworks Inc. 2011–2018. All rights reserved • Standards • Formats • Protocols • Veracity • Validity • Schemas • Partitioning/Bun dling Data Dataflow Challenges in 3 Categories Infrastructure • “Exactly Once” Delivery • Ensuring Security • Overcoming Security • Credential Management • Network People • Compliance • “That [person|team|g roup]” • Consumers Change • Requirements Change • “Exactly Once” Delivery
  • 8. © Hortonworks Inc. 2011–2018. All rights reserved Raise your hand if you want to maintain Python scripts for the rest of your life Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved Apache NiFi
  • 10. © Hortonworks Inc. 2011–2018. All rights reserved • Guaranteed delivery • Data buffering • Backpressure • Pressure release • Prioritized queuing • Flow specific QoS • Latency vs. throughput • Loss tolerance Key Features Apache NiFi • Data provenance • Supports push and pull models • Recovery/recording a rolling log of fine-grained history • Visual command and control • Flow templates • Pluggable, multi-tenant security • Designed for extension • Clustering
  • 11. © Hortonworks Inc. 2011–2018. All rights reserved Flowfiles Are Like HTTP Data HTTP Data FlowFile HTTP/1.1 200 OK Date: Sun, 10 Oct 2010 23:26:07 GMT Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT ETag: "45b6-834-49130cc1182c0" Accept-Ranges: bytes Content-Length: 13 Connection: close Content-Type: text/html Hello world! Standard FlowFile Attributes Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'fileSize’ Value: '23609' FlowFile Attribute Map Content Key: 'filename’ Value: '15650246997242' Key: 'path’ Value: './’ Binary Content * Header Content
  • 12. © Hortonworks Inc. 2011–2018. All rights reserved User Interface Less of this… … more of this
  • 13. © Hortonworks Inc. 2011–2018. All rights reserved Deeper Ecosystem Integration: 274+ Processors, 57 Controller Services Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute All Apache project logos are trademarks of the ASF and the respective projects. Fetch HTTP Syslog Email HTML Image HL7 FTP UDP XML SFTP AMQP WebSocket Parse Records Convert Records
  • 14. 22 © Hortonworks Inc. 2011–2018. All rights reserved Apache MiNiFi
  • 15. 23 © Hortonworks Inc. 2011–2018. All rights reserved IoT Challenges • Limited computing capability • Limited power/network • Restricted software library/platform availability • No UI • Physically inaccessible • Not frequently updated • Competing standards/protocols • Scalability • Privacy & Security @_lennart
  • 16. © Hortonworks Inc. 2011–2018. All rights reserved27 • NiFi is designed to “own the box” • NiFi 0.7.x started up in about 10-15 minutes on RP3 (593 MB) • NiFi 1.x started up in about 30 minutes on RP3 (760 MB) • 33 new processors • Rewrite for multi tenant authorization • Complete UI overhaul So Why Do We Need a Different Solution?
  • 17. © Hortonworks Inc. 2011–2018. All rights reserved28 • Get the key parts of NiFi close to where data begins and provide bidirectional communication • NiFi lives in the data center — give it an enterprise server or a cluster of them • MiNiFi lives as close to where data is born and is a guest on that device or system • IoT • Connected car • Legacy hardware Apache NiFi Subproject: MiNiFi
  • 18. © Hortonworks Inc. 2011–2018. All rights reserved30 • MiNiFi Java (v0.5.0) • Modified version of NiFi • No UI • YAML configuration • Reduced processor count • 63+ by default, more available with additional NARs • MiNiFi C++ (v0.5.0) • Written from scratch • 33 processors by default • Bi-directional site-to-site & provenance data Flavors of MiNiFi
  • 19. © Hortonworks Inc. 2011–2018. All rights reserved32 • NiFi • Design flows • Aggregate data from many sources • Perform routing/analysis/SEP • MiNiFi • Receive flows • Collect data • Send for processing How Does MiNiFi Interact with NiFi?
  • 20. © Hortonworks Inc. 2011–2018. All rights reserved33 • We’ve been imagining EDGE to CORE as a bi-directional linear system • Let’s expand that to the real world Let’s Add Dimensionality
  • 21. © Hortonworks Inc. 2011–2018. All rights reserved34 • Data tagging/provenance • Governance from edge (geopolitical restrictions) • Security (encryption, certificate-based authentication) • Low latency (immediate reactions & decision-making) What Does MiNiFi provide? Connected Car Reference Platform Box Tuner + DSRC CardConnectivity Card
  • 22. © Hortonworks Inc. 2011–2018. All rights reserved37 • Site-to-Site • NiFi protocol • Two implementations • Raw socket • HTTP(S) • Secured with mutual authentication TLS • HTTP(S), (S)FTP, JMS, Syslog, File, Email, Process MiNiFi Exfil
  • 23. 38 © Hortonworks Inc. 2011–2018. All rights reserved Apache NiFi Registry
  • 24. 39 © Hortonworks Inc. 2011–2018. All rights reserved Flow Development Lifecycle (FDLC) • Origins of NiFi • Operator Experience • MC data, don’t drop, mitigate temporarily • Version Control • Environment Promotion
  • 25. 40 © Hortonworks Inc. 2011–2018. All rights reserved Operator Experience
  • 26. © Hortonworks Inc. 2011–2018. All rights reserved41 • Shows previous values (user, time changed) • Sensitive values are always encrypted at rest and never returned via the API Component Property History
  • 27. 42 © Hortonworks Inc. 2011–2018. All rights reserved Exporting Flows • XML templates • Copying flow.xml.gz between systems
  • 28. 43 © Hortonworks Inc. 2011–2018. All rights reserved Challenges • Templates • Updates/replacement • Sensitive property replacement • Flow.xml.gz migration • Key synchronization • Environment promotion • Approval processes • Verifiability
  • 29. 44 © Hortonworks Inc. 2011–2018. All rights reserved Template Replacement • Export a new version of template • Transfer (somehow) • Verify? • Import onto canvas side-by-side existing flow • Stop processors • Empty queues • Reconnect queues • Start • Pray?
  • 30. 45 © Hortonworks Inc. 2011–2018. All rights reserved Template Replacement
  • 31. © Hortonworks Inc. 2011–2018. All rights reserved46 • Previously, flows were exported via XML templates • Didn’t contain sensitive values • Couldn’t be updated in-place • No tracking system • NiFi Registry brings asset management as first-class citizen to NiFi • Flows can be versioned Introducing Apache NiFi Registry 0.3.0 NiFi Registry for Dataflows
  • 32. © Hortonworks Inc. 2011–2018. All rights reserved47 • Connect multiple NiFi instances to a NiFi Registry instance • Communicate between multiple NiFi Registry instances • via multiple Registry Clients • via NiFi CLI Flows Can Be Promoted Between Environments
  • 33. © Hortonworks Inc. 2011–2018. All rights reserved48 • Git-backed persistence • Share flows via GitHub, etc. • Commit hooks • Register a hook & action • “When a new version of the flow is committed to QA Registry, email the QA team and post in the QA Deploy Slack channel” • Pluggable DB implementations Extensibility
  • 34. 49 © Hortonworks Inc. 2011–2018. All rights reserved Demo
  • 35. © Hortonworks Inc. 2011–2018. All rights reserved50 • Install nifi-registry • $ mvn clean install • $ ./bin/nifi-registry.sh start • Browse to http://localhost:18080 Create Registry
  • 36. 51 © Hortonworks Inc. 2011–2018. All rights reserved Create Bucket
  • 37. © Hortonworks Inc. 2011–2018. All rights reserved52 Connect to NiFi
  • 38. © Hortonworks Inc. 2011–2018. All rights reserved53 Create Process Group
  • 39. © Hortonworks Inc. 2011–2018. All rights reserved54 Commit Version
  • 40. © Hortonworks Inc. 2011–2018. All rights reserved55 View Flow in Registry
  • 41. © Hortonworks Inc. 2011–2018. All rights reserved56 Import New Instance into NiFi
  • 42. © Hortonworks Inc. 2011–2018. All rights reserved57 Modify the Original Flow
  • 43. © Hortonworks Inc. 2011–2018. All rights reserved58 See Local Changes Before Committing
  • 44. © Hortonworks Inc. 2011–2018. All rights reserved59 Commit
  • 45. © Hortonworks Inc. 2011–2018. All rights reserved60 Update New Instance from Registry
  • 46. 61 © Hortonworks Inc. 2011–2018. All rights reserved Complementary Tools
  • 47. © Hortonworks Inc. 2011–2018. All rights reserved62 • NiFi Toolkit • NiPyAPI • MiNiFi Converter Toolkit Complementary Tools
  • 48. 63 © Hortonworks Inc. 2011–2018. All rights reserved NiFi Toolkit • TLS Toolkit • Generates, signs, and packages keys and certificates for NiFi services (node/cluster, clients) • Encrypt Config • Protects sensitive configuration values like passwords • CLI • Interacts with NiFi & NiFi Registry to operate on flows
  • 49. 64 © Hortonworks Inc. 2011–2018. All rights reserved NiPyAPI • Python wrapper around NiFi REST API • Community-provided by Daniel Chaffelson • Exposes common operations for automation, batch processing, recursion, etc. dev_bucket = nipyapi.versioning.get_registry_bucket(dev_bucket_name) dev_ver_flow = nipyapi.versioning.get_flow_in_bucket( dev_bucket.identifier, identifier=dev_ver_flow_name ) dev_export = nipyapi.versioning.export_flow_version( bucket_id=dev_bucket.identifier, flow_id=dev_ver_flow.identifier, mode='yaml' )
  • 50. 65 © Hortonworks Inc. 2011–2018. All rights reserved MiNiFi Converter Toolkit • Save as template from NiFi • Run $ ./bin/config.sh transform template.xml config.yml • MiNiFi flow ready to run
  • 51. 66 © Hortonworks Inc. 2011–2018. All rights reserved Community
  • 52. © Hortonworks Inc. 2011–2018. All rights reserved67 • FDLC with Apache NiFi, Kevin Doran • NiPyAPI Docs, Daniel Chaffelson • DevOps Tips, Tim Spann • Automate Workflow, Pierre Villard More Resources
  • 53. © Hortonworks Inc. 2011–2018. All rights reserved68 • NiFi 1.8.0 — … Oct 2018 (170+ Jiras) • Jetty, DB improvements • Auto load-balancing queues • TLS Toolkit w/ external CA • Record processor improvements • MiNiFi C++ 0.5.0 — 6 June 2018 • MiNiFi Java 0.5.0 — 7 July 2018 • NiFi Registry 0.3.0 — 25 Sept 2018 New Announcements
  • 54. © Hortonworks Inc. 2011–2018. All rights reserved69 Community Health
  • 55. © Hortonworks Inc. 2011–2018. All rights reserved70 Apache NiFi site https://nifi.apache.org Subproject MiNiFi site https://nifi.apache.org/minifi/ Subscribe to and collaborate at dev@nifi.apache.org users@nifi.apache.org Submit Ideas or Issues https://issues.apache.org/jira/browse/NIFI Follow us on Twitter @apachenifi Learn More and Join Us
  • 56. 72 © Hortonworks Inc. 2011–2018. All rights reserved Thank you alopresto@hortonworks.com | alopresto@apache.org | @yolopey github.com/alopresto/slides