SlideShare uma empresa Scribd logo
1 de 47
Apache NiFi Crash
Course Intro
Joe Percivall - @JPercivall
Hadoop Summit – Melbourne
1 Sept 2016
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About Me
‱ Software Engineer at Hortonworks
‱ Apache NiFi committer and PMC member
‱ Github: github.com/JPercivall
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What is dataflow and what are the challenges?
Apache NiFi
Architecture
Live Demo
Community
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What is dataflow and what are the challenges?
Apache NiFi
Architecture
Live Demo
Community
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Let’s Connect A to B
Producers A.K.A Things
Anything
AND
Everything
Internet!
Consumers
‱ User
‱ Storage
‱ System
‱ 
More Things
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Moving data effectively is hard
Standards: http://xkcd.com/927/
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why is moving data effectively hard?
 Standards
 Formats
 “Exactly Once” Delivery
 Protocols
 Veracity of Information
 Validity of Information
 Ensuring Security
 Overcoming Security
 Compliance
 Schemas
 Consumers Change
 Credential Management
 “That [person|team|group]”
 Network
 “Exactly Once” Delivery
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs
Let’s consider the needs of a courier service
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Great! I am collecting all this data! Let’s use it!
Finding our needles in the haystack
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center
Kafka
Core Data Center at HQ
Server Cluster
Others
Storm / Spark /
Flink / Apex
Kafka
Storm / Spark / Flink / Apex
On Delivery Routes
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs
Oh, that courier service is global
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What is dataflow and what are the challenges?
Apache NiFi
Architecture
Live Demo
Community
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
‱ Web-based User Interface for creating, monitoring,
& controlling data flows
‱ Directed graphs of data routing and transformation
‱ Highly configurable - modify data flow at runtime,
dynamically prioritize data
‱ Easily extensible through development of custom
components
‱ Data Provenance tracks data through entire system
[1] https://nifi.apache.org/
Dataflow
Apache NiFi
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi
Key Features
‱ Guaranteed delivery
‱ Data buffering
- Backpressure
- Pressure release
‱ Prioritized queuing
‱ Flow specific QoS
- Latency vs. throughput
- Loss tolerance
‱ Data provenance
‱ Supports push and pull
models
‱ Recovery/recording
a rolling log of fine-
grained history
‱ Visual command and
control
‱ Flow templates
‱ Pluggable/multi-role
security*
‱ Designed for extension
‱ Clustering
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Revisit: Courier service from the perspective of NiFi
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
NiFi NiFi NiFi NiFi NiFi NiFi
On Delivery Routes
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Courier service from the perspective of NiFi & MiNiFi
Physical Store
Gateway
Server
Mobile Devices
Registers
Server Cluster
Distribution Center Core Data Center at HQ
Server Cluster
Trucks Deliverers
Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/
Deliverer: Rigo Peter, https://thenounproject.com/rigo/
Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/
Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
Client
Libraries
Client
Libraries
MiNiFi
MiNiFi
NiFi NiFi NiFi NiFi NiFi NiFi
Client
Libraries
On Delivery Routes
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi MiNiFi
Key Features
‱ Guaranteed delivery
‱ Data buffering
- Backpressure
- Pressure release
‱ Prioritized queuing
‱ Flow specific QoS
- Latency vs. throughput
- Loss tolerance
‱ Data provenance
‱ Recovery/recording
a rolling log of fine-
grained history
‱ Designed for extension
‱ Design and Deploy
‱ Warm re-deploys
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi MiNiFi
Key Features
‱ Guaranteed delivery
‱ Data buffering
- Backpressure
- Pressure release
‱ Prioritized queuing
‱ Flow specific QoS
- Latency vs. throughput
- Loss tolerance
‱ Data provenance
‱ Recovery/recording
a rolling log of fine-
grained history
‱ Designed for extension
‱ Design and Deploy
‱ Warm re-deploys
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Visual Command and Control
vs.
Design and Deploy
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache NiFi Managed Dataflow
SOURCES
REGIONAL
INFRASTRUCTURE
CORE
INFRASTRUCTURE
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi is based on Flow Based Programming (FBP)
FBP Term NiFi Term Description
Information
Packet
FlowFile Each object moving through the system.
Black Box FlowFile
Processor
Performs the work, doing some combination of data routing, transformation,
or mediation between systems.
Bounded
Buffer
Connection The linkage between processors, acting as queues and allowing various
processes to interact at differing rates.
Scheduler Flow
Controller
Maintains the knowledge of how processes are connected, and manages the
threads and allocations thereof which all processes use.
Subnet Process
Group
A set of processes and their connections, which can receive and send data via
ports. A process group allows creation of entirely new component simply by
composition of its components.
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
FlowFiles & Data Agnosticism
 NiFi is data agnostic!
 But, NiFi was designed understanding that users
can care about specifics and provides tooling
to interact with specific formats, protocols, etc.
ISO 8601 - http://xkcd.com/1179/
Robustness principle
Be conservative in what you do,
be liberal in what you accept from others“
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
FlowFiles are like HTTP data
HTTP Data FlowFile
HTTP/1.1 200 OK
Date: Sun, 10 Oct 2010 23:26:07 GMT
Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g
Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT
ETag: "45b6-834-49130cc1182c0"
Accept-Ranges: bytes
Content-Length: 13
Connection: close
Content-Type: text/html
Hello world!
Standard FlowFile Attributes
Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016'
Key: 'fileSize’ Value: '23609'
FlowFile Attribute Map Content
Key: 'filename’ Value: '15650246997242'
Key: 'path’ Value: './’
Binary Content *
Header
Content
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What is dataflow and what are the challenges?
Apache NiFi
Architecture
Live Demo
Community
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
NiFi Architecture 0.x line OS/Host
JVM
NiFi Cluster Manger – Request Replicator
Web Server
Master
NiFi Cluster
Manager (NCM)
OS/Host
JVM
Flow Controller
Web Server
Processor 1 Extension N
FlowFile
Repository
Content
Repository
Provenance
Repository
Local Storage
Slaves
NiFi Nodes
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Architecture 1.x line
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Architecture 1.x line
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Architecture – Repositories - Pass by reference
FlowFile Content Provenance
F1 C1 C1 P1 F1
Excerpt of demo flow
 What’s happening inside the repositories

BEFORE
AFTER
F2 C1 C1 P3 F2 – Clone (F1)
F1 C1 P2 F1 – Route
P1 F1 – Create
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Architecture – Repositories – Copy on Write
FlowFile Content Provenance
F1 C1 C1 P1 F1 - CREATE
Excerpt of demo flow
 What’s happening inside the repositories

BEFORE
AFTER
F1 C1
F1.1 C2 C2 (encrypted)
C1 (plaintext)
P2 F1.1 - MODIFY
P1 F1 - CREATE
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi vs MiNiFi Java Processes
NiFi Framework
Components
MiNiFi
NiFi Framework
User Interface
Components
NiFi
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Java Processes
Bootstrap
NiFi
UI
bootstrap.conf
nifi.properties
flow.xml.gzreads &
modifies
reads
reads
starts
NiFi MiNiFi
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi Java Processes
MiNiFi
Bootstrap
Configuration
Change Notifier(s)
bootstrap.conf
nifi.properties
flow.xml.gz
reads
reads
starts
config.ymltransforms
reads
into
NiFi MiNiFi
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simple Config.yml
Tail a rolling file -> Site to Site
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Config Change Notifiers
 Two implementations
– RestChangeNotifier
‱ Http(s)
– FileChangeNotifier
 Configured in bootstrap.conf
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
1. Initial state
–Both running
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
user creates new configuration
2. User sends update through
notifier
–HTTP(S) post request
–Change watched file
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
3. Bootstrap validation
–Basic validation
–Rest notifier will respond
accordingly
–Results logged
validate new configuration
user creates new configuration
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
config.yml
saves new
4. Bootstrap saves and
transforms
–Copy old config.yml to a
swap file
validate new configuration
user creates new configuration
nifi.properties
flow.xml.gz
transforms into
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
MiNiFi
Bootstrap
Configuration
Change Notifiers
nifi.properties
flow.xml.gz
attempt restart
config.yml
saves new
reads
transforms into
5. Bootstrap attempts restart
–MiNiFi reads in the new
nifi.properties and
flow.xml.gz
validate new configuration
user creates new configuration
39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Change notifier update
6. Success or Fail
–Successful restart continue
processing
–Failure, rollback to old
config
–Existing Data is mapped or
orphaned
MiNiFi
Bootstrap
Configuration
Change Notifiers
nifi.properties
flow.xml.gz
attempt restart
config.yml
saves new
reads
transforms into
validate new configuration
user creates new configuration
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What is dataflow and what are the challenges?
Apache NiFi
Architecture
Demo
Community
41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What is dataflow and what are the challenges?
Apache NiFi
Architecture
Demo
Community
42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Matured at NSA 2006-2014
Brief history of the Apache NiFi Community
‱ Contributors from Government and several commercial industries
‱ Releases on a 6-8 week schedule
Code developed
at NSA
2006
Today
Achieved TLP
status in just
7 months
July 2015
Code available
open source
ASL v2
November 2014
43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
MiNiFi Prospective Plans - Centralized Command and Control
 Design at a centralized place, deploy on the edge
– Flow deployment
– NAR deployment
– Agent deployment
 Version control of flows
 Agent status monitoring
 Bi-directional command and control
Centralized management console with a UI
44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks Dataflow
Ingestion
Simple Event Processing
Engine
Stream Processing
DestinationData Bus
45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions?
46 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Learn more and join us!
Apache NiFi site
http://nifi.apache.org
Subproject MiNiFi site
http://nifi.apache.org/minifi/
Subscribe to and collaborate at
dev@nifi.apache.org
users@nifi.apache.org
Submit Ideas or Issues
https://issues.apache.org/jira/browse/NIFI
Follow us on Twitter
@apachenifi
47 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!

Mais conteĂșdo relacionado

Mais procurados

Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019Timothy Spann
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshopYifeng Jiang
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash CourseDataWorks Summit
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseAldrin Piri
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiTimothy Spann
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifiAnshuman Ghosh
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataWorks Summit
 
Data Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaData Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaDataWorks Summit
 
Apache NiFi Record Processing
Apache NiFi Record ProcessingApache NiFi Record Processing
Apache NiFi Record ProcessingBryan Bende
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiLev Brailovskiy
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseGregory Keys
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and FlinkBryan Bende
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Timothy Spann
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 

Mais procurados (20)

Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Data Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaData Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and Kafka
 
Apache NiFi Record Processing
Apache NiFi Record ProcessingApache NiFi Record Processing
Apache NiFi Record Processing
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
 
NiFi Best Practices for the Enterprise
NiFi Best Practices for the EnterpriseNiFi Best Practices for the Enterprise
NiFi Best Practices for the Enterprise
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016
 
Openstack 101
Openstack 101Openstack 101
Openstack 101
 
Ipfs
IpfsIpfs
Ipfs
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 

Semelhante a Apache NiFi Crash Course Intro

Apache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitApache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitAldrin Piri
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiJoe Percivall
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiAldrin Piri
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiDataWorks Summit
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveAldrin Piri
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityAccumulo Summit
 
You Can't Search Without Data
You Can't Search Without DataYou Can't Search Without Data
You Can't Search Without DataBryan Bende
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFiHortonworks
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesIsheeta Sanghi
 
MiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkMiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkJoe Percivall
 
Apache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupApache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupJoseph Witt
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupIntroduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupSaptak Sen
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionMilind Pandit
 
Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Apache Apex
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and ApexBryan Bende
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataWorks Summit
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisDataWorks Summit/Hadoop Summit
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIsheeta Sanghi
 

Semelhante a Apache NiFi Crash Course Intro (20)

Apache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitApache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop Summit
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
 
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
 
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & CommunityState of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
 
You Can't Search Without Data
You Can't Search Without DataYou Can't Search Without Data
You Can't Search Without Data
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
 
MiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkMiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talk
 
Apache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupApache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming Meetup
 
Introduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability MeetupIntroduction to Apache NiFi - Seattle Scalability Meetup
Introduction to Apache NiFi - Seattle Scalability Meetup
 
HDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi IntroductionHDF Powered by Apache NiFi Introduction
HDF Powered by Apache NiFi Introduction
 
Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex Integrating Apache NiFi and Apache Apex
Integrating Apache NiFi and Apache Apex
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and Apex
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 

Mais de DataWorks Summit/Hadoop Summit

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionDataWorks Summit/Hadoop Summit
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinDataWorks Summit/Hadoop Summit
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient DataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopDataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit/Hadoop Summit
 

Mais de DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 

Último

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Último (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Apache NiFi Crash Course Intro

  • 1. Apache NiFi Crash Course Intro Joe Percivall - @JPercivall Hadoop Summit – Melbourne 1 Sept 2016
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved About Me ‱ Software Engineer at Hortonworks ‱ Apache NiFi committer and PMC member ‱ Github: github.com/JPercivall
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda What is dataflow and what are the challenges? Apache NiFi Architecture Live Demo Community
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda What is dataflow and what are the challenges? Apache NiFi Architecture Live Demo Community
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Let’s Connect A to B Producers A.K.A Things Anything AND Everything Internet! Consumers ‱ User ‱ Storage ‱ System ‱ 
More Things
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Moving data effectively is hard Standards: http://xkcd.com/927/
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why is moving data effectively hard?  Standards  Formats  “Exactly Once” Delivery  Protocols  Veracity of Information  Validity of Information  Ensuring Security  Overcoming Security  Compliance  Schemas  Consumers Change  Credential Management  “That [person|team|group]”  Network  “Exactly Once” Delivery
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs Let’s consider the needs of a courier service Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster On Delivery Routes Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Great! I am collecting all this data! Let’s use it! Finding our needles in the haystack Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Kafka Core Data Center at HQ Server Cluster Others Storm / Spark / Flink / Apex Kafka Storm / Spark / Flink / Apex On Delivery Routes Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Let’s Connect Lots of As to Bs to As to Cs to Bs to Δs to Cs to ϕs Oh, that courier service is global
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda What is dataflow and what are the challenges? Apache NiFi Architecture Live Demo Community
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved ‱ Web-based User Interface for creating, monitoring, & controlling data flows ‱ Directed graphs of data routing and transformation ‱ Highly configurable - modify data flow at runtime, dynamically prioritize data ‱ Easily extensible through development of custom components ‱ Data Provenance tracks data through entire system [1] https://nifi.apache.org/ Dataflow Apache NiFi
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi Key Features ‱ Guaranteed delivery ‱ Data buffering - Backpressure - Pressure release ‱ Prioritized queuing ‱ Flow specific QoS - Latency vs. throughput - Loss tolerance ‱ Data provenance ‱ Supports push and pull models ‱ Recovery/recording a rolling log of fine- grained history ‱ Visual command and control ‱ Flow templates ‱ Pluggable/multi-role security* ‱ Designed for extension ‱ Clustering
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Revisit: Courier service from the perspective of NiFi Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ NiFi NiFi NiFi NiFi NiFi NiFi On Delivery Routes
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Courier service from the perspective of NiFi & MiNiFi Physical Store Gateway Server Mobile Devices Registers Server Cluster Distribution Center Core Data Center at HQ Server Cluster Trucks Deliverers Delivery Truck: Creative Stall, https://thenounproject.com/creativestall/ Deliverer: Rigo Peter, https://thenounproject.com/rigo/ Cash Register: Sergey Patutin, https://thenounproject.com/bdesign.by/ Hand Scanner: Eric Pearson, https://thenounproject.com/epearson001/ Client Libraries Client Libraries MiNiFi MiNiFi NiFi NiFi NiFi NiFi NiFi NiFi Client Libraries On Delivery Routes
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi MiNiFi Key Features ‱ Guaranteed delivery ‱ Data buffering - Backpressure - Pressure release ‱ Prioritized queuing ‱ Flow specific QoS - Latency vs. throughput - Loss tolerance ‱ Data provenance ‱ Recovery/recording a rolling log of fine- grained history ‱ Designed for extension ‱ Design and Deploy ‱ Warm re-deploys
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi MiNiFi Key Features ‱ Guaranteed delivery ‱ Data buffering - Backpressure - Pressure release ‱ Prioritized queuing ‱ Flow specific QoS - Latency vs. throughput - Loss tolerance ‱ Data provenance ‱ Recovery/recording a rolling log of fine- grained history ‱ Designed for extension ‱ Design and Deploy ‱ Warm re-deploys
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Visual Command and Control vs. Design and Deploy
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache NiFi Managed Dataflow SOURCES REGIONAL INFRASTRUCTURE CORE INFRASTRUCTURE
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi is based on Flow Based Programming (FBP) FBP Term NiFi Term Description Information Packet FlowFile Each object moving through the system. Black Box FlowFile Processor Performs the work, doing some combination of data routing, transformation, or mediation between systems. Bounded Buffer Connection The linkage between processors, acting as queues and allowing various processes to interact at differing rates. Scheduler Flow Controller Maintains the knowledge of how processes are connected, and manages the threads and allocations thereof which all processes use. Subnet Process Group A set of processes and their connections, which can receive and send data via ports. A process group allows creation of entirely new component simply by composition of its components.
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved FlowFiles & Data Agnosticism  NiFi is data agnostic!  But, NiFi was designed understanding that users can care about specifics and provides tooling to interact with specific formats, protocols, etc. ISO 8601 - http://xkcd.com/1179/ Robustness principle Be conservative in what you do, be liberal in what you accept from others“
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved FlowFiles are like HTTP data HTTP Data FlowFile HTTP/1.1 200 OK Date: Sun, 10 Oct 2010 23:26:07 GMT Server: Apache/2.2.8 (CentOS) OpenSSL/0.9.8g Last-Modified: Sun, 26 Sep 2010 22:04:35 GMT ETag: "45b6-834-49130cc1182c0" Accept-Ranges: bytes Content-Length: 13 Connection: close Content-Type: text/html Hello world! Standard FlowFile Attributes Key: 'entryDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'lineageStartDate’ Value: 'Fri Jun 17 17:15:04 EDT 2016' Key: 'fileSize’ Value: '23609' FlowFile Attribute Map Content Key: 'filename’ Value: '15650246997242' Key: 'path’ Value: './’ Binary Content * Header Content
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda What is dataflow and what are the challenges? Apache NiFi Architecture Live Demo Community
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage NiFi Architecture 0.x line OS/Host JVM NiFi Cluster Manger – Request Replicator Web Server Master NiFi Cluster Manager (NCM) OS/Host JVM Flow Controller Web Server Processor 1 Extension N FlowFile Repository Content Repository Provenance Repository Local Storage Slaves NiFi Nodes
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Architecture 1.x line
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Architecture 1.x line
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Architecture – Repositories - Pass by reference FlowFile Content Provenance F1 C1 C1 P1 F1 Excerpt of demo flow
 What’s happening inside the repositories
 BEFORE AFTER F2 C1 C1 P3 F2 – Clone (F1) F1 C1 P2 F1 – Route P1 F1 – Create
  • 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Architecture – Repositories – Copy on Write FlowFile Content Provenance F1 C1 C1 P1 F1 - CREATE Excerpt of demo flow
 What’s happening inside the repositories
 BEFORE AFTER F1 C1 F1.1 C2 C2 (encrypted) C1 (plaintext) P2 F1.1 - MODIFY P1 F1 - CREATE
  • 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi vs MiNiFi Java Processes NiFi Framework Components MiNiFi NiFi Framework User Interface Components NiFi
  • 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Java Processes Bootstrap NiFi UI bootstrap.conf nifi.properties flow.xml.gzreads & modifies reads reads starts NiFi MiNiFi
  • 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi Java Processes MiNiFi Bootstrap Configuration Change Notifier(s) bootstrap.conf nifi.properties flow.xml.gz reads reads starts config.ymltransforms reads into NiFi MiNiFi
  • 32. 32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Simple Config.yml Tail a rolling file -> Site to Site
  • 33. 33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Config Change Notifiers  Two implementations – RestChangeNotifier ‱ Http(s) – FileChangeNotifier  Configured in bootstrap.conf
  • 34. 34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers 1. Initial state –Both running
  • 35. 35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers user creates new configuration 2. User sends update through notifier –HTTP(S) post request –Change watched file
  • 36. 36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers 3. Bootstrap validation –Basic validation –Rest notifier will respond accordingly –Results logged validate new configuration user creates new configuration
  • 37. 37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers config.yml saves new 4. Bootstrap saves and transforms –Copy old config.yml to a swap file validate new configuration user creates new configuration nifi.properties flow.xml.gz transforms into
  • 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update MiNiFi Bootstrap Configuration Change Notifiers nifi.properties flow.xml.gz attempt restart config.yml saves new reads transforms into 5. Bootstrap attempts restart –MiNiFi reads in the new nifi.properties and flow.xml.gz validate new configuration user creates new configuration
  • 39. 39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Change notifier update 6. Success or Fail –Successful restart continue processing –Failure, rollback to old config –Existing Data is mapped or orphaned MiNiFi Bootstrap Configuration Change Notifiers nifi.properties flow.xml.gz attempt restart config.yml saves new reads transforms into validate new configuration user creates new configuration
  • 40. 40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda What is dataflow and what are the challenges? Apache NiFi Architecture Demo Community
  • 41. 41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda What is dataflow and what are the challenges? Apache NiFi Architecture Demo Community
  • 42. 42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Matured at NSA 2006-2014 Brief history of the Apache NiFi Community ‱ Contributors from Government and several commercial industries ‱ Releases on a 6-8 week schedule Code developed at NSA 2006 Today Achieved TLP status in just 7 months July 2015 Code available open source ASL v2 November 2014
  • 43. 43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MiNiFi Prospective Plans - Centralized Command and Control  Design at a centralized place, deploy on the edge – Flow deployment – NAR deployment – Agent deployment  Version control of flows  Agent status monitoring  Bi-directional command and control Centralized management console with a UI
  • 44. 44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Dataflow Ingestion Simple Event Processing Engine Stream Processing DestinationData Bus
  • 45. 45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions?
  • 46. 46 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Learn more and join us! Apache NiFi site http://nifi.apache.org Subproject MiNiFi site http://nifi.apache.org/minifi/ Subscribe to and collaborate at dev@nifi.apache.org users@nifi.apache.org Submit Ideas or Issues https://issues.apache.org/jira/browse/NIFI Follow us on Twitter @apachenifi
  • 47. 47 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank you!

Notas do Editor

  1. 170 different actions bundled by default
  2. Can put NiFi on a Gateway server but probably don’t want to mess with a UI on ever single one Maybe not best fit
  3. Let me get the key parts of NiFi close to where data begins and provide bidrectional communication NiFi lives in the data center. Give it an enterprise server or a cluster of them. MiNiFi lives close to where data is born and may be a guest on that device or system
  4. Introduce the architecture of NiFi, describe major system components, and describe the single node and clustering models.
  5. Framework – put a new wrapper on the framework, or in maven terms, we kept the underlying modules and wrote minifi-framework-core replacing nifi-framework-core Talking about MiNiFi-Java, Cpp version also exists
  6. Initiates with ./bin/nifi.sh start
  7. user, only need bootstrap and config.yml nifi.properties and flow.xml are implementation details
  8. 1 of the main differences with NiFi Warm Redeploys
  9. Think NiFi Process Groups but MiNiFi Agents
  10. Kafka Reads events in memory and write to  distributed logÂ