Enviar pesquisa
Carregar
Apache NiFi in the Hadoop Ecosystem
•
Transferir como PPTX, PDF
•
11 gostaram
•
4,864 visualizações
DataWorks Summit/Hadoop Summit
Seguir
Apache NiFi in the Hadoop Ecosystem
Leia menos
Leia mais
Tecnologia
Vista de apresentação de diapositivos
Denunciar
Compartilhar
Vista de apresentação de diapositivos
Denunciar
Compartilhar
1 de 36
Baixar agora
Recomendados
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Nifi workshop
Nifi workshop
Yifeng Jiang
Apache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
DataWorks Summit/Hadoop Summit
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
Timothy Spann
Integrating NiFi and Flink
Integrating NiFi and Flink
Bryan Bende
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
DataWorks Summit/Hadoop Summit
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
NiFi Best Practices for the Enterprise
NiFi Best Practices for the Enterprise
Gregory Keys
Recomendados
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Nifi workshop
Nifi workshop
Yifeng Jiang
Apache NiFi Crash Course Intro
Apache NiFi Crash Course Intro
DataWorks Summit/Hadoop Summit
Introduction to Apache NiFi 1.11.4
Introduction to Apache NiFi 1.11.4
Timothy Spann
Integrating NiFi and Flink
Integrating NiFi and Flink
Bryan Bende
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
DataWorks Summit/Hadoop Summit
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
NiFi Best Practices for the Enterprise
NiFi Best Practices for the Enterprise
Gregory Keys
Apache Nifi Crash Course
Apache Nifi Crash Course
DataWorks Summit
Apache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Aldrin Piri
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
Timo Walther
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
Timothy Spann
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
DataWorks Summit
Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
Ververica
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
Nifi
Nifi
Julio Castro
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
Timothy Spann
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
Anshuman Ghosh
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
DataWorks Summit
NiFi Developer Guide
NiFi Developer Guide
Deon Huang
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Timothy Spann
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
StreamNative
Introduction to Kafka connect
Introduction to Kafka connect
Knoldus Inc.
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
Accumulo Summit
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
Bryan Bende
Mais conteúdo relacionado
Mais procurados
Apache Nifi Crash Course
Apache Nifi Crash Course
DataWorks Summit
Apache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Aldrin Piri
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
Timo Walther
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
Timothy Spann
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
GetInData
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
DataWorks Summit
Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
Ververica
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
Nifi
Nifi
Julio Castro
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
Timothy Spann
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
Anshuman Ghosh
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
DataWorks Summit
NiFi Developer Guide
NiFi Developer Guide
Deon Huang
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Timothy Spann
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
StreamNative
Introduction to Kafka connect
Introduction to Kafka connect
Knoldus Inc.
Mais procurados
(20)
Apache Nifi Crash Course
Apache Nifi Crash Course
Apache Flink and what it is used for
Apache Flink and what it is used for
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
Nifi
Nifi
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
NiFi Developer Guide
NiFi Developer Guide
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Introduction to Kafka connect
Introduction to Kafka connect
Semelhante a Apache NiFi in the Hadoop Ecosystem
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
Accumulo Summit
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
Bryan Bende
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
Aldrin Piri
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
Hortonworks
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Seetharam Venkatesh
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Data Con LA
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
DataWorks Summit/Hadoop Summit
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
Joe Percivall
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
DataWorks Summit
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
Timothy Spann
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Alex Zeltov
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Isheeta Sanghi
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Isheeta Sanghi
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Hortonworks
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Isheeta Sanghi
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Cluster
ahortonworks
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
Semelhante a Apache NiFi in the Hadoop Ecosystem
(20)
State of the Apache NiFi Ecosystem & Community
State of the Apache NiFi Ecosystem & Community
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Cluster
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
Mais de DataWorks Summit/Hadoop Summit
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
Hadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
Apache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit/Hadoop Summit
Mais de DataWorks Summit/Hadoop Summit
(20)
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
Hadoop Crash Course
Hadoop Crash Course
Data Science Crash Course
Data Science Crash Course
Apache Spark Crash Course
Apache Spark Crash Course
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
HBase in Practice
HBase in Practice
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Último
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
Slack Application Development 101 Slides
Slack Application Development 101 Slides
praypatel2
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
hans926745
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Enterprise Knowledge
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Results
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
V3cube
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Delhi Call girls
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Katpro Technologies
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Delhi Call girls
Último
(20)
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Slack Application Development 101 Slides
Slack Application Development 101 Slides
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Apache NiFi in the Hadoop Ecosystem
1.
Apache NiFi in
the Hadoop Ecosystem Bryan Bende – Member of Technical Staff Hadoop Summit 2016 Dublin
2.
2 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Outline • What is Apache NiFi? • Hadoop Ecosystem Integrations • Use-Case/Architecture Discussion • Future Work
3.
3 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved About Me • Member of Technical Staff on Hortonworks DataFlow Team • Apache NiFi Committer & PMC Member • Integrations for HBase, Solr, Syslog, Stream Processing • Twitter: @bbende / Blog: bryanbende.com
4.
4 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved What is Apache NiFi?
5.
5 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Simplistic View of Enterprise Data Flow Data Flow Process and Analyze Data Acquire Data Store Data
6.
6 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Different organizations/business units across different geographic locations… Realistic View of Enterprise Data Flow
7.
7 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Interacting with different business partners and customers Realistic View of Enterprise Data Flow
8.
8 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Apache NiFi • Created to address the challenges of global enterprise dataflow • Key features: – Visual Command and Control – Data Lineage (Provenance) – Data Prioritization – Data Buffering/Back-Pressure – Control Latency vs. Throughput – Secure Control Plane / Data Plane – Scale Out Clustering – Extensibility
9.
9 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Apache NiFi What is Apache NiFi used for? • Reliable and secure transfer of data between systems • Delivery of data from sources to analytic platforms • Enrichment and preparation of data: – Conversion between formats – Extraction/Parsing – Routing decisions What is Apache NiFi NOT used for? • Distributed Computation • Complex Event Processing • Joins / Complex Rolling Window Operations
10.
10 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Hadoop Ecosystem Integrations
11.
11 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved HDFS Ingest MergeContent • Merges into appropriately sized files for HDFS • Based on size, number of messages, and time UpdateAttribute • Sets the HDFS directory and filename • Use expression language to dynamically bin by date: /data/${now():format('yyyy/MM/dd/HH')}/ PutHDFS • Writes FlowFile content to HDFS • Supports Conflict Resolution Strategy and Kerberos authentication
12.
12 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved HDFS Retrieval ListHDFS • Periodically perform listing on HDFS directory • Produces FlowFile per HDFS file • Flow only contains HDFS path & filename FetchHDFS • Retrieves a file from HDFS • Use incoming FlowFiles to dynamically fetch: HDFS Filename: ${path}/${filename}
13.
13 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved HDFS Retrieval in a Cluster NCM Node 1 (Primary) ListHDFS HDFS FetchHDFS RPG Input Port Node 2 ListHDFS FetchHDFS RPG Input Port • Perform “list” operation on primary node • Send results to Remote Process Group pointing back to same cluster • Redistributes results to all nodes to perform “fetch” in parallel • Same approach for ListFile + FetchFile and ListSFTP + FetchSFTP
14.
14 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved HBase Integration • ControllerService wrapping HBase Client • Implementation provided for HBase 1.1.2 Client • Other implementations could be built as an extension
15.
15 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved HBase Ingest – Single Cell • Table, Row Id, Col Family, and Col Qualifier provided in processor, or dynamically from attributes • FlowFile content becomes the cell value • Batch Size to specify maximum number of cells for a single ‘put’ operation
16.
16 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved HBase Ingest – Full Row • Table and Column Family provided in processor, or dynamically from attributes • Row ID can be a field in JSON, or a FlowFile attribute • JSON Field/Values become Column Qualifiers and Values • Complex Field Strategy – Fail – Warn – Ignore – Text
17.
17 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Kafka Integration PutKafka • Provide Broker and Topic Name • Publishes FlowFile content as one or more messages • Ability to send large delimited content, split into messages by NiFi GetKafka • Provide ZK Connection String and Topic Name • Produces a FlowFile for each message consumed
18.
18 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Stream Processing Integration • Stream processing systems generally pull data, then push results • NiFi Site-To-Site pushes and pulls between NiFi instances • The Site-To-Site Client can be used from a stream processing platform https://github.com/apache/nifi/tree/master/nifi -commons/nifi-site-to-site-client Stream Processing Site-To-Site Client NiFi Output Port Stream Processing Site-To-Site Client NiFi Input Port Pull Push
19.
19 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Site-to-Site Client Overview • Push to Input Port, or Pull from Output Port • Communicate with NiFi clusters, or standalone instances • Handles load balancing and reliable delivery • Secure connections using certificates (optional) SiteToSiteClientConfig clientConfig = new SiteToSiteClient.Builder() .url(“http://localhost:8080/nifi”) .portName(”My Port”) .buildConfig();
20.
20 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Site-To-Site Client Pulling SiteToSiteClient client = ... Transaction transaction = client.createTransaction(TransferDirection.RECEIVE); DataPacket dataPacket = transaction.receive(); while (dataPacket != null) { ... } transaction.confirm(); transaction.complete();
21.
21 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Site-To-Site Client Pushing SiteToSiteClient client = ... Transaction transaction = client.createTransaction(TransferDirection.SEND); NiFiDataPacket data = ... transaction.send(data.getContent(), data.getAttributes()); transaction.confirm(); transaction.complete();
22.
22 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Current Stream Processing Integrations Spark Streaming - NiFi Spark Receiver • https://github.com/apache/nifi/tree/master/nifi-external/nifi-spark-receiver Storm – NiFi Spout • https://github.com/apache/nifi/tree/master/nifi-external/nifi-storm-spout Flink – NiFi Source & Sink • https://github.com/apache/flink/tree/master/flink-streaming-connectors/flink-connector-nifi Apex - NiFi Input Operators & Output Operators • https://github.com/apache/incubator-apex- malhar/tree/master/contrib/src/main/java/com/datatorrent/contrib/nifi
23.
23 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Other Relevant Integrations • GetSolr, PutSolrContentStream • FetchElasticSearch, PutElasticSearch • GetMongo, PutMongo • QueryCassandra, PutCassandraQL • GetCouchbaseKey, PutCouchbaseKey • QueryDatabaseTable, ExecuteSQL, PutSQL • GetSplunk, PutSplunk • And more! https://nifi.apache.org/docs.html
24.
24 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Use-Case/Architecture Discussion
25.
25 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Drive Data to Core for Analysis NiFi Stream Processing NiFi NiFi • Drive data from sources to central data center for analysis • Tiered collection approach at various locations, think regional data centers Edge Edge Core Batch Analytics
26.
26 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Dynamically Adjusting Data Flows • Push analytic results back to core NiFi • Push results back to edge locations/devices to change behavior NiFi NiFi NiFi Edge Edge Core Batch Analytics Stream Processing
27.
27 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved 1. Logs filtered by level and sent from Edge -> Core 2. Stream processing produces new filters based on rate & sends back to core 3. Edge polls core for new filter levels & updates filtering Example: Dynamic Log Collection Core NiFi Streaming Processing Edge NiFi Logs Logs New Filters Logs Output Log Input Log Output Result Input Store Result Service Fetch ResultPoll Service Filter New Filters New Filters Poll Analytic
28.
28 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Dynamic Log Collection – Edge NiFi
29.
29 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Dynamic Log Collection – Core NiFi
30.
30 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Dynamic Log Collection Summary NiFi NiFi NiFi Edge Edge Core Stream Processing Logs Logs Logs New FiltersNew Filters New Filters
31.
31 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Future Work
32.
32 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved The Future – Ecosystem Integrations • Ambari – Support a fully managed NiFi Cluster through Ambari – Monitoring, management, upgrades, etc. • Ranger – Ability to delegate authorization decisions to Ranger – Manage authorization controls through Ranger • Atlas – Track lineage from the source to destination – Apply tags to data as its acquired
33.
33 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved The Future – Apache NiFi • HA Control Plane – Zero Master cluster, Web UI accessible from any node – Auto-Election of “Cluster Coordinator” and “Primary Node” through ZooKeeper • HA Data Plane – Ability to replicate data across nodes in a cluster • Multi-Tenancy – Restrict Access to portions of a flow – Allow people/groups with in an organization to only access their portions of the flow • Extension Registry – Create a central repository of NARs and Templates – Move most NARs out of Apache NiFi distribution, ship with a minimal set
34.
34 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved The Future – Apache NiFi • Variable Registry – Define environment specific variables through the UI, reference through EL – Make templates more portable across environments/instances • Redesign of User Interface – Modernize look & feel, improve usability, support multi-tenancy • Continued Development of Integration Points – New processors added continuously! • MiNiFi – Complimentary data collection agent to NiFi’s current approach – Small, lightweight, centrally managed agent that integrates with NiFi for follow-on dataflow management
35.
35 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Thanks! • Questions? • Contact Info: – Email: bbende@hortonworks.com – Twitter: @bbende
36.
36 © Hortonworks
Inc. 2011 – 2016. All Rights Reserved Thank you
Baixar agora