SlideShare uma empresa Scribd logo
1 de 24
What will be new in
Apache NiFi 1.2.0
(was not released as of this writing, but released on May 8th)
Apr 28th, 2017
Apache NiFi committer: Koji Kawamura (ijokarumawak)
Disclaimer: See release note for detail!
• Apache NiFi 1.2.0 has been released on May 8th, 2017!! Please see the
official release note for official what’s new.
https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#Release
Notes-Version1.2.0
• The contents in this slide deck are derived from Apache NiFi JIRA issues
which is labeled with next release target 1.2.0 and source code available at
Github (already merged into master branch), however it does NOT mean
these are guaranteed to be released and still are subjects to change.
• The motivation of this presentation is share what have been introduced
into the project since the latest Apache NiFi 1.1.2 release.
• The contents are created from information available under Apache NiFi
project, however, the way summarize it is solely done with my personal
thoughts and not a consensus built among Apache NiFi community.
Themes
• Schema Registry, Record Reader/Writer
• Multiple version of Nar
• Support EL for various Processor properties
• Performance Improvement
• CDC (Capture Data Change)
• Rollback on Failure
• Flow control
• UX
• Security
Schema Management is a pain…
We need schema to:
- Analyze data
- Convert one format to another
- Validation
- … etc
- Centralized schema management is needed..
Schema?
Schema Registry
• AvroSchemaRegistry
• Provides a service for registering and accessing schemas. You can register a
schema as a dynamic property where 'name' represents the schema name
and 'value' represents the textual representation of the actual schema
following the syntax and semantics of Avro's Schema format.
• HortonworksSchemaRegistry
• Provides a Schema Registry Service that interacts with a Hortonworks Schema
Registry, available at https://github.com/hortonworks/registry
Record Reader/Writer
Schema Registry and Reader/Writer
• ConvertRecord
• Converts records from one data format to another using configured Record Reader
and Record Write Controller Services.
• SplitRecord
• Splits up an input FlowFile that is in a record-oriented data format into multiple
smaller FlowFiles
• PutDatabaseRecord
• The PutDatabaseRecord processor uses a specified RecordReader to input (possibly
multiple) records from an incoming flow file.
• QueryRecord
• Evaluates one or more SQL queries against the contents of a FlowFile. The result of
the SQL query then becomes the content of the output FlowFile. This can be used,
for example, for field-specific filtering, transformation, and row-level filtering.
Multiple versions of Nar
MANIFEST.MF
New Processors: CDC
• CaptureChangeMySQL
Retrieves Change Data Capture (CDC) events from a MySQL database.
CDC Events include INSERT, UPDATE, DELETE operations. Events are
output as individual flow files ordered by the time at which the
operation occurred.
How CDC works
Rollback on Failure
• PutSQL
• PutHiveQL
• PutHiveStreaming
• PutDatabaseRecord
PutSQL : default behavior
2 13 PutSQL RDBMS
Input FlowFiles
1 3
1
3 2
success failure
PutSQL : Rollback on Failure
2 13 PutSQL RDBMS
Input FlowFiles
success failure
Rollback!
Modified records are rolled back
No output FlowFile, those will be kept in the input queue
New Processors: Flow Control
• EnforceOrder
• Enforces expected ordering of FlowFiles those belong to the same data group.
• Wait
• Routes incoming FlowFiles to the 'wait' relationship until a matching release
signal is stored in the distributed cache from a corresponding Notify
processor. When a matching release signal is identified, a waiting FlowFile is
routed to the 'success' relationship, with attributes copied from the FlowFile
that produced the release signal from the Notify processor.
• Notify
• Caches a release signal identifier in the distributed cache, optionally along
with the FlowFile's attributes. Any flow files held at a corresponding Wait
processor will be released once this signal in the cache is discovered.
Flow Control using Wait/Notify
New Processors GCS
• DeleteGCSObject
• FetchGCSObject
• ListGCSBucket
• PutGCSObject
New Processors
• ConsumeEWS
• Consumes messages from Microsoft Exchange using Exchange Web Services. The
raw-bytes of each received email message are written as contents of the FlowFile
• ConvertExcelToCSVProcessor
• Consumes a Microsoft Excel document and converts each worksheet to csv.
• ExtractCCDAAttributes
• Extracts information from an Consolidated CDA formatted FlowFile and provides
individual attributes as FlowFile attributes.
• ExtractGrok
• Evaluates one or more Grok Expressions against the content of a FlowFile, adding the
results as attributes or replacing the content of the FlowFile with a JSON notation of
the matched content
New Processors
• FetchHBaseRow
• Fetches a row from an HBase table.
• FuzzyHashContent
• Calculates a fuzzy/locality-sensitive hash value for the Content of a FlowFile and puts
that hash value on the FlowFile as an attribute whose name is determined by the
<Hash Attribute Name> property.
• ISPEnrichIP
• Looks up ISP information for an IP address and adds the information to FlowFile
attributes.
• ListenBeats
• Listens for messages sent by libbeat compatible clients (e.g. filebeats, metricbeats,
etc) using Libbeat's 'output.logstash', writing its JSON formatted payload to the
content of a FlowFile.This processor replaces the now deprecated ListenLumberjack
New Processors
• ExecuteScript
• ClojureScriptEngine is added!
• UpdateCounter
• This processor allows users to set specific counters and key points in their flow. It is
useful for debugging and basic counting functions.
• AttributeRollingWindow
• Track a Rolling Window based on evaluating an Expression Language expression on
each FlowFile and add that value to the processor's state.
• GetTCP
• Connects over TCP to the provided endpoint(s). Received data will be written as
content to the FlowFile
• QueryDatabaseTable
• MSSQL2008DatabaseAdapter, MSSQLDatabaseAdapter
Deep Linking!
Align Components!
More context menu at root Process Group
BTW, did you know you can
‘Refresh’ flow by ‘Cmd + r’?
… and more!!
1.2.0 is discussed to be released soon! Released and available for download!
Thank you :)
https://nifi.apache.org/download.html

Mais conteúdo relacionado

Mais procurados

Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsBryan Bende
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Apache NiFi: Ingesting Enterprise Data At Scale
Apache NiFi:   Ingesting Enterprise Data At Scale Apache NiFi:   Ingesting Enterprise Data At Scale
Apache NiFi: Ingesting Enterprise Data At Scale Timothy Spann
 
Embeddable data transformation for real time streams
Embeddable data transformation for real time streamsEmbeddable data transformation for real time streams
Embeddable data transformation for real time streamsJoey Echeverria
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...DataWorks Summit
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem DataWorks Summit/Hadoop Summit
 
Data Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaData Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaDataWorks Summit
 
An elastic batch-and stream-processing stack with Pravega and Apache Flink
An elastic batch-and stream-processing stack with Pravega and Apache FlinkAn elastic batch-and stream-processing stack with Pravega and Apache Flink
An elastic batch-and stream-processing stack with Pravega and Apache FlinkDataWorks Summit
 
Sharing metadata across the data lake and streams
Sharing metadata across the data lake and streamsSharing metadata across the data lake and streams
Sharing metadata across the data lake and streamsDataWorks Summit
 
StreamNative FLiP into scylladb - scylla summit 2022
StreamNative   FLiP into scylladb - scylla summit 2022StreamNative   FLiP into scylladb - scylla summit 2022
StreamNative FLiP into scylladb - scylla summit 2022Timothy Spann
 
Flink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaFlink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaDataWorks Summit
 
Neo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache KafkaNeo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache Kafkajexp
 

Mais procurados (20)

Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 
Nifi
NifiNifi
Nifi
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC Improvements
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Apache NiFi: Ingesting Enterprise Data At Scale
Apache NiFi:   Ingesting Enterprise Data At Scale Apache NiFi:   Ingesting Enterprise Data At Scale
Apache NiFi: Ingesting Enterprise Data At Scale
 
Embeddable data transformation for real time streams
Embeddable data transformation for real time streamsEmbeddable data transformation for real time streams
Embeddable data transformation for real time streams
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
The Power of Intelligent Flows: Real-Time IoT Botnet Classification with Apac...
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
 
Machine Learning in the IoT with Apache NiFi
Machine Learning in the IoT with Apache NiFiMachine Learning in the IoT with Apache NiFi
Machine Learning in the IoT with Apache NiFi
 
Data Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and KafkaData Ingest Self Service and Management using Nifi and Kafka
Data Ingest Self Service and Management using Nifi and Kafka
 
Migrating pipelines into Docker
Migrating pipelines into DockerMigrating pipelines into Docker
Migrating pipelines into Docker
 
An elastic batch-and stream-processing stack with Pravega and Apache Flink
An elastic batch-and stream-processing stack with Pravega and Apache FlinkAn elastic batch-and stream-processing stack with Pravega and Apache Flink
An elastic batch-and stream-processing stack with Pravega and Apache Flink
 
Sharing metadata across the data lake and streams
Sharing metadata across the data lake and streamsSharing metadata across the data lake and streams
Sharing metadata across the data lake and streams
 
Using Apache Spark with IBM SPSS Modeler
Using Apache Spark with IBM SPSS ModelerUsing Apache Spark with IBM SPSS Modeler
Using Apache Spark with IBM SPSS Modeler
 
StreamNative FLiP into scylladb - scylla summit 2022
StreamNative   FLiP into scylladb - scylla summit 2022StreamNative   FLiP into scylladb - scylla summit 2022
StreamNative FLiP into scylladb - scylla summit 2022
 
Flink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaFlink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at Alibaba
 
From Device to Data Center to Insights
From Device to Data Center to InsightsFrom Device to Data Center to Insights
From Device to Data Center to Insights
 
Neo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache KafkaNeo4j Graph Streaming Services with Apache Kafka
Neo4j Graph Streaming Services with Apache Kafka
 

Semelhante a What will be new in Apache NiFi 1.2.0

Welcome to New Swift: Library Evolution & LSP Support
Welcome to New Swift: Library Evolution & LSP SupportWelcome to New Swift: Library Evolution & LSP Support
Welcome to New Swift: Library Evolution & LSP SupportG ABHISEK
 
FHIR Server internals - sqlonfhir
FHIR Server internals - sqlonfhirFHIR Server internals - sqlonfhir
FHIR Server internals - sqlonfhirBrian Postlethwaite
 
Solution for events logging with akka streams and kafka
Solution for events logging with akka streams and kafkaSolution for events logging with akka streams and kafka
Solution for events logging with akka streams and kafkaAnatoly Sementsov
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Apex
 
How to instantiate pinta in a domain
How to instantiate pinta in a domainHow to instantiate pinta in a domain
How to instantiate pinta in a domainDhavalkumar Thakker
 
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...LINE Corporation
 
Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaMax Alexejev
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafkadatamantra
 
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...Michael Stack
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemApache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemBryan Bende
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Kai Wähner
 
Developing and Hosting REST APIs 3.7
Developing and Hosting REST APIs 3.7Developing and Hosting REST APIs 3.7
Developing and Hosting REST APIs 3.7StephenKardian
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...HostedbyConfluent
 
Kafka Presentation.pptx
Kafka Presentation.pptxKafka Presentation.pptx
Kafka Presentation.pptxSRIRAMKIRAN9
 
Kafka Presentation.pptx
Kafka Presentation.pptxKafka Presentation.pptx
Kafka Presentation.pptxSRIRAMKIRAN9
 
The new OSGi LogService 1.4 and integrating with SLF4J
The new OSGi LogService 1.4 and integrating with SLF4JThe new OSGi LogService 1.4 and integrating with SLF4J
The new OSGi LogService 1.4 and integrating with SLF4Jbjhargrave
 
Integrating SLF4J and the new OSGi LogService 1.4 - BJ Hargrave (IBM)
Integrating SLF4J and the new OSGi LogService 1.4 - BJ Hargrave (IBM)Integrating SLF4J and the new OSGi LogService 1.4 - BJ Hargrave (IBM)
Integrating SLF4J and the new OSGi LogService 1.4 - BJ Hargrave (IBM)mfrancis
 

Semelhante a What will be new in Apache NiFi 1.2.0 (20)

Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Welcome to New Swift: Library Evolution & LSP Support
Welcome to New Swift: Library Evolution & LSP SupportWelcome to New Swift: Library Evolution & LSP Support
Welcome to New Swift: Library Evolution & LSP Support
 
FHIR Server internals - sqlonfhir
FHIR Server internals - sqlonfhirFHIR Server internals - sqlonfhir
FHIR Server internals - sqlonfhir
 
Solution for events logging with akka streams and kafka
Solution for events logging with akka streams and kafkaSolution for events logging with akka streams and kafka
Solution for events logging with akka streams and kafka
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
 
How to instantiate pinta in a domain
How to instantiate pinta in a domainHow to instantiate pinta in a domain
How to instantiate pinta in a domain
 
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
 
Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and Scala
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafka
 
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop EcosystemApache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Vip2p
Vip2pVip2p
Vip2p
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
Developing and Hosting REST APIs 3.7
Developing and Hosting REST APIs 3.7Developing and Hosting REST APIs 3.7
Developing and Hosting REST APIs 3.7
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
 
Kafka Presentation.pptx
Kafka Presentation.pptxKafka Presentation.pptx
Kafka Presentation.pptx
 
Kafka Presentation.pptx
Kafka Presentation.pptxKafka Presentation.pptx
Kafka Presentation.pptx
 
The new OSGi LogService 1.4 and integrating with SLF4J
The new OSGi LogService 1.4 and integrating with SLF4JThe new OSGi LogService 1.4 and integrating with SLF4J
The new OSGi LogService 1.4 and integrating with SLF4J
 
Integrating SLF4J and the new OSGi LogService 1.4 - BJ Hargrave (IBM)
Integrating SLF4J and the new OSGi LogService 1.4 - BJ Hargrave (IBM)Integrating SLF4J and the new OSGi LogService 1.4 - BJ Hargrave (IBM)
Integrating SLF4J and the new OSGi LogService 1.4 - BJ Hargrave (IBM)
 

Mais de Koji Kawamura

Broadcast チームの オブザーバビリティ向上活動.pdf
Broadcast チームの オブザーバビリティ向上活動.pdfBroadcast チームの オブザーバビリティ向上活動.pdf
Broadcast チームの オブザーバビリティ向上活動.pdfKoji Kawamura
 
Elastic Stack を網羅する ハンズオンワークショップを 作ってみた.pdf
Elastic Stack を網羅する ハンズオンワークショップを 作ってみた.pdfElastic Stack を網羅する ハンズオンワークショップを 作ってみた.pdf
Elastic Stack を網羅する ハンズオンワークショップを 作ってみた.pdfKoji Kawamura
 
Drupal Elasticsearch Connector の日本語検索の質を高める
Drupal Elasticsearch Connector の日本語検索の質を高めるDrupal Elasticsearch Connector の日本語検索の質を高める
Drupal Elasticsearch Connector の日本語検索の質を高めるKoji Kawamura
 
20200324 ms open-tech-elastic
20200324 ms open-tech-elastic20200324 ms open-tech-elastic
20200324 ms open-tech-elasticKoji Kawamura
 
祝Elasticsearch 7.6、date, number 型での ソートがさらに高速に!? Magic WANDってなんですか?
祝Elasticsearch 7.6、date, number 型での ソートがさらに高速に!?  Magic WANDってなんですか?祝Elasticsearch 7.6、date, number 型での ソートがさらに高速に!?  Magic WANDってなんですか?
祝Elasticsearch 7.6、date, number 型での ソートがさらに高速に!? Magic WANDってなんですか?Koji Kawamura
 
Apache NiFi 流れるデータにもスキーマを
Apache NiFi 流れるデータにもスキーマをApache NiFi 流れるデータにもスキーマを
Apache NiFi 流れるデータにもスキーマをKoji Kawamura
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellKoji Kawamura
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうKoji Kawamura
 
Apache NiFiで、楽して、つながる、広がる IoTプロジェクト
Apache NiFiで、楽して、つながる、広がる IoTプロジェクトApache NiFiで、楽して、つながる、広がる IoTプロジェクト
Apache NiFiで、楽して、つながる、広がる IoTプロジェクトKoji Kawamura
 
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間Koji Kawamura
 
Couchbase 30-dbtechshowcase-tokyo2014
Couchbase 30-dbtechshowcase-tokyo2014Couchbase 30-dbtechshowcase-tokyo2014
Couchbase 30-dbtechshowcase-tokyo2014Koji Kawamura
 
Introduce couchbase server
Introduce couchbase serverIntroduce couchbase server
Introduce couchbase serverKoji Kawamura
 
CouchDB JP Developers Dummit LT
CouchDB JP Developers Dummit LTCouchDB JP Developers Dummit LT
CouchDB JP Developers Dummit LTKoji Kawamura
 
Introduction of CouchDB JP
Introduction of CouchDB JPIntroduction of CouchDB JP
Introduction of CouchDB JPKoji Kawamura
 
ApacheCon NA 2011 report
ApacheCon NA 2011 reportApacheCon NA 2011 report
ApacheCon NA 2011 reportKoji Kawamura
 
もうひとつのNo sql couchdbとは
もうひとつのNo sql couchdbとはもうひとつのNo sql couchdbとは
もうひとつのNo sql couchdbとはKoji Kawamura
 

Mais de Koji Kawamura (16)

Broadcast チームの オブザーバビリティ向上活動.pdf
Broadcast チームの オブザーバビリティ向上活動.pdfBroadcast チームの オブザーバビリティ向上活動.pdf
Broadcast チームの オブザーバビリティ向上活動.pdf
 
Elastic Stack を網羅する ハンズオンワークショップを 作ってみた.pdf
Elastic Stack を網羅する ハンズオンワークショップを 作ってみた.pdfElastic Stack を網羅する ハンズオンワークショップを 作ってみた.pdf
Elastic Stack を網羅する ハンズオンワークショップを 作ってみた.pdf
 
Drupal Elasticsearch Connector の日本語検索の質を高める
Drupal Elasticsearch Connector の日本語検索の質を高めるDrupal Elasticsearch Connector の日本語検索の質を高める
Drupal Elasticsearch Connector の日本語検索の質を高める
 
20200324 ms open-tech-elastic
20200324 ms open-tech-elastic20200324 ms open-tech-elastic
20200324 ms open-tech-elastic
 
祝Elasticsearch 7.6、date, number 型での ソートがさらに高速に!? Magic WANDってなんですか?
祝Elasticsearch 7.6、date, number 型での ソートがさらに高速に!?  Magic WANDってなんですか?祝Elasticsearch 7.6、date, number 型での ソートがさらに高速に!?  Magic WANDってなんですか?
祝Elasticsearch 7.6、date, number 型での ソートがさらに高速に!? Magic WANDってなんですか?
 
Apache NiFi 流れるデータにもスキーマを
Apache NiFi 流れるデータにもスキーマをApache NiFi 流れるデータにもスキーマを
Apache NiFi 流れるデータにもスキーマを
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
 
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょうそのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
 
Apache NiFiで、楽して、つながる、広がる IoTプロジェクト
Apache NiFiで、楽して、つながる、広がる IoTプロジェクトApache NiFiで、楽して、つながる、広がる IoTプロジェクト
Apache NiFiで、楽して、つながる、広がる IoTプロジェクト
 
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
Kafka含むデータ処理フローを NiFiで構築するさまを実演する5分間
 
Couchbase 30-dbtechshowcase-tokyo2014
Couchbase 30-dbtechshowcase-tokyo2014Couchbase 30-dbtechshowcase-tokyo2014
Couchbase 30-dbtechshowcase-tokyo2014
 
Introduce couchbase server
Introduce couchbase serverIntroduce couchbase server
Introduce couchbase server
 
CouchDB JP Developers Dummit LT
CouchDB JP Developers Dummit LTCouchDB JP Developers Dummit LT
CouchDB JP Developers Dummit LT
 
Introduction of CouchDB JP
Introduction of CouchDB JPIntroduction of CouchDB JP
Introduction of CouchDB JP
 
ApacheCon NA 2011 report
ApacheCon NA 2011 reportApacheCon NA 2011 report
ApacheCon NA 2011 report
 
もうひとつのNo sql couchdbとは
もうひとつのNo sql couchdbとはもうひとつのNo sql couchdbとは
もうひとつのNo sql couchdbとは
 

Último

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

What will be new in Apache NiFi 1.2.0

  • 1. What will be new in Apache NiFi 1.2.0 (was not released as of this writing, but released on May 8th) Apr 28th, 2017 Apache NiFi committer: Koji Kawamura (ijokarumawak)
  • 2. Disclaimer: See release note for detail! • Apache NiFi 1.2.0 has been released on May 8th, 2017!! Please see the official release note for official what’s new. https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#Release Notes-Version1.2.0 • The contents in this slide deck are derived from Apache NiFi JIRA issues which is labeled with next release target 1.2.0 and source code available at Github (already merged into master branch), however it does NOT mean these are guaranteed to be released and still are subjects to change. • The motivation of this presentation is share what have been introduced into the project since the latest Apache NiFi 1.1.2 release. • The contents are created from information available under Apache NiFi project, however, the way summarize it is solely done with my personal thoughts and not a consensus built among Apache NiFi community.
  • 3. Themes • Schema Registry, Record Reader/Writer • Multiple version of Nar • Support EL for various Processor properties • Performance Improvement • CDC (Capture Data Change) • Rollback on Failure • Flow control • UX • Security
  • 4. Schema Management is a pain… We need schema to: - Analyze data - Convert one format to another - Validation - … etc - Centralized schema management is needed..
  • 6. Schema Registry • AvroSchemaRegistry • Provides a service for registering and accessing schemas. You can register a schema as a dynamic property where 'name' represents the schema name and 'value' represents the textual representation of the actual schema following the syntax and semantics of Avro's Schema format. • HortonworksSchemaRegistry • Provides a Schema Registry Service that interacts with a Hortonworks Schema Registry, available at https://github.com/hortonworks/registry
  • 8. Schema Registry and Reader/Writer • ConvertRecord • Converts records from one data format to another using configured Record Reader and Record Write Controller Services. • SplitRecord • Splits up an input FlowFile that is in a record-oriented data format into multiple smaller FlowFiles • PutDatabaseRecord • The PutDatabaseRecord processor uses a specified RecordReader to input (possibly multiple) records from an incoming flow file. • QueryRecord • Evaluates one or more SQL queries against the contents of a FlowFile. The result of the SQL query then becomes the content of the output FlowFile. This can be used, for example, for field-specific filtering, transformation, and row-level filtering.
  • 9. Multiple versions of Nar MANIFEST.MF
  • 10. New Processors: CDC • CaptureChangeMySQL Retrieves Change Data Capture (CDC) events from a MySQL database. CDC Events include INSERT, UPDATE, DELETE operations. Events are output as individual flow files ordered by the time at which the operation occurred.
  • 12. Rollback on Failure • PutSQL • PutHiveQL • PutHiveStreaming • PutDatabaseRecord
  • 13. PutSQL : default behavior 2 13 PutSQL RDBMS Input FlowFiles 1 3 1 3 2 success failure
  • 14. PutSQL : Rollback on Failure 2 13 PutSQL RDBMS Input FlowFiles success failure Rollback! Modified records are rolled back No output FlowFile, those will be kept in the input queue
  • 15. New Processors: Flow Control • EnforceOrder • Enforces expected ordering of FlowFiles those belong to the same data group. • Wait • Routes incoming FlowFiles to the 'wait' relationship until a matching release signal is stored in the distributed cache from a corresponding Notify processor. When a matching release signal is identified, a waiting FlowFile is routed to the 'success' relationship, with attributes copied from the FlowFile that produced the release signal from the Notify processor. • Notify • Caches a release signal identifier in the distributed cache, optionally along with the FlowFile's attributes. Any flow files held at a corresponding Wait processor will be released once this signal in the cache is discovered.
  • 16. Flow Control using Wait/Notify
  • 17. New Processors GCS • DeleteGCSObject • FetchGCSObject • ListGCSBucket • PutGCSObject
  • 18. New Processors • ConsumeEWS • Consumes messages from Microsoft Exchange using Exchange Web Services. The raw-bytes of each received email message are written as contents of the FlowFile • ConvertExcelToCSVProcessor • Consumes a Microsoft Excel document and converts each worksheet to csv. • ExtractCCDAAttributes • Extracts information from an Consolidated CDA formatted FlowFile and provides individual attributes as FlowFile attributes. • ExtractGrok • Evaluates one or more Grok Expressions against the content of a FlowFile, adding the results as attributes or replacing the content of the FlowFile with a JSON notation of the matched content
  • 19. New Processors • FetchHBaseRow • Fetches a row from an HBase table. • FuzzyHashContent • Calculates a fuzzy/locality-sensitive hash value for the Content of a FlowFile and puts that hash value on the FlowFile as an attribute whose name is determined by the <Hash Attribute Name> property. • ISPEnrichIP • Looks up ISP information for an IP address and adds the information to FlowFile attributes. • ListenBeats • Listens for messages sent by libbeat compatible clients (e.g. filebeats, metricbeats, etc) using Libbeat's 'output.logstash', writing its JSON formatted payload to the content of a FlowFile.This processor replaces the now deprecated ListenLumberjack
  • 20. New Processors • ExecuteScript • ClojureScriptEngine is added! • UpdateCounter • This processor allows users to set specific counters and key points in their flow. It is useful for debugging and basic counting functions. • AttributeRollingWindow • Track a Rolling Window based on evaluating an Expression Language expression on each FlowFile and add that value to the processor's state. • GetTCP • Connects over TCP to the provided endpoint(s). Received data will be written as content to the FlowFile • QueryDatabaseTable • MSSQL2008DatabaseAdapter, MSSQLDatabaseAdapter
  • 23. More context menu at root Process Group BTW, did you know you can ‘Refresh’ flow by ‘Cmd + r’?
  • 24. … and more!! 1.2.0 is discussed to be released soon! Released and available for download! Thank you :) https://nifi.apache.org/download.html