SlideShare uma empresa Scribd logo
1 de 11
Flume NG Basics




 1   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
     reserved.
Oracle’s Big Data Approach
    4 Steps to Greater Value
    • Acquire and organize all data

    • Enable greater access to wide data

    • Analyze and refine important data

    • Decide and publish insights

2   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
    reserved.
How do I get data to my Hadoop Cluster?
                                 Using Flume NG to collect distributed data




3   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
    reserved.
My log data is not near my Hadoop cluster
                                                                                                                                    Oracle
Application                                                                                                                         Big Data Appliance
Servers
                                                        Customer Logs




                                                                                                                             ?
4   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
    reserved.
Moving Data with Flume NG
               Application
               Servers                                                                                                              Oracle
                                                                                                                                    Big Data Appliance

                                                             Flume NG                       Flume NG
                                       Logs                  Agent                          HDFS Write
                                                                                       Avro Agent


                                                             Flume NG                       Flume NG
                                       Logs                                            Avro HDFS Write
                                                             Agent                          Agent


                                                             Flume NG                       Flume NG
                                       Logs                                            Avro HDFS Write
                                                             Agent                          Agent


5   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
    reserved.
Building a Basic Flume Agent
    One configuration file
    • Flume is flexible
          – Durable Transactions
          – In-Flight Data Modification
          – Compresses Data
    • Flume simpler than it used to be
          – No Zookeeper requirement
          – No Master-Slave architecture
    • 3 basic pieces
          – Source, Channel, Sink
6   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
    reserved.
Flume Configuration
flume-ng agent –f this_file –n hdfs-agent
ollect
e
hannel

llect.type = netcat
llect.bind = 127.0.0.1
llect.port = 11111

type = hdfs
hdfs.path = hdfs://localhost:8020/user/oracle/sabre_example
rollInterval = 30
hdfs.writeFormat=Text
hdfs.fileType=DataStream

annel.type = memory
annel.capacity=10000

llect.channels=memoryChannel
channel=memoryChannel

         7   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
             reserved.
Sending Data to the Agent

    • Connect netcat to the host

    • Pipe input to it

    • Records are transmitted on newline

    • head example.xml | nc localhost 11111


8   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
    reserved.
Alternatives to Flume
    And Their Trade-Offs
    • Scribe
          – Thrift-based
          – Lightweight, but no support
          – Not designed around Hadoop
    • Kafka
          – Designed to resemble a publish-subscribe system
          – Explicitly distributed
          – Apache Incubator Project


9   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
    reserved.
10   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
     reserved.
11   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from Slide 8
     reserved.

Mais conteúdo relacionado

Mais procurados

Flume basic
Flume basicFlume basic
Flume basic
Uday Vakalapudi
 
Apache flume by Swapnil Dubey
Apache flume by Swapnil DubeyApache flume by Swapnil Dubey
Apache flume by Swapnil Dubey
Swapnil Dubey
 
Flume with Twitter Integration
Flume with Twitter IntegrationFlume with Twitter Integration
Flume with Twitter Integration
RockyCIce
 
ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016
Jayesh Thakrar
 
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEKafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
kawamuray
 

Mais procurados (20)

Flume basic
Flume basicFlume basic
Flume basic
 
Centralized logging with Flume
Centralized logging with FlumeCentralized logging with Flume
Centralized logging with Flume
 
Cloudera's Flume
Cloudera's FlumeCloudera's Flume
Cloudera's Flume
 
Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insights
 
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
Chicago Hadoop User Group (CHUG) Presentation on Apache Flume - April 9, 2014
 
Apache flume by Swapnil Dubey
Apache flume by Swapnil DubeyApache flume by Swapnil Dubey
Apache flume by Swapnil Dubey
 
Apache Flume and its use case in Manufacturing
Apache Flume and its use case in ManufacturingApache Flume and its use case in Manufacturing
Apache Flume and its use case in Manufacturing
 
Flume with Twitter Integration
Flume with Twitter IntegrationFlume with Twitter Integration
Flume with Twitter Integration
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Filesystems, RPC and HDFS
Filesystems, RPC and HDFSFilesystems, RPC and HDFS
Filesystems, RPC and HDFS
 
Flume @ Austin HUG 2/17/11
Flume @ Austin HUG 2/17/11Flume @ Austin HUG 2/17/11
Flume @ Austin HUG 2/17/11
 
Apache Flume - DataDayTexas
Apache Flume - DataDayTexasApache Flume - DataDayTexas
Apache Flume - DataDayTexas
 
Flume intro-100715
Flume intro-100715Flume intro-100715
Flume intro-100715
 
Hadoop - Apache Pig
Hadoop - Apache PigHadoop - Apache Pig
Hadoop - Apache Pig
 
ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Session 09 - Flume
Session 09 - FlumeSession 09 - Flume
Session 09 - Flume
 
Big data: Loading your data with flume and sqoop
Big data:  Loading your data with flume and sqoopBig data:  Loading your data with flume and sqoop
Big data: Loading your data with flume and sqoop
 
Highlights Of Sqoop2
Highlights Of Sqoop2Highlights Of Sqoop2
Highlights Of Sqoop2
 
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEKafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
 

Destaque

The Asset Consultancy_PPT _final
The Asset Consultancy_PPT _finalThe Asset Consultancy_PPT _final
The Asset Consultancy_PPT _final
Rushin Naik
 
Twitter as a data mining source
Twitter  as  a data mining sourceTwitter  as  a data mining source
Twitter as a data mining source
Ataxo Group
 
Search Analytics with Flume and HBase
Search Analytics with Flume and HBaseSearch Analytics with Flume and HBase
Search Analytics with Flume and HBase
Sematext Group, Inc.
 
Indian natural gas market ppt
Indian natural gas market pptIndian natural gas market ppt
Indian natural gas market ppt
Romana Aftab
 

Destaque (20)

Analyse Tweets using Flume, Hadoop and Hive
Analyse Tweets using Flume, Hadoop and HiveAnalyse Tweets using Flume, Hadoop and Hive
Analyse Tweets using Flume, Hadoop and Hive
 
Extracting twitter data using apache flume
Extracting twitter data using apache flumeExtracting twitter data using apache flume
Extracting twitter data using apache flume
 
Enabling Microservices @Orbitz - DockerCon 2015
Enabling Microservices @Orbitz - DockerCon 2015Enabling Microservices @Orbitz - DockerCon 2015
Enabling Microservices @Orbitz - DockerCon 2015
 
'Flume' Case Study
'Flume' Case Study'Flume' Case Study
'Flume' Case Study
 
Presentation sdimi risks, challenges and benefits of social media 2011
Presentation sdimi risks, challenges and benefits of social media 2011Presentation sdimi risks, challenges and benefits of social media 2011
Presentation sdimi risks, challenges and benefits of social media 2011
 
The Asset Consultancy_PPT _final
The Asset Consultancy_PPT _finalThe Asset Consultancy_PPT _final
The Asset Consultancy_PPT _final
 
DataSift Update - May 3rd 2011 - Devnest
DataSift Update - May 3rd 2011 - DevnestDataSift Update - May 3rd 2011 - Devnest
DataSift Update - May 3rd 2011 - Devnest
 
Demystify Big Data Breakfast Briefing: Martha Bennett, Forrester
Demystify Big Data Breakfast Briefing: Martha Bennett, Forrester Demystify Big Data Breakfast Briefing: Martha Bennett, Forrester
Demystify Big Data Breakfast Briefing: Martha Bennett, Forrester
 
Tweet alert - semantic analysis in social networks for citizen opinion mining
Tweet alert - semantic analysis in social networks for citizen opinion miningTweet alert - semantic analysis in social networks for citizen opinion mining
Tweet alert - semantic analysis in social networks for citizen opinion mining
 
Demo or Die: Where advertising meets product design
Demo or Die: Where advertising meets product designDemo or Die: Where advertising meets product design
Demo or Die: Where advertising meets product design
 
Flume Case Study
Flume Case StudyFlume Case Study
Flume Case Study
 
Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010Spatial Analytics, Where 2.0 2010
Spatial Analytics, Where 2.0 2010
 
Twitter as a data mining source
Twitter  as  a data mining sourceTwitter  as  a data mining source
Twitter as a data mining source
 
Big Data at Twitter, Chirp 2010
Big Data at Twitter, Chirp 2010Big Data at Twitter, Chirp 2010
Big Data at Twitter, Chirp 2010
 
Social media data for Social science research
Social media data for Social science researchSocial media data for Social science research
Social media data for Social science research
 
PPT FOR BIG
PPT FOR BIGPPT FOR BIG
PPT FOR BIG
 
Data Mining on Twitter
Data Mining on TwitterData Mining on Twitter
Data Mining on Twitter
 
Search Analytics with Flume and HBase
Search Analytics with Flume and HBaseSearch Analytics with Flume and HBase
Search Analytics with Flume and HBase
 
Indian natural gas market ppt
Indian natural gas market pptIndian natural gas market ppt
Indian natural gas market ppt
 
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
 

Semelhante a Flume in 10minutes

Hadoop in three use cases
Hadoop in three use casesHadoop in three use cases
Hadoop in three use cases
Joey Echeverria
 
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
yaevents
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
Amr Awadallah
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
skumpf
 

Semelhante a Flume in 10minutes (20)

Hadoop in three use cases
Hadoop in three use casesHadoop in three use cases
Hadoop in three use cases
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
 
Zend Products and PHP for IBMi
Zend Products and PHP for IBMi  Zend Products and PHP for IBMi
Zend Products and PHP for IBMi
 
Database as a Service, Collaborate 2016
Database as a Service, Collaborate 2016Database as a Service, Collaborate 2016
Database as a Service, Collaborate 2016
 
Is 12 Factor App Right About Logging
Is 12 Factor App Right About LoggingIs 12 Factor App Right About Logging
Is 12 Factor App Right About Logging
 
Building an Apache Hadoop data application
Building an Apache Hadoop data applicationBuilding an Apache Hadoop data application
Building an Apache Hadoop data application
 
Hadoop summit 2016
Hadoop summit 2016Hadoop summit 2016
Hadoop summit 2016
 
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profitDeep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
 
Hadoop for carrier
Hadoop for carrierHadoop for carrier
Hadoop for carrier
 
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
 
Big data components - Introduction to Flume, Pig and Sqoop
Big data components - Introduction to Flume, Pig and SqoopBig data components - Introduction to Flume, Pig and Sqoop
Big data components - Introduction to Flume, Pig and Sqoop
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
 
Os Pittaro
Os PittaroOs Pittaro
Os Pittaro
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
Data Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataData Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big Data
 
From raw data to business insights. A modern data lake
From raw data to business insights. A modern data lakeFrom raw data to business insights. A modern data lake
From raw data to business insights. A modern data lake
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
 
Advanced Federation and Web Services in Aras for Enterprise PLM
Advanced Federation and Web Services in Aras for Enterprise PLMAdvanced Federation and Web Services in Aras for Enterprise PLM
Advanced Federation and Web Services in Aras for Enterprise PLM
 

Último

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Flume in 10minutes

  • 1. Flume NG Basics 1 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 2. Oracle’s Big Data Approach 4 Steps to Greater Value • Acquire and organize all data • Enable greater access to wide data • Analyze and refine important data • Decide and publish insights 2 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 3. How do I get data to my Hadoop Cluster? Using Flume NG to collect distributed data 3 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 4. My log data is not near my Hadoop cluster Oracle Application Big Data Appliance Servers Customer Logs ? 4 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 5. Moving Data with Flume NG Application Servers Oracle Big Data Appliance Flume NG Flume NG Logs Agent HDFS Write Avro Agent Flume NG Flume NG Logs Avro HDFS Write Agent Agent Flume NG Flume NG Logs Avro HDFS Write Agent Agent 5 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 6. Building a Basic Flume Agent One configuration file • Flume is flexible – Durable Transactions – In-Flight Data Modification – Compresses Data • Flume simpler than it used to be – No Zookeeper requirement – No Master-Slave architecture • 3 basic pieces – Source, Channel, Sink 6 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 7. Flume Configuration flume-ng agent –f this_file –n hdfs-agent ollect e hannel llect.type = netcat llect.bind = 127.0.0.1 llect.port = 11111 type = hdfs hdfs.path = hdfs://localhost:8020/user/oracle/sabre_example rollInterval = 30 hdfs.writeFormat=Text hdfs.fileType=DataStream annel.type = memory annel.capacity=10000 llect.channels=memoryChannel channel=memoryChannel 7 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 8. Sending Data to the Agent • Connect netcat to the host • Pipe input to it • Records are transmitted on newline • head example.xml | nc localhost 11111 8 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 9. Alternatives to Flume And Their Trade-Offs • Scribe – Thrift-based – Lightweight, but no support – Not designed around Hadoop • Kafka – Designed to resemble a publish-subscribe system – Explicitly distributed – Apache Incubator Project 9 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 10. 10 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 11. 11 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.