SlideShare uma empresa Scribd logo
1 de 67
961
•
•
• Hadoop         (Cloudera)

• Elastic MapReduce
•
•
•
•
• Hadoop         (Cloudera)

• Elastic MapReduce
•
•
•          (@sasata299)

• 2009 8                  JOIN

•
• Hadoop
•
•
• Hadoop         (Cloudera)

• Elastic MapReduce
•
•
•      961

• 30         3   1

•
   -
   -
•
•       (       ,   , ...)

    -
    -       (   ,       ,    , …)

    -
•         Hadoop     MySQL

•
    -   GROUP BY

•                   7000      …

•                  (´Д`   )
• MySQL
   -
•
   -
  -
•
•
• Hadoop         (Cloudera)

• Elastic MapReduce
•
•
Hadoop
• Google   MapReduce   OSS

•
   -
   -
   -
   -
Hadoop
master (   )

               slave (   )
Hadoop
master (   )

               slave (   )




Map
Hadoop
master (    )

                       slave (   )



           <key,value>
Map
           Shuffle & Sort
Hadoop
master (    )

                       slave (            )



           <key,value>
Map                              Reduce
           Shuffle & Sort
• Hadoop Streaming (Ruby       )

• EC2 Cloudera Hadoop
   - Cloudera CDH1
   - Hadoop           0.18.3

•                S3
MySQL → Hadoop
•
• GROUP BY        MapReduce
    -       (          )
    -           key

• JOIN   MapReduce
    •
(1)   master



(2)   S3
(1)   master



(2)   S3
(1)   master
               master   slave scp


(2)   S3
(1)        master
                    master      slave scp


(2)        S3


      S3            slave scp
MySQL vs Hadoop


   7000




    MySQL    Hadoop
   MySQL    Hadoop
MySQL vs Hadoop

            ( Д )
   7000
            30

    MySQL    Hadoop
   MySQL    Hadoop
Hadoop++
   ←Hadoop


        ↓MySQL
•
•
• Hadoop         (Cloudera)

• Elastic MapReduce
•
•
• Hadoop
   -
• Hadoop                        (HADOOP-6254)
   -   S3
   -   SocketTimeoutException
• EMR (Elastic MapReduce)
   -   Amazon               Hadoop

• Cloudera   CDH2
   -
AMI
            (Amazon Machine
       UP       Image)




EMR


CDH2
AMI
            (Amazon Machine
       UP       Image)




EMR


CDH2
EMR




Job Flow (   )
EMR
             BootStrap Action




Job Flow (               )
EMR
             BootStrap Action


             Step (Hadoop Job)




Job Flow (               )
EMR
             BootStrap Action


             Step (Hadoop Job)




Job Flow (               )
•
    -
    - --alive
• AMI
   -            AMI

    - BootStrap Action
Created job flow j-8IXS98OW1WEE
                         ID
Hadoop
•
    - mapred.child.java.opts
    -   streaming

•
    -
    -   ElasticMapReduce-master 5100
•
•
• Hadoop         (Cloudera)

• Elastic MapReduce
•
•
• Map
   -
• Reduce
   -       key   Reduce
   -
UU


      Map


     Reduce
UU


           Map
     ID

          Reduce
UU


           Map


          Reduce
     ID
UU


           Map


          Reduce
     ID
Map


Reduce
Map
ID key

         Reduce
Map


               Reduce
key   Reduce
Map
100
                 100

                       Reduce
  key   Reduce
×
                        Map
100
                 ×
                 100

                       Reduce
  key   Reduce
×
                           Map
100
                   ×100

                          Reduce
      Reduce   key sort
×
                           Map
100
                   ×100

                          Reduce
      Reduce   key sort
Hadoop

 •
     -
         Hadoop
Hadoop

 •
     -
         Hadoop
Hadoop

 •
     -
         Hadoop
Hadoop

 •
     -
         Hadoop
Hadoop

 •
     -
         Hadoop
•
•
• Hadoop         (Cloudera)

• Elastic MapReduce
•
•
•                Hadoop
    -
    -
    -   Reduce
961万人の食卓を支えるデータ解析

Mais conteúdo relacionado

Semelhante a 961万人の食卓を支えるデータ解析

COOKPADでのHadoop利用
COOKPADでのHadoop利用COOKPADでのHadoop利用
COOKPADでのHadoop利用
Tatsuya Sasaki
 
Hadoop入門とクラウド利用
Hadoop入門とクラウド利用Hadoop入門とクラウド利用
Hadoop入門とクラウド利用
Naoki Yanai
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris Schneider
Dmitry Makarchuk
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
npinto
 
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceQubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant Conference
Joydeep Sen Sarma
 
Hadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 FallHadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 Fall
Ryu Kobayashi
 
データ解析技術入門(Hadoop編)
データ解析技術入門(Hadoop編)データ解析技術入門(Hadoop編)
データ解析技術入門(Hadoop編)
Takumi Asai
 
サンプルから見るMap reduceコード
サンプルから見るMap reduceコードサンプルから見るMap reduceコード
サンプルから見るMap reduceコード
Shinpei Ohtani
 
Brust hadoopecosystem
Brust hadoopecosystemBrust hadoopecosystem
Brust hadoopecosystem
Andrew Brust
 

Semelhante a 961万人の食卓を支えるデータ解析 (20)

COOKPADでのHadoop利用
COOKPADでのHadoop利用COOKPADでのHadoop利用
COOKPADでのHadoop利用
 
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッドHadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
 
Amebaサービスのログ解析基盤
Amebaサービスのログ解析基盤Amebaサービスのログ解析基盤
Amebaサービスのログ解析基盤
 
Hadoop入門とクラウド利用
Hadoop入門とクラウド利用Hadoop入門とクラウド利用
Hadoop入門とクラウド利用
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris Schneider
 
Hadoop london
Hadoop londonHadoop london
Hadoop london
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
 
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software FrameworkHadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
 
Qubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant ConferenceQubole Overview at the Fifth Elephant Conference
Qubole Overview at the Fifth Elephant Conference
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Hadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 FallHadoop Conference Japan 2011 Fall
Hadoop Conference Japan 2011 Fall
 
Hadoop Overview kdd2011
Hadoop Overview kdd2011Hadoop Overview kdd2011
Hadoop Overview kdd2011
 
データ解析技術入門(Hadoop編)
データ解析技術入門(Hadoop編)データ解析技術入門(Hadoop編)
データ解析技術入門(Hadoop編)
 
サンプルから見るMap reduceコード
サンプルから見るMap reduceコードサンプルから見るMap reduceコード
サンプルから見るMap reduceコード
 
サンプルから見るMapReduceコード
サンプルから見るMapReduceコードサンプルから見るMapReduceコード
サンプルから見るMapReduceコード
 
Hadoop
HadoopHadoop
Hadoop
 
Brust hadoopecosystem
Brust hadoopecosystemBrust hadoopecosystem
Brust hadoopecosystem
 
Rug hogan-10-03-2012
Rug hogan-10-03-2012Rug hogan-10-03-2012
Rug hogan-10-03-2012
 
Hadoop and MapReduce
Hadoop and MapReduceHadoop and MapReduce
Hadoop and MapReduce
 
Azure Data Warehouse
Azure Data WarehouseAzure Data Warehouse
Azure Data Warehouse
 

Mais de Tatsuya Sasaki (7)

からあげエンジニアについて
からあげエンジニアについてからあげエンジニアについて
からあげエンジニアについて
 
クックパッドでのemr利用事例
クックパッドでのemr利用事例クックパッドでのemr利用事例
クックパッドでのemr利用事例
 
からあげとビーチと私
からあげとビーチと私からあげとビーチと私
からあげとビーチと私
 
メタプログラミングでDSLを書こう
メタプログラミングでDSLを書こうメタプログラミングでDSLを書こう
メタプログラミングでDSLを書こう
 
NoSQLデータベースが登場した背景と特徴
NoSQLデータベースが登場した背景と特徴NoSQLデータベースが登場した背景と特徴
NoSQLデータベースが登場した背景と特徴
 
Hadoopをemr経由で利用する方法
Hadoopをemr経由で利用する方法Hadoopをemr経由で利用する方法
Hadoopをemr経由で利用する方法
 
YUI
YUIYUI
YUI
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

961万人の食卓を支えるデータ解析

Notas do Editor