SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
MAP REDUCE PATTERN
               (cafe.naver.com/architect1)



                                 (itmentor@gmail.com)
• Automatic         parallelization & distribution

• Fault-tolerant

• Provides        status and monitoring tools

• Clean         abstraction for programmers
MAP REDUCE
          • Google

               •             Map reduce

               •     Page rank, crawler, google map

          • Hadoop

               •

               •             Map function, reduce function

          • Qizmt

               •

               •     C#               Map function, reduce function

          •    etc

               •     C++, C#, Java, Haskell

               •     http://en.wikipedia.org/wiki/MapReduce
MAP

               map f lst: (’a->’b) -> (’a list) -> (’b list)
                                                     f
                                              .
                      <key, value>                           .
REDUCE
               (= fold, accumulate, compress, inject)

               fold f x0 lst: ('a*'b->'b)->'b->('a list)->'b
                                           ,f
                  accumulator                            .     key
                        value        reduce                          .
MAPREDUCE                                                                                                        ?
                                          28   CHAPTER 2       THE BASICS OF A MAPREDUCE JOB



                                                                                       Provided by Hadoop
                                                       Provided by User                    Framework

          •
                                                       Job Configuration

                                      .
                                                                                         Input Splitting &
                                                                                           Distribution


                                                         Input Format
                                                                                        Start of Individual

               • Input   format                         Input Locations                     Map Tasks

                                                         Map Function



               • Input   location                         Number of                    Shuffle, Partition/Sort
                                                         Reduce Tasks                    per Map Output


                                                       Reduce Function

               • Map   function                             Output
                                                                                         Merge Sort for
                                                                                       Map Outputs for Each
                                                           Key Type                       Reduce Task


               • Reduce    function                          Output
                                                           Value Type                    Start of Individual
                                                                                           Reduce Tasks


               • Output    format
                                                        Output Format

                                                        Output Location
                                                                                           Collection of
                                                                                           Final Output
               • Output    location
                                               Figure 2-1. Parts of a MapReduce job


	    	    	                                        The user is responsible for handling the job setup, specifying the input
MAP REDUCE
          Input          Map                 Shuffle         Reduce         Output

                  1.
                                Logical Flow
                                    map()
                                            Key        key              Reduce
                   2. map()                                  reduce()
                  (key,val) pairs
MAP REDUCE
               Physical Flow
MAP REDUCE
               Physical Flow
                        Job
?

                  PROGRAM            Map function              Reduce function
                Distributed Grep     matched lines                     pass
     Reverse Web link graph <target, source>                  <target, list(src)>
          URL                          <URL, 1>              <URL, total count>
          Term-Vector per Host     <hostname, term-vector>   <hostname, all-term-vector>

                 Inverted Index     <word, doc id> <word, list(doc id)>
                Distributed Sort      <key,value>                      pass
CLUSTER     80   CHAPTER 3
                                            - HADOOP
                                                  THE BASICS OF MULTIMACHINE CLUSTERS



                                 Enable Job Control Options on the Web Interfaces
• Master                         Both the JobTracker and the NameNode provide a web interface for monitori
                                 trol. By default, the JobTracker provides web service on
                                 the NameNode provides web service on                                  . If the

      • Name       node                             parameter is set to    , the JobTracker web interface will ad
                                 and Change Job Priority options to the per-job detail page. The default locatio
                                 tional options is the bottom-left corner of the page (so you usually need to scr
                                 page to see them).

      • Job     tracker
                                 A Sample Cluster Configuration
                                 In this section, we will walk through a simple configuration of a six-node Had
• Slave(        =Worker )        cluster will be composed of six machines:           ,        ,         ,
                                          . The JobTracker and NameNode will reside on the machine
                                 NameNode will be placed on             . The DataNodes and TaskTrackers will b
                                 the same machines, and the nodes will be named             through          . Fi
      • Data     node            this setup.

                                         Master                    Slave01
                                       NameNode
      • Task     tracker
                                                                       Slave02
                                  http://master:50070/            Datanode
                                                                          Slave03
                                       JobTracker                     Datanode
                                                                 TasktrackerSlave04
                                  http://master:50030/                   Datanode
                                                                    TasktrackerSlave05
                                                                            Datanode
                                                                        Tasktracker
                                                                               DataNode
                                                                           Tasktracker
                                                                            TaskTracker


                                 Figure 3-2. A simple six-node cluster
MAP REDUCE                                           - GOOGLE
1.                 16MB ~ 64MB                  .
                                .
2.                               Master
       . Worker                   Master                       (map
     task, reduce task)             . master idle worker
                   .
3.   Map task                 worker
           map                           immediate key/value pair
                                .
4.                     pair                                          ,
                          Reduce                       .
           pair       master                       . master map worker
                                     reduce worker
          .
5.   reduce worker master                                   , RPC
             map worker buffered data( immediate key/value
     pairs )            .                  immediate key
          .
                  external sort             .
6.   reduce worker                                          ,
                              .
     reduce                     . reduce
                      (                       )
7.         map            reduce                 ,         user program
                 ,                                    MapReduce
                .
?

•                             (DFS)

         • Google   Map reduce - Bigtable

         • Hadoop   - HBase

         • Hypertable   ( commercial )
EXAMPLE SOURCE CODE
                   Google Mapreduce example
                          Word count
http://research.microsoft.com/barc/SortBenchmark/.
               ence. Concurrency and Computation: Practice and Ex-                  input->set_filepattern(argv[i]);
                                                                                                                       class Adder : public Reducer {
               perience, 2004.                                                      input->set_mapper_class("WordCounter");
                                                      [11] William Gropp, Ewing Lusk, and Anthony Skjellum.              virtual void Reduce(ReduceInput* input) {
                                                                                  }
                                                             Using MPI: Portable Parallel Programming with the
          [17] L. G. Valiant. A bridging model for parallel computation.                                                   // Iterate over all entries with the
                                                             Message-Passing Interface. MIT Press, Cambridge, MA,          // same key and add the values
               Communications of the ACM, 33(8):103–111, 1997.                    // Specify the output files:
                                                                                                                           int64 value = 0;
                                                             1999.                //     /gfs/test/freq-00000-of-00100
          [18] Jim Wyllie. Spsort: How to sort a terabyte quickly.                //     /gfs/test/freq-00001-of-00100
                                                                                                                           while (!input->done()) {




                   EXAMPLE - WORDCOUNT
               http://alme1.almaden.ibm.com/cs/spsort.pdf. L. Huston, R. Sukthankar, R. Wickremesinghe, M. Satya-
                                                      [12]                        //     ...
                                                                                                                             value += StringToInt(input->value());
                                                             narayanan, G. R. Ganger, E. Riedel, and A. out = spec.output(); input->NextValue();
                                                                                  MapReduceOutput*      Ailamaki. Di-
                                                             amond: A storage architecture for early discard in inter-     }
                                                                                  out->set_filebase("/gfs/test/freq");
          A Word Frequency                                   active search. In Proceedings of the 2004 USENIX File
                                                                                  out->set_num_tasks(100);
                                                                                                                           // Emit sum for input->key()
                                                             and Storage Technologies FAST Conference, April 2004.
                                                                                  out->set_format("text");
                                                                                                                           Emit(IntToString(value));
                                                                         out->set_reducer_class("Adder");
          This section contains a program that counts the number
                                                  [13] Richard E. Ladner and Michael J. Fischer. Parallel prefix    }
                                                                                                                };
          of occurrences of each unique word in a set of input files Journal ofOptional: do partial 1980. within map
                                                       computation.       // the ACM, 27(4):831–838, sums
                                                                                                                REGISTER_REDUCER(Adder);
          specified on the command line.                                   // tasks to save network bandwidth
                                                  [14] Michael O. Rabin. Efficient dispersal of information for
                                                                         out->set_combiner_class("Adder");
                                                 security, load balancing and fault tolerance. Journal of int main(int argc, char** argv) {
          #include "mapreduce/mapreduce.h"       the ACM, 36(2):335–348, 1989. parameters: use at most ParseCommandLineFlags(argc, argv);
                                                                     // Tuning                              2000
          // User’s map function                                     // Faloutsos, Garth A. Gibson, and
                                            [15] Erik Riedel, Christos   machines and 100 MB of memory per task
                                                                                                            MapReduceSpecification spec;
                                                                     spec.set_machines(2000);
          class WordCounter : public Mapper {    David Nagle. Active disks for large-scale data process-
           public:                                                   spec.set_map_megabytes(100);
                                                 ing. IEEE Computer, pages 68–74, June 2001.
                                                                     spec.set_reduce_megabytes(100);
                                                                                                            // Store list of input files into "spec"
               virtual void Map(const MapInput& input) {                                                           for (int i = 1; i < argc; i++) {
                                                   [16] Douglas Thain, Todd Tannenbaum, and Miron Livny.
                  const string& text = input.value();                                                                MapReduceInput* input = spec.add_input();
                  const int n = text.size();            Distributed computing in practice:it
                                                                             // Now run The Condor experi-
                                                                                                                     input->set_format("text");
                                                                             MapReduceResult result;
                  for (int i = 0; i < n; ) {            ence. Concurrency if (!MapReduce(spec, &result)) abort();
                                                                             and Computation: Practice and Ex-       input->set_filepattern(argv[i]);
                       // Skip past leading whitespace perience, 2004.                                               input->set_mapper_class("WordCounter");
                       while ((i < n) && isspace(text[i]))                                                         }
                         i++;                      [17] L. G. Valiant. A bridging model ’result’ computation. contains info
                                                                             // Done: for parallel structure
                                                        Communications of the ACM, 33(8):103–111,time taken, number of
                                                                             // about counters,
                                                                                                     1997.         // Specify the output files:
                       // Find word end                                      // machines used, etc.
                                                                                                                   //     /gfs/test/freq-00000-of-00100
                       int start = i;              [18] Jim Wyllie. Spsort: How to sort a terabyte quickly.        //     /gfs/test/freq-00001-of-00100
                                                        http://alme1.almaden.ibm.com/cs/spsort.pdf.
                       while ((i < n) && !isspace(text[i]))                  return 0;
                                                                                                                   //     ...
                         i++;                                             }
                                                                                                                   MapReduceOutput* out = spec.output();
                                                                                                                   out->set_filebase("/gfs/test/freq");
can
scan
                   if (start < i)
                      if (start < i)
                                                   A Word Frequency                                                out->set_num_tasks(100);
 ni-                    Emit(text.substr(start,i-start),"1");                                                      out->set_format("text");
gni-       To}} Emit(text.substr(start,i-start),"1");
                 appear in OSDI 2004                                                                                          13
                                                                                                                   out->set_reducer_class("Adder");
 96.                                               This section contains a program that counts the number
 ’96.       }
nce
ence     }; }                                      of occurrences of each unique word in a set of input files       // Optional: do partial sums within map
          };
         REGISTER_MAPPER(WordCounter);
          REGISTER_MAPPER(WordCounter);
                                                   specified on the command line.                                   // tasks to save network bandwidth
 ge.                                                                                                               out->set_combiner_class("Adder");
 age.    // User’s reduce function
          // User’s reduce function                #include "mapreduce/mapreduce.h"
         class Adder : public Reducer {                                                                            // Tuning parameters: use at most 2000
um.       class Adder : public Reducer {
            virtual void Reduce(ReduceInput* // User’s map function
                                                   input) {                                                        // machines and 100 MB of memory per task
 um.          virtual void Reduce(ReduceInput* input) {
 the            // Iterate over all entries with the WordCounter : public Mapper {
                                                   class                                                           spec.set_machines(2000);
  the            // Iterate over all entries with the
                // same key and add the values public:                                                             spec.set_map_megabytes(100);
MA,              // same key and add the values
MA,             int64 value = 0;                                                                                   spec.set_reduce_megabytes(100);
                 int64 value = 0;                     virtual void Map(const MapInput& input) {
                while (!input->done()) {                const string& text = input.value();
                 while (!input->done()) {
 ya-               value += StringToInt(input->value()); int n = text.size();                                      // Now run it
 tya-                                                   const
                      value += StringToInt(input->value());
Di-                input->NextValue();                  for (int i = 0; i < n; ) {                                 MapReduceResult result;
  Di-           }
                      input->NextValue();
                                                                                                                   if (!MapReduce(spec, &result)) abort();
  er-            }                                         // Skip past leading whitespace
nter-
File                                                       while ((i < n) && isspace(text[i]))
 File           // Emit sum for input->key()                  i++;                                                 // Done: ’result’ structure contains info
04.              // Emit sum for input->key()
 04.            Emit(IntToString(value));
                 Emit(IntToString(value));
                                                                                                                   // about counters, time taken, number of
efix         }                                              // Find word end                                        // machines used, etc.
 efix          }
80.      };                                                int start = i;
980.      };
         REGISTER_REDUCER(Adder);                          while ((i < n) && !isspace(text[i]))                    return 0;
          REGISTER_REDUCER(Adder);
 for                                                                                                             }
   for                                                        i++;
   of    int main(int argc, char** argv) {
 l of     int main(int argc, char** argv) {
            ParseCommandLineFlags(argc, argv);
            	 ParseCommandLineFlags(argc, argv);
QIZMT
               Qizmt - Map reduce framework on Windows
QIZMT FEATURES
CORE MYSPACE QIZMT FEATURES
                 •   C#             mapreducer job
                 •
                 •   Built-in IDE/Debugger
                 •                             mapreducer job          /      /   /
                 •   Delta-only exchange option for Mapreduce jobs
                 •               /
                 •   Easily add machines to a cluster to increase processing power and capacity
                 •   CAC (Cluster Assembly Cache) for exposing .Net DLLs to mapreduce jobs
                 •             Job
                      ◦ Mapreduce -
                      ◦ Remote -                               (                            )
                      ◦ Local - For orchestrating a pipeline of Mapreducer and Remote jobs
                 •
                      ◦   Sorted - Shuffle         Key        (                             )
                      ◦   Grouped -
                      ◦   Hashsorted - core            hashtable       , Key                    .




          Input                      Map                   Shuffle                   Reduce             Output

                             1.                map()
                                                           Sorted /            key                  Reduce
                              2. map()                    Grouped /                  reduce()
                             (key,val) pairs              Hashsorted
EXAMPLE - WORD COUNT
QIZMT EXAMPLE

WORDCOUNT
•        Hadoop

         •

         •                 C++   map, reduce

         •        But, cygwin

•        Qizmt

         •                                  ‘              ’   .

         •

         •        Master            .

         •                 IDE          .

         •                                              .

         •                                      -
Q&A
Map reduce

Mais conteúdo relacionado

Destaque

Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014soujavajug
 
Secrets in Kubernetes
Secrets in KubernetesSecrets in Kubernetes
Secrets in KubernetesJerry Jalava
 
Talend Big Data Capabilities Overview
Talend Big Data Capabilities OverviewTalend Big Data Capabilities Overview
Talend Big Data Capabilities OverviewRajan Kanitkar
 
Statistical Significance | Statistics
Statistical Significance | StatisticsStatistical Significance | Statistics
Statistical Significance | StatisticsTransweb Global Inc
 
FTP Client and Server | Computer Science
FTP Client and Server | Computer ScienceFTP Client and Server | Computer Science
FTP Client and Server | Computer ScienceTransweb Global Inc
 
Apache Spark Streaming: Architecture and Fault Tolerance
Apache Spark Streaming: Architecture and Fault ToleranceApache Spark Streaming: Architecture and Fault Tolerance
Apache Spark Streaming: Architecture and Fault ToleranceSachin Aggarwal
 
Spark architecture
Spark architectureSpark architecture
Spark architecturedatamantra
 
Client server architecture
Client server architectureClient server architecture
Client server architectureBhargav Amin
 
Lecture 5 6 .ad hoc network
Lecture 5 6 .ad hoc networkLecture 5 6 .ad hoc network
Lecture 5 6 .ad hoc networkChandra Meena
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advancedChirag Ahuja
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Sparkdatamantra
 

Destaque (20)

Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
 
Bd class 2 complete
Bd class 2 completeBd class 2 complete
Bd class 2 complete
 
Big data gaurav
Big data gauravBig data gaurav
Big data gaurav
 
HadoopFileFormats_2016
HadoopFileFormats_2016HadoopFileFormats_2016
HadoopFileFormats_2016
 
Secrets in Kubernetes
Secrets in KubernetesSecrets in Kubernetes
Secrets in Kubernetes
 
Talend Big Data Capabilities Overview
Talend Big Data Capabilities OverviewTalend Big Data Capabilities Overview
Talend Big Data Capabilities Overview
 
Statistical Significance | Statistics
Statistical Significance | StatisticsStatistical Significance | Statistics
Statistical Significance | Statistics
 
Hadoop File System Shell Commands,
Hadoop File System Shell Commands,Hadoop File System Shell Commands,
Hadoop File System Shell Commands,
 
FTP Client and Server | Computer Science
FTP Client and Server | Computer ScienceFTP Client and Server | Computer Science
FTP Client and Server | Computer Science
 
Apache Spark Streaming: Architecture and Fault Tolerance
Apache Spark Streaming: Architecture and Fault ToleranceApache Spark Streaming: Architecture and Fault Tolerance
Apache Spark Streaming: Architecture and Fault Tolerance
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
Ad hoc networks
Ad hoc networksAd hoc networks
Ad hoc networks
 
Networking
NetworkingNetworking
Networking
 
MPP vs Hadoop
MPP vs HadoopMPP vs Hadoop
MPP vs Hadoop
 
Client server architecture
Client server architectureClient server architecture
Client server architecture
 
Lecture 5 6 .ad hoc network
Lecture 5 6 .ad hoc networkLecture 5 6 .ad hoc network
Lecture 5 6 .ad hoc network
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Density Function | Statistics
Density Function | StatisticsDensity Function | Statistics
Density Function | Statistics
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advanced
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 

Semelhante a Map reduce

Buzz words
Buzz wordsBuzz words
Buzz wordscwensel
 
Big Data Analytics with Hadoop with @techmilind
Big Data Analytics with Hadoop with @techmilindBig Data Analytics with Hadoop with @techmilind
Big Data Analytics with Hadoop with @techmilindEMC
 
サンプルから見るMap reduceコード
サンプルから見るMap reduceコードサンプルから見るMap reduceコード
サンプルから見るMap reduceコードShinpei Ohtani
 
サンプルから見るMapReduceコード
サンプルから見るMapReduceコードサンプルから見るMapReduceコード
サンプルから見るMapReduceコードShinpei Ohtani
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
Zh Tw Introduction To Map Reduce
Zh Tw Introduction To Map ReduceZh Tw Introduction To Map Reduce
Zh Tw Introduction To Map Reducekevin liao
 
Adaptive MapReduce using Situation-Aware Mappers
Adaptive MapReduce using Situation-Aware MappersAdaptive MapReduce using Situation-Aware Mappers
Adaptive MapReduce using Situation-Aware Mappersrvernica
 
FME's Role in a Map Revision Production Workflow and R&D Environment
FME's Role in a Map Revision Production Workflow and R&D EnvironmentFME's Role in a Map Revision Production Workflow and R&D Environment
FME's Role in a Map Revision Production Workflow and R&D EnvironmentSafe Software
 
Introduction to MapReduce using Disco
Introduction to MapReduce using DiscoIntroduction to MapReduce using Disco
Introduction to MapReduce using DiscoJim Roepcke
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmNilaNila16
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceDr Ganesh Iyer
 
Session 19 - MapReduce
Session 19  - MapReduce Session 19  - MapReduce
Session 19 - MapReduce AnandMHadoop
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Matthew Lease
 
Kansas City Big Data: The Future Of Insights - Keynote: "Big Data Technologie...
Kansas City Big Data: The Future Of Insights - Keynote: "Big Data Technologie...Kansas City Big Data: The Future Of Insights - Keynote: "Big Data Technologie...
Kansas City Big Data: The Future Of Insights - Keynote: "Big Data Technologie...kcitp
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxHARIKRISHNANU13
 
Introduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdfIntroduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdfBikalAdhikari4
 

Semelhante a Map reduce (20)

Buzz words
Buzz wordsBuzz words
Buzz words
 
Big Data Analytics with Hadoop with @techmilind
Big Data Analytics with Hadoop with @techmilindBig Data Analytics with Hadoop with @techmilind
Big Data Analytics with Hadoop with @techmilind
 
サンプルから見るMap reduceコード
サンプルから見るMap reduceコードサンプルから見るMap reduceコード
サンプルから見るMap reduceコード
 
サンプルから見るMapReduceコード
サンプルから見るMapReduceコードサンプルから見るMapReduceコード
サンプルから見るMapReduceコード
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
Zh Tw Introduction To Map Reduce
Zh Tw Introduction To Map ReduceZh Tw Introduction To Map Reduce
Zh Tw Introduction To Map Reduce
 
Adaptive MapReduce using Situation-Aware Mappers
Adaptive MapReduce using Situation-Aware MappersAdaptive MapReduce using Situation-Aware Mappers
Adaptive MapReduce using Situation-Aware Mappers
 
Unit3 MapReduce
Unit3 MapReduceUnit3 MapReduce
Unit3 MapReduce
 
FME's Role in a Map Revision Production Workflow and R&D Environment
FME's Role in a Map Revision Production Workflow and R&D EnvironmentFME's Role in a Map Revision Production Workflow and R&D Environment
FME's Role in a Map Revision Production Workflow and R&D Environment
 
Introduction to MapReduce using Disco
Introduction to MapReduce using DiscoIntroduction to MapReduce using Disco
Introduction to MapReduce using Disco
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
Session 19 - MapReduce
Session 19  - MapReduce Session 19  - MapReduce
Session 19 - MapReduce
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
 
Kansas City Big Data: The Future Of Insights - Keynote: "Big Data Technologie...
Kansas City Big Data: The Future Of Insights - Keynote: "Big Data Technologie...Kansas City Big Data: The Future Of Insights - Keynote: "Big Data Technologie...
Kansas City Big Data: The Future Of Insights - Keynote: "Big Data Technologie...
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
 
Introduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdfIntroduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdf
 

Mais de Hyosung Jeon

windows via c++ Ch 5. Job
windows via c++ Ch 5. Jobwindows via c++ Ch 5. Job
windows via c++ Ch 5. JobHyosung Jeon
 
9장 도메인 주도 설계
9장 도메인 주도 설계9장 도메인 주도 설계
9장 도메인 주도 설계Hyosung Jeon
 
Mongo db 복제(Replication)
Mongo db 복제(Replication)Mongo db 복제(Replication)
Mongo db 복제(Replication)Hyosung Jeon
 
xUnitTestPattern/chapter12
xUnitTestPattern/chapter12xUnitTestPattern/chapter12
xUnitTestPattern/chapter12Hyosung Jeon
 
목적이 부여된 에이전트 행동
목적이 부여된 에이전트 행동목적이 부여된 에이전트 행동
목적이 부여된 에이전트 행동Hyosung Jeon
 

Mais de Hyosung Jeon (7)

Nodejs express
Nodejs expressNodejs express
Nodejs express
 
windows via c++ Ch 5. Job
windows via c++ Ch 5. Jobwindows via c++ Ch 5. Job
windows via c++ Ch 5. Job
 
WebGL
WebGLWebGL
WebGL
 
9장 도메인 주도 설계
9장 도메인 주도 설계9장 도메인 주도 설계
9장 도메인 주도 설계
 
Mongo db 복제(Replication)
Mongo db 복제(Replication)Mongo db 복제(Replication)
Mongo db 복제(Replication)
 
xUnitTestPattern/chapter12
xUnitTestPattern/chapter12xUnitTestPattern/chapter12
xUnitTestPattern/chapter12
 
목적이 부여된 에이전트 행동
목적이 부여된 에이전트 행동목적이 부여된 에이전트 행동
목적이 부여된 에이전트 행동
 

Último

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 

Último (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 

Map reduce

  • 1. MAP REDUCE PATTERN (cafe.naver.com/architect1) (itmentor@gmail.com)
  • 2.
  • 3. • Automatic parallelization & distribution • Fault-tolerant • Provides status and monitoring tools • Clean abstraction for programmers
  • 4. MAP REDUCE • Google • Map reduce • Page rank, crawler, google map • Hadoop • • Map function, reduce function • Qizmt • • C# Map function, reduce function • etc • C++, C#, Java, Haskell • http://en.wikipedia.org/wiki/MapReduce
  • 5. MAP map f lst: (’a->’b) -> (’a list) -> (’b list) f . <key, value> .
  • 6. REDUCE (= fold, accumulate, compress, inject) fold f x0 lst: ('a*'b->'b)->'b->('a list)->'b ,f accumulator . key value reduce .
  • 7. MAPREDUCE ? 28 CHAPTER 2 THE BASICS OF A MAPREDUCE JOB Provided by Hadoop Provided by User Framework • Job Configuration . Input Splitting & Distribution Input Format Start of Individual • Input format Input Locations Map Tasks Map Function • Input location Number of Shuffle, Partition/Sort Reduce Tasks per Map Output Reduce Function • Map function Output Merge Sort for Map Outputs for Each Key Type Reduce Task • Reduce function Output Value Type Start of Individual Reduce Tasks • Output format Output Format Output Location Collection of Final Output • Output location Figure 2-1. Parts of a MapReduce job The user is responsible for handling the job setup, specifying the input
  • 8. MAP REDUCE Input Map Shuffle Reduce Output 1. Logical Flow map() Key key Reduce 2. map() reduce() (key,val) pairs
  • 9. MAP REDUCE Physical Flow
  • 10. MAP REDUCE Physical Flow Job
  • 11. ? PROGRAM Map function Reduce function Distributed Grep matched lines pass Reverse Web link graph <target, source> <target, list(src)> URL <URL, 1> <URL, total count> Term-Vector per Host <hostname, term-vector> <hostname, all-term-vector> Inverted Index <word, doc id> <word, list(doc id)> Distributed Sort <key,value> pass
  • 12.
  • 13. CLUSTER 80 CHAPTER 3 - HADOOP THE BASICS OF MULTIMACHINE CLUSTERS Enable Job Control Options on the Web Interfaces • Master Both the JobTracker and the NameNode provide a web interface for monitori trol. By default, the JobTracker provides web service on the NameNode provides web service on . If the • Name node parameter is set to , the JobTracker web interface will ad and Change Job Priority options to the per-job detail page. The default locatio tional options is the bottom-left corner of the page (so you usually need to scr page to see them). • Job tracker A Sample Cluster Configuration In this section, we will walk through a simple configuration of a six-node Had • Slave( =Worker ) cluster will be composed of six machines: , , , . The JobTracker and NameNode will reside on the machine NameNode will be placed on . The DataNodes and TaskTrackers will b the same machines, and the nodes will be named through . Fi • Data node this setup. Master Slave01 NameNode • Task tracker Slave02 http://master:50070/ Datanode Slave03 JobTracker Datanode TasktrackerSlave04 http://master:50030/ Datanode TasktrackerSlave05 Datanode Tasktracker DataNode Tasktracker TaskTracker Figure 3-2. A simple six-node cluster
  • 14. MAP REDUCE - GOOGLE 1. 16MB ~ 64MB . . 2. Master . Worker Master (map task, reduce task) . master idle worker . 3. Map task worker map immediate key/value pair . 4. pair , Reduce . pair master . master map worker reduce worker . 5. reduce worker master , RPC map worker buffered data( immediate key/value pairs ) . immediate key . external sort . 6. reduce worker , . reduce . reduce ( ) 7. map reduce , user program , MapReduce .
  • 15. ? • (DFS) • Google Map reduce - Bigtable • Hadoop - HBase • Hypertable ( commercial )
  • 16. EXAMPLE SOURCE CODE Google Mapreduce example Word count
  • 17. http://research.microsoft.com/barc/SortBenchmark/. ence. Concurrency and Computation: Practice and Ex- input->set_filepattern(argv[i]); class Adder : public Reducer { perience, 2004. input->set_mapper_class("WordCounter"); [11] William Gropp, Ewing Lusk, and Anthony Skjellum. virtual void Reduce(ReduceInput* input) { } Using MPI: Portable Parallel Programming with the [17] L. G. Valiant. A bridging model for parallel computation. // Iterate over all entries with the Message-Passing Interface. MIT Press, Cambridge, MA, // same key and add the values Communications of the ACM, 33(8):103–111, 1997. // Specify the output files: int64 value = 0; 1999. // /gfs/test/freq-00000-of-00100 [18] Jim Wyllie. Spsort: How to sort a terabyte quickly. // /gfs/test/freq-00001-of-00100 while (!input->done()) { EXAMPLE - WORDCOUNT http://alme1.almaden.ibm.com/cs/spsort.pdf. L. Huston, R. Sukthankar, R. Wickremesinghe, M. Satya- [12] // ... value += StringToInt(input->value()); narayanan, G. R. Ganger, E. Riedel, and A. out = spec.output(); input->NextValue(); MapReduceOutput* Ailamaki. Di- amond: A storage architecture for early discard in inter- } out->set_filebase("/gfs/test/freq"); A Word Frequency active search. In Proceedings of the 2004 USENIX File out->set_num_tasks(100); // Emit sum for input->key() and Storage Technologies FAST Conference, April 2004. out->set_format("text"); Emit(IntToString(value)); out->set_reducer_class("Adder"); This section contains a program that counts the number [13] Richard E. Ladner and Michael J. Fischer. Parallel prefix } }; of occurrences of each unique word in a set of input files Journal ofOptional: do partial 1980. within map computation. // the ACM, 27(4):831–838, sums REGISTER_REDUCER(Adder); specified on the command line. // tasks to save network bandwidth [14] Michael O. Rabin. Efficient dispersal of information for out->set_combiner_class("Adder"); security, load balancing and fault tolerance. Journal of int main(int argc, char** argv) { #include "mapreduce/mapreduce.h" the ACM, 36(2):335–348, 1989. parameters: use at most ParseCommandLineFlags(argc, argv); // Tuning 2000 // User’s map function // Faloutsos, Garth A. Gibson, and [15] Erik Riedel, Christos machines and 100 MB of memory per task MapReduceSpecification spec; spec.set_machines(2000); class WordCounter : public Mapper { David Nagle. Active disks for large-scale data process- public: spec.set_map_megabytes(100); ing. IEEE Computer, pages 68–74, June 2001. spec.set_reduce_megabytes(100); // Store list of input files into "spec" virtual void Map(const MapInput& input) { for (int i = 1; i < argc; i++) { [16] Douglas Thain, Todd Tannenbaum, and Miron Livny. const string& text = input.value(); MapReduceInput* input = spec.add_input(); const int n = text.size(); Distributed computing in practice:it // Now run The Condor experi- input->set_format("text"); MapReduceResult result; for (int i = 0; i < n; ) { ence. Concurrency if (!MapReduce(spec, &result)) abort(); and Computation: Practice and Ex- input->set_filepattern(argv[i]); // Skip past leading whitespace perience, 2004. input->set_mapper_class("WordCounter"); while ((i < n) && isspace(text[i])) } i++; [17] L. G. Valiant. A bridging model ’result’ computation. contains info // Done: for parallel structure Communications of the ACM, 33(8):103–111,time taken, number of // about counters, 1997. // Specify the output files: // Find word end // machines used, etc. // /gfs/test/freq-00000-of-00100 int start = i; [18] Jim Wyllie. Spsort: How to sort a terabyte quickly. // /gfs/test/freq-00001-of-00100 http://alme1.almaden.ibm.com/cs/spsort.pdf. while ((i < n) && !isspace(text[i])) return 0; // ... i++; } MapReduceOutput* out = spec.output(); out->set_filebase("/gfs/test/freq"); can scan if (start < i) if (start < i) A Word Frequency out->set_num_tasks(100); ni- Emit(text.substr(start,i-start),"1"); out->set_format("text"); gni- To}} Emit(text.substr(start,i-start),"1"); appear in OSDI 2004 13 out->set_reducer_class("Adder"); 96. This section contains a program that counts the number ’96. } nce ence }; } of occurrences of each unique word in a set of input files // Optional: do partial sums within map }; REGISTER_MAPPER(WordCounter); REGISTER_MAPPER(WordCounter); specified on the command line. // tasks to save network bandwidth ge. out->set_combiner_class("Adder"); age. // User’s reduce function // User’s reduce function #include "mapreduce/mapreduce.h" class Adder : public Reducer { // Tuning parameters: use at most 2000 um. class Adder : public Reducer { virtual void Reduce(ReduceInput* // User’s map function input) { // machines and 100 MB of memory per task um. virtual void Reduce(ReduceInput* input) { the // Iterate over all entries with the WordCounter : public Mapper { class spec.set_machines(2000); the // Iterate over all entries with the // same key and add the values public: spec.set_map_megabytes(100); MA, // same key and add the values MA, int64 value = 0; spec.set_reduce_megabytes(100); int64 value = 0; virtual void Map(const MapInput& input) { while (!input->done()) { const string& text = input.value(); while (!input->done()) { ya- value += StringToInt(input->value()); int n = text.size(); // Now run it tya- const value += StringToInt(input->value()); Di- input->NextValue(); for (int i = 0; i < n; ) { MapReduceResult result; Di- } input->NextValue(); if (!MapReduce(spec, &result)) abort(); er- } // Skip past leading whitespace nter- File while ((i < n) && isspace(text[i])) File // Emit sum for input->key() i++; // Done: ’result’ structure contains info 04. // Emit sum for input->key() 04. Emit(IntToString(value)); Emit(IntToString(value)); // about counters, time taken, number of efix } // Find word end // machines used, etc. efix } 80. }; int start = i; 980. }; REGISTER_REDUCER(Adder); while ((i < n) && !isspace(text[i])) return 0; REGISTER_REDUCER(Adder); for } for i++; of int main(int argc, char** argv) { l of int main(int argc, char** argv) { ParseCommandLineFlags(argc, argv); ParseCommandLineFlags(argc, argv);
  • 18. QIZMT Qizmt - Map reduce framework on Windows
  • 20. CORE MYSPACE QIZMT FEATURES • C# mapreducer job • • Built-in IDE/Debugger • mapreducer job / / / • Delta-only exchange option for Mapreduce jobs • / • Easily add machines to a cluster to increase processing power and capacity • CAC (Cluster Assembly Cache) for exposing .Net DLLs to mapreduce jobs • Job ◦ Mapreduce - ◦ Remote - ( ) ◦ Local - For orchestrating a pipeline of Mapreducer and Remote jobs • ◦ Sorted - Shuffle Key ( ) ◦ Grouped - ◦ Hashsorted - core hashtable , Key . Input Map Shuffle Reduce Output 1. map() Sorted / key Reduce 2. map() Grouped / reduce() (key,val) pairs Hashsorted
  • 21. EXAMPLE - WORD COUNT
  • 23.
  • 24. Hadoop • • C++ map, reduce • But, cygwin • Qizmt • ‘ ’ . • • Master . • IDE . • . • -
  • 25. Q&A