O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Introduction to Hadoop : A bird eye's view | Abhishek Mukherjee

261 visualizações

Publicada em

Introduction to hadoop and map reduce 'a headstart to algorithm'.

Publicada em: Tecnologia
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Introduction to Hadoop : A bird eye's view | Abhishek Mukherjee

  2. 2. Hello! I am Abhishek Mukherjee
  3. 3. OVERVIEW ▷What is Hadoop? ▷Short intro to the HDFS architecture ▷What is Map Reduce? ▷The components of Map Reduce Algorithm
  4. 4. What is Hadoop? Let’s start with the 1st set of slides
  5. 5. “Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment.
  6. 6. Short introduction to the HDFS Let’s start with the 2nd set of slides
  7. 7. Each of these platfroms have their own uses and will be dealt in detail in our upcoming presentations
  8. 8. HDFS Architecture ▷Follows master slave architecture ▷Here we have master as namenode and slave as datanode
  9. 9. Map reduce algorithm Let’s start with the 3rd set of slides
  10. 10. Delving into the algorithm Use Case: word count
  11. 11. Phases of map reduce ▷Map Phase ▷Combiner Phase(Optional) ▷Sort Phase ▷Shuffle Phase ▷Partition Phase(Optional) ▷Reducer Phase
  12. 12. Map Phase Take this as an input file: Hello my name is abhishek Hello my name is utsav Hello my passion is cricket  This file has 2 lines  Each line in the file has a byte offset of its own which serves as a key to the mapper and the value of the mapper is the data which is present In the line
  13. 13. Operation on output of map phase Hello 1 my 1 name 1 is 1 abhishek 1 Hello 1 my 1 name 1 is 1 utsav 1 Hello 1 my 1 passion 1 is 1 cricket 1 Input to reducer abhishek(1) cricket(1) Hello(1,1,1) is(1,1,1) my(1,1,1) name(1,1) passion(1) utsav(1) Output of mapper
  14. 14. Explanation of sort and shuffle phase ▷Sort the key value pairs according to the key values ▷Shuffle the mapped output to get values with same key to create a tuple of values with same key ▷This output is fed to the reducer which in turn maps the values of the tuple by returning a single value for a list of values present in the tuple
  15. 15. Hello(1,1,1) my(1,1,1) name(1,1,1) is(1,1,1) abhishek(1) utsav(1) passion(1) cricket(1) Reducer input abhishek(1) cricket(1) Hello(3) is(3) my(3) name(3) passion(1) utsav(1) Reducer output Reducer phase
  16. 16. Thanks! Any questions? You can find me at: scobbyabhi9@gmail.com