2. 2
● Open source framework developed and
maintained by Apache foundation
● Consists of a Distributed file system (HDFS)
used for storing large blocks of data
● MapReduce Framework - Model for large scale
data processing
● YARN - Resource management platform for
managing computing resources in clusters
What is Hadoop?
3. 3
Problem: We have large number of attendees at this meetup
and want to count the number of Java Engineers, C++
Engineers and C Engineers present here
MapReduce In Practice