1. Welcome to the World of Big Data & Hadoop
www.easylearning.guru
2. Agenda
What is Big Data ?
Different Kinds of Big Data
Big Data Global Market
Hadoop Global job trends
What is Hadoop ?
www.easylearning.guru
3. What is Big Data?
Big data is the term for a collection of data
sets so large and complex that it becomes
difficult to process using on-hand database
management tools or traditional data
processing applications.
www.easylearning.guru
4. Types of Big Data ?
Traditional RDBMS deals
with only Structured data.
Need of a technology which deals with
Semi-structured data, Unstructured
data and Structured data as well
Semi-Structured
Data
www.easylearning.guru
6. Sources of Data
Social Media & Networks
(All of us are generating data)
Mobile Devices
(Tracking all the objects all the time)
Sensor Technology & Networks
(Measuring all kinds of data)
Scientific Instruments
(Collecting all sorts of data)
www.easylearning.guru
10. Big Data Global Market
Sources : Dice, LinkedIn.
Big Data Implementation
Implemented Big Data Yet to Implement Big Data
0
10
20
30
40
50
60
2012 2013 2014 2015 2016 2017
BigDataGrowth(inUSDBillions)
BIG DATA ANALYST
BIG DATA ARCHITECT
BIG DATA ENGINEER
BIG DATA RESEARCH ANALYST
BIG DATA VISUALIZER
DATA SCIENTIST
50
43
44
31
23
18
50
57
56
69
77
82
FILLED/VACANCY(%)
Filled Unfilled
www.easylearning.guru
11. Hadoop Global Job Trends
Top Hadoop Technology Companies
Sources : Dice, LinkedIn.
More than 17,000
employees with Hadoop
skill across these
companies
www.easylearning.guru
12. 2% 2% 3% 4%
8% 8%
10% 11%
14%
38%
DEMAND FOR BIG DATA IN CITIES
As of February 2014
0
20
40
60
80
100
120
SALARY(USDP.A.INTHOUSANDS)
Sources : Dice, LinkedIn.
Hadoop Global Job Trends
www.easylearning.guru
13. What is Hadoop ?
Hadoop was created by Doug Cutting and Mike Cafarella.
Hadoop provides the reliable shared storage and analysis
system.
It is designed to scale up from a single server to thousand of
machines, with a high degree of fault tolerance.
www.easylearning.guru
15. Hadoop Core Components
Core Hadoop has two main systems:
• Hadoop Distributed File System: The Hadoop file system is a
Distributed file system which holds the large amount of data across
multiple nodes in a cluster.
• MapReduce: MapReduce is a distributed programming paradigm
used to analyze the data in the HDFS.
www.easylearning.guru
16. Hadoop Distributed File System (HDFS)
A given file is broken down into blocks (default=64MB), then blocks are
replicated across cluster (default=3).
Optimized for throughput.
HDFS allows you to put/get/delete files.
Follows the philosophy
“Write Once and Read Multiple times”
Block Replication for:
- Durability, High Availability and Throughput.
www.easylearning.guru
23. Thank you for watching the Live Demo for Hadoop.
You can always contact us on:
Your queries are always welcome.
Phone : +91 124 4763660 (India)
Email : contact@easylearning.guru
Skype Id : easylearning.guru
Website : www.easylearning.guru
www.easylearning.guru