2. Outline
Introduction
What is Big Data
Generators of Big Data
Characteristic of Big Data
Benefit of Big data
Hadoop
Hadoop components
Big Data vs. Hadoop
5. What is Big data
Is very large data sets that may be analyzed computationally to
reveal patterns, trends, and associations, especially relating to
Customers behavior and interactions.
A technology term about Data that becomes too large to be
managed in a manner that is previously known to work
normally.
6. Big Data generators
This data comes from everywhere:
sensors used to gather climate information,
posts to social media sites,
digital pictures
online Shopping
Airlines , and many more…
This data is “ big data.”
7. Characteristic Of Big Data
“Big data is the data characterized by 3 attributes: Volume, Velocity and Variety .”
8. Volume
It is the size of the data which determines the value and potential of the data under
consideration. The name ‘Big Data’ itself contains a term which is related to size
and hence the characteristic.
9. Variety
Data today comes in all types of formats. Structured, numeric data in traditional
databases. Unstructured text documents, email, stock ticker data and financial
transactions and semi-structured data too.
10. Velocity
speed of generation of data or how fast the data is generated and processed to meet
the demands and the challenges which lie ahead in the path of growth and
development.
11. FB generates 100TB daily
Twitter generates 8TB of data Daily
12. Benefit of Big data
Cost Reduction from Big Data Technologies
Time Reduction from Big Data
Developing New Big Data-Based Offerings
Supporting Internal Business Decisions
Real-time big data isn’t just a process for storing petabytes or Exabyte's of
data in a data warehouse, It’s about the ability to make better decisions and
take meaningful actions at the right time.
15. What is Hadoop
Flexible and available architecture for large scale computation and data
performance on a network of commodity hardware
Framework that allows for distributed processing of large data sets across
clusters of commodity servers
– Store large amount of data
– Process the large amount of data stored
Getting result from HDFS
16. Hadoop Components
Hadoop Distributed File system (HDFS)
Map Reduce
Name Node
Data Node
Pig , hive