2. What is Data
What is Bigdata
Concepts in Bigdata
Three Structures of Bigdata
Sources and Term Bigdata analytics
How Bigdata can Differenciate From
Traditional Database
Applications of Bigdata in various fields
Tools for bigdata
Problems and sollutions
Conclusion
3. Data is information processed or
stored by a computer.
Data may be processed by the
computer's CPU and is stored
in files and folders on the
computer's hard disk.
5. Big data is a term that describes the large
volume of structured , semi structured and
unstructured data.
It is difficult to process using
traditional database and software techniques
Big data can be analyzed for better decisions
and strategic business moves.
7. Volume. Organizations collect data from a
variety of sources, including business
transactions, social media etc..
Velocity. Data streams in at an
unprecedented speed and must be dealt with
in a timely manner.
8. Variety. Data comes in all types of
formats – from structured, numeric data in
traditional databases to unstructured text
documents, email, video, audio, stock and
financial transactions.
9. 1 – Structured Data
2 – SemiStructured Data
2- Unstructured Data
10. The term structured data generally refers to
data that has a defined length and format for
big data.
Examples of structured data include
numbers, dates, and groups of words and
numbers called strings.
Structured data is the data you’re probably
used to dealing with. It’s usually stored in a
database.
11. 1) Point of sale data: When the cashier
swipes the bar code of any product that you
are purchasing, all that data associated with
the product is generated.
12. 2) Financial data: Financial systems are
programmatic they are operated based on
predefined rules that automate processes.
Stock-trading data is a good example of this.
It contains structured data such as the
company symbol and dollar value.
3) Input data: This is any piece of data that a
human might input into a computer, such as
name, age, income, non-free-form survey
responses, and so on.
13. The data does not reside in fixed
fields or records, but does contain
elements that can separate the data
into various hiearchies.
15. Unstructured data, in contrast, refers
to data that doesn’t fit neatly into the
traditional row and column structure
of relational databases. Examples of
unstructured data include: emails,
videos, audio files, web pages, and
social media messages
16. 1) Satellite images: This includes weather
data or the data that the government
captures in its satellite surveillance imagery.
Images of earth or other planets collected by
Imaging satellites operated by governments
and businesses around the world.
2) Photographs and video: This includes
security surveillance, and traffic video etc.
17. 3) Text internal to your company: Think of all
the text within documents, logs, survey
results etc. Enterprise information actually
represents a large percent of the text
information in the world today.
4) Social media data: This data is generated
from the social media platforms such as
YouTube, Facebook, Twitter, LinkedIn.
18. 5) Black Box Data
This is the data generated by airplanes,
including jets and helicopters.
Black box data includes flight crew voices,
microphone recordings, and aircraft
performance information.
21. Big data analytics is the often complex
process of examining large and varied
data sets or big data to uncover
information including unknown
correlations, market trends and
customer preferences that can help
organizations make informed
business decisions.
22. Capturing data
Data storage
Data analysis
Search
Sharing and transfer
Visualization
Querying
Updating
23. 1)ACCURACY OF THE DATA
With Traditional data, its difficult to
maintain the accuracy and confidential as
the quality of the data.But, Big data
provides the high accuracy and makes
the results more accurate.
24. 2)DATA STORAGE
In Traditional Data, it’s impossible to
store a large amount of data.However,
with Big Data can store huge voluminous
data easily. The traditional database can
save data in the number of gigabytes to
terabytes. Well, the big data can save
hundreds of terabytes, petabytes and
even more.
25. 3)MORE FLEXIBLE
Big Data is flexible and easily handle without any
kind of disturbance. In the previous time, the data
can only save in specific kind of data structures.
4)FAST AND EASY
The whole process is for getting the data analyzing
end reports much simpler and easy it also become
fast.
26. 5)DATA VARIETY
Big data has the ability to process and
store all variety of data it is
structured, semi-structured &
unstructured. But Traditional DBMS
can manage only structured and
semi-structured data.
28. 1) Big data in Finance Sector
Financial services have widely
adopted big data analytics to
inform better investment
decisions with consistent returns.
29. 2) Big data in Telecommunication
Telecommunication net works need to share
data between cell towers, users and
processing centers and it is important to
process it near the source and then efficiently
transfer it to various data centers for further
use.
30. 3) Big data in HealthCare
Big data is used for analyzing data in the
electronic medical record (EMR) system
with the goal of reducing costs and
improving patient care.
This Data includes the unstructured data
from physician notes, pathology reports
etc.
31. 4) Bigdata in Social Media
One of the better examples of how big
data currently shapes our lives is social
media analytics. The user information
that is being collected on social
networking platforms allows marketers
to have a better understanding of the
customer behavior.
32. Today almost every course of
learning is present
online.
By analyzing big data can identify
at-risk students, make sure students
are making adequate progress, and
can implement a better system for
evaluation and support of teachers.
33. 6)Bigdata in banking
It finds out all the mischief tasks done.
It detects:
misuse of credit cards
misuse of debit cards
The Securities and Exchange Commission uses
this big data in order to keep a track of all the
commercial market movements.
34. Retailers need to know the best way to
market to customers. Bigdata is using for
most effective way to handle transactions,
and the most strategic way to bring back
lapsed business.
37. In 2005 Doung cutting and
Michael developed
HADOOP.
The most popular open source
big data solution.
38. Hadoop is an open source framework
that allows us to store and process large
data sets in parallel and distributed
fashion.
Runs a number of applications with
thousands of nodes involving petabytes
of data.
It is a Java based programing framework.
40. The Hadoop Distributed File System (HDFS) is
the primary data storage system used
by Hadoop applications.
It employs a NameNode and DataNode
architecture to implement a distributed file
system.
45. MapReduce is a programming
framework that allows us to perform
distributed and parallel processing on
large data sets in a distributed
environment.
46. 1) Map stage : The map or mapper’s job is to
process the input data. Generally the input
data is in the form of file or directory and is
stored in the Hadoop file system (HDFS). The
input file is passed to the mapper function
line by line. The mapper processes the data
and creates several small chunks of data.
47. 2) Reduce stage : This stage is the
combination of the Shufflestage and
the Reduce stage. The Reducer’s job is to
process the data that comes from the
mapper. After processing, it produces a new
set of output, which will be stored in the
HDFS.
48. MongoDb is an open source data base that
uses a document oriented data model and a
non structured query language.
Being a NoSQL tool means it does not use the
usual rows and columns that we so much
associate with RDBMS
It is an architecture that built on collections
and documents.
49. The data is huge.
HDFS
Data appears in different formats.
Allows to store any kind of
data(Structured,semistructured&unst-
ructured)