2. WHAT IS BIG DATA?
Big Data is a collection of data sets that are large and complex in
nature.
OR
The data which is large in volume and difficult to process and
store. Big data can be analyzed for insights that lead to better
decisions and strategic business moves.
Big Data basically constitute both semi-structured and
unstructured data that grows large so fast that they are not
manageable by traditional relational data base system or
conventional tools.
IT FOR MANAGERS 2
3. SOURCES OF BIG DATA
Social Networking: Facebook, Twitter, Instagram, Google +
etc.
Sensors: Used in Aircraft, Cars, Industrial Machines, Space
Technology, CCTV Footage etc.
Data Created From Transportation Services: Aviation,
Railways, Shipping etc.
Online Shopping Portal: Amazon, Flipkart, Snapdeal,
Alibaba etc.
Mobile Applications: What’s App, Google Hangout, Hike etc.
Data created by Different Firms: Education Institute, Banks,
Hospital, Software Companies etc.
IT FOR MANAGERS 3
4. CHARACTERISTICS OF BIG DATA
There are three characteristics of Big Data:
3 V’s
1. VOLUME- Data in Tera Byte, Zeta Byte, Peta Byte.
2. VELOCITY- Data is growing very fast that gives
challenges in storing and processing.
3. VARIETY-
I. Unstructured data- Videos, Audio, Images, Texts.
II. Semi-structured data- Log Files.
IT FOR MANAGERS 4
5. Big Data...What it Means to You - YouTube.MP4
Source:-
SAS Thailand
Facts And Figures Related To Big Data
IT FOR MANAGERS 5
6. WHAT IS ANALYTICS ?
It is a process to take the data then apply some
mathematical and statistical algorithm or tool to build
some model. This model will be predictive and
exploratory which is having information that allow us to
get insights and insights allow us to take action.
USE OF DATA STATISTICAL
ANALYSIS
MODEL
GAIN INSIGHTS
ACT ON COMPLEX
ISSUES
IT FOR MANAGERS 6
7. TYPES OF ANALYTICS
1. DESCRIPTIVE- What happened ?
2. DIAGNOSTIC- Why did it happen?
3. PREDICTIVE- What is likely to happen?
4. PRESCRIPTIVE- What should i do about it?
Level Of Impact
Skilllevelpresent
1
2
3
4
IT FOR MANAGERS 7
8. Tools Of Analytics
Most used statistical programing tools :
IBM SPSS
SAS
Sata
R (Open Source)
MATLAB
Rest of the tools except ‘R’ are commercial and very
expensive.
R and MATLAB has most comprehensive support of
statistical functions.
R is most popular among Yahoo ,Google etc.
IT FOR MANAGERS 8
9. BIG DATAANALYTICS
When we analyze Big Data then that analytics is called Big Data
Analytics, basically it is the process of collecting , organizing and
analyzing data to discover pattern and other useful information that
allow us to take proper action.
ANALYTICS CHALLENGES WITH BIG DATA
• Traditional RDBMS fail to use Big Data.
• Big Data can not fit in the single computer.
• Processing of Big Data in single computer will take a lot of time.
• Through traditional analytics it would be costly to analyze Big Data.
• Scaling of Big Data through traditional RDBMS is expensive.
IT FOR MANAGERS 9
10. Big Data Analytics Tool And Technology
HADOOP- It is a Open Source Framework where we can
analyze the data cheaper and faster with the cluster of commodity
hardware. It provide massive storage for any kind of data with
enormous processing power .
HDFS (Hadoop Distributed File System): The java based
scalable that stores data across multiple machines without prior
organization.
Map Reduce: It is a software programing model for processing
large sets of data in parallel.
Hadoop= HDFS + Map Reduce
IT FOR MANAGERS10
11. Benefits Of Hadoop
Computing Power : Its distributed computing model quickly processes Big Data.
The more computing notes we use the more processing power we have.
Flexibility: We can store as much data as we want and decide how to use it later.
That include unstructured data like text, images and videos.
Low Cost: It is open source framework id free and uses commodity hardware to
store large quantity of data.
Scalability: We can easily grow our system simply by adding more nodes. IT FOR MANAGERS
11
12. High Scale Computing Platform for Big Data Analytics
HDFS
Structured
data in
RDMS
Sqoop
Unstructured
data
Pig
Online
data
stream
Real
time
learning
system
System/
web logs
Flume
Internal data
transformation
Pig
R Hadoop
Hive
Internal data
transformation
IT FOR MANAGERS 12
13. Big Data Analytics In Banks
Data creation
Collection of data
Banks own HDFS for storing
Fetching of data
Model formation
Knowing the insights of model
Taking action
IT FOR MANAGERS
13
15. Benefits Of Big Data Analytics in Banking Sector
Fraud Detection: It help Bank to detect, prevent and eliminate
internal and external fraud as well as reduce the associated cost.
Risk Management: Bank anlyse transaction data to determine
risk and exposures based on simulated market behavior, scoring
customer and potential clients.
Contacts Center Efficiency Optimization: It help Banks
to resolve problems of customers quickly by allowing Banks to
anticipate customers need ahead of time.
Customer Segmentation For Optimize Offers: It
provides a way to understand customers’ needs at a granular level
so that Banks can deliver targeted offers more effectively. IT FOR MANAGERS
15
16. Customer Churn Analysis: It help Banks to retain their
customers by analyzing their behavior and identifying patterns
that lead to a customer abandonment.
Sentiment Analyst: This tool help the Bank to analyse social
media to monitor user sentiment toward a firm, brand or product.
Customer Experience Analytics: It can provide better
insight and understanding, allowing Banks to match offers to a
customers’ needs.
Continued…
IT FOR MANAGERS 16
17. Conclusion
Banks are creating large amount of data day by day.
Their creation speed is much faster than our
processor’s speed. So the handling of bulk amount of
data is difficult for our system. But storing and
processing of Big Data is faster when it stored in
distributed manner.
‘Hadoop’ framework provides such kind of network
where Big Data distributed among different systems.
By adding more nodes data can be stored in different
location. If any node fails then there is no loss of data.
By the use of big data banks run more profitably.
IT FOR MANAGERS 17
18. Reference:-
1. Introduction to big data analytics: A Webinar by WizIq Education Online.
2. Big Data analytics using Hadoop: A Lecture by Durga Software Solutions.
3. Book Followed:
Information Technology for Management by Efraim Turban, Linda Volonino.
4. Website Followed:
www.flysas.com
www.smartdatacollective.com
IT FOR MANAGERS 18