The Codex of Business Writing Software for Real-World Solutions 2.pptx
Large Scale Data Analytics
1. Large Scale Data Analytics
Shankar Radhakrishnan
shankar.r3@cognizant.com
linkedin.com/in/connect2shankar
2. Scenario
• Insurer uses meteorological data for pricing model
• At present data from 2000 weather stations are
collected for analysis
• Plan is to use 10,000 weather station data
( or more )
• Stochastic simulation needs to run to ID pattern in
weather data, to determine pricing
• Volumetric : peta-bytes of information
( for 1 region )
2
7. Large Scale Data Analytics
7
“Involves using different algorithms,
distributed platforms, tools and techniques
to analyze big data and provide actionable
insights”
8. Big Data
“ Data sets that are very large in volume and complex “
8
New platforms, tools and techniques
have emerged to manage Big Data
We broke away from traditional
ways to process and analyze them
9. Data Structures
Vector, Matrix,
Or Complex
Structure
Free Text
Image or
Binary Data
Data “bags”
Iterative
Logic Or
Complex
Branching
Advanced
Analytic
Routines
Rapidly
Repeated
Measurements
Extreme
Low
Latency
Access to
all data
required
Search Ranking X X X X X X
Ad Tracking X X X X X X X X
Location or Proximity Tracking X X X X X
Social CRM X X X X X X X
Document Similarity Testing X X X X X X X X
Genomic Analysis X X X X X
Customer Cohort groups X X X X X X
Fraud Detection X X X X X X X X X
Smart Utility Metering X X X X X X
Churn Analysis X X X X X X X
Satellite Image Analysis X X X X
Game Gesture Analysis X X X X X X X X
Data Bag Exploration X X X X X X
9
10. Business Interests : Well Informed Customer Executive
10
Speech to Text
Conversion
Voice Data
Unstructured data Analytical System
Customer Persona
• Customer Persona -
Demographics,
Top interactions,
Channel Preferences,
Dissatisfies
• Customer Lifetime Value
• Recent Contact History
• Customer Sentiment &
Trend during the call
Customer’s state of mind
Sentimental
Analysis
Social media
Depositions
Complaints
Other Channel
information
(ATM, Branch)
Big Data Warehouse
Traditional Warehouse
Decision Engine • Customer Executive Dashboard
presents all intelligence
required to make a decision
• The decision engine also
presents important decisions
to be taken for the particular
customer issue
11. Well Informed Customer Executive…
Customer calls
BankingCallCenter
Executive
understands the
customerproblemExecutive authenticates
customer and pulls up
CustomerPersona
Executive reviews
risk of attrition
against Customer
LifetimeValue
Executive reviews
Last 5 call center
and banking
transactions
Executiveviews
customer’s state of
mind (riskof
attrition )through a
barometer chart
Analytical Solution -
Converts Speech to
textAnalytical engine
listens to
customer voice
Suggested top 5
Actions required
DecisionEngine
Executive performs below actions based on his analysis and
recommendations from Decision engine
1. Reversal of overdraft fee
2. One time fee waiver on Cheque book (predicting customer
need based on historic usage cycles )
3. Cash back Reward card for a minimum spend of $X through
debitcard
4. Offer interest revision for investment products or mortgage
5. Promote new mutual funds or credit cards based on
customer willingness
Analytical engine
monitors
sentiment
Executive analyzes Customer
Persona (demographic /
Preferences / Satisfiers /
dissatisfiesetc )
11
12. Business Interests : Fraud Prevention
12
Envisaged Benefits
▪ New fraud patterns can be identified by building ‘analytical models’ to run against historical data
▪ ‘Web crawling’, ‘Contextual text analysis’, ‘Natural Language Processing’ allows fraud behavior
identification from social media. It may increase Fraud detection success rate
▪ ‘Real time’ models to capture behavioral patters and do pattern analysis against History data to
evaluate Fraud case validity. The model learns by self and updates ‘Fraud pattern master sets.
▪ Brings ‘artificial intelligent’ fraud pattern detection and analysis
▪ ‘Real time’ (in the order of .5-1 minute refresh rate) alerts to Fraud analysts about ‘self learned’ fraud
patterns based on new customer behavior patterns
Big Data Usage
▪ Formation of key value groups to the order of XcY (where X no. of attributes that are relevant to Fraud
and Y is no. of attributes that should be combined to identify patterns)
▪ High speed history data loading from source systems
▪ Efficient Real time fraud detection by identifying patterns through customer behavioral events and
processing them over X yrs. of history data – e.g. using HBase
Scenario
Formation of Fraud pattern reference tables using
▪ Real time data coming from different departments like IVR, WEB, Customer profile, Transactions etc
▪ Real time Mining and analysis of history data to form prior patterns (no. of years in range to 50-100 TB)
13. Fraud Pattern Detection…
13
Legacy Fraud
Data
Customer
Profile Data
IVR Audio
Data
Web / Online
Card
Transaction
Data
Fraud
Pattern
Master Table
Fraud Analyst
History Data
Processing to
determine
Fraud
Patterns over
X years
Real-time
Customer
Behavior
Analysis for
Fraud
Detection
Customer
Behavior Change
Events
Customer
Behavior Change
Events
Customer
Behavior Change
Events
Real time Analysis of
behavior patterns over
historical data
Real time update to
Master Table on New
Fraud Patterns
Real time alert to
Fraud Analyst
RDBMS RDBMS
(JSON
Files) RDBMS
Customer
Behavior Change
Events
15. Benefits
15
BenefitsIndustry
Financial services
▪ Customer Insights – Integrating Transactional data (CRM/Payments) and unstructured Social feeds
▪ Regulatory Compliance – Risk exposures across asset classes, LOBs and firms
▪ Fraud Detection in Credit Cards & Financial Crimes (AML) in Banks
Travel, Hospitality & Retail
▪ Customer centricity – Customer behavior analysis from Omni channel retailing & Social feeds
▪ Markdown Optimization – Improve markdown based on actual customer buying patters
▪ Market basket analysis – Narrow down market basket analysis by demographics
Life Science
▪ Improve targeting & predictions – Automatic Detection of Adverse Drug Effects (ADEs)
▪ Patient data analysis – Longitudinal Patient Data (LPD) analysis
▪ Predictive Sciences – Analyze Preclinical Side Effect Profiles of Marketed Drugs
Healthcare (Payers & Providers)
▪ Cost of Care – Drug effectiveness & Cost of Care Analysis based on electronic Health Records (EMR)
▪ Self Service Healthcare – Increase in mHealth & eHealth to allow consumer access to health information
▪ Claims Analytics – Analyze insurance claims data for fraud detection & preferred treatment plans
Communication,
Media & Entertainment
▪ Discover churn patterns based on Call data records (CDRs) and activity in subscribers’ networks
▪ Digital Asset Management (DAM) – Analyze & capitalize digital data assets
Manufacturing
▪ Proactive Maintenance & Recommendation – Sensor Monitoring for automobile, buildings & machinery
▪ Energy Efficiency – Leveraging Smart meters for utility energy consumption
▪ Location or Proximity Tracking – Location based analytics using GPS Data
Hi-Tech
▪ Extend and complement conventional information supply chain with big data path
▪ Predictive analysis and real time decision support
28. Analytics - Trends
• Big Data Analytics In The Cloud
• AWS, AWS-Redshift
• Hadoop
• Enterprise Data Operating
System
• Data Analytics Platform
• SQL on Hadoop
• NoSQL
• IoT ( Internet of Things )
28
• Multi-polar Analytics
• Predictive Analytics ( Spark )
• In-memory Analytics
• Data Lake
• Deep Learning
• Machine Learning
• Neural Networks
• Data Monetization