1. Knowledge is of the past, wisdom is of the future
Big Data Analytics Using R
AMZ Bank
RITUPARNA SARKAR
2. Banks look to 'big fast data' to meet regulator demands
Use cases of Analytics in Banking
◦ Consumer Behavior and Marketing
◦ Risk, Fraud and AML/KYC
◦ Product and PortFollio Optimization
Examples of Banking Analytics Impact
◦ A 15% increase in assets by designing unique offers for customers
◦ Improve time-to-market by 25%
◦ Cut marketing cost by 20%
Size and Growth of Analytics in Banking industry
3. AMZ Bank
◦ Recognized for their high standards and advanced service philosophy
◦ 7th largest lender in terms of assets
◦ Network of over 400 branches in Asia Pacific
◦ Relationships with 1000 banks in 70 countries around the world
Vision Statement: To become ultra-modern and data driven—an organization enabled to use
any and all of their data to drive business excellence
Goal and objective:
◦ Wish to reward their active and paying customers and reduce overheads on maintaining not so paying
customers.
◦ Maximize the number of active credit card customers
better targeting marketing incentives to those most likely to activate and use for their business
transactions
◦ Want to isolate the cards that would likely never be activated to reduce wasted marketing spend
◦ Reduce loss of customers to their competition
4. How did AMZ get here:
◦ Realized the potential of Big Data that could help them through predictive analytics and guided actions
for their business decisions
◦ Large institution with large data
ability to scale is mandatory, but they wanted to place themselves at the leading edge
◦ Efficiency was one of their top considerations
wanted an analytic architecture that was fast, flexible, affordable and simple-to-use, all at the
same time
Current Situation
◦ Recently set out to create a state-of-the-art analytic environment to support and fuel their fast
growing credit card business
◦ Bank Credit Card Center was already familiar with predictive analytics
◦ Had used conventional products in the past for BI and Reporting for their business outcomes and
decision making.
5. To Visualization & Reporting Layer
Connection to Data-lake
Open-source Message Broker - provides a unified,
high-throughput, low-latency platform for handling real-
time data feeds.
Open-source real-time computation system-
processes unbounded streams of data, doing for realtime
processing what Hadoop did for batch processing.
6. A. Setting up a right Datawarehouse platform: Leveraging from the large data
Many hundreds of systems are distributed throughout the organization; each system is largely
independent; any customer experience data is concentrated within that system
Option A: Traditional Datawarehouse Option B: Big Data (MPP Database) Approach
• extensive data definition work
• extensive transfer of data
• data sources are incomplete, do not use
the same definitions, and not always
available
• Sampling the data would have been very
problematic, as the objective was to
construct a customer view over time from
all the events that took place.
• Timescale to implement considerably high
• Need for elastic scalability, extreme
performance, faster data access,
• Workload Management, Fault-Tolerance,
Advanced Analytics feature support
• Massively Parallel Processing data warehouse
set up
• Can execute complex SQL analytics on very
large data sets at speeds multiple times
faster
8. B. Identifying the predictive analytics software
◦ Need to use the complete data, and not just a sample to get the complete picture towards making the
right decisions
◦ Tool would need to work with massive datasets, support the datawarehouse platform, provide
visualizations/predictive modelling capability
◦ Fully scalable data modeling against all data, improving analytic model accuracy and efficacy.
◦ Faster time from modeling to scoring, delivering rapid results and enabling iteration.
◦ Reduced data movement and latency, thus improving productivity of data analysts and IT staff.
◦ Efficient utilization of data assets and IT resources, reducing costs and increasing ROI ; essential to
factor in the cost of having dedicated data miners versus a tool based approach
◦ Scoring directly within the database, leveraging the database’s common security, auditing and
administration capabilities and reducing data movement and increasing data utilization.
9. Ad-hoc Data Analysis & Reporting
in Excel & R
Automated Reporting & Alerting system
Delivered to all form-factors
Advance analytics capabilities
Data-lake
Visualization & Reporting Platform
10. With centralization and democratization of data, business user become
more empowered to mitigate risk and propel development.
Rational and data supported decisions will increase efficiency in
processes
2-layered data warehouse (MPP & HDFS) bring the best of both worlds.
Mix of flagship products & open-source gives technical flexibility with
unbounded scope.
11. Data Model
◦ HDFS
Unstructured data : JSON formatted Flat File schema.
Structured data : Flattened Star Schema.
◦ Pivotal Greenplum
Structured data : Snowflakes schema
◦ Apache Storm
All data : JSON formatted Flat File schema.
12. Analyzing the issue 1: Reduce the number of inactive credit cards /
Maximize the number of active credit card customers
Definition, Benefits and Cause
◦ In-active credit cards - No transaction over an year
◦ Benefits of reducing In-active Credit Cards
Restricted lines of credit can be re-distributed among active users
and hence opening new opportunities of earning.
Suggested Solution
◦ Design dynamic products that grows and shrinks with customers
behavior.
◦ Restrict pushing low fees cards.
◦ Define in-activity period limits, apply re-activation strategy and
churn customers on expiry.
◦ Increase presence of ATMs, POS Card Swipe Machines, online
merchant partnership to tap more information.
◦ Design system to push customized 1-to-1 based on location and
situation.
Possible
Causes
Low
Quality
Customer
Low
Quality
Products
& Eco-
system
Customer
Product
Mismatch
13. Analyzing the issue 2: Reduce customer churn
Definition, Benefits and Cause
◦ Loss to competition - Customers shifting to competitor’s product
◦ Benefits – Steady revenue, focused incentives, increased loyalty
Suggested Solution
◦ Push products with high exit barrier to customer with multiple short term credit
lines.
◦ Improve brand value and trust by showcasing the strength.
◦ Empower customer support with updated-till-last-minute details of customer and
prescribed recommendation to serve better.
◦ Revamp the customer acquisition process with best mix of online and offline
processes.
◦ Achieve all regulatory and compliance certification
◦ Design dynamic products that grows and shrinks with customers behavior.
◦ Restrict pushing low fees cards.
◦ Define in-activity period limits, apply re-activation strategy and churn customers
on expiry.
◦ Increase presence of ATMs, POS Card Swipe Machines, online merchant
partnership to tap more information.
◦ Design system to push customized 1-to-1 based on location and situation.
Possible
Causes
Weaker
brand
value
Higher
Cost of
ownershi
p
Low
quality
customer
service
Innovativ
e
products
14. Data Sources
Internal Data External Data
• Customer acquisition history – Sales Team : How, what, when,
why and who
• Customer lifetime history – Customer Relations Team
• Current & Historical satisfaction status
• Movement between products
• Movement in customer’s life an its effect on behavior
• Marketing & Products data – Marketing Teams
• Products
• Marketing campaigns
• Customer identification across borders
• Across border relationship
• Credit scores and credit reports
• Analyze open lines of credit
• Analyze metrics like debt ratio, credit utilization etc.
• Domestic and International Fraud and Crime Data
• Prevent fraud and money laundering
• Partner Banks, Merchants and Institutions
•Professional Data from LinkedIn
•Network and personal life from Facebook & Instagram
•Social behavior from Twitter
Social
media
•Sensors, Mobile devices and applications
•PoS, ATM
•Web logs and Online shopping
Others
15. Devise 3600 view of customer
◦ Develop customers spend signature to prevent fraud using past transaction data, social media behavior, sensor data from mobile etc.
◦ Analyze all interactions like emails, call-center calls etc. with customer to devise the current satisfaction status
◦ Develop a view of customers personal life and social connections like “Last 5 significant events”.
◦ Link and analyze customer social media activities
◦ Develop recommendations for customer which can include offers, merchant offers, upgrade offer etc.
Revamp Marketing and Sales
◦ Build Social Media listening and monitoring center to capture, mediate and intervene in Social Media conversations.
◦ Analyze live data sources along with customer spend signature to push location-based, situation-based 1-to-1 targeted offers and notification.
◦ Devise aggregated regional potential, market share, brand sentiment to formulate resource allocation, sales targets etc.
◦ Empower ground staff with next-gen marketing & sale tools to help them maximize their personal ROI.
◦ Design and deploy market watch system to monitor and generate alerts on competitor activity like new product launch, change in interest rates
etc.
Empowering Operations
◦ Revamped dashboards with details like customer satisfaction status, customer value, last 5 call highlights etc to be delivered with least time.
◦ Design special call routing to build executive – customer relationship.
◦ Develop 1-click recommendation and live offer system which will use data from live current conversation to facilitate best cross selling.
◦ Deploy Robotic Process Automation tools to increase efficiency.
17. Setting up a MPP Data warehouse
EMC Greenplum Enterprise Data Cloud (EDC) as a powerful MPP data warehouse platform
◦ Greenplum Database® is an advanced, fully featured, open source data warehouse. It provides powerful and rapid analytics on petabyte
scale data volumes.
◦ True MPP architecture and features that meet mandatory requirements of enterprise-class data warehousing
◦ Allow AMZ market-leading power and scalability on commodity hardware.
◦ affordably ride the curve of hardware advances and enjoy the simplicity and flexibility of a private cloud environment
◦ Can get market leading load performance (10X faster than peer data warehouses) with comprehensive ELT transformation capabilities
to enterprise-class high availability/disaster recovery
◦ Can use Specific customer experience analytical packages (ClickFox and Merced) towards data analysis
Setting up a predictive analytics software
◦ Alpine Miner as a Predictive analytics software
Alpine Miner on the Greenplum database provides a fully integrated environment for statistical transformation and modeling
methods for data analysis, modeling and scoring, with true scalability and top processing speed.
As a completely scalable in-database solution, these processes can be built entirely within the Alpine Miner interface, and then
executed directly where the data resides, with no limitations on size or complexity
With Alpine Miner, business and data analysts can flexibly and efficiently conduct end-to-end knowledge discovery and predictive
analytics—including data preparation, data transformation, data modeling and data scoring.
All the models built with Alpine Miner are automatically stored in the database and can be published or deployed directly within the
database at the press of a button, further assuring data reliability and integrity and accelerating model integration with business
applications.
The Alpine Miner architecture cuts weeks to months from the process because it reduces unnecessary data movement, and supports
better data governance and data-mining process standardization.
19. Lack of opportunity to use
Possible solutions
◦ Increase partner, merchants and network
◦ Don’t sell cards to such geographies
Lack of knowledge and faith
Possible solutions
◦ Invest in customer training
◦ Showcase your security and compliances
Already owns multiple credit cards
Possible solutions
◦ Analyze list of open lines of credit
20. The low performance in the products can be due to various reasons.
◦ Product Design, Pricing, Customer segment etc.
Improvement Approaches
◦ This can be improved by developing a service delivery strategy. For Ex. “Mobile Wallet”.
◦ Mobile banking: Offers the convenience of banking anytime, anywhere.
◦ Mobile payments: Enables customers to send money to any mobile phone number.
◦ Mobile payments–contactless: Allows customers to save time with tap-and-go mobile payment service.
◦ Mobile marketing: Provides consumers with exclusive promotions, coupons, and alerts based on their current
location.
21. Market segment mismatch:
The product was not designed for this type of customer.
Provide a tailored product based on the historic data.
Customer-specific mismatch:
The unique needs of the customer may be causing some friction points even though
the customer is in the product’s target market.
Reach out to the customer asking feedbacks on the current offerings.
Geographical mismatch:
The customer may be based in a particular country or region with regulations or local
business norms that create unique challenges.
Find out features at a region level, which features are used most and which are not.
22. Banking industry : Data Landmine with tremendous capabilities
Mix of flagship products & open-source gives technical flexibility with unbounded scope: Essential for the
right architecture and design
Case Study
◦ 2 key objectives – Maximize active credit card users, Reduce Customer Churn
◦ High level architecture followed by a snapshot of proposed solution to meet the objectives
◦ Major Tasks for improvement
Challenges
◦ Transforming the approach to do business with Big Data.
◦ Align all layers of the organization understand and leverage the benefits out of Big Data.
◦ Hire, nurture and retain fresh talent to develop and build the new system.
◦ Sustain business though the transformation period ranging from 5-10 years on an average.