1. New Data Project Plan
Anderson, Brykman, Gero, Lakhani, Matthews
PREDICT 480 – SECTION 55
2. Introduction
Company Background
DERAK is an analytical support team for a young credit card company
Company is 5 years old
Latest venture is to consolidate data sources across the organization as well as
collect data from mobile applications to gain customer loyalty
Why did we select this company?
The combination of credit card data, social media, GPS, mobile data, etc. offer
exiting opportunities for modeling to give value to the customer
3. Introduction
SWOT Analysis
Strengths
Company is becoming analytically mature (Stage III on Maturity Model)
Diversity of products to spread risk
Weaknesses
Not yet a significant brand in Credit Card market
As a new company, we have limited data from 2008 Financial Crisis
Opportunities
Gain new insights on customers through mobile app
Utilize data for predictive modeling
Explore uses of GPS data
Sell customer data to third party companies
Threats
Credit Card Market is mature with several key players. Gaining significant market share is
difficult and expensive
Customer privacy concerns and government regulations
4. Introduction
3 Key Issues to Address with Analytics
1. Limited understanding about our customer
1. We need to know more about our customer in order to offer them more value than
the competition
2. Credit card default risk
1. Increase volume but maintain quality
3. Fraud
1. We need to increase trust with the customer by protecting them financially
5. Introduction
Database and Data preparation plan
Our database and data preparation plan will help us solve this issues by:
Helping us to gather more intimate data on our customer to better understand
their shopping preferences, personal interests, and measure customer loyalty
We plan to utilize the new data to identify possible natural clusters of our
customers that may help use to identify segments that could carry higher
probability of default risk
Through mobile applications, we can use GPS data to identify transactions that
may be fraudulent in order to better protect our customers
6. Literature and Data Sources Overview
Document your information and data sources.
Describe methods relevant to:
Acquiring
Storing
Maintaining
Accessing the data that you need
Cite references that you have used to guide your thinking about
data sources and methods, and include these in the reference list
a the end of the paper.
7. Acquiring Data
Company Databases1
Demographic information
Banking plans
Credit/Debit balances
Transactional data
Social media3
Twitter
Hashtags
@Branding
Facebook
Status updates
‘Likes’
Contact Network
1 https://customers.microsoft.com/Pages/Download.aspx?id=13928/
2 http://www.sitetechsystems.com/top-10-ways-to-use-gis-in-retail-banking/
2 www.mmaglobal.com/files/mbankingoverview.pdf
3 http://www.bearingpoint.com/ecomaXL/files/0615_WP_EN_Social_CRM_final_web.pdf
Mobile Phone Data2
GPS Location
Mobile client application
Mobile web
Short Message Service
(SMS)
Contact Lists
8. Bank
MobileSocial
media
Data Integration
Central location
Dealing with data issues
and data preparation
Integration of results from
various data sources into a
central location for analysis1
Integration of results from
feeds (structured and
unstructured)1
1 https://www.in.capgemini.com/resource-file-
access/resource/pdf/A_Case_for_Enterprise_Data_Management_for_Banking.pdf
9. Storage
Considerations close to real time or longer term
access1
Governance for retention periods, data ownership
and entry of new data sources2
Maintain backups2
Accessible2
Document sources and validations taking place2
1 http://www.oracle.com/us/products/middleware/data-integration/oracle-goldengate-realtime-access-2031152.pdf
2 http://www.osfi-bsif.gc.ca/eng/docs/data_maint_ja06.pdf
10. Maintaining data
Backing up data1
Deleting data based on established retention1
Performance optimization2
1 http://www.osfi-bsif.gc.ca/eng/docs/data_maint_ja06.pdf
2 http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-bp-for-stats-gather-12c-1967354.pdf
11. Accessing
Access via graphical user interfaces (GUIs) through
applications for
Structured data or unstructured data
Real time or ex-post data
Access for
System administration and maintenance
Validation, editing and estimating (VEE)
Analyzing and modelling processed data (Ultimate Goal)
Audit access controls1
1 http://www.osfi-bsif.gc.ca/eng/docs/data_maint_ja06.pdf
12. Criteria
Describe the systems and methods used to:
Acquire data
Store data
Maintain data
Access data
Describe the infrastructure.
Describe how these systems and methods work together.
13. Acquire data
Python and R languages will be used for data acquisition.
Lightweight and flexible.
Mature - industry standards.
Multiple libraries available for data handling and analytics.
Data acquisition
Facebook1 and Twitter2 provide APIs for access
Python supported
Returns JSON format
Company data
Years of company credit card history available.
1 https://developers.facebook.com/docs/graph-api/using-graph-api/v2.5
2 https://dev.twitter.com/overview/api
14. Storing, Maintaining and Accessing
Oracle DBMS for internal data
PostgreSQL for analytics
Supports structured and unstructured data.
Better performance than MongoDB
Purchase history tied to customers maintained for 90 days.
Data used for analytic models maintained for 90 days.
Data anonymized and maintained for 2 years.
Used for general trending and analysis.
Sold to third parties.
Views will be created in the PostgreSQL environment to allow access for the
mobile application.
16. Data Preparation
Text Analysis
Cleanse the data for keywords
GPS Data
Broad scale demographic segmentation
Contact Networks
Identifiable features of people and groups
17. Data Issues
Problems
Application Permissions
Consumer Participation
Inaccurate Text Analysis
Data Storage and Processing
Solutions
Provide benefits for permissions
Entice consumers with offers
Segmented Analysis
Commodity Clustered Servers
18. Data Quality
Outliers & Incomplete data
Used where applicable, likely excluded
Bootstrapping
Estimation of additional sampling distributions
Influential observations
Used to further define possible segmentations
19. In Conclusion…
Opportunities
DERAK is well positioned to create key analytical insights to our credit card
customer, which will benefit its services by:
Improving its market access
Gaining new insights on its credit card customers through mobile app
Utilizing the collected data for predictive modeling (spending habits, etc.)
Exploring uses of GPS data
Selling anonymized customer data to third party companies
24. In Conclusion…
Direction and future state
Refining data collection and
analytics
Improving data mining methods
Expanding our services and market
reach to other industries
25. Team and Presenters (in order of appearance)
Aaron Matthews
Ketan Lakhani
Eric Gero
Daniel Anderson
Raphael Brykman
26. References
Royal Bank of Scotland Case Study (2013)
Mobile Banking Overview (January 2009)
10 ways to use GIS data in Retail Banking (2012)
A Case for Enterprise Data Management in Banking (2012)
Data Maintenance at IRB Institutions (2006)
Best Practices for Gathering Optimizer Statistics with Oracle Database 12c