SlideShare a Scribd company logo
1 of 57
Download to read offline
Location:
ARPM Open Source Conference
8/13/2017
Machine Learning applications in Credit Risk
2017 Copyright QuantUniversity LLC.
Presented By:
Sri Krishnamurthy, CFA, CAP
sri@quantuniversity.com
www.analyticscertificate.com
2
Slides will be available at:
www.analyticscertificatecom/MachineLearning
• Founder of QuantUniversity LLC. and
www.analyticscertificate.com
• Advisory and Consultancy for Financial Analytics
• Prior Experience at MathWorks, Citigroup and
Endeca and 25+ financial services and energy
customers.
• Regular Columnist for the Wilmott Magazine
• Author of forthcoming book
“Financial Modeling: A case study approach”
published by Wiley
• Charted Financial Analyst and Certified Analytics
Professional
• Teaches Analytics in the Babson College MBA
program and at Northeastern University, Boston
Sri Krishnamurthy
Founder and CEO
3
4
Quantitative Analytics and Big Data Analytics Onboarding
• Data Science, Quant Finance and
Machine Learning Advisory
• Trained more than 1000 students in
Quantitative methods, Data Science
and Big Data Technologies using
MATLAB, Python and R
• Launching
▫ Analytics Certificate Program
 Spring 2018
▫ Fintech Certification program
 Fall 2017
• Building
6
Credit risk in consumer credit
Credit-scoring models and techniques assess the risk in lending to
customers.
Typical decisions:
• Grant credit/not to new applicants
• Increasing/Decreasing spending limits
• Increasing/Decreasing lending rates
• What new products can be given to existing applicants ?
Credit assessment in consumer credit
History:
• Gut feel
• Social network
• Communities and influence
Traditional:
• Scoring mechanisms through credit bureaus
• Bank assessments through business rules
Newer approaches (FINTECH):
• Peer-to-Peer lending
• Lending club, Prosper Market place
9
10
Types of algorithms
Machine
learning
Supervised
Learning
Prediction
Classification
Unsupervised
Learning
Clustering
11
Used to derive a relationship between dependent and independent
variables
• Prediction
▫ Regression
▫ Decision Trees (CART)
▫ Neural Networks
• Classification
▫ Logistic Regression
▫ CART, Random Forest, SVM
▫ Neural Networks
Supervised Learning
12
Data pre-
processing
Split data into
Training and
Testing sets
Train the model
on Training data
Test the model
using Testing data
to evaluate model
performance
Methodology
13
• No distinction between independent variables and dependent
variables
• No result labels to determine “correct” results
• Goals:
▫ Data Reduction
▫ Clustering
Unsupervised Learning
14
• Partitioning Clustering
▫ Starts with K –number of clusters sought
▫ Observations randomly divided to form cohesive clusters
▫ Example : K-means
• Hierarchical Agglomerative Clustering
▫ Each observation is its own cluster
▫ Combine clusters two at a time to finally have one cluster
▫ Example: Hierarchical clustering using single linkage, Ward’s method
etc.
Types of Clustering
15
• Tries to separate samples into K groups with a goal of maximizing
between group variance and minimizing within group variance
• Requires K to be specified up front.
• Starts with K initial centroids and optimizes to minimize the criterion
or till the number of specified iterations are reached.
• Suited for larger datasets
K-means
16
• Goal is to derive a dendrogram starting from each record being its
own cluster
• Works well for smaller data sets
• Proximity is measured in multiple ways (more later)
Hierarchical clustering
17
How do you measure similarity between two entities ?
▫ Apples and Bananas
▫ Coke and Pepsi vs Orange juice
▫ Honda Civic vs Toyota Corolla
▫ New York and Boston
• The notion of distance
The notion of distance
18
• Euclidean distance
• Cosine distance
Distance measures
19
• Manhattan distance
(Taxi-cab distance)
• Jaccard distance
▫ Used to measure similarity or dissimilarity between binary and non-
binary variables
▫ http://people.revoledu.com/kardi/tutorial/Similarity/Jaccard.html
Other distance measures
20
• Gower distance is used for calculating distances when we have mixed types
of variables (continuous and categorical)
• Variables can be:
▫ Quantitative (such as rating scale)
▫ Binary (such as present/absent)
▫ Nominal (such as worker/teacher/clerk)
• The metrics used for each data type are described below:
▫ Quantitative: range-normalized Manhattan distance
▫ Ordinal: variable is first ranked, then Manhattan distance is used with a special
adjustment for ties
▫ Nominal: variables of k categories are first converted into k binary columns and
then the Dice coefficient is used
(https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient )
Working with mixed-data
21
• Daisy : Compute all the pairwise dissimilarities (distances) between
observations in the data set
• Pam: Partitioning (clustering) of the data into k clusters “around
medoids”, a more robust version of K-means.
• Agnes: Computes agglomerative nesting (hierarchical clustering) of
the dataset.
Support in R
22
23
Lending club
24
The Data
https://www.lendingclub.com/info/download-data.action
25
The Data
https://www.kaggle.com/wendykan/lending-club-loan-data
Variable description
• Calculate dissimilarity between observations.
• Select algorithm to group observations together
• Choose the best number of clusters
• Visualize clusters on reduced dimensions
Objective
• Partitioning around medoids (PAM) is used in this case.
• PAM is an iterative clustering procedure with the following steps:
▫ Step 1: Choose k random entities to become the medoids.
▫ Step 2: Assign every entity to its closest medoid (using the distance
matrix we have calculated).
▫ Step 3: For each cluster, identify the observation that would yield the
lowest average distance if it were to be re-assigned as the medoid. If
so, make this observation the new medoid.
▫ Step 4: If at least one medoid has changes, return to step 2.
Otherwise, end the algorithm.
Selecting number of clusters
• One way to visualize many variables in a lower dimensional space is
with t-distributed stochastic neighborhood embedding (t-SNE)
• This method is a dimension reduction technique that tries to
preserve local structure so as to make clusters visible in a 2D or 3D
visualization.
• https://en.wikipedia.org/wiki/T-
distributed_stochastic_neighbor_embedding
Visualization with reduced dimension
30
31
Alternative Credit scoring in the news
32
Fintech being noticed by Regulators
33
• The regulatory sandbox allows businesses to test innovative
products, services, business models and delivery mechanisms in the
real market, with real consumers.
• The sandbox is a supervised space, open to both authorized and
unauthorized firms, that provides firms with:
▫ reduced time-to-market at potentially lower cost
▫ appropriate consumer protection safeguards built in to new products and
services
▫ better access to finance
• https://www.fca.org.uk/firms/regulatory-sandbox
Regulatory Sandboxes
34
US Regulators catching up
Model Validation
• “Model risk is the potential for adverse consequences from
decisions based on incorrect or misused model outputs and
reports. “ [1]
• “Model validation is the set of processes and activities
intended to verify that models are performing as expected,
in line with their design objectives and business uses. ” [1]
• Ref:
• [1] . Supervisory Letter SR 11-7 on guidance on Model Risk
36
Popularity of Open-source software in the enterprise
increasing
37
• Financial Services customers like Capital One, FINRA, and Pacific Life
are moving critical workloads to AWS
Cloud maturing
38
• Versions and packages
Challenges in adopting Open-source software in the
enterprise
39
• Difficulty in replicating and reconciling differences in environments
Challenges in adopting Open-source software in the
enterprise
40
• Deploying models built by Data Scientists still a problem
Challenges in adopting Open-source software in the
enterprise
Data Scientists Enterprise IT
41
• The try-before-adopt model is difficult with unproven open-source
solutions
Challenges in adopting Open-source software in the
enterprise
42
www.QuSandbox.com
43
Quant/Enterprise use cases
• Create an environment that can support multiple platforms and
programming languages
• Enable remote running of applications
• Ability to try out a Github submission/ someone else’s code
• Facilitate creation of Docker images to create replicable containers
• Create prototyping environments for Data Science/Quant teams
• Enable Data scientists/Quants to deploy their solutions
• Enable running multiple experiments concurrently
• Integrate seamlessly with the cloud to scale up computations
Use cases
44
Fintech use cases
• To demonstrate solutions to enterprises
• Create customized enterprise trials for companies that don’t permit
installation of vendor software prior to procurement
• To manage quick updates
• Enable effective integration and hosting of services (REST APIs)
Use cases
45
Academic use cases
• Enable creation of course material and exercises that could be
shared
• Enable students and workshop participants to focus on the data
science experiments rather than environment setting
Use cases
46
Creating replicable environments
Creating and manage replicable environments (Code + software + data) in a single portal
47
Creating replicable environments
Create replicable environments (Code + software + data) through a easy point & click tool and
publish to Dockerhub or manage internally
Share it with target users
48
User portal
• Run multiple experiments in pre-created environments (Code + software + data)
• Deploy your own solutions
• Run any Docker image or Github submission on the cloud
49
Run Jupyter notebooks and prototype applications
50
Run Rstudio and Shiny applications
51
Run any Docker application
52
Manage tasks and errors
53
User portal
• Dockerize and deploy applications on AWS in just a few steps
54
Deploy applications with ease
55
QU’s open source project – Project Mozaic
56
www.QuSandbox.com
57
www.analyticscertificatecom/MachineLearning
Thank you ARPM and enjoy the boot camp!
Checkout our programs at:
www.analyticscertificate.com/fintech
www.qusandbox.com
Sri Krishnamurthy, CFA, CAP
Founder and CEO
QuantUniversity LLC.
srikrishnamurthy
www.QuantUniversity.com
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be
distributed or used in any other publication without the prior written consent of QuantUniversity LLC.
58

More Related Content

What's hot

Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval Venkata Reddy Konasani
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detectionPEIPEI HAN
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud DetectionNitesh Kumar
 
Machine Learning in Banking
Machine Learning in BankingMachine Learning in Banking
Machine Learning in Bankingaccenture
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesT.S. Lim
 
Machine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementMachine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementQuantUniversity
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
 
Data Governance_Notes.pptx
Data Governance_Notes.pptxData Governance_Notes.pptx
Data Governance_Notes.pptxVivekDubley
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
Decision tree for Predictive Modeling
Decision tree for Predictive ModelingDecision tree for Predictive Modeling
Decision tree for Predictive ModelingEdureka!
 
Nitin sharma - Deep Learning Applications to Online Payment Fraud Detection
Nitin sharma - Deep Learning Applications to Online Payment Fraud DetectionNitin sharma - Deep Learning Applications to Online Payment Fraud Detection
Nitin sharma - Deep Learning Applications to Online Payment Fraud DetectionMLconf
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionvineeta vineeta
 
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...DATAVERSITY
 
Predictive analysis and modelling
Predictive analysis and modellingPredictive analysis and modelling
Predictive analysis and modellinglalit Lalitm7225
 
Loan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachLoan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachEslam Nader
 

What's hot (20)

Model building in credit card and loan approval
Model building in credit card and loan approval Model building in credit card and loan approval
Model building in credit card and loan approval
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detection
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
 
Machine Learning in Banking
Machine Learning in BankingMachine Learning in Banking
Machine Learning in Banking
 
Benefit of Predictive Analytics in Healthcare
Benefit of Predictive Analytics in HealthcareBenefit of Predictive Analytics in Healthcare
Benefit of Predictive Analytics in Healthcare
 
Applications of Big Data Analytics in Businesses
Applications of Big Data Analytics in BusinessesApplications of Big Data Analytics in Businesses
Applications of Big Data Analytics in Businesses
 
Machine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementMachine Learning and AI in Risk Management
Machine Learning and AI in Risk Management
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
 
Data Governance_Notes.pptx
Data Governance_Notes.pptxData Governance_Notes.pptx
Data Governance_Notes.pptx
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Decision tree for Predictive Modeling
Decision tree for Predictive ModelingDecision tree for Predictive Modeling
Decision tree for Predictive Modeling
 
Nitin sharma - Deep Learning Applications to Online Payment Fraud Detection
Nitin sharma - Deep Learning Applications to Online Payment Fraud DetectionNitin sharma - Deep Learning Applications to Online Payment Fraud Detection
Nitin sharma - Deep Learning Applications to Online Payment Fraud Detection
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
Data Architecture Strategies: Building an Enterprise Data Strategy – Where to...
 
Predictive analysis and modelling
Predictive analysis and modellingPredictive analysis and modelling
Predictive analysis and modelling
 
Credit Risk Model Building Steps
Credit Risk Model Building StepsCredit Risk Model Building Steps
Credit Risk Model Building Steps
 
BI Presentation
BI PresentationBI Presentation
BI Presentation
 
Loan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachLoan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approach
 
Open Banking APIs on AWS
Open Banking APIs on AWSOpen Banking APIs on AWS
Open Banking APIs on AWS
 
Fraud Analytics
Fraud AnalyticsFraud Analytics
Fraud Analytics
 

Similar to Machine Learning Applications in Credit Risk

Regtech in Fintech + QuSandbox Demo
Regtech in Fintech + QuSandbox DemoRegtech in Fintech + QuSandbox Demo
Regtech in Fintech + QuSandbox DemoQuantUniversity
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVIntoTheMinds
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVFrancisco Couto
 
Practical model management in the age of Data science and ML
Practical model management in the age of Data science and MLPractical model management in the age of Data science and ML
Practical model management in the age of Data science and MLQuantUniversity
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsQuantUniversity
 
Digicorp - Supply Chain Analytics Apps
Digicorp - Supply Chain Analytics AppsDigicorp - Supply Chain Analytics Apps
Digicorp - Supply Chain Analytics AppsDigicorp
 
Scaling Analytics with Apache Spark
Scaling Analytics with Apache SparkScaling Analytics with Apache Spark
Scaling Analytics with Apache SparkQuantUniversity
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuantUniversity
 
The Future of BriteCore - Product Development
The Future of BriteCore - Product DevelopmentThe Future of BriteCore - Product Development
The Future of BriteCore - Product DevelopmentPhil Reynolds
 
Digital Transformation at the University of Edinburgh
Digital Transformation at the University of EdinburghDigital Transformation at the University of Edinburgh
Digital Transformation at the University of EdinburghMark Ritchie
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsQuantUniversity
 
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...Redis Labs
 
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationUsing PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationDatabricks
 
Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance
 

Similar to Machine Learning Applications in Credit Risk (20)

Credit risk meetup
Credit risk meetupCredit risk meetup
Credit risk meetup
 
Regtech in Fintech + QuSandbox Demo
Regtech in Fintech + QuSandbox DemoRegtech in Fintech + QuSandbox Demo
Regtech in Fintech + QuSandbox Demo
 
A flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TVA flexible recommenndation system for Cable TV
A flexible recommenndation system for Cable TV
 
A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
 
Practical model management in the age of Data science and ML
Practical model management in the age of Data science and MLPractical model management in the age of Data science and ML
Practical model management in the age of Data science and ML
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and Innovations
 
QuSandbox+NVIDIA Rapids
QuSandbox+NVIDIA RapidsQuSandbox+NVIDIA Rapids
QuSandbox+NVIDIA Rapids
 
Ds for finance day 4
Ds for finance day 4Ds for finance day 4
Ds for finance day 4
 
Digicorp - Supply Chain Analytics Apps
Digicorp - Supply Chain Analytics AppsDigicorp - Supply Chain Analytics Apps
Digicorp - Supply Chain Analytics Apps
 
Scaling Analytics with Apache Spark
Scaling Analytics with Apache SparkScaling Analytics with Apache Spark
Scaling Analytics with Apache Spark
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
 
The Future of BriteCore - Product Development
The Future of BriteCore - Product DevelopmentThe Future of BriteCore - Product Development
The Future of BriteCore - Product Development
 
Digital Transformation at the University of Edinburgh
Digital Transformation at the University of EdinburghDigital Transformation at the University of Edinburgh
Digital Transformation at the University of Edinburgh
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and Applications
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
Multi-Agency Multi-Media Interoperable Communication, Enabled By Redis: Paul ...
 
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationUsing PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy Exploration
 
Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016Pistoia Alliance USA Conference 2016
Pistoia Alliance USA Conference 2016
 
21st century quant
21st century quant21st century quant
21st century quant
 

More from QuantUniversity

EU Artificial Intelligence Act 2024 passed !
EU Artificial Intelligence Act 2024 passed !EU Artificial Intelligence Act 2024 passed !
EU Artificial Intelligence Act 2024 passed !QuantUniversity
 
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdfManaging-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdfQuantUniversity
 
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSPYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSQuantUniversity
 
Qu for India - QuantUniversity FundRaiser
Qu for India  - QuantUniversity FundRaiserQu for India  - QuantUniversity FundRaiser
Qu for India - QuantUniversity FundRaiserQuantUniversity
 
Ml master class for CFA Dallas
Ml master class for CFA DallasMl master class for CFA Dallas
Ml master class for CFA DallasQuantUniversity
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0QuantUniversity
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...QuantUniversity
 
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...QuantUniversity
 
Seeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper reviewSeeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper reviewQuantUniversity
 
AI Explainability and Model Risk Management
AI Explainability and Model Risk ManagementAI Explainability and Model Risk Management
AI Explainability and Model Risk ManagementQuantUniversity
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0QuantUniversity
 
Machine Learning in Finance: 10 Things You Need to Know in 2021
Machine Learning in Finance: 10 Things You Need to Know in 2021Machine Learning in Finance: 10 Things You Need to Know in 2021
Machine Learning in Finance: 10 Things You Need to Know in 2021QuantUniversity
 
Bayesian Portfolio Allocation
Bayesian Portfolio AllocationBayesian Portfolio Allocation
Bayesian Portfolio AllocationQuantUniversity
 
Constructing Private Asset Benchmarks
Constructing Private Asset BenchmarksConstructing Private Asset Benchmarks
Constructing Private Asset BenchmarksQuantUniversity
 
Machine Learning Interpretability
Machine Learning InterpretabilityMachine Learning Interpretability
Machine Learning InterpretabilityQuantUniversity
 
Responsible AI in Action
Responsible AI in ActionResponsible AI in Action
Responsible AI in ActionQuantUniversity
 
Qu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in FinanceQu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in FinanceQuantUniversity
 

More from QuantUniversity (20)

EU Artificial Intelligence Act 2024 passed !
EU Artificial Intelligence Act 2024 passed !EU Artificial Intelligence Act 2024 passed !
EU Artificial Intelligence Act 2024 passed !
 
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdfManaging-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
 
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSPYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
 
Qu for India - QuantUniversity FundRaiser
Qu for India  - QuantUniversity FundRaiserQu for India  - QuantUniversity FundRaiser
Qu for India - QuantUniversity FundRaiser
 
Ml master class for CFA Dallas
Ml master class for CFA DallasMl master class for CFA Dallas
Ml master class for CFA Dallas
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
 
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
 
Seeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper reviewSeeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper review
 
AI Explainability and Model Risk Management
AI Explainability and Model Risk ManagementAI Explainability and Model Risk Management
AI Explainability and Model Risk Management
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0
 
Machine Learning in Finance: 10 Things You Need to Know in 2021
Machine Learning in Finance: 10 Things You Need to Know in 2021Machine Learning in Finance: 10 Things You Need to Know in 2021
Machine Learning in Finance: 10 Things You Need to Know in 2021
 
Bayesian Portfolio Allocation
Bayesian Portfolio AllocationBayesian Portfolio Allocation
Bayesian Portfolio Allocation
 
The API Jungle
The API JungleThe API Jungle
The API Jungle
 
Explainable AI Workshop
Explainable AI WorkshopExplainable AI Workshop
Explainable AI Workshop
 
Constructing Private Asset Benchmarks
Constructing Private Asset BenchmarksConstructing Private Asset Benchmarks
Constructing Private Asset Benchmarks
 
Machine Learning Interpretability
Machine Learning InterpretabilityMachine Learning Interpretability
Machine Learning Interpretability
 
Responsible AI in Action
Responsible AI in ActionResponsible AI in Action
Responsible AI in Action
 
Qu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in FinanceQu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in Finance
 
Qwafafew meeting 5
Qwafafew meeting 5Qwafafew meeting 5
Qwafafew meeting 5
 

Recently uploaded

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 

Recently uploaded (20)

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 

Machine Learning Applications in Credit Risk

  • 1. Location: ARPM Open Source Conference 8/13/2017 Machine Learning applications in Credit Risk 2017 Copyright QuantUniversity LLC. Presented By: Sri Krishnamurthy, CFA, CAP sri@quantuniversity.com www.analyticscertificate.com
  • 2. 2 Slides will be available at: www.analyticscertificatecom/MachineLearning
  • 3. • Founder of QuantUniversity LLC. and www.analyticscertificate.com • Advisory and Consultancy for Financial Analytics • Prior Experience at MathWorks, Citigroup and Endeca and 25+ financial services and energy customers. • Regular Columnist for the Wilmott Magazine • Author of forthcoming book “Financial Modeling: A case study approach” published by Wiley • Charted Financial Analyst and Certified Analytics Professional • Teaches Analytics in the Babson College MBA program and at Northeastern University, Boston Sri Krishnamurthy Founder and CEO 3
  • 4. 4 Quantitative Analytics and Big Data Analytics Onboarding • Data Science, Quant Finance and Machine Learning Advisory • Trained more than 1000 students in Quantitative methods, Data Science and Big Data Technologies using MATLAB, Python and R • Launching ▫ Analytics Certificate Program  Spring 2018 ▫ Fintech Certification program  Fall 2017 • Building
  • 5. 6
  • 6. Credit risk in consumer credit Credit-scoring models and techniques assess the risk in lending to customers. Typical decisions: • Grant credit/not to new applicants • Increasing/Decreasing spending limits • Increasing/Decreasing lending rates • What new products can be given to existing applicants ?
  • 7. Credit assessment in consumer credit History: • Gut feel • Social network • Communities and influence Traditional: • Scoring mechanisms through credit bureaus • Bank assessments through business rules Newer approaches (FINTECH): • Peer-to-Peer lending • Lending club, Prosper Market place
  • 8. 9
  • 10. 11 Used to derive a relationship between dependent and independent variables • Prediction ▫ Regression ▫ Decision Trees (CART) ▫ Neural Networks • Classification ▫ Logistic Regression ▫ CART, Random Forest, SVM ▫ Neural Networks Supervised Learning
  • 11. 12 Data pre- processing Split data into Training and Testing sets Train the model on Training data Test the model using Testing data to evaluate model performance Methodology
  • 12. 13 • No distinction between independent variables and dependent variables • No result labels to determine “correct” results • Goals: ▫ Data Reduction ▫ Clustering Unsupervised Learning
  • 13. 14 • Partitioning Clustering ▫ Starts with K –number of clusters sought ▫ Observations randomly divided to form cohesive clusters ▫ Example : K-means • Hierarchical Agglomerative Clustering ▫ Each observation is its own cluster ▫ Combine clusters two at a time to finally have one cluster ▫ Example: Hierarchical clustering using single linkage, Ward’s method etc. Types of Clustering
  • 14. 15 • Tries to separate samples into K groups with a goal of maximizing between group variance and minimizing within group variance • Requires K to be specified up front. • Starts with K initial centroids and optimizes to minimize the criterion or till the number of specified iterations are reached. • Suited for larger datasets K-means
  • 15. 16 • Goal is to derive a dendrogram starting from each record being its own cluster • Works well for smaller data sets • Proximity is measured in multiple ways (more later) Hierarchical clustering
  • 16. 17 How do you measure similarity between two entities ? ▫ Apples and Bananas ▫ Coke and Pepsi vs Orange juice ▫ Honda Civic vs Toyota Corolla ▫ New York and Boston • The notion of distance The notion of distance
  • 17. 18 • Euclidean distance • Cosine distance Distance measures
  • 18. 19 • Manhattan distance (Taxi-cab distance) • Jaccard distance ▫ Used to measure similarity or dissimilarity between binary and non- binary variables ▫ http://people.revoledu.com/kardi/tutorial/Similarity/Jaccard.html Other distance measures
  • 19. 20 • Gower distance is used for calculating distances when we have mixed types of variables (continuous and categorical) • Variables can be: ▫ Quantitative (such as rating scale) ▫ Binary (such as present/absent) ▫ Nominal (such as worker/teacher/clerk) • The metrics used for each data type are described below: ▫ Quantitative: range-normalized Manhattan distance ▫ Ordinal: variable is first ranked, then Manhattan distance is used with a special adjustment for ties ▫ Nominal: variables of k categories are first converted into k binary columns and then the Dice coefficient is used (https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient ) Working with mixed-data
  • 20. 21 • Daisy : Compute all the pairwise dissimilarities (distances) between observations in the data set • Pam: Partitioning (clustering) of the data into k clusters “around medoids”, a more robust version of K-means. • Agnes: Computes agglomerative nesting (hierarchical clustering) of the dataset. Support in R
  • 21. 22
  • 26. • Calculate dissimilarity between observations. • Select algorithm to group observations together • Choose the best number of clusters • Visualize clusters on reduced dimensions Objective
  • 27. • Partitioning around medoids (PAM) is used in this case. • PAM is an iterative clustering procedure with the following steps: ▫ Step 1: Choose k random entities to become the medoids. ▫ Step 2: Assign every entity to its closest medoid (using the distance matrix we have calculated). ▫ Step 3: For each cluster, identify the observation that would yield the lowest average distance if it were to be re-assigned as the medoid. If so, make this observation the new medoid. ▫ Step 4: If at least one medoid has changes, return to step 2. Otherwise, end the algorithm. Selecting number of clusters
  • 28. • One way to visualize many variables in a lower dimensional space is with t-distributed stochastic neighborhood embedding (t-SNE) • This method is a dimension reduction technique that tries to preserve local structure so as to make clusters visible in a 2D or 3D visualization. • https://en.wikipedia.org/wiki/T- distributed_stochastic_neighbor_embedding Visualization with reduced dimension
  • 29. 30
  • 31. 32 Fintech being noticed by Regulators
  • 32. 33 • The regulatory sandbox allows businesses to test innovative products, services, business models and delivery mechanisms in the real market, with real consumers. • The sandbox is a supervised space, open to both authorized and unauthorized firms, that provides firms with: ▫ reduced time-to-market at potentially lower cost ▫ appropriate consumer protection safeguards built in to new products and services ▫ better access to finance • https://www.fca.org.uk/firms/regulatory-sandbox Regulatory Sandboxes
  • 34. Model Validation • “Model risk is the potential for adverse consequences from decisions based on incorrect or misused model outputs and reports. “ [1] • “Model validation is the set of processes and activities intended to verify that models are performing as expected, in line with their design objectives and business uses. ” [1] • Ref: • [1] . Supervisory Letter SR 11-7 on guidance on Model Risk
  • 35. 36 Popularity of Open-source software in the enterprise increasing
  • 36. 37 • Financial Services customers like Capital One, FINRA, and Pacific Life are moving critical workloads to AWS Cloud maturing
  • 37. 38 • Versions and packages Challenges in adopting Open-source software in the enterprise
  • 38. 39 • Difficulty in replicating and reconciling differences in environments Challenges in adopting Open-source software in the enterprise
  • 39. 40 • Deploying models built by Data Scientists still a problem Challenges in adopting Open-source software in the enterprise Data Scientists Enterprise IT
  • 40. 41 • The try-before-adopt model is difficult with unproven open-source solutions Challenges in adopting Open-source software in the enterprise
  • 42. 43 Quant/Enterprise use cases • Create an environment that can support multiple platforms and programming languages • Enable remote running of applications • Ability to try out a Github submission/ someone else’s code • Facilitate creation of Docker images to create replicable containers • Create prototyping environments for Data Science/Quant teams • Enable Data scientists/Quants to deploy their solutions • Enable running multiple experiments concurrently • Integrate seamlessly with the cloud to scale up computations Use cases
  • 43. 44 Fintech use cases • To demonstrate solutions to enterprises • Create customized enterprise trials for companies that don’t permit installation of vendor software prior to procurement • To manage quick updates • Enable effective integration and hosting of services (REST APIs) Use cases
  • 44. 45 Academic use cases • Enable creation of course material and exercises that could be shared • Enable students and workshop participants to focus on the data science experiments rather than environment setting Use cases
  • 45. 46 Creating replicable environments Creating and manage replicable environments (Code + software + data) in a single portal
  • 46. 47 Creating replicable environments Create replicable environments (Code + software + data) through a easy point & click tool and publish to Dockerhub or manage internally Share it with target users
  • 47. 48 User portal • Run multiple experiments in pre-created environments (Code + software + data) • Deploy your own solutions • Run any Docker image or Github submission on the cloud
  • 48. 49 Run Jupyter notebooks and prototype applications
  • 49. 50 Run Rstudio and Shiny applications
  • 50. 51 Run any Docker application
  • 52. 53 User portal • Dockerize and deploy applications on AWS in just a few steps
  • 54. 55 QU’s open source project – Project Mozaic
  • 57. Thank you ARPM and enjoy the boot camp! Checkout our programs at: www.analyticscertificate.com/fintech www.qusandbox.com Sri Krishnamurthy, CFA, CAP Founder and CEO QuantUniversity LLC. srikrishnamurthy www.QuantUniversity.com Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be distributed or used in any other publication without the prior written consent of QuantUniversity LLC. 58