Mais conteúdo relacionado Semelhante a 10 Key Considerations for AI/ML Model Governance (20) Mais de QuantUniversity (20) 10 Key Considerations for AI/ML Model Governance1. www.prmia.org© PRMIA 2020
10 Key Considerations for AI/ML Model Governance
Sri Krishnamurthy, CFA, CAP
Founder & CEO
www.QuantUniversity.com
www.prmia.org© PRMIA 2020
Thought Leadership Webinar
2. www.prmia.org© PRMIA 2020
Before We Begin
Submit your questions
anytime using the
Questions pane.
Session is being recorded
Show/Hide panel arrow Download Handout
3. www.prmia.org© PRMIA 2020
Presenter
Sri Krishnamurthy, CFA, CAP
Founder & CEO, QuantUniversity
• Advisory and Consultancy for Financial Analytics
• Prior experience at MathWorks, Citigroup, and Endeca and
25+ years in financial services and energy
• Columnist for the Wilmott Magazine
• Teaches Analytics, AI, ML related topics at Northeastern
University, Boston
• Reviewer: Journal of Asset Management
4. www.prmia.org© PRMIA 2020
10 Key Considerations for AI/ML Model Governance
Sri Krishnamurthy, CFA, CAP
Founder & CEO
www.QuantUniversity.com
www.prmia.org© PRMIA 2020
Thought Leadership Webinar
5. www.prmia.org© PRMIA 2020
About www.QuantUniversity.com
• Boston-based Data Science, Quant
Finance and Machine Learning training
and consulting advisory
• Trained more than 5,000 students in
Quantitative methods, Data Science
and Big Data Technologies using
MATLAB, Python and R
• Building a platform for AI
and Machine Learning Enablement in
the Enterprise
9. www.prmia.org© PRMIA 2020
AI is no longer science fiction!
Your challenge is to design an artificial intelligence and machine learning (AI/ML)
framework capable of flying a drone through several professional drone racing
courses without human intervention or navigational pre-programming.
Source: https://www.lockheedmartin.com/en-us/news/events/ai-innovation-challenge.html
11. www.prmia.org© PRMIA 2020
RBC and BCG Patent Applications
RBC Patents in 20191
• K-LSTM (long term memory loss)
architecture for purchase prediction
• Machine learning architecture with
adversarial attack defense
• Trade platform with reinforcement
learning
• Machine natural language processing
BCG patent2
• Systems and methods for predicting
transactions
1. https://www.fintechfutures.com/2020/01/canadas-rbc-files-patents-for-ai-inventions-as-bigtechs-soar/
2. https://patents.justia.com/patent/10002322
13. www.prmia.org© PRMIA 2020
The Machine Learning and AI Workflow
Data Scraping/
Ingestion
Data
Exploration
Data Cleansing
and Processing
Feature
Engineering
Model
Evaluation
& Tuning
Model
Selection
Model
Deployment/
Inference
Supervised
Unsupervised
Modeling
Data Engineer, Dev Ops Engineer
• Auto ML
• Model Validation
• Interpretability
Robotic Process Automation (RPA) (Microservices, Pipelines )
• SW: Web/ Rest API
• HW: GPU, Cloud
• Monitoring
• Regression
• KNN
• Decision Trees
• Naive Bayes
• Neural Networks
• Ensembles
• Clustering
• PCA
• Autoencoder
• RMS
• MAPS
• MAE
• Confusion Matrix
• Precision/Recall
• ROC
• Hyper-parameter
tuning
• Parameter Grids
Risk Management/ Compliance(All stages)
Software / Web Engineer Data Scientist/Quants
Analysts&
DecisionMakers
21. www.prmia.org© PRMIA 2020
Polling Question 1
Question: Has your organization formalized a MRM policy for
handling Machine Learning models?
a) Considering it
b) Will be rolled out soon
c) In production
d) Not yet
23. www.prmia.org© PRMIA 2020
Decalogue: Ten best practices for an effective model risk management program, Sri Krishnamurthy
https://onlinelibrary.wiley.com/doi/abs/10.1002/wilm.10348
The Decalogue
24. www.prmia.org© PRMIA 2020
1. Adopt a framework-driven approach for model risk management
2. Customize a model risk management program
3. Clearly define roles and responsibilities
4. Integrate model risk management effectively into the model life cycle
5. Don’t reinvent the wheel
6. All models weren’t born equal
7. A checklist is your friend
8. Monitor the health of the models and the program
9. Leverage your domain knowledge on the models
10. Own the model risk management program
The Decalogue
26. www.prmia.org© PRMIA 2020
NLP Pipeline
Data
Ingestion
from Edgar
Pre-
Processing
Invoking
APIs to label
data
Compare
APIs
Build a new
model for
sentiment
Analysis
Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
• Amazon Comprehend API
• Google API
• Watson API
• Azure API
27. www.prmia.org© PRMIA 2020
2. Governing the Machine Learning models Process
Data
cleansing
Feature
Engineering
Training and
Testing
Model
building
Model
selection
Model
Deployment
28. www.prmia.org© PRMIA 2020
The Machine Learning Process
Data
cleansing
Feature
Engineering
Training
and Testing
Model
building
Model
selection
Model
Deployment
29. www.prmia.org© PRMIA 2020
The Machine Learning and AI Workflow
Data Scraping/
Ingestion
Data
Exploration
Data Cleansing
and Processing
Feature
Engineering
Model
Evaluation
& Tuning
Model
Selection
Model
Deployment/
Inference
Supervised
Unsupervised
Modeling
Data Engineer, Dev Ops Engineer
• Auto ML
• Model Validation
• Interpretability
Robotic Process Automation (RPA) (Microservices, Pipelines )
• SW: Web/ Rest API
• HW: GPU, Cloud
• Monitoring
• Regression
• KNN
• Decision Trees
• Naive Bayes
• Neural Networks
• Ensembles
• Clustering
• PCA
• Autoencoder
• RMS
• MAPS
• MAE
• Confusion Matrix
• Precision/Recall
• ROC
• Hyper-parameter
tuning
• Parameter Grids
Risk Management/ Compliance(All stages)
Software / Web Engineer Data Scientist/Quants
Analysts&
DecisionMakers
30. www.prmia.org© PRMIA 2020
Model Verification is defined as:
“The process of determining that a model or simulation implementation and its associated
data accurately represent the developer’s conceptual description and specifications.”
Model Validation is defined as:
“The process of determining the degree to which a model or simulation and its associated
data are an accurate representation of the real world from the perspective of the intended
uses of the model.”
Ref:DoDModeling and Simulation (M&S)Verification, Validation, and Accreditation (VV&A),DoDInstruction 5000.61, December9, 2009.
3. Model Verification vs. Validation of Machine Learning Models
32. www.prmia.org© PRMIA 2020
4. Performance Metrics and Evaluation Criteria
Claim:
• Our Machine Learning models are better than
conventional models
Caution:
• What metrics do we use?
• Is accuracy the right metric?
• How do we evaluate the model? Accuracy or F1-
Score?
• How does the model behave in different
regimes?
Source:
https://en.wikipedia.org/wiki/Confusion_matrix
33. www.prmia.org© PRMIA 2020
5. Model Inventory and Tracking
• Programming
environment
• Execution environment
• Hardware specs
• Cloud
• GPU
• Dependencies
• Lineage/Provenance of
individual components
• Model params
• Hyper parameters
• Pipeline specifications
• Model specific
• Tests
• Data versions
Data Model
EnvironmentProcess
34. www.prmia.org© PRMIA 2020
6. Data Governance and Model Governance
Source: Sculley et al., 2015 "Hidden Technical Debt in Machine Learning Systems"
35. www.prmia.org© PRMIA 2020
7. Development Models vs. Production Models
Claim:
• Our models work on all the datasets we
have tested on.
Caution:
• Do we have enough data?
• How do we handle bias in datasets?
• Beware of overfitting
• Historical Analysis is not Prediction
78
36. www.prmia.org© PRMIA 2020
Prototyping vs. Production: The Reality
Kristy Roth from HSBC:
• “It’s been somewhat easy - in a funny way
- to get going using sample data, [but]
then you hit the real problems,” Roth said.
• “I think our early track record on PoCs or
pilots hides a little bit the underlying
issues.
Matt Davey from Societe Generale:
• “We’ve done quite a bit of work with RPA
recently and I have to say we’ve been a bit
disillusioned with that experience,”
• “the PoC is the easy bit: it’s how you get
that into production and shift the balance”
https://www.itnews.com.au/news/hsbc-societe-generale-run-into-ais-production-problems-477966
79
38. www.prmia.org© PRMIA 2020
Leverage Technology to Scale Analytics in Production
1. 64-bit systems : Addressable space ~8TB
2. Multi-core processors
3. Parallel and Distributed Computing
4. General-purpose computing on graphics processing units
5. Cloud Computing
Ref:Gainingthe TechnologyEdge:http://www.quantuniversity.com/w5.html
41. www.prmia.org© PRMIA 2020
41
ML as a service
Pre-trained
models
AutoML
Models built
using
packages
Models
developed
from
scratch
9. Machine Learning Choices
42. www.prmia.org© PRMIA 2020
10. Roles and Responsibilities
42
Development
Quants/Data Scientists
• New Algorithms
• Try new methods
• Effect of Parameters and Hyper
Parameters
Production
Engineering/IT
• Scaling
• Structuring
• Design of Experiments
• Data Parallel/Task Parallel
48. www.prmia.org© PRMIA 2020
48
NLP Pipeline
Data
Ingestion
from Edgar
Pre-
Processing
Invoking
APIs to label
data
Compare
APIs
Build a new
model for
sentiment
Analysis
Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
• Amazon Comprehend API
• Google API
• Watson API
• Azure API
49. www.prmia.org© PRMIA 2020
QuSandbox Research Suite
49
QuSynthesize
QuSandbox
QuModelStudio
QuAnalyze
QuTrack
QuResearchHub
Prototype, Iterate and tune
Standardize workflows
Productionize and share
Track Models
Prepare and evaluate datasets
54. www.prmia.org© PRMIA 2020
54
Metadata
• Data about the information to be tracked
• Includes version number, timestamps, user information, MD5 of the
artifacts and high-level notes
Data
• Pipelines, custom DSL, standard formats for representing models
• Events (Updates, rollbacks
• JSON, Amazon ION, YAML,
Artifacts
• Model Pickle files, ONYX, COREML, Model params
• Data, blobs etc.
Architecture: What’s tracked?
59. www.prmia.org© PRMIA 2020
Q&A Sri Krishnamurthy, CFA, CAP
Founder and CEO
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. except
where other sources are noted and shall not be distributed or used in any other publication without the prior written
consent of QuantUniversity LLC.