SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
Title of the Presentation Goes Here
© 2018 Carnegie Mellon University
SATURN 2018
14th Annual SEI Architecture Technology User Network Conference
MAY 7–10, 2018 | PLANO, TEXAS
1
Marvin AI - An Open Source Platform to
Deploy and Manage Machine Learning Models
Daniel Takabayashi and Jeremy Elster
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University
SATURN 2018
2
About us…
Daniel Takabayashi
• Technology Manager and Software Architect @ B2W Digital (São Paulo - Brazil)
• Startups Mentor @ Founder Institute (San Francisco - USA)
• Co-Founder of Boolabs, an Brazilian artificial intelligence startup acquired by B2W in 2016
• MSc in Computer Engineering (IPT - Brazil)
and my contacts…
daniel.takabayashi@b2wdigital.com
@DanTakabayashi
linkedin.com/danieltakabayashi
github.com/takabayashi
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University
SATURN 2018
3
About us…
Jeremy Elster
• Data Scientist @ B2W Digital (Irvine - CA)
• University of California @ Berkeley
and my contacts…
jeremy.elster@b2wdigital.com
linkedin.com/in/jeremyelster/
github.com/jeremyelster
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University
SATURN 2018
4
B2W Digital is the leading e-commerce company in Latin
America.
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University
SATURN 2018
5
Agenda
1.Some Problems in Machine Learning Projects (8 slides)
2.Marvin AI Platform (10 slides)
2.1.Main Components
2.2.Architecture Views
3.Hands On (~ 30 minutes)
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 6
Marvin AI - An open source platform to deploy and manage machine learning models
Some Problems in ML Projects
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 7
The knowledge domains to
understand, research, build
and deploy ML projects are
huge and distinct.
Building a “team" with
complementary profiles
makes the project more
expensive.
Some Problems in Machine Learning Projects (1 of 5)
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 8
Almost all data scientists
(Type A) do not have the
necessary software
engineer skills to build a
production grade solution.
And good Type B
professionals are unicorns!
Some Problems in Machine Learning Projects (2 of 5)
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 9
The faster the creation
process ends, the sooner the
improvement process starts.
Rapidly establishing a
baseline MVP (around few
weeks) is strategic to the
project success!
Some Problems in Machine Learning Projects (3 of 5)
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 10
ML Teams must save the
hypothesis, data, code and
metrics for each new iteration
of the project.
Reproducibility is always a
requirement !!!
Some Problems in Machine Learning Projects (4 of 5)
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 11
Code is prototyped locally
or in Jupyter notebooks
(interactive IDE) in any
language.
Models run over test
datasets, but are not
scalable for production.
Some Problems in Machine Learning Projects (5 of 5)
How to simplify the process of
exploring, building, testing and
deploying machine learning
projects in a reproducible way?
ABSTRACTION
+
STANDARDIZATION
github.com/marvin-ai
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 15
Marvin AI - An open source platform to deploy and manage machine learning models
Marvin AI Platform
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University
SATURN 2018
16
Marvin AI Platform: General Infos
• Started at B2W Digital in 2016 to solve internal problems
• Released as open source on 09/2017 with Apache 2 licence
• First Paper published in Papis.io conference (Boston) on 09/2017
• Three versions released since 09/2017
• Community is growing…
The project was submitted to the Apache incubation process!
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University
SATURN 2018
17
Marvin AI Platform: Quality Atrributes
For Data Scientists:
• Interoperability - to support different programmer languages
• Usability - to accelerate and simplify the model creation process
For Administrators:
• Manageability - to simplify the distributed deploy/management process
• Scalability - to support from tiny to intensive loads
For Marvin Developers:
• Modificability - to improve and release new versions constantly
• Maintainability - to allow all type of programmer (from beginners to experts) to
contribute
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 18
Marvin AI Platform: Main components (1 of 3)
DASFE*
Data Acquisition, Serving, Feedback and Evaluation
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 19
Marvin AI Platform: Main components (2 of 3)
Engine - Specific language project that contains source code related to the model.
Implementation of DASFE pattern.
Artefacts - Persistent and versioned binaries (initial dataset, dataset, model, and metrics).
Engine Executor - Architectural abstraction implementation around the Engine such as
parallelism, distribution, versioning, rest apis, availability and so on.
Toolbox - Set of CLI's, utilities, classes and libraries, specific per programming language,
that supports the whole process of exploring, developing, testing and deploying an engine
(Eg. python-toolbox, scala-toolbox, r-toolbox etc).
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 20
Marvin AI Platform: Main components (3 of 3)
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 21
Marvin AI Platform: Some Architectural Tactics
Quality Attributes Main Tactics
Interoperability
Using gRPC connections between the EngineExecutor and the
UserCode code and a DSL to describe the interfaces.
Usability
CLI’s with default parameters and Generic Rest APIs to manage and
request everything in the system. Marvin defines external and
coherent concepts (Eg. Executor, Engine, Action and Toolbox).
Manageability
A Manager actor to control (locally or remotely) everything in the
system and a cluster concept to help in distributed installations.
Scalability
Actor model architecture to increase parallelism and distribution
throughout the system. Containerisation as deployment solution.
Modificability
Encapsulation through Actor model and base classes, minimum
responsibility of each actor and a lot of abstraction.
Maintainability
Scala as implementation language, encapsulation, unit tests and
continuous delivery. Virtualized development environment (vagrant
and docker).
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 22
Context
Diagram
Marvin AI Platform: Architecture Views (1 of 4)
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 23
Components
Diagram 1
Marvin AI Platform: Architecture Views (2 of 4)
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 24
Components
Diagram 2
Marvin AI Platform: Architecture Views (3 of 4)
SATURN 2018
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 25
Deployment
Diagram (draft)
Marvin AI Platform: Architecture Views (4 of 4)
Marvin AI - An open source platform to deploy and manage machine learning models
© 2018 Carnegie Mellon University 26
Marvin AI - An open source platform to deploy and manage machine learning models
Hands On…
Fork me on Github.com/marvin-ai
and fell free to contribute.
Thank you!
twitter.com/_marvin_ai
gitter.im/marvin-ai

Mais conteúdo relacionado

Semelhante a Marvin AI: An Open Source Platform to Deploy and Manage Machine Learning Models

2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4Ferdin Joe John Joseph PhD
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceFerdin Joe John Joseph PhD
 
Tutorial helsinki 20180313 v1
Tutorial helsinki 20180313 v1Tutorial helsinki 20180313 v1
Tutorial helsinki 20180313 v1ISSIP
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)Tao Xie
 
Ahmed Motair CV 2020
Ahmed Motair CV 2020Ahmed Motair CV 2020
Ahmed Motair CV 2020Ahmed Mater
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesRudiger Wolf
 
Developing Effective Software Productively
Developing Effective Software ProductivelyDeveloping Effective Software Productively
Developing Effective Software ProductivelyGail Murphy
 
InnovateHER Workshop, GDSC, DY PATIL- RAIT
InnovateHER Workshop, GDSC, DY PATIL- RAITInnovateHER Workshop, GDSC, DY PATIL- RAIT
InnovateHER Workshop, GDSC, DY PATIL- RAIThrishitapandeyqmp
 
Software system design sample
Software system design sampleSoftware system design sample
Software system design sampleNorman K Ma
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examplesLuciano Resende
 
ICONIQ Analytics: The Modern Developer Technology Stack
ICONIQ Analytics: The Modern Developer Technology StackICONIQ Analytics: The Modern Developer Technology Stack
ICONIQ Analytics: The Modern Developer Technology StackChristine Edmonds
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25thIBM
 

Semelhante a Marvin AI: An Open Source Platform to Deploy and Manage Machine Learning Models (20)

2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4
 
Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
 
Tutorial helsinki 20180313 v1
Tutorial helsinki 20180313 v1Tutorial helsinki 20180313 v1
Tutorial helsinki 20180313 v1
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)
 
Waseem Arfi Personal Profile
Waseem Arfi Personal ProfileWaseem Arfi Personal Profile
Waseem Arfi Personal Profile
 
Bridging the Gap
Bridging the GapBridging the Gap
Bridging the Gap
 
Ahmed Motair CV 2020
Ahmed Motair CV 2020Ahmed Motair CV 2020
Ahmed Motair CV 2020
 
Lloyd Mcallen
Lloyd McallenLloyd Mcallen
Lloyd Mcallen
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slides
 
Developing Effective Software Productively
Developing Effective Software ProductivelyDeveloping Effective Software Productively
Developing Effective Software Productively
 
Jagrat_Mankad
Jagrat_MankadJagrat_Mankad
Jagrat_Mankad
 
InnovateHER Workshop, GDSC, DY PATIL- RAIT
InnovateHER Workshop, GDSC, DY PATIL- RAITInnovateHER Workshop, GDSC, DY PATIL- RAIT
InnovateHER Workshop, GDSC, DY PATIL- RAIT
 
Software system design sample
Software system design sampleSoftware system design sample
Software system design sample
 
Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examples
 
Resume_Amaku
Resume_AmakuResume_Amaku
Resume_Amaku
 
ICONIQ Analytics: The Modern Developer Technology Stack
ICONIQ Analytics: The Modern Developer Technology StackICONIQ Analytics: The Modern Developer Technology Stack
ICONIQ Analytics: The Modern Developer Technology Stack
 
Training report
Training reportTraining report
Training report
 
mgross_Resume_PM
mgross_Resume_PMmgross_Resume_PM
mgross_Resume_PM
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25th
 

Mais de Daniel Takabayashi, MSc

Marvin Platform - Artificial Intelligence Platform
Marvin Platform - Artificial Intelligence PlatformMarvin Platform - Artificial Intelligence Platform
Marvin Platform - Artificial Intelligence PlatformDaniel Takabayashi, MSc
 
Marvin Platform – Potencializando equipes de Machine Learning
Marvin Platform – Potencializando equipes de Machine LearningMarvin Platform – Potencializando equipes de Machine Learning
Marvin Platform – Potencializando equipes de Machine LearningDaniel Takabayashi, MSc
 
Da Exploração à Produção - Inteligência Artificial com a plataforma Marvin
Da Exploração à Produção - Inteligência Artificial com a plataforma MarvinDa Exploração à Produção - Inteligência Artificial com a plataforma Marvin
Da Exploração à Produção - Inteligência Artificial com a plataforma MarvinDaniel Takabayashi, MSc
 
TrabalhoDefesa-Completo-vFinal-Impressão
TrabalhoDefesa-Completo-vFinal-ImpressãoTrabalhoDefesa-Completo-vFinal-Impressão
TrabalhoDefesa-Completo-vFinal-ImpressãoDaniel Takabayashi, MSc
 

Mais de Daniel Takabayashi, MSc (7)

Marvin Platform - Artificial Intelligence Platform
Marvin Platform - Artificial Intelligence PlatformMarvin Platform - Artificial Intelligence Platform
Marvin Platform - Artificial Intelligence Platform
 
Marvin Platform – Potencializando equipes de Machine Learning
Marvin Platform – Potencializando equipes de Machine LearningMarvin Platform – Potencializando equipes de Machine Learning
Marvin Platform – Potencializando equipes de Machine Learning
 
Da Exploração à Produção - Inteligência Artificial com a plataforma Marvin
Da Exploração à Produção - Inteligência Artificial com a plataforma MarvinDa Exploração à Produção - Inteligência Artificial com a plataforma Marvin
Da Exploração à Produção - Inteligência Artificial com a plataforma Marvin
 
Proposta de Inovação (1)
Proposta de Inovação (1)Proposta de Inovação (1)
Proposta de Inovação (1)
 
Guia de Estudo OCA Java SE 5 - SE6
Guia de Estudo OCA Java SE 5 - SE6Guia de Estudo OCA Java SE 5 - SE6
Guia de Estudo OCA Java SE 5 - SE6
 
Apresentação - Defesa Mestrado - v4
Apresentação - Defesa Mestrado - v4Apresentação - Defesa Mestrado - v4
Apresentação - Defesa Mestrado - v4
 
TrabalhoDefesa-Completo-vFinal-Impressão
TrabalhoDefesa-Completo-vFinal-ImpressãoTrabalhoDefesa-Completo-vFinal-Impressão
TrabalhoDefesa-Completo-vFinal-Impressão
 

Último

How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 

Último (20)

How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 

Marvin AI: An Open Source Platform to Deploy and Manage Machine Learning Models

  • 1. Title of the Presentation Goes Here © 2018 Carnegie Mellon University SATURN 2018 14th Annual SEI Architecture Technology User Network Conference MAY 7–10, 2018 | PLANO, TEXAS 1 Marvin AI - An Open Source Platform to Deploy and Manage Machine Learning Models Daniel Takabayashi and Jeremy Elster
  • 2. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University SATURN 2018 2 About us… Daniel Takabayashi • Technology Manager and Software Architect @ B2W Digital (São Paulo - Brazil) • Startups Mentor @ Founder Institute (San Francisco - USA) • Co-Founder of Boolabs, an Brazilian artificial intelligence startup acquired by B2W in 2016 • MSc in Computer Engineering (IPT - Brazil) and my contacts… daniel.takabayashi@b2wdigital.com @DanTakabayashi linkedin.com/danieltakabayashi github.com/takabayashi
  • 3. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University SATURN 2018 3 About us… Jeremy Elster • Data Scientist @ B2W Digital (Irvine - CA) • University of California @ Berkeley and my contacts… jeremy.elster@b2wdigital.com linkedin.com/in/jeremyelster/ github.com/jeremyelster
  • 4. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University SATURN 2018 4 B2W Digital is the leading e-commerce company in Latin America.
  • 5. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University SATURN 2018 5 Agenda 1.Some Problems in Machine Learning Projects (8 slides) 2.Marvin AI Platform (10 slides) 2.1.Main Components 2.2.Architecture Views 3.Hands On (~ 30 minutes)
  • 6. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 6 Marvin AI - An open source platform to deploy and manage machine learning models Some Problems in ML Projects
  • 7. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 7 The knowledge domains to understand, research, build and deploy ML projects are huge and distinct. Building a “team" with complementary profiles makes the project more expensive. Some Problems in Machine Learning Projects (1 of 5)
  • 8. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 8 Almost all data scientists (Type A) do not have the necessary software engineer skills to build a production grade solution. And good Type B professionals are unicorns! Some Problems in Machine Learning Projects (2 of 5)
  • 9. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 9 The faster the creation process ends, the sooner the improvement process starts. Rapidly establishing a baseline MVP (around few weeks) is strategic to the project success! Some Problems in Machine Learning Projects (3 of 5)
  • 10. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 10 ML Teams must save the hypothesis, data, code and metrics for each new iteration of the project. Reproducibility is always a requirement !!! Some Problems in Machine Learning Projects (4 of 5)
  • 11. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 11 Code is prototyped locally or in Jupyter notebooks (interactive IDE) in any language. Models run over test datasets, but are not scalable for production. Some Problems in Machine Learning Projects (5 of 5)
  • 12. How to simplify the process of exploring, building, testing and deploying machine learning projects in a reproducible way?
  • 15. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 15 Marvin AI - An open source platform to deploy and manage machine learning models Marvin AI Platform
  • 16. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University SATURN 2018 16 Marvin AI Platform: General Infos • Started at B2W Digital in 2016 to solve internal problems • Released as open source on 09/2017 with Apache 2 licence • First Paper published in Papis.io conference (Boston) on 09/2017 • Three versions released since 09/2017 • Community is growing… The project was submitted to the Apache incubation process!
  • 17. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University SATURN 2018 17 Marvin AI Platform: Quality Atrributes For Data Scientists: • Interoperability - to support different programmer languages • Usability - to accelerate and simplify the model creation process For Administrators: • Manageability - to simplify the distributed deploy/management process • Scalability - to support from tiny to intensive loads For Marvin Developers: • Modificability - to improve and release new versions constantly • Maintainability - to allow all type of programmer (from beginners to experts) to contribute
  • 18. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 18 Marvin AI Platform: Main components (1 of 3) DASFE* Data Acquisition, Serving, Feedback and Evaluation
  • 19. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 19 Marvin AI Platform: Main components (2 of 3) Engine - Specific language project that contains source code related to the model. Implementation of DASFE pattern. Artefacts - Persistent and versioned binaries (initial dataset, dataset, model, and metrics). Engine Executor - Architectural abstraction implementation around the Engine such as parallelism, distribution, versioning, rest apis, availability and so on. Toolbox - Set of CLI's, utilities, classes and libraries, specific per programming language, that supports the whole process of exploring, developing, testing and deploying an engine (Eg. python-toolbox, scala-toolbox, r-toolbox etc).
  • 20. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 20 Marvin AI Platform: Main components (3 of 3)
  • 21. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 21 Marvin AI Platform: Some Architectural Tactics Quality Attributes Main Tactics Interoperability Using gRPC connections between the EngineExecutor and the UserCode code and a DSL to describe the interfaces. Usability CLI’s with default parameters and Generic Rest APIs to manage and request everything in the system. Marvin defines external and coherent concepts (Eg. Executor, Engine, Action and Toolbox). Manageability A Manager actor to control (locally or remotely) everything in the system and a cluster concept to help in distributed installations. Scalability Actor model architecture to increase parallelism and distribution throughout the system. Containerisation as deployment solution. Modificability Encapsulation through Actor model and base classes, minimum responsibility of each actor and a lot of abstraction. Maintainability Scala as implementation language, encapsulation, unit tests and continuous delivery. Virtualized development environment (vagrant and docker).
  • 22. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 22 Context Diagram Marvin AI Platform: Architecture Views (1 of 4)
  • 23. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 23 Components Diagram 1 Marvin AI Platform: Architecture Views (2 of 4)
  • 24. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 24 Components Diagram 2 Marvin AI Platform: Architecture Views (3 of 4)
  • 25. SATURN 2018 Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 25 Deployment Diagram (draft) Marvin AI Platform: Architecture Views (4 of 4)
  • 26. Marvin AI - An open source platform to deploy and manage machine learning models © 2018 Carnegie Mellon University 26 Marvin AI - An open source platform to deploy and manage machine learning models Hands On…
  • 27. Fork me on Github.com/marvin-ai and fell free to contribute. Thank you! twitter.com/_marvin_ai gitter.im/marvin-ai