Enviar pesquisa
Carregar
Fri benghiat gil-odsc-data-kitchen-data science to dataops
•
4 gostaram
•
2,087 visualizações
DataKitchen
Seguir
DataOps
Leia menos
Leia mais
Dados e análise
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 37
Baixar agora
Baixar para ler offline
Recomendados
ODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps Manifesto
DataKitchen
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
DataKitchen
Open Data Science Conference Agile Data
Open Data Science Conference Agile Data
DataKitchen
Low-tech, Low-cost data management: Six insights from national reporting on f...
Low-tech, Low-cost data management: Six insights from national reporting on f...
srjbridge
Overcoming DataOps hurdles for ML in Production
Overcoming DataOps hurdles for ML in Production
Sandeep Uttamchandani
Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!
DataKitchen
Bridged Overview by CodeData
Bridged Overview by CodeData
Sam Sur
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven Organizations
Ellen Friedman
Recomendados
ODSC May 2019 - The DataOps Manifesto
ODSC May 2019 - The DataOps Manifesto
DataKitchen
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
Strata+hadoop data kitchen-seven-steps-to-high-velocity-data-analytics-with d...
DataKitchen
Open Data Science Conference Agile Data
Open Data Science Conference Agile Data
DataKitchen
Low-tech, Low-cost data management: Six insights from national reporting on f...
Low-tech, Low-cost data management: Six insights from national reporting on f...
srjbridge
Overcoming DataOps hurdles for ML in Production
Overcoming DataOps hurdles for ML in Production
Sandeep Uttamchandani
Do Agile Data in Just 5 Shocking Steps!
Do Agile Data in Just 5 Shocking Steps!
DataKitchen
Bridged Overview by CodeData
Bridged Overview by CodeData
Sam Sur
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven Organizations
Ellen Friedman
ODSC data science to DataOps
ODSC data science to DataOps
Christopher Bergh
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
DataKitchen
Monitoring data quality by Jos Gheerardyn of Yields.io
Monitoring data quality by Jos Gheerardyn of Yields.io
Dataops Ghent Meetup
DataOps, DevOps and the Developer: Treating Database Code Just Like App Code
DataOps, DevOps and the Developer: Treating Database Code Just Like App Code
DevOps.com
Redis rise of Dataops
Redis rise of Dataops
landoop
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Data Con LA
5 Simple Steps to Unleash Big Data Talend Connect
5 Simple Steps to Unleash Big Data Talend Connect
Talend
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Talend
Talend 6.1 - What's New in Talend?
Talend 6.1 - What's New in Talend?
Talend
Operationalizing Data Analytics
Operationalizing Data Analytics
VMware Tanzu
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow
VMware Tanzu
Unleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine Learning
Talend
Metadata Mastery: A Big Step for BI Modernization
Metadata Mastery: A Big Step for BI Modernization
Eric Kavanagh
Achieving Agility and Scale for Your Data Lake - Talend
Achieving Agility and Scale for Your Data Lake - Talend
Talend
Embracing Cloud Agility to Maximize Flexibility & Performance
Embracing Cloud Agility to Maximize Flexibility & Performance
Talend
Embedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern Staender
Dataconomy Media
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Open Data Group
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
Ellen Friedman
Big Data Maturity Scorecard
Big Data Maturity Scorecard
DataWorks Summit
Making the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British Airways
DataWorks Summit
seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019
DataKitchen
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
DataKitchen
Mais conteúdo relacionado
Mais procurados
ODSC data science to DataOps
ODSC data science to DataOps
Christopher Bergh
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
DataKitchen
Monitoring data quality by Jos Gheerardyn of Yields.io
Monitoring data quality by Jos Gheerardyn of Yields.io
Dataops Ghent Meetup
DataOps, DevOps and the Developer: Treating Database Code Just Like App Code
DataOps, DevOps and the Developer: Treating Database Code Just Like App Code
DevOps.com
Redis rise of Dataops
Redis rise of Dataops
landoop
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Data Con LA
5 Simple Steps to Unleash Big Data Talend Connect
5 Simple Steps to Unleash Big Data Talend Connect
Talend
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Talend
Talend 6.1 - What's New in Talend?
Talend 6.1 - What's New in Talend?
Talend
Operationalizing Data Analytics
Operationalizing Data Analytics
VMware Tanzu
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow
VMware Tanzu
Unleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine Learning
Talend
Metadata Mastery: A Big Step for BI Modernization
Metadata Mastery: A Big Step for BI Modernization
Eric Kavanagh
Achieving Agility and Scale for Your Data Lake - Talend
Achieving Agility and Scale for Your Data Lake - Talend
Talend
Embracing Cloud Agility to Maximize Flexibility & Performance
Embracing Cloud Agility to Maximize Flexibility & Performance
Talend
Embedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern Staender
Dataconomy Media
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Open Data Group
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
Ellen Friedman
Big Data Maturity Scorecard
Big Data Maturity Scorecard
DataWorks Summit
Making the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British Airways
DataWorks Summit
Mais procurados
(20)
ODSC data science to DataOps
ODSC data science to DataOps
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Monitoring data quality by Jos Gheerardyn of Yields.io
Monitoring data quality by Jos Gheerardyn of Yields.io
DataOps, DevOps and the Developer: Treating Database Code Just Like App Code
DataOps, DevOps and the Developer: Treating Database Code Just Like App Code
Redis rise of Dataops
Redis rise of Dataops
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
5 Simple Steps to Unleash Big Data Talend Connect
5 Simple Steps to Unleash Big Data Talend Connect
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Mike Tuche, CEO of Talend: Enabling the Data Driven Enterprise
Talend 6.1 - What's New in Talend?
Talend 6.1 - What's New in Talend?
Operationalizing Data Analytics
Operationalizing Data Analytics
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow
Unleash the Power of Big Data and Machine Learning
Unleash the Power of Big Data and Machine Learning
Metadata Mastery: A Big Step for BI Modernization
Metadata Mastery: A Big Step for BI Modernization
Achieving Agility and Scale for Your Data Lake - Talend
Achieving Agility and Scale for Your Data Lake - Talend
Embracing Cloud Agility to Maximize Flexibility & Performance
Embracing Cloud Agility to Maximize Flexibility & Performance
Embedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern Staender
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
Comcast Labs Connect - PHLAI Conference Philadelphia 2018
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
Big Data Maturity Scorecard
Big Data Maturity Scorecard
Making the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British Airways
Semelhante a Fri benghiat gil-odsc-data-kitchen-data science to dataops
seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019
DataKitchen
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
DataKitchen
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
DataKitchen
Enabling Agility Through DevOps
Enabling Agility Through DevOps
Leland Newsom CSP-SM, SPC5, SDP
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchen
Christopher Bergh
From zero to one - How we evolved our test automation processes and mindset i...
From zero to one - How we evolved our test automation processes and mindset i...
Jen-Chieh Ko
Data science tools of the trade
Data science tools of the trade
Fangda Wang
ENT206 Product Development in the Cloud
ENT206 Product Development in the Cloud
Amazon Web Services
Agile-plus-DevOps Testing for Packaged Applications
Agile-plus-DevOps Testing for Packaged Applications
Worksoft
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
SnapLogic
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
DATAVERSITY
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Christopher Gutknecht
Product Development in the Cloud
Product Development in the Cloud
Amazon Web Services
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
Boston Consulting Group
Transforming Product Development in the Cloud (ENT306) - AWS re:Invent 2018
Transforming Product Development in the Cloud (ENT306) - AWS re:Invent 2018
Amazon Web Services
Transforming Product Development - Transformation Day Montreal 2018
Transforming Product Development - Transformation Day Montreal 2018
Amazon Web Services
Product Development in the Cloud - ENT206 - Chicago AWS Summit
Product Development in the Cloud - ENT206 - Chicago AWS Summit
Amazon Web Services
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
Revolution Analytics
Business Process and Technology Evolution - Product Creation
Business Process and Technology Evolution - Product Creation
Vikram Singla FCILT
PayPal Notebooks at Jupytercon 2018
PayPal Notebooks at Jupytercon 2018
Romit Mehta
Semelhante a Fri benghiat gil-odsc-data-kitchen-data science to dataops
(20)
seven steps to dataops @ dataops.rocks conference Oct 2019
seven steps to dataops @ dataops.rocks conference Oct 2019
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
Enabling Agility Through DevOps
Enabling Agility Through DevOps
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchen
From zero to one - How we evolved our test automation processes and mindset i...
From zero to one - How we evolved our test automation processes and mindset i...
Data science tools of the trade
Data science tools of the trade
ENT206 Product Development in the Cloud
ENT206 Product Development in the Cloud
Agile-plus-DevOps Testing for Packaged Applications
Agile-plus-DevOps Testing for Packaged Applications
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Building a Marketing Data Warehouse from Scratch - SMX Advanced 202
Product Development in the Cloud
Product Development in the Cloud
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
Transforming Product Development in the Cloud (ENT306) - AWS re:Invent 2018
Transforming Product Development in the Cloud (ENT306) - AWS re:Invent 2018
Transforming Product Development - Transformation Day Montreal 2018
Transforming Product Development - Transformation Day Montreal 2018
Product Development in the Cloud - ENT206 - Chicago AWS Summit
Product Development in the Cloud - ENT206 - Chicago AWS Summit
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
Business Process and Technology Evolution - Product Creation
Business Process and Technology Evolution - Product Creation
PayPal Notebooks at Jupytercon 2018
PayPal Notebooks at Jupytercon 2018
Último
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
VICTOR MAESTRE RAMIREZ
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
17djon017
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
Seán Kennedy
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
Seán Kennedy
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
dajasot375
Learn How Data Science Changes Our World
Learn How Data Science Changes Our World
Eduminds Learning
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
voginip
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
Amil Baba Dawood bangali
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
dataanalyticsqueen03
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
vhwb25kk
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
chwongval
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
jennyeacort
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
yuu sss
How we prevented account sharing with MFA
How we prevented account sharing with MFA
Andrei Kaleshka
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
Pramod Kumar Srivastava
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
GQ Research
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
Cathrine Wilhelmsen
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
Human37
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
John Sterrett
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Boston Institute of Analytics
Último
(20)
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Learn How Data Science Changes Our World
Learn How Data Science Changes Our World
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
How we prevented account sharing with MFA
How we prevented account sharing with MFA
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Fri benghiat gil-odsc-data-kitchen-data science to dataops
1.
Copyright © 2017
by DataKitchen, Inc. All Rights Reserved.
2.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Agenda How to go from Data Science to Data Operations (#DataOps) Introductions Data Science Challenges What is DataOps? Seven Shocking Steps to DataOps Pulling it together
3.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Keep this question in mind What can I take from this session and use on Monday?
4.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. For slides contact gil@DataKitchen.io
5.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Speaker – co-Founder of DataKitchen Gil Benghiat, Founder, VP of Products gil@datakitchen.io A series of data centric software projects 🎓 Applied Math / Biology @ Brown 🎓 Computer Science @ Stanford 🏢 Bell Labs, Sybase, PhaseForward, LeapFrogRx
6.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. DataKitchen DataOps Software Platform Main Features 1. Orchestrate complex data pipelines 2. Deploy new ideas to production 3. Automate tests and monitor quality Enables 1. Fast delivery of analytics 2. High data quality 3. Using your favorite tools and data stores
7.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Agenda Introductions • Data Science Challenges What is DataOps? Seven Shocking Steps to DataOps Pulling it together
8.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Figure 1: Only a small fraction of real-world ML systems is composed of the ML code, as shown by the small black box in the middle. The required surrounding infrastructure is vast and complex. Google Advances in Neural Information Processing Systems 28 (NIPS 2015)
9.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Business Need Prep Data Feature Extraction Build Model Evaluate Model Deploy Model Monitor Model Iterate, Test and Improve Model building
10.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Agenda Introductions Data Science Challenges • What is DataOps? Seven Shocking Steps to DataOps Pulling it together
11.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Genesis of DataOps People, Process, Organization Technical Environment = 7 steps
12.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Data Engineer Data Scientist Data Analyst Agile Development is a mindset: 1. Collaborate with your customers 2. Respond to change 3. Measure progress by working analytics 4. Release frequently (most important first) 5. Get feedback on your releases 6. Adjust your behavior to become more effective 4 Values 12 Principles Be Pragmatic Not Dogmatic DataOps: It Began With Agile Business Partner
13.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Focus on Value
14.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Agenda Introductions Data Science Challenges What is DataOps? • Seven Shocking Steps to DataOps Pulling it together
15.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Seven Steps to DataOps 1. Orchestrate Two Journeys 2. Add Tests 3. Use a Version Control System 4. Branch and Merge 5. Use Multiple Environments 6. Reuse & Containerize 7. Parameterize Your Processing
16.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Journey 1: Orchestrate data to customer value Analytic process are like manufacturing: materials (data) and production outputs (refined data, charts, graphs, models) Access: Python Code Transform: SQL Code, ETL Model: R Code Visualize: Tableau Workbook Report: Tableau Online ❶
17.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Journey 2: Speed ideas to production Analytic processes are like software development: deliverables continually move from development to production ❶ Data Engineers Data Scientists Data Analysts Diverse Team Diverse Tools Diverse Customers Business Customer Products & Systems
18.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Innovation and Value Pipeline Together Focus on both orchestration and deployment while automating & monitoring quality Don’t want break production when I deploy my changes Don’t want to learn about data quality issues from my customers ❶
19.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Add Tests Monitor quality Data Quality Monitoring: To ensure that during in the Value Pipeline, the data quality remains high. Code Quality Monitoring: Before promoting work, running new and old tests gives high confidence that the change did not break anything in the Innovation Pipeline ❷
20.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Automate Monitoring & Tests In Production Test Every Step And Every Tool in Your Value Pipeline Are your outputs consistent? And Save Test Results! Are data inputs free from issues? Is your business logic still correct? Access: Python Code Transform: SQL Code, ETL Model: R Code Visualize: Tableau Workbook Report: Tableau Online ❷
21.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Support Multiple Types Of Tests Testing Data Is Not Just Pass/Fail in Your Value Pipeline Support Test Types • Error – stop the line • Warning – investigate later • Info – list of changes Keep Test History • Statistical Process Control ❷
22.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Types of Tests ❷
23.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Example Tests Simple ❷
24.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. For the Innovation Pipeline Tests Are For Also Code: Keep Data Fixed Deploy Feature Run all tests here before promoting ❷
25.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Use a Version Control System At The End Of The Day, Analytic Work Is All Just Code Access: Python Code Transform: SQL Code, ETL Code Model: R Code Visualize: Tableau Workbook XML Report: Tableau Online Source Code Control ❸
26.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Branch & Merge Source Code Control Branching & Merging enables people to safely work on their own tasks ❹
27.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. ❹ Example branch and merge pattern Sprint 1 Sprint 2 f1 f2 f3 main / master / trunk f5
28.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Access: Python Code Transform: SQL Code, ETL Code Model: R Code Visualize: Tableau Workbook XML Report: Tableau Online Use Multiple Environments Analytic Environment Your Analytic Work Requires Coordinating Tools And Hardware ❺
29.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Use Multiple Environments Provide an Analytic Environment for each branch • Analysts and Data Scientists need a controlled environment for their experiments • Engineers need a place to develop outside of production • Update Production only after all tests are run! ❺
30.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Use Multiple Environments ❺ Provide an Analytic Environment for each branch • Analysts and Data Scientists need a controlled environment for their experiments • Engineers need a place to develop outside of production • Update Production only after all tests are run!
31.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Reuse & Containerize Containerize 1. Manage the environment for each model (e.g. Docker, VM, AMI) 2. Practice Environment Version Control make production and development areas identical Reuse 1. The code 2. Data ❻
32.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Parameterize Your Processing Think Of Your Pipeline Like A Big Function • Named sets of parameters will increase your velocity • With parameters, you can vary • Inputs • Outputs • Steps in the workflow • You can make a time machine ❼
33.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Agenda Introductions Data Science Challenges What is DataOps? Seven Shocking Steps to DataOps • Pulling it together
34.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Business Need Prep Data Feature Extraction Build Model Evaluate Model Deploy Model Monitor Model Iterate, Test and Improve Model building
35.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. The 7 Steps and Data Science Journeys Tests Version Control Branch and Merge Environments Reuse / Containerize Parameterize Business Need Agile Prep Data x x x x x x x Feature Extraction x x x x x x x Build Model x x x x x x x Evaluate Model x Deploy Model x x x x x x x Monitor Model x
36.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. Make a note to yourself What can I take from this session and use on Monday?
37.
Copyright © 2018
by DataKitchen, Inc. All Rights Reserved. For slides contact gil@DataKitchen.io Thank you for attending
Baixar agora