SlideShare uma empresa Scribd logo
1 de 20
Learn more at datascience.com | Empower Your Data Scientists
November 7, 2017
Best Practices:
Implementing DataOps with a Data Science Platform
Learn more at datascience.com | Empower Your Data Scientists
• Evolving data science landscape
• Data growth and impacts
• Defining DataOps
• DataOps Vs. DevOps
• Best practices in applying DataOps
• Q&A
Agenda
2
Crystal Valentine
VP Technology Strategy
MapR
cvalentine@mapr.com
William Merchan
CSO
DataScience.com
william@datascience.com
Learn more at datascience.com | Empower Your Data Scientists 3
EVOLVING LANDSCAPE
Learn more at datascience.com | Empower Your Data Scientists
DOING DATA SCIENCE HAS GROWN IN COMPLEXITY
4
Windows OSX Cloud On Prem
Laptops Remote
Environments
Security AWS Google Azure
Notebooks
Jupyter
R Studio
Zeppelin
Languages
Python
Scala
R
SAS
Tools
Libraries
Sharing & Collaboration
?
Results Models
Chat Email
.ppt
Code
Email
Shared
Drives
Deployments
Monitoring Support
Logging
Style A
Logging
Style B
Tools
PMML
Flask
Lineage and Repeatability
?
Data Lake Database
Data
Inventory
Spark PigHive
Data
ToolsETL
Cron
Users
Learn more at datascience.com | Empower Your Data Scientists
DATA SCIENCE TRENDS: GROWING TEAMS & OPEN SOURCE AS THE NEW
STANDARD
5
2017: 2,350,000 data science and analytics job listings*
*Source: Kaggle 2017 data science trend report, Burning Glass Quant Crunch Report, Microsoft Revolutions Blog 2017
Learn more at datascience.com | Empower Your Data Scientists
DATA SCIENCE PLATFORMS ARE EMERGING CATEGORY BRINGING TOGETHER ESSENTIAL
ELEMENTS FOR DATA SCIENCE SCALING
6
CLOUD PROVIDERS
ETL & DATA
ENGINEERING
VERTICAL
APPLICATIONS
BI & VISUALIZATION
TOOLS
SECURITY
INFRASTRUCTURE
LIBRARIESTOOLS
DATA PLATFORMS
DATA SCIENCE PLATFORMS
Learn more at datascience.com | Empower Your Data Scientists 7
DATA GROWTH
Learn more at datascience.com | Empower Your Data Scientists
DATA IS THE LEVERAGE POINT FOR COMPETITIVE ADVANTAGE
Learn more at datascience.com | Empower Your Data Scientists
DATA VOLUMES GROWING FASTER THAN MOORE’S LAW
Source: McKinsey Global Institute
20101987
1.2
Zettabytes
of Data
3
Exabytes
of Data
Data Diversity
2020
44
Zettabytes of Data
EmailsCall Detail
Records
Click
stream
CSV DocumentsData
PDFBilling
Data
Meta
Data
JSON Network
Data
Mobile
Data
XMLProduct
Catalog
Medical
Records
Text Files VideoText
Messages
Merchant
Listings
Sensor
Data
Server
Logs
Set Top
Box
Social
Media
Audio
Learn more at datascience.com | Empower Your Data Scientists
THE VALUE OF DATA
Size
$
Valu
e
Cost
Legacy Value Model
Net
Value
Size
$
Valu
e
Next-Gen Value Model
Cost
Net
Value
OPT OPT
Learn more at datascience.com | Empower Your Data Scientists
WE HAVE PASSED AN INFLECTION POINT
Legacy technology investmentNext-Gen technology investment
Source: IDC, Gartner; Analysis & Estimates: MapR
Next-gen consists of cloud, big data, software and hardware related expenses
$ (millions)
INVESTMENT IN NEXT-GEN VS. LEGACY TECHNOLOGIES FOR DATA
Total $ growth of IT market
90% of data is on
next-gen
technology by 2020
Learn more at datascience.com | Empower Your Data Scientists 12
DATAOPS
Learn more at datascience.com | Empower Your Data Scientists
DATAOPS: AN AGILE METHODOLOGY FOR DATA-DRIVEN ORGANIZATIONS
13
Axioms:
1. Data is central to disruptive enterprise applications
a. Lightweight, stateless functions do not represent the majority of workloads
2. Data science and machine learning are an important paradigm
a. Scientists become active users -- no longer just application developers
b. Iterative workflow with different data usage patterns
3. Data volumes continue to grow
4. Moving data is a performance bottleneck
DataOps Goals:
• Continuous model deployment
• Promote repeatability
• Promote productivity -- focus on core competencies
• Promote agility
• Promote self-service
Learn more at datascience.com | Empower Your Data Scientists
COMPARING DEVOPS AND DATAOPS: WHAT’S DIFFERENT OR THE SAME?
14
Developers &
Architects
Data Engineers
Data
Scientists
Security &
Governance
Operations
DataOps
DevOps DataOps
Learn more at datascience.com | Empower Your Data Scientists
CONTINUOUS MODEL DEPLOYMENT
Data
Engineering
Model
Development
Model
Management
Model
Deployment
Model
Monitoring &
Rescoring
Key Building Blocks for Agility:
1) Unified data platform
2) Data governance
3) Self-service data and compute access
4) Multitenancy and resource management
Learn more at datascience.com | Empower Your Data Scientists 16
BEST PRACTICES
Learn more at datascience.com | Empower Your Data Scientists
INDUSTRY LEADING DATA SCIENCE ORGANIZATIONS ADOPTING DATAOPS
Versioning Platform approach Team makeup and
organization
Self service
Learn more at datascience.com | Empower Your Data Scientists 18
DataOps Platform Checklist
Unified platform for all data --
historical and real-time production
Multitenancy and resource utilization
Single security and access model for
governance and self-service access
Enterprise-grade for mission-critical
applications and open source tools
Run compute on the data platform --
leverage data locality
Learn more at datascience.com | Empower Your Data Scientists 19
Thank you!
Learn more at datascience.com | Empower Your Data Scientists 20
NEW DATAOPS APPROACH FOR DATA SCIENCE TEAMS
DataOps

Mais conteúdo relacionado

Mais de MapR Technologies

Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0MapR Technologies
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR Technologies
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceMapR Technologies
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataMapR Technologies
 

Mais de MapR Technologies (20)

Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 

Último

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Best Practices: Implementing DataOps with a Data Science Platform

  • 1. Learn more at datascience.com | Empower Your Data Scientists November 7, 2017 Best Practices: Implementing DataOps with a Data Science Platform
  • 2. Learn more at datascience.com | Empower Your Data Scientists • Evolving data science landscape • Data growth and impacts • Defining DataOps • DataOps Vs. DevOps • Best practices in applying DataOps • Q&A Agenda 2 Crystal Valentine VP Technology Strategy MapR cvalentine@mapr.com William Merchan CSO DataScience.com william@datascience.com
  • 3. Learn more at datascience.com | Empower Your Data Scientists 3 EVOLVING LANDSCAPE
  • 4. Learn more at datascience.com | Empower Your Data Scientists DOING DATA SCIENCE HAS GROWN IN COMPLEXITY 4 Windows OSX Cloud On Prem Laptops Remote Environments Security AWS Google Azure Notebooks Jupyter R Studio Zeppelin Languages Python Scala R SAS Tools Libraries Sharing & Collaboration ? Results Models Chat Email .ppt Code Email Shared Drives Deployments Monitoring Support Logging Style A Logging Style B Tools PMML Flask Lineage and Repeatability ? Data Lake Database Data Inventory Spark PigHive Data ToolsETL Cron Users
  • 5. Learn more at datascience.com | Empower Your Data Scientists DATA SCIENCE TRENDS: GROWING TEAMS & OPEN SOURCE AS THE NEW STANDARD 5 2017: 2,350,000 data science and analytics job listings* *Source: Kaggle 2017 data science trend report, Burning Glass Quant Crunch Report, Microsoft Revolutions Blog 2017
  • 6. Learn more at datascience.com | Empower Your Data Scientists DATA SCIENCE PLATFORMS ARE EMERGING CATEGORY BRINGING TOGETHER ESSENTIAL ELEMENTS FOR DATA SCIENCE SCALING 6 CLOUD PROVIDERS ETL & DATA ENGINEERING VERTICAL APPLICATIONS BI & VISUALIZATION TOOLS SECURITY INFRASTRUCTURE LIBRARIESTOOLS DATA PLATFORMS DATA SCIENCE PLATFORMS
  • 7. Learn more at datascience.com | Empower Your Data Scientists 7 DATA GROWTH
  • 8. Learn more at datascience.com | Empower Your Data Scientists DATA IS THE LEVERAGE POINT FOR COMPETITIVE ADVANTAGE
  • 9. Learn more at datascience.com | Empower Your Data Scientists DATA VOLUMES GROWING FASTER THAN MOORE’S LAW Source: McKinsey Global Institute 20101987 1.2 Zettabytes of Data 3 Exabytes of Data Data Diversity 2020 44 Zettabytes of Data EmailsCall Detail Records Click stream CSV DocumentsData PDFBilling Data Meta Data JSON Network Data Mobile Data XMLProduct Catalog Medical Records Text Files VideoText Messages Merchant Listings Sensor Data Server Logs Set Top Box Social Media Audio
  • 10. Learn more at datascience.com | Empower Your Data Scientists THE VALUE OF DATA Size $ Valu e Cost Legacy Value Model Net Value Size $ Valu e Next-Gen Value Model Cost Net Value OPT OPT
  • 11. Learn more at datascience.com | Empower Your Data Scientists WE HAVE PASSED AN INFLECTION POINT Legacy technology investmentNext-Gen technology investment Source: IDC, Gartner; Analysis & Estimates: MapR Next-gen consists of cloud, big data, software and hardware related expenses $ (millions) INVESTMENT IN NEXT-GEN VS. LEGACY TECHNOLOGIES FOR DATA Total $ growth of IT market 90% of data is on next-gen technology by 2020
  • 12. Learn more at datascience.com | Empower Your Data Scientists 12 DATAOPS
  • 13. Learn more at datascience.com | Empower Your Data Scientists DATAOPS: AN AGILE METHODOLOGY FOR DATA-DRIVEN ORGANIZATIONS 13 Axioms: 1. Data is central to disruptive enterprise applications a. Lightweight, stateless functions do not represent the majority of workloads 2. Data science and machine learning are an important paradigm a. Scientists become active users -- no longer just application developers b. Iterative workflow with different data usage patterns 3. Data volumes continue to grow 4. Moving data is a performance bottleneck DataOps Goals: • Continuous model deployment • Promote repeatability • Promote productivity -- focus on core competencies • Promote agility • Promote self-service
  • 14. Learn more at datascience.com | Empower Your Data Scientists COMPARING DEVOPS AND DATAOPS: WHAT’S DIFFERENT OR THE SAME? 14 Developers & Architects Data Engineers Data Scientists Security & Governance Operations DataOps DevOps DataOps
  • 15. Learn more at datascience.com | Empower Your Data Scientists CONTINUOUS MODEL DEPLOYMENT Data Engineering Model Development Model Management Model Deployment Model Monitoring & Rescoring Key Building Blocks for Agility: 1) Unified data platform 2) Data governance 3) Self-service data and compute access 4) Multitenancy and resource management
  • 16. Learn more at datascience.com | Empower Your Data Scientists 16 BEST PRACTICES
  • 17. Learn more at datascience.com | Empower Your Data Scientists INDUSTRY LEADING DATA SCIENCE ORGANIZATIONS ADOPTING DATAOPS Versioning Platform approach Team makeup and organization Self service
  • 18. Learn more at datascience.com | Empower Your Data Scientists 18 DataOps Platform Checklist Unified platform for all data -- historical and real-time production Multitenancy and resource utilization Single security and access model for governance and self-service access Enterprise-grade for mission-critical applications and open source tools Run compute on the data platform -- leverage data locality
  • 19. Learn more at datascience.com | Empower Your Data Scientists 19 Thank you!
  • 20. Learn more at datascience.com | Empower Your Data Scientists 20 NEW DATAOPS APPROACH FOR DATA SCIENCE TEAMS DataOps