3. Offering
Platform for emerging data scientists to graphically build and deploy experiments
Key Value Props
• Rapid experiment composition
• > 100 easily configured modules for data prep, training, evaluation
• Extensibility through R & Python
• Serverless training and deployment
Numbers
• 100’s of thousands of deployed models serving billions of requests
Azure Machine
Learning Studio
5. Infrastructure Can Get in Your Way
Clusters
• Provision GPUs
• Install drivers
and software
• Interactive use
Scheduling
• Queue work
• Prioritize jobs
• Start MPI
• Monitor
• Handle failures
Data
• Scale access to
training data
• Output logs &
models
• Secure &
compliant
Cost
• Scale up and
down
• Share reserved
instances
• Low priority
Workflow
• Choose
efficient
hardware
• Tooling
integration
• Laptop to cloud
6. • Managed Service
• Supports Role Based Access Control
• Run any toolkit (CNTK, Tensorflow,
Caffee/Caffee2, Chainer, Keras, …)
• Run experiments in Parallel
• Run in Containers or directly on VM
• Support various Shared File Systems
• Load based automatic scaling
• Only Storage and compute cost. Service is free
Azure Batch
AI Service
8. CONTROL EASE OF USE
Azure Data Lake Store
Azure Storage
Any Hadoop technology,
any distribution
Workload optimized,
managed clusters
Data Engineering in a
Job-as-a-service model
Azure Marketplace
HDP | CDH | MapR
Azure Data Lake
Analytics
IaaS Clusters Managed Clusters Big Data as-a-service
Azure HDInsight
Frictionless & Optimized
Spark clusters
Azure Databricks
BIGDATA
STORAGE
BIGDATA
ANALYTICS
ReducedAdministration
IaaS and PaaS Big Data Analytics
14. Apps + insights
Social
LOB
Graph
IoT
Image
CRM INGEST STORE PREP & TRAIN MODEL & SERVE
Data orchestration
and monitoring
Data lake
and storage
Hadoop/Spark/SQL
and ML
.
IoT
Azure Machine Learning
The AI Development lifecycle
15. Local machine
Scale up to DSVM
Scale out with Spark on HDInsight
Azure Batch AI (Coming Soon)
ML Server (Coming Soon)
Experiment Anywhere
A ZURE ML
EXPERIMENTATION
Command line tools
IDEs
Notebooks in Workbench
VS Code Tools for AI
21. R Server Overview
• Enhances upon open source R to scale to big data
• Embraces combined open source and commercial innovations
• Allows customers to get the support they trust
• Microsoft innovations:
• RevoScaleR
• Parallelized, distributed algorithms
• Microsoft Machine learning
• Fast and Deep learning
• Pretrained models
• Custom parallel frameworks
22. ML Services Version 9.2 at a glance
Platforms & Data
Tools
Languages
Algorithms
Data Sources
Rattle Mrsdeploy
RESTful API
deployment
Real-Time
Scoring
Visualization
Tool
Integration
.csv Microsoft .XDF
In-database
deployment
Operationalization
Distributed Parallelized Algorithms:
•RevoScaleR and RevoScalePy libraries
•MicrosoftML library
•Custom parallelization frameworks
Open source R algorithms
& visualizations:
•CRAN
•bioconductor
Plus:
•Deep Learning
•Pretrained Models
•Prebuilt Featurizers
ODBC/JDBC
25. TDSP objective
Integrate DevOps with data science workflows to improve collaboration,
quality, robustness and efficiency in data science projects
o Infrastructure as Code (IaC)
o Building
o Testing
o CI / CD
o …
o App performance monitoring