SlideShare uma empresa Scribd logo
1 de 52
Unsupervised Aspect
Based Sentiment
Analysis at Scale.
Disclaimer: My Opinions Are My Own
• Blog: https://medium.com/@aribornstein
• Twitter: https://twitter.com/pythiccoder
• GitHub: https://github.com/aribornstein
Trends In Language Understanding
Language
Understanding
Ambient
Computing
Automation and
Solution Scale
Unlocking
Unstructured Data
Demo Text Analytics
What is Azure Machine Learning Service?
Set of Azure
Cloud Services
Python
SDK
Prepare Data
Build Models
Train Models
Manage Models
Track Experiments
Deploy Models
That enables
you to:
Azure ML Service
Key Artifacts
Workspace
Advanced Azure ML Services
NLP-Architect Aspect Based Sentiment Analysis
Azure Machine Learning
Typical E2E Process
…
Prepare Experiment Deploy
Orchestrate
Azure Machine Learning
Typical E2E Process
…
Prepare Experiment Deploy
Orchestrate
Managing Tailwind Traders ML Challenges
Challenges of distributed training
 Dependencies and
Containers
Provision clusters of
VMs
Schedule jobs
Distribute data
Gather results
Handle failures
Scale resources
Secure Access
Features of Distributed Training
distributed training with
Horovod
/Docs alert
Get Started by setting up your
Azure ML Environment
https://aka.ms/amlsdk
My Computer
Data Store
Azure ML
Workspace
Compute Target
Docker Image
How Azure ML Works
The 8 Azure ML Train Steps
from azureml.core import Workspace
ws = Workspace.create(name='myworkspace',
subscription_id='<azure-subscription-id>',
resource_group='myresourcegroup',
create_resource_group=True,
location='eastus2' # or other supported Azure region
)
# see workspace details
ws.get_details()
Step 2 – Create an Experiment
experiment_name = ‘my-experiment-1'
from azureml.core import Experiment
exp = Experiment(workspace=ws, name=experiment_name)
Step 1 – Create a workspace
Step 3 – Create remote compute target
# choose a name for your cluster, specify min and max nodes
compute_name = os.environ.get("BATCHAI_CLUSTER_NAME", "cpucluster")
compute_min_nodes = os.environ.get("BATCHAI_CLUSTER_MIN_NODES", 0)
compute_max_nodes = os.environ.get("BATCHAI_CLUSTER_MAX_NODES", 4)
# This example uses CPU VM. For using GPU VM, set SKU to STANDARD_NC6
vm_size = os.environ.get("BATCHAI_CLUSTER_SKU", "STANDARD_D2_V2")
provisioning_config = AmlCompute.provisioning_configuration(
vm_size = vm_size,
min_nodes = compute_min_nodes,
max_nodes = compute_max_nodes)
# create the cluster
print(‘ creating a new compute target... ')
compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
# You can poll for a minimum number of nodes and for a specific timeout.
# if no min node count is provided it will use the scale settings for the cluster
compute_target.wait_for_completion(show_output=True,
min_node_count=None, timeout_in_minutes=20)
# note that while loading, we are shrinking the intensity values (X) from 0-255 to 0-1 so that the
model converge faster.
X_train = load_data('./data/train-images.gz', False) / 255.0
y_train = load_data('./data/train-labels.gz', True).reshape(-1)
X_test = load_data('./data/test-images.gz', False) / 255.0
y_test = load_data('./data/test-labels.gz', True).reshape(-1)
First load the compressed files into numpy arrays. Note the ‘load_data’ is a custom function that simply parses the
compressed files into numpy arrays.
Now make the data accessible remotely by uploading that data from your local machine into Azure so it can be
accessed for remote training. The files are uploaded into a directory named mnist at the root of the datastore.
ds = ws.get_default_datastore()
print(ds.datastore_type, ds.account_name, ds.container_name)
ds.upload(src_dir='./data', target_path='mnist', overwrite=True, show_progress=True)
We now have everything you need to start training a model.
Step 4 – Upload data to the cloud
%%time from sklearn.linear_model import LogisticRegression
clf = LogisticRegression()
clf.fit(X_train, y_train)
# Next, make predictions using the test set and calculate the accuracy
y_hat = clf.predict(X_test)
print(np.average(y_hat == y_test))
You should see the local model accuracy displayed. [It should be a number like 0.915]
Train a simple logistic regression model using scikit-learn locally. This should take a minute or two.
Step 5 – Train a local model
To submit a training job to a remote you have to perform the following tasks:
• 6.1: Create a directory
• 6.2: Create a training script
• 6.3: Create an estimator object
• 6.4: Submit the job
Step 6.1 – Create a directory
Create a directory to deliver the required code from your computer to the remote resource.
import os
script_folder = './sklearn-mnist' os.makedirs(script_folder, exist_ok=True)
Step 6 – Train model on remote cluster
%%writefile $script_folder/train.py
# load train and test set into numpy arrays
# Note: we scale the pixel intensity values to 0-1 (by dividing it with 255.0) so the model can
# converge faster.
# ‘data_folder’ variable holds the location of the data files (from datastore)
Reg = 0.8 # regularization rate of the logistic regression model.
X_train = load_data(os.path.join(data_folder, 'train-images.gz'), False) / 255.0
X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0
y_train = load_data(os.path.join(data_folder, 'train-labels.gz'), True).reshape(-1)
y_test = load_data(os.path.join(data_folder, 'test-labels.gz'), True).reshape(-1)
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape, sep = 'n’)
# get hold of the current run
run = Run.get_context()
#Train a logistic regression model with regularizaion rate of’ ‘reg’
clf = LogisticRegression(C=1.0/reg, random_state=42)
clf.fit(X_train, y_train)
Step 6.2 – Create a Training Script (1/2)
print('Predict the test set’)
y_hat = clf.predict(X_test)
# calculate accuracy on the prediction
acc = np.average(y_hat == y_test)
print('Accuracy is', acc)
run.log('regularization rate', np.float(args.reg))
run.log('accuracy', np.float(acc)) os.makedirs('outputs', exist_ok=True)
# The training script saves the model into a directory named ‘outputs’. Note files saved in the
# outputs folder are automatically uploaded into experiment record. Anything written in this
# directory is automatically uploaded into the workspace.
joblib.dump(value=clf, filename='outputs/sklearn_mnist_model.pkl')
Step 6.2 – Create a Training Script (2/2)
An estimator object is used to submit the run.
from azureml.train.estimator import Estimator
script_params = { '--data-folder': ds.as_mount(), '--regularization': 0.8 }
est = Estimator(source_directory=script_folder,
script_params=script_params,
compute_target=compute_target,
entry_script='train.py’,
conda_packages=['scikit-learn'])
Step 6.4 – Submit the job to the cluster for training
run = exp.submit(config=est)
run
Step 6.3 – Create an Estimator
Step 7 – Monitor a run
You can watch the progress of the run with a Jupyter widget. The widget is asynchronous and provides live
updates every 10-15 seconds until the job completes.
from azureml.widgets import RunDetails
RunDetails(run).show()
Here is a still snapshot of the widget shown at the end of training:
Step 8 – See the results
As model training and monitoring happen in the background. Wait until the model has completed training before
running more code. Use wait_for_completion to show when the model training is complete
run.wait_for_completion(show_output=False)
# now there is a trained model on the remote cluster
print(run.get_metrics())
{'regularization rate': 0.8, 'accuracy':
0.9204}
Step 9 – Register the model
This wrote the file ‘outputs/sklearn_mnist_model.pkl’ in a directory named ‘outputs’ in the VM of the cluster where
the job is executed.
• outputs is a special directory in that all content in this directory is automatically uploaded to your workspace.
• This content appears in the run record in the experiment under your workspace.
• Hence, the model file is now also available in your workspace.
joblib.dump(value=clf, filename='outputs/sklearn_mnist_model.pkl')
Recall that the last step in the training script is:
# register the model in the workspace
model = run.register_model (
model_name='sklearn_mnist’,
model_path='outputs/sklearn_mnist_model.pkl’)
The model is now available to query, examine, or deploy
Step 9 – Deploy the Model
Step 9.1 – Create the scoring script
Create the scoring script, called score.py, used by the web service call to show how to use the model.
It requires two functions – init() and run (input data)
from azureml.core.model import Model
def init():
global model
# retreive the path to the model file using the model name
model_path = Model.get_model_path('sklearn_mnist’)
model = joblib.load(model_path)
def run(raw_data):
data = np.array(json.loads(raw_data)['data’])
# make prediction
y_hat = model.predict(data) return json.dumps(y_hat.tolist())
Step 9.2 – Create environment file
Create an environment file, called myenv.yml, that specifies all of the script's package dependencies. This file is
used to ensure that all of those dependencies are installed in the Docker image. This example needs scikit-learn
and azureml-sdk.
from azureml.core.conda_dependencies import CondaDependencies
myenv = CondaDependencies()
myenv.add_conda_package("scikit-learn")
with open("myenv.yml","w") as f:
f.write(myenv.serialize_to_string())
Step 9.3 – Create configuration file
Create a deployment configuration file and specify the number of CPUs and gigabyte of RAM needed for the
ACI container. Here we will use the defaults (1 core and 1 gigabyte of RAM)
from azureml.core.webservice import AciWebservice
aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1,
tags={"data": "MNIST", "method" : "sklearn"},
description='Predict MNIST with sklearn')
Step 9.4 – Deploy the model to ACI
%%time
from azureml.core.webservice import Webservice
from azureml.core.image import ContainerImage
# configure the image
image_config = ContainerImage.image_configuration(
execution_script ="score.py",
runtime ="python",
conda_file ="myenv.yml")
service = Webservice.deploy_from_model(workspace=ws, name='sklearn-mnist-svc’,
deployment_config=aciconfig, models=[model],
image_config=image_config)
service.wait_for_deployment(show_output=True)
Step 10 – Test the deployed model using the HTTP end point
Test the deployed model by sending images to be classified to the HTTP endpoint
import requests
import json
# send a random row from the test set to score
random_index = np.random.randint(0, len(X_test)-1)
input_data = "{"data": [" + str(list(X_test[random_index])) + "]}"
headers = {'Content-Type':'application/json’}
resp = requests.post(service.scoring_uri, input_data, headers=headers)
print("POST to url", service.scoring_uri)
#print("input data:", input_data)
print("label:", y_test[random_index])
print("prediction:", resp.text)
Demo: Operationalizing Open
Source with Azure ML and NLP
Architect
Distributed Hyperparameter Tuning
What are Hyperparameters?
{
“learning_rate”: uniform(0, 1),
“num_layers”: choice(2, 4, 8)
…
}
Config1= {“learning_rate”: 0.2,
“num_layers”: 2, …}
Config2= {“learning_rate”: 0.5,
“num_layers”: 4, …}
Config3= {“learning_rate”: 0.9,
“num_layers”: 8, …}
…
Typical ‘manual’ approach to hyperparameter tuning
Dataset
Training
Algorithm 1
Hyperparameter
Values – config 1
Model 1
Hyperparameter
Values – config 2
Model 2
Hyperparameter
Values – config 3
Model 3
Model
Training
InfrastructureTraining
Algorithm 2
Hyperparameter
Values – config 4
Model 4
Complex
Tedious
Repetitive
Time consuming
Expensive
Automated Hyperparameter Tuning
Manage Active Jobs
Evaluate training runs for
specified primary metric
Early terminate poor performing
training runs. Early termination
policies include:
Early terminate poorly
performing runs
/Docs alert
Get started with Azure
Machine Learning Service
Hyperparameter Tuning
https://aka.ms/AzureMLHyperDrive
Wrapping Up
Takeaways
/Docs alert
Azure ML Best Practices Repos
github.com/microsoft/nlp/
• Computer Vision github.com/microsoft/ComputerVision
• Recommenders github.com/microsoft/Recommenders
https://aka.ms/TextCogSvc
https://aka.ms/AutomatedMLDoc
https://aka.ms/AzureMLHyperDrive
https://aka.ms/AA3dzht
https://aka.ms/9MLTips
Getting Started Series…
Other Materials
/MS Learn alert
aka.ms/AIML40MSLearnCollection
Learning Resources
aka.ms/AA3dzht
https://medium.com/microsoftazure/9-advanced-tips-for-production-
machine-learning-6bbdebf49a6f

Mais conteúdo relacionado

Mais procurados

Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applicationsKexin Xie
 
Django Multi-DB in Anger
Django Multi-DB in AngerDjango Multi-DB in Anger
Django Multi-DB in AngerLoren Davie
 
Julien Simon "Scaling ML from 0 to millions of users"
Julien Simon "Scaling ML from 0 to millions of users"Julien Simon "Scaling ML from 0 to millions of users"
Julien Simon "Scaling ML from 0 to millions of users"Fwdays
 
NUS-ISS Learning Day 2019-Pandas in the cloud
NUS-ISS Learning Day 2019-Pandas in the cloudNUS-ISS Learning Day 2019-Pandas in the cloud
NUS-ISS Learning Day 2019-Pandas in the cloudNUS-ISS
 
Integrating cloud stack with puppet
Integrating cloud stack with puppetIntegrating cloud stack with puppet
Integrating cloud stack with puppetPuppet
 
SenchaCon 2016: Ext JS + React: A Match Made in UX Heaven - Mark Brocato
SenchaCon 2016: Ext JS + React: A Match Made in UX Heaven - Mark BrocatoSenchaCon 2016: Ext JS + React: A Match Made in UX Heaven - Mark Brocato
SenchaCon 2016: Ext JS + React: A Match Made in UX Heaven - Mark BrocatoSencha
 
From Zero to Hero – Web Performance
From Zero to Hero – Web PerformanceFrom Zero to Hero – Web Performance
From Zero to Hero – Web PerformanceSebastian Springer
 
Learn you some Ansible for great good!
Learn you some Ansible for great good!Learn you some Ansible for great good!
Learn you some Ansible for great good!David Lapsley
 
Pyclustering tutorial - K-means
Pyclustering tutorial - K-meansPyclustering tutorial - K-means
Pyclustering tutorial - K-meansAndrei Novikov
 
Autoscale a self-healing cluster in OpenStack with Heat
Autoscale a self-healing cluster in OpenStack with HeatAutoscale a self-healing cluster in OpenStack with Heat
Autoscale a self-healing cluster in OpenStack with HeatRico Lin
 
Data processing with celery and rabbit mq
Data processing with celery and rabbit mqData processing with celery and rabbit mq
Data processing with celery and rabbit mqJeff Peck
 
Inside Azure Diagnostics
Inside Azure DiagnosticsInside Azure Diagnostics
Inside Azure DiagnosticsMichael Collier
 
【Unite 2017 Tokyo】ScriptableObjectを使ってプログラマーもアーティストも幸せになろう
【Unite 2017 Tokyo】ScriptableObjectを使ってプログラマーもアーティストも幸せになろう【Unite 2017 Tokyo】ScriptableObjectを使ってプログラマーもアーティストも幸せになろう
【Unite 2017 Tokyo】ScriptableObjectを使ってプログラマーもアーティストも幸せになろうUnity Technologies Japan K.K.
 
Improvements in OpenStack Integration for Application Developers
Improvements in OpenStack Integration for Application DevelopersImprovements in OpenStack Integration for Application Developers
Improvements in OpenStack Integration for Application DevelopersRico Lin
 
(DEV301) Automating AWS with the AWS CLI
(DEV301) Automating AWS with the AWS CLI(DEV301) Automating AWS with the AWS CLI
(DEV301) Automating AWS with the AWS CLIAmazon Web Services
 
Functional Core, Reactive Shell
Functional Core, Reactive ShellFunctional Core, Reactive Shell
Functional Core, Reactive ShellGiovanni Lodi
 

Mais procurados (20)

Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
 
Django Multi-DB in Anger
Django Multi-DB in AngerDjango Multi-DB in Anger
Django Multi-DB in Anger
 
Julien Simon "Scaling ML from 0 to millions of users"
Julien Simon "Scaling ML from 0 to millions of users"Julien Simon "Scaling ML from 0 to millions of users"
Julien Simon "Scaling ML from 0 to millions of users"
 
NUS-ISS Learning Day 2019-Pandas in the cloud
NUS-ISS Learning Day 2019-Pandas in the cloudNUS-ISS Learning Day 2019-Pandas in the cloud
NUS-ISS Learning Day 2019-Pandas in the cloud
 
Integrating cloud stack with puppet
Integrating cloud stack with puppetIntegrating cloud stack with puppet
Integrating cloud stack with puppet
 
SenchaCon 2016: Ext JS + React: A Match Made in UX Heaven - Mark Brocato
SenchaCon 2016: Ext JS + React: A Match Made in UX Heaven - Mark BrocatoSenchaCon 2016: Ext JS + React: A Match Made in UX Heaven - Mark Brocato
SenchaCon 2016: Ext JS + React: A Match Made in UX Heaven - Mark Brocato
 
From Zero to Hero – Web Performance
From Zero to Hero – Web PerformanceFrom Zero to Hero – Web Performance
From Zero to Hero – Web Performance
 
Celery
CeleryCelery
Celery
 
Ios - Introduction to memory management
Ios - Introduction to memory managementIos - Introduction to memory management
Ios - Introduction to memory management
 
Learn you some Ansible for great good!
Learn you some Ansible for great good!Learn you some Ansible for great good!
Learn you some Ansible for great good!
 
Pyclustering tutorial - K-means
Pyclustering tutorial - K-meansPyclustering tutorial - K-means
Pyclustering tutorial - K-means
 
Autoscale a self-healing cluster in OpenStack with Heat
Autoscale a self-healing cluster in OpenStack with HeatAutoscale a self-healing cluster in OpenStack with Heat
Autoscale a self-healing cluster in OpenStack with Heat
 
Typescript barcelona
Typescript barcelonaTypescript barcelona
Typescript barcelona
 
Data processing with celery and rabbit mq
Data processing with celery and rabbit mqData processing with celery and rabbit mq
Data processing with celery and rabbit mq
 
Firebase ng2 zurich
Firebase ng2 zurichFirebase ng2 zurich
Firebase ng2 zurich
 
Inside Azure Diagnostics
Inside Azure DiagnosticsInside Azure Diagnostics
Inside Azure Diagnostics
 
【Unite 2017 Tokyo】ScriptableObjectを使ってプログラマーもアーティストも幸せになろう
【Unite 2017 Tokyo】ScriptableObjectを使ってプログラマーもアーティストも幸せになろう【Unite 2017 Tokyo】ScriptableObjectを使ってプログラマーもアーティストも幸せになろう
【Unite 2017 Tokyo】ScriptableObjectを使ってプログラマーもアーティストも幸せになろう
 
Improvements in OpenStack Integration for Application Developers
Improvements in OpenStack Integration for Application DevelopersImprovements in OpenStack Integration for Application Developers
Improvements in OpenStack Integration for Application Developers
 
(DEV301) Automating AWS with the AWS CLI
(DEV301) Automating AWS with the AWS CLI(DEV301) Automating AWS with the AWS CLI
(DEV301) Automating AWS with the AWS CLI
 
Functional Core, Reactive Shell
Functional Core, Reactive ShellFunctional Core, Reactive Shell
Functional Core, Reactive Shell
 

Semelhante a Unsupervised Aspect Based Sentiment Analysis at Scale

Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning serviceRuth Yakubu
 
AzureMLDeployment.ppt
AzureMLDeployment.pptAzureMLDeployment.ppt
AzureMLDeployment.pptSiddharth Vij
 
Shift Remote AI: Build and deploy PyTorch Models with Azure Machine Learning ...
Shift Remote AI: Build and deploy PyTorch Models with Azure Machine Learning ...Shift Remote AI: Build and deploy PyTorch Models with Azure Machine Learning ...
Shift Remote AI: Build and deploy PyTorch Models with Azure Machine Learning ...Shift Conference
 
Build and deploy PyTorch models with Azure Machine Learning - Henk - CCDays
Build and deploy PyTorch models with Azure Machine Learning - Henk - CCDaysBuild and deploy PyTorch models with Azure Machine Learning - Henk - CCDays
Build and deploy PyTorch models with Azure Machine Learning - Henk - CCDaysCodeOps Technologies LLP
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to productionHerman Wu
 
Viktor Tsykunov "Microsoft AI platform for every Developer"
Viktor Tsykunov "Microsoft AI platform for every Developer"Viktor Tsykunov "Microsoft AI platform for every Developer"
Viktor Tsykunov "Microsoft AI platform for every Developer"Lviv Startup Club
 
Automation with Ansible and Containers
Automation with Ansible and ContainersAutomation with Ansible and Containers
Automation with Ansible and ContainersRodolfo Carvalho
 
SCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AISCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AIHiroshi Tanaka
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)TarunPaparaju
 
Deploy and Serve Model from Azure Databricks onto Azure Machine Learning
Deploy and Serve Model from Azure Databricks onto Azure Machine LearningDeploy and Serve Model from Azure Databricks onto Azure Machine Learning
Deploy and Serve Model from Azure Databricks onto Azure Machine LearningDatabricks
 
Dive into DevOps | March, Building with Terraform, Volodymyr Tsap
Dive into DevOps | March, Building with Terraform, Volodymyr TsapDive into DevOps | March, Building with Terraform, Volodymyr Tsap
Dive into DevOps | March, Building with Terraform, Volodymyr TsapProvectus
 
Designing a production grade realtime ml inference endpoint
Designing a production grade realtime ml inference endpointDesigning a production grade realtime ml inference endpoint
Designing a production grade realtime ml inference endpointChandim Sett
 
Distributed Deep Learning Using Java on the Client and in the Cloud
Distributed Deep Learning Using Java on the Client and in the CloudDistributed Deep Learning Using Java on the Client and in the Cloud
Distributed Deep Learning Using Java on the Client and in the CloudData Science Leuven
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple stepsRenjith M P
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Miguel González-Fierro
 
Training course lect2
Training course lect2Training course lect2
Training course lect2Noor Dhiya
 

Semelhante a Unsupervised Aspect Based Sentiment Analysis at Scale (20)

Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning service
 
AzureMLDeployment.ppt
AzureMLDeployment.pptAzureMLDeployment.ppt
AzureMLDeployment.ppt
 
Shift Remote AI: Build and deploy PyTorch Models with Azure Machine Learning ...
Shift Remote AI: Build and deploy PyTorch Models with Azure Machine Learning ...Shift Remote AI: Build and deploy PyTorch Models with Azure Machine Learning ...
Shift Remote AI: Build and deploy PyTorch Models with Azure Machine Learning ...
 
Build and deploy PyTorch models with Azure Machine Learning - Henk - CCDays
Build and deploy PyTorch models with Azure Machine Learning - Henk - CCDaysBuild and deploy PyTorch models with Azure Machine Learning - Henk - CCDays
Build and deploy PyTorch models with Azure Machine Learning - Henk - CCDays
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
 
Database connectivity in python
Database connectivity in pythonDatabase connectivity in python
Database connectivity in python
 
Viktor Tsykunov "Microsoft AI platform for every Developer"
Viktor Tsykunov "Microsoft AI platform for every Developer"Viktor Tsykunov "Microsoft AI platform for every Developer"
Viktor Tsykunov "Microsoft AI platform for every Developer"
 
Automation with Ansible and Containers
Automation with Ansible and ContainersAutomation with Ansible and Containers
Automation with Ansible and Containers
 
Database connectivity in python
Database connectivity in pythonDatabase connectivity in python
Database connectivity in python
 
SCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AISCasia 2018 MSFT hands on session for Azure Batch AI
SCasia 2018 MSFT hands on session for Azure Batch AI
 
Celery with python
Celery with pythonCelery with python
Celery with python
 
Competition 1 (blog 1)
Competition 1 (blog 1)Competition 1 (blog 1)
Competition 1 (blog 1)
 
Deploy and Serve Model from Azure Databricks onto Azure Machine Learning
Deploy and Serve Model from Azure Databricks onto Azure Machine LearningDeploy and Serve Model from Azure Databricks onto Azure Machine Learning
Deploy and Serve Model from Azure Databricks onto Azure Machine Learning
 
Dive into DevOps | March, Building with Terraform, Volodymyr Tsap
Dive into DevOps | March, Building with Terraform, Volodymyr TsapDive into DevOps | March, Building with Terraform, Volodymyr Tsap
Dive into DevOps | March, Building with Terraform, Volodymyr Tsap
 
Designing a production grade realtime ml inference endpoint
Designing a production grade realtime ml inference endpointDesigning a production grade realtime ml inference endpoint
Designing a production grade realtime ml inference endpoint
 
Distributed Deep Learning Using Java on the Client and in the Cloud
Distributed Deep Learning Using Java on the Client and in the CloudDistributed Deep Learning Using Java on the Client and in the Cloud
Distributed Deep Learning Using Java on the Client and in the Cloud
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
 
Training course lect2
Training course lect2Training course lect2
Training course lect2
 

Mais de Aaron (Ari) Bornstein

The Importance of Developing Interpretable AI Applications
The Importance of Developing Interpretable AI ApplicationsThe Importance of Developing Interpretable AI Applications
The Importance of Developing Interpretable AI ApplicationsAaron (Ari) Bornstein
 
What startups need to know about NLP, AI, & ML on the cloud.
What startups need to know about NLP, AI, & ML on the cloud.What startups need to know about NLP, AI, & ML on the cloud.
What startups need to know about NLP, AI, & ML on the cloud.Aaron (Ari) Bornstein
 
PyConIL 2019 Beyond word Embeddings Slides
PyConIL 2019 Beyond word Embeddings SlidesPyConIL 2019 Beyond word Embeddings Slides
PyConIL 2019 Beyond word Embeddings SlidesAaron (Ari) Bornstein
 
Democratizing AI Istanbul Open Source Summit
Democratizing AI Istanbul Open Source SummitDemocratizing AI Istanbul Open Source Summit
Democratizing AI Istanbul Open Source SummitAaron (Ari) Bornstein
 
Data Hack 2018 Microsoft Math Teacher Challenge
Data Hack 2018 Microsoft Math Teacher ChallengeData Hack 2018 Microsoft Math Teacher Challenge
Data Hack 2018 Microsoft Math Teacher ChallengeAaron (Ari) Bornstein
 
PyconIL 2017 Realtime Sensor Anomaly Detection with Scikit-Learn and the Azu...
PyconIL 2017  Realtime Sensor Anomaly Detection with Scikit-Learn and the Azu...PyconIL 2017  Realtime Sensor Anomaly Detection with Scikit-Learn and the Azu...
PyconIL 2017 Realtime Sensor Anomaly Detection with Scikit-Learn and the Azu...Aaron (Ari) Bornstein
 
DLD TLV Cognitive Services: The Brains Behind Your Bot
DLD TLV Cognitive Services:The Brains Behind Your BotDLD TLV Cognitive Services:The Brains Behind Your Bot
DLD TLV Cognitive Services: The Brains Behind Your BotAaron (Ari) Bornstein
 

Mais de Aaron (Ari) Bornstein (13)

The Importance of Developing Interpretable AI Applications
The Importance of Developing Interpretable AI ApplicationsThe Importance of Developing Interpretable AI Applications
The Importance of Developing Interpretable AI Applications
 
Microsoft Breeze CA AI Workshop
Microsoft Breeze CA AI WorkshopMicrosoft Breeze CA AI Workshop
Microsoft Breeze CA AI Workshop
 
What startups need to know about NLP, AI, & ML on the cloud.
What startups need to know about NLP, AI, & ML on the cloud.What startups need to know about NLP, AI, & ML on the cloud.
What startups need to know about NLP, AI, & ML on the cloud.
 
PyConIL 2019 Beyond word Embeddings Slides
PyConIL 2019 Beyond word Embeddings SlidesPyConIL 2019 Beyond word Embeddings Slides
PyConIL 2019 Beyond word Embeddings Slides
 
Best practices for DevRel in Israel
Best practices for DevRel in IsraelBest practices for DevRel in Israel
Best practices for DevRel in Israel
 
NLP in the Industry
NLP in the Industry NLP in the Industry
NLP in the Industry
 
Big Data with Azure
Big Data with AzureBig Data with Azure
Big Data with Azure
 
Democratizing AI Istanbul Open Source Summit
Democratizing AI Istanbul Open Source SummitDemocratizing AI Istanbul Open Source Summit
Democratizing AI Istanbul Open Source Summit
 
Beyond word embeddings
Beyond word embeddingsBeyond word embeddings
Beyond word embeddings
 
Data Hack 2018 Microsoft Math Teacher Challenge
Data Hack 2018 Microsoft Math Teacher ChallengeData Hack 2018 Microsoft Math Teacher Challenge
Data Hack 2018 Microsoft Math Teacher Challenge
 
A walk through Azure IoT
A walk through Azure IoTA walk through Azure IoT
A walk through Azure IoT
 
PyconIL 2017 Realtime Sensor Anomaly Detection with Scikit-Learn and the Azu...
PyconIL 2017  Realtime Sensor Anomaly Detection with Scikit-Learn and the Azu...PyconIL 2017  Realtime Sensor Anomaly Detection with Scikit-Learn and the Azu...
PyconIL 2017 Realtime Sensor Anomaly Detection with Scikit-Learn and the Azu...
 
DLD TLV Cognitive Services: The Brains Behind Your Bot
DLD TLV Cognitive Services:The Brains Behind Your BotDLD TLV Cognitive Services:The Brains Behind Your Bot
DLD TLV Cognitive Services: The Brains Behind Your Bot
 

Último

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 

Último (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 

Unsupervised Aspect Based Sentiment Analysis at Scale

  • 3. • Blog: https://medium.com/@aribornstein • Twitter: https://twitter.com/pythiccoder • GitHub: https://github.com/aribornstein
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Trends In Language Understanding Language Understanding Ambient Computing Automation and Solution Scale Unlocking Unstructured Data
  • 10.
  • 11.
  • 12. What is Azure Machine Learning Service? Set of Azure Cloud Services Python SDK Prepare Data Build Models Train Models Manage Models Track Experiments Deploy Models That enables you to:
  • 13. Azure ML Service Key Artifacts Workspace
  • 14. Advanced Azure ML Services
  • 15. NLP-Architect Aspect Based Sentiment Analysis
  • 16. Azure Machine Learning Typical E2E Process … Prepare Experiment Deploy Orchestrate
  • 17. Azure Machine Learning Typical E2E Process … Prepare Experiment Deploy Orchestrate
  • 18. Managing Tailwind Traders ML Challenges
  • 19. Challenges of distributed training  Dependencies and Containers Provision clusters of VMs Schedule jobs Distribute data Gather results Handle failures Scale resources Secure Access
  • 20. Features of Distributed Training distributed training with Horovod
  • 21. /Docs alert Get Started by setting up your Azure ML Environment https://aka.ms/amlsdk
  • 22. My Computer Data Store Azure ML Workspace Compute Target Docker Image How Azure ML Works
  • 23. The 8 Azure ML Train Steps
  • 24. from azureml.core import Workspace ws = Workspace.create(name='myworkspace', subscription_id='<azure-subscription-id>', resource_group='myresourcegroup', create_resource_group=True, location='eastus2' # or other supported Azure region ) # see workspace details ws.get_details() Step 2 – Create an Experiment experiment_name = ‘my-experiment-1' from azureml.core import Experiment exp = Experiment(workspace=ws, name=experiment_name) Step 1 – Create a workspace
  • 25. Step 3 – Create remote compute target # choose a name for your cluster, specify min and max nodes compute_name = os.environ.get("BATCHAI_CLUSTER_NAME", "cpucluster") compute_min_nodes = os.environ.get("BATCHAI_CLUSTER_MIN_NODES", 0) compute_max_nodes = os.environ.get("BATCHAI_CLUSTER_MAX_NODES", 4) # This example uses CPU VM. For using GPU VM, set SKU to STANDARD_NC6 vm_size = os.environ.get("BATCHAI_CLUSTER_SKU", "STANDARD_D2_V2") provisioning_config = AmlCompute.provisioning_configuration( vm_size = vm_size, min_nodes = compute_min_nodes, max_nodes = compute_max_nodes) # create the cluster print(‘ creating a new compute target... ') compute_target = ComputeTarget.create(ws, compute_name, provisioning_config) # You can poll for a minimum number of nodes and for a specific timeout. # if no min node count is provided it will use the scale settings for the cluster compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
  • 26. # note that while loading, we are shrinking the intensity values (X) from 0-255 to 0-1 so that the model converge faster. X_train = load_data('./data/train-images.gz', False) / 255.0 y_train = load_data('./data/train-labels.gz', True).reshape(-1) X_test = load_data('./data/test-images.gz', False) / 255.0 y_test = load_data('./data/test-labels.gz', True).reshape(-1) First load the compressed files into numpy arrays. Note the ‘load_data’ is a custom function that simply parses the compressed files into numpy arrays. Now make the data accessible remotely by uploading that data from your local machine into Azure so it can be accessed for remote training. The files are uploaded into a directory named mnist at the root of the datastore. ds = ws.get_default_datastore() print(ds.datastore_type, ds.account_name, ds.container_name) ds.upload(src_dir='./data', target_path='mnist', overwrite=True, show_progress=True) We now have everything you need to start training a model. Step 4 – Upload data to the cloud
  • 27. %%time from sklearn.linear_model import LogisticRegression clf = LogisticRegression() clf.fit(X_train, y_train) # Next, make predictions using the test set and calculate the accuracy y_hat = clf.predict(X_test) print(np.average(y_hat == y_test)) You should see the local model accuracy displayed. [It should be a number like 0.915] Train a simple logistic regression model using scikit-learn locally. This should take a minute or two. Step 5 – Train a local model
  • 28. To submit a training job to a remote you have to perform the following tasks: • 6.1: Create a directory • 6.2: Create a training script • 6.3: Create an estimator object • 6.4: Submit the job Step 6.1 – Create a directory Create a directory to deliver the required code from your computer to the remote resource. import os script_folder = './sklearn-mnist' os.makedirs(script_folder, exist_ok=True) Step 6 – Train model on remote cluster
  • 29. %%writefile $script_folder/train.py # load train and test set into numpy arrays # Note: we scale the pixel intensity values to 0-1 (by dividing it with 255.0) so the model can # converge faster. # ‘data_folder’ variable holds the location of the data files (from datastore) Reg = 0.8 # regularization rate of the logistic regression model. X_train = load_data(os.path.join(data_folder, 'train-images.gz'), False) / 255.0 X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0 y_train = load_data(os.path.join(data_folder, 'train-labels.gz'), True).reshape(-1) y_test = load_data(os.path.join(data_folder, 'test-labels.gz'), True).reshape(-1) print(X_train.shape, y_train.shape, X_test.shape, y_test.shape, sep = 'n’) # get hold of the current run run = Run.get_context() #Train a logistic regression model with regularizaion rate of’ ‘reg’ clf = LogisticRegression(C=1.0/reg, random_state=42) clf.fit(X_train, y_train) Step 6.2 – Create a Training Script (1/2)
  • 30. print('Predict the test set’) y_hat = clf.predict(X_test) # calculate accuracy on the prediction acc = np.average(y_hat == y_test) print('Accuracy is', acc) run.log('regularization rate', np.float(args.reg)) run.log('accuracy', np.float(acc)) os.makedirs('outputs', exist_ok=True) # The training script saves the model into a directory named ‘outputs’. Note files saved in the # outputs folder are automatically uploaded into experiment record. Anything written in this # directory is automatically uploaded into the workspace. joblib.dump(value=clf, filename='outputs/sklearn_mnist_model.pkl') Step 6.2 – Create a Training Script (2/2)
  • 31. An estimator object is used to submit the run. from azureml.train.estimator import Estimator script_params = { '--data-folder': ds.as_mount(), '--regularization': 0.8 } est = Estimator(source_directory=script_folder, script_params=script_params, compute_target=compute_target, entry_script='train.py’, conda_packages=['scikit-learn']) Step 6.4 – Submit the job to the cluster for training run = exp.submit(config=est) run Step 6.3 – Create an Estimator
  • 32. Step 7 – Monitor a run You can watch the progress of the run with a Jupyter widget. The widget is asynchronous and provides live updates every 10-15 seconds until the job completes. from azureml.widgets import RunDetails RunDetails(run).show() Here is a still snapshot of the widget shown at the end of training:
  • 33. Step 8 – See the results As model training and monitoring happen in the background. Wait until the model has completed training before running more code. Use wait_for_completion to show when the model training is complete run.wait_for_completion(show_output=False) # now there is a trained model on the remote cluster print(run.get_metrics()) {'regularization rate': 0.8, 'accuracy': 0.9204}
  • 34. Step 9 – Register the model This wrote the file ‘outputs/sklearn_mnist_model.pkl’ in a directory named ‘outputs’ in the VM of the cluster where the job is executed. • outputs is a special directory in that all content in this directory is automatically uploaded to your workspace. • This content appears in the run record in the experiment under your workspace. • Hence, the model file is now also available in your workspace. joblib.dump(value=clf, filename='outputs/sklearn_mnist_model.pkl') Recall that the last step in the training script is: # register the model in the workspace model = run.register_model ( model_name='sklearn_mnist’, model_path='outputs/sklearn_mnist_model.pkl’) The model is now available to query, examine, or deploy
  • 35. Step 9 – Deploy the Model
  • 36. Step 9.1 – Create the scoring script Create the scoring script, called score.py, used by the web service call to show how to use the model. It requires two functions – init() and run (input data) from azureml.core.model import Model def init(): global model # retreive the path to the model file using the model name model_path = Model.get_model_path('sklearn_mnist’) model = joblib.load(model_path) def run(raw_data): data = np.array(json.loads(raw_data)['data’]) # make prediction y_hat = model.predict(data) return json.dumps(y_hat.tolist())
  • 37. Step 9.2 – Create environment file Create an environment file, called myenv.yml, that specifies all of the script's package dependencies. This file is used to ensure that all of those dependencies are installed in the Docker image. This example needs scikit-learn and azureml-sdk. from azureml.core.conda_dependencies import CondaDependencies myenv = CondaDependencies() myenv.add_conda_package("scikit-learn") with open("myenv.yml","w") as f: f.write(myenv.serialize_to_string()) Step 9.3 – Create configuration file Create a deployment configuration file and specify the number of CPUs and gigabyte of RAM needed for the ACI container. Here we will use the defaults (1 core and 1 gigabyte of RAM) from azureml.core.webservice import AciWebservice aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1, tags={"data": "MNIST", "method" : "sklearn"}, description='Predict MNIST with sklearn')
  • 38. Step 9.4 – Deploy the model to ACI %%time from azureml.core.webservice import Webservice from azureml.core.image import ContainerImage # configure the image image_config = ContainerImage.image_configuration( execution_script ="score.py", runtime ="python", conda_file ="myenv.yml") service = Webservice.deploy_from_model(workspace=ws, name='sklearn-mnist-svc’, deployment_config=aciconfig, models=[model], image_config=image_config) service.wait_for_deployment(show_output=True)
  • 39. Step 10 – Test the deployed model using the HTTP end point Test the deployed model by sending images to be classified to the HTTP endpoint import requests import json # send a random row from the test set to score random_index = np.random.randint(0, len(X_test)-1) input_data = "{"data": [" + str(list(X_test[random_index])) + "]}" headers = {'Content-Type':'application/json’} resp = requests.post(service.scoring_uri, input_data, headers=headers) print("POST to url", service.scoring_uri) #print("input data:", input_data) print("label:", y_test[random_index]) print("prediction:", resp.text)
  • 40. Demo: Operationalizing Open Source with Azure ML and NLP Architect
  • 42. What are Hyperparameters? { “learning_rate”: uniform(0, 1), “num_layers”: choice(2, 4, 8) … } Config1= {“learning_rate”: 0.2, “num_layers”: 2, …} Config2= {“learning_rate”: 0.5, “num_layers”: 4, …} Config3= {“learning_rate”: 0.9, “num_layers”: 8, …} …
  • 43. Typical ‘manual’ approach to hyperparameter tuning Dataset Training Algorithm 1 Hyperparameter Values – config 1 Model 1 Hyperparameter Values – config 2 Model 2 Hyperparameter Values – config 3 Model 3 Model Training InfrastructureTraining Algorithm 2 Hyperparameter Values – config 4 Model 4 Complex Tedious Repetitive Time consuming Expensive
  • 44. Automated Hyperparameter Tuning Manage Active Jobs Evaluate training runs for specified primary metric Early terminate poor performing training runs. Early termination policies include: Early terminate poorly performing runs
  • 45. /Docs alert Get started with Azure Machine Learning Service Hyperparameter Tuning https://aka.ms/AzureMLHyperDrive
  • 48. /Docs alert Azure ML Best Practices Repos github.com/microsoft/nlp/ • Computer Vision github.com/microsoft/ComputerVision • Recommenders github.com/microsoft/Recommenders https://aka.ms/TextCogSvc https://aka.ms/AutomatedMLDoc https://aka.ms/AzureMLHyperDrive https://aka.ms/AA3dzht https://aka.ms/9MLTips Getting Started Series… Other Materials
  • 50.
  • 51.

Notas do Editor

  1. For this, the simplest solution would be to use Text analytics, which is Azure Cognitive service that enables you to enter your text and get automatically metadata and features. Tailwind Traders can use it to have insight into how people are interacting with their restaurants. Let’s have a look at how it works.
  2. Why choose these APIs ? They work, and it’s easy. Easy:  The APIs are easy to implement because of the simple REST calls.  Being REST APIs, there’s a common way to implement and you can get started with all of them for free simply by going to one place, one website, www.microsoft.com/cognitive.  (You don’t have to hunt around to different places.)  Flexible:  We’ve got a breadth of intelligence and knowledge APIs so developers will be able to find what intelligence feature they need; and importantly, they all work on whatever language, framework, or platform developers choose. So, devs can integrated into their apps—iOS, Android, Windows—using their own tools they know and love (such as python or node.js, etc.). Tested: Tap into an ever-growing collection of powerful AI algorithms developed by experts. Developers can trust the quality and expertise build into each by experts in their field from Microsoft’s Research organization, Bing, and Azure machine learning and these capabilities are used across many Microsoft first party products such as Cortana, Bing and Skype. 
  3. The pre-built AI services, whether it's in text or computer vision or language understanding usually will fill 80% of your business scenario needs. However, you're going to find that there are 20% of your business needs that might not be filled by a prebuilt model, and in those cases you're going to need something more custom. What tends to be really interesting, in my own personal experience, and as you're going to see this applies to Tailwind Traders as well, is that sometimes these 20% end up being the most valuable. In many cases, I might want to start with the cognitive services because it really makes it easy to get started, and that enables me to focus more deeply on the 20% of scenarios that I can't reach with the cognitive services. So let me give you a quick example in demo. We take the Text Analytics API, and we see here. I had a wonderful trip to Seattle last week. I even visited the Space Needle 2 times and we get really nice 84% positive sentiment. But here's something that's Interesting. If I changed one letter in “wonderful”, all of a sudden this typo causes the sentiment to drop to 50%. Now, this specific example is not a crazy example because you know it's something that I can change with maybe putting in a model for spelling detection. But there might be other things as I'm building and trying to build more robust and more custom versions of sentiment analysis that might be domain specific to my restaurant, and less applicable in the general model.
  4. In that case, I might want to build my own so the way that I do this is I can use the Azure Machine Learning Service. What the Azure Machine Learning Services is -- it's a set of cloud services and APIs, with a Python SDK, that enable you to prepare your data, build models, train the models, and manage them, as well as the different experiments that you're using to build these models, and finaly deploy them at scale.
  5. So the Azure ML Services are broken up to a few different artifacts. We have the models that we're going to build, the experiments of how we come up and develop them, the pipelines that allow us to take our experiments and build the pipeline for their deployment. This is especially useful for linking our data to model development and redeployment. We have compute management, which allows us to manage compute at scale, creating clusters of VMs as needed. Image management allows us to manage different environments for our models and deployments, allowing us to deploy them on edge devices, on the cloud, or even on prem. And finally the data storage is also integrated into one workspace.
  6. So now that we have our baseline and I think that's great. Let’s discuss some of the more advanced Azura ML Services. They provide a foundation, on top of which we can train any models, including some open source models, and then put them into production and manage.
  7. We're going to be taking an open source model developed by Intel AI Labs as part of their NLP Architect Library. What it does is it provides us with something called aspect based sentiment analysis. Before when we were looking at our restaurant reviews, we got just a very simple sentiment average for each restaurant. However, for Tailwind Traders that doesn't help to actually go in and fix any of the problems, because they still have one more step that they have to do - they have to send in an inspector, who can actually quantify and see why sentiment is so good at one branch versus another. Aspect based sentiment analysis allows us to address this challenge. What it does is is it takes our text an it extracts what's called a lexicon from it, of both aspect-related and opinion-related words. Each of the aspects represents things we're interested in, like the price of food, cleanness of the place, etc. And when we analyze actual review - it would give us a sentiment for each one of these individual aspects, which then in turn allows us to provide much more actionable responses from our models. These are the ABSA key messages: Producing knowledge regarding specific aspects hence enables to gain targeted business insight. Unsupervised - does not require costly manually tagged data for train Domain Adaptive – produces domain specific aspect and opinion terms   These are the additional NLP architect key messages: Develop low-resource and NLP models that require small amounts of training data. Increase adoption of NLP by developing deep-learning based tools that can be used by non-machine-learning experts. Produce optimized NLP models that are ready for deployment on Intel AI hardware.
  8. So, in order to put this Intel model into production, we’re going to use the Azure Machine Learning Service. I think it's important to understand how the end to end process works, from taking a model from open source up to deploying it into production. The first thing that we have to do is to prepare a data set, then we have to choose our experiment environment, in this case we're going to be using Jupyter notebooks. We have to go and train this model, and iterate a few of different versions. Once we do have a model that is ready for deployment, we need to work on putting the model into production.
  9. In this session, we're going to be focusing on the experimentation phase, though I strongly recommend that you check out the AIML 50 session, which talks about how you can take your models that you've developed and put them into production.
  10. So for Tailwind Traders when they're thinking of how to take an open source state of the art model and put it into production, they've noticed that they have a bunch of critical challenges. Things like being able to manage the environmental dependency across the different packages and their different versions. Training a model with a globally distributed team in a globally distributed company is also a challenge. Managing large data sets and understanding how to optimize costs for storing their data and compute, as well as scaling their compute. Also in terms of compliance and making sure that they are ensured data protection, that only certain people have access to data and only the correct parts of the data. When they build the models, making sure that the models are interpretable in a way that they can understand how they were trained how they can be reproduced and the last real big challenge that they have is managing distributed training. Each of these things is somehow provided by the Azure ML Service, but it would take a very long time to go through each and every single one of them individually So what will do here is we're going to drill down into one of them and you will see the complexity of it, and then as we go and we solve and put this aspect based sentiment analysis model for Tailwind Traders in production.
  11. 19
  12. 20
  13. I am providing links to relevant docs here for your convenience.
  14. When I need to train an Azure ML model, it works as follows. There are just a few steps that we need to get started. The first thing that I need to do is I need to take the code that I have for model training and I need to send it to the Azure Machine Learning Service. The service will then create the configuration for the experiment that I’ve just sent, and it will create a Docker image. It will deploy it to the compute target that I've configured which will be linked to the data that I’ve uploaded in my dataset. It will then launch the training script that I've configured and then in real time it will send me back the stream logs and metrics so that I can see how the process goes, as if I were developing on my own computer!
  15. To take this Aspect-Based Sentiment Analysis model into production, there would by 8 simple steps! 1. Configure Azure Machine Learning Workspace that we saw above, it would act like a central place for all our activities – from storing data to training models. 2. Define a compute target – on which our training code is going to run. 3. Upload our data set. 4. Create the experiment with the train code, encompassing all the dependencies that we have and the training script. 5. Training script can be taken from the existing Intel’s code. Sometimes we might need to adjust the train script from the original library for distributed training, in other cases changes to the code would be very minimal. 6. Run the estimator to perform the training 7. Once we're done with the results of training, we get the model which we then need to register. 8. Registered model can be deployed and tested, and becomes ready for production.
  16. Let’s have a look at how such a model can be trained using a compute cluster. Typically, as a data scientist, I will do most of my development from a Jupyter Notebook. So I can select any Jupyter environment (local, Azure Notebooks or a Notebook inside Azure ML Workspace), and start my work from there using Azure ML Python SDK.
  17. When we're dealing with model optimization, we also have a challenge of distributed hyperparameter tuning. What is distributed hyperparameter tuning?
  18. What are hyperparameters typically? When you see a new state of the art model that's been open sourced, whether that’s BERT, or in this case our aspect based sentiment analysis model, typically you see these models trained on certain data. The performance of the model is Dependent on the distribution of the data that it’s been trained on, and there are set of parameters that are configured as part of the models that are optimized for the given data set, so that the researcher who is publishing the model can get the best possible performance. Now, when you take these models and you run them on your own data or train them for your own scenario, you're going to have a different distribution of data, so these hyperparameters might need to be adjusted, otherwise you will see a big gap in performance of your model when you apply it to Tailwind Traders data, comparing to what’s written in the leaderboards or in the papers.
  19. This it can be a real challenge going through a manual hyperparameter tuning. Typically, what we do is we take a data set and we come up with an algorithm, and then we configure these parameters to build our first model, and then we change the parameters again and we build a second model and we tried to see whether it's better than the first one, and we do this a couple of times until we max out the potential of this algorithm. And then we might change our model and then we have to do this process all over again, and you can imagine it’s time-consuming tedious repetitive and really expensive process.
  20. Well, the Azure ML Service helps you manage this by running distributed versions of your model and allowing you to choose which are the primary metrics. You want to optimize based on the hyperparameters that you provide, and it provides a bunch of different policies, such as bandit policies, medium stopping policy and truncation policies that enable early termination, so if it sees that a certain set of hyperparameters isn't providing the best possible results, it will terminate those run jobs and release and then run and prioritize the compute for models that are much more promising -- until you find the optimal set of hyperparameters
  21. If you want to get started with that here's the documentation. We will be diving into it in a second so you can see how this service works now.
  22. So let's tie this all together what we've seen here in literally less than an hour.
  23. There are also a few takeways that I want you to get away with. First of all, AI is instrumental to companies success in the modern world, and most of the AI can be incorporated into production using Cognitve Services. However, for more complex tasks, Microsoft provides both simple “code-less” ways to train a baseline model, and very sophisticated Azure ML Service that can simplify life of a serious data scientist quite a bit. Whether you are experiences ML professional or a “traditional” developer – we have you covered!
  24. Here is the summary of all docs pages that we have touched upon – they all present a great material for you to study after the session. I want especially to draw your attention to the last link, which is an overview article explaining how the thights we have learnt today can be useful in developing production models.
  25. So thank you for your time and I hope you enjoyed.