O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Flock: Data Science Platform @ CISL

2.542 visualizações

Publicada em

In this talk, we will present the basic features and functionality of Flock, an end-to-end research platform that we are developing at CISL which simplifies and automates the integration of machine learning solutions in data engines. Flock makes use of MLflow for model and experiment tracking but extends and complements it by providing automatic logging, model optimizations and support for the ONNX model format.

We will showcase Flock's features through a demo using Microsoft's Azure Data Studio and SQL Server.

Publicada em: Software

Flock: Data Science Platform @ CISL

  1. 1. Chapter 1 Chapter 2
  2. 2. Applied research group Collaborating with Azure Data product group Open-sourcing our code Apache Hadoop, REEF, Heron, MLflow
  3. 3. Our labs by numbers 637 Patents GAed or Public Preview features just this year LoC in OSS 0.5M LoC in OSS 130+Publications in top tier conferences/journals 1.1M LoC in products 600k Servers running our code in Azure/Cosmos
  4. 4. Systems considered thus far Cloud Providers Private Services OSS
  5. 5. Training Experiment Tracking Managed Notebooks Pipelines / Projects Multi-Framework Proprietary Algos Distributed Training Auto ML Serving Batch prediction On-prem deployment Model Monitoring Model Validation Data Management Data Provenance Data testing Feature Store Featurization DSL Labelling In-DB ML Good Support OK Support No Support Unknown
  6. 6. Let Data Scientists do Data Science!
  7. 7. offline online Data-driven development Solution Deployment NN Model transform ONNX ONNX’ Optimization Close/Update Incidents Job-id Job telemetry telemetry application tracking model training LightGBM policies deployment ONNX’ pyfunc policies Dhalion
  8. 8. DEMO Python code import pandas as pd import lightgbm as lgb from sklearn import metrics data_train = pd.read_csv("global_train_x_label_with_mapping.csv") data_test = pd.read_csv("global_test_x_label_with_mapping.csv") train_x = data_train.iloc[:,:-1].values train_y = data_train.iloc[:,-1].values test_x = data_test.iloc[:,:-1].values test_y = data_test.iloc[:,-1].values n_leaves = 8 n_trees = 100 clf = lgb.LGBMClassifier(num_leaves=n_leaves, n_estimators=n_trees) clf.fit(train_x,train_y) score = metrics.precision_score(test_y, clf.predict(test_x), average='macro’) print("Precision Score on Test Data: " + str(score)) import mlflow import mlflow.onnx import multiprocessing import torch import onnx from onnx import optimizer from functools import partial from flock import get_tree_parameters, LightGBMBinaryClassifier_Batched import mlflow.sklearn import mlflow import pandas as pd import lightgbm as lgb from sklearn import metrics data_train = pd.read_csv('global_train_x_label_with_mapping.csv') data_test = pd.read_csv('global_test_x_label_with_mapping.csv') train_x = data_train.iloc[:, :-1].values train_y = data_train.iloc[:, (-1)].values test_x = data_test.iloc[:, :-1].values test_y = data_test.iloc[:, (-1)].values n_leaves = 8 n_trees = 100 clf = lgb.LGBMClassifier(num_leaves=n_leaves, n_estimators=n_trees) mlflow.log_param('clf_init_n_estimators', n_trees) mlflow.log_param('clf_init_num_leaves', n_leaves) clf.fit(train_x, train_y) mlflow.sklearn.log_model(clf, 'clf_model') score = metrics.precision_score(test_y, clf.predict(test_x), average='macro') mlflow.log_param('precision_score_average', ' macro') mlflow.log_param('score', score) print('Precision Score on Test Data: ' + str(score)) n_features = 100 activation = 'sigmoid' torch.set_num_threads(1) device = torch.device('cpu') model_name = 'griffon' model = clf.booster_.dump_model() n_features = clf.n_features_ tree_infos = model['tree_info'] pool = multiprocessing.Pool(8) parameters = pool.map(partial(get_tree_parameters, n_features=n_features), tree_infos) lgb_nn = LightGBMBinaryClassifier_Batched(parameters, n_features, activation ).to(device) torch.onnx.export(lgb_nn, torch.randn(1, n_features).to(device), model_name + '_nn.onnx', export_params=True, operator_export_type=torch.onnx. OperatorExportTypes.ONNX_ATEN_FALLBACK) passes = ['eliminate_deadend', 'eliminate_identity', 'eliminate_nop_monotone_argmax', 'eliminate_nop_transpose', 'eliminate_unused_initializer', 'extract_constant_to_initializer', 'fuse_consecutive_concats', 'fuse_consecutive_reduce_unsqueeze', 'fuse_consecutive_squeezes', 'fuse_consecutive_transposes', 'fuse_matmul_add_bias_into_gemm', 'fuse_transpose_into_gemm', 'lift_lexical_references'] model = onnx.load(model_name + '_nn.onnx') opt_model = optimizer.optimize(model, passes) mlflow.onnx.log_model(opt_model, 'opt_model') pyfunc_loaded = mlflow.pyfunc.load_pyfunc('opt_model', run_id=mlflow. active_run().info.run_uuid) scoring = pyfunc_loaded.predict(pd.DataFrame(test_x[:1].astype('float32')) ).values print('Scoring through mlflow pyfunc: ', scoring) mlflow.log_param('pyfunc_scoring', scoring[0][0]) User code Instrumented code Flock
  9. 9. Current OnCall Workflow Revised OnCall Workflow with Griffon A support engineer (SE) spends hours of manual labor looking through hundreds of metrics After 5-6 hours of investigation, the reason for job slow down is found. A job goes out of SLA and Support is alerted A job goes out of SLA and the SE is alerted The Job ID is fed through Griffon and the top reasons for job slowdown are generated automatically The reason is found in the top five generated by Griffon. All the metrics Griffon has looked at can be ruled out and the SE can direct their efforts to a smaller set of metrics.
  10. 10. ONNX: Interoperability across ML frameworks Open format to represent ML models Backed by Microsoft, Amazon, Facebook, and several hardware vendors
  11. 11. Train a model using a popular framework such as TensorFlow Convert the model to ONNX format Perform inference efficiently across multiple platforms and hardware using ONNX runtime
  12. 12. ONNX Runtime and optimizations Key design points: Graph IR Support for multiple backends (e.g., CPU, GPU, FPGA) Graph optimizations Rule-based optimizer inspired by DB optimizers Improved inference time and memory consumption Examples: 117msec → 34msec; 250MB → 200MB
  13. 13. ~40 ONNX models in production >10 orgs are migrating their models to ONNX Runtime Average Speedup 2.7x ONNX Runtime in production
  14. 14. ONNX Runtime in production Office – Grammar Checking Model 14.6x reduction in latency
  15. 15. Train a sklearn model
  16. 16. mlflow models serve -m /artifacts/model -p 1234 curl -X POST -H "Content-Type:application/json;format=pandas-split"--data '{"columns":["alcohol","chlorides", "citric acid", "density","fixed acidity","free sulfur dioxide", "pH", "residual sugar", "sulphates", "total sulfur dioxide", "volatile acidity"],"data":[[12.8,0.029, 0.48, 0.98, 6.2, 29, 3.33, 1.2, 0.39, 75, 0.66]]}' [6.379428821398614] Deploy the server Perform Inference ONNX Runtime is automatically invoked