SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Apache Submarine
Unified Machine Learning Platform
Keqiu Hu
Staff Software Engineer LinkedIn
Wangda Tan
Apache Hadoop PMC Member
Sr. Engineering Manager @ Cloudera
Agenda ML in Production & Requirements
What is Apache Submarine?
Demo
Current state and future
Machine Learning In Production
Machine Learning in tutorial
What is included in a ML training lifecycle
Data Pipeline For Machine Learning
ETL
Data Exploration
Join / Sampling /
Feature Extraction
Split train, test Data set, etc.
Model Training
Model Saving,
Versioning, etc.
Model Deployment
(Online Serving)
• Expert of ML algorithms, models, libraries,
feature engineering.
• Need tools and platforms to gain insight of
data, build models productively. Create ML
pipeline (such as: data labeling,
transformation, etc.)
• Mostly familiar with Python, Spark, Hive,
etc.
• Not familiar with platform stuffs.
Data Scientist
Model Exploration
• Pre-process using Spark/Hive,
(or some small scale
alternatives)
• Experiment using sampled
dataset with notebooks.
(Single node)
• Experiment with full dataset to
get best results. (distributed)
What Data Scientist Expect? (Cont)
Reproducible Experiment
• Record parameters, code,
metrics of experiment
• Dependency management,
coding once, run everywhere.
• Easy to fine-tune parameters,
AutoML.
Model Management
• Easy to manage model, and
push to production
• Model assurance, monitoring
What Data Scientist NOT expect to know?
• Deep understanding of resource
management system concepts in
YARN/Kubernetes (how capacity
scheduler, k8s operations, etc.)
• Compute engine tuning (memory
configuration, shuffling
performance)
• Nitty gritty details of underlying
infra, it should just work
What is Submarine?
Compute Engine
Connector
Submarine Service
Submarine Workbench
SDK
Submarine Architecture
Projects ModelData Hub
Java/Python/REST API,
Mini-Submarine
Metric StoreNotebook
Cluster Orchestrator
Runtime
Submarine “Hello World”
java -cp hadoop-submarine-<version>.jar submarine.cli job run
--framework tensorflow
--name tf-job-001
--input_path “hdfs://default/dataset/cifar-10-data”
--checkpoint_path “hdfs://default/tmp/cifar-10-jobdir”
--num_workers 2
--worker_resources memory=8G,vcores=2,gpu=2
--worker_launch_cmd "cmd for worker … "
--num_ps 1
--ps_resources memory=4G,vcores=2,gpu=0
--ps_launch_cmd "cmd for ps ..."
Demo: Mini Submarine
演示 mini-submarine
Demo: Zeppelin integration
Submarine 训练和推理演示
Demo: Submarine Workbench
Submarine Workbench
现状
Current State and Future
Current State
Submarine v0.1 released after Apache Hadoop v3.2.0
Submarine v0.2 released
PyTorch support
Thin and uber jar
Addded LinkedIn TonY execution runtime (Hadoop 2.7.3+compatability)
Zeppelin Submarine interpreter
Mini-submarine (All-In-One image)
Future
• We are working with Hadoop/Apache community to spin-off
Submarine to a new Apache project.
• Some more features we are working on
Task JIRA Target Version
Submarine Workbench (Web/Server) SUBMARINE-98/SUBMARINE-131 0.3.0
Submarine Kubernetes Support SUBMARINE-154 0.3.0
Metrics Support (Like MLflow) TBD TBD
Submarine Community
Development Team
Community Members
Hadoop Community PMC & Committer
Zeppelin Community PMC & Committer
Cloudera Wangda Tan,Zhankun Tang,Sunil Govind …
NetEase Xun Liu, Quan Zhou
LinkedIn Keqiu Hu
Alibaba Jeff Zhang
Ke.com Guoxian Zhao,Feng Liu,Huiyang Jian
JD Wanqiang Ji
Dahua Linhao Zhu
NetEase
• One of the largest online
game/news/music provider in
China.
• 245 GPU Cluster runs
Submarine.
• One of the model built is
music recommendation
model which invoked
1B+/days.
Community Use Cases
LinkedIn
• 250+ GPU machines
• 500+ TensorFlow
trainings/day.
• Serves applications in
recommendation systems and
NLP.
• Collaboration on runtime and
SDK development.
Ke.com
• Largest online real-estate
brokerage website in China.
• 50+ GPU machines (includes
19 multi-v100 GPU machines),
based on Hadoop trunk
(3.3.0).
• Serves applications like
image/voice recognition, etc.
Thank you!
Please join the community!
Website:
https://hadoop.apache.org/submarine/
Weekly Community Meeting:
https://docs.google.com/document/d/1XkrcyVil_ORV1UP-
JhosGzK8qWGXXX3wuplo4RtC7u0/edit
Code:
https://github.com/apache/hadoop

Mais conteúdo relacionado

Mais procurados

Python Deployment with Fabric
Python Deployment with FabricPython Deployment with Fabric
Python Deployment with Fabric
andymccurdy
 

Mais procurados (20)

Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Inferno Scalable Deep Learning on Spark
Inferno Scalable Deep Learning on SparkInferno Scalable Deep Learning on Spark
Inferno Scalable Deep Learning on Spark
 
A fun cup of joe with open liberty
A fun cup of joe with open libertyA fun cup of joe with open liberty
A fun cup of joe with open liberty
 
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
One-click Hadoop Cluster Deployment on OpenPOWER Systems
One-click Hadoop Cluster Deployment on OpenPOWER SystemsOne-click Hadoop Cluster Deployment on OpenPOWER Systems
One-click Hadoop Cluster Deployment on OpenPOWER Systems
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And Profit
 
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environmentLessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
RENCI User Group Meeting 2017 - I Upgraded iRODS and I still have all my hair
RENCI User Group Meeting 2017 - I Upgraded iRODS and I still have all my hairRENCI User Group Meeting 2017 - I Upgraded iRODS and I still have all my hair
RENCI User Group Meeting 2017 - I Upgraded iRODS and I still have all my hair
 
ha_module5
ha_module5ha_module5
ha_module5
 
JahiaOne - Performance Tuning
JahiaOne - Performance TuningJahiaOne - Performance Tuning
JahiaOne - Performance Tuning
 
Hadoop engineering bo_f_final
Hadoop engineering bo_f_finalHadoop engineering bo_f_final
Hadoop engineering bo_f_final
 
Orchestration tool roundup - OpenStack Israel summit - kubernetes vs. docker...
Orchestration tool roundup  - OpenStack Israel summit - kubernetes vs. docker...Orchestration tool roundup  - OpenStack Israel summit - kubernetes vs. docker...
Orchestration tool roundup - OpenStack Israel summit - kubernetes vs. docker...
 
Atlanta Hadoop Users Meetup 09 21 2016
Atlanta Hadoop Users Meetup 09 21 2016Atlanta Hadoop Users Meetup 09 21 2016
Atlanta Hadoop Users Meetup 09 21 2016
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on Kubernetes
 
Real-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and KafkaReal-time streams and logs with Storm and Kafka
Real-time streams and logs with Storm and Kafka
 
Cobbler, Func and Puppet: Tools for Large Scale Environments
Cobbler, Func and Puppet: Tools for Large Scale EnvironmentsCobbler, Func and Puppet: Tools for Large Scale Environments
Cobbler, Func and Puppet: Tools for Large Scale Environments
 
deep learning in production cff 2017
deep learning in production cff 2017deep learning in production cff 2017
deep learning in production cff 2017
 
Python Deployment with Fabric
Python Deployment with FabricPython Deployment with Fabric
Python Deployment with Fabric
 

Semelhante a Apache Submarine: Unified Machine Learning Platform

Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Chris Fregly
 

Semelhante a Apache Submarine: Unified Machine Learning Platform (20)

High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
High Performance Distributed TensorFlow with GPUs - Nvidia GPU Tech Conferenc...
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedIngesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
 
Deep learning - the conf br 2018
Deep learning - the conf br 2018Deep learning - the conf br 2018
Deep learning - the conf br 2018
 
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
Optimize + Deploy Distributed Tensorflow, Spark, and Scikit-Learn Models on G...
 
Distributed Deep Learning on Hadoop Clusters
Distributed Deep Learning on Hadoop ClustersDistributed Deep Learning on Hadoop Clusters
Distributed Deep Learning on Hadoop Clusters
 
introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.js
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
 
php & performance
 php & performance php & performance
php & performance
 
Enterprise Deep Learning with DL4J
Enterprise Deep Learning with DL4JEnterprise Deep Learning with DL4J
Enterprise Deep Learning with DL4J
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
 
Deep Learning: DL4J and DataVec
Deep Learning: DL4J and DataVecDeep Learning: DL4J and DataVec
Deep Learning: DL4J and DataVec
 
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
 
饿了么 TensorFlow 深度学习平台:elearn
饿了么 TensorFlow 深度学习平台:elearn饿了么 TensorFlow 深度学习平台:elearn
饿了么 TensorFlow 深度学习平台:elearn
 
Large-scaled Deploy Over 100 Servers in 3 Minutes
Large-scaled Deploy Over 100 Servers in 3 MinutesLarge-scaled Deploy Over 100 Servers in 3 Minutes
Large-scaled Deploy Over 100 Servers in 3 Minutes
 

Mais de Wangda Tan

Mais de Wangda Tan (6)

Running Tensorflow In Production: Challenges and Solutions on YARN 3.x
Running Tensorflow In Production: Challenges and Solutions on YARN 3.x Running Tensorflow In Production: Challenges and Solutions on YARN 3.x
Running Tensorflow In Production: Challenges and Solutions on YARN 3.x
 
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Hadoop Summit - Scheduling policies in YARN - San Jose 2016
Hadoop Summit - Scheduling policies in YARN - San Jose 2016Hadoop Summit - Scheduling policies in YARN - San Jose 2016
Hadoop Summit - Scheduling policies in YARN - San Jose 2016
 
Node labels in YARN
Node labels in YARNNode labels in YARN
Node labels in YARN
 
Hadoop summit-diverse-workload
Hadoop summit-diverse-workloadHadoop summit-diverse-workload
Hadoop summit-diverse-workload
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 

Apache Submarine: Unified Machine Learning Platform

  • 1. Apache Submarine Unified Machine Learning Platform Keqiu Hu Staff Software Engineer LinkedIn Wangda Tan Apache Hadoop PMC Member Sr. Engineering Manager @ Cloudera
  • 2. Agenda ML in Production & Requirements What is Apache Submarine? Demo Current state and future
  • 3. Machine Learning In Production
  • 5. What is included in a ML training lifecycle
  • 6. Data Pipeline For Machine Learning ETL Data Exploration Join / Sampling / Feature Extraction Split train, test Data set, etc. Model Training Model Saving, Versioning, etc. Model Deployment (Online Serving)
  • 7. • Expert of ML algorithms, models, libraries, feature engineering. • Need tools and platforms to gain insight of data, build models productively. Create ML pipeline (such as: data labeling, transformation, etc.) • Mostly familiar with Python, Spark, Hive, etc. • Not familiar with platform stuffs. Data Scientist
  • 8. Model Exploration • Pre-process using Spark/Hive, (or some small scale alternatives) • Experiment using sampled dataset with notebooks. (Single node) • Experiment with full dataset to get best results. (distributed) What Data Scientist Expect? (Cont) Reproducible Experiment • Record parameters, code, metrics of experiment • Dependency management, coding once, run everywhere. • Easy to fine-tune parameters, AutoML. Model Management • Easy to manage model, and push to production • Model assurance, monitoring
  • 9. What Data Scientist NOT expect to know? • Deep understanding of resource management system concepts in YARN/Kubernetes (how capacity scheduler, k8s operations, etc.) • Compute engine tuning (memory configuration, shuffling performance) • Nitty gritty details of underlying infra, it should just work
  • 11. Compute Engine Connector Submarine Service Submarine Workbench SDK Submarine Architecture Projects ModelData Hub Java/Python/REST API, Mini-Submarine Metric StoreNotebook Cluster Orchestrator Runtime
  • 12. Submarine “Hello World” java -cp hadoop-submarine-<version>.jar submarine.cli job run --framework tensorflow --name tf-job-001 --input_path “hdfs://default/dataset/cifar-10-data” --checkpoint_path “hdfs://default/tmp/cifar-10-jobdir” --num_workers 2 --worker_resources memory=8G,vcores=2,gpu=2 --worker_launch_cmd "cmd for worker … " --num_ps 1 --ps_resources memory=4G,vcores=2,gpu=0 --ps_launch_cmd "cmd for ps ..."
  • 21. Current State Submarine v0.1 released after Apache Hadoop v3.2.0 Submarine v0.2 released PyTorch support Thin and uber jar Addded LinkedIn TonY execution runtime (Hadoop 2.7.3+compatability) Zeppelin Submarine interpreter Mini-submarine (All-In-One image)
  • 22. Future • We are working with Hadoop/Apache community to spin-off Submarine to a new Apache project. • Some more features we are working on Task JIRA Target Version Submarine Workbench (Web/Server) SUBMARINE-98/SUBMARINE-131 0.3.0 Submarine Kubernetes Support SUBMARINE-154 0.3.0 Metrics Support (Like MLflow) TBD TBD
  • 24. Development Team Community Members Hadoop Community PMC & Committer Zeppelin Community PMC & Committer Cloudera Wangda Tan,Zhankun Tang,Sunil Govind … NetEase Xun Liu, Quan Zhou LinkedIn Keqiu Hu Alibaba Jeff Zhang Ke.com Guoxian Zhao,Feng Liu,Huiyang Jian JD Wanqiang Ji Dahua Linhao Zhu
  • 25. NetEase • One of the largest online game/news/music provider in China. • 245 GPU Cluster runs Submarine. • One of the model built is music recommendation model which invoked 1B+/days. Community Use Cases LinkedIn • 250+ GPU machines • 500+ TensorFlow trainings/day. • Serves applications in recommendation systems and NLP. • Collaboration on runtime and SDK development. Ke.com • Largest online real-estate brokerage website in China. • 50+ GPU machines (includes 19 multi-v100 GPU machines), based on Hadoop trunk (3.3.0). • Serves applications like image/voice recognition, etc.
  • 26. Thank you! Please join the community! Website: https://hadoop.apache.org/submarine/ Weekly Community Meeting: https://docs.google.com/document/d/1XkrcyVil_ORV1UP- JhosGzK8qWGXXX3wuplo4RtC7u0/edit Code: https://github.com/apache/hadoop