SlideShare uma empresa Scribd logo
1 de 51
Baixar para ler offline
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
2018年8月6日
李 燮鳴
Spark AI Summit 2018 報告会
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
自己紹介
2
李 燮鳴 (リ ショウメイ)
2017年3月筑波大学大学院博士(工学)取得
• 並列ファイルシステムのためのスケジューラの研究
2017年4月からはヤフーに入社
• 入社後はHadoopクラスタのDevOpsを担当
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Spark AI Summit 2018
3
開催日時:2018/06/04~2018/06/06
場所:San Francisco Moscone Center West
参加者:6000名ほど
セッション:約9~11並列発表され、合計で193セッション
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
会場 (外観)
4
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
会場 (内部)
5
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
食事
6
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
アジェンダ
7
• MLを便利にするフレームワーク(2件)
• Spark SQLについて(2件)
• クラスタアーキテクチャー(2件)
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
MLを便利にするフレームワーク
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
MEET UP: Horovod
9
• TensorFlowの分散型学習を高速化した
フレームワーク
• MPIのALL REDUCEを利用してGradientsの平均値
の計算を高速化した
Alexander Sergeev, Uber
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
MEET UP: Horovod (1)
10
Alexander Sergeev, Uber
Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from
https://eng.uber.com/horovod/
TensorFlowの分散型学習ではParameter Serverを使用し、各Workerで求まったGradientの平均値の計算
難点:Parameter Serverの構成を選択するのは難しい
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
MEET UP: Horovod (2)
11
Alexander Sergeev, Uber
 Horovodでは、Parameter Serverを使用せず、NCCL (NVIDIA Collective
Communications Library, MPIで実装)のRing ALL REDUCEでGradientsの交換・平均計算
を行った
Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from
https://eng.uber.com/horovod/
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
MEET UP: Horovod (3)
12
Alexander Sergeev, Uber
 TensorFlow のオフィシャルのベンチマークを用いた性能評価では約2倍ほどの性能向上を確
認できた
Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from
https://eng.uber.com/horovod/
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
MEET UP: Horovod (4)
13
Alexander Sergeev, Uber
 InfinibandでRDMA (Remote Direct Memory Access)を使用すると、性能がさらに上がった
Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from
https://eng.uber.com/horovod/
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
KEYNOTE: Hydrogen
14
Reynold Xin, Databricks
DLのフレームワークをSparkで効率できるようにする提案
• SPIP= Spark Project Improvement Proposal
• 現時点ではDesign Sketchが完了(Designが15%終了,
SPARK-24374)
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
KEYNOTE: Hydrogen (1)
15
Reynold Xin, Databricks
Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/databricks-keynote-2
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
KEYNOTE: Hydrogen (2)
16
Reynold Xin, Databricks
Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/databricks-keynote-2
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Spark SQL
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Deep Dive into Spark SQL with Advanced Performance Tuning
18
Xiao Li, Wenchen Fan, Databricks
Spark SQLがクエリから実行されるまでの各段階で実施できるパラ
メータチューニングの手法を紹介した
Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive-
into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Deep Dive into Spark SQL with Advanced Performance Tuning
19
Xiao Li, Wenchen Fan, Databricks
Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive-
into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Deep Dive into Spark SQL with Advanced Performance Tuning
20
Xiao Li, Wenchen Fan, Databricks
Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive-
into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Deep Dive into Spark SQL with Advanced Performance Tuning
21
Xiao Li, Wenchen Fan, Databricks
Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive-
into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Deep Dive into Spark SQL with Advanced Performance Tuning
22
Xiao Li, Wenchen Fan, Databricks
Spark SQLがクエリから実行されるまでの各段階で実施できるパラ
メータチューニングの手法を紹介した
Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive-
into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Spark SQL Adaptive Execution
23
Carson Wang, Intel, Yuanjian Li, Baidu
Spark SQLの実行をランタイムで変更させて効率よくした
• 最適なReducerの数をランタイムで決める
• 適切なJoin手法をランタイムで決める
• BaiduではProd環境で使用(SPARK-23128)
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Spark SQL Adaptive Execution (1)
24
Carson Wang, Intel, Yuanjian Li, Baidu
Reducerの数のチューニング
• 少なすぎる場合: Spill, OOM
• 多すぎる場合: Scheduling overhead. More IO
requests. Too many small output files
• すべてのstages に適した数を指定するのはむずかしい
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Spark SQL Adaptive Execution (2)
25
Carson Wang, Intel, Yuanjian Li, Baidu
ShuffledRowRDD
Partition 0 (70MB)
Partition 1 (30MB)
Partition 2 (20MB)
Partition 3 (10MB)
Partition 4 (50MB)
ShuffledRowRDD
Partition 0 (70MB)
Partition 1 (30MB)
Partition 2 (20MB)
Partition 3 (10MB)
Partition 4 (50MB)
Target Size per Reducer =64MB, Min-Max Shuffle Partition Number = 1 to 5
30+20+10<64MB
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Spark SQL Adaptive Execution (3)
26
Carson Wang, Intel, Yuanjian Li, Baidu
SQL Query Logical Plan
Optimized
Logical Plan
Multiple
Physical Plan
Selected
Physical Plan
Cost Modelで評価Join2
Join1
Exchange1
T1
Exchange2
T2
Exchange3
T3
最適ではない
Joinが選ばれる
Plannerの予測値と
実際の値と大きく異
なる場合がある
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Spark SQL Adaptive Execution (4)
27
Carson Wang, Intel, Yuanjian Li, Baidu
Join2
Join1
Exchange1
T1
Exchange2
T2
Exchange3
T3
QueryStage4
Join2
Join1
QueryStage
Input1
QueryStage
Input2
QueryStage
Input3
QueryStage1
Exchange1
T1
QueryStage2
Exchange2
T2
QueryStage3
Exchange3
T3
実際の値を把
握できる
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Spark SQL Adaptive Execution (5)
28
Carson Wang, Intel, Yuanjian Li, Baidu
BaiduでAdaptive Exectionを適用した結果
• SortMergeJoinがBroadcastJoinに変更され、
50%~200%の性能向上を確認した
• 実行時間が1時間以上のジョブでは適切なReducer数が
指定され、50%~100%の性能向上を確認した
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
クラスタアーキテクチャー
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture
30
Brian Cho, Facebook
• データと計算を分離したアーキテクチャーの紹介
• データと計算を分離したアーキテクチャーにおけるSpark
の最適化
1. Fileインターフェイスの定義
2. SparkのTemporaryファイルのアクセス最適化
3. Spark shuffleの最適化
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (1)
31
Brian Cho, Facebook
Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from
https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (2)
32
Brian Cho, Facebook
Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from
https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (3)
33
Brian Cho, Facebook
データと計算を分離したアーキテクチャーのメリット
• それぞれデータと計算に適したサーバー調達できる
• キャパシティプランニングが簡単
• それぞれのチームでメンテナンスできる
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (4)
34
Brian Cho, Facebook
Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from
https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (5)
35
Brian Cho, Facebook
Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from
https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (5)
36
Brian Cho, Facebook
Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from
https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (6)
37
Brian Cho, Facebook
Executor Executor Executor
ESS ESS ESS
Local FS Local FS Local FS
ここはLocalアクセス
計算
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (7)
38
Brian Cho, Facebook
Executor Executor Executor
ESS ESS ESS
Warm Storage
ここはRemoteアクセス
*Network Transfer
計算
ストレージ
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Taking Advantage of a Disaggregated Storage and
Compute Architecture (8)
39
Brian Cho, Facebook
Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from
https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
Index, shuffle
shuffle
shuffle
Index
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Apache Spark on Kubernetes Clusters
40
Sean Suchter, PepperData, Anirudh Ramanathan, Google
 Kubernetesの概要、Spark on Kubernetesの実装と今後
の予定を紹介した
 Spark DriverはKubernetesのCustom Controllerとして
実装されている
 将来的に追加される機能(ピックアップ)
• PySpark: SPARK-23984
• Dynamic Allocation: SPARK-24432
• Driver HA
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Apache Spark on Kubernetes Clusters (1)
41
Sean Suchter, PepperData, Anirudh Ramanathan, Google
Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Apache Spark on Kubernetes Clusters (2)
42
Sean Suchter, PepperData, Anirudh Ramanathan, Google
Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Apache Spark on Kubernetes Clusters (3)
43
Sean Suchter, PepperData, Anirudh Ramanathan, Google
bin/spark-submit ¥
--master k8s://<server:port> ¥
--deploy-mode cluster ¥
--name spark-pi ¥
--class org.apach.spark.examples.SparkPi ¥
--conf spark.executor.instances=5 ¥
--conf spark.kubernetes.container.image=<spark-image> ¥
local:///path/to/examples.jar
 利用者はほぼ今まで通りの方法でジョブを提出
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
Apache Spark on Kubernetes Clusters (4)
44
Sean Suchter, PepperData, Anirudh Ramanathan, Google
Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters
 Spark on Kubernetes Roadmap
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
EOP
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
予備スライド
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
KEYNOTE: MLflow
47
Matei Zaharia, Databricks
 SparkのMachine Learningのライフサイクル
管理フレームワーク
 SparkのMLが難しい3つのポイントがあること挙げたう
え、それぞれのポイントに対して解決策を提供した
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
KEYNOTE: Mlflow (1)
48
Matei Zaharia, Databricks
Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from
https://databricks.com/session/unifying-data-and-ai-for-better-data-products
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
KEYNOTE: Mlflow (2)
49
Matei Zaharia, Databricks
Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from
https://databricks.com/session/unifying-data-and-ai-for-better-data-products
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
KEYNOTE: Mlflow (3)
50
Matei Zaharia, Databricks
def main()
alpha = float(argv[1]) if len(argv) > 1 else 0
l1_ratio = float(argv[2]) if len(argv) > 2 else 0
(x_train, y_train) = load_data("train.parguet")
(x_test, y_test) = load_data("test.parguet")
print("Using parameter alpha=%.1f l1_ratio=%.1f" % (alpha, l1_ratio))
mlflow.log_param("alpha", alpha)
mlflow.log_param("l1", l1_ratio)
model = ElasitcNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
model.fit (x_train, x_train)
y_pred = model.predict(x_test)
(mae, rmse, r2) = eval_metrics(y_test, y_pred)
mlflow.log_metric("MAE", mae) print("MAE", mae)
mlflow.log_metric("RMSE", rmse) print("RMSE", rmse)
mlflow.log_metric("R2", r2) print("R2", r2)
Mlflow Tracking
Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.
KEYNOTE: Mlflow (4)
51
Matei Zaharia, Databricks

Mais conteúdo relacionado

Mais procurados

Graph Gurus Episode 5: Webinar PageRank
Graph Gurus Episode 5: Webinar PageRankGraph Gurus Episode 5: Webinar PageRank
Graph Gurus Episode 5: Webinar PageRankTigerGraph
 
QCon SP - recommended for you
QCon SP - recommended for youQCon SP - recommended for you
QCon SP - recommended for youTatiana Al-Chueyr
 
Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph
Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph  Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph
Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph TigerGraph
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 
Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...
Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...
Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...TigerGraph
 
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and BeyondGetting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and BeyondDatabricks
 
Graph Gurus Episode 6: Community Detection
Graph Gurus Episode 6: Community DetectionGraph Gurus Episode 6: Community Detection
Graph Gurus Episode 6: Community DetectionTigerGraph
 
Automated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Automated Time Series Analysis using Deep Learning, Ray and Analytics ZooAutomated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Automated Time Series Analysis using Deep Learning, Ray and Analytics ZooJason Dai
 
Building AI to play the FIFA video game using distributed TensorFlow on Analy...
Building AI to play the FIFA video game using distributed TensorFlow on Analy...Building AI to play the FIFA video game using distributed TensorFlow on Analy...
Building AI to play the FIFA video game using distributed TensorFlow on Analy...Jason Dai
 
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIAH2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIASri Ambati
 
CPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use CasesCPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use CasesStephan Haller
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MorePaco Nathan
 

Mais procurados (12)

Graph Gurus Episode 5: Webinar PageRank
Graph Gurus Episode 5: Webinar PageRankGraph Gurus Episode 5: Webinar PageRank
Graph Gurus Episode 5: Webinar PageRank
 
QCon SP - recommended for you
QCon SP - recommended for youQCon SP - recommended for you
QCon SP - recommended for you
 
Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph
Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph  Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph
Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...
Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...
Graph Gurus Episode 7: Connecting the Dots in Real-Time: Deep Link Analysis w...
 
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and BeyondGetting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond
 
Graph Gurus Episode 6: Community Detection
Graph Gurus Episode 6: Community DetectionGraph Gurus Episode 6: Community Detection
Graph Gurus Episode 6: Community Detection
 
Automated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Automated Time Series Analysis using Deep Learning, Ray and Analytics ZooAutomated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
Automated Time Series Analysis using Deep Learning, Ray and Analytics Zoo
 
Building AI to play the FIFA video game using distributed TensorFlow on Analy...
Building AI to play the FIFA video game using distributed TensorFlow on Analy...Building AI to play the FIFA video game using distributed TensorFlow on Analy...
Building AI to play the FIFA video game using distributed TensorFlow on Analy...
 
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIAH2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
 
CPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use CasesCPaaS.io Y1 Review Meeting - Use Cases
CPaaS.io Y1 Review Meeting - Use Cases
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
 

Semelhante a SAIS/DWS2018報告会 #saisdws2018

2018 Oracle Impact 발표자료: Oracle Enterprise AI
2018  Oracle Impact 발표자료: Oracle Enterprise AI2018  Oracle Impact 발표자료: Oracle Enterprise AI
2018 Oracle Impact 발표자료: Oracle Enterprise AITaewan Kim
 
“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the CoreC4Media
 
Diagnose Your Microservices
Diagnose Your MicroservicesDiagnose Your Microservices
Diagnose Your MicroservicesMarcus Hirt
 
20181127 オラクル講演資料(DataRobot AI Experience Tokyo)
20181127 オラクル講演資料(DataRobot AI Experience Tokyo)20181127 オラクル講演資料(DataRobot AI Experience Tokyo)
20181127 オラクル講演資料(DataRobot AI Experience Tokyo)オラクルエンジニア通信
 
Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3DataWorks Summit
 
Jakarta EE Meets NoSQL in the Cloud Age [DEV6109]
Jakarta EE Meets NoSQL in the Cloud Age [DEV6109]Jakarta EE Meets NoSQL in the Cloud Age [DEV6109]
Jakarta EE Meets NoSQL in the Cloud Age [DEV6109]Otávio Santana
 
[Oracle Innovation Summit Tokyo 2018] オラクルの考えるAI、そのアプローチ
[Oracle Innovation Summit Tokyo 2018] オラクルの考えるAI、そのアプローチ[Oracle Innovation Summit Tokyo 2018] オラクルの考えるAI、そのアプローチ
[Oracle Innovation Summit Tokyo 2018] オラクルの考えるAI、そのアプローチオラクルエンジニア通信
 
Sitecore Install Extensions in Action
Sitecore Install Extensions in ActionSitecore Install Extensions in Action
Sitecore Install Extensions in ActionRobert Senktas
 
Ros: 站在巨人的肩膀上
Ros: 站在巨人的肩膀上Ros: 站在巨人的肩膀上
Ros: 站在巨人的肩膀上建銘 林
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkDatabricks
 
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...Codemotion Tel Aviv
 
[아이펀팩토리] 2018 데브데이 서버위더스 _03 Scalable 한 게임 서버 만들기
[아이펀팩토리] 2018 데브데이 서버위더스 _03 Scalable 한 게임 서버 만들기[아이펀팩토리] 2018 데브데이 서버위더스 _03 Scalable 한 게임 서버 만들기
[아이펀팩토리] 2018 데브데이 서버위더스 _03 Scalable 한 게임 서버 만들기iFunFactory Inc.
 
Building Serverless Analytics Pipelines with AWS Glue (ANT308) - AWS re:Inven...
Building Serverless Analytics Pipelines with AWS Glue (ANT308) - AWS re:Inven...Building Serverless Analytics Pipelines with AWS Glue (ANT308) - AWS re:Inven...
Building Serverless Analytics Pipelines with AWS Glue (ANT308) - AWS re:Inven...Amazon Web Services
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringMehdi Mirakhorli
 
MySQL Day Paris 2018 - What’s New in MySQL 8.0 ?
MySQL Day Paris 2018 - What’s New in MySQL 8.0 ?MySQL Day Paris 2018 - What’s New in MySQL 8.0 ?
MySQL Day Paris 2018 - What’s New in MySQL 8.0 ?Olivier DASINI
 
Gain Insights with Graph Analytics
Gain Insights with Graph Analytics Gain Insights with Graph Analytics
Gain Insights with Graph Analytics Jean Ihm
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Amazon Web Services
 
Smart Agile Tools 2020: AI taking it to the future state
Smart Agile Tools 2020: AI taking it to the future stateSmart Agile Tools 2020: AI taking it to the future state
Smart Agile Tools 2020: AI taking it to the future stateRaghavendra Meharwade
 
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatengeKarin Patenge
 
AUSOUG Analytics Update - Nov 14 2018
AUSOUG Analytics Update - Nov 14 2018AUSOUG Analytics Update - Nov 14 2018
AUSOUG Analytics Update - Nov 14 2018Jason Lowe
 

Semelhante a SAIS/DWS2018報告会 #saisdws2018 (20)

2018 Oracle Impact 발표자료: Oracle Enterprise AI
2018  Oracle Impact 발표자료: Oracle Enterprise AI2018  Oracle Impact 발표자료: Oracle Enterprise AI
2018 Oracle Impact 발표자료: Oracle Enterprise AI
 
“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core
 
Diagnose Your Microservices
Diagnose Your MicroservicesDiagnose Your Microservices
Diagnose Your Microservices
 
20181127 オラクル講演資料(DataRobot AI Experience Tokyo)
20181127 オラクル講演資料(DataRobot AI Experience Tokyo)20181127 オラクル講演資料(DataRobot AI Experience Tokyo)
20181127 オラクル講演資料(DataRobot AI Experience Tokyo)
 
Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3
 
Jakarta EE Meets NoSQL in the Cloud Age [DEV6109]
Jakarta EE Meets NoSQL in the Cloud Age [DEV6109]Jakarta EE Meets NoSQL in the Cloud Age [DEV6109]
Jakarta EE Meets NoSQL in the Cloud Age [DEV6109]
 
[Oracle Innovation Summit Tokyo 2018] オラクルの考えるAI、そのアプローチ
[Oracle Innovation Summit Tokyo 2018] オラクルの考えるAI、そのアプローチ[Oracle Innovation Summit Tokyo 2018] オラクルの考えるAI、そのアプローチ
[Oracle Innovation Summit Tokyo 2018] オラクルの考えるAI、そのアプローチ
 
Sitecore Install Extensions in Action
Sitecore Install Extensions in ActionSitecore Install Extensions in Action
Sitecore Install Extensions in Action
 
Ros: 站在巨人的肩膀上
Ros: 站在巨人的肩膀上Ros: 站在巨人的肩膀上
Ros: 站在巨人的肩膀上
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
 
[아이펀팩토리] 2018 데브데이 서버위더스 _03 Scalable 한 게임 서버 만들기
[아이펀팩토리] 2018 데브데이 서버위더스 _03 Scalable 한 게임 서버 만들기[아이펀팩토리] 2018 데브데이 서버위더스 _03 Scalable 한 게임 서버 만들기
[아이펀팩토리] 2018 데브데이 서버위더스 _03 Scalable 한 게임 서버 만들기
 
Building Serverless Analytics Pipelines with AWS Glue (ANT308) - AWS re:Inven...
Building Serverless Analytics Pipelines with AWS Glue (ANT308) - AWS re:Inven...Building Serverless Analytics Pipelines with AWS Glue (ANT308) - AWS re:Inven...
Building Serverless Analytics Pipelines with AWS Glue (ANT308) - AWS re:Inven...
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
 
MySQL Day Paris 2018 - What’s New in MySQL 8.0 ?
MySQL Day Paris 2018 - What’s New in MySQL 8.0 ?MySQL Day Paris 2018 - What’s New in MySQL 8.0 ?
MySQL Day Paris 2018 - What’s New in MySQL 8.0 ?
 
Gain Insights with Graph Analytics
Gain Insights with Graph Analytics Gain Insights with Graph Analytics
Gain Insights with Graph Analytics
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Smart Agile Tools 2020: AI taking it to the future state
Smart Agile Tools 2020: AI taking it to the future stateSmart Agile Tools 2020: AI taking it to the future state
Smart Agile Tools 2020: AI taking it to the future state
 
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
20180921_DOAG_BigDataDays_OracleSpatialandPython_kpatenge
 
AUSOUG Analytics Update - Nov 14 2018
AUSOUG Analytics Update - Nov 14 2018AUSOUG Analytics Update - Nov 14 2018
AUSOUG Analytics Update - Nov 14 2018
 

Mais de Yahoo!デベロッパーネットワーク

ヤフーでは開発迅速性と品質のバランスをどう取ってるか
ヤフーでは開発迅速性と品質のバランスをどう取ってるかヤフーでは開発迅速性と品質のバランスをどう取ってるか
ヤフーでは開発迅速性と品質のバランスをどう取ってるかYahoo!デベロッパーネットワーク
 
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2Yahoo!デベロッパーネットワーク
 
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtcヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtcYahoo!デベロッパーネットワーク
 
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtcYahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtcYahoo!デベロッパーネットワーク
 
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtcヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtcYahoo!デベロッパーネットワーク
 
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtcYahoo!デベロッパーネットワーク
 
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtcPC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtcYahoo!デベロッパーネットワーク
 
モブデザインによる多職種チームのコミュニケーション改善 #yjtc
モブデザインによる多職種チームのコミュニケーション改善 #yjtcモブデザインによる多職種チームのコミュニケーション改善 #yjtc
モブデザインによる多職種チームのコミュニケーション改善 #yjtcYahoo!デベロッパーネットワーク
 
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtcユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtcYahoo!デベロッパーネットワーク
 

Mais de Yahoo!デベロッパーネットワーク (20)

ゼロから始める転移学習
ゼロから始める転移学習ゼロから始める転移学習
ゼロから始める転移学習
 
継続的なモデルモニタリングを実現するKubernetes Operator
継続的なモデルモニタリングを実現するKubernetes Operator継続的なモデルモニタリングを実現するKubernetes Operator
継続的なモデルモニタリングを実現するKubernetes Operator
 
ヤフーでは開発迅速性と品質のバランスをどう取ってるか
ヤフーでは開発迅速性と品質のバランスをどう取ってるかヤフーでは開発迅速性と品質のバランスをどう取ってるか
ヤフーでは開発迅速性と品質のバランスをどう取ってるか
 
オンプレML基盤on Kubernetes パネルディスカッション
オンプレML基盤on Kubernetes パネルディスカッションオンプレML基盤on Kubernetes パネルディスカッション
オンプレML基盤on Kubernetes パネルディスカッション
 
LakeTahoe
LakeTahoeLakeTahoe
LakeTahoe
 
オンプレML基盤on Kubernetes 〜Yahoo! JAPAN AIPF〜
オンプレML基盤on Kubernetes 〜Yahoo! JAPAN AIPF〜オンプレML基盤on Kubernetes 〜Yahoo! JAPAN AIPF〜
オンプレML基盤on Kubernetes 〜Yahoo! JAPAN AIPF〜
 
Persistent-memory-native Database High-availability Feature
Persistent-memory-native Database High-availability FeaturePersistent-memory-native Database High-availability Feature
Persistent-memory-native Database High-availability Feature
 
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
 
eコマースと実店舗の相互利益を目指したデザイン #yjtc
eコマースと実店舗の相互利益を目指したデザイン #yjtceコマースと実店舗の相互利益を目指したデザイン #yjtc
eコマースと実店舗の相互利益を目指したデザイン #yjtc
 
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtcヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
 
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtcYahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
 
ビッグデータから人々のムードを捉える #yjtc
ビッグデータから人々のムードを捉える #yjtcビッグデータから人々のムードを捉える #yjtc
ビッグデータから人々のムードを捉える #yjtc
 
サイエンス領域におけるMLOpsの取り組み #yjtc
サイエンス領域におけるMLOpsの取り組み #yjtcサイエンス領域におけるMLOpsの取り組み #yjtc
サイエンス領域におけるMLOpsの取り組み #yjtc
 
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtcヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
 
Yahoo! JAPAN Tech Conference 2022 Day2 Keynote #yjtc
Yahoo! JAPAN Tech Conference 2022 Day2 Keynote #yjtcYahoo! JAPAN Tech Conference 2022 Day2 Keynote #yjtc
Yahoo! JAPAN Tech Conference 2022 Day2 Keynote #yjtc
 
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
 
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtcPC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
 
モブデザインによる多職種チームのコミュニケーション改善 #yjtc
モブデザインによる多職種チームのコミュニケーション改善 #yjtcモブデザインによる多職種チームのコミュニケーション改善 #yjtc
モブデザインによる多職種チームのコミュニケーション改善 #yjtc
 
「新しいおうち探し」のためのAIアシスト検索 #yjtc
「新しいおうち探し」のためのAIアシスト検索 #yjtc「新しいおうち探し」のためのAIアシスト検索 #yjtc
「新しいおうち探し」のためのAIアシスト検索 #yjtc
 
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtcユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
 

Último

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Último (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

SAIS/DWS2018報告会 #saisdws2018

  • 1. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 2018年8月6日 李 燮鳴 Spark AI Summit 2018 報告会
  • 2. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 自己紹介 2 李 燮鳴 (リ ショウメイ) 2017年3月筑波大学大学院博士(工学)取得 • 並列ファイルシステムのためのスケジューラの研究 2017年4月からはヤフーに入社 • 入社後はHadoopクラスタのDevOpsを担当
  • 3. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark AI Summit 2018 3 開催日時:2018/06/04~2018/06/06 場所:San Francisco Moscone Center West 参加者:6000名ほど セッション:約9~11並列発表され、合計で193セッション
  • 4. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 会場 (外観) 4
  • 5. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 会場 (内部) 5
  • 6. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 食事 6
  • 7. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. アジェンダ 7 • MLを便利にするフレームワーク(2件) • Spark SQLについて(2件) • クラスタアーキテクチャー(2件)
  • 8. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MLを便利にするフレームワーク
  • 9. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod 9 • TensorFlowの分散型学習を高速化した フレームワーク • MPIのALL REDUCEを利用してGradientsの平均値 の計算を高速化した Alexander Sergeev, Uber
  • 10. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod (1) 10 Alexander Sergeev, Uber Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from https://eng.uber.com/horovod/ TensorFlowの分散型学習ではParameter Serverを使用し、各Workerで求まったGradientの平均値の計算 難点:Parameter Serverの構成を選択するのは難しい
  • 11. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod (2) 11 Alexander Sergeev, Uber  Horovodでは、Parameter Serverを使用せず、NCCL (NVIDIA Collective Communications Library, MPIで実装)のRing ALL REDUCEでGradientsの交換・平均計算 を行った Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from https://eng.uber.com/horovod/
  • 12. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod (3) 12 Alexander Sergeev, Uber  TensorFlow のオフィシャルのベンチマークを用いた性能評価では約2倍ほどの性能向上を確 認できた Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from https://eng.uber.com/horovod/
  • 13. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod (4) 13 Alexander Sergeev, Uber  InfinibandでRDMA (Remote Direct Memory Access)を使用すると、性能がさらに上がった Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from https://eng.uber.com/horovod/
  • 14. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Hydrogen 14 Reynold Xin, Databricks DLのフレームワークをSparkで効率できるようにする提案 • SPIP= Spark Project Improvement Proposal • 現時点ではDesign Sketchが完了(Designが15%終了, SPARK-24374)
  • 15. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Hydrogen (1) 15 Reynold Xin, Databricks Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/databricks-keynote-2
  • 16. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Hydrogen (2) 16 Reynold Xin, Databricks Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/databricks-keynote-2
  • 17. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL
  • 18. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 18 Xiao Li, Wenchen Fan, Databricks Spark SQLがクエリから実行されるまでの各段階で実施できるパラ メータチューニングの手法を紹介した Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  • 19. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 19 Xiao Li, Wenchen Fan, Databricks Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  • 20. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 20 Xiao Li, Wenchen Fan, Databricks Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  • 21. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 21 Xiao Li, Wenchen Fan, Databricks Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  • 22. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 22 Xiao Li, Wenchen Fan, Databricks Spark SQLがクエリから実行されるまでの各段階で実施できるパラ メータチューニングの手法を紹介した Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  • 23. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution 23 Carson Wang, Intel, Yuanjian Li, Baidu Spark SQLの実行をランタイムで変更させて効率よくした • 最適なReducerの数をランタイムで決める • 適切なJoin手法をランタイムで決める • BaiduではProd環境で使用(SPARK-23128)
  • 24. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (1) 24 Carson Wang, Intel, Yuanjian Li, Baidu Reducerの数のチューニング • 少なすぎる場合: Spill, OOM • 多すぎる場合: Scheduling overhead. More IO requests. Too many small output files • すべてのstages に適した数を指定するのはむずかしい
  • 25. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (2) 25 Carson Wang, Intel, Yuanjian Li, Baidu ShuffledRowRDD Partition 0 (70MB) Partition 1 (30MB) Partition 2 (20MB) Partition 3 (10MB) Partition 4 (50MB) ShuffledRowRDD Partition 0 (70MB) Partition 1 (30MB) Partition 2 (20MB) Partition 3 (10MB) Partition 4 (50MB) Target Size per Reducer =64MB, Min-Max Shuffle Partition Number = 1 to 5 30+20+10<64MB
  • 26. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (3) 26 Carson Wang, Intel, Yuanjian Li, Baidu SQL Query Logical Plan Optimized Logical Plan Multiple Physical Plan Selected Physical Plan Cost Modelで評価Join2 Join1 Exchange1 T1 Exchange2 T2 Exchange3 T3 最適ではない Joinが選ばれる Plannerの予測値と 実際の値と大きく異 なる場合がある
  • 27. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (4) 27 Carson Wang, Intel, Yuanjian Li, Baidu Join2 Join1 Exchange1 T1 Exchange2 T2 Exchange3 T3 QueryStage4 Join2 Join1 QueryStage Input1 QueryStage Input2 QueryStage Input3 QueryStage1 Exchange1 T1 QueryStage2 Exchange2 T2 QueryStage3 Exchange3 T3 実際の値を把 握できる
  • 28. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (5) 28 Carson Wang, Intel, Yuanjian Li, Baidu BaiduでAdaptive Exectionを適用した結果 • SortMergeJoinがBroadcastJoinに変更され、 50%~200%の性能向上を確認した • 実行時間が1時間以上のジョブでは適切なReducer数が 指定され、50%~100%の性能向上を確認した
  • 29. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. クラスタアーキテクチャー
  • 30. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture 30 Brian Cho, Facebook • データと計算を分離したアーキテクチャーの紹介 • データと計算を分離したアーキテクチャーにおけるSpark の最適化 1. Fileインターフェイスの定義 2. SparkのTemporaryファイルのアクセス最適化 3. Spark shuffleの最適化
  • 31. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (1) 31 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  • 32. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (2) 32 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  • 33. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (3) 33 Brian Cho, Facebook データと計算を分離したアーキテクチャーのメリット • それぞれデータと計算に適したサーバー調達できる • キャパシティプランニングが簡単 • それぞれのチームでメンテナンスできる
  • 34. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (4) 34 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  • 35. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (5) 35 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  • 36. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (5) 36 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  • 37. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (6) 37 Brian Cho, Facebook Executor Executor Executor ESS ESS ESS Local FS Local FS Local FS ここはLocalアクセス 計算
  • 38. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (7) 38 Brian Cho, Facebook Executor Executor Executor ESS ESS ESS Warm Storage ここはRemoteアクセス *Network Transfer 計算 ストレージ
  • 39. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (8) 39 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture Index, shuffle shuffle shuffle Index
  • 40. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters 40 Sean Suchter, PepperData, Anirudh Ramanathan, Google  Kubernetesの概要、Spark on Kubernetesの実装と今後 の予定を紹介した  Spark DriverはKubernetesのCustom Controllerとして 実装されている  将来的に追加される機能(ピックアップ) • PySpark: SPARK-23984 • Dynamic Allocation: SPARK-24432 • Driver HA
  • 41. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters (1) 41 Sean Suchter, PepperData, Anirudh Ramanathan, Google Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters
  • 42. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters (2) 42 Sean Suchter, PepperData, Anirudh Ramanathan, Google Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters
  • 43. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters (3) 43 Sean Suchter, PepperData, Anirudh Ramanathan, Google bin/spark-submit ¥ --master k8s://<server:port> ¥ --deploy-mode cluster ¥ --name spark-pi ¥ --class org.apach.spark.examples.SparkPi ¥ --conf spark.executor.instances=5 ¥ --conf spark.kubernetes.container.image=<spark-image> ¥ local:///path/to/examples.jar  利用者はほぼ今まで通りの方法でジョブを提出
  • 44. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters (4) 44 Sean Suchter, PepperData, Anirudh Ramanathan, Google Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters  Spark on Kubernetes Roadmap
  • 45. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. EOP
  • 46. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 予備スライド
  • 47. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: MLflow 47 Matei Zaharia, Databricks  SparkのMachine Learningのライフサイクル 管理フレームワーク  SparkのMLが難しい3つのポイントがあること挙げたう え、それぞれのポイントに対して解決策を提供した
  • 48. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Mlflow (1) 48 Matei Zaharia, Databricks Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/unifying-data-and-ai-for-better-data-products
  • 49. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Mlflow (2) 49 Matei Zaharia, Databricks Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/unifying-data-and-ai-for-better-data-products
  • 50. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Mlflow (3) 50 Matei Zaharia, Databricks def main() alpha = float(argv[1]) if len(argv) > 1 else 0 l1_ratio = float(argv[2]) if len(argv) > 2 else 0 (x_train, y_train) = load_data("train.parguet") (x_test, y_test) = load_data("test.parguet") print("Using parameter alpha=%.1f l1_ratio=%.1f" % (alpha, l1_ratio)) mlflow.log_param("alpha", alpha) mlflow.log_param("l1", l1_ratio) model = ElasitcNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42) model.fit (x_train, x_train) y_pred = model.predict(x_test) (mae, rmse, r2) = eval_metrics(y_test, y_pred) mlflow.log_metric("MAE", mae) print("MAE", mae) mlflow.log_metric("RMSE", rmse) print("RMSE", rmse) mlflow.log_metric("R2", r2) print("R2", r2) Mlflow Tracking
  • 51. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Mlflow (4) 51 Matei Zaharia, Databricks