SlideShare uma empresa Scribd logo
1 de 10
Baixar para ler offline
Chainer feature introduction
Training and dataset abstraction
5th Oct. 2016
GTC Japan @ Tokyo
Preferred Networks, Inc.
Kenta Oono
oono@preferred.jp
Trainer and Dataset abstraction
• New feature from v1.11.0
ü Free users from implementing training loops by themselves.
ü Support most of typical training procedures.
ü Easy to customize and extend.
• Note: We can also write manually training loops without this feature, as we
did in the examples of the previous versions.
2
Target Link
Dataset
Optimizer
Iterator
Main modules
• Dataset, Iterator: extract mini batches by iterating over datasets
• Trainer, Updater, Extension: customize the training loop with low cost
• Reporter: to collect statistics from inside of the models
3
Trainer
Extension
Extension
Extension
Updater Optimizer
Optimizer
Target Link
Target Link
Iterator
Iterator
Dataset
Dataset
We often use only one
optimizer and one
dataset. This diagram
shows a general case.
MNIST classification by
MLP with Trainer
class MLP(Link):
def __int__(self):
super(MLP, self).__init__(
l1=Linear(784, 1000),
l2=Linear(1000, 1000),
l3=Linear(1000, 10))
def __call__(x):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(l1))
return self.l3(h2)
Linear l1
x
W bias
5
ReLU
Linear l2
h1
W bias
ReLU
Linear l3
h2
W bias
4
# Prepare datasets and their iterators
train, test = get_mnist()
train_iter = SerialIterator(train, 128)
test_iter = SerialIterator(test, 128, repeat=False,
shuffle=False)
# Prepare links and their optimizers
model = L.Classifier(MLP())
optimizer = Adam()
optimizer.setup(model)
# Prepare trainer
updater = StandardUpdater(train_iter, optimizer)
trainer = Trainer(updater, (10, 'epoch'))
5
# Add extensions to augment trainer
trainer.extend(Evaluator(test_iter, model))
trainer.extend(dump_graph('main/loss'))
trainer.extend(snapshot())
trainer.extend(LogReport())
trainer.extend(PrintReport(
'epoch', 'main/accuracy',
'validation/main/accuracy']))
trainer.extend(ProgressBar())
# Execute
trainer.run()
6
Pseudo code of training loop abstraction
For each extension e:
Invoke e if specified
Until stop_trigger is fired:
Invoke updater
for each extension e:
if e’s trigger is fired:
Invoke e
For each extension e:
Finalize e
Finalize updater
7
• Trainer has stop trigger to determine
when to stop the training loop
• Each extension have a trigger to
determine when to invoke
8
Trainer-related modules
• Updater
– Fetch a mini-batch using Iterator, and update parameters using
Optimizer
– You can customize the update routine
– Built-in updater: StandardUpdater, ParallelUpdater
• Extension
– It adds an extra routine to the training loop
– Basic extensions are built-in:
Evaluator, LogReport, PrintReport, ProgressBar
snapshot, snapshot_object, ExponentialDecay,
LinearShift, dump_graph
– You can write your own extensions
9
Dataset-related modules
• Dataset is just a sequence of data points (a.k.a. examples)
• Iterator defines how to iterate over the dataset
• Built-in iterators:
– SerialIterator
– MultiprocessIterator
10

Mais conteúdo relacionado

Mais procurados

Chainer v2 and future dev plan
Chainer v2 and future dev planChainer v2 and future dev plan
Chainer v2 and future dev planSeiya Tokui
 
Chainer Update v1.8.0 -> v1.10.0+
Chainer Update v1.8.0 -> v1.10.0+Chainer Update v1.8.0 -> v1.10.0+
Chainer Update v1.8.0 -> v1.10.0+Seiya Tokui
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1Kenta Oono
 
Introduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningIntroduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningSeiya Tokui
 
Chainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportChainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportPreferred Networks
 
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Kenta Oono
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」Preferred Networks
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUShohei Hido
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowEmanuel Di Nardo
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsKusano Hitoshi
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureMani Goswami
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...MLconf
 
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2Tyrone Systems
 
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place SolutionKaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place SolutionPreferred Networks
 
Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Tyrone Systems
 

Mais procurados (20)

Chainer v3
Chainer v3Chainer v3
Chainer v3
 
Chainer v2 and future dev plan
Chainer v2 and future dev planChainer v2 and future dev plan
Chainer v2 and future dev plan
 
Chainer Update v1.8.0 -> v1.10.0+
Chainer Update v1.8.0 -> v1.10.0+Chainer Update v1.8.0 -> v1.10.0+
Chainer Update v1.8.0 -> v1.10.0+
 
Tokyo Webmining Talk1
Tokyo Webmining Talk1Tokyo Webmining Talk1
Tokyo Webmining Talk1
 
Introduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep LearningIntroduction to Chainer: A Flexible Framework for Deep Learning
Introduction to Chainer: A Flexible Framework for Deep Learning
 
Chainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereportChainer ui v0.3 and imagereport
Chainer ui v0.3 and imagereport
 
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Comparison of deep learning frameworks from a viewpoint of double backpropaga...
Comparison of deep learning frameworks from a viewpoint of double backpropaga...
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
 
Chainer v4 and v5
Chainer v4 and v5Chainer v4 and v5
Chainer v4 and v5
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflow
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsFCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
 
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place SolutionKaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th Place Solution
 
Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1
 

Destaque

Introduction to Chainer and CuPy
Introduction to Chainer and CuPyIntroduction to Chainer and CuPy
Introduction to Chainer and CuPyKenta Oono
 
On the benchmark of Chainer
On the benchmark of ChainerOn the benchmark of Chainer
On the benchmark of ChainerKenta Oono
 
情報幾何学の基礎、第7章発表ノート
情報幾何学の基礎、第7章発表ノート情報幾何学の基礎、第7章発表ノート
情報幾何学の基礎、第7章発表ノートKenta Oono
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative ModelsKenta Oono
 
Deep Learning技術の最近の動向とPreferred Networksの取り組み
Deep Learning技術の最近の動向とPreferred Networksの取り組みDeep Learning技術の最近の動向とPreferred Networksの取り組み
Deep Learning技術の最近の動向とPreferred Networksの取り組みKenta Oono
 
提供AMIについて
提供AMIについて提供AMIについて
提供AMIについてKenta Oono
 
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用 2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用 Kenta Oono
 
Learning Image Embeddings using Convolutional Neural Networks for Improved Mu...
Learning Image Embeddings using Convolutional Neural Networks for Improved Mu...Learning Image Embeddings using Convolutional Neural Networks for Improved Mu...
Learning Image Embeddings using Convolutional Neural Networks for Improved Mu...Kenta Oono
 
Introduction to Chainer (LL Ring Recursive)
Introduction to Chainer (LL Ring Recursive)Introduction to Chainer (LL Ring Recursive)
Introduction to Chainer (LL Ring Recursive)Kenta Oono
 
Encode勉強会:GENCODE: The reference human genome annotation for The ENCODE Proje...
Encode勉強会:GENCODE: The reference human genome annotation for The ENCODE Proje...Encode勉強会:GENCODE: The reference human genome annotation for The ENCODE Proje...
Encode勉強会:GENCODE: The reference human genome annotation for The ENCODE Proje...Kenta Oono
 
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料Kenta Oono
 
ディープラーニング最近の発展とビジネス応用への課題
ディープラーニング最近の発展とビジネス応用への課題ディープラーニング最近の発展とビジネス応用への課題
ディープラーニング最近の発展とビジネス応用への課題Kenta Oono
 
Development and Experiment of Deep Learning with Caffe and maf
Development and Experiment of Deep Learning with Caffe and mafDevelopment and Experiment of Deep Learning with Caffe and maf
Development and Experiment of Deep Learning with Caffe and mafKenta Oono
 
最先端NLP勉強会 “Learning Language Games through Interaction” Sida I. Wang, Percy L...
最先端NLP勉強会“Learning Language Games through Interaction”Sida I. Wang, Percy L...最先端NLP勉強会“Learning Language Games through Interaction”Sida I. Wang, Percy L...
最先端NLP勉強会 “Learning Language Games through Interaction” Sida I. Wang, Percy L...Yuya Unno
 
深層学習による機械とのコミュニケーション
深層学習による機械とのコミュニケーション深層学習による機械とのコミュニケーション
深層学習による機械とのコミュニケーションYuya Unno
 
How to Develop Experiment-Oriented Programs
How to Develop Experiment-Oriented ProgramsHow to Develop Experiment-Oriented Programs
How to Develop Experiment-Oriented ProgramsKenta Oono
 
集中不等式のすすめ [集中不等式本読み会#1]
集中不等式のすすめ [集中不等式本読み会#1]集中不等式のすすめ [集中不等式本読み会#1]
集中不等式のすすめ [集中不等式本読み会#1]Kentaro Minami
 
Caffeインストール
CaffeインストールCaffeインストール
CaffeインストールKenta Oono
 
大規模データ時代に求められる自然言語処理 -言語情報から世界を捉える-
大規模データ時代に求められる自然言語処理 -言語情報から世界を捉える-大規模データ時代に求められる自然言語処理 -言語情報から世界を捉える-
大規模データ時代に求められる自然言語処理 -言語情報から世界を捉える-Yuya Unno
 
Techtalk:多様体
Techtalk:多様体Techtalk:多様体
Techtalk:多様体Kenta Oono
 

Destaque (20)

Introduction to Chainer and CuPy
Introduction to Chainer and CuPyIntroduction to Chainer and CuPy
Introduction to Chainer and CuPy
 
On the benchmark of Chainer
On the benchmark of ChainerOn the benchmark of Chainer
On the benchmark of Chainer
 
情報幾何学の基礎、第7章発表ノート
情報幾何学の基礎、第7章発表ノート情報幾何学の基礎、第7章発表ノート
情報幾何学の基礎、第7章発表ノート
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
Deep Learning技術の最近の動向とPreferred Networksの取り組み
Deep Learning技術の最近の動向とPreferred Networksの取り組みDeep Learning技術の最近の動向とPreferred Networksの取り組み
Deep Learning技術の最近の動向とPreferred Networksの取り組み
 
提供AMIについて
提供AMIについて提供AMIについて
提供AMIについて
 
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用 2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用
 
Learning Image Embeddings using Convolutional Neural Networks for Improved Mu...
Learning Image Embeddings using Convolutional Neural Networks for Improved Mu...Learning Image Embeddings using Convolutional Neural Networks for Improved Mu...
Learning Image Embeddings using Convolutional Neural Networks for Improved Mu...
 
Introduction to Chainer (LL Ring Recursive)
Introduction to Chainer (LL Ring Recursive)Introduction to Chainer (LL Ring Recursive)
Introduction to Chainer (LL Ring Recursive)
 
Encode勉強会:GENCODE: The reference human genome annotation for The ENCODE Proje...
Encode勉強会:GENCODE: The reference human genome annotation for The ENCODE Proje...Encode勉強会:GENCODE: The reference human genome annotation for The ENCODE Proje...
Encode勉強会:GENCODE: The reference human genome annotation for The ENCODE Proje...
 
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料
日本神経回路学会セミナー「DeepLearningを使ってみよう!」資料
 
ディープラーニング最近の発展とビジネス応用への課題
ディープラーニング最近の発展とビジネス応用への課題ディープラーニング最近の発展とビジネス応用への課題
ディープラーニング最近の発展とビジネス応用への課題
 
Development and Experiment of Deep Learning with Caffe and maf
Development and Experiment of Deep Learning with Caffe and mafDevelopment and Experiment of Deep Learning with Caffe and maf
Development and Experiment of Deep Learning with Caffe and maf
 
最先端NLP勉強会 “Learning Language Games through Interaction” Sida I. Wang, Percy L...
最先端NLP勉強会“Learning Language Games through Interaction”Sida I. Wang, Percy L...最先端NLP勉強会“Learning Language Games through Interaction”Sida I. Wang, Percy L...
最先端NLP勉強会 “Learning Language Games through Interaction” Sida I. Wang, Percy L...
 
深層学習による機械とのコミュニケーション
深層学習による機械とのコミュニケーション深層学習による機械とのコミュニケーション
深層学習による機械とのコミュニケーション
 
How to Develop Experiment-Oriented Programs
How to Develop Experiment-Oriented ProgramsHow to Develop Experiment-Oriented Programs
How to Develop Experiment-Oriented Programs
 
集中不等式のすすめ [集中不等式本読み会#1]
集中不等式のすすめ [集中不等式本読み会#1]集中不等式のすすめ [集中不等式本読み会#1]
集中不等式のすすめ [集中不等式本読み会#1]
 
Caffeインストール
CaffeインストールCaffeインストール
Caffeインストール
 
大規模データ時代に求められる自然言語処理 -言語情報から世界を捉える-
大規模データ時代に求められる自然言語処理 -言語情報から世界を捉える-大規模データ時代に求められる自然言語処理 -言語情報から世界を捉える-
大規模データ時代に求められる自然言語処理 -言語情報から世界を捉える-
 
Techtalk:多様体
Techtalk:多様体Techtalk:多様体
Techtalk:多様体
 

Semelhante a GTC Japan 2016 Chainer feature introduction

AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
AWS re:Invent 2018 - AIM401 - Deep Learning using TensorflowAWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
AWS re:Invent 2018 - AIM401 - Deep Learning using TensorflowJulien SIMON
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple stepsRenjith M P
 
[REPEAT] Deep Learning Applications Using TensorFlow (AIM401-R) - AWS re:Inve...
[REPEAT] Deep Learning Applications Using TensorFlow (AIM401-R) - AWS re:Inve...[REPEAT] Deep Learning Applications Using TensorFlow (AIM401-R) - AWS re:Inve...
[REPEAT] Deep Learning Applications Using TensorFlow (AIM401-R) - AWS re:Inve...Amazon Web Services
 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn철민 권
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
K-Fashion 경진대회 3등 수상자 솔루션
K-Fashion 경진대회 3등 수상자 솔루션K-Fashion 경진대회 3등 수상자 솔루션
K-Fashion 경진대회 3등 수상자 솔루션DACON AI 데이콘
 
A Tour of Tensorflow's APIs
A Tour of Tensorflow's APIsA Tour of Tensorflow's APIs
A Tour of Tensorflow's APIsDean Wyatte
 
MLPerf an industry standard benchmark suite for machine learning performance
MLPerf an industry standard benchmark suite for machine learning performanceMLPerf an industry standard benchmark suite for machine learning performance
MLPerf an industry standard benchmark suite for machine learning performancejemin lee
 
ECET 360 help A Guide to career/Snaptutorial
ECET 360 help A Guide to career/SnaptutorialECET 360 help A Guide to career/Snaptutorial
ECET 360 help A Guide to career/Snaptutorialpinck2380
 
ECET 360 help A Guide to career/Snaptutorial
ECET 360 help A Guide to career/SnaptutorialECET 360 help A Guide to career/Snaptutorial
ECET 360 help A Guide to career/Snaptutorialpinck200
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsIntel® Software
 
Horovod ubers distributed deep learning framework by Alex Sergeev from Uber
Horovod ubers distributed deep learning framework  by Alex Sergeev from UberHorovod ubers distributed deep learning framework  by Alex Sergeev from Uber
Horovod ubers distributed deep learning framework by Alex Sergeev from UberBill Liu
 
Inference accelerators
Inference acceleratorsInference accelerators
Inference acceleratorsDarshanG13
 
Uber's Journey in Distributed Deep Learning
Uber's Journey in Distributed Deep LearningUber's Journey in Distributed Deep Learning
Uber's Journey in Distributed Deep Learninginside-BigData.com
 
Accelerated Training of Transformer Models
Accelerated Training of Transformer ModelsAccelerated Training of Transformer Models
Accelerated Training of Transformer ModelsDatabricks
 

Semelhante a GTC Japan 2016 Chainer feature introduction (20)

AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
AWS re:Invent 2018 - AIM401 - Deep Learning using TensorflowAWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
AWS re:Invent 2018 - AIM401 - Deep Learning using Tensorflow
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
[REPEAT] Deep Learning Applications Using TensorFlow (AIM401-R) - AWS re:Inve...
[REPEAT] Deep Learning Applications Using TensorFlow (AIM401-R) - AWS re:Inve...[REPEAT] Deep Learning Applications Using TensorFlow (AIM401-R) - AWS re:Inve...
[REPEAT] Deep Learning Applications Using TensorFlow (AIM401-R) - AWS re:Inve...
 
Hot sos em12c_metric_extensions
Hot sos em12c_metric_extensionsHot sos em12c_metric_extensions
Hot sos em12c_metric_extensions
 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn
 
myslide1
myslide1myslide1
myslide1
 
myslide6
myslide6myslide6
myslide6
 
NewSeriesSlideShare
NewSeriesSlideShareNewSeriesSlideShare
NewSeriesSlideShare
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
K-Fashion 경진대회 3등 수상자 솔루션
K-Fashion 경진대회 3등 수상자 솔루션K-Fashion 경진대회 3등 수상자 솔루션
K-Fashion 경진대회 3등 수상자 솔루션
 
Compiler Design- Machine Independent Optimizations
Compiler Design- Machine Independent OptimizationsCompiler Design- Machine Independent Optimizations
Compiler Design- Machine Independent Optimizations
 
A Tour of Tensorflow's APIs
A Tour of Tensorflow's APIsA Tour of Tensorflow's APIs
A Tour of Tensorflow's APIs
 
MLPerf an industry standard benchmark suite for machine learning performance
MLPerf an industry standard benchmark suite for machine learning performanceMLPerf an industry standard benchmark suite for machine learning performance
MLPerf an industry standard benchmark suite for machine learning performance
 
ECET 360 help A Guide to career/Snaptutorial
ECET 360 help A Guide to career/SnaptutorialECET 360 help A Guide to career/Snaptutorial
ECET 360 help A Guide to career/Snaptutorial
 
ECET 360 help A Guide to career/Snaptutorial
ECET 360 help A Guide to career/SnaptutorialECET 360 help A Guide to career/Snaptutorial
ECET 360 help A Guide to career/Snaptutorial
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
 
Horovod ubers distributed deep learning framework by Alex Sergeev from Uber
Horovod ubers distributed deep learning framework  by Alex Sergeev from UberHorovod ubers distributed deep learning framework  by Alex Sergeev from Uber
Horovod ubers distributed deep learning framework by Alex Sergeev from Uber
 
Inference accelerators
Inference acceleratorsInference accelerators
Inference accelerators
 
Uber's Journey in Distributed Deep Learning
Uber's Journey in Distributed Deep LearningUber's Journey in Distributed Deep Learning
Uber's Journey in Distributed Deep Learning
 
Accelerated Training of Transformer Models
Accelerated Training of Transformer ModelsAccelerated Training of Transformer Models
Accelerated Training of Transformer Models
 

Mais de Kenta Oono

Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...Kenta Oono
 
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017Kenta Oono
 
深層学習フレームワーク概要とChainerの事例紹介
深層学習フレームワーク概要とChainerの事例紹介深層学習フレームワーク概要とChainerの事例紹介
深層学習フレームワーク概要とChainerの事例紹介Kenta Oono
 
20170422 数学カフェ Part2
20170422 数学カフェ Part220170422 数学カフェ Part2
20170422 数学カフェ Part2Kenta Oono
 
20170422 数学カフェ Part1
20170422 数学カフェ Part120170422 数学カフェ Part1
20170422 数学カフェ Part1Kenta Oono
 
Stochastic Gradient MCMC
Stochastic Gradient MCMCStochastic Gradient MCMC
Stochastic Gradient MCMCKenta Oono
 
Chainer Contribution Guide
Chainer Contribution GuideChainer Contribution Guide
Chainer Contribution GuideKenta Oono
 
Chainerインストール
ChainerインストールChainerインストール
ChainerインストールKenta Oono
 
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...Kenta Oono
 
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...Kenta Oono
 

Mais de Kenta Oono (10)

Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...
 
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017
 
深層学習フレームワーク概要とChainerの事例紹介
深層学習フレームワーク概要とChainerの事例紹介深層学習フレームワーク概要とChainerの事例紹介
深層学習フレームワーク概要とChainerの事例紹介
 
20170422 数学カフェ Part2
20170422 数学カフェ Part220170422 数学カフェ Part2
20170422 数学カフェ Part2
 
20170422 数学カフェ Part1
20170422 数学カフェ Part120170422 数学カフェ Part1
20170422 数学カフェ Part1
 
Stochastic Gradient MCMC
Stochastic Gradient MCMCStochastic Gradient MCMC
Stochastic Gradient MCMC
 
Chainer Contribution Guide
Chainer Contribution GuideChainer Contribution Guide
Chainer Contribution Guide
 
Chainerインストール
ChainerインストールChainerインストール
Chainerインストール
 
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
 
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
NIPS2013読み会:Inverse Density as an Inverse Problem: The Fredholm Equation Appr...
 

Último

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Último (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

GTC Japan 2016 Chainer feature introduction

  • 1. Chainer feature introduction Training and dataset abstraction 5th Oct. 2016 GTC Japan @ Tokyo Preferred Networks, Inc. Kenta Oono oono@preferred.jp
  • 2. Trainer and Dataset abstraction • New feature from v1.11.0 ü Free users from implementing training loops by themselves. ü Support most of typical training procedures. ü Easy to customize and extend. • Note: We can also write manually training loops without this feature, as we did in the examples of the previous versions. 2
  • 3. Target Link Dataset Optimizer Iterator Main modules • Dataset, Iterator: extract mini batches by iterating over datasets • Trainer, Updater, Extension: customize the training loop with low cost • Reporter: to collect statistics from inside of the models 3 Trainer Extension Extension Extension Updater Optimizer Optimizer Target Link Target Link Iterator Iterator Dataset Dataset We often use only one optimizer and one dataset. This diagram shows a general case.
  • 4. MNIST classification by MLP with Trainer class MLP(Link): def __int__(self): super(MLP, self).__init__( l1=Linear(784, 1000), l2=Linear(1000, 1000), l3=Linear(1000, 10)) def __call__(x): h1 = F.relu(self.l1(x)) h2 = F.relu(self.l2(l1)) return self.l3(h2) Linear l1 x W bias 5 ReLU Linear l2 h1 W bias ReLU Linear l3 h2 W bias 4
  • 5. # Prepare datasets and their iterators train, test = get_mnist() train_iter = SerialIterator(train, 128) test_iter = SerialIterator(test, 128, repeat=False, shuffle=False) # Prepare links and their optimizers model = L.Classifier(MLP()) optimizer = Adam() optimizer.setup(model) # Prepare trainer updater = StandardUpdater(train_iter, optimizer) trainer = Trainer(updater, (10, 'epoch')) 5
  • 6. # Add extensions to augment trainer trainer.extend(Evaluator(test_iter, model)) trainer.extend(dump_graph('main/loss')) trainer.extend(snapshot()) trainer.extend(LogReport()) trainer.extend(PrintReport( 'epoch', 'main/accuracy', 'validation/main/accuracy'])) trainer.extend(ProgressBar()) # Execute trainer.run() 6
  • 7. Pseudo code of training loop abstraction For each extension e: Invoke e if specified Until stop_trigger is fired: Invoke updater for each extension e: if e’s trigger is fired: Invoke e For each extension e: Finalize e Finalize updater 7 • Trainer has stop trigger to determine when to stop the training loop • Each extension have a trigger to determine when to invoke
  • 8. 8
  • 9. Trainer-related modules • Updater – Fetch a mini-batch using Iterator, and update parameters using Optimizer – You can customize the update routine – Built-in updater: StandardUpdater, ParallelUpdater • Extension – It adds an extra routine to the training loop – Basic extensions are built-in: Evaluator, LogReport, PrintReport, ProgressBar snapshot, snapshot_object, ExponentialDecay, LinearShift, dump_graph – You can write your own extensions 9
  • 10. Dataset-related modules • Dataset is just a sequence of data points (a.k.a. examples) • Iterator defines how to iterate over the dataset • Built-in iterators: – SerialIterator – MultiprocessIterator 10