SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Once-for-All: Train One Network and
Specialize it for Efficient Deployment
[ICLR 2020]
2022. 03. 20. (Sun)
Presented by: 김동현
w/ Fundamental Team: 김채현, 박종익, 양현모, 이근배, 이재윤, 송헌
1
Contents
● Problem and Approach
● Key Challenge
● How to Train Once-for-all Network
● How to Deploy Once-for-all Network
● Evaluations
● Discussions
● Conclusion
2
Contents
● Problem and Approach
● Key Challenge
● How to Train Once-for-all Network
● How to Deploy Once-for-all Network
● Evaluations
● Discussions
● Conclusion
3
Main Problem to Solve
● There are various hardware platforms to deploy DNN models.
○ Survey says there are 23.14 billion IoT devices until 2018.
○ The devices have different resource constraints;
It is impossible to deploy the same model to all devices.
● The optimal neural network architecture varies by deployment environments
(e.g., #arithmetic units, application requirements).
4
Main Problem to Solve
● It is computationally prohibitive to find all the optimal architecture by training
on each environment.
● Then, how is it possible to cost-efficiently find the specialized model on
each platform?
5
target latency
= 20ms
Suggested Approach
● Train a Once-for-all(OFA) network, which enables serving on various
environment without additional training.
○ Various scales of sub-networks (about 1019
) are available from one OFA network.
○ Each hardware can find the specialized model for its requirements (e.g, latency).
6
Key Challenges for Once-for-All Network
Requirements
1. The sub-network architecture should be part of the largest network.
2. Sub-networks should share parameters with larger networks.
3. Optimal model architecture for specified hardwares should be easily found.
7
Key Challenges for Once-for-All Network
Requirements
1. The sub-network architecture should be part of the largest network.
2. Sub-networks should share parameters with larger networks.
3. Optimal model architecture for specified hardwares should be easily found.
Challenges
1. How to design sub-network architecture space based on a OFA network.
2. How to let sub-networks share parameters with larger networks.
3. How to select the optimal model for the hardware (in terms of latency,
accuracy).
8
Contents
● Problem and Approach
● Key Challenge
● How to Train Once-for-all Network : Challenges #1, #2
● How to Deploy Once-for-all Network: Challenges #3
● Evaluations
● Discussions
● Conclusion
9
Q&A
10
● Assumption: Follow the common practice of CNN models (e.g., ResNet).
○ A model consists of groups of Layers (i.e., units).
● Architecture Search Space
○ # Layers(L): the depth of each unit is chosen from {2, 3, 4}
○ # Channels(C): expansion ratio in each layer is chosen from {3, 4, 6}
○ Kernel Size(Ks): {3, 5, 7}
○ Input Dimension: ranges from 128 to 224 with a stride
● Num available sub-networks: ((3 * 3)2
+ (3 * 3)3
+ (3 * 3)4
)5
= about 1019
Training OFA Network - Network Architecture
… … …
…
L1 L2 L3
C
…
Ks
# units
11
How sub-networks share parameters:
● Elastic Kernel Size
○ Merely sharing the parameters of larger kernel can affect the performance.
○ When changing kernel size, pass through Transform Matrix:
■ For each layer, hold parameters for elastic kernels.
● # 25*25 parameters for 7x7 -> 5x5.
● # 9*9 parameters for 5x5 -> 3x3.
● E.g., 5x5 kernel = (Center of 7x7) * Transform Matrix
Training OFA Network - Sharing Parameters
12
How sub-networks share parameters:
● Elastic Depth (= #Layers)
○ The first D layers are shared when L layers exist in a unit.
○ Simpler depth settings compared to selecting random layers from L layers.
Training OFA Network - Sharing Parameters
L D
13
How sub-networks share parameters:
● Elastic Width (= #Channels)
○ For the given expansion ratio, select channels through a channel sorting method:
1. Calculate L1 Norm for each channel’s weights.
2. Sort the channels by the L1 Norm order.
3. Choose the top-K channels.
Training OFA Network - Sharing Parameters
L1 Norm
14
Progressive Shrinking
1. Train a full model (i.e. max vaule for each configuration).
● With the trained full-size model, Knowledge-Distillation techniques are leveraged.
● Note: Full model != Best model
Training OFA Network - Training Process
… … …
…
L1 L2 L3
Note1: Input image size is randomly chosen for each training batch
15
Progressive Shrinking
1. Train a full model (i.e. max vaule for each configuration).
2. Sample sub-networks varying kernel sizes and fine-tune.
a. For each step, sample one sub-net with different kernel sizes.
b. Calculate Loss. Loss = Full model loss * KD_raio + sub-net loss
c. Update the weights (updating sub-net’s weight -> updating the full model’s weight)
Training OFA Network - Training Process
… … …
L1 L2 L3 16
Note1: Input image size is randomly chosen for each training batch
Progressive Shrinking
1. Train a full model (i.e. max vaule for each configuration).
2. Sample sub-networks varying kernel sizes and fine-tune.
3. Sample sub-networks varying depth and fine-tune.
4. Sample sub-networks varying channel expansion ratio and fine-tune.
Training OFA Network - Training Process
… … …
L1 L2 L3
Note2: Refer to Appendix B for impl. details of progressive shrinking
Note1: Input image size is randomly chosen for each training batch
17
Deploying Specialized Model w/ OFA Network
Problem:
● derive the specialized sub-network for a given deployment scenario (e.g.,
latency constraints).
Solution:
● Train an accuracy predictor (3-layer FFNN)
○ f(architecture, input image size) => accuracy
○ randomly sample 16K sub-networks, measure the accuracy on 10K validation images
● Latency Lookup Table (Details in the ProxylessNAS paper)
○ On each hardware platform, build a latency lookup table .
● Conduct an evolutionary search leveraging the above information.
○ Mutate from the known sub-network by sampling and predicting the performance.
○ add the mutated sub-network to the child pool if it satisfies the constraint (latency).
18
Q&A
19
Evaluation
● ImageNet Dataset
● Eval on Various Hardware Platforms:
○ Samsung S7 Edge, Note8, Note10, Google Pixel1, Pixel2, LG G8, NVIDIA 1080Ti, V100
GPUs, Jetson TX2, Intel Xeon CPU, Xilinx ZU9EG, and ZU3EG FPGAs
● Please refer to the paper for the detailed training configurations.
20
Evaluation
Performance of sub-networks on ImageNet
● top-1 accuracy under 224x224 resolution.
● Can achieve higher performance through Progressive Shrinking.
○ 74.8% top1 accuracy (D=4, W=3, K=3), which is on par with MobileNetV3-Large.
○ Without PS, it achieves 71.5%, which is 3.3% lower.
21
get the same architecture from
the full model w/o PS
Evaluation
Reduced Design Cost
● reports comparison between OFA and hardware-aware NAS methods
○ NAS: The design cost is linear to the number of deployment scenarios (N).
○ the total CO2 emissions of OFA is:
■ 16× fewer than ProxylessNAS
■ 19× fewer than FBNet
■ 1,300× fewer than MnasNet
22
Evaluation
OFA under Different Computational Resource Constraints
● Better accuracy under the same constraints:
○ (Left): MACs, (Right): Latency
○ Achieves higher accuracy, Requires lower computations
○ Better than “OFA - Train from scratch”, which is trained from the scratch without pretraining.
23
Discussions
● Would it work if the same approach is applied to other models, tasks (e.g.,
Transformer, NLP)?
● The architecture search space is limited to certain models.
○ e.g. How to apply the method to models such as HRNet?
24
Conclusion
● Once-for-all(OFA) Network allows training one large model and deploying
various sub-networks without additional training.
● OFA suggests Progressive Shrinking algorithm to share and find
sub-networks, which highly reduces the design cost.
● The paper shows OFA can achieve higher performance with ImageNet
dataset.
● With a trained OFA network, optimal sub-networks can be found on various
deployment environments.
25
Q&A
26

Mais conteúdo relacionado

Mais procurados

GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBArangoDB Database
 
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image SegmentationU-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentationfake can
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetupLiad Magen
 
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Edureka!
 
NEURAL NETWORKS
NEURAL NETWORKSNEURAL NETWORKS
NEURAL NETWORKSESCOM
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
[DL輪読会]EfficientDet: Scalable and Efficient Object DetectionDeep Learning JP
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
Autoencoders
AutoencodersAutoencoders
AutoencodersCloudxLab
 
Graph neural networks overview
Graph neural networks overviewGraph neural networks overview
Graph neural networks overviewRodion Kiryukhin
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief NetworksHasan H Topcu
 

Mais procurados (20)

MobileNet V3
MobileNet V3MobileNet V3
MobileNet V3
 
Recurrent neural network
Recurrent neural networkRecurrent neural network
Recurrent neural network
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image SegmentationU-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetup
 
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
 
Restricted boltzmann machine
Restricted boltzmann machineRestricted boltzmann machine
Restricted boltzmann machine
 
NEURAL NETWORKS
NEURAL NETWORKSNEURAL NETWORKS
NEURAL NETWORKS
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Clusetrreport
ClusetrreportClusetrreport
Clusetrreport
 
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
[DL輪読会]EfficientDet: Scalable and Efficient Object Detection
 
One shot learning
One shot learningOne shot learning
One shot learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Graph neural networks overview
Graph neural networks overviewGraph neural networks overview
Graph neural networks overview
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
 

Semelhante a Once-for-All: Train One Network and Specialize it for Efficient Deployment

Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfDuy-Hieu Bui
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee
 
Standardising the compressed representation of neural networks
Standardising the compressed representation of neural networksStandardising the compressed representation of neural networks
Standardising the compressed representation of neural networksFörderverein Technische Fakultät
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...Bomm Kim
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxssuser3aa461
 
Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architecturesananth
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Shien-Chun Luo
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesJinwon Lee
 
Unit 1
Unit 1Unit 1
Unit 1sasi
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learningAmer Ather
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...Edge AI and Vision Alliance
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer visionMarcin Jedyk
 

Semelhante a Once-for-All: Train One Network and Specialize it for Efficient Deployment (20)

Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
Multicore architectures
Multicore architecturesMulticore architectures
Multicore architectures
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
 
Standardising the compressed representation of neural networks
Standardising the compressed representation of neural networksStandardising the compressed representation of neural networks
Standardising the compressed representation of neural networks
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
 
Clustering
ClusteringClustering
Clustering
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...Deep Learning for Computer Vision: Memory usage and computational considerati...
Deep Learning for Computer Vision: Memory usage and computational considerati...
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
 
Unit 1
Unit 1Unit 1
Unit 1
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learning
 
Deep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLabDeep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLab
 
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio..."Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
"Quantizing Deep Networks for Efficient Inference at the Edge," a Presentatio...
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
 
VGG.pptx
VGG.pptxVGG.pptx
VGG.pptx
 

Mais de taeseon ryu

OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...taeseon ryu
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splattingtaeseon ryu
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptxtaeseon ryu
 
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정taeseon ryu
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdftaeseon ryu
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories taeseon ryu
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extractiontaeseon ryu
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learningtaeseon ryu
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Modelstaeseon ryu
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuningtaeseon ryu
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdftaeseon ryu
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdftaeseon ryu
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithmtaeseon ryu
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networkstaeseon ryu
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarizationtaeseon ryu
 

Mais de taeseon ryu (20)

VoxelNet
VoxelNetVoxelNet
VoxelNet
 
OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splatting
 
JetsonTX2 Python
 JetsonTX2 Python  JetsonTX2 Python
JetsonTX2 Python
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptx
 
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
 
YOLO V6
YOLO V6YOLO V6
YOLO V6
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories
 
RL_UpsideDown
RL_UpsideDownRL_UpsideDown
RL_UpsideDown
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extraction
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learning
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuning
 
mPLUG
mPLUGmPLUG
mPLUG
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithm
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networks
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarization
 

Último

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 

Último (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 

Once-for-All: Train One Network and Specialize it for Efficient Deployment

  • 1. Once-for-All: Train One Network and Specialize it for Efficient Deployment [ICLR 2020] 2022. 03. 20. (Sun) Presented by: 김동현 w/ Fundamental Team: 김채현, 박종익, 양현모, 이근배, 이재윤, 송헌 1
  • 2. Contents ● Problem and Approach ● Key Challenge ● How to Train Once-for-all Network ● How to Deploy Once-for-all Network ● Evaluations ● Discussions ● Conclusion 2
  • 3. Contents ● Problem and Approach ● Key Challenge ● How to Train Once-for-all Network ● How to Deploy Once-for-all Network ● Evaluations ● Discussions ● Conclusion 3
  • 4. Main Problem to Solve ● There are various hardware platforms to deploy DNN models. ○ Survey says there are 23.14 billion IoT devices until 2018. ○ The devices have different resource constraints; It is impossible to deploy the same model to all devices. ● The optimal neural network architecture varies by deployment environments (e.g., #arithmetic units, application requirements). 4
  • 5. Main Problem to Solve ● It is computationally prohibitive to find all the optimal architecture by training on each environment. ● Then, how is it possible to cost-efficiently find the specialized model on each platform? 5 target latency = 20ms
  • 6. Suggested Approach ● Train a Once-for-all(OFA) network, which enables serving on various environment without additional training. ○ Various scales of sub-networks (about 1019 ) are available from one OFA network. ○ Each hardware can find the specialized model for its requirements (e.g, latency). 6
  • 7. Key Challenges for Once-for-All Network Requirements 1. The sub-network architecture should be part of the largest network. 2. Sub-networks should share parameters with larger networks. 3. Optimal model architecture for specified hardwares should be easily found. 7
  • 8. Key Challenges for Once-for-All Network Requirements 1. The sub-network architecture should be part of the largest network. 2. Sub-networks should share parameters with larger networks. 3. Optimal model architecture for specified hardwares should be easily found. Challenges 1. How to design sub-network architecture space based on a OFA network. 2. How to let sub-networks share parameters with larger networks. 3. How to select the optimal model for the hardware (in terms of latency, accuracy). 8
  • 9. Contents ● Problem and Approach ● Key Challenge ● How to Train Once-for-all Network : Challenges #1, #2 ● How to Deploy Once-for-all Network: Challenges #3 ● Evaluations ● Discussions ● Conclusion 9
  • 11. ● Assumption: Follow the common practice of CNN models (e.g., ResNet). ○ A model consists of groups of Layers (i.e., units). ● Architecture Search Space ○ # Layers(L): the depth of each unit is chosen from {2, 3, 4} ○ # Channels(C): expansion ratio in each layer is chosen from {3, 4, 6} ○ Kernel Size(Ks): {3, 5, 7} ○ Input Dimension: ranges from 128 to 224 with a stride ● Num available sub-networks: ((3 * 3)2 + (3 * 3)3 + (3 * 3)4 )5 = about 1019 Training OFA Network - Network Architecture … … … … L1 L2 L3 C … Ks # units 11
  • 12. How sub-networks share parameters: ● Elastic Kernel Size ○ Merely sharing the parameters of larger kernel can affect the performance. ○ When changing kernel size, pass through Transform Matrix: ■ For each layer, hold parameters for elastic kernels. ● # 25*25 parameters for 7x7 -> 5x5. ● # 9*9 parameters for 5x5 -> 3x3. ● E.g., 5x5 kernel = (Center of 7x7) * Transform Matrix Training OFA Network - Sharing Parameters 12
  • 13. How sub-networks share parameters: ● Elastic Depth (= #Layers) ○ The first D layers are shared when L layers exist in a unit. ○ Simpler depth settings compared to selecting random layers from L layers. Training OFA Network - Sharing Parameters L D 13
  • 14. How sub-networks share parameters: ● Elastic Width (= #Channels) ○ For the given expansion ratio, select channels through a channel sorting method: 1. Calculate L1 Norm for each channel’s weights. 2. Sort the channels by the L1 Norm order. 3. Choose the top-K channels. Training OFA Network - Sharing Parameters L1 Norm 14
  • 15. Progressive Shrinking 1. Train a full model (i.e. max vaule for each configuration). ● With the trained full-size model, Knowledge-Distillation techniques are leveraged. ● Note: Full model != Best model Training OFA Network - Training Process … … … … L1 L2 L3 Note1: Input image size is randomly chosen for each training batch 15
  • 16. Progressive Shrinking 1. Train a full model (i.e. max vaule for each configuration). 2. Sample sub-networks varying kernel sizes and fine-tune. a. For each step, sample one sub-net with different kernel sizes. b. Calculate Loss. Loss = Full model loss * KD_raio + sub-net loss c. Update the weights (updating sub-net’s weight -> updating the full model’s weight) Training OFA Network - Training Process … … … L1 L2 L3 16 Note1: Input image size is randomly chosen for each training batch
  • 17. Progressive Shrinking 1. Train a full model (i.e. max vaule for each configuration). 2. Sample sub-networks varying kernel sizes and fine-tune. 3. Sample sub-networks varying depth and fine-tune. 4. Sample sub-networks varying channel expansion ratio and fine-tune. Training OFA Network - Training Process … … … L1 L2 L3 Note2: Refer to Appendix B for impl. details of progressive shrinking Note1: Input image size is randomly chosen for each training batch 17
  • 18. Deploying Specialized Model w/ OFA Network Problem: ● derive the specialized sub-network for a given deployment scenario (e.g., latency constraints). Solution: ● Train an accuracy predictor (3-layer FFNN) ○ f(architecture, input image size) => accuracy ○ randomly sample 16K sub-networks, measure the accuracy on 10K validation images ● Latency Lookup Table (Details in the ProxylessNAS paper) ○ On each hardware platform, build a latency lookup table . ● Conduct an evolutionary search leveraging the above information. ○ Mutate from the known sub-network by sampling and predicting the performance. ○ add the mutated sub-network to the child pool if it satisfies the constraint (latency). 18
  • 20. Evaluation ● ImageNet Dataset ● Eval on Various Hardware Platforms: ○ Samsung S7 Edge, Note8, Note10, Google Pixel1, Pixel2, LG G8, NVIDIA 1080Ti, V100 GPUs, Jetson TX2, Intel Xeon CPU, Xilinx ZU9EG, and ZU3EG FPGAs ● Please refer to the paper for the detailed training configurations. 20
  • 21. Evaluation Performance of sub-networks on ImageNet ● top-1 accuracy under 224x224 resolution. ● Can achieve higher performance through Progressive Shrinking. ○ 74.8% top1 accuracy (D=4, W=3, K=3), which is on par with MobileNetV3-Large. ○ Without PS, it achieves 71.5%, which is 3.3% lower. 21 get the same architecture from the full model w/o PS
  • 22. Evaluation Reduced Design Cost ● reports comparison between OFA and hardware-aware NAS methods ○ NAS: The design cost is linear to the number of deployment scenarios (N). ○ the total CO2 emissions of OFA is: ■ 16× fewer than ProxylessNAS ■ 19× fewer than FBNet ■ 1,300× fewer than MnasNet 22
  • 23. Evaluation OFA under Different Computational Resource Constraints ● Better accuracy under the same constraints: ○ (Left): MACs, (Right): Latency ○ Achieves higher accuracy, Requires lower computations ○ Better than “OFA - Train from scratch”, which is trained from the scratch without pretraining. 23
  • 24. Discussions ● Would it work if the same approach is applied to other models, tasks (e.g., Transformer, NLP)? ● The architecture search space is limited to certain models. ○ e.g. How to apply the method to models such as HRNet? 24
  • 25. Conclusion ● Once-for-all(OFA) Network allows training one large model and deploying various sub-networks without additional training. ● OFA suggests Progressive Shrinking algorithm to share and find sub-networks, which highly reduces the design cost. ● The paper shows OFA can achieve higher performance with ImageNet dataset. ● With a trained OFA network, optimal sub-networks can be found on various deployment environments. 25