Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
[PR12] PR-063: Peephole predicting network performance before training
1. Paper reviewed by Taegyun Jeon
Peephole: Predicting Network
Performance Before Training
Boyang Deng, Junjie Yan, Dahua Lin,
“Peephole: Predicting Network Performance Before Training” (2017)
https://arxiv.org/abs/1712.03351
[TensorFlow-KR] PR12
2. 배경 | 높은 성능을 얻으려면?
▪ 결론: 좋은 네트워크를 써야한다.
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 2
3. 배경 | 좋은 네트워크를 얻으려면?
▪ 2가지 고려요소
▫ Large design space
• For Convolutional Neural Networks (CNN)
◦ the number of layers
◦ the number of channels within these layers
◦ whether to insert a pooling layer at certain points
▫ Costly training process
• Z. Zhong, J. Yan, and C. L. Liu. “Practical network blocks design with q-learning”. arXiv preprint
arXiv:1708.05552, 2017.
• B. Zoph and Q. V. Le. “Neural architecture search with reinforcement learning”. arXiv preprint
arXiv:1611.01578, 2016.
• B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. “Learning transferable architectures for scalable
image recognition.” arXiv preprint arXiv:1707.07012, 2017.
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 3
4. 문제정의 | 모델 성능 예측
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 4
5. 아이디어 | “네트워크 구조에 대한 성능”을 학습
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 5
𝑦 = 𝑓(𝑥, 𝑡)
6. 제안 | “네트워크 구조” 표현
▪ Unified Layer Code and Layer Embedding
▫ Integer code: TY, KW, KH, CH
• index of 8-bins: CH = [0.25, 0.5, 0.75, 1.0, 1.5, 2.0, 2.5, 3.0]
▫ Layer embedding
• Hidden state of LSTM cell: structural features
• Epoch index: embedded into real-vector
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 6
7. 제안 | “네트워크 구조”와 “성능” 의 데이터
▪ 막연한 생각
▫ Random sampling sequences of layers
• The design space grows exponentially as the number of layers increases.
• Many combinations of layers are not reasonable options from a practical point of view.
▪ Block-based generation
▫ Skeleton + generated blocks
▫ One block contains less than 10 layers
• First layer is convolution layer w/ random
kernel size.
▫ Markov chain
• For predefined transition prob.
from practical networks
▫ Restrict the number of convolution layers
within a block to less than 4
▫ 1x1 convolution for dimension matching
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 7
8. 제안 | 기존 네트워크 구조 = Markov Chain
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 8
9. 제안 | “X:네트워크, Y:성능” 데이터셋
▪ 데이터셋 구성
▫ N 개의 네트워크: {𝑥𝑖}1:𝑁
▫ Performance curves 𝑦𝑖(𝑡)
• Training data로 학습시키면서 epoch 𝑡에서 validation data에 대한 validation accuracy
▫ 𝒟 = {𝑥𝑖, 𝑦𝑖}1:𝑁
▪ Objective function with smooth L1 loss
▫ ℒ(𝒟; 𝜃) =
1
𝑁
σ𝑖=1
𝑛
𝑙(𝑓 𝑥𝑖, 𝑇 , 𝑦𝑖(𝑇))
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 9
10. 실험 | 무엇을 학습할 것인가?
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 10
11. 실험 | 무엇을 학습할 것인가?
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 11
12. 실험 | 무엇을 학습할 것인가?
▪ Comparison
▫ Bayesian Neural Networks and 𝜐-SVR (Support Vector Regression)
▪ Evaluation metrics
▫ Mean Square Error (MSE)
▫ Kendall’s Tau (Tau)
▫ Coefficient of Determination (R2)
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 12
13. 실험 | Transfer to ImageNet
▪ a
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 13
14. 결론 |
▪ Block-based generation
▫ Skeleton + generated blocks
▪ 다른 요소들에 대한 실험은..?
▫ Residual block, Dense connection 등
▪ 결국 평가를 위해선 모든 세팅에 대한 학습 필요
▪ Transfer learning을 위한 최적의 방법인가?
[PR12] Peephole: Predicting Network Performance Before Training (2017) Page 14