Seu SlideShare está sendo baixado. ×

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

1 de 20 Anúncio

# GNR638_project ppt.pdf

GNR 638 IIT Bombay project presentation

GNR 638 IIT Bombay project presentation

Anúncio
Anúncio

Anúncio

### GNR638_project ppt.pdf

1. 1. GNR638: Course Project Kervolutional Neural Networks Nov 21, 2021 Sahasra Ranjan Paarth Jain Atul Verma 190050102 190050076 19B090004 Tirthankar Adhikari Shrey Gupta 190070003 190100112
2. 2. Introduction ➢ Convolutional neural networks (CNNs) have been tremendously successful in computer vision, e.g. image recognition and object detection ➢ But convolutions are not able to express non-linear behaviour, they can do so using an activation function but even though it can only provide pointwise non-linearity. Hence, the paper used kervolution which uses the kernel trick to solve this.
3. 3. Recent Approaches to the Problem A minimal character based CNN architecture based model: https://arxiv.org/ftp/arxiv/papers/1901/1901.06032.pdf https://www.analyticsvidhya.com/blog/2020/10/what-is-the-convolution al-neural-network-architecture/
4. 4. Our Implementation to the Problem ● We used Kervolutional layers to deploy our model using PyTorch. ● When Kernel type is linear, it’s a usual CNN, but in our implementation we changed our Kernel types across Polynomial and Gaussian to introduce non-linearity which in turn, gave better performance.
5. 5. Dataset and Features MNIST CIFAR10
6. 6. Baseline Model: Kervolution ● The ith element of the convolution output f(x) is calculated as a simple inner product between vector x(i) and vector w. ● Whereas the kervolution is calculated via the kernel trick which essentially maps the vector in a non linear space and then takes the inner product Convolution Kervolution
7. 7. ● Kernel function takes kervolution to non-linear space, thus the model capacity is increased without introducing extra parameters. ● Kervolution measures the similarity by match kernels, which are equivalent to extracting speciﬁc features. ● One of the advantages of kervolution is that the non-linear properties can be customized without explicit calculation. Models Capacity and features
8. 8. Polynomial Kervolution ● To show the behavior of polynomial Kervolution, the learned ﬁlters of LeNet-5 trained for MNIST are visualized i which contains all six channels of the ﬁrst Kervolutional layer using polynomial kernel (dp = 3, cp = 1)
9. 9. Continued.. ● For a comparison, the learned ﬁlters from CNN are also presented. It is interesting that some of the learned ﬁlters of KNN and CNN are quite similar, This veriﬁes our understanding of polynomial kernel, which is a combination of linear and higher order terms. ● This also indicates that polynomial kervolution introduces higher order feature interaction in a more ﬂexible and direct way than the existing methods.
10. 10. Gaussian Kervolution The Gaussian RBF kernel extends kervolution to inﬁnite dimensions. where γg (γg ∈ R+ ) is a hyperparameter to control the smoothness of decision boundary.
11. 11. Continued... It extends kervolution to inﬁnite dimensions because of the ith-degree terms in
12. 12. Results MNIST Dataset ● Test Accuracy (trained for 5 epochs): ● Convolution : 98.1% ● Poly-linear-linear : 98.4% ● Linear-poly-linear : 98.47%
13. 13. Graph to show faster training with kervolution
14. 14. ● Other Results:
15. 15. Conclusions & Future Work ● Kervolution generalise convolution to non-linear space. ● Extends convolutional neural networks to kervolutional Neural network. ● Not only retains the advantages of convolution( sharing weights and equivalence to translation) but also enhances model capacity and captures higher order interactions of features, via patch-wise kernel functions without introducing additional parameters.
16. 16. Future Work: Continued... ● With careful kernel chosen, the performance of CNN can be signiﬁcantly improved on MNIST, CIFAR, and ImageNet dataset via replacing convolutional layers by kervolutional layers. ● Due to the large number of choices of kervolution, we cannot perform a brute force search for all the possibilities. ● We expect the introduction of kervolutional layers in more architectures and extensive hyperparameter searches can further improve the performance.
17. 17. Individual Contribution & Code Sahasra Ranjan (190050102) Worked on the Kervolution Neural Networks and implemented the training procedure on GPU using pytorch. Paarth Jain (190050076) Worked on the training procedure and generated results for various hyperparameters and network settings Atul Verma (19B090004) Prepared presentation and project report Tirthankar Adhikari (190070003) Debugging the implemented code and preparing presentation Shrey Gupta (190100112)
18. 18. Github Repository Link for Final code, Readme Files and Results: GitHub Repo: https://github.com/Lhisoka/GNR-638-Project Project PPT: https://docs.google.com/presentation/d/1-VgwYgyPi4UW1CoTHDgVi7EISm5AbeZPVu62b CwqDsg/edit?usp=sharing Note: All of our code is based on the following documentation: https://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_Kervolutiona l_Neural_Networks_CVPR_2019_paper.pdf
19. 19. Given the recent rapid development in this ﬁeld, there are a lot more remaining to be explored
20. 20. Thank You!