O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a navegar o site, você aceita o uso de cookies. Leia nosso Contrato do Usuário e nossa Política de Privacidade.

O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a utilizar o site, você aceita o uso de cookies. Leia nossa Política de Privacidade e nosso Contrato do Usuário para obter mais detalhes.

O slideshow foi denunciado.

Gostou da apresentação? Compartilhe-a!

- What to Upload to SlideShare by SlideShare 5374368 views
- Customer Code: Creating a Company C... by HubSpot 4027749 views
- Be A Great Product Leader (Amplify,... by Adam Nash 899496 views
- Trillion Dollar Coach Book (Bill Ca... by Eric Schmidt 1067450 views
- APIdays Paris 2019 - Innovation @ s... by apidays 1199105 views
- A few thoughts on work life-balance by Wim Vanderbauwhede 935454 views

6.121 visualizações

Publicada em

Overview of Hinton's capsule networks, including vector and matrix capsules.

Publicada em:
Tecnologia

Sem downloads

Visualizações totais

6.121

No SlideShare

0

A partir de incorporações

0

Número de incorporações

2.390

Compartilhamentos

0

Downloads

340

Comentários

11

Gostaram

19

Nenhuma nota no slide

- 1. calculation | consulting capsule networks (TM) c|c (TM) charles@calculationconsulting.com
- 2. calculation|consulting capsule networks (TM) charles@calculationconsulting.com
- 3. c|c (TM) (TM) 3 calculation | consulting capsule networks Capsule networks by Hinton
- 4. c|c (TM) (TM) 4 calculation | consulting capsule networks Capsule networks by Hinton
- 5. c|c (TM) (TM) 5 calculation | consulting capsule networks Where ConvNets come from: LeNet 5 Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE 86(11): 2278–2324, 1998.
- 6. c|c (TM) (TM) 6 calculation | consulting capsule networks Convolutions usually w/ max pooling we get gross spatial invariance by ignoring exactly where a feature occurs “A vision system needs to use the same knowledge at all locations in the image” Hinton ConvNet: share weights + max pooling
- 7. c|c (TM) (TM) 7 calculation | consulting capsule networks Hierarchical model of the visual system HMax model, Riesenhuber and Poggio (1999) dotted line selects max pooled features from lower layer
- 8. c|c (TM) (TM) 8 calculation | consulting capsule networks Hierarchical model of the visual system Pooling proposed by Hubel andWiesel in1962 A. Receptive ﬁeld (RF) of simple cell (green) formed by pooling over (center-surround) cells (yellow) in the same orientation row B. RF of complex cell (green) formed by pooling over over simple cells. here: (crude) translation invariance
- 9. c|c (TM) (TM) 9 calculation | consulting capsule networks Hierarchical model of the visual system ConvNets resemble hierarchical models (but notice the hyper-column) HMax model, Riesenhuber and Poggio (1999)
- 10. c|c (TM) (TM) 10 calculation | consulting capsule networks Hinton: why max pooling is bad ? (If) the brain embeds things in rectangular space, then Translation is easy; Rotation is hard Experiment: time for mind to process rotation ~ amount Conv Nets: Crude translation invariance No explicit pose (orientation) information Can not distinguish left from right (actually some people have stopped using pooling) A vision system needs to use the same knowledge at all locations in the image
- 11. c|c (TM) (TM) 11 calculation | consulting capsule networks 2 streams hypothesis: what and where Ventral: what objects are Dorsal: where objects are in space How do we know ? Neurological disorders Simultanagnosia: can only see one object at a time idea dates back to 1968 lots of other evidence as well https://www.youtube.com/watch?v=mCoYOFzSS9A
- 12. c|c (TM) (TM) 12 calculation | consulting capsule networks Cortical Microcolumns Capsules may encode orientation scale velocity color … Column through cortical layers of the brain 80-120 neurons (2X long inV1) share the same receptive ﬁeld part of Hubel andWiesel, Nobel Prize 1981 also see recent review: https://www.sciencedirect.com/science/article/pii/S0166223615001484
- 13. c|c (TM) (TM) 13 calculation | consulting capsule networks Canonical object based frames of reference: Hinton 1981 Hinton has been thinking about this a long time A kind of inverse computer graphics
- 14. c|c (TM) (TM) 14 calculation | consulting capsule networks Capsule networks: inverse computer graphics computer graphics: rendering engine capsule network: inverse graphics matrix of pose information Hinton proposes that our brain does a kind-of inverse computer graphics transformation.
- 15. c|c (TM) (TM) 15 calculation | consulting capsule networks Invariance vs Equivariance Max pooling provides spatial Invariance, but Hinton argues we need spatial Equivariance. so use vectors and Afﬁne transformations Invariance: similar results if image is shifted or rotated Equivariance: invariance under a Symmetry Transformations (S,A,…) Group homomorphism: f(g*x)=g*f(x)=f(x)*g-1 Geometric: i.e. triangle centers invariant under Similarity (S) centroid invariant under Afﬁne (A) Statistics: mean: invariant under change of units median: more generally invariant; a better statistic
- 16. c|c (TM) (TM) 16 calculation | consulting capsule networks Segmenting highly overlapping objects Explaining away: Even if two hidden causes are independent, they can become dependent when we observe an effect that they can both inﬂuence. Hinton
- 17. c|c (TM) (TM) 17 calculation | consulting capsule networks Capsule networks: architecture + unsupervised | reconstruction loss supervised | max norm loss Hinton et. al. Dynamic Routing Between Capsules (2017)
- 18. c|c (TM) (TM) 18 calculation | consulting capsule networks Capsule networks by Hinton conv2D Keep ﬁrst convolutional layer, but replace max pooling with …
- 19. c|c (TM) (TM) 19 calculation | consulting capsule networks Capsule networks by Hinton conv2D Reshape conv2d into primary capsule vectors (red), and replace max pooling with routing-by-agreement algo
- 20. c|c (TM) (TM) 20 calculation | consulting capsule networks Capsule networks by Hinton “Active capsules at one level (red) make predictions, via transformation matrices, for the instantiation parameters of higher-level capsules (blue). When multiple predictions agree, a higher level capsule (blue) becomes active” conv2D
- 21. c|c (TM) (TM) 21 calculation | consulting capsule networks Primary layer: Conv2D reshaped keras implementation: https://github.com/XifengGuo/CapsNet-Keras
- 22. c|c (TM) (TM) 22 calculation | consulting capsule networks Capsule networks: encodes poses Capsules can represent objects w/ different poses (3D orientations) Latest results (matrix capsules, below) improve best accuracy on SmallNORB by %45
- 23. c|c (TM) (TM) 23 calculation | consulting capsule networks Capsules capture visual features “A capsule is a group of neurons whose outputs represent different properties of the same entity.” Capsules encode SIFT-like features Perturbing an image causes speciﬁc capsules to activate
- 24. c|c (TM) (TM) 24 calculation | consulting capsule networks Place-coding vs Rate-coding Place-coding: convNet w/out pooling low level features for small receptive ﬁelds when a part moves, it may gets a new capsule position maps to active capsules (u) in primary layer Rate-coding: traditional neurological way of coding (1926) stimulus info encoded in rate of ﬁring (as opposed to magnitude, population, timing, …) when a part rotates or moves, the capsule values change maps to real-values of capsule output vectors (v) rates encoded in vector values aside: are ReLUs a kind of rate coding ?
- 25. c|c (TM) (TM) 25 calculation | consulting capsule networks Hierarchy of parts: coupled layers A higher level entity is present if the lower / primary layer capsules agree on their predictions for its pose.
- 26. c|c (TM) (TM) 26 calculation | consulting capsule networks Routining algo: some pose prose An effective way to implement the “explaining away” that is needed for segmenting highly overlapping objects. Like an Attention mechanism: The competition … is between the higher-level capsules that a lower-level capsule might send its vote to. stuff Hinton says… A capsule is activated only if the transformed poses coming from the layer below match each other. This is a more effective way to capture covariance and leads to models with many fewer parameters that generalize better. …a powerful segmentation principle that allows knowledge of familiar shapes to drive segmentation, rather than just using low-level cues such as proximity or agreement in color or velocity.
- 27. c|c (TM) (TM) 27 calculation | consulting capsule networks Data-speciﬁc dynamic routes squash softmax “c are determined by an iterative dynamic routing process”ij weighted sum weighted mean prediction
- 28. c|c (TM) (TM) 28 calculation | consulting capsule networks Capsule vs traditional neuron https://github.com/naturomics/CapsNet-Tensorﬂow
- 29. c|c (TM) (TM) 29 calculation | consulting capsule networks Capsule: afﬁne transformation Primary rectangle and triangle capsules (prediction vectors) routed to boat and house capsules (parent layer), and then routes pruned “CapsNet is moderately robust to small afﬁne transformations of the training data”
- 30. c|c (TM) (TM) 30 calculation | consulting capsule networks Capsule: squashing function https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-ii-how-capsules-work-153b6ade9f66 length of the capsule vector ~ probability entity represented by capsule
- 31. c|c (TM) (TM) 31 calculation | consulting capsule networks Routing by agreement Algo selects data-speciﬁc routes b by matching primary outputs and squashed (secondary) outputs ij ﬁrst paper uses vector overlap / cosine distance to ﬁnd cluster centers: ok, but can not tell great from good second paper (matrix capsules) uses a Free Energy cost function
- 32. c|c (TM) (TM) 32 calculation | consulting capsule networks Routing algorithm How can we implement in Backprop ? ﬁxed point equation
- 33. c|c (TM) (TM) 33 calculation | consulting capsule networks Routing algo: EM ﬁxed point equation in forward pass of Backprop (like an EM step) must terminate to take dW dot product ~ log likelihood (Energy*) *Similar to ﬁxed point equation for TAP Free Energy in the EMF RBM **and in the later matrix capsule paper, a Free Energy is used explicitly
- 34. c|c (TM) (TM) 34 calculation | consulting capsule networks Routing algo: ﬁxed point unwound (3 steps) Similar to a 3 layer FCN w/shared weights W = 0
- 35. c|c (TM) (TM) 35 calculation | consulting capsule networks Routing algorithm: keras Layers https://keras.io/layers/writing-your-own-keras-layers/
- 36. c|c (TM) (TM) 36 calculation | consulting capsule networks Routing algo: keras
- 37. c|c (TM) (TM) 37 calculation | consulting capsule networks Routing algo: matrix capsules cluster score = [ log p(x | mixture) - log p(x | uniform)]ii cosine distance —> Free Energy cost: EM to ﬁnd mean, variance, and mixing proportion of Gaussians “data-points that form a tight cluster from the perspective of one capsule may be widely scattered from the perspective of another capsule” p(x | mixture) ih
- 38. c|c (TM) (TM) 38 calculation | consulting capsule networks Matrix capsules: after 3 EM iterations recent results from matrix capsule paper (more later)
- 39. c|c (TM) (TM) 39 calculation | consulting capsule networks Capsule networks: architecture + unsupervised | reconstruction loss supervised | multi-label max-norm loss each digit capsule ~ single digit for MNIST data |v| ~ Prob(digit) image size
- 40. c|c (TM) (TM) 40 calculation | consulting capsule networks From max pool to max |vector| mask selects (squashed) max vector (by length) - does not throw away position information - inputs vector into Fully Connected Net - reconstructs the image from the vector - similar to a variational auto-encoder
- 41. c|c (TM) (TM) 41 calculation | consulting capsule networks From max pool to max |vector|
- 42. c|c (TM) (TM) 42 calculation | consulting capsule networks Reconstruction error: a regularizer
- 43. Reconstruction: overlapping images c|c (TM) (TM) 43 calculation | consulting capsule networks individual (8, 6) reconstructed after removing a speciﬁc capsule and does not reconstruct absent (0, 1) trained on overlapping MNIST images like (8,1) (6,7) does have trouble with close images (like humans) https://www.youtube.com/watch?v=gq-7HgzfDBM&t=62s
- 44. c|c (TM) (TM) 44 calculation | consulting capsule networks Matrix capsules : Nov 2017 capsule vectors —> matrices cosine distance —> Free Energy cost function (Gaussian mixtures) + convolutions between layers + lots more details … for another video
- 45. (TM) c|c (TM) c | c charles@calculationconsulting.com

Nenhum painel de recortes público que contém este slide

Parece que você já adicionou este slide ao painel

Criar painel de recortes

Entre para ver os comentários