SlideShare uma empresa Scribd logo
1 de 101
Baixar para ler offline
A biologically-motivated approach to computer vision

Thomas Serre




McGovern Institute for Brain Research
Department of Brain & Cognitive Sciences
Massachusetts Institute of Technology
The problem: invariant
recognition in natural scenes




• Object recognition is hard!


• Our visual capabilities are
  computationally amazing


• Reverse-engineer the visual
  system and build machines that
  see and interpret the visual
  world as well as we do
Computer vision
                  Face detection
    successes
Computer vision
                  Face detection
    successes
The recipe

                                                                                                                                                                         lots of training
 Lots of simple features                                                    fancy classifier
                                                                                                                                                                            examples
                                                                           Given example images                         where
                                                                                    for negative and positive examples respec-
                                                                           tively.
                                                                           Initialize weights           for          respec-
                                                                           tively, where and are the number of negatives and
                                                                           positives respectively.
                                                                           For              :

                                                                              1. Normalize is very valuable, in their implementation it is necessary to
                                                                                           the weights,
                                                                                              first evaluate some feature detector at every location. These
                                                                                              features are then grouped to find unusual co-occurrences. In
                                                                                              practice, since the form of our detector and the features that
                                                                                  so that     it uses are extremely efficient, the amortized cost of evalu-
                                                                                              is a probability distribution.




                                                                 +                                                                          +
                                                                               2. For each feature, , detector at every scale and location is much faster first and second features selected by Ad-
                                                                                              ating our train a classifier       which        Figure 3: The
                                                                                  is restricted to findingaand grouping edges throughoutaBoost. The two features are shown in the top row and then
                                                                                              than using single feature. The                  the image.
                                                                                  error is evaluated with work Fleuret and Geman have presented a face
                                                                                                  In recent respect to ,
                                                                                                                                             overlayed on a typical training face in the bottom row. The
                                                                                              detection. technique which relies on a “chain” of tests in or-
                                                                                                                                             first feature measures the difference in intensity between the
                                                                               3. Choose theder to signifywith the lowest of a face at a particular scale and and a region across the upper cheeks. The
                                                                                               classifier, , the presence error .             region of the eyes
                                                                                              location [4]. The image properties measured by Fleuret and
                                                                               4. Update the weights:                                        feature capitalizes on the observation that the eye region is
                                                                                              Geman, disjunctions of fine scale edges, are quite different
                                                                                                                                             often darker than the cheeks. The second feature compares
                                                                                              than rectangle features which are simple, exist at all scales,
                                                                                                                                             the intensities in the eye regions to the intensity across the
                                                                                              and are somewhat interpretable. The two approaches also
                                                                                  where              if example      is classified cor-       bridge of the nose.
                                                                                              differ radically in their learning philosophy. The motivation
                                                                                  rectly,         otherwise, and              .
                                                                                              for Fleuret and Geman’s learning process is density estima-
                                                                                              tion and density discrimination, while our detector nose and cheeks (see Figure 3). This feature is rel-
                                                                            The final strong classifier is:                                    of the is purely
Figure 3: The first and second features selected by Ad-                                        discriminative. Finally the false positive rate of Fleuret andcomparison with the detection sub-window,
                                                                                                                                             atively large in       Figure 5: Example of frontal upright face images used for
aBoost. The two features are shown in the top row and then                                    Geman’s approach appears to be higher than that of previ-
                                                                                                                                             and should be somewhat insensitive to size and location of
                                                                                                 otherwise                                                          training.
                                                                                              ous approaches like Rowley et al. and thisthe face. The second feature selected relies on the property
                                                                                                                                              approach. Un-
overlayed on a typical training face in the bottom row. The                                   fortunately the paper does not report quantitative results are darker than the bridge of the nose.
                                                                            where                                                            that the eyes of
first feature measures the difference in intensity between the                                 this kind. The included example images each have between
region of the eyes and a region across the upper cheeks. The                                  2 and 10 false positives.
feature capitalizes on the observation that the eye region is
                                                                     Table 1: The AdaBoost algorithm for classifier learn-                    4. The Attentional speed of the cascaded detector is directly related to
                                                                                                                                                                       The Cascade
                                                                     ing. Each round of boosting selects one feature from the                                       the number of features evaluated per scanned sub-window.
often darker than the cheeks. The second feature compares            180,000 potential features.                                             This section describes an algorithm for constructing a cas-[12], an average of 10
                                                                                                                                                                    Evaluated on the MIT+CMU test set
the intensities in the eye regions to the intensity across the                                5 Results                                      cade of classifiers which achieves increased detectionevaluated per sub-window.
                                                                                                                                                                    features out of a total of 6061 are per-
                                                                                                                                                                    This is possible because a large majority of sub-windows
                                                                                                                                             formance while radically reducing computation time. The
bridge of the nose.
                                                                                                                       Schneiderman & Kanade ’99
                                                                     number of features are retained (perhaps a classifier was or
                                                                                              A 38 layer cascaded few hundred trained to detect frontalthat smaller, and by the first or second layer in the cascade. On
                                                                                                                                             key insight is         are rejected therefore more efficient,



                                                           Face detection
                                                                     thousand).               upright faces. To train the detector, a set of face and non- can be constructed which processor, the face detector can pro-
                                                                                                                                             boosted classifiers a 700 Mhz Pentium III reject many of
                                                                                              face training images were used. The face training set con-
                                                                                                                                             the negative sub-windows a 384 by 288 pixel image in about .067 seconds (us-
                                                                                                                                                                    cess while detecting almost all posi-
of the nose and cheeks (see Figure 3). This feature is rel-
atively large in comparison with the detection sub-window,           3.2. Learning Results                             Viola & Jones ’01
                                                                                              sisted of 4916 hand labeled faces scaled and aligned to (i.e. the threshold of scale of 1.25 and a step size of 1.5 described
                                                                                                                                             tive instances a       ing a starting a boosted classifier can
                                                                                              base resolution of 24 by 24 pixels. The be adjusted so that the false negative rate is close times faster than the Rowley-
                                                                                                                                              faces were ex-        below). This is roughly 15 to zero).
and should be somewhat insensitive to size and location of           While details on the trainingfrom performance of the final a random crawl of
                                                                                              tracted and images downloaded during Simpler classifiers are used to reject the majority of about 600 times faster than
                                                                                                                                                                    Baluja-Kanade detector [12] and sub-
                                                                     system are presented the world wide several simple results examples are shown more complex classifiers are called upon
                                                                                               in Section 5, web. Some typical face          windows before         the Schneiderman-Kanade detector [15].
the face. The second feature selected relies on the property
                                                                     merit discussion. InitialFigure 5. The non-face subwindows used to train the
                                                                                              in experiments demonstrated that a             to achieve low false positive rates.
that the eyes are darker than the bridge of the nose.                frontal face classifier detector come from 9544 images which were manually in-
                                                                                               constructed from 200 features yields                                 Image Processing
                                                                                                                                                 The overall form of the detection process is that of a de-
                                                                     a detection rate of 95% withand found to not contain any faces. generate decision tree, what example asub-windows used for training were vari-
                                                                                              spected a false positive rate of 1 in          There are about           All we call “cascade” (see Fig-
10K-1M training examples




                  Schneiderman & Kanade ’99
 Face detection   Viola & Jones ’01
over 100K training examples




   Car detection   Schneiderman & Kanade ’99
over 1K training examples




Pedestrian detection   Dalal & Triggs ’05
What’s wrong with this
             picture?
• Tens of thousands of manually
         annotated training examples


       • ~30,000 object categories
        (Biederman, 1987)



       • Approach unlikely to scale up ...




What’s wrong with this
             picture?
One-shot learning in   By age 6, a child knows 10-30K
                       categories
           humans
One-shot learning in   By age 6, a child knows 10-30K
                       categories
           humans
One-shot learning in   By age 6, a child knows 10-30K
                       categories
           humans
What are the
computational
mechanisms
underlying this
amazing feat?




                  source: cerebral cortex
What are the
computational
mechanisms
underlying this
amazing feat?




                  source: cerebral cortex
What are the
computational
mechanisms
underlying this
amazing feat?

 1. Organization of the
   visual system




                          source: cerebral cortex
What are the
computational
mechanisms
underlying this
amazing feat?

 1. Organization of the
   visual system


 2. Computational model of
   the visual cortex




                             source: cerebral cortex
What are the
computational
mechanisms
underlying this
amazing feat?

 1. Organization of the
   visual system


 2. Computational model of
   the visual cortex


 3. Application to computer
   vision

                              source: cerebral cortex
What are the
computational
mechanisms
underlying this
amazing feat?

 1. Organization of the
   visual system


 2. Computational model of
   the visual cortex


 3. Application to computer
   vision

                              source: cerebral cortex
Hierarchical architecture:
                             Rockland & Pandya ’79;
                Anatomy      Maunsell & Van Essen ‘83;
                             Felleman & Van Essen ’91
Hierarchical architecture:
                             Rockland & Pandya ’79;
                Anatomy      Maunsell & Van Essen ‘83;
                             Felleman & Van Essen ’91
source: Thorpe & Fabre-Thorpe ‘01




         Hierarchical architecture:   Nowak & Bullier ’97
                                      Schmolesky et al ’98
                         Latencies
Hierarchical architecture:
                 Function
ventral visual stream




Hierarchical architecture:
                 Function
Hierarchical architecture:
                 Function
Hierarchical architecture:
                 Function
Hierarchical architecture:
                             Hubel & Wiesel 1959, 1962, 1965, 1968
                 Function
simple       complex
 cells         cells




                                           Nobel prize 1981
    Hierarchical architecture:
                                 Hubel & Wiesel 1959, 1962, 1965, 1968
                     Function
gradual increase in complexity
                           of preferred stimulus




Hierarchical architecture:   Kobatake & Tanaka 1994
                             see also Oram & Perrett 1993; Sheinberg &

                 Function
                             Logothetis 1996; Gallant et al 1996;
                             Riesenhuber & Poggio 1999
Parallel increase in invariance
                       properties (position and scale)
                                  of neurons




Hierarchical architecture:   Kobatake & Tanaka 1994
                             see also Oram & Perrett 1993; Sheinberg &

                 Function    Logothetis 1996; Gallant et al 1996;
                             Riesenhuber & Poggio 1999
Hierarchical architecture:
                 Function
Hierarchical architecture:
                 Function
Hierarchical architecture:
                             Hung* Kreiman* Poggio & DiCarlo 2005
                 Function
• Invariant object recognition in
  IT:


 • Robust invariant readout of
   category information from
   small population of neurons


• Single spikes after response
  onset carry most of the
  information




      Hierarchical architecture:
                                    Hung* Kreiman* Poggio & DiCarlo 2005
                       Function
Hierarchical architecture:
                             Thorpe Fize & Marlot ‘96
Feedforward processing
Hierarchical architecture:
                             Thorpe Fize & Marlot ‘96
Feedforward processing
Hierarchical architecture:
Feedforward processing
Hierarchical architecture:
Feedforward processing
What are the
computational
mechanisms used by
brains to achieve this
amazing feat?

 1. Organization of the
   visual system


 2. Computational model of
   the visual cortex


 3. Application to computer
   vision

                              source: cerebral cortex
• Qualitative neurobiological models
  (Hubel & Wiesel ‘58; Perrett & Oram ‘93)


• Biologically-inspired
  (Fukushima ‘80; Mel ‘97; LeCun et al ‘98;
  Thorpe ‘02; Ullman et al ‘02; Wersing &
  Koerner ‘03)


• Quantitative neurobiological models
  (Wallis & Rolls ‘97; Riesenhuber & Poggio
  ‘99; Amit & Mascaro ‘03; Deco & Rolls ‘06)



        Feedforward hierarchical
      model of object recognition
Model
                                                                                                                                                  layers
                                                                                                                                                             RF sizes              Num.
                                                                                                                                                                                   units
                                                                                                                                                                                                                                                                                               • Large-scale (108
                   Prefrontal                                                              11,
                                                                                                                                       Animal
                                                                                                                                         vs.
                                                                                                                                                                                                                                                                                                 units), spans several
                                                                                                                                                                                                                                                                                                 areas of the visual




                                                                                                                                                                                              task-dependent learning
                    Cortex
                                                 46               8       45 12            13
                                                                                                                                     non-animal   classification                    10 0
                                                                                                                                                      units




                                                                                                                                                                                                    Supervised



                                                                                                                                                                                                                         Increase in complexity (number of subunits), RF size and invariance
                                                                                                       PG


                                                                                                                                                                                                                                                                                                 cortex
                                                        V2,V3,V4,MT,MST
                                        LIP,VIP,DP,7a




                                                                                                  V1
                                          P P P

                                                                  T




                                                                                     AIT,36,35
                                                                          PIT, AIT




                                                                                                            TE
                                                                                                                                                                          o              2
                                                                                                                                                   S4                 7             10

                                                    STP


                                                                                                                                                                                                                                                                                               • Combination of
                          Rostral STS




                                                                                                                        TG   36 35
                                        }




                                                                                                                                                                          o
                                         TPO PGa IPa TEa TEm
                                                           m                                                                                       C3                 7             10 3
       PG Cortex




                                                                                                                                                                                             task-independent learning
                                                                                                                  AIT

                                                                                                                                                   C2b                7
                                                                                                                                                                          o
                                                                                                                                                                                    10 3                                                                                                         forward and reverse




                                                                                                                                                                                                   Unsupervised
                                                                                                                                                   S3
                                                                                                                                                                      o
                                                                                                                                                                   1.2 - 3.2
                                                                                                                                                                               o
                                                                                                                                                                                    10 4                                                                                                         engineering
DP   VIP LIP 7a PP MSTcMSTp
                   M TcM p                                     FST
                                                                 T                                          PIT    TF                                                 o       o
                                                                                                                                                   S2b             0.9 - 4.4        10 7

                                                                                                                                                                      o       o
                                                                                                                                                                                    10 5
                                                                                                                                                                                                                                                                                               • Shown to be
                                                                                                                                                   C2              1.1 - 3.0

                                                                                                                                                                      o       o          7
                                                                                                                                                                   0.6 - 2.4
                                                                                                                                                                                                                                                                                                 consistent with many
                                                        PO                 V3A                          V4                                         S2                               10

                                                                                                                                                                      o       o
                                                                                                                                                                                    10 4
                                                                                                                                                                                                                                                                                                 experimental data
                                                                                                 V2
                                                                                                       V3
                                                                                                                                                   C1              0.4 - 1.6

                                                                                                                                                                              o
                                                                                                                                                                   0.2o- 1.1
                                                                                                 V1
                                                                                                                                                   S1                               10 6
                                                                                                                                                                                                                                                                                                 across areas of visual
                          dorsal stream                                                           ventral stream                                                                                                                                                                                 cortex
                         'where' pathway                                                          'what' pathway

                                                                                                                                                         Simple cells
                                                                                                                                                         Complex cells
                                                                                                                                                         Tuning               Main routes
                                                                                                                                                         MAX                  Bypass routes




                                                                          Feedforward hierarchical
                                                                                           model
Simple units              Complex units




      Selective pooling   Riesenhuber & Poggio 1999 (building on
                          Fukushima ‘80 and Hubel & Wiesel ‘62)
          mechanisms
Simple units                    Complex units
 Template matching                      Invariance
 Gaussian-like tuning                max-like operation
      ~ “AND”                              ~”OR”




            Selective pooling   Riesenhuber & Poggio 1999 (building on
                                Fukushima ‘80 and Hubel & Wiesel ‘62)
                mechanisms
Model
                                                                                                                                                  layers
                                                                                                                                                             RF sizes              Num.
                                                                                                                                                                                   units
                                                                                                                                                                                                                                                                                               • Large-scale (108
                   Prefrontal                                                              11,
                                                                                                                                       Animal
                                                                                                                                         vs.
                                                                                                                                                                                                                                                                                                 units), spans several
                                                                                                                                                                                                                                                                                                 areas of the visual




                                                                                                                                                                                              task-dependent learning
                    Cortex
                                                 46               8       45 12            13
                                                                                                                                     non-animal   classification                    10 0
                                                                                                                                                      units




                                                                                                                                                                                                    Supervised



                                                                                                                                                                                                                         Increase in complexity (number of subunits), RF size and invariance
                                                                                                       PG


                                                                                                                                                                                                                                                                                                 cortex
                                                        V2,V3,V4,MT,MST
                                        LIP,VIP,DP,7a




                                                                                                  V1
                                          P P P

                                                                  T




                                                                                     AIT,36,35
                                                                          PIT, AIT




                                                                                                            TE
                                                                                                                                                                          o              2
                                                                                                                                                   S4                 7             10

                                                    STP


                                                                                                                                                                                                                                                                                               • Combination of
                          Rostral STS




                                                                                                                        TG   36 35
                                        }




                                                                                                                                                                          o
                                         TPO PGa IPa TEa TEm
                                                           m                                                                                       C3                 7             10 3
       PG Cortex




                                                                                                                                                                                             task-independent learning
                                                                                                                  AIT

                                                                                                                                                   C2b                7
                                                                                                                                                                          o
                                                                                                                                                                                    10 3                                                                                                         forward and reverse




                                                                                                                                                                                                   Unsupervised
                                                                                                                                                   S3
                                                                                                                                                                      o
                                                                                                                                                                   1.2 - 3.2
                                                                                                                                                                               o
                                                                                                                                                                                    10 4                                                                                                         engineering
DP   VIP LIP 7a PP MSTcMSTp
                   M TcM p                                     FST
                                                                 T                                          PIT    TF                                                 o       o
                                                                                                                                                   S2b             0.9 - 4.4        10 7

                                                                                                                                                                      o       o
                                                                                                                                                                                    10 5
                                                                                                                                                                                                                                                                                               • Shown to be
                                                                                                                                                   C2              1.1 - 3.0

                                                                                                                                                                      o       o          7
                                                                                                                                                                   0.6 - 2.4
                                                                                                                                                                                                                                                                                                 consistent with many
                                                        PO                 V3A                          V4                                         S2                               10

                                                                                                                                                                      o       o
                                                                                                                                                                                    10 4
                                                                                                                                                                                                                                                                                                 experimental data
                                                                                                 V2
                                                                                                       V3
                                                                                                                                                   C1              0.4 - 1.6

                                                                                                                                                                              o
                                                                                                                                                                   0.2o- 1.1
                                                                                                 V1
                                                                                                                                                   S1                               10 6
                                                                                                                                                                                                                                                                                                 across areas of visual
                          dorsal stream                                                           ventral stream                                                                                                                                                                                 cortex
                         'where' pathway                                                          'what' pathway

                                                                                                                                                         Simple cells
                                                                                                                                                         Complex cells
                                                                                                                                                         Tuning               Main routes
                                                                                                                                                         MAX                  Bypass routes




                                                                          Feedforward hierarchical
                                                                                           model
Kouh & Poggio 2007; Knoblich Bouvrie Poggio 2007


                                                   Both operations can be
           Basic circuit for the two               approximated gain control
                         operations                circuits using shunting inhibition
Model      RF size
                                                                                                 layers

                                                                                      Animal
                Prefrontal
                 Cortex         45 12
                                                  11,
                                                  13
                                                                                        vs.
                                                                                    non-animal
                                                                                                          PFC
                                                                                                 classification
                                                                                                     units
                                                              PG


                                                         V1


                                                                                                          AIT




                                           AIT,36,35
                                PIT, AIT
                                                                   TE
                                                                                                  S4                    7


                                                                               35



                                                                         AIT
                                                                                                  C3
                                                                                                          PIT           7


                                                                                                  C2b                   7

                                                                                                                        o
                                                                                                  S3              1.2

                                                                   PIT
                                                                                                  S2b     V4 0.9        o



                                                                                                                        o
                                                                                                  C2              1.1


                                                               V4
                                                                                                          V2 0.6        o
                                                                                                  S2

                                                                                                                        o
                                                                                                  C1              0.4
                                                                                                          V1
                                                        V2


                                                        V1                                                        0.2o
                                                                                                  S1



                       dorsal stream                     ventral stream
                      'where' pathway                    'what' pathway

                                                                                                        Simple cell
                                                                                                        Complex ce
                                                                                                        Tuning
                                                                                                        MAX




Learning and plasticity
Model      RF size
                                                                                                                               layers

                                                                                                                    Animal
                    Prefrontal
                                    45 12
                                                      11,                                                             vs.               PFC
                                                                                                                               classification
         PFC, IT very likely                                                                                      non-animal
                     Cortex                           13
                                                                                                                                   units
                                                                  PG




                                                                             Evidence for adult plasticity
                                                             V1


                                                                                                                                        AIT




                                               AIT,36,35
                                    PIT, AIT
                                                                       TE
                                                                                                                                S4                    7


                                                                                                             35



                                                                              AIT
                                                                                                                                C3
                                                                                                                                        PIT           7


                                                                                                                                C2b                   7

                                                                                                                                                      o
                                                                                                                                S3              1.2
                   V4 likely                                           PIT
                                                                                                                                S2b     V4 0.9        o



                                                                                                                                                      o
                                                                                                                                C2              1.1


                                                                   V4
                                                                                                                                        V2 0.6        o
                                                                                                                                S2

                                                                                                                                                      o
                                                                                                                                C1              0.4
                                                                                                                                        V1
                                                            V2


     V1/V2 limited evidence                                 V1
                                                                                                                                S1
                                                                                                                                                0.2o



                           dorsal stream                     ventral stream
                          'where' pathway                    'what' pathway

                                                                                                                                      Simple cell
                                                                                                                                      Complex ce
                                                                                                                                      Tuning
                                                                                                                                      MAX




Learning and plasticity
Model      RF size
                                                                                                    layers

                                                                                         Animal
                   Prefrontal
                    Cortex         45 12
                                                     11,
                                                     13
                                                                                           vs.
                                                                                       non-animal
                                                                                                             PFC
                                                                                                    classification
                                                                                                        units
                                                                 PG


                                                            V1


                                                                                                             AIT




                                              AIT,36,35
                                   PIT, AIT
                                                                      TE
                                                                                                     S4                    7


                                                                                  35



                                                                            AIT
                                                                                                     C3
                                                                                                             PIT           7


                                                                                                     C2b                   7

                                                                                                                           o
                                                                                                     S3              1.2

                                                                      PIT
                                                                                                     S2b     V4 0.9        o




    Unsupervised developmental-                                                                      C2              1.1
                                                                                                                           o


                                                                                                             V2 0.6
    like learning stage:                                          V4                                 S2
                                                                                                                           o




    Frequent image features                                                                          C1              0.4
                                                                                                                           o


                                                                                                             V1
                                                           V2


                                                           V1                                                        0.2o
                                                                                                     S1



                          dorsal stream                     ventral stream
                         'where' pathway                    'what' pathway

                                                                                                           Simple cell
                                                                                                           Complex ce
                                                                                                           Tuning
                                                                                                           MAX




Learning and plasticity
Model      RF size
                                                                                                    layers

                                                                                         Animal
                   Prefrontal
                    Cortex         45 12
                                                     11,
                                                     13
                                                                                           vs.
                                                                                       non-animal
                                                                                                             PFC
                                                                                                    classification
                                                                                                        units
                                                                 PG


                                                            V1


                                                                                                             AIT




                                              AIT,36,35
                                   PIT, AIT
                                                                      TE
                                                                                                     S4                    7


                                                                                  35



                                                                            AIT
                                                                                                     C3
                                                                                                             PIT           7


                                                                                                     C2b                   7

                                                                                                                           o
                                                                                                     S3              1.2

                                                                      PIT
                                                                                                     S2b     V4 0.9        o




    Unsupervised developmental-                                                                      C2              1.1
                                                                                                                           o


                                                                                                             V2 0.6
    like learning stage:                                          V4                                 S2
                                                                                                                           o




    Frequent image features                                                                          C1              0.4
                                                                                                                           o


                                                                                                             V1
                                                           V2


                                                           V1                                                        0.2o
                                                                                                     S1



                          dorsal stream                     ventral stream
                         'where' pathway                    'what' pathway

                                                                                                           Simple cell
                                                                                                           Complex ce
                                                                                                           Tuning
                                                                                                           MAX




Learning and plasticity
Model      RF size
                                                                                                    layers

                                                                                         Animal
                   Prefrontal
                    Cortex         45 12
                                                     11,
                                                     13
                                                                                           vs.
                                                                                       non-animal
                                                                                                             PFC
                                                                                                    classification
                                                                                                        units
                                                                 PG


                                                            V1


                                                                                                             AIT




                                              AIT,36,35
                                   PIT, AIT
                                                                      TE
                                                                                                     S4                    7


                                                                                  35



                                                                            AIT
                                                                                                     C3
                                                                                                             PIT           7


                                                                                                     C2b                   7

                                                                                                                           o
                                                                                                     S3              1.2

                                                                      PIT
                                                                                                     S2b     V4 0.9        o




    Unsupervised developmental-                                                                      C2              1.1
                                                                                                                           o


                                                                                                             V2 0.6
    like learning stage:                                          V4                                 S2
                                                                                                                           o




    Frequent image features                                                                          C1              0.4
                                                                                                                           o


                                                                                                             V1
                                                           V2


                                                           V1                                                        0.2o
                                                                                                     S1



                          dorsal stream                     ventral stream
                         'where' pathway                    'what' pathway

                                                                                                           Simple cell
                                                                                                           Complex ce
                                                                                                           Tuning
                                                                                                           MAX




Learning and plasticity
Model      RF size
                                                                                                              layers


Learned V2/V4 units          Prefrontal
                                             45 12
                                                               11,
                                                                                                   Animal
                                                                                                     vs.               PFC
                                                                                                              classification
                              Cortex                           13
                                                                                                 non-animal
                                                                                                                  units

              stronger
                                                                           PG


                                                                      V1
              facilitation
                                                                                                                       AIT




                                                        AIT,36,35
                                             PIT, AIT
                                                                                TE
                                                                                                               S4                    7


                                                                                            35



                                                                                      AIT
                                                                                                               C3
                                                                                                                       PIT           7


                                                                                                               C2b                   7
              stronger
              suppression                                                                                      S3              1.2
                                                                                                                                     o



                                                                                PIT
                                                                                                               S2b     V4 0.9        o




           Unsupervised developmental-                                                                         C2              1.1
                                                                                                                                     o


                                                                                                                       V2 0.6
           like learning stage:                                             V4                                 S2
                                                                                                                                     o




           Frequent image features                                                                             C1              0.4
                                                                                                                                     o


                                                                                                                       V1
                                                                     V2


                                                                     V1                                                        0.2o
                                                                                                               S1



                                    dorsal stream                     ventral stream
                                   'where' pathway                    'what' pathway

                                                                                                                     Simple cell
                                                                                                                     Complex ce
                                                                                                                     Tuning
                                                                                                                     MAX




    Learning and plasticity
Model      RF size
                                                                                                             layers


     Beyond V4              Prefrontal
                                            45 12
                                                              11,
                                                                                                  Animal
                                                                                                    vs.               PFC
                                                                                                             classification
                             Cortex                           13
                                                                                                non-animal
Combinations of those...                                                  PG
                                                                                                                 units


                                                                     V1


                                                                                                                      AIT




                                                       AIT,36,35
                                            PIT, AIT
                                                                               TE
                                                                                                              S4                    7


                                                                                           35



                                                                                     AIT
                                                                                                              C3
                                                                                                                      PIT           7


                                                                                                              C2b                   7

                                                                                                                                    o
                                                                                                              S3              1.2

                                                                               PIT
                                                                                                              S2b     V4 0.9        o




             Unsupervised developmental-                                                                      C2              1.1
                                                                                                                                    o


                                                                                                                      V2 0.6
             like learning stage:                                          V4                                 S2
                                                                                                                                    o




             Frequent image features                                                                          C1              0.4
                                                                                                                                    o


                                                                                                                      V1
                                                                    V2


                                                                    V1                                                        0.2o
                                                                                                              S1



                                   dorsal stream                     ventral stream
                                  'where' pathway                    'what' pathway

                                                                                                                    Simple cell
                                                                                                                    Complex ce
                                                                                                                    Tuning
                                                                                                                    MAX




      Learning and plasticity
Model      RF size
                                                                                                      layers

                                                                                           Animal

    Supervised learning from a
                     Prefrontal
                      Cortex         45 12
                                                       11,
                                                       13
                                                                                             vs.
                                                                                         non-animal
                                                                                                               PFC
                                                                                                      classification
                                                                                                          units
    handful of training examples                                   PG




    ~ linear perceptron
                                                              V1


                                                                                                               AIT




                                                AIT,36,35
                                     PIT, AIT
                                                                        TE
                                                                                                       S4                    7


                                                                                    35



                                                                              AIT
                                                                                                       C3
                                                                                                               PIT           7


                                                                                                       C2b                   7

                                                                                                                             o
                                                                                                       S3              1.2

                                                                        PIT
                                                                                                       S2b     V4 0.9        o




    Unsupervised developmental-                                                                        C2              1.1
                                                                                                                             o


                                                                                                               V2 0.6
    like learning stage:                                            V4                                 S2
                                                                                                                             o




    Frequent image features                                                                            C1              0.4
                                                                                                                             o


                                                                                                               V1
                                                             V2


                                                             V1                                                        0.2o
                                                                                                       S1



                            dorsal stream                     ventral stream
                           'where' pathway                    'what' pathway

                                                                                                             Simple cell
                                                                                                             Complex ce
                                                                                                             Tuning
                                                                                                             MAX




Learning and plasticity
Learning and sample
          complexity
Model      RF sizes              Num.
                                                                                                                                                  layers                           units
                                                                                                                                       Animal
                   Prefrontal                                                              11,                                           vs.




                                                                                                                                                                                              task-dependent learning
                    Cortex
                                                 46               8       45 12            13
                                                                                                                                     non-animal   classification                    10 0
                                                                                                                                                      units




                                                                                                                                                                                                    Supervised



                                                                                                                                                                                                                         Increase in complexity (number of subunits), RF size and invariance
                                                                                                       PG
                                                        V2,V3,V4,MT,MST
                                        LIP,VIP,DP,7a




                                                                                                  V1
                                          P P P

                                                                  T




                                                                                     AIT,36,35
                                                                          PIT, AIT




                                                                                                            TE
                                                                                                                                                                          o              2
                                                                                                                                                   S4                 7             10

                                                    STP
                          Rostral STS




                                                                                                                        TG   36 35
                                        }




                                                                                                                                                                          o
                                         TPO PGa IPa TEa TEm
                                                           m                                                                                       C3                 7             10 3
       PG Cortex




                                                                                                                                                                                             task-independent learning
                                                                                                                  AIT
                                                                                                                                                                          o
                                                                                                                                                   C2b                7             10 3




                                                                                                                                                                                                   Unsupervised
                                                                                                                                                                      o        o
                                                                                                                                                   S3              1.2 - 3.2        10 4

DP   VIP LIP 7a PP MSTcMSTp
                   M TcM p                                     FST
                                                                 T                                          PIT    TF                                                 o       o
                                                                                                                                                   S2b             0.9 - 4.4        10 7

                                                                                                                                                                      o       o
                                                                                                                                                   C2              1.1 - 3.0        10 5

                                                                                                                                                                      o       o
                                                        PO                 V3A                          V4                                         S2
                                                                                                                                                                   0.6 - 2.4        10 7

                                                                                                                                                                      o       o
                                                                                                 V2
                                                                                                       V3
                                                                                                                                                   C1              0.4 - 1.6        10 4

                                                                                                                                                                              o
                                                                                                 V1                                                                0.2o- 1.1        10 6
                                                                                                                                                   S1



                          dorsal stream                                                           ventral stream
                         'where' pathway                                                          'what' pathway

                                                                                                                                                         Simple cells
                                                                                                                                                         Complex cells
                                                                                                                                                         Tuning               Main routes
                                                                                                                                                         MAX                  Bypass routes




                                                                                 Feedforward hierarchical
                                                                                                  model
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision
A biologically-motivated approach to computer vision

Mais conteúdo relacionado

Mais procurados

Panasonic AG-HPX250E
Panasonic AG-HPX250EPanasonic AG-HPX250E
Panasonic AG-HPX250EAV ProfShop
 
Three Tears For Jimmy Prophet Evaluation
Three Tears For Jimmy Prophet EvaluationThree Tears For Jimmy Prophet Evaluation
Three Tears For Jimmy Prophet EvaluationKatieeBeth
 
Database Design 2009
Database Design 2009Database Design 2009
Database Design 2009Cathie101
 
Trainers advert help sheet 2
Trainers advert help sheet 2Trainers advert help sheet 2
Trainers advert help sheet 2Graveney School
 
Elettronica: Multimedia Information Processing in Smart Environments by Aless...
Elettronica: Multimedia Information Processing in Smart Environments by Aless...Elettronica: Multimedia Information Processing in Smart Environments by Aless...
Elettronica: Multimedia Information Processing in Smart Environments by Aless...Codemotion
 
Accelarating Optical Quadrature Microscopy Using GPUs
Accelarating Optical Quadrature Microscopy Using GPUsAccelarating Optical Quadrature Microscopy Using GPUs
Accelarating Optical Quadrature Microscopy Using GPUsPerhaad Mistry
 
Final presentation (1) (1)
Final presentation (1) (1)Final presentation (1) (1)
Final presentation (1) (1)Gargee Hiray
 
Poinsettia i e_a4
Poinsettia i e_a4Poinsettia i e_a4
Poinsettia i e_a4hafize
 
173v1 wp prostate row electronic
173v1 wp prostate row electronic173v1 wp prostate row electronic
173v1 wp prostate row electronicNasos Papapostolou
 
Trace Pro Evaluation
Trace Pro EvaluationTrace Pro Evaluation
Trace Pro EvaluationShane Dowd
 
Structure of english (4 th q.)
Structure of english (4 th q.)Structure of english (4 th q.)
Structure of english (4 th q.)Janine Mendoza
 
What’s in the Cards for Enterprise IT Culture?
What’s in the Cards for Enterprise IT Culture?What’s in the Cards for Enterprise IT Culture?
What’s in the Cards for Enterprise IT Culture?kinvey
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
 

Mais procurados (14)

Panasonic AG-HPX250E
Panasonic AG-HPX250EPanasonic AG-HPX250E
Panasonic AG-HPX250E
 
Three Tears For Jimmy Prophet Evaluation
Three Tears For Jimmy Prophet EvaluationThree Tears For Jimmy Prophet Evaluation
Three Tears For Jimmy Prophet Evaluation
 
Database Design 2009
Database Design 2009Database Design 2009
Database Design 2009
 
Trainers advert help sheet 2
Trainers advert help sheet 2Trainers advert help sheet 2
Trainers advert help sheet 2
 
Elettronica: Multimedia Information Processing in Smart Environments by Aless...
Elettronica: Multimedia Information Processing in Smart Environments by Aless...Elettronica: Multimedia Information Processing in Smart Environments by Aless...
Elettronica: Multimedia Information Processing in Smart Environments by Aless...
 
Accelarating Optical Quadrature Microscopy Using GPUs
Accelarating Optical Quadrature Microscopy Using GPUsAccelarating Optical Quadrature Microscopy Using GPUs
Accelarating Optical Quadrature Microscopy Using GPUs
 
Final presentation (1) (1)
Final presentation (1) (1)Final presentation (1) (1)
Final presentation (1) (1)
 
Poinsettia i e_a4
Poinsettia i e_a4Poinsettia i e_a4
Poinsettia i e_a4
 
173v1 wp prostate row electronic
173v1 wp prostate row electronic173v1 wp prostate row electronic
173v1 wp prostate row electronic
 
Session 2
Session 2Session 2
Session 2
 
Trace Pro Evaluation
Trace Pro EvaluationTrace Pro Evaluation
Trace Pro Evaluation
 
Structure of english (4 th q.)
Structure of english (4 th q.)Structure of english (4 th q.)
Structure of english (4 th q.)
 
What’s in the Cards for Enterprise IT Culture?
What’s in the Cards for Enterprise IT Culture?What’s in the Cards for Enterprise IT Culture?
What’s in the Cards for Enterprise IT Culture?
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
 

Destaque

Mit6870 orsu lecture11
Mit6870 orsu lecture11Mit6870 orsu lecture11
Mit6870 orsu lecture11zukun
 
Ethics presentation final 20[1].09.10
Ethics presentation final 20[1].09.10Ethics presentation final 20[1].09.10
Ethics presentation final 20[1].09.10riddhipimputkar
 
Visual cortex
Visual cortexVisual cortex
Visual cortexhansvanni
 
Legalizing Marijuana[1]
Legalizing Marijuana[1]Legalizing Marijuana[1]
Legalizing Marijuana[1]annajensen
 
Binocular Rivalry and Visual Awareness in Human Extrastriate Cortex
Binocular Rivalry and Visual Awareness in Human Extrastriate CortexBinocular Rivalry and Visual Awareness in Human Extrastriate Cortex
Binocular Rivalry and Visual Awareness in Human Extrastriate CortexStan James
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013Philip Zheng
 

Destaque (6)

Mit6870 orsu lecture11
Mit6870 orsu lecture11Mit6870 orsu lecture11
Mit6870 orsu lecture11
 
Ethics presentation final 20[1].09.10
Ethics presentation final 20[1].09.10Ethics presentation final 20[1].09.10
Ethics presentation final 20[1].09.10
 
Visual cortex
Visual cortexVisual cortex
Visual cortex
 
Legalizing Marijuana[1]
Legalizing Marijuana[1]Legalizing Marijuana[1]
Legalizing Marijuana[1]
 
Binocular Rivalry and Visual Awareness in Human Extrastriate Cortex
Binocular Rivalry and Visual Awareness in Human Extrastriate CortexBinocular Rivalry and Visual Awareness in Human Extrastriate Cortex
Binocular Rivalry and Visual Awareness in Human Extrastriate Cortex
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013
 

Último

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

A biologically-motivated approach to computer vision

  • 1. A biologically-motivated approach to computer vision Thomas Serre McGovern Institute for Brain Research Department of Brain & Cognitive Sciences Massachusetts Institute of Technology
  • 2. The problem: invariant recognition in natural scenes • Object recognition is hard! • Our visual capabilities are computationally amazing • Reverse-engineer the visual system and build machines that see and interpret the visual world as well as we do
  • 3. Computer vision Face detection successes
  • 4. Computer vision Face detection successes
  • 5. The recipe lots of training Lots of simple features fancy classifier examples Given example images where for negative and positive examples respec- tively. Initialize weights for respec- tively, where and are the number of negatives and positives respectively. For : 1. Normalize is very valuable, in their implementation it is necessary to the weights, first evaluate some feature detector at every location. These features are then grouped to find unusual co-occurrences. In practice, since the form of our detector and the features that so that it uses are extremely efficient, the amortized cost of evalu- is a probability distribution. + + 2. For each feature, , detector at every scale and location is much faster first and second features selected by Ad- ating our train a classifier which Figure 3: The is restricted to findingaand grouping edges throughoutaBoost. The two features are shown in the top row and then than using single feature. The the image. error is evaluated with work Fleuret and Geman have presented a face In recent respect to , overlayed on a typical training face in the bottom row. The detection. technique which relies on a “chain” of tests in or- first feature measures the difference in intensity between the 3. Choose theder to signifywith the lowest of a face at a particular scale and and a region across the upper cheeks. The classifier, , the presence error . region of the eyes location [4]. The image properties measured by Fleuret and 4. Update the weights: feature capitalizes on the observation that the eye region is Geman, disjunctions of fine scale edges, are quite different often darker than the cheeks. The second feature compares than rectangle features which are simple, exist at all scales, the intensities in the eye regions to the intensity across the and are somewhat interpretable. The two approaches also where if example is classified cor- bridge of the nose. differ radically in their learning philosophy. The motivation rectly, otherwise, and . for Fleuret and Geman’s learning process is density estima- tion and density discrimination, while our detector nose and cheeks (see Figure 3). This feature is rel- The final strong classifier is: of the is purely Figure 3: The first and second features selected by Ad- discriminative. Finally the false positive rate of Fleuret andcomparison with the detection sub-window, atively large in Figure 5: Example of frontal upright face images used for aBoost. The two features are shown in the top row and then Geman’s approach appears to be higher than that of previ- and should be somewhat insensitive to size and location of otherwise training. ous approaches like Rowley et al. and thisthe face. The second feature selected relies on the property approach. Un- overlayed on a typical training face in the bottom row. The fortunately the paper does not report quantitative results are darker than the bridge of the nose. where that the eyes of first feature measures the difference in intensity between the this kind. The included example images each have between region of the eyes and a region across the upper cheeks. The 2 and 10 false positives. feature capitalizes on the observation that the eye region is Table 1: The AdaBoost algorithm for classifier learn- 4. The Attentional speed of the cascaded detector is directly related to The Cascade ing. Each round of boosting selects one feature from the the number of features evaluated per scanned sub-window. often darker than the cheeks. The second feature compares 180,000 potential features. This section describes an algorithm for constructing a cas-[12], an average of 10 Evaluated on the MIT+CMU test set the intensities in the eye regions to the intensity across the 5 Results cade of classifiers which achieves increased detectionevaluated per sub-window. features out of a total of 6061 are per- This is possible because a large majority of sub-windows formance while radically reducing computation time. The bridge of the nose. Schneiderman & Kanade ’99 number of features are retained (perhaps a classifier was or A 38 layer cascaded few hundred trained to detect frontalthat smaller, and by the first or second layer in the cascade. On key insight is are rejected therefore more efficient, Face detection thousand). upright faces. To train the detector, a set of face and non- can be constructed which processor, the face detector can pro- boosted classifiers a 700 Mhz Pentium III reject many of face training images were used. The face training set con- the negative sub-windows a 384 by 288 pixel image in about .067 seconds (us- cess while detecting almost all posi- of the nose and cheeks (see Figure 3). This feature is rel- atively large in comparison with the detection sub-window, 3.2. Learning Results Viola & Jones ’01 sisted of 4916 hand labeled faces scaled and aligned to (i.e. the threshold of scale of 1.25 and a step size of 1.5 described tive instances a ing a starting a boosted classifier can base resolution of 24 by 24 pixels. The be adjusted so that the false negative rate is close times faster than the Rowley- faces were ex- below). This is roughly 15 to zero). and should be somewhat insensitive to size and location of While details on the trainingfrom performance of the final a random crawl of tracted and images downloaded during Simpler classifiers are used to reject the majority of about 600 times faster than Baluja-Kanade detector [12] and sub- system are presented the world wide several simple results examples are shown more complex classifiers are called upon in Section 5, web. Some typical face windows before the Schneiderman-Kanade detector [15]. the face. The second feature selected relies on the property merit discussion. InitialFigure 5. The non-face subwindows used to train the in experiments demonstrated that a to achieve low false positive rates. that the eyes are darker than the bridge of the nose. frontal face classifier detector come from 9544 images which were manually in- constructed from 200 features yields Image Processing The overall form of the detection process is that of a de- a detection rate of 95% withand found to not contain any faces. generate decision tree, what example asub-windows used for training were vari- spected a false positive rate of 1 in There are about All we call “cascade” (see Fig-
  • 6. 10K-1M training examples Schneiderman & Kanade ’99 Face detection Viola & Jones ’01
  • 7. over 100K training examples Car detection Schneiderman & Kanade ’99
  • 8. over 1K training examples Pedestrian detection Dalal & Triggs ’05
  • 9. What’s wrong with this picture?
  • 10. • Tens of thousands of manually annotated training examples • ~30,000 object categories (Biederman, 1987) • Approach unlikely to scale up ... What’s wrong with this picture?
  • 11. One-shot learning in By age 6, a child knows 10-30K categories humans
  • 12. One-shot learning in By age 6, a child knows 10-30K categories humans
  • 13. One-shot learning in By age 6, a child knows 10-30K categories humans
  • 14. What are the computational mechanisms underlying this amazing feat? source: cerebral cortex
  • 15. What are the computational mechanisms underlying this amazing feat? source: cerebral cortex
  • 16. What are the computational mechanisms underlying this amazing feat? 1. Organization of the visual system source: cerebral cortex
  • 17. What are the computational mechanisms underlying this amazing feat? 1. Organization of the visual system 2. Computational model of the visual cortex source: cerebral cortex
  • 18. What are the computational mechanisms underlying this amazing feat? 1. Organization of the visual system 2. Computational model of the visual cortex 3. Application to computer vision source: cerebral cortex
  • 19. What are the computational mechanisms underlying this amazing feat? 1. Organization of the visual system 2. Computational model of the visual cortex 3. Application to computer vision source: cerebral cortex
  • 20. Hierarchical architecture: Rockland & Pandya ’79; Anatomy Maunsell & Van Essen ‘83; Felleman & Van Essen ’91
  • 21. Hierarchical architecture: Rockland & Pandya ’79; Anatomy Maunsell & Van Essen ‘83; Felleman & Van Essen ’91
  • 22. source: Thorpe & Fabre-Thorpe ‘01 Hierarchical architecture: Nowak & Bullier ’97 Schmolesky et al ’98 Latencies
  • 24. ventral visual stream Hierarchical architecture: Function
  • 27. Hierarchical architecture: Hubel & Wiesel 1959, 1962, 1965, 1968 Function
  • 28. simple complex cells cells Nobel prize 1981 Hierarchical architecture: Hubel & Wiesel 1959, 1962, 1965, 1968 Function
  • 29. gradual increase in complexity of preferred stimulus Hierarchical architecture: Kobatake & Tanaka 1994 see also Oram & Perrett 1993; Sheinberg & Function Logothetis 1996; Gallant et al 1996; Riesenhuber & Poggio 1999
  • 30. Parallel increase in invariance properties (position and scale) of neurons Hierarchical architecture: Kobatake & Tanaka 1994 see also Oram & Perrett 1993; Sheinberg & Function Logothetis 1996; Gallant et al 1996; Riesenhuber & Poggio 1999
  • 33. Hierarchical architecture: Hung* Kreiman* Poggio & DiCarlo 2005 Function
  • 34. • Invariant object recognition in IT: • Robust invariant readout of category information from small population of neurons • Single spikes after response onset carry most of the information Hierarchical architecture: Hung* Kreiman* Poggio & DiCarlo 2005 Function
  • 35. Hierarchical architecture: Thorpe Fize & Marlot ‘96 Feedforward processing
  • 36. Hierarchical architecture: Thorpe Fize & Marlot ‘96 Feedforward processing
  • 39. What are the computational mechanisms used by brains to achieve this amazing feat? 1. Organization of the visual system 2. Computational model of the visual cortex 3. Application to computer vision source: cerebral cortex
  • 40. • Qualitative neurobiological models (Hubel & Wiesel ‘58; Perrett & Oram ‘93) • Biologically-inspired (Fukushima ‘80; Mel ‘97; LeCun et al ‘98; Thorpe ‘02; Ullman et al ‘02; Wersing & Koerner ‘03) • Quantitative neurobiological models (Wallis & Rolls ‘97; Riesenhuber & Poggio ‘99; Amit & Mascaro ‘03; Deco & Rolls ‘06) Feedforward hierarchical model of object recognition
  • 41. Model layers RF sizes Num. units • Large-scale (108 Prefrontal 11, Animal vs. units), spans several areas of the visual task-dependent learning Cortex 46 8 45 12 13 non-animal classification 10 0 units Supervised Increase in complexity (number of subunits), RF size and invariance PG cortex V2,V3,V4,MT,MST LIP,VIP,DP,7a V1 P P P T AIT,36,35 PIT, AIT TE o 2 S4 7 10 STP • Combination of Rostral STS TG 36 35 } o TPO PGa IPa TEa TEm m C3 7 10 3 PG Cortex task-independent learning AIT C2b 7 o 10 3 forward and reverse Unsupervised S3 o 1.2 - 3.2 o 10 4 engineering DP VIP LIP 7a PP MSTcMSTp M TcM p FST T PIT TF o o S2b 0.9 - 4.4 10 7 o o 10 5 • Shown to be C2 1.1 - 3.0 o o 7 0.6 - 2.4 consistent with many PO V3A V4 S2 10 o o 10 4 experimental data V2 V3 C1 0.4 - 1.6 o 0.2o- 1.1 V1 S1 10 6 across areas of visual dorsal stream ventral stream cortex 'where' pathway 'what' pathway Simple cells Complex cells Tuning Main routes MAX Bypass routes Feedforward hierarchical model
  • 42. Simple units Complex units Selective pooling Riesenhuber & Poggio 1999 (building on Fukushima ‘80 and Hubel & Wiesel ‘62) mechanisms
  • 43. Simple units Complex units Template matching Invariance Gaussian-like tuning max-like operation ~ “AND” ~”OR” Selective pooling Riesenhuber & Poggio 1999 (building on Fukushima ‘80 and Hubel & Wiesel ‘62) mechanisms
  • 44. Model layers RF sizes Num. units • Large-scale (108 Prefrontal 11, Animal vs. units), spans several areas of the visual task-dependent learning Cortex 46 8 45 12 13 non-animal classification 10 0 units Supervised Increase in complexity (number of subunits), RF size and invariance PG cortex V2,V3,V4,MT,MST LIP,VIP,DP,7a V1 P P P T AIT,36,35 PIT, AIT TE o 2 S4 7 10 STP • Combination of Rostral STS TG 36 35 } o TPO PGa IPa TEa TEm m C3 7 10 3 PG Cortex task-independent learning AIT C2b 7 o 10 3 forward and reverse Unsupervised S3 o 1.2 - 3.2 o 10 4 engineering DP VIP LIP 7a PP MSTcMSTp M TcM p FST T PIT TF o o S2b 0.9 - 4.4 10 7 o o 10 5 • Shown to be C2 1.1 - 3.0 o o 7 0.6 - 2.4 consistent with many PO V3A V4 S2 10 o o 10 4 experimental data V2 V3 C1 0.4 - 1.6 o 0.2o- 1.1 V1 S1 10 6 across areas of visual dorsal stream ventral stream cortex 'where' pathway 'what' pathway Simple cells Complex cells Tuning Main routes MAX Bypass routes Feedforward hierarchical model
  • 45. Kouh & Poggio 2007; Knoblich Bouvrie Poggio 2007 Both operations can be Basic circuit for the two approximated gain control operations circuits using shunting inhibition
  • 46. Model RF size layers Animal Prefrontal Cortex 45 12 11, 13 vs. non-animal PFC classification units PG V1 AIT AIT,36,35 PIT, AIT TE S4 7 35 AIT C3 PIT 7 C2b 7 o S3 1.2 PIT S2b V4 0.9 o o C2 1.1 V4 V2 0.6 o S2 o C1 0.4 V1 V2 V1 0.2o S1 dorsal stream ventral stream 'where' pathway 'what' pathway Simple cell Complex ce Tuning MAX Learning and plasticity
  • 47. Model RF size layers Animal Prefrontal 45 12 11, vs. PFC classification PFC, IT very likely non-animal Cortex 13 units PG Evidence for adult plasticity V1 AIT AIT,36,35 PIT, AIT TE S4 7 35 AIT C3 PIT 7 C2b 7 o S3 1.2 V4 likely PIT S2b V4 0.9 o o C2 1.1 V4 V2 0.6 o S2 o C1 0.4 V1 V2 V1/V2 limited evidence V1 S1 0.2o dorsal stream ventral stream 'where' pathway 'what' pathway Simple cell Complex ce Tuning MAX Learning and plasticity
  • 48. Model RF size layers Animal Prefrontal Cortex 45 12 11, 13 vs. non-animal PFC classification units PG V1 AIT AIT,36,35 PIT, AIT TE S4 7 35 AIT C3 PIT 7 C2b 7 o S3 1.2 PIT S2b V4 0.9 o Unsupervised developmental- C2 1.1 o V2 0.6 like learning stage: V4 S2 o Frequent image features C1 0.4 o V1 V2 V1 0.2o S1 dorsal stream ventral stream 'where' pathway 'what' pathway Simple cell Complex ce Tuning MAX Learning and plasticity
  • 49. Model RF size layers Animal Prefrontal Cortex 45 12 11, 13 vs. non-animal PFC classification units PG V1 AIT AIT,36,35 PIT, AIT TE S4 7 35 AIT C3 PIT 7 C2b 7 o S3 1.2 PIT S2b V4 0.9 o Unsupervised developmental- C2 1.1 o V2 0.6 like learning stage: V4 S2 o Frequent image features C1 0.4 o V1 V2 V1 0.2o S1 dorsal stream ventral stream 'where' pathway 'what' pathway Simple cell Complex ce Tuning MAX Learning and plasticity
  • 50. Model RF size layers Animal Prefrontal Cortex 45 12 11, 13 vs. non-animal PFC classification units PG V1 AIT AIT,36,35 PIT, AIT TE S4 7 35 AIT C3 PIT 7 C2b 7 o S3 1.2 PIT S2b V4 0.9 o Unsupervised developmental- C2 1.1 o V2 0.6 like learning stage: V4 S2 o Frequent image features C1 0.4 o V1 V2 V1 0.2o S1 dorsal stream ventral stream 'where' pathway 'what' pathway Simple cell Complex ce Tuning MAX Learning and plasticity
  • 51. Model RF size layers Learned V2/V4 units Prefrontal 45 12 11, Animal vs. PFC classification Cortex 13 non-animal units stronger PG V1 facilitation AIT AIT,36,35 PIT, AIT TE S4 7 35 AIT C3 PIT 7 C2b 7 stronger suppression S3 1.2 o PIT S2b V4 0.9 o Unsupervised developmental- C2 1.1 o V2 0.6 like learning stage: V4 S2 o Frequent image features C1 0.4 o V1 V2 V1 0.2o S1 dorsal stream ventral stream 'where' pathway 'what' pathway Simple cell Complex ce Tuning MAX Learning and plasticity
  • 52. Model RF size layers Beyond V4 Prefrontal 45 12 11, Animal vs. PFC classification Cortex 13 non-animal Combinations of those... PG units V1 AIT AIT,36,35 PIT, AIT TE S4 7 35 AIT C3 PIT 7 C2b 7 o S3 1.2 PIT S2b V4 0.9 o Unsupervised developmental- C2 1.1 o V2 0.6 like learning stage: V4 S2 o Frequent image features C1 0.4 o V1 V2 V1 0.2o S1 dorsal stream ventral stream 'where' pathway 'what' pathway Simple cell Complex ce Tuning MAX Learning and plasticity
  • 53. Model RF size layers Animal Supervised learning from a Prefrontal Cortex 45 12 11, 13 vs. non-animal PFC classification units handful of training examples PG ~ linear perceptron V1 AIT AIT,36,35 PIT, AIT TE S4 7 35 AIT C3 PIT 7 C2b 7 o S3 1.2 PIT S2b V4 0.9 o Unsupervised developmental- C2 1.1 o V2 0.6 like learning stage: V4 S2 o Frequent image features C1 0.4 o V1 V2 V1 0.2o S1 dorsal stream ventral stream 'where' pathway 'what' pathway Simple cell Complex ce Tuning MAX Learning and plasticity
  • 54. Learning and sample complexity
  • 55. Model RF sizes Num. layers units Animal Prefrontal 11, vs. task-dependent learning Cortex 46 8 45 12 13 non-animal classification 10 0 units Supervised Increase in complexity (number of subunits), RF size and invariance PG V2,V3,V4,MT,MST LIP,VIP,DP,7a V1 P P P T AIT,36,35 PIT, AIT TE o 2 S4 7 10 STP Rostral STS TG 36 35 } o TPO PGa IPa TEa TEm m C3 7 10 3 PG Cortex task-independent learning AIT o C2b 7 10 3 Unsupervised o o S3 1.2 - 3.2 10 4 DP VIP LIP 7a PP MSTcMSTp M TcM p FST T PIT TF o o S2b 0.9 - 4.4 10 7 o o C2 1.1 - 3.0 10 5 o o PO V3A V4 S2 0.6 - 2.4 10 7 o o V2 V3 C1 0.4 - 1.6 10 4 o V1 0.2o- 1.1 10 6 S1 dorsal stream ventral stream 'where' pathway 'what' pathway Simple cells Complex cells Tuning Main routes MAX Bypass routes Feedforward hierarchical model