SlideShare a Scribd company logo
1 of 81
Download to read offline
The NASA
                         Vision Workbench
                          Reflections on Image Processing in C++
                          Matt Hancher & Michael Broxton
                          Intelligent Robotics Group
                          January 7, 2009, Willow Garage
Intelligent Systems Division                                 NASA Ames Research Center
Talk Overview

            •      Overview and Background

            •      Introduction to the Vision Workbench

            •      Vision Workbench Modules and Applications

            •      Under the Hood: Templates, Views, and Lazy Evaluation

            •      Lessons Learned and Future Directions



Intelligent Systems Division                                   NASA Ames Research Center
NASA Ames Research Center

          • NASA’s Silicon Valley
                 research center
               •      Small spacecraft
               •      Supercomputers
               •      Lunar & Planetary Science
               •      Intelligent Systems
               •      Human Factors
               •      Thermal protection systems
               •      Aeronautics
               •      Astrobiology


Intelligent Systems Division                       NASA Ames Research Center
Intelligent Robotics Group (IRG)
           •      Areas of expertise
                •      Applied computer vision
                •      Human-robot interaction
                •      Instrument deployment & placement
                •      Interactive 3D visualization
                •      Robot software architectures


           •      Science-driven exploration
                •      Instrument placement, resource
                       mapping, analysis support
                •      Low speed, deliberative operation


           •      Fieldwork-driven operations
                •      Precursor missions (site survey, site
                       survey, deployment, etc.)
                •      Manned missions (human-paced
                       interaction, inspection, etc.)


Intelligent Systems Division                                   NASA Ames Research Center
The NASA Vision Workbench

                  •       Open-source image processing and machine vision library
                          in C++

                  •       Developed as a foundation for unifying image processing
                          work at NASA Ames

                  •       A “second-generation” C++ image processing library,
                          drawing on lessons learned by VXL, GIL, VIGRA, etc.

                  •       Designed for easy, expressive coding of efficient image
                          processing algorithms




Intelligent Systems Division                                                 NASA Ames Research Center
Obtaining the Vision Workbench

               • Available under the NASA Open Source
                      Agreement (NOSA), an OSI-approved non-
                      viral open source license.


               • VW version 2.0 alpha snapshots currently
                      being released for the brave. (We use it.)

                      http://ti.arc.nasa.gov/visionworkbench/




Intelligent Systems Division                                  NASA Ames Research Center
Image Module Basics




Intelligent Systems Division                         NASA Ames Research Center
API Philosophy

                   •      Simple, natural, mathematical, expressive

                   •      Treat images as first-class mathematical data
                          types whenever possible
                        •      Example: IIR filtering for background subtraction
                               background += alpha * ( image - background );



                   •      Direct, intuitive function calls
                        • Example: A Gaussian smoothing filter
                               result = gaussian_filter( image, 3.0 );



Intelligent Systems Division                                                      NASA Ames Research Center
The Core Image Type

                                ImageView<PixelT>

                   • Stores a reference-counted array of pixels.
                   • Templatized on the pixel type; e.g.
                        ImageView<PixelRGB<uint8> >


                   • Supports an arbitrary number of image planes.


Intelligent Systems Division                                   NASA Ames Research Center
The ImageView Public Interface
                                         ImageView<...> img;
                         Constructing    ImageView<...> img(cols,rows);
                                         ImageView<...> img(cols,rows,planes);

                                         img.set_size(cols,rows);
                Changing dimensions      img.set_size(cols,rows,planes);


                                         img.cols()
                  Getting dimensions     img.rows()
                                         img.planes()


                                         img(col,row)
                      Accessing pixels   img(col,row,plane)


                                         ImageView<...>::iterator
                          STL iterator   img.begin()
                                         img.end()


                                         ImageView<...>::pixel_accessor
                        Pixel accessor   img.origin()



Intelligent Systems Division                                                NASA Ames Research Center
Built-In Pixel Types
                                           PixelGray<float32>
                               Grayscale   PixelGrayA<uint8>
                                           PixelRGB<double>
                                 RGB       PixelRGBA<int16>
                                           PixelHSV<float32>
                                           PixelXYZ<float32>
                         Color spaces      PixelLuv<float32>
                                           PixelLab<float32>
                                           float32, float64 and 8,16,32,
                Unitless (e.g. kernels)    64 bit signed and unsigned integer

                               Vectors     Vector<float64,4>

                                           float32, float64 and 8,16,32,
                Unitless (e.g. kernels)    64 bit signed and unsigned integer
                                           PixelMask<float>
                         Masked Pixels     PixelMask<PixelRGBA<uint8> >


             • Try something like this at the top of your code:
                       typedef ImageView<PixelRGB<double> > Image;

Intelligent Systems Division                                                NASA Ames Research Center
Simple ImageView Operations
              • Operations like these are inexpensive and
                    “shallow” or “lazy.”
                                transpose(img)                 rotate_180(img)

                          flip_vertical(img)              flip_horizontal(img)

                               rotate_90cw(img)               rotate_90ccw(img)

                                          crop(img,x,y,cols,rows)

                                            subsample(img,factor)
                                       subsample(img,xfactor,yfactor)


          • Use copy() to make a deep copy if you need one.
                                                  copy(img)


Intelligent Systems Division                                                      NASA Ames Research Center
Slicing and Dicing
             • Select an individual plane or channel “slice”:
                                    select_plane(img,plane)

                                  select_channel(img,channel)



               • Interpret pixel channels as image planes:
                                    channels_to_planes(img)



               • Example: making a PixelRGBA<float32> image opaque:
                               fill( select_channel(img,3), 1.0 );


Intelligent Systems Division                                         NASA Ames Research Center
ImageView Filtering Operations

                                  convolution_filter(img,kernel)

                         separable_convolution_filter(img,xkernel,ykernel)

                                    gaussian_filter(img,sigma)

                               derivative_filter(img,xderiv,yderiv)

                                       laplacian_filter(img)

                                threshold_filter(img,thresh,hi,lo)

                                                ...


          • There are several options, including edge extensions:
                    img = gaussian_filter(img, 3.0, ZeroEdgeExtention());


Intelligent Systems Division                                           NASA Ames Research Center
Some Simple Filtering Examples
                                 Original     Gaussian




                               X Derivative   Laplacian




Intelligent Systems Division                              NASA Ames Research Center
ImageView Operators
    • Mathematical operators on images work as you’d like.
    • Add, subtract, multiply, and divide images (per-pixel).
    • Add or subtract a constant pixel value offset.
    • Multiply or divide by scalars.
    • Example: IIR filtering for background subtraction.
                               bkg_img += 0.02 * (src_img - bkg_img);


     • Operators are the best way to do image arithmetic
           with the Vision Workbench.

Intelligent Systems Division                                            NASA Ames Research Center
More ImageView Math
       • Most standard math functions work on images too.
                                abs                  exp                 log

                               sqrt                  pow                hypot

                                sin                  cos                 tan

                               asin                  acos                atan

                               sinh                  cosh                tanh

                               asinh                acosh               atanh

                                                  ...and more!


       • Example: Computing gradient orientation.
                                      orientation = atan2(grad_y, grad_x);


Intelligent Systems Division                                                    NASA Ames Research Center
ImageView Math Examples
                          Gradient Orientation   Gradient Magnitude




              Absolute Difference of Gaussians    Logarithmic Map




Intelligent Systems Division                                        NASA Ames Research Center
Per-Pixel ImageView Operations
           • Cast to a new pixel type or channel type:
                                   pixel_cast<NewPixelT>(img)

                                 channel_cast<NewChannelT>(img)



           • Explicit casts are generally not needed to convert
                between color spaces.

           • Apply an arbitrary function to each pixel, or to each
                channel of each pixel:
                                   per_pixel_filter(img,func)

                               per_pixel_channel_filter(img,func)

Intelligent Systems Division                                        NASA Ames Research Center
Example: Color Detection
      • E.g. in color fiducial tracking and object tracking
        ImageView<PixelRGB<double> > input = ...;
        double hue_ref = 0.54;

        ImageView<PixelHSV<double> > hsv_im = gaussian_filter( input, 1.0 );

        ImageView<double> hue = select_channel( hsv_im, 0 );
        ImageView<double> sat = select_channel( hsv_im, 1 );

        ImageView<double> match_im = ( 1.0 - 20.0*abs(hue-hue_ref) ) * sat*sat;




Intelligent Systems Division                                                      NASA Ames Research Center
Image Transformation
           • Arbitrary image transformations via
                transform “functors” that define a mapping.
                               warped = transform( image, my_txform );



           • Simple wrappers for common cases.
             resample(img,xscale,yscale)             resize(img,xsize,ysize)

                 translate(img,xoff,yoff)               rotate(img,angle)



           • Customizable interpolation and image edge
                extension via optional arguments.


Intelligent Systems Division                                                NASA Ames Research Center
Transformation Examples
                                   Rotation             Homography




                               Radial Distortion   Arbitrary Transformation




Intelligent Systems Division                                           NASA Ames Research Center
Modules & Applications




Intelligent Systems Division                            NASA Ames Research Center
Interest Point & Alignment Module




Intelligent Systems Division                                   NASA Ames Research Center
Interest Point & Alignment Module




Intelligent Systems Division                                   NASA Ames Research Center
Interest Point & Alignment Module




Intelligent Systems Division                                   NASA Ames Research Center
Interest Point & Alignment Module




Intelligent Systems Division                                   NASA Ames Research Center
Interest Point & Alignment Module

                  Original
                  Images




                   Aligned
                   Images




Intelligent Systems Division                                   NASA Ames Research Center
Mosaic Module Basics




Intelligent Systems Division                          NASA Ames Research Center
Mosaic Module Basics




Intelligent Systems Division                          NASA Ames Research Center
Mosaic Module Basics




Intelligent Systems Division                          NASA Ames Research Center
CTX Polar Mosaic


      •        Based on pre-release
               polar data captured by
               CTX on Mars
               Reconnaissance
               Orbiter
      •        Two weeks of
               development time

      •        Stats:
           •       1610 source images
           •       305-GB of source imagery
           •       40.3 Gigapixels




Intelligent Systems Division                             NASA Ames Research Center
Cartography Module




Intelligent Systems Division                        NASA Ames Research Center
High Dynamic Range Module
                  •       Merge multiple exposures of the same scene to increase
                          dynamic range.

                  •       Closely related to photometric calibration of orbital
                          imagery.




                               LDR                                 HDR

Intelligent Systems Division                                                  NASA Ames Research Center
HDR Module




Intelligent Systems Division                NASA Ames Research Center
HDR Module




Intelligent Systems Division                NASA Ames Research Center
HDR Module




Intelligent Systems Division                NASA Ames Research Center
Application: Image Matching
                               •   Problem: Given an image, find others like it.




                                Example database: Apollo Metric Camera images

Intelligent Systems Division                                                      NASA Ames Research Center
Texture-Based Image Matching
                     Model
                     image

                                    Texture bank filtering
                    Filtering
                                    (Gaussian 1st derivative and LOG)

                                    Grouping to remove orientation
            Output Representation
                                    Energy in a window

                                    E-M Gaussian mixture model
                 Segmentation
                                    Iterative tryouts, MDL

                                    Max vote
                Post-processing


                                    Grouping
                Summarization
                                    Mean energy in segment

                                    Euclidian distance
              Vector Comparison



                    Matched
                     image


Intelligent Systems Division                                            NASA Ames Research Center
Texture Matching Filter Bank




Intelligent Systems Division                        NASA Ames Research Center
Image Matching: Results




Intelligent Systems Division                             NASA Ames Research Center
Stereo Module
                                                                                               Right Image

                                                                                                                 2. Sub-pixel
1. Discrete
                                                                                                                 Refinement
Correlation
                                                                                                                 • Fit a 2D convex quadratic
• Find the integer
                                                                                                                   surface to the nine nearest
 offset (disparity) that
                                                                                                                   points in correlation fitness
 minimizes the sum
                                                                                                                   space.
 of absolute
                                                             Template Region
 difference between                                          (from Left Image)
 template region and
 the right image.
                                                                                                                                Discrete Correlation
For speed:
                                                                                                                                                       Sub-pixel Correlation
• Coarse-to-fine
  processing.
                                                  Candidate
• Disparity search
                                               Disparity(dx, dy)
  sub-regioning1
• Box filter-optimized                                                           Search Area
  correlator.




1. Changming Sun. Rectangular Subregioning
and 3-D Maximum-Surface Techniques for Fast
Stereo Matching. In Proceedings of the IEEE
Workshop on Stereo and Multi-Baseline Vision
                                                                                                             3. Consistency Checks
(2001)

                                                                                                             • Left/Right Cross Check
                                                                                                             • Median Filtering

                                                                                                             Other methods to be added soon:
                                                                                                             • Epipolar, photometric, continuity/
                                                                                                               smoothness constraints.
                                                                                                             • Robust Cost Function

 Intelligent Systems Division                                                                                                                       NASA Ames Research Center
Improved Stereo Matching:
                               Affine-adaptive Sub-pixel Correlation
•                                                   Right Image
      Foreshortening is the geometric effect that gives rise to stereo processing. However, the
      change in perspective on a sloped surface can confuse an area-based stereo correlator.

•     The solution is to use an iterative algorithm to adapt the correlation window (e.g. affine).

                                          AS15-M-1134                                        AS15-M-1135




Intelligent Systems Division                                                           NASA Ames Research Center
Improved Stereo Matching:
                                        Handling “Noise”
                                                              Right Image

•   The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.

•   We have shown that we can mitigate this effect somewhat by using robust statistics.




                               Dust and lint on AS15-M-1134


Intelligent Systems Division                                                       NASA Ames Research Center
Improved Stereo Matching:
                                         Handling “Noise”
                                                                 Right Image

•   The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.

•   We have shown that we can mitigate this effect somewhat by using robust statistics.




                               DEM (Note error due to dust...)


Intelligent Systems Division                                                       NASA Ames Research Center
Improved Stereo Matching:
                                              Handling “Noise”
                                                                   Right Image

•   The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.

•   We have shown that we can mitigate this effect somewhat by using robust statistics.




                       DEM (with error corrected using Cauchy robust weighting)


Intelligent Systems Division                                                       NASA Ames Research Center
The Ames Stereo Pipeline
              •      Problem: Given multiple images, compute the 3D terrain.

                                            Mars Pathfinder &
                                                                                           Mars Exploration Rovers (MER) & Viz
                                            MarsMap




                                              Mars Polar Lander & Viz




      NASA Ames has been developing surface reconstruction techniques for planetary exploration since the mid 1990s.


Intelligent Systems Division                                                                         NASA Ames Research Center
Architectural Overview
                                                                The Stereo Pipeline is a relatively thin
  Vision Workbench Overview
                                                              application built upon the open source ARC
 • Modular, extensible, C++ machine vision and image          Vision Workbench and USGS ISIS toolkits.
   processing library (Linux, OS-X, Win32)
 • Developed as a framework for unifying image processing
                                                                                          Mission Specific Code
   work at NASA Ames
                                                             Stereo Pipeline
 • Designed for easy, expressive coding of efficient image
                                                                                                    ISIS
   processing algorithms.


                                                                Vision              Camera
  Vision Workbench Modules
                                                               Workbench
 •   Core (abstract datatypes & utilities)
                                                                                       VW Camera Models
 •   Camera (models & calibration)
                                                             Image                     ISIS Camera Models
 •   Cartography (geospatial images)
 •   GPU (HW accelerated processing)
                                                               Image Processing
                                                                                    Stereo
 •   HDR (high-dynamic range images)
 •   Interest Point (tracking & matching)                                              Dense Stereo Correlation
                                                             FileIO
 •   Mosaic (composite & blend huge images)
                                                                                       Stereo Camera Geometry
                                                               Image File I/O
 •   Stereo Processing (high-quality DEMs & 3D models)
                                                               ISIS File I/O
                                                                                    Cartography
                                                             InterestPoint             DEM Generation

                                                               Image Alignment         Georeferenced File I/O
  http://ti.arc.nasa.gov/visionworkbench/


Intelligent Systems Division                                                                 NASA Ames Research Center
Mars Stereo: MOC NA
                                          MGS MOC-Narrow Angle
                                          •   Malin Space Science Systems
                                          •   Altitude: 388.4 km (typical)
                                          •   Line Scan Camera: 2048 pixels
                                          •   Focal length: 3.437m
                                          •   Resolution: 1.5-12m / pixel
                                          •   FOV: 0.5 deg




Intelligent Systems Division                                       NASA Ames Research Center
Galaxius Fluctus Channel




               This VRML model was generated from MOC image pair M01-00115 and E02-01461 (34.66°N, 141.29°E). The complete
               stereo reconstruction process takes approximately five minutes on a 3.0GHz workstation (1024x8064 pixels). This model is
               shown without vertical elevation exaggeration.


Intelligent Systems Division                                                                                             NASA Ames Research Center
Warrego Vallis System




                 Lower Left: This 3D model was generated from MOC-NA images E01-02032 and M07-02071 (42.66°S, 93.55°E).
                 Upper Right: Ortho-image overlay. Areas of interpolated data are colored red.



Intelligent Systems Division                                                                                         NASA Ames Research Center
NE Terra Meridiani

                                 !%
                                  quot;quot;
                                  #$
                !!




                                                            $
                                                        #
                   quot;




                                                     quot;quot;
                  quot;quot;




                                                  !%
                       #$




                                                                               !!quot;quot;quot; $
                                                                           !%quot;quot;#$




             Upper Left: This DTM was generated from MOC images E04-01109 and M20-01357 (2.38°N, 6.40°E). The contour lines (20m
             spacing) overlay an ortho-image generated from the 3D terrain model. Lower Right: An oblique view of the corresponding VRML
             model.




Intelligent Systems Division                                                                                             NASA Ames Research Center
Lunar Stereo: Apollo Orbiter Cameras

            ITEK Panoramic Camera
            • Focal length: 610 mm (24”)
            • Optical bar camera
            • Apollo 15,16,17 Scientific
              Instrument Module (SIM)
            • Film image: 1.149 x 0.1149 m
            • Resolution: 108-135 lines/mm




Intelligent Systems Division                 NASA Ames Research Center
Apollo 17 Landing Site




             Top: Stereo reconstruction

             Right: Handheld photo taken by an
             orbiting Apollo 17 astronaut




Intelligent Systems Division                            NASA Ames Research Center
Public Outreach: Haydn Planetarium




Intelligent Systems Division           NASA Ames Research Center
Public Outreach: Haydn Planetarium




Intelligent Systems Division           NASA Ames Research Center
Recent Developments:
                               Processing Large Satellite Imagery
•       The Vision Workbench handles                            Apollo Metric Camera                               HiRISE                LROC
                                                                     (16,000x16,000)                       (20,000x40,000)       (10000x50000)
        arbitrarily large images via
        intelligent caching and a flexible
        abstraction of an image called an
        “image view.”
    •      Image operations are evaluated
           lazily, allowing for optimization down
           the line.
                                                                    HRSC       CTX
    •      Processing occurs one tile at a time,                   (5184x    (5064x
                                                                   16000)    16000)
           and is usually driven by the output
           operation (i.e. writing a tile to disk).
    •      DiskImageView, BlockCacheView,
           ImageViewRef, block_rasterize(),
           and blocked-savvy write-image()/
           FileIO
•       Scalable performance on multi-
        threaded machines (soon to include                             MOC-NA
                                                                    (2048x4800)
        Columbia, NASA’s supercomputer)
    •      Thread and ThreadPool/WorkQueue                                 MER
           objects                                                  (1024x1024)

    •      Specifically targeting the stereo
           correlator, outlier rejection, and
                                                      Nominal Resolutions for Various Imagers. All sizes given in pixels.
           stereo intersection algorithms.                 Apollo Panoramic Camera is not shown (25400 x 244000 pixels)!


Intelligent Systems Division                                                                                           NASA Ames Research Center
Recent Developments:
                               Least Squares Bundle Adjustment
                                                                Right Image

Refining Apollo SPICE Kernels

•    Camera position and pose in “historical” SPICE kernels
     provided by ASU provide a good initial solution, but they
     will require refinement.

•    Incorporate new Apollo Metric Camera tie-points into
     ULCN 2005 - or - tie these points to the preliminary
     LOLA control network in late 2009.

•    This work will be carried out as part of a USGS/ARC
     LASER proposal during FY09/FY10.


Automating Bundle Adjustment

•    Automate tie-point matching using the SIFT and SURF
     algorithms.

•    Experimenting with reducing sensitivities to outliers using
     Robust Statistics (i.e. error models with a “heavy tailed”
     probability distributions)
                                                 Top: Partial view of Orbit 33 stereo reconstruction. Note the discontinuities in the colored,
                                           hillshaded terrain. Bottom: KSU “Bundlevis” visualization of bundle adjustment for AS15-M-113[5-7]

Intelligent Systems Division                                                                                         NASA Ames Research Center
A Peek Under the Hood




Intelligent Systems Division                       NASA Ames Research Center
Problem: Intermediate Results
      • What happens when you chain operations?
        result = image1 + image2 + image3;

        result = transpose( crop(x,y,31,31) );


      • Normally those would be the same as these:
        Image tmp = image1 + image2;
        result = tmp + image3;

        Image tmp = crop(image,x,y,31,31);
        result = transpose(tmp);

      • That would be terribly inefficient! Computing the
            intermediate requires an extra pass over the data.

Intelligent Systems Division                              NASA Ames Research Center
Solution: Lazy Evaluation
      • The + operator returns a special image sum object.
      • The actual computation is only performed when you
            set an ImageView equal to one of these objects.

      • The entire operation is performed in the inner loop,
            once per pixel.

      • No intermediate image is needed!
      • No second pass over the data is needed, either!

Intelligent Systems Division                               NASA Ames Research Center
Generalizing the View Concept
      • An image view is any object that you can access just
            like a regular old ImageView object.
                                         Image::pixel_type
                      Type definitions    Image::result_type

                                         img.cols()
                  Getting dimensions     img.rows()
                                         img.planes()

                                         img(col,row)
                      Accessing pixels   img(col,row,plane)

                                         Image::pixel_accessor
                        Pixel accessor   img.origin()

                                         Image::prerasterize_type
                         Rasterization   prerasterize(bbox)
                                         template <DestT> rasterize(dest,bbox)


      • The data can be anywhere, or it can be computed.
Intelligent Systems Division                                               NASA Ames Research Center
The Pixel Accessor Public Interface
           • Pixel accessors are the most efficient way to move
                around the pixels in an image, and are typically
                used to implement rasterization functions.

           • They behave somewhat like standard C++
                iterators.
                                           acc.prev_col()
                                           acc.next_col()
                                           acc.prev_row()
                               Iteration   acc.next_row()
                                           acc.prev_plane()
                                           acc.next_plane()

                                           acc.advance(cols,rows)
                         Advancement       acc.advance(cols,rows,planes)

                           Pixel access    *acc

Intelligent Systems Division                                               NASA Ames Research Center
Views, Views, Everywhere!

      • None of the functions we’ve seen so far do anything.
      • Instead, they immediately return view objects that
            represent processed views of the underlying data.

      • Nested function calls produce nested view types.
      • The computation happens in either the assignment
            operator or the constructor of the destination.

      • We call this final step the “rasterization” of one view
            into another view.

Intelligent Systems Division                              NASA Ames Research Center
Block Rasterization

              •       Ultra-large (larger than memory) images are are
                      easily supported.
              •       All image views natively support block-by-block
                      computation (“rasterization”).
                    •          write_image() computes per-block or -line
                    •          QuadTreeGenerator computes per-block
                    •          BlockCacheView allows you to manually
                               control block computation in a nested view.
         template <DestT> Image::rasterize(DestT const& dest, BBox2i const& bbox);


Intelligent Systems Division                                            NASA Ames Research Center
A Trivial First Example
      • SLOG: Sign of Laplacian of Gaussian

      Image slog =
        threshold_filter( laplacian_filter( gaussian_filter( img, 1.5 ) ) );




Intelligent Systems Division                                           NASA Ames Research Center
Generic View Types Can be Complicated!

     •     The type of the resulting view object becomes complex very
           quickly.

      Image slog =
        threshold_filter( laplacian_filter( gaussian_filter( img, 1.5 ) ) );




      UnaryPerPixelView<ConvolutionView<SeparableConvolutionView<ImageView<PixelRGB<float> >,
                                                                 double, ConstantEdgeExtension>,
                                        double, ConstantEdgeExtension>,
                        UnaryCompoundFunctor<ChannelThresholdFunctor<PixelRGB<float> > >




Intelligent Systems Division                                                                 NASA Ames Research Center
Other Advantages to Views
      • Generalized views emerged as the solution to several
            problems at once.

      • On-disk images can be supported cleanly.
      • Procedurally-generated images can be, too.
      • If you only want a small number of processed pixel
            values, e.g. near interest points, make the view and just
            ask it for those values.

      • Lazy evaluation permits more sophisticated algorithmic
            optimizations down the road.
Intelligent Systems Division                                NASA Ames Research Center
Naïve Laziness can be Very Bad™

      • What happens when you chain convolutions?
         result = convolution_filter(convolution_filter(image,kern1),kern2);



      • Now the intermediate result is an important cache:
        Image tmp = convolution_filter(image,kern1);
        result = convolution_filter(tmp,kern2);



      • Without this cache, performance will be terrible.
      • In the Vision Workbench, intermediate results are
            computed and cached when necessary.


Intelligent Systems Division                                            NASA Ames Research Center
Generic vs. Abstract Views

      • Views could be either template-based (generic) or
            virtual-function-based (abstract).

      • Because pixel access often appears in tight inner
            loops, the template-based solution performs better.

      • Templates are also more flexible. Virtualization can
            only recover one hidden type at a time.

      • Alas, keeping track of complex types can be annoying.
            Fortunately, the end user usually doesn’t have to.

Intelligent Systems Division                               NASA Ames Research Center
Virtualizing Image Views
     • Sometimes the abstract base class approach is better.
          • Run-time polymorphism.
          • Hiding complex types altogether.

     • The ImageViewRef class wraps an arbitrary view in a veil
           of abstraction.
          • Templatized only on the pixel type.
          • Contains a pointer to a special abstract base class.
          • Has reference semantics (but re-bindable).

                 ImageViewRef<float> img_ref = My(Complex(Image(View(Type(img)))));




     • Great for keeping a lazy view around if you only want
           to evaluate it at select points.
Intelligent Systems Division                                                 NASA Ames Research Center
Image Resources
                    •          Image resources, such as image files on disk, may
                               have unknown pixel/channel types.
                                               PixelFormatEnum pixel_format()
                     Getting type info         ChannelTypeEnum channel_type()

                                               int32 img.cols()
                  Getting dimensions           int32 img.rows()
                                               int32 img.planes()

                                               void read( ImageBuffer buf, BBox2i bbox )
                 Accessing pixel data          void write( ImageBuffer buf, BBox2i bbox )

                                               Vector2i native_block_size()
                                 Other         void flush()


               •       ImageBuffer is a simple struct describing a block of
                       contiguous pixels in memory.

               •       Read/write functions call helper functions to
                       convert to/from the desired pixel type.
Intelligent Systems Division                                                     NASA Ames Research Center
Lessons Learned and
                               Thoughts for the Future




Intelligent Systems Division                             NASA Ames Research Center
Templates and Laziness Revisited
                    •          The image view framework currently serves
                               multiple purposes:

                          •      Lazy evaluation of pixels on demand

                          •      Block rasterization of gigantic images

                          •      Eliminating unwanted temporaries


                    •          This sometimes results in confused design.

                    •          Lazy views need not be fully statically defined: that is
                               a premature optimization that complicates design.


Intelligent Systems Division                                                    NASA Ames Research Center
Example: Image Transformation
             •        This simple expression:
           rotate( image, 45*M_PI/180 )




             •        Returns this complex type (assuming an RGB8 image):
           TransformView< InterpolationView< EdgeExtensionView< ImageView<PixelRGB<uint8> >,
                                                                ZeroEdgeExtension >,
                                             BilinearInterpolation >,
                          RotateTransform >




            •       Nested views are very powerful, but the resulting view is
                    needlessly complex.

            •       Virtualizing the edge extension step has negligible impact
                    on performance. Virtualizing the interpolation step is
                    impossible.

Intelligent Systems Division                                                             NASA Ames Research Center
Template Pitfalls
           • A common and frustratingly terrible idiom for
                supporting multiple pixel types:
           template <class PixelT>
           int do_something_useful(…) {
              // Your actual program code
           };

           int main(int argc, char *argv) {
             // Parse the arguments...

               DiskImageResource *resource = DiskImageResource::open(image_filename);
               ChannelTypeEnum channel_type = resource->channel_type();
               PixelFormatEnum pixel_format = resource->pixel_format();

               switch(pixel_format) {
                 case VW_PIXEL_GRAY:
                   switch(channel_type) {
                     case VW_CHANNEL_UINT8: return do_something_useful<PixelGray<uint8> >(…);
                     case VW_CHANNEL_UINT16: return do_something_useful<PixelGray<uint16> >(…);
                     // And so on...
                   }
                 // And so on...
               }
           }


          • Annoying to write, takes forever to compile, and
                results in huge executables.
Intelligent Systems Division                                                               NASA Ames Research Center
A More Pythonic Way
             •       Process an image using its native pixel type, as long as its
                     a standard type:
           >>>    import vw
           >>>    input = vw.read_image( ‘my_image.jpg’ )
           >>>    filtered = vw.gaussian_filter( input, 3 )
           >>>    vw.write_image( ‘filtered_image.jpg’, filtered )




             •        Coercion to a specific pixel type:
           >>> input = vw.read_image( ‘my_image.jpg’, ptype=vw.PixelRGB, ctype=vw.uint8 )




             •        Successfully implemented in the Python bindings.

             •        It’s great to use, and terrible to implement.

             •        Results in huge Python bindings, especially due to SWIG
                      limitations on multiple compilation units.
Intelligent Systems Division                                                                NASA Ames Research Center
Proliferation of Image Concepts
                •       ImageView : Static pixel type, pixels stored
                        contiguously in memory.
                •       ImageViewRef : Static pixel type, abstracts
                        arbitrary block image computation.
                •       ImageResource : Dynamic pixel type, block
                        image access with conversion.
                •       ImageBuffer : Dynamic pixel type, pixels stored
                        in a block in memory.


                •       A dynamically typed version of ImageViewRef?
Intelligent Systems Division                                          NASA Ames Research Center
A Dynamic View Abstraction?
           •       ImageView needs to be templatized on the pixel type for fast and
                   easy pixel access, but this does not prevent it from also adhering
                   to a dynamically typed view abstraction.

           •       Automatic pixel type casting/coercion is needed to avoid a
                   combinatorial explosion.

           •       Existing ImageResource interface may be close.... (for 3.0?)

           •       Currently exploring an intermediate solution (essentially a
                   dynamic version of ImageViewRef) for 2.0 release.

                                        PixelFormatEnum pixel_format()
                Getting type info       ChannelTypeEnum channel_type()

                                        int32 img.cols()
             Getting dimensions         int32 img.rows()
                                        int32 img.planes()

                      Rasteriztion      void rasterize( ImageBuffer buf, BBox2i bbox )

Intelligent Systems Division                                                 NASA Ames Research Center
An OpenCV – VW Bridge?
             •       OpenCV contains many algorithms that Vision
                     Workbench users would love to use.
             •       The simplest approach would be a direct bridge
                     between ImageView and IplImage.
             •       A more powerful approach would be to
                     produce Vision Workbench views whose
                     rasterizers invoke OpenCV algorithms.
             •       This would automatically support applying many
                     OpenCV algorithms to gigantic images, and fit
                     naturally into the VW view ecosystem.

Intelligent Systems Division                                  NASA Ames Research Center
Questions / Discussion




                        http://ti.arc.nasa.gov/visionworkbench/




Intelligent Systems Division                                NASA Ames Research Center

More Related Content

What's hot

Loops and functions in r
Loops and functions in rLoops and functions in r
Loops and functions in rmanikanta361
 
Introduction to LaTeX (For Word users)
 Introduction to LaTeX (For Word users) Introduction to LaTeX (For Word users)
Introduction to LaTeX (For Word users)Guy K. Kloss
 
PyOpenCLによるGPGPU入門
PyOpenCLによるGPGPU入門PyOpenCLによるGPGPU入門
PyOpenCLによるGPGPU入門Yosuke Onoue
 
Naive bayesian classification
Naive bayesian classificationNaive bayesian classification
Naive bayesian classificationDr-Dipali Meher
 
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...How to make a presentation with LATEX? Introduction to BeamerPresentation ben...
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...researchcenterm
 
Mathematical operations in image processing
Mathematical operations in image processingMathematical operations in image processing
Mathematical operations in image processingAsad Ali
 
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 2: Pooling
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 2: PoolingDeep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 2: Pooling
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 2: PoolingKirill Eremenko
 
Comuter graphics ellipse drawing algorithm
Comuter graphics ellipse drawing algorithmComuter graphics ellipse drawing algorithm
Comuter graphics ellipse drawing algorithmRachana Marathe
 
海外ゲーム技術勉強会#1 OGRE3D
海外ゲーム技術勉強会#1 OGRE3D海外ゲーム技術勉強会#1 OGRE3D
海外ゲーム技術勉強会#1 OGRE3DKazuhisa Minato
 
Graphic hardware and software
Graphic hardware and softwareGraphic hardware and software
Graphic hardware and softwarerafhat
 
ملزمة الرياضيات - السادس العلمي
ملزمة الرياضيات - السادس العلمي   ملزمة الرياضيات - السادس العلمي
ملزمة الرياضيات - السادس العلمي Ahmed Mahdi
 
Image processing sw & hw
Image processing sw & hwImage processing sw & hw
Image processing sw & hwamalalhait
 
圏論のモナドとHaskellのモナド
圏論のモナドとHaskellのモナド圏論のモナドとHaskellのモナド
圏論のモナドとHaskellのモナドYoshihiro Mizoguchi
 
Design by contractとホーア論理
Design by contractとホーア論理Design by contractとホーア論理
Design by contractとホーア論理Takuya Matsunaga
 
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Module 2
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Module 2Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Module 2
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Module 2Kirill Eremenko
 
Chapter 2 - Matlab Environment
Chapter 2 - Matlab EnvironmentChapter 2 - Matlab Environment
Chapter 2 - Matlab EnvironmentSiva Gopal
 

What's hot (20)

Loops and functions in r
Loops and functions in rLoops and functions in r
Loops and functions in r
 
Introduction to LaTeX (For Word users)
 Introduction to LaTeX (For Word users) Introduction to LaTeX (For Word users)
Introduction to LaTeX (For Word users)
 
PyOpenCLによるGPGPU入門
PyOpenCLによるGPGPU入門PyOpenCLによるGPGPU入門
PyOpenCLによるGPGPU入門
 
OpenXR 1.0 Reference Guide
OpenXR 1.0 Reference GuideOpenXR 1.0 Reference Guide
OpenXR 1.0 Reference Guide
 
Naive bayesian classification
Naive bayesian classificationNaive bayesian classification
Naive bayesian classification
 
Matlab Tutorial.ppt
Matlab Tutorial.pptMatlab Tutorial.ppt
Matlab Tutorial.ppt
 
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...How to make a presentation with LATEX? Introduction to BeamerPresentation ben...
How to make a presentation with LATEX? Introduction to BeamerPresentation ben...
 
Mathematical operations in image processing
Mathematical operations in image processingMathematical operations in image processing
Mathematical operations in image processing
 
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 2: Pooling
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 2: PoolingDeep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 2: Pooling
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Step 2: Pooling
 
Comuter graphics ellipse drawing algorithm
Comuter graphics ellipse drawing algorithmComuter graphics ellipse drawing algorithm
Comuter graphics ellipse drawing algorithm
 
海外ゲーム技術勉強会#1 OGRE3D
海外ゲーム技術勉強会#1 OGRE3D海外ゲーム技術勉強会#1 OGRE3D
海外ゲーム技術勉強会#1 OGRE3D
 
Graphic hardware and software
Graphic hardware and softwareGraphic hardware and software
Graphic hardware and software
 
ملزمة الرياضيات - السادس العلمي
ملزمة الرياضيات - السادس العلمي   ملزمة الرياضيات - السادس العلمي
ملزمة الرياضيات - السادس العلمي
 
Python程式設計 - 串列資料應用
Python程式設計 - 串列資料應用 Python程式設計 - 串列資料應用
Python程式設計 - 串列資料應用
 
Bresenham circle
Bresenham circleBresenham circle
Bresenham circle
 
Image processing sw & hw
Image processing sw & hwImage processing sw & hw
Image processing sw & hw
 
圏論のモナドとHaskellのモナド
圏論のモナドとHaskellのモナド圏論のモナドとHaskellのモナド
圏論のモナドとHaskellのモナド
 
Design by contractとホーア論理
Design by contractとホーア論理Design by contractとホーア論理
Design by contractとホーア論理
 
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Module 2
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Module 2Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Module 2
Deep Learning A-Z™: Convolutional Neural Networks (CNN) - Module 2
 
Chapter 2 - Matlab Environment
Chapter 2 - Matlab EnvironmentChapter 2 - Matlab Environment
Chapter 2 - Matlab Environment
 

Similar to The NASA Vision Workbench: Reflections on Image Processing in C++

Image Processing and Cartography with the NASA Vision Workbench
Image Processing and Cartography with the NASA Vision WorkbenchImage Processing and Cartography with the NASA Vision Workbench
Image Processing and Cartography with the NASA Vision WorkbenchMatt Hancher
 
Information from pixels
Information from pixelsInformation from pixels
Information from pixelsDave Snowdon
 
Introduction to Computer graphics
Introduction to Computer graphicsIntroduction to Computer graphics
Introduction to Computer graphicsLOKESH KUMAR
 
Automated Face Detection System
Automated Face Detection SystemAutomated Face Detection System
Automated Face Detection SystemAbhiroop Ghatak
 
OpenCascade Technology Overview: Visualization
OpenCascade Technology Overview: VisualizationOpenCascade Technology Overview: Visualization
OpenCascade Technology Overview: VisualizationRiver Wang
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
Computer Graphics
Computer GraphicsComputer Graphics
Computer GraphicsAdri Jovin
 
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...Amazon Web Services
 
NIAR_VRC_2010
NIAR_VRC_2010NIAR_VRC_2010
NIAR_VRC_2010fftoledo
 
The not so short
The not so shortThe not so short
The not so shortAXM
 
AISF19 - Unleash Computer Vision at the Edge
AISF19 - Unleash Computer Vision at the EdgeAISF19 - Unleash Computer Vision at the Edge
AISF19 - Unleash Computer Vision at the EdgeBill Liu
 
PCI Geomatics Overview
PCI Geomatics OverviewPCI Geomatics Overview
PCI Geomatics OverviewPci Geomatics
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingYu Huang
 
Machine Learning with JavaScript
Machine Learning with JavaScriptMachine Learning with JavaScript
Machine Learning with JavaScriptIvo Andreev
 
Final presentation (1) (1)
Final presentation (1) (1)Final presentation (1) (1)
Final presentation (1) (1)Gargee Hiray
 
01 foundations
01 foundations01 foundations
01 foundationsankit_ppt
 
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningMakine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningAli Alkan
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...Edge AI and Vision Alliance
 
Introduction of super map gis 10i(2020) (1)
Introduction of super map gis 10i(2020) (1)Introduction of super map gis 10i(2020) (1)
Introduction of super map gis 10i(2020) (1)GeoMedeelel
 

Similar to The NASA Vision Workbench: Reflections on Image Processing in C++ (20)

Image Processing and Cartography with the NASA Vision Workbench
Image Processing and Cartography with the NASA Vision WorkbenchImage Processing and Cartography with the NASA Vision Workbench
Image Processing and Cartography with the NASA Vision Workbench
 
Information from pixels
Information from pixelsInformation from pixels
Information from pixels
 
Introduction to Computer graphics
Introduction to Computer graphicsIntroduction to Computer graphics
Introduction to Computer graphics
 
Automated Face Detection System
Automated Face Detection SystemAutomated Face Detection System
Automated Face Detection System
 
Computer Vision Introduction
Computer Vision IntroductionComputer Vision Introduction
Computer Vision Introduction
 
OpenCascade Technology Overview: Visualization
OpenCascade Technology Overview: VisualizationOpenCascade Technology Overview: Visualization
OpenCascade Technology Overview: Visualization
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
Computer Graphics
Computer GraphicsComputer Graphics
Computer Graphics
 
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC...
 
NIAR_VRC_2010
NIAR_VRC_2010NIAR_VRC_2010
NIAR_VRC_2010
 
The not so short
The not so shortThe not so short
The not so short
 
AISF19 - Unleash Computer Vision at the Edge
AISF19 - Unleash Computer Vision at the EdgeAISF19 - Unleash Computer Vision at the Edge
AISF19 - Unleash Computer Vision at the Edge
 
PCI Geomatics Overview
PCI Geomatics OverviewPCI Geomatics Overview
PCI Geomatics Overview
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous Driving
 
Machine Learning with JavaScript
Machine Learning with JavaScriptMachine Learning with JavaScript
Machine Learning with JavaScript
 
Final presentation (1) (1)
Final presentation (1) (1)Final presentation (1) (1)
Final presentation (1) (1)
 
01 foundations
01 foundations01 foundations
01 foundations
 
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningMakine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
 
Introduction of super map gis 10i(2020) (1)
Introduction of super map gis 10i(2020) (1)Introduction of super map gis 10i(2020) (1)
Introduction of super map gis 10i(2020) (1)
 

Recently uploaded

VIP Girls Available Call or WhatsApp 9711199012
VIP Girls Available Call or WhatsApp 9711199012VIP Girls Available Call or WhatsApp 9711199012
VIP Girls Available Call or WhatsApp 9711199012ankitnayak356677
 
Opportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationOpportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationReyMonsales
 
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep VictoryAP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victoryanjanibaddipudi1
 
Rohan Jaitley: Central Gov't Standing Counsel for Justice
Rohan Jaitley: Central Gov't Standing Counsel for JusticeRohan Jaitley: Central Gov't Standing Counsel for Justice
Rohan Jaitley: Central Gov't Standing Counsel for JusticeAbdulGhani778830
 
Brief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerBrief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerOmarCabrera39
 
Referendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoReferendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoSABC News
 
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkcomplaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkbhavenpr
 
57 Bidens Annihilation Nation Policy.pdf
57 Bidens Annihilation Nation Policy.pdf57 Bidens Annihilation Nation Policy.pdf
57 Bidens Annihilation Nation Policy.pdfGerald Furnkranz
 
IndiaWest: Your Trusted Source for Today's Global News
IndiaWest: Your Trusted Source for Today's Global NewsIndiaWest: Your Trusted Source for Today's Global News
IndiaWest: Your Trusted Source for Today's Global NewsIndiaWest2
 
Manipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkManipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkbhavenpr
 
Top 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfTop 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfauroraaudrey4826
 
Quiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsQuiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsnaxymaxyy
 
Global Terrorism and its types and prevention ppt.
Global Terrorism and its types and prevention ppt.Global Terrorism and its types and prevention ppt.
Global Terrorism and its types and prevention ppt.NaveedKhaskheli1
 

Recently uploaded (13)

VIP Girls Available Call or WhatsApp 9711199012
VIP Girls Available Call or WhatsApp 9711199012VIP Girls Available Call or WhatsApp 9711199012
VIP Girls Available Call or WhatsApp 9711199012
 
Opportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationOpportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and information
 
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep VictoryAP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
 
Rohan Jaitley: Central Gov't Standing Counsel for Justice
Rohan Jaitley: Central Gov't Standing Counsel for JusticeRohan Jaitley: Central Gov't Standing Counsel for Justice
Rohan Jaitley: Central Gov't Standing Counsel for Justice
 
Brief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerBrief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert Oppenheimer
 
Referendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoReferendum Party 2024 Election Manifesto
Referendum Party 2024 Election Manifesto
 
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkcomplaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
 
57 Bidens Annihilation Nation Policy.pdf
57 Bidens Annihilation Nation Policy.pdf57 Bidens Annihilation Nation Policy.pdf
57 Bidens Annihilation Nation Policy.pdf
 
IndiaWest: Your Trusted Source for Today's Global News
IndiaWest: Your Trusted Source for Today's Global NewsIndiaWest: Your Trusted Source for Today's Global News
IndiaWest: Your Trusted Source for Today's Global News
 
Manipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkManipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpk
 
Top 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfTop 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdf
 
Quiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsQuiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the rounds
 
Global Terrorism and its types and prevention ppt.
Global Terrorism and its types and prevention ppt.Global Terrorism and its types and prevention ppt.
Global Terrorism and its types and prevention ppt.
 

The NASA Vision Workbench: Reflections on Image Processing in C++

  • 1. The NASA Vision Workbench Reflections on Image Processing in C++ Matt Hancher & Michael Broxton Intelligent Robotics Group January 7, 2009, Willow Garage Intelligent Systems Division NASA Ames Research Center
  • 2. Talk Overview • Overview and Background • Introduction to the Vision Workbench • Vision Workbench Modules and Applications • Under the Hood: Templates, Views, and Lazy Evaluation • Lessons Learned and Future Directions Intelligent Systems Division NASA Ames Research Center
  • 3. NASA Ames Research Center • NASA’s Silicon Valley research center • Small spacecraft • Supercomputers • Lunar & Planetary Science • Intelligent Systems • Human Factors • Thermal protection systems • Aeronautics • Astrobiology Intelligent Systems Division NASA Ames Research Center
  • 4. Intelligent Robotics Group (IRG) • Areas of expertise • Applied computer vision • Human-robot interaction • Instrument deployment & placement • Interactive 3D visualization • Robot software architectures • Science-driven exploration • Instrument placement, resource mapping, analysis support • Low speed, deliberative operation • Fieldwork-driven operations • Precursor missions (site survey, site survey, deployment, etc.) • Manned missions (human-paced interaction, inspection, etc.) Intelligent Systems Division NASA Ames Research Center
  • 5. The NASA Vision Workbench • Open-source image processing and machine vision library in C++ • Developed as a foundation for unifying image processing work at NASA Ames • A “second-generation” C++ image processing library, drawing on lessons learned by VXL, GIL, VIGRA, etc. • Designed for easy, expressive coding of efficient image processing algorithms Intelligent Systems Division NASA Ames Research Center
  • 6. Obtaining the Vision Workbench • Available under the NASA Open Source Agreement (NOSA), an OSI-approved non- viral open source license. • VW version 2.0 alpha snapshots currently being released for the brave. (We use it.) http://ti.arc.nasa.gov/visionworkbench/ Intelligent Systems Division NASA Ames Research Center
  • 7. Image Module Basics Intelligent Systems Division NASA Ames Research Center
  • 8. API Philosophy • Simple, natural, mathematical, expressive • Treat images as first-class mathematical data types whenever possible • Example: IIR filtering for background subtraction background += alpha * ( image - background ); • Direct, intuitive function calls • Example: A Gaussian smoothing filter result = gaussian_filter( image, 3.0 ); Intelligent Systems Division NASA Ames Research Center
  • 9. The Core Image Type ImageView<PixelT> • Stores a reference-counted array of pixels. • Templatized on the pixel type; e.g. ImageView<PixelRGB<uint8> > • Supports an arbitrary number of image planes. Intelligent Systems Division NASA Ames Research Center
  • 10. The ImageView Public Interface ImageView<...> img; Constructing ImageView<...> img(cols,rows); ImageView<...> img(cols,rows,planes); img.set_size(cols,rows); Changing dimensions img.set_size(cols,rows,planes); img.cols() Getting dimensions img.rows() img.planes() img(col,row) Accessing pixels img(col,row,plane) ImageView<...>::iterator STL iterator img.begin() img.end() ImageView<...>::pixel_accessor Pixel accessor img.origin() Intelligent Systems Division NASA Ames Research Center
  • 11. Built-In Pixel Types PixelGray<float32> Grayscale PixelGrayA<uint8> PixelRGB<double> RGB PixelRGBA<int16> PixelHSV<float32> PixelXYZ<float32> Color spaces PixelLuv<float32> PixelLab<float32> float32, float64 and 8,16,32, Unitless (e.g. kernels) 64 bit signed and unsigned integer Vectors Vector<float64,4> float32, float64 and 8,16,32, Unitless (e.g. kernels) 64 bit signed and unsigned integer PixelMask<float> Masked Pixels PixelMask<PixelRGBA<uint8> > • Try something like this at the top of your code: typedef ImageView<PixelRGB<double> > Image; Intelligent Systems Division NASA Ames Research Center
  • 12. Simple ImageView Operations • Operations like these are inexpensive and “shallow” or “lazy.” transpose(img) rotate_180(img) flip_vertical(img) flip_horizontal(img) rotate_90cw(img) rotate_90ccw(img) crop(img,x,y,cols,rows) subsample(img,factor) subsample(img,xfactor,yfactor) • Use copy() to make a deep copy if you need one. copy(img) Intelligent Systems Division NASA Ames Research Center
  • 13. Slicing and Dicing • Select an individual plane or channel “slice”: select_plane(img,plane) select_channel(img,channel) • Interpret pixel channels as image planes: channels_to_planes(img) • Example: making a PixelRGBA<float32> image opaque: fill( select_channel(img,3), 1.0 ); Intelligent Systems Division NASA Ames Research Center
  • 14. ImageView Filtering Operations convolution_filter(img,kernel) separable_convolution_filter(img,xkernel,ykernel) gaussian_filter(img,sigma) derivative_filter(img,xderiv,yderiv) laplacian_filter(img) threshold_filter(img,thresh,hi,lo) ... • There are several options, including edge extensions: img = gaussian_filter(img, 3.0, ZeroEdgeExtention()); Intelligent Systems Division NASA Ames Research Center
  • 15. Some Simple Filtering Examples Original Gaussian X Derivative Laplacian Intelligent Systems Division NASA Ames Research Center
  • 16. ImageView Operators • Mathematical operators on images work as you’d like. • Add, subtract, multiply, and divide images (per-pixel). • Add or subtract a constant pixel value offset. • Multiply or divide by scalars. • Example: IIR filtering for background subtraction. bkg_img += 0.02 * (src_img - bkg_img); • Operators are the best way to do image arithmetic with the Vision Workbench. Intelligent Systems Division NASA Ames Research Center
  • 17. More ImageView Math • Most standard math functions work on images too. abs exp log sqrt pow hypot sin cos tan asin acos atan sinh cosh tanh asinh acosh atanh ...and more! • Example: Computing gradient orientation. orientation = atan2(grad_y, grad_x); Intelligent Systems Division NASA Ames Research Center
  • 18. ImageView Math Examples Gradient Orientation Gradient Magnitude Absolute Difference of Gaussians Logarithmic Map Intelligent Systems Division NASA Ames Research Center
  • 19. Per-Pixel ImageView Operations • Cast to a new pixel type or channel type: pixel_cast<NewPixelT>(img) channel_cast<NewChannelT>(img) • Explicit casts are generally not needed to convert between color spaces. • Apply an arbitrary function to each pixel, or to each channel of each pixel: per_pixel_filter(img,func) per_pixel_channel_filter(img,func) Intelligent Systems Division NASA Ames Research Center
  • 20. Example: Color Detection • E.g. in color fiducial tracking and object tracking ImageView<PixelRGB<double> > input = ...; double hue_ref = 0.54; ImageView<PixelHSV<double> > hsv_im = gaussian_filter( input, 1.0 ); ImageView<double> hue = select_channel( hsv_im, 0 ); ImageView<double> sat = select_channel( hsv_im, 1 ); ImageView<double> match_im = ( 1.0 - 20.0*abs(hue-hue_ref) ) * sat*sat; Intelligent Systems Division NASA Ames Research Center
  • 21. Image Transformation • Arbitrary image transformations via transform “functors” that define a mapping. warped = transform( image, my_txform ); • Simple wrappers for common cases. resample(img,xscale,yscale) resize(img,xsize,ysize) translate(img,xoff,yoff) rotate(img,angle) • Customizable interpolation and image edge extension via optional arguments. Intelligent Systems Division NASA Ames Research Center
  • 22. Transformation Examples Rotation Homography Radial Distortion Arbitrary Transformation Intelligent Systems Division NASA Ames Research Center
  • 23. Modules & Applications Intelligent Systems Division NASA Ames Research Center
  • 24. Interest Point & Alignment Module Intelligent Systems Division NASA Ames Research Center
  • 25. Interest Point & Alignment Module Intelligent Systems Division NASA Ames Research Center
  • 26. Interest Point & Alignment Module Intelligent Systems Division NASA Ames Research Center
  • 27. Interest Point & Alignment Module Intelligent Systems Division NASA Ames Research Center
  • 28. Interest Point & Alignment Module Original Images Aligned Images Intelligent Systems Division NASA Ames Research Center
  • 29. Mosaic Module Basics Intelligent Systems Division NASA Ames Research Center
  • 30. Mosaic Module Basics Intelligent Systems Division NASA Ames Research Center
  • 31. Mosaic Module Basics Intelligent Systems Division NASA Ames Research Center
  • 32. CTX Polar Mosaic • Based on pre-release polar data captured by CTX on Mars Reconnaissance Orbiter • Two weeks of development time • Stats: • 1610 source images • 305-GB of source imagery • 40.3 Gigapixels Intelligent Systems Division NASA Ames Research Center
  • 33. Cartography Module Intelligent Systems Division NASA Ames Research Center
  • 34. High Dynamic Range Module • Merge multiple exposures of the same scene to increase dynamic range. • Closely related to photometric calibration of orbital imagery. LDR HDR Intelligent Systems Division NASA Ames Research Center
  • 35. HDR Module Intelligent Systems Division NASA Ames Research Center
  • 36. HDR Module Intelligent Systems Division NASA Ames Research Center
  • 37. HDR Module Intelligent Systems Division NASA Ames Research Center
  • 38. Application: Image Matching • Problem: Given an image, find others like it. Example database: Apollo Metric Camera images Intelligent Systems Division NASA Ames Research Center
  • 39. Texture-Based Image Matching Model image Texture bank filtering Filtering (Gaussian 1st derivative and LOG) Grouping to remove orientation Output Representation Energy in a window E-M Gaussian mixture model Segmentation Iterative tryouts, MDL Max vote Post-processing Grouping Summarization Mean energy in segment Euclidian distance Vector Comparison Matched image Intelligent Systems Division NASA Ames Research Center
  • 40. Texture Matching Filter Bank Intelligent Systems Division NASA Ames Research Center
  • 41. Image Matching: Results Intelligent Systems Division NASA Ames Research Center
  • 42. Stereo Module Right Image 2. Sub-pixel 1. Discrete Refinement Correlation • Fit a 2D convex quadratic • Find the integer surface to the nine nearest offset (disparity) that points in correlation fitness minimizes the sum space. of absolute Template Region difference between (from Left Image) template region and the right image. Discrete Correlation For speed: Sub-pixel Correlation • Coarse-to-fine processing. Candidate • Disparity search Disparity(dx, dy) sub-regioning1 • Box filter-optimized Search Area correlator. 1. Changming Sun. Rectangular Subregioning and 3-D Maximum-Surface Techniques for Fast Stereo Matching. In Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision 3. Consistency Checks (2001) • Left/Right Cross Check • Median Filtering Other methods to be added soon: • Epipolar, photometric, continuity/ smoothness constraints. • Robust Cost Function Intelligent Systems Division NASA Ames Research Center
  • 43. Improved Stereo Matching: Affine-adaptive Sub-pixel Correlation • Right Image Foreshortening is the geometric effect that gives rise to stereo processing. However, the change in perspective on a sloped surface can confuse an area-based stereo correlator. • The solution is to use an iterative algorithm to adapt the correlation window (e.g. affine). AS15-M-1134 AS15-M-1135 Intelligent Systems Division NASA Ames Research Center
  • 44. Improved Stereo Matching: Handling “Noise” Right Image • The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator. • We have shown that we can mitigate this effect somewhat by using robust statistics. Dust and lint on AS15-M-1134 Intelligent Systems Division NASA Ames Research Center
  • 45. Improved Stereo Matching: Handling “Noise” Right Image • The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator. • We have shown that we can mitigate this effect somewhat by using robust statistics. DEM (Note error due to dust...) Intelligent Systems Division NASA Ames Research Center
  • 46. Improved Stereo Matching: Handling “Noise” Right Image • The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator. • We have shown that we can mitigate this effect somewhat by using robust statistics. DEM (with error corrected using Cauchy robust weighting) Intelligent Systems Division NASA Ames Research Center
  • 47. The Ames Stereo Pipeline • Problem: Given multiple images, compute the 3D terrain. Mars Pathfinder & Mars Exploration Rovers (MER) & Viz MarsMap Mars Polar Lander & Viz NASA Ames has been developing surface reconstruction techniques for planetary exploration since the mid 1990s. Intelligent Systems Division NASA Ames Research Center
  • 48. Architectural Overview The Stereo Pipeline is a relatively thin Vision Workbench Overview application built upon the open source ARC • Modular, extensible, C++ machine vision and image Vision Workbench and USGS ISIS toolkits. processing library (Linux, OS-X, Win32) • Developed as a framework for unifying image processing Mission Specific Code work at NASA Ames Stereo Pipeline • Designed for easy, expressive coding of efficient image ISIS processing algorithms. Vision Camera Vision Workbench Modules Workbench • Core (abstract datatypes & utilities) VW Camera Models • Camera (models & calibration) Image ISIS Camera Models • Cartography (geospatial images) • GPU (HW accelerated processing) Image Processing Stereo • HDR (high-dynamic range images) • Interest Point (tracking & matching) Dense Stereo Correlation FileIO • Mosaic (composite & blend huge images) Stereo Camera Geometry Image File I/O • Stereo Processing (high-quality DEMs & 3D models) ISIS File I/O Cartography InterestPoint DEM Generation Image Alignment Georeferenced File I/O http://ti.arc.nasa.gov/visionworkbench/ Intelligent Systems Division NASA Ames Research Center
  • 49. Mars Stereo: MOC NA MGS MOC-Narrow Angle • Malin Space Science Systems • Altitude: 388.4 km (typical) • Line Scan Camera: 2048 pixels • Focal length: 3.437m • Resolution: 1.5-12m / pixel • FOV: 0.5 deg Intelligent Systems Division NASA Ames Research Center
  • 50. Galaxius Fluctus Channel This VRML model was generated from MOC image pair M01-00115 and E02-01461 (34.66°N, 141.29°E). The complete stereo reconstruction process takes approximately five minutes on a 3.0GHz workstation (1024x8064 pixels). This model is shown without vertical elevation exaggeration. Intelligent Systems Division NASA Ames Research Center
  • 51. Warrego Vallis System Lower Left: This 3D model was generated from MOC-NA images E01-02032 and M07-02071 (42.66°S, 93.55°E). Upper Right: Ortho-image overlay. Areas of interpolated data are colored red. Intelligent Systems Division NASA Ames Research Center
  • 52. NE Terra Meridiani !% quot;quot; #$ !! $ # quot; quot;quot; quot;quot; !% #$ !!quot;quot;quot; $ !%quot;quot;#$ Upper Left: This DTM was generated from MOC images E04-01109 and M20-01357 (2.38°N, 6.40°E). The contour lines (20m spacing) overlay an ortho-image generated from the 3D terrain model. Lower Right: An oblique view of the corresponding VRML model. Intelligent Systems Division NASA Ames Research Center
  • 53. Lunar Stereo: Apollo Orbiter Cameras ITEK Panoramic Camera • Focal length: 610 mm (24”) • Optical bar camera • Apollo 15,16,17 Scientific Instrument Module (SIM) • Film image: 1.149 x 0.1149 m • Resolution: 108-135 lines/mm Intelligent Systems Division NASA Ames Research Center
  • 54. Apollo 17 Landing Site Top: Stereo reconstruction Right: Handheld photo taken by an orbiting Apollo 17 astronaut Intelligent Systems Division NASA Ames Research Center
  • 55. Public Outreach: Haydn Planetarium Intelligent Systems Division NASA Ames Research Center
  • 56. Public Outreach: Haydn Planetarium Intelligent Systems Division NASA Ames Research Center
  • 57. Recent Developments: Processing Large Satellite Imagery • The Vision Workbench handles Apollo Metric Camera HiRISE LROC (16,000x16,000) (20,000x40,000) (10000x50000) arbitrarily large images via intelligent caching and a flexible abstraction of an image called an “image view.” • Image operations are evaluated lazily, allowing for optimization down the line. HRSC CTX • Processing occurs one tile at a time, (5184x (5064x 16000) 16000) and is usually driven by the output operation (i.e. writing a tile to disk). • DiskImageView, BlockCacheView, ImageViewRef, block_rasterize(), and blocked-savvy write-image()/ FileIO • Scalable performance on multi- threaded machines (soon to include MOC-NA (2048x4800) Columbia, NASA’s supercomputer) • Thread and ThreadPool/WorkQueue MER objects (1024x1024) • Specifically targeting the stereo correlator, outlier rejection, and Nominal Resolutions for Various Imagers. All sizes given in pixels. stereo intersection algorithms. Apollo Panoramic Camera is not shown (25400 x 244000 pixels)! Intelligent Systems Division NASA Ames Research Center
  • 58. Recent Developments: Least Squares Bundle Adjustment Right Image Refining Apollo SPICE Kernels • Camera position and pose in “historical” SPICE kernels provided by ASU provide a good initial solution, but they will require refinement. • Incorporate new Apollo Metric Camera tie-points into ULCN 2005 - or - tie these points to the preliminary LOLA control network in late 2009. • This work will be carried out as part of a USGS/ARC LASER proposal during FY09/FY10. Automating Bundle Adjustment • Automate tie-point matching using the SIFT and SURF algorithms. • Experimenting with reducing sensitivities to outliers using Robust Statistics (i.e. error models with a “heavy tailed” probability distributions) Top: Partial view of Orbit 33 stereo reconstruction. Note the discontinuities in the colored, hillshaded terrain. Bottom: KSU “Bundlevis” visualization of bundle adjustment for AS15-M-113[5-7] Intelligent Systems Division NASA Ames Research Center
  • 59. A Peek Under the Hood Intelligent Systems Division NASA Ames Research Center
  • 60. Problem: Intermediate Results • What happens when you chain operations? result = image1 + image2 + image3; result = transpose( crop(x,y,31,31) ); • Normally those would be the same as these: Image tmp = image1 + image2; result = tmp + image3; Image tmp = crop(image,x,y,31,31); result = transpose(tmp); • That would be terribly inefficient! Computing the intermediate requires an extra pass over the data. Intelligent Systems Division NASA Ames Research Center
  • 61. Solution: Lazy Evaluation • The + operator returns a special image sum object. • The actual computation is only performed when you set an ImageView equal to one of these objects. • The entire operation is performed in the inner loop, once per pixel. • No intermediate image is needed! • No second pass over the data is needed, either! Intelligent Systems Division NASA Ames Research Center
  • 62. Generalizing the View Concept • An image view is any object that you can access just like a regular old ImageView object. Image::pixel_type Type definitions Image::result_type img.cols() Getting dimensions img.rows() img.planes() img(col,row) Accessing pixels img(col,row,plane) Image::pixel_accessor Pixel accessor img.origin() Image::prerasterize_type Rasterization prerasterize(bbox) template <DestT> rasterize(dest,bbox) • The data can be anywhere, or it can be computed. Intelligent Systems Division NASA Ames Research Center
  • 63. The Pixel Accessor Public Interface • Pixel accessors are the most efficient way to move around the pixels in an image, and are typically used to implement rasterization functions. • They behave somewhat like standard C++ iterators. acc.prev_col() acc.next_col() acc.prev_row() Iteration acc.next_row() acc.prev_plane() acc.next_plane() acc.advance(cols,rows) Advancement acc.advance(cols,rows,planes) Pixel access *acc Intelligent Systems Division NASA Ames Research Center
  • 64. Views, Views, Everywhere! • None of the functions we’ve seen so far do anything. • Instead, they immediately return view objects that represent processed views of the underlying data. • Nested function calls produce nested view types. • The computation happens in either the assignment operator or the constructor of the destination. • We call this final step the “rasterization” of one view into another view. Intelligent Systems Division NASA Ames Research Center
  • 65. Block Rasterization • Ultra-large (larger than memory) images are are easily supported. • All image views natively support block-by-block computation (“rasterization”). • write_image() computes per-block or -line • QuadTreeGenerator computes per-block • BlockCacheView allows you to manually control block computation in a nested view. template <DestT> Image::rasterize(DestT const& dest, BBox2i const& bbox); Intelligent Systems Division NASA Ames Research Center
  • 66. A Trivial First Example • SLOG: Sign of Laplacian of Gaussian Image slog = threshold_filter( laplacian_filter( gaussian_filter( img, 1.5 ) ) ); Intelligent Systems Division NASA Ames Research Center
  • 67. Generic View Types Can be Complicated! • The type of the resulting view object becomes complex very quickly. Image slog = threshold_filter( laplacian_filter( gaussian_filter( img, 1.5 ) ) ); UnaryPerPixelView<ConvolutionView<SeparableConvolutionView<ImageView<PixelRGB<float> >, double, ConstantEdgeExtension>, double, ConstantEdgeExtension>, UnaryCompoundFunctor<ChannelThresholdFunctor<PixelRGB<float> > > Intelligent Systems Division NASA Ames Research Center
  • 68. Other Advantages to Views • Generalized views emerged as the solution to several problems at once. • On-disk images can be supported cleanly. • Procedurally-generated images can be, too. • If you only want a small number of processed pixel values, e.g. near interest points, make the view and just ask it for those values. • Lazy evaluation permits more sophisticated algorithmic optimizations down the road. Intelligent Systems Division NASA Ames Research Center
  • 69. Naïve Laziness can be Very Bad™ • What happens when you chain convolutions? result = convolution_filter(convolution_filter(image,kern1),kern2); • Now the intermediate result is an important cache: Image tmp = convolution_filter(image,kern1); result = convolution_filter(tmp,kern2); • Without this cache, performance will be terrible. • In the Vision Workbench, intermediate results are computed and cached when necessary. Intelligent Systems Division NASA Ames Research Center
  • 70. Generic vs. Abstract Views • Views could be either template-based (generic) or virtual-function-based (abstract). • Because pixel access often appears in tight inner loops, the template-based solution performs better. • Templates are also more flexible. Virtualization can only recover one hidden type at a time. • Alas, keeping track of complex types can be annoying. Fortunately, the end user usually doesn’t have to. Intelligent Systems Division NASA Ames Research Center
  • 71. Virtualizing Image Views • Sometimes the abstract base class approach is better. • Run-time polymorphism. • Hiding complex types altogether. • The ImageViewRef class wraps an arbitrary view in a veil of abstraction. • Templatized only on the pixel type. • Contains a pointer to a special abstract base class. • Has reference semantics (but re-bindable). ImageViewRef<float> img_ref = My(Complex(Image(View(Type(img))))); • Great for keeping a lazy view around if you only want to evaluate it at select points. Intelligent Systems Division NASA Ames Research Center
  • 72. Image Resources • Image resources, such as image files on disk, may have unknown pixel/channel types. PixelFormatEnum pixel_format() Getting type info ChannelTypeEnum channel_type() int32 img.cols() Getting dimensions int32 img.rows() int32 img.planes() void read( ImageBuffer buf, BBox2i bbox ) Accessing pixel data void write( ImageBuffer buf, BBox2i bbox ) Vector2i native_block_size() Other void flush() • ImageBuffer is a simple struct describing a block of contiguous pixels in memory. • Read/write functions call helper functions to convert to/from the desired pixel type. Intelligent Systems Division NASA Ames Research Center
  • 73. Lessons Learned and Thoughts for the Future Intelligent Systems Division NASA Ames Research Center
  • 74. Templates and Laziness Revisited • The image view framework currently serves multiple purposes: • Lazy evaluation of pixels on demand • Block rasterization of gigantic images • Eliminating unwanted temporaries • This sometimes results in confused design. • Lazy views need not be fully statically defined: that is a premature optimization that complicates design. Intelligent Systems Division NASA Ames Research Center
  • 75. Example: Image Transformation • This simple expression: rotate( image, 45*M_PI/180 ) • Returns this complex type (assuming an RGB8 image): TransformView< InterpolationView< EdgeExtensionView< ImageView<PixelRGB<uint8> >, ZeroEdgeExtension >, BilinearInterpolation >, RotateTransform > • Nested views are very powerful, but the resulting view is needlessly complex. • Virtualizing the edge extension step has negligible impact on performance. Virtualizing the interpolation step is impossible. Intelligent Systems Division NASA Ames Research Center
  • 76. Template Pitfalls • A common and frustratingly terrible idiom for supporting multiple pixel types: template <class PixelT> int do_something_useful(…) { // Your actual program code }; int main(int argc, char *argv) { // Parse the arguments... DiskImageResource *resource = DiskImageResource::open(image_filename); ChannelTypeEnum channel_type = resource->channel_type(); PixelFormatEnum pixel_format = resource->pixel_format(); switch(pixel_format) { case VW_PIXEL_GRAY: switch(channel_type) { case VW_CHANNEL_UINT8: return do_something_useful<PixelGray<uint8> >(…); case VW_CHANNEL_UINT16: return do_something_useful<PixelGray<uint16> >(…); // And so on... } // And so on... } } • Annoying to write, takes forever to compile, and results in huge executables. Intelligent Systems Division NASA Ames Research Center
  • 77. A More Pythonic Way • Process an image using its native pixel type, as long as its a standard type: >>> import vw >>> input = vw.read_image( ‘my_image.jpg’ ) >>> filtered = vw.gaussian_filter( input, 3 ) >>> vw.write_image( ‘filtered_image.jpg’, filtered ) • Coercion to a specific pixel type: >>> input = vw.read_image( ‘my_image.jpg’, ptype=vw.PixelRGB, ctype=vw.uint8 ) • Successfully implemented in the Python bindings. • It’s great to use, and terrible to implement. • Results in huge Python bindings, especially due to SWIG limitations on multiple compilation units. Intelligent Systems Division NASA Ames Research Center
  • 78. Proliferation of Image Concepts • ImageView : Static pixel type, pixels stored contiguously in memory. • ImageViewRef : Static pixel type, abstracts arbitrary block image computation. • ImageResource : Dynamic pixel type, block image access with conversion. • ImageBuffer : Dynamic pixel type, pixels stored in a block in memory. • A dynamically typed version of ImageViewRef? Intelligent Systems Division NASA Ames Research Center
  • 79. A Dynamic View Abstraction? • ImageView needs to be templatized on the pixel type for fast and easy pixel access, but this does not prevent it from also adhering to a dynamically typed view abstraction. • Automatic pixel type casting/coercion is needed to avoid a combinatorial explosion. • Existing ImageResource interface may be close.... (for 3.0?) • Currently exploring an intermediate solution (essentially a dynamic version of ImageViewRef) for 2.0 release. PixelFormatEnum pixel_format() Getting type info ChannelTypeEnum channel_type() int32 img.cols() Getting dimensions int32 img.rows() int32 img.planes() Rasteriztion void rasterize( ImageBuffer buf, BBox2i bbox ) Intelligent Systems Division NASA Ames Research Center
  • 80. An OpenCV – VW Bridge? • OpenCV contains many algorithms that Vision Workbench users would love to use. • The simplest approach would be a direct bridge between ImageView and IplImage. • A more powerful approach would be to produce Vision Workbench views whose rasterizers invoke OpenCV algorithms. • This would automatically support applying many OpenCV algorithms to gigantic images, and fit naturally into the VW view ecosystem. Intelligent Systems Division NASA Ames Research Center
  • 81. Questions / Discussion http://ti.arc.nasa.gov/visionworkbench/ Intelligent Systems Division NASA Ames Research Center