Slides that accompanied a discussion of the NASA Vision Workbech, an open-source C++ image processing library developed by the Intelligent Robotics Group at the NASA Ames Research Center.
KING VISHNU BHAGWANON KA BHAGWAN PARAMATMONKA PARATOMIC PARAMANU KASARVAMANVA...
The NASA Vision Workbench: Reflections on Image Processing in C++
1. The NASA
Vision Workbench
Reflections on Image Processing in C++
Matt Hancher & Michael Broxton
Intelligent Robotics Group
January 7, 2009, Willow Garage
Intelligent Systems Division NASA Ames Research Center
2. Talk Overview
• Overview and Background
• Introduction to the Vision Workbench
• Vision Workbench Modules and Applications
• Under the Hood: Templates, Views, and Lazy Evaluation
• Lessons Learned and Future Directions
Intelligent Systems Division NASA Ames Research Center
3. NASA Ames Research Center
• NASA’s Silicon Valley
research center
• Small spacecraft
• Supercomputers
• Lunar & Planetary Science
• Intelligent Systems
• Human Factors
• Thermal protection systems
• Aeronautics
• Astrobiology
Intelligent Systems Division NASA Ames Research Center
4. Intelligent Robotics Group (IRG)
• Areas of expertise
• Applied computer vision
• Human-robot interaction
• Instrument deployment & placement
• Interactive 3D visualization
• Robot software architectures
• Science-driven exploration
• Instrument placement, resource
mapping, analysis support
• Low speed, deliberative operation
• Fieldwork-driven operations
• Precursor missions (site survey, site
survey, deployment, etc.)
• Manned missions (human-paced
interaction, inspection, etc.)
Intelligent Systems Division NASA Ames Research Center
5. The NASA Vision Workbench
• Open-source image processing and machine vision library
in C++
• Developed as a foundation for unifying image processing
work at NASA Ames
• A “second-generation” C++ image processing library,
drawing on lessons learned by VXL, GIL, VIGRA, etc.
• Designed for easy, expressive coding of efficient image
processing algorithms
Intelligent Systems Division NASA Ames Research Center
6. Obtaining the Vision Workbench
• Available under the NASA Open Source
Agreement (NOSA), an OSI-approved non-
viral open source license.
• VW version 2.0 alpha snapshots currently
being released for the brave. (We use it.)
http://ti.arc.nasa.gov/visionworkbench/
Intelligent Systems Division NASA Ames Research Center
8. API Philosophy
• Simple, natural, mathematical, expressive
• Treat images as first-class mathematical data
types whenever possible
• Example: IIR filtering for background subtraction
background += alpha * ( image - background );
• Direct, intuitive function calls
• Example: A Gaussian smoothing filter
result = gaussian_filter( image, 3.0 );
Intelligent Systems Division NASA Ames Research Center
9. The Core Image Type
ImageView<PixelT>
• Stores a reference-counted array of pixels.
• Templatized on the pixel type; e.g.
ImageView<PixelRGB<uint8> >
• Supports an arbitrary number of image planes.
Intelligent Systems Division NASA Ames Research Center
10. The ImageView Public Interface
ImageView<...> img;
Constructing ImageView<...> img(cols,rows);
ImageView<...> img(cols,rows,planes);
img.set_size(cols,rows);
Changing dimensions img.set_size(cols,rows,planes);
img.cols()
Getting dimensions img.rows()
img.planes()
img(col,row)
Accessing pixels img(col,row,plane)
ImageView<...>::iterator
STL iterator img.begin()
img.end()
ImageView<...>::pixel_accessor
Pixel accessor img.origin()
Intelligent Systems Division NASA Ames Research Center
11. Built-In Pixel Types
PixelGray<float32>
Grayscale PixelGrayA<uint8>
PixelRGB<double>
RGB PixelRGBA<int16>
PixelHSV<float32>
PixelXYZ<float32>
Color spaces PixelLuv<float32>
PixelLab<float32>
float32, float64 and 8,16,32,
Unitless (e.g. kernels) 64 bit signed and unsigned integer
Vectors Vector<float64,4>
float32, float64 and 8,16,32,
Unitless (e.g. kernels) 64 bit signed and unsigned integer
PixelMask<float>
Masked Pixels PixelMask<PixelRGBA<uint8> >
• Try something like this at the top of your code:
typedef ImageView<PixelRGB<double> > Image;
Intelligent Systems Division NASA Ames Research Center
12. Simple ImageView Operations
• Operations like these are inexpensive and
“shallow” or “lazy.”
transpose(img) rotate_180(img)
flip_vertical(img) flip_horizontal(img)
rotate_90cw(img) rotate_90ccw(img)
crop(img,x,y,cols,rows)
subsample(img,factor)
subsample(img,xfactor,yfactor)
• Use copy() to make a deep copy if you need one.
copy(img)
Intelligent Systems Division NASA Ames Research Center
13. Slicing and Dicing
• Select an individual plane or channel “slice”:
select_plane(img,plane)
select_channel(img,channel)
• Interpret pixel channels as image planes:
channels_to_planes(img)
• Example: making a PixelRGBA<float32> image opaque:
fill( select_channel(img,3), 1.0 );
Intelligent Systems Division NASA Ames Research Center
14. ImageView Filtering Operations
convolution_filter(img,kernel)
separable_convolution_filter(img,xkernel,ykernel)
gaussian_filter(img,sigma)
derivative_filter(img,xderiv,yderiv)
laplacian_filter(img)
threshold_filter(img,thresh,hi,lo)
...
• There are several options, including edge extensions:
img = gaussian_filter(img, 3.0, ZeroEdgeExtention());
Intelligent Systems Division NASA Ames Research Center
15. Some Simple Filtering Examples
Original Gaussian
X Derivative Laplacian
Intelligent Systems Division NASA Ames Research Center
16. ImageView Operators
• Mathematical operators on images work as you’d like.
• Add, subtract, multiply, and divide images (per-pixel).
• Add or subtract a constant pixel value offset.
• Multiply or divide by scalars.
• Example: IIR filtering for background subtraction.
bkg_img += 0.02 * (src_img - bkg_img);
• Operators are the best way to do image arithmetic
with the Vision Workbench.
Intelligent Systems Division NASA Ames Research Center
17. More ImageView Math
• Most standard math functions work on images too.
abs exp log
sqrt pow hypot
sin cos tan
asin acos atan
sinh cosh tanh
asinh acosh atanh
...and more!
• Example: Computing gradient orientation.
orientation = atan2(grad_y, grad_x);
Intelligent Systems Division NASA Ames Research Center
18. ImageView Math Examples
Gradient Orientation Gradient Magnitude
Absolute Difference of Gaussians Logarithmic Map
Intelligent Systems Division NASA Ames Research Center
19. Per-Pixel ImageView Operations
• Cast to a new pixel type or channel type:
pixel_cast<NewPixelT>(img)
channel_cast<NewChannelT>(img)
• Explicit casts are generally not needed to convert
between color spaces.
• Apply an arbitrary function to each pixel, or to each
channel of each pixel:
per_pixel_filter(img,func)
per_pixel_channel_filter(img,func)
Intelligent Systems Division NASA Ames Research Center
20. Example: Color Detection
• E.g. in color fiducial tracking and object tracking
ImageView<PixelRGB<double> > input = ...;
double hue_ref = 0.54;
ImageView<PixelHSV<double> > hsv_im = gaussian_filter( input, 1.0 );
ImageView<double> hue = select_channel( hsv_im, 0 );
ImageView<double> sat = select_channel( hsv_im, 1 );
ImageView<double> match_im = ( 1.0 - 20.0*abs(hue-hue_ref) ) * sat*sat;
Intelligent Systems Division NASA Ames Research Center
21. Image Transformation
• Arbitrary image transformations via
transform “functors” that define a mapping.
warped = transform( image, my_txform );
• Simple wrappers for common cases.
resample(img,xscale,yscale) resize(img,xsize,ysize)
translate(img,xoff,yoff) rotate(img,angle)
• Customizable interpolation and image edge
extension via optional arguments.
Intelligent Systems Division NASA Ames Research Center
22. Transformation Examples
Rotation Homography
Radial Distortion Arbitrary Transformation
Intelligent Systems Division NASA Ames Research Center
32. CTX Polar Mosaic
• Based on pre-release
polar data captured by
CTX on Mars
Reconnaissance
Orbiter
• Two weeks of
development time
• Stats:
• 1610 source images
• 305-GB of source imagery
• 40.3 Gigapixels
Intelligent Systems Division NASA Ames Research Center
34. High Dynamic Range Module
• Merge multiple exposures of the same scene to increase
dynamic range.
• Closely related to photometric calibration of orbital
imagery.
LDR HDR
Intelligent Systems Division NASA Ames Research Center
38. Application: Image Matching
• Problem: Given an image, find others like it.
Example database: Apollo Metric Camera images
Intelligent Systems Division NASA Ames Research Center
39. Texture-Based Image Matching
Model
image
Texture bank filtering
Filtering
(Gaussian 1st derivative and LOG)
Grouping to remove orientation
Output Representation
Energy in a window
E-M Gaussian mixture model
Segmentation
Iterative tryouts, MDL
Max vote
Post-processing
Grouping
Summarization
Mean energy in segment
Euclidian distance
Vector Comparison
Matched
image
Intelligent Systems Division NASA Ames Research Center
42. Stereo Module
Right Image
2. Sub-pixel
1. Discrete
Refinement
Correlation
• Fit a 2D convex quadratic
• Find the integer
surface to the nine nearest
offset (disparity) that
points in correlation fitness
minimizes the sum
space.
of absolute
Template Region
difference between (from Left Image)
template region and
the right image.
Discrete Correlation
For speed:
Sub-pixel Correlation
• Coarse-to-fine
processing.
Candidate
• Disparity search
Disparity(dx, dy)
sub-regioning1
• Box filter-optimized Search Area
correlator.
1. Changming Sun. Rectangular Subregioning
and 3-D Maximum-Surface Techniques for Fast
Stereo Matching. In Proceedings of the IEEE
Workshop on Stereo and Multi-Baseline Vision
3. Consistency Checks
(2001)
• Left/Right Cross Check
• Median Filtering
Other methods to be added soon:
• Epipolar, photometric, continuity/
smoothness constraints.
• Robust Cost Function
Intelligent Systems Division NASA Ames Research Center
43. Improved Stereo Matching:
Affine-adaptive Sub-pixel Correlation
• Right Image
Foreshortening is the geometric effect that gives rise to stereo processing. However, the
change in perspective on a sloped surface can confuse an area-based stereo correlator.
• The solution is to use an iterative algorithm to adapt the correlation window (e.g. affine).
AS15-M-1134 AS15-M-1135
Intelligent Systems Division NASA Ames Research Center
44. Improved Stereo Matching:
Handling “Noise”
Right Image
• The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.
• We have shown that we can mitigate this effect somewhat by using robust statistics.
Dust and lint on AS15-M-1134
Intelligent Systems Division NASA Ames Research Center
45. Improved Stereo Matching:
Handling “Noise”
Right Image
• The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.
• We have shown that we can mitigate this effect somewhat by using robust statistics.
DEM (Note error due to dust...)
Intelligent Systems Division NASA Ames Research Center
46. Improved Stereo Matching:
Handling “Noise”
Right Image
• The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.
• We have shown that we can mitigate this effect somewhat by using robust statistics.
DEM (with error corrected using Cauchy robust weighting)
Intelligent Systems Division NASA Ames Research Center
47. The Ames Stereo Pipeline
• Problem: Given multiple images, compute the 3D terrain.
Mars Pathfinder &
Mars Exploration Rovers (MER) & Viz
MarsMap
Mars Polar Lander & Viz
NASA Ames has been developing surface reconstruction techniques for planetary exploration since the mid 1990s.
Intelligent Systems Division NASA Ames Research Center
48. Architectural Overview
The Stereo Pipeline is a relatively thin
Vision Workbench Overview
application built upon the open source ARC
• Modular, extensible, C++ machine vision and image Vision Workbench and USGS ISIS toolkits.
processing library (Linux, OS-X, Win32)
• Developed as a framework for unifying image processing
Mission Specific Code
work at NASA Ames
Stereo Pipeline
• Designed for easy, expressive coding of efficient image
ISIS
processing algorithms.
Vision Camera
Vision Workbench Modules
Workbench
• Core (abstract datatypes & utilities)
VW Camera Models
• Camera (models & calibration)
Image ISIS Camera Models
• Cartography (geospatial images)
• GPU (HW accelerated processing)
Image Processing
Stereo
• HDR (high-dynamic range images)
• Interest Point (tracking & matching) Dense Stereo Correlation
FileIO
• Mosaic (composite & blend huge images)
Stereo Camera Geometry
Image File I/O
• Stereo Processing (high-quality DEMs & 3D models)
ISIS File I/O
Cartography
InterestPoint DEM Generation
Image Alignment Georeferenced File I/O
http://ti.arc.nasa.gov/visionworkbench/
Intelligent Systems Division NASA Ames Research Center
49. Mars Stereo: MOC NA
MGS MOC-Narrow Angle
• Malin Space Science Systems
• Altitude: 388.4 km (typical)
• Line Scan Camera: 2048 pixels
• Focal length: 3.437m
• Resolution: 1.5-12m / pixel
• FOV: 0.5 deg
Intelligent Systems Division NASA Ames Research Center
50. Galaxius Fluctus Channel
This VRML model was generated from MOC image pair M01-00115 and E02-01461 (34.66°N, 141.29°E). The complete
stereo reconstruction process takes approximately five minutes on a 3.0GHz workstation (1024x8064 pixels). This model is
shown without vertical elevation exaggeration.
Intelligent Systems Division NASA Ames Research Center
51. Warrego Vallis System
Lower Left: This 3D model was generated from MOC-NA images E01-02032 and M07-02071 (42.66°S, 93.55°E).
Upper Right: Ortho-image overlay. Areas of interpolated data are colored red.
Intelligent Systems Division NASA Ames Research Center
52. NE Terra Meridiani
!%
quot;quot;
#$
!!
$
#
quot;
quot;quot;
quot;quot;
!%
#$
!!quot;quot;quot; $
!%quot;quot;#$
Upper Left: This DTM was generated from MOC images E04-01109 and M20-01357 (2.38°N, 6.40°E). The contour lines (20m
spacing) overlay an ortho-image generated from the 3D terrain model. Lower Right: An oblique view of the corresponding VRML
model.
Intelligent Systems Division NASA Ames Research Center
53. Lunar Stereo: Apollo Orbiter Cameras
ITEK Panoramic Camera
• Focal length: 610 mm (24”)
• Optical bar camera
• Apollo 15,16,17 Scientific
Instrument Module (SIM)
• Film image: 1.149 x 0.1149 m
• Resolution: 108-135 lines/mm
Intelligent Systems Division NASA Ames Research Center
54. Apollo 17 Landing Site
Top: Stereo reconstruction
Right: Handheld photo taken by an
orbiting Apollo 17 astronaut
Intelligent Systems Division NASA Ames Research Center
55. Public Outreach: Haydn Planetarium
Intelligent Systems Division NASA Ames Research Center
56. Public Outreach: Haydn Planetarium
Intelligent Systems Division NASA Ames Research Center
57. Recent Developments:
Processing Large Satellite Imagery
• The Vision Workbench handles Apollo Metric Camera HiRISE LROC
(16,000x16,000) (20,000x40,000) (10000x50000)
arbitrarily large images via
intelligent caching and a flexible
abstraction of an image called an
“image view.”
• Image operations are evaluated
lazily, allowing for optimization down
the line.
HRSC CTX
• Processing occurs one tile at a time, (5184x (5064x
16000) 16000)
and is usually driven by the output
operation (i.e. writing a tile to disk).
• DiskImageView, BlockCacheView,
ImageViewRef, block_rasterize(),
and blocked-savvy write-image()/
FileIO
• Scalable performance on multi-
threaded machines (soon to include MOC-NA
(2048x4800)
Columbia, NASA’s supercomputer)
• Thread and ThreadPool/WorkQueue MER
objects (1024x1024)
• Specifically targeting the stereo
correlator, outlier rejection, and
Nominal Resolutions for Various Imagers. All sizes given in pixels.
stereo intersection algorithms. Apollo Panoramic Camera is not shown (25400 x 244000 pixels)!
Intelligent Systems Division NASA Ames Research Center
58. Recent Developments:
Least Squares Bundle Adjustment
Right Image
Refining Apollo SPICE Kernels
• Camera position and pose in “historical” SPICE kernels
provided by ASU provide a good initial solution, but they
will require refinement.
• Incorporate new Apollo Metric Camera tie-points into
ULCN 2005 - or - tie these points to the preliminary
LOLA control network in late 2009.
• This work will be carried out as part of a USGS/ARC
LASER proposal during FY09/FY10.
Automating Bundle Adjustment
• Automate tie-point matching using the SIFT and SURF
algorithms.
• Experimenting with reducing sensitivities to outliers using
Robust Statistics (i.e. error models with a “heavy tailed”
probability distributions)
Top: Partial view of Orbit 33 stereo reconstruction. Note the discontinuities in the colored,
hillshaded terrain. Bottom: KSU “Bundlevis” visualization of bundle adjustment for AS15-M-113[5-7]
Intelligent Systems Division NASA Ames Research Center
59. A Peek Under the Hood
Intelligent Systems Division NASA Ames Research Center
60. Problem: Intermediate Results
• What happens when you chain operations?
result = image1 + image2 + image3;
result = transpose( crop(x,y,31,31) );
• Normally those would be the same as these:
Image tmp = image1 + image2;
result = tmp + image3;
Image tmp = crop(image,x,y,31,31);
result = transpose(tmp);
• That would be terribly inefficient! Computing the
intermediate requires an extra pass over the data.
Intelligent Systems Division NASA Ames Research Center
61. Solution: Lazy Evaluation
• The + operator returns a special image sum object.
• The actual computation is only performed when you
set an ImageView equal to one of these objects.
• The entire operation is performed in the inner loop,
once per pixel.
• No intermediate image is needed!
• No second pass over the data is needed, either!
Intelligent Systems Division NASA Ames Research Center
62. Generalizing the View Concept
• An image view is any object that you can access just
like a regular old ImageView object.
Image::pixel_type
Type definitions Image::result_type
img.cols()
Getting dimensions img.rows()
img.planes()
img(col,row)
Accessing pixels img(col,row,plane)
Image::pixel_accessor
Pixel accessor img.origin()
Image::prerasterize_type
Rasterization prerasterize(bbox)
template <DestT> rasterize(dest,bbox)
• The data can be anywhere, or it can be computed.
Intelligent Systems Division NASA Ames Research Center
63. The Pixel Accessor Public Interface
• Pixel accessors are the most efficient way to move
around the pixels in an image, and are typically
used to implement rasterization functions.
• They behave somewhat like standard C++
iterators.
acc.prev_col()
acc.next_col()
acc.prev_row()
Iteration acc.next_row()
acc.prev_plane()
acc.next_plane()
acc.advance(cols,rows)
Advancement acc.advance(cols,rows,planes)
Pixel access *acc
Intelligent Systems Division NASA Ames Research Center
64. Views, Views, Everywhere!
• None of the functions we’ve seen so far do anything.
• Instead, they immediately return view objects that
represent processed views of the underlying data.
• Nested function calls produce nested view types.
• The computation happens in either the assignment
operator or the constructor of the destination.
• We call this final step the “rasterization” of one view
into another view.
Intelligent Systems Division NASA Ames Research Center
65. Block Rasterization
• Ultra-large (larger than memory) images are are
easily supported.
• All image views natively support block-by-block
computation (“rasterization”).
• write_image() computes per-block or -line
• QuadTreeGenerator computes per-block
• BlockCacheView allows you to manually
control block computation in a nested view.
template <DestT> Image::rasterize(DestT const& dest, BBox2i const& bbox);
Intelligent Systems Division NASA Ames Research Center
66. A Trivial First Example
• SLOG: Sign of Laplacian of Gaussian
Image slog =
threshold_filter( laplacian_filter( gaussian_filter( img, 1.5 ) ) );
Intelligent Systems Division NASA Ames Research Center
67. Generic View Types Can be Complicated!
• The type of the resulting view object becomes complex very
quickly.
Image slog =
threshold_filter( laplacian_filter( gaussian_filter( img, 1.5 ) ) );
UnaryPerPixelView<ConvolutionView<SeparableConvolutionView<ImageView<PixelRGB<float> >,
double, ConstantEdgeExtension>,
double, ConstantEdgeExtension>,
UnaryCompoundFunctor<ChannelThresholdFunctor<PixelRGB<float> > >
Intelligent Systems Division NASA Ames Research Center
68. Other Advantages to Views
• Generalized views emerged as the solution to several
problems at once.
• On-disk images can be supported cleanly.
• Procedurally-generated images can be, too.
• If you only want a small number of processed pixel
values, e.g. near interest points, make the view and just
ask it for those values.
• Lazy evaluation permits more sophisticated algorithmic
optimizations down the road.
Intelligent Systems Division NASA Ames Research Center
69. Naïve Laziness can be Very Bad™
• What happens when you chain convolutions?
result = convolution_filter(convolution_filter(image,kern1),kern2);
• Now the intermediate result is an important cache:
Image tmp = convolution_filter(image,kern1);
result = convolution_filter(tmp,kern2);
• Without this cache, performance will be terrible.
• In the Vision Workbench, intermediate results are
computed and cached when necessary.
Intelligent Systems Division NASA Ames Research Center
70. Generic vs. Abstract Views
• Views could be either template-based (generic) or
virtual-function-based (abstract).
• Because pixel access often appears in tight inner
loops, the template-based solution performs better.
• Templates are also more flexible. Virtualization can
only recover one hidden type at a time.
• Alas, keeping track of complex types can be annoying.
Fortunately, the end user usually doesn’t have to.
Intelligent Systems Division NASA Ames Research Center
71. Virtualizing Image Views
• Sometimes the abstract base class approach is better.
• Run-time polymorphism.
• Hiding complex types altogether.
• The ImageViewRef class wraps an arbitrary view in a veil
of abstraction.
• Templatized only on the pixel type.
• Contains a pointer to a special abstract base class.
• Has reference semantics (but re-bindable).
ImageViewRef<float> img_ref = My(Complex(Image(View(Type(img)))));
• Great for keeping a lazy view around if you only want
to evaluate it at select points.
Intelligent Systems Division NASA Ames Research Center
72. Image Resources
• Image resources, such as image files on disk, may
have unknown pixel/channel types.
PixelFormatEnum pixel_format()
Getting type info ChannelTypeEnum channel_type()
int32 img.cols()
Getting dimensions int32 img.rows()
int32 img.planes()
void read( ImageBuffer buf, BBox2i bbox )
Accessing pixel data void write( ImageBuffer buf, BBox2i bbox )
Vector2i native_block_size()
Other void flush()
• ImageBuffer is a simple struct describing a block of
contiguous pixels in memory.
• Read/write functions call helper functions to
convert to/from the desired pixel type.
Intelligent Systems Division NASA Ames Research Center
73. Lessons Learned and
Thoughts for the Future
Intelligent Systems Division NASA Ames Research Center
74. Templates and Laziness Revisited
• The image view framework currently serves
multiple purposes:
• Lazy evaluation of pixels on demand
• Block rasterization of gigantic images
• Eliminating unwanted temporaries
• This sometimes results in confused design.
• Lazy views need not be fully statically defined: that is
a premature optimization that complicates design.
Intelligent Systems Division NASA Ames Research Center
75. Example: Image Transformation
• This simple expression:
rotate( image, 45*M_PI/180 )
• Returns this complex type (assuming an RGB8 image):
TransformView< InterpolationView< EdgeExtensionView< ImageView<PixelRGB<uint8> >,
ZeroEdgeExtension >,
BilinearInterpolation >,
RotateTransform >
• Nested views are very powerful, but the resulting view is
needlessly complex.
• Virtualizing the edge extension step has negligible impact
on performance. Virtualizing the interpolation step is
impossible.
Intelligent Systems Division NASA Ames Research Center
76. Template Pitfalls
• A common and frustratingly terrible idiom for
supporting multiple pixel types:
template <class PixelT>
int do_something_useful(…) {
// Your actual program code
};
int main(int argc, char *argv) {
// Parse the arguments...
DiskImageResource *resource = DiskImageResource::open(image_filename);
ChannelTypeEnum channel_type = resource->channel_type();
PixelFormatEnum pixel_format = resource->pixel_format();
switch(pixel_format) {
case VW_PIXEL_GRAY:
switch(channel_type) {
case VW_CHANNEL_UINT8: return do_something_useful<PixelGray<uint8> >(…);
case VW_CHANNEL_UINT16: return do_something_useful<PixelGray<uint16> >(…);
// And so on...
}
// And so on...
}
}
• Annoying to write, takes forever to compile, and
results in huge executables.
Intelligent Systems Division NASA Ames Research Center
77. A More Pythonic Way
• Process an image using its native pixel type, as long as its
a standard type:
>>> import vw
>>> input = vw.read_image( ‘my_image.jpg’ )
>>> filtered = vw.gaussian_filter( input, 3 )
>>> vw.write_image( ‘filtered_image.jpg’, filtered )
• Coercion to a specific pixel type:
>>> input = vw.read_image( ‘my_image.jpg’, ptype=vw.PixelRGB, ctype=vw.uint8 )
• Successfully implemented in the Python bindings.
• It’s great to use, and terrible to implement.
• Results in huge Python bindings, especially due to SWIG
limitations on multiple compilation units.
Intelligent Systems Division NASA Ames Research Center
78. Proliferation of Image Concepts
• ImageView : Static pixel type, pixels stored
contiguously in memory.
• ImageViewRef : Static pixel type, abstracts
arbitrary block image computation.
• ImageResource : Dynamic pixel type, block
image access with conversion.
• ImageBuffer : Dynamic pixel type, pixels stored
in a block in memory.
• A dynamically typed version of ImageViewRef?
Intelligent Systems Division NASA Ames Research Center
79. A Dynamic View Abstraction?
• ImageView needs to be templatized on the pixel type for fast and
easy pixel access, but this does not prevent it from also adhering
to a dynamically typed view abstraction.
• Automatic pixel type casting/coercion is needed to avoid a
combinatorial explosion.
• Existing ImageResource interface may be close.... (for 3.0?)
• Currently exploring an intermediate solution (essentially a
dynamic version of ImageViewRef) for 2.0 release.
PixelFormatEnum pixel_format()
Getting type info ChannelTypeEnum channel_type()
int32 img.cols()
Getting dimensions int32 img.rows()
int32 img.planes()
Rasteriztion void rasterize( ImageBuffer buf, BBox2i bbox )
Intelligent Systems Division NASA Ames Research Center
80. An OpenCV – VW Bridge?
• OpenCV contains many algorithms that Vision
Workbench users would love to use.
• The simplest approach would be a direct bridge
between ImageView and IplImage.
• A more powerful approach would be to
produce Vision Workbench views whose
rasterizers invoke OpenCV algorithms.
• This would automatically support applying many
OpenCV algorithms to gigantic images, and fit
naturally into the VW view ecosystem.
Intelligent Systems Division NASA Ames Research Center
81. Questions / Discussion
http://ti.arc.nasa.gov/visionworkbench/
Intelligent Systems Division NASA Ames Research Center