1. Image Formation
N.A.Thacker, ISBE, University of Manchester.
Intensity reflectivity
Object interaction
Lens models
Spectral sensitivity
Human eye
CCD array
Camera model
Point spread function
Images from Images
Error propagation
Monte-Carlo
Summary
Figures c.o. Tim Ellis.
1
2. Intensity Reflectivity
Source Illumination Material Reflectivity
(E) (S)
sunshine 10,000 black velvet 0.01
cloudy day 1,000 stainless steel 0.65
office 100 white wall 0.80
moonlight 0.01 chrome 0.90
snow 0.93
Table 1: Typical values for relative levels of reflected light = E × S.
The brightest/best lit objects are 107
× brighter than the
darkest/poorest lit.
2
3. Object Interaction with Light
The image of an object is subject to several phys-
ical processes;
• spectral reflectance
• specularities
• shading
• mutual illumination
• shadows
A diffuse body reflects equally in all directions
(Lambertian)
A specular surface has the property that incident
light reflects directly from the surface.
Torrence-Sparrow , dichromatic reflection model.
3
4. Shading
Let z = z(x, y) be a depth map with illuminant direction
L = (Lx, Ly, Lz) = (cosτ sinσ, sinτ sinσ, cosσ) ie: a
single distant light source with σ slope and τ tilt.
Putting; p = ∂z(x, y)/∂x q = ∂z(x, y)/∂y
I(x, y) =
p cosτ sinσ + q sinτ sinσ + cosσ
(p2 + q2 + 1)1/2
≈ const + p cosτ sinσ + q sinτ sinσ
Taking Fourier transforms;
Fz = mz(f, θ) exp[i(φz(f, θ) + π/2)]
Fp = 2π cosθ f mz(f, θ) exp[i(φz(f, θ) + π/2)]
Fq = 2π sinθ f mz(f, θ) exp[i(φz(f, θ) + π/2)]
and ignoring the constant term
FI(f, θ) ≈ 2π sinσ f mz(f, θ)exp[i(φz(f, θ) + π/2)]
×(cosθ cosτ + sinθ sinτ)
ie: The image is related to the depth map by a convolu-
tion.
FI(f, θ) ≈ Fz(f, θ)2π sinσf(cosθ cosτ + sinθ sinτ)
Extended light sources are linear combinations so shape
from shading can be solved by deconvolution.
[A. Pentland]
4
5. Optical Lens Characteristics
• f - focal length
• D - diameter of aperture (light flux ∝ D2)
• f- number = f/D
The ‘speed’ of the lens (luminous flux through
the aperture) is halved for each ‘stop’: -
1 :1.4 : 2 : 2.8 : 4: 5.6 : 8: 11 : 16 : 22
• depth of field
• field of view
• geometric distortion (eg: radial)
• sharpness (e.g. PSF)
5
7. Thin Lens and Pin Hole models
1
f
=
1
u
+
1
v
As the aperture decreases (f → ∞) the model
approximates a pin hole camera.
7
8. Pin-Hole Camera Model.
The simplest camera model (pinhole) is written as
u =
u
v
1
=
fku 0 u0
0 −fkv v0
0 0 1
x
y
f
= C
x
y
f
The vector (x, y, f) is obtained from perspective geom-
etry
x
y
f
= f/Z
X
Y
Z
=
f/Z 0 0 0
0 f/Z 0 0
0 0 f/Z 0
X
Y
Z
1
= P
X
Y
Z
1
8
9. Finally a point in the external world co-ordinate system
Xw transforms to the camera frame using
X = RXw + T
which can be written in homogenous coordinates as
X
Y
Z
1
=
R T
0T
1
Xw
Yw
Zw
1
= H
Xw
Yw
Zw
1
Thus the full process of projecting a point in 3D onto the
image plane can be written as a single matrix product
u =
u
v
1
= CPH
Xw
Yw
Zw
1
where CPH is the 3x4 projection matrix.
This allows camera calibration to be achieved via closed
form optimisation of a quadratic Likelihood function.
9
10. Radial Distortion
Conventional optics can generate images which are sharp
but geometrically distorted.
The mathematical form is best represented as a modifica-
tion to radius r
r′
= r(1 + k1r2
+ k2r4
+ k3r6
+ ...)
10
11. Point Spread Functions
The physical properties of the optics and the sensor often
lead to spatial blurring (Point Spread Function) h(x, y).
i′
(x, y) = i(x, y) ⊗ h(x, y)
which can be considered (via the convolution theorem) as
an Optical Transfer Function (OTF) in Fourier space.
FFT(i′
(x, y)) = FFT(i(x, y)) × H(θ, φ)
11
12. Optical Spectral Sensitivity
i(x, y) =
∞
0 R(λ)S(x, y, λ)E(λ)dλ
where; E(λ) - illuminant spectral distribution
S(x, y, λ) - surface reflectance distribution
R(λ) - sensor spectral sensitivity distribution
12
13. CCD Array
• Photosensitive detector: planar silicon
• CCD: mega pixel (1,000 x 1,000 array).
• brightness range (at fixed exposure time) - 105
,
actually encoded 102
• uniform spatial distribution
• colour images constructed from mosaic sensors
CCD images have linear sensitivity ( ≈ 100 levels) and
Gaussian noise.
I(x, y) = i(x, y) ± σ
13
14. Human Eye
• Photosensitive detector: hemispherical retina
• rods: 75- 150 million high sensitivity monochromatic
receptors
• cones: 6-7 million low sensitivity colour receptors (in
fovea).
• non-uniform spatial distribution
• brightness range - 107
, actually encoded 105
.
The human eye has sensitivity to proportional differences
( ≈ 100 levels), i.e.:
S(x, y) = i(x1, y1) − i(x2, y2) ± γS(x, y)
These properties seem to be useful for exploiting invari-
ances (both of geometry and illumination), during recogni-
tion.
[Thacker 2007]
14
15. Making Images from Images.
In simple image processing the requirements of an image
processing algorithm may be purely to enhance the image
for viewing.
But; the aim of advanced image processing to produce
an image that makes certain information explicit in the
resulting image values for automated data extraction.
eg: edge strength maps.
Generally, high values located over features of interest. The
process which determines a good algorithm is its behaviour
in the presence of noise, in particular does the resulting
image give results which really can be interpreted purely
on the basis of output value.
ie: is a high value genuine or just a product of the prop-
agated noise.
There are two complementary ways of assessing algorithms:
Error Propagation and Monte-Carlo techniques.
15
16. Error Propagation.
General Approach for Error Propagation.
∆f2
(X) = ∇fT
CX∇f
where CX is the parameter covariance and ∇f is a vector
of derivatives
∇f = (
∂f
∂X1
,
∂f
∂X2
,
∂f
∂X3
, ....)
and ∆f(X) is the standard deviation on the computed
measure
If we apply this to image processing assuming that im-
ages have uniform random noise then we can simplify this
expression to
∆f2
xy(I) =
nm
σ2
nm(
∂fxy
∂Inm
)2
ie: the contribution to the output from each independent
variance involved in the calculation is added in quadrature.
[Haralick 1994]
16
17. Common Statistical Models.
Gaussian sensor (CCD, MRI, CT)
I(x, y) = i(x, y) ± σ
Simple non-linear transformations can be used to transform
some statistical models to approximate Gaussian.
Logarithmic sensor (Retina)
S(x, y) = i(x, y) ± γi(x, y)
→ I(x, y) = log[i(x, y)] ± σ
Poisson sensor (Low light, Histogram, SAR)
P(x, y) = i(x, y) ± γ i(x, y)
→ I(x, y) = i(x, y) ± σ
All are of the form
I(x, y) = f(scene) ± σ
17
18. Analysis Chains.
F(I)
J(I) K(I) L(I)
H(I)G(I)
uniform errors uniform errors uniform errors
non−uniform errors really non−uniform maggots
Propagation through stable algorithm.
Propagation through un−stable algorithm.
Alternatively, every algorithm should provide detailed error
estimates and make appropriate use of them in subsequent
stages.
[Courtney 2001]
18
19. Image Arithmetic.
We can drop the xy subscript as it is not needed.
Addition:
O = I1 + I2
∆O2
= σ2
1 + σ2
2
Division:
O = I1 / I2
∆O2
=
σ2
1
I2
2
+
I2
1σ2
2
I4
2
Multiplication:
O = I1 . I2
∆O2
= I2
2σ2
1 + I2
1σ2
2
Square-root:
O = (I1)
∆O2
=
σ2
1
I1
19
20. Logarithm:
O = log(I1)
∆O2
=
σ2
1
I2
1
Polynomial Term:
O = In
1
∆O2
= (nIn−1
1 )2
σ2
1
Square-root of Sum of Squares:
O = I2
1 + I2
2
∆O2
=
I2
1σ2
1 + I2
2σ2
2
I2
1 + I2
2
Notice that some of these results are independent of the im-
age data. Thus these algorithms preserve uniform random
noise in the output image.
Such techniques form the basis of the most useful building
blocks for image processing algorithms.
Some however, (most notably multiplication and division)
produce a result which is data dependent, thus each output
pixel will have different noise characteristics. This compli-
cates the process of algorithmic design.
20
21. Linear Filters.
For Linear Filters we initially have to re-introduce the spa-
tial subscript for the input and output images I and O.
Oxy =
nm
hnmIx+n,y+m
where hnm are the linear co-efficient.
Error propagation gives:
∆O2
xy =
nm
(hnmσx+n,y+m)2
for uniform errors this can be rewritten as
∆O2
xy = σ2
nm
(hnm)2
= K σ2
Thus linear filters produce outputs that have uniform er-
rors.
Unlike image arithmetic, although the errors are uniform
they are no-longer independent because the same data is
used in the calculation of the output image pixels. Thus
care has to be taken when applying further processing.
21
22. Histogram Equalisation.
For this algorithm we have a small problem as the differ-
ential of the processing process is not well defined.
If however we take the limiting case of the algorithm for
a continuous signal then the output image can be defined
as:
Oxy =
Ixy
0 fdI/
∞
0 fdI
where f is the frequency distribution of the grey levels (ie
the histogram.
This can now be differentiated giving
∂Oxy
∂Ixy
= K fIxy
ie: the derivative is proportional to the frequency of oc-
currence of grey level value Ixy and the expected variance
is:
∆O2
xy = K σ2
xyf2
Ixy
Clearly this will not be uniform across the image, nor would
it be in the quantised definition of the algorithm.
Thus although histogram equalisation is a popular pro-
cess for displaying results (to make better use of the dy-
namic range available in the display) it should generally be
avoided as part of a Machine Vision algorithm.
22
23. Monte-Carlo Techniques.
Differential propagation techniques are inappropriate when:
• Input errors are large compared to the range of linearity
of the function.
• Input distribution is non-Gaussian.
The most general technique for algorithm analysis which is
still applicable under these circumstances is known as the
Monte-Carlo technique.
This techniques takes values from the expected input dis-
tribution and accumulates the statistical response of the
output distribution.
Non−Linear Function. Non−Gaussian Errors.
Algorithm
f(x,y)
x
y
Monte−Carlo Technique.
23
24. Generating Distributions.
Conventional random number generators produce uniform
random variables x. The method for generating a variate
from the distribution f(x) using x is to solve for y in
x =
y0
−∞ f(y)dy/
∞
−∞ f(y)dy
ie: x is used to locate a variable some fraction of the way
through the integrated distribution (c.w. histogram equal-
isation).
eg: a Gaussian distribution leads to the BOX MULLER
method
y1 = (−2ln(x1))cos(2πx2)
y2 = (−2ln(x1))sin(2πx2)
which generates two Gaussian random deviates y1 and y2
for every two input deviates x1 and x2.
Armed with distribution generators we can generate many
alternative images for statistical testing from only a few ex-
amples of image data. Examples:
• drop out noise in images.
• bit inaccuracy and thermal noise.
• feature location and orientation accuracy.
24
25. Summary.
Conventional image formation combines many aspects (op-
tics, geometry and solid state physics).
The human eye and sensor equipment are characteristically
different. This seems to be a consequence of jointly evolving
the processor and the sensor.
Many imaging systems generate images which can be mod-
elled as
I = f(scene) ± σ
Though other models (e.g. Rician) exist, this is the con-
ventional basis for statistical algorithm design.
Pre-processing alters statistical properties of an image, these
can be assesed via error propagation or Monte-Carlo.
Image based matching metrics should be constructed using
a valid noise model (e.g. Least-Squares for homogenous
Gaussian errors).
25
26. References.
www.tina-vision.net
A. Pentland, Neural Computation: 1,2, 208-217, 1989.
R.M. Haralick, Performance Characterisation in Com-
puter Vision, CVGIP-IE, 60, 1994, pp.245-249.
P. Courtney and N.A.Thacker, Performance Character-
isation in Computer Vision: The Role of Statistics in
Testing and Design, ch. “Imaging and Vision Systems:
Theory Assessment and Applications”, J. Blanc-Talon and
D. Popescu, NOVA Science Books, 2001.
N.A.Thacker and E.C. Leek, Retinal Sampling, Feature
Detection and Saccades; A Statistical Perspective. proc.
BMVC 2007, vol 2, pp 990-1000, 2007.
.
26