3. What is OpenCV?
• OpenCV is a Python open-source library, which is used for computer
vision in Artificial intelligence, Machine Learning, face recognition,
etc.
• The purpose of computer vision is to understand the content of the
images.
• It extracts the description from the pictures, which may be an
object, a text description, and three-dimension model, and so on.
3
4. Computer vision
• There are a two main task which are defined below:
• Object Classification - In the object classification, we train a model
on a dataset of particular objects, and the model classifies new
objects as belonging to one or more of your training categories.
• Object Identification - In the object
identification, our model will identify a
particular instance of an object - for
example, parsing two faces in an image
and tagging one as Virat Kohli and other
one as Rohit Sharma.
4
6. How does computer
recognize the image?
• Machines are facilitated with seeing everything, convert the vision
into numbers and store in the memory.
• A pixel is the smallest unit of a digital image or
graphics that can be displayed and represented
on a digital display device.
• The picture intensity at the
particular location is
represented by the numbers.
6
7. Grayscale vs. RGB
• Grayscale images are those images which contain only two colors
black and white. The contrast measurement of intensity is black
treated as the weakest intensity, and white as the strongest
intensity.
• An RGB is a
combination
of the red, green,
blue color which
together makes
a new color.
7
8. Install
• sudo pip3 install opencv-python
• OpenCV allows us to perform multiple operations on the image, but
to do that it is necessary to read an image file as input, and then we
can perform the various operations on it.
9
17. Read and Save Image
• cv2.imread(rfilename,[flag])
• flag: The flag specifies the color type of a loaded image:
●
cv2.IMREAD_ANYDEPTH - If we set it as flag, it will return
16-bits/32-bits image when the input has the corresponding depth,
otherwise convert it to 8-BIT.
●
cv2.IMREAD_COLOR - If we set it as flag, it always return the
converted image to the color one.
●
cv2.IMREAD_GRAYSCALE - If we set it as flag, it always convert
image into the grayscale.
19
18. 8-bit or 16-bit or 32-bit
• A one-bit image can only be black and white because 1 bit can only
be black if it’s a 1 or white if it’s a 0.
• When you add more information to it, the color depth grows
exponentially.
• So, an 8-bit image can hold 256 tonal values in three different
channels (red, green, and blue).
• That equals 16.7 million colors.
• A 16-bit image has 65,536 tonal values in the same three channels.
That means 281 trillion colors.
20
19. 8-bit or 16-bit or 32-bit
• When you’re photographing, you can choose between shooting in
JPEG, which generates 8-bit images, or RAW, which will give you
images from 12 to 14 bits depending on the camera that you’re
using.
• Is 16-Bit or 32-Bit Color Better?
21
21. Histogram
• Histogram is considered as a graph or plot which is related to
frequency of pixels in an Gray Scale Image
• It quantifies the number of pixels for each intensity value
considered.
24
22. Thresholding
• Thresholding is a technique in OpenCV, which is the assignment of
pixel values in relation to the threshold value provided.
• In thresholding, each pixel value is compared with the threshold
value.
• If the pixel value is smaller than the threshold,
it is set to 0, otherwise, it is set to a maximum
value (generally 255).
25
23. Thresholding
• If f (x, y) > T
then f (x, y) = 0
else
f (x, y) = 255
• where
f (x, y) = Coordinate Pixel Value
T = Threshold Value
26
24. Filter
• Here we set the lower_range and upper_range value of our
requirement color in -> BGR format [ Blue Green Red ].
• Then we create a mask variable which holds a range.
• Then we perform a bitwise And operation with the given image and
applying mask variable as the mask parameter, then we stored the
result in the result variable.
• We can perform a bitwise And operation.
27
28. Canny_edge_detector
• The Canny edge detector is an edge detection operator that uses a
multi-stage algorithm to detect a wide range of edges in images.
• It was developed by John F. Canny in 1986.
• http://cs.ucf.edu/~mikel/Research/Edge_Detection.htm
31
29. Canny_edge_detector
• The Canny edge detection
algorithm is composed of 5
steps:
- Noise reduction;
- Gradient calculation;
- Non-maximum suppression;
- Double threshold;
- Edge Tracking by Hysteresis.
32
30. Noise Reduction
• Since the mathematics involved behind the scene are mainly based
on derivatives (cf. Step 2: Gradient calculation), edge detection
results are highly sensitive to image noise.
• One way to get rid of the noise on the image, is by applying
Gaussian blur to smooth it.
• To do so, image convolution technique is applied with a Gaussian
Kernel (3x3, 5x5, 7x7 etc…).
34
31. Noise Reduction
• The equation for a Gaussian filter kernel of size (2k+1)×(2k+1) is
given by:
35
33. Gradient Calculation
• The Gradient calculation step detects the edge intensity and
direction by calculating the gradient of the image using edge
detection operators.
• Edges correspond to a change of pixels’ intensity. To detect it, the
easiest way is to apply filters that highlight this intensity change in
both directions: horizontal (x) and vertical (y)
37
34. Gradient Calculation
• When the image is smoothed, the derivatives Ix and Iy w.r.t. x and y
are calculated. It can be implemented by convolving [web] I with
Sobel kernels Kx and Ky, respectively:
• Then, the magnitude G and the slope θ of the gradient are
calculated as follow:
38
35. Gradient Calculation
• This is how the Sobel filters
are applied to the image, and
how to get both intensity and
edge direction.
• The result is almost the
expected one, but we can see
that some of the edges are
thick and others are thin. Non-
Max Suppression step will help
us mitigate the thick ones.
39
36. Non-Maximum Suppression
• Ideally, the final image should have thin edges. Thus, we must
perform non-maximum suppression to thin out the edges.
• The principle is simple: the algorithm goes through all the points on
the gradient intensity
matrix and finds the pixels
with the maximum value
in the edge directions.
41
37. Non-Maximum Suppression
• The upper left corner red box present on the above image,
represents an intensity pixel of the
Gradient Intensity matrix being processed.
• The corresponding edge direction is
represented by the orange arrow with
an angle of -pi radians (+/-180 degrees).
• The edge direction is the orange dotted
line (horizontal from left to right).
42
38. Non-Maximum Suppression
• The purpose of the algorithm is to check if the pixels on the same
direction are more or less intense than the ones being processed.
• In the example above, the pixel (i, j) is being
processed, and the pixels on the same
direction are highlighted in blue (i, j-1) and
(i, j+1).
• If one those two pixels are more intense than
the one being processed, then only the more
intense one is kept.
43
39. Non-Maximum Suppression
• Pixel (i, j-1) seems to be more intense, because it is white (value of
255). Hence, the intensity value of the current pixel (i, j) is set to 0.
• If there are no pixels in the edge direction
having more intense values, then the value
of the current pixel is kept.
44
40. Double threshold
• The double threshold step aims at identifying 3 kinds of pixels:
strong, weak, and non-relevant:
- Strong pixels are pixels that have an intensity so high that we are
sure they contribute to the final edge.
- Weak pixels are pixels that have an intensity value that is not
enough to be considered as strong ones, but yet not small enough
to be considered as non-relevant for the edge detection.
- Other pixels are considered as non-relevant for the edge.
45
41. Double threshold
• Now you can see what the double thresholds holds for:
- High threshold is used to identify the strong pixels (intensity
higher than the high threshold)
- Low threshold is used to identify the non-relevant pixels
(intensity lower than the low threshold)
- All pixels having intensity between both thresholds are flagged as
weak and the Hysteresis mechanism (next step) will help us identify
the ones that could be considered as strong and the ones that are
considered as non-relevant.
46
42. Edge Tracking by Hysteresis
• Based on the threshold results, the hysteresis consists of
transforming weak pixels into strong ones, if and only if at least one
of the pixels around the one being processed is a strong one, as
described below:
47
44. Scale Invariant Feature Transform [SIFT]
• The idea is to collect gradient directions and magnitudes around
each keypoint.
• Then we figure out the most prominent orientation(s) in that region.
And we assign this orientation(s) to the keypoint.
• Any later calculations are done relative to this orientation.
• Convert the image into GrayScale Image
• Initialize the SIFT object we can use the
cv.xfeatures2d.SIFT_create() method.
• Now with the help of siftobject let’s detect all the features with the
help of sift detectAndCompute() method
49
45. Other Methods
• Speeded-Up Robust Features (SURF)
• Gradient Location and Orientation Histogram (GLOH)
• Histogram of Oriented Gradients (HOG) [Link]
50