Computer vision is an interdisciplinary field that focuses on enabling computers to interpret and analyze visual data from the world around us. It involves the development of algorithms and techniques that allow machines to understand images and videos, just as humans do.
The main goal of computer vision is to create machines that can "see" and understand the world around them, and then use that information to make decisions or take actions. This can involve tasks such as object recognition, scene reconstruction, facial recognition, and image segmentation.
Computer vision has a wide range of applications in various fields, such as healthcare, entertainment, transportation, robotics, and security. Some examples include medical image analysis, autonomous vehicles, augmented reality, and surveillance systems.
In recent years, the development of deep learning techniques, particularly convolutional neural networks (CNNs), has greatly advanced the field of computer vision, allowing machines to achieve state-of-the-art performance on various visual recognition tasks.
3. What is Computer Vision?
• Computer vision is the science and technology of machines that see.
• Concerned with the theory for building artificial systems that obtain information from images.
• The image data can take many forms, such as a video sequence, depth images, views from multiple cameras, or multi-dimensional data from a medical scanner
4. Computer Vision
Make computers understand images and videos.
What kind of scene?
Where are the cars?
How far is the
building?
…
5. Components of a computer vision system
Lighting
Scene
Camera
Computer
Scene Interpretation
Srinivasa Narasimhan’s slide
7. Vision is really hard
• Vision is an amazing feat of natural intelligence
• Visual cortex occupies about 50% of Macaque brain
• More human brain devoted to vision than anything else
Is that a
queen or a
bishop?
10. A little story
about
Computer
Vision
• In 1966, Marvin Minsky
at MIT asked his
undergraduate student
Gerald Jay Sussman to
“spend the summer linking
a camera to a
• computer and getting
the computer to describe
what it saw”. We now
know that the problem is
slightly more difficult than
that. (Szeliski 2009,
Computer Vision)
11. A little story
about
Computer
Vision
• In 1966, Marvin Minsky
at MIT asked his
undergraduate student
Gerald Jay Sussman to
“spend the summer linking
a camera to a
• computer and getting
the computer to describe
what it saw”. We now
know that the problem is
slightly more difficult than
that.
Founder, MIT AI project
12. A little story about Computer Vision
In 1966, Marvin Minsky at MIT asked his undergraduate student
Gerald Jay Sussman to “spend the summer linking a camera to a
computer and getting the computer to describe what it saw”. We
now know that the problem is slightly more difficult than that.
Image Understanding
13. Ridiculously brief history of computer vision
• 1966: Minsky assigns computer vision as
an undergrad summer project
• 1960’s: interpretation of synthetic
worlds
• 1970’s: some progress on interpreting
selected images
• 1980’s: ANNs come and go; shift toward
geometry and increased mathematical
rigor
• 1990’s: face recognition; statistical
analysis in vogue
• 2000’s: broader recognition; large
annotated datasets available; video
processing starts; vision & graphis; vision
for HCI; internet vision, etc.
Guzman ‘68
Ohta Kanade ‘78
Turk and Pentland ‘91
15. Optical
character
recognition
(OCR)
Digit recognition, AT&T labs
http://www.research.att.com/~yann/
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software
License plate readers
http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
18. Object recognition (in
supermarkets)
• LaneHawk by EvolutionRobotics
• “A smart camera is flush-mounted in the
checkout lane, continuously watching for items.
When an item is detected and recognized, the
cashier verifies the quantity of items that were
found under the basket, and continues to close
the transaction. The item can remain under the
basket, and with LaneHawk,you are assured to
get paid for it… “
20. Login without a password…
Fingerprint scanners on
many new laptops,
other devices
Face recognition systems now
beginning to appear more widely
http://www.sensiblevision.com/
24. Sports
• Sportvision first down
line
• Nice explanation on
www.howstuffworks.com
• http://www.sportvision.c
om/video.html
25. Smart cars
• Mobileye [wiki article]
• Vision systems currently in high-end BMW, GM, Volvo
models
• By 2010: 70% of car manufacturers.
Slide content courtesy of Amnon Shashua
28. Vision in space
Vision systems (JPL) used for several tasks
• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.
32. Prerequisites
A good working knowledge of C/C++, Java or Matlab
A good understand of math (linear algebra, basic
calculus, basic probability)
Willing to learn new stuffs (optimization,
statistical learning etc.)
37. Focus
Basic understand of OpenCV face recognition software and algorithms
Methods and Theory behind the EigenFace method for facial recognition
Implementation using Python in a Linux-based environment
Runs on a Raspberry Pi
38. Goal
• General facial recognition methods
• EigenFaces
• OpenCV’s facial recognition
One half research
• Create a system capable of facial recognition
• Real-time
• Able to run on a Raspberry Pi
One half implementation
40. Different Facial Recognition Methods
Geometric
Eigenfaces
Fisherfaces
Local Binary Patterns
Active Appearance
3D Shape Models
41. Geometric
• First method of facial recognition
• Done by hand at first
• Automation came later
• Find the locations of key parts of the face
• And the distances between them
• Good initial method, but had flaws
• Unable to handle multiple views
• Required good initial guess
42. Eigenfaces
• Information theory approach
• Codes and then decodes face images to gain
recognition
• Uses principal component analysis (PCA) to find the
most important bits
43. Fisherfaces
• Same approach as Eigenface
• Instead of PCA, uses linear discriminant analysis
(LDA)
• Better handles intrapersonal variability within
images such as lighting
44. Local Binary
Patterns
• Describes local features of an object
• Comparison of each pixel to its neighbors
• Histogram of image contains information about the
destruction of the local micro patterns
46. Basic Idea
• Let face image 𝐼(𝑥, 𝑦) be a two-dimensional 𝑁 by 𝑁 array
of (8-bit) intensity values
• Can consider image an 𝑁2
vector of dimensions
• Image of 256 by 256 becomes a 65,536 vector of dimension
• Or a point in 65,536-dimensional space
47. Basic Idea
• Images of faces will not differ too much
• This allows a much smaller dimensional subspace to be used to classify them
• PCA analysis finds the vectors that best define the distribution of images
• These vectors are then
• 𝑁2 long
• Describe an 𝑁 by 𝑁 image
• Linear combination of the original face images
48. Basic Idea
• These vectors are called
eigenfaces
• They are the eigenvectors
of the covariance matrix
• Resemble faces
49. Method
• Acquire initial training set of face images
• Calculate eigenfaces
• Keep only 𝑀 eigenfaces that correspond to the highest
eigenvalues
• These images now define the face space
• Calculate corresponding distribution in 𝑀-dimensional
weight space for each known individual
50. Method
• Calculate weights for new image by
projecting the input image onto each of
the eigenfaces
• Determine if face is known
• Within some tolerance, close to face
space
• If within face space, classify weights as
either known or unknown
• (Optional) Update eigenfaces and weights
• (Optional) If same face repeats, input into
known faces
51. Classifying
• Four possibilities for an
input image
• Near face space, near
face class
• Known face
• Near face space, not
near face class
• Unknown face
• Not near face space,
near face class
• Not a face, but may
look like one (false
positive)
• Not near face space, not
near face class
• Not a face
52. OpenCV and
Theory
• Beauty about OpenCV is a lot of this
process is completely automated
• Need:
• Training images
• Specify type of training
• Number of eigenfaces
• Threshold
• Input Image
56. Training
• Model was trained using positive and negative images
• Creates training file that holds the 𝑀-dimensional face space
• Now have a base to recognize from
model = cv2.createEigenFaceRecognizer()
model.train(np.asarray(faces),np.asarray(labels))
57. Recognition
• Steps to recognizing face
• Capture image
• Detect face
• Crop and resize around face
• Project across all eigenvectors
• Find face class that minimizes Euclidian distance
• Return label from face class, and Euclidian distance
• Euclidian distance also called Confidence level
model = cv2.createEigenFaceRecognizer()
model.load(config.TRAINING_FILE)
label, confidence = model.predict(image)
58. Test
• Created four different Test
• First data set uses 24 positive training images
• Almost no pose and lighting variation
• Second data set uses 12 positive training images
• Good pose variation, little lighting variation
• Third data set uses 25 positive training images
• Good pose and lighting variation
• Fourth data set uses second and third data set but with Fisherface method
59. Results • Results from Data sets 1-3, each one from 20 input
images
• Confidence represents distance from known face class
Data Set
Mean
Confidence Max Confidence Min Confidence
1 3462 3948 3040
2 2127 2568 1835
3 1709 2196 1217
60. Results • Results from eigenface vs. fisherface comparison
ALGORITHM DATA SET
# TRAINING
IMAGES
MEAN
CONFIDENCE
MAX
CONFIDENCE MIN CONFIDENCE
Eigen 2 12 2127 2568 1835
Fisher 2 12 2029 2538 1468
Eigen 3 25 1709 2196 1217
Fisher 3 25 2017 2748 1530
61. Conclusion
• Theory behind eigenfaces
• Face space
• Training
• Simple implementation of OpenCV’s eigenface recognizer
• Compared different training models
• Number of training images
• Pose and lighting variations
• Compared eigenfaces and fisherfaces
62. Conclusion
• Future work
• Further testing of different training models
• Implement updating facial recognition
64. Process optimization: You can help optimize the manufacturing process by using computer vision to monitor and control the
production line. For example, computer vision can be used to ensure that the correct materials are being used and that they are being
assembled correctly.
Defect detection: Computer vision can be used to detect defects in the footwear products, such as stitching errors or material
inconsistencies. By detecting defects early, the manufacturing process can be adjusted to correct the problem, reducing the need for
manual labor to fix defects later.
Computer vision can be used in the footwear manufacturing industry to automate certain processes and
reduce the need for manual labor. Here are a few ways you can help computer vision for footwear
manufacturing to reduce the need for labor:
65. Quality control: Computer vision can be used to inspect the finished footwear products and ensure that they
meet the required quality standards. This can reduce the need for manual inspection and increase the overall
efficiency of the manufacturing process.
Inventory management: Computer vision can be used to monitor the inventory of materials and finished
products. By automating the inventory management process, the need for manual labor can be reduced.
Automated assembly: Computer vision can be used to automate the assembly of footwear products, such as
attaching soles to uppers. By automating this process, the need for manual labor can be greatly reduced.
Overall, there are many ways to help computer vision for footwear manufacturing to reduce the need
for labor. By contributing to the development of computer vision models and applications, you can help
improve the efficiency and cost-effectiveness of the manufacturing process, while also reducing the
need for manual labor.