Final Year Project-Gesture Based Interaction and Image Processing

Department of Information Systems and Computing
BSc (Hons) Computer Science
Academic Year 2010 - 2011
Gesture Based Interaction and Image Processing
Sabnam Pandey (0822577)
A report submitted in partial fulfilment of the requirement for the degree of Bachelor
of Science
Brunel University
Department of Information Systems and Computing
Uxbridge, Middlesex UB8 3PH
United Kingdom
Tel: +44 (0) 1895 203397
Fax: +44 (0) 1895 251686

FYP Task 3 Project Report Student ID:0822577
2

ABSTRACT
Gesture based interaction systems are becoming more and more popular both at
workplace and home. The projects intends to develop a system that can recognize hand
gestures which can be used as an input command to interact with the PC which can be
applied to picture gallery browsing. One of the key areas that needs to be looked at while
developing such systems is the image processing stage.
Since it would be very hard to produce an algorithm that can recognize gestures in the
time allocated, within my project, I plan to design and implement a system that can
perform general image processing of the user image captured in real time. Most of my
work will be based on image processing techniques. My expected outcome will be an
algorithm that can detect skin regions of the image user image captured and detect
contours around the detected skin regions.
In order to manage my project I will be using Waterfall model as the system development
methodology. Microsoft Visual C++ will be used for the implementation of the code
developed with combination with the OpenCV library. I feel that if I successfully meet my
targets then I will have contributed towards the future of natural gesture based interfaces,
if only in a minimal way.

3

ACKNOWLEDGEMENTS
I would firstly like to thank my supervisor Dr. Mark Perry whose input and feedback has
shaped the way I have approached this project, I would also like to thank my second
reader Dr. David Bell for his feedback. I could not have asked for a better supervisor or
second reader.
I would also like to thank my family; their encouragement and motivation have given me
the strength to keep going on with my work an never loose hope even when struck by an
obstacle. I would also like to thank my close friends who have been there for me
throughout.
TOTAL NUMBER OF WORDS: 10,244
(Not including contents, abstract, appendix and diagrams, etc)
I certify that the work presented in the dissertation is my own unless referenced
Signature: ………………..
Date: 29th
March, 2011

4

TABLE OF CONTENTS
CHAPTER 1: INTRODUCTION………………………………………………………………7
1.1 Introduction…………………………………………………………………………………7
1.2 Research aim and objectives……………………………………………………………..8
1.3 Research approach………………………………………………………………………..8
1.4 Project Scope………………………………………………………………………………9
1.5 Dissertation outline………………………………………………………………………...9
1.6 Summary……………………………………………………………………………………10
CHAPTER 2: LITERATURE REVIEW………………………………………………………..10
2.1 Human-Computer Interaction……………………………………………………………10
2.1.1 Interaction methods………………………………………………………………..11
2.2 Gesture Recognition……………………………………………………………………….11
2.2.1 Hand gesture recognition…………………………………………………………11
2.3 Vision Based Method……………………………………………………………………..12
2.4 Image processing…………………………………………………………………………..13
2.4.1 Color Spaces…………………………………………………………………………14
2.4.1.1 RGB………………………………………………………………………….14
2.4.1.2 HSV(Hue Saturation Value)……………………………………………….15
2.4.2 Image segmentation…………………………………………………………………14
2.4.3 Skin Detection………………………………………………………………………...15
2.4.3.1 Explicitly Defined Skin Region……………………………………………..16
2.4.4 Contour Detection…………………………………………………………………….16
2.5 Summary……………………………………………………………………………………..16
CHAPTER 3: METHODOLOGY………………………………………………………………...17
3.1 Introduction……………………………………………………………………………………17
3.2 Software development methodology……………………………………………………….17
3.2.1 The waterfall model……………………………………………………………………18
3.2.2 The spiral model……………………………………………………………………….18
3.3 Development Methodology Chosen………………………………………………………..19
3.4 Implementation methodology……………………………………………………………….20

5

3.4.1 Programming Language………………………………………………………………20
3.4.2 Open Source Library …………………………………………………………………20
3.5 Summary……………………………………………………………………………………...20
CHAPTER 4: REQUIREMENTS ANALYSIS & DESIGN ……………………………………20
4.1 Introduction…………………………………………………………………………………...21
4.2 Software Engineering ……………………………………………………………………….21
4.3 Requirements specification………………………………………………………………….21
4.3.1 Functional requirements………………………………………………………………21
4.4 Block Diagram………………………………………………………………………………..22
4.5 UML Diagrams…………………………………………………………………… …………23
4.5.1 UML Use Case Diagram………………………………………………………………….23
4.5.2 Activity diagram………………………………………………………………………..23
4.6 Assumptions………………………………………………………………………………….26
4.7 Summary……………………………………………………………………………………...26
CHAPTER 5: IMPLEMENTATION……………………………………………………………..26
5.1 Development Tools…………………………………………………………………………..27
5.2 OpenCV Functions Used……………………………………………………………………28
5.3 Initialization…………………………………………………………………………………...28
5.4 Image Processing……………………………………………………………………………30
5.4.1 Image Conversion……………………………………………………………………..30
5.4.2 Skin Detection………………………………………………………………………….32
5.4.3 Contour Detection……………………………………………………………………...34
5.5 Summary………………………………………………………………………………………37
CHAPTER 6: TESTING AND EVALUATION ………………………………………………….37
6.1 Introduction…………………………………………………………………………………….37
6.2 Black-box Testing……………………………………………………………………………..38
6.3 Testing Evaluation…………………………………………………………………………….38
6.4 Project Evaluation……………………………………………………………………………..39
6.4.1 Project Aim………………………………………………………………………………39
6.4.2 Objective………………………………………………………………………………...39
CHAPTER 7: CONCLUSION…………………………………………………………………….41

6

7.1 Summary of the dissertation…………………………………………………………………41
7.2 Research contributions……………………………………………………………………….42
7.3 Limitations And Future Development……………………………………………………….43
7.4 Personal Reflections………………………………………………………………………….44
REFERENCES…………………………………………………………………………………….46
APPENDIX…………………………………………………………………………………………48
(A) Source Code…………………………………………………………………………………..51
(B) Communications Log…………………………………………………………………………56

CHAPTER 1: INTRODUCTION
This chapter highlights the purpose of the project and project approach. Furthermore, it
will outline the aims and objectives, as well scope, to assist through each stages of the
project.
1.1 Introduction
Human–computer interaction (HCI) is the study, planning and design of the interaction
between users(people) and computers. As an alternative to the traditional human-
computer interaction interface (HCI) such as the keyboard and the mouse, the use of
human movements, involving face, the whole body and specially hand gestures has
attracted more and more people in recent years. A computer that can recognize and
respond to the users gestures could provide a natural interface. The diverse logical and
physical capabilities of users (e.g. elderly, children or people with disabilities) also
require human-computer interfaces that are easily learnable and usable, instead of
traditional interaction techniques such as the mouse and keyboard, which require a
certain kind of skills, and restrict the user in a certain kind of physical mode. One of the
latest developments made in the field of gaming using such interaction technique is the
Microsoft’s Kinect for the Xbox 360 console. It enables users to control and interact
with the Xbox 360 without the need to touch a game controller like remote, such
gesture based have become extremely popular among today’s community.
Hands are one of the most multipurpose tools in our human body to accomplish
different tasks. They are one of the important features in our user interfaces and
interaction applications. Interest in the field of computer vision based hand gesture
recognition has increased in recent times due to its potential application in the field of
Human Computer Interaction. The most important feature of this technique is that the
system uses hand gesture recognition as an input, through which the users can control
the system or devices without having to touch any external interaction devices such as
mouse or keyboard. Also, it gives users a sense of freedom and ease to accomplish
different tasks. This serves as a motivating factor to carry out this project.
This project aims to develop a system, which captures certain hand gestures as an
input from the users using a web camera, and performs the task associated with the
gesture recognized. The recognized will be applied in gallery browsing of the PC being
used. A program based on an open source library, which looks into image processing,
and hand gesture recognition developed to accomplish the aim of this project.

8

1.2 Research aim and objectives
Aim
The aim of this project is to develop and implement a set of algorithm that utilizes
gesture recognition to browse through the picture gallery of a PC using a web camera
for the purpose of making the PC more usable and improve the interactions between
the users and the computers.
*Redefined Aim
Due to the unexpected complexity of the problem, this project will look into developing
and implementing a set of algorithms that enables skin and contour detection of the
user’s hand in real time.
Objectives
In order to fulfill the aims set for this project, the following objectives must be followed:
1. Begin literature review on areas like gesturing recognition, image processing,
and open source libraries.
2. Determine the appropriate methodology to be used to design and implement
the proposed system based on a detailed analysis of each methodology.
3. Formulate requirements based on techniques chosen from the literature found
during research.
4. Use analyzed requirements to begin designing the steps involved in image
processing.
5. Implement the processes of skin detection and contour detection using a
suitable programming language and an appropriate open source library.
6. Assess the developed code, by applying various testing techniques to ensure
that the test cases developed for the software conform to the requirements
specification.
7. Evaluate the implemented code and justify whether the derived results have
achieve the project aims and objectives.
1.3 Research approach
The literature on topics like computer vision, gesture recognition and image processing
will be researched and reviewed to gain information on the existing gesture recognition

9

system and the common techniques used which can be applied to the system we
intend to develop. After gaining some knowledge about the topic, the appropriate
software development methodology will be chosen to plan the progress of the project.
The information through literature review will be analyzed to gather the requirements of
the proposed system based on which the design of the system will commence. This
system will then be implemented and tested to ensure there are no faults. The project
will then be evaluated against the aims and objectives of the project stated.
1.4 Project Scope
Initially, the aim of this project was to recognize hand gestures in real time to browse
through the picture gallery of the PC being used. Due to the strict time allocation and
lack of knowledge on high level programming, it was necessary to limit the scope of the
project. Therefore, the scope of this project involves converting captured images via a
vision-based sensor (web-cam) and apply different image processing techniques to
analyze the image better which can then be used in future works to detect hands and
recognize gestures. The main focus will be on image processing. The literature on
hand tracking and gesture recognition will be discussed only to give the readers a
basic knowledge on how this project can be further developed to reach the initial aim of
the project but this will not be considered during the implementation phase.
1.5 Dissertation outline
This will outline the remaining chapters in the dissertation:

10

1.5 Dissertation outline
This will outline the remaining chapters in the dissertation:
Chapter 2
Literature review This chapter will look into various literature sources and
read it for research into relevant topics of interest to the
project to determine the outcome.
Chapter 3
Methodologies This chapter looks into the Software Development
Lifecycle that is suitable for this project and the various
developments tools used.
Chapter 4
Design This chapter will model the requirements of the software
and use various design methods to understand and
develop the software.
Chapter 5
Implementation The chapter will discuss about how the implementation of
the algorithms was carried out and the problems faced
during the process.
Chapter 6
Testing and Evaluation Determine and apply various testing techniques to ensure
that there are no errors occurred during the
implementation in order to make the software reliable.
Then evaluate the project against the aim and objects.
Chapter 7
Conclusion Sum up the project, reflect the problems faced throughout
the project, and discuss the limitations and future work.
1.6 Summary
This chapter gives a general introduction of this project and the aims and objectives set
out for this project. It also outlines the different chapters that we will come across as we
go further towards the completion of the report. The next chapter will discuss the
literature in relation to this project.

11

CHAPTER 2: LITERATURE REVIEW
This chapter focuses on research, covering relevant literature relating to this project on
touchless hand gesture based human computer interaction. Appropriate journals,
books, Internet sites will be used to gather the relevant literature. This chapter
describes the literature associated with gesture recognition and image processing. It
discusses the different steps that comprise image processing and the techniques used
to accomplish the stages. The literature also involves previous studies that aimed at
development of similar systems. The techniques used previously are analyzed and
appropriate ones chosen to devise a feasible solution.
2.1 Human-Computer Interaction
Human–computer interaction (HCI) is the study, planning and design of the interaction
between people and computers. The aim of HCI is to improve the interactions between
computers and users by making computers more usable and more responsive to user’s
needs.
2.1.1 Interaction methods
There are different interaction methods with which we can interact with the computer
the most common being the use of a mouse. The mouse was developed at Stanford
Research Laboratory (now SRI) in 1965 as part of the NLS project (funding from
ARPA, NASA, and Rome ADC) [9] to be a cheap replacement for light pens, which had
been used at least since 1954 [10, p. 68]. The mouse was then made famous as a
practical input device by Xerox PARC in the 1970's.
Another interaction method that has been increasingly popular in the recent years is
using gestures for interacting with the computer. Instead of learning completely
different new ways to interact, the users may prefer to adopt the natural ways of
communication that they are familiar with in everyday life. These demands have
resulted research in which the user interfaces take advantages of the natural ways that
people interact with each other and the physical world, e.g. speech, gesture, eye-gaze,
and physical tools (Grasso et al (1998);Oviatt (2001);Wang et al (2001)). Such systems
accept gestures as an input form the user recognize the inputted gestures and perform
the task associated with that gesture. This project will look into the gesture-based
interaction in real time.

12

2.2 Gesture Recognition
Gestures can be described as different types of human movements. These can be two-
dimensional or three-dimensional and can be specific to the hand, arm or body
movements as well as facial expressions. (Hoffman et al (2004)).
Gesture recognition enables humans to interface with the machine and interact
naturally without any external devices such as the keyboard. It is a method of
assigning commands to the computer (machine) to perform specific tasks.
This project will be focusing specifically on hand gesture, as they are easier to perform
and recognize with less effort. Also, the users of the software are going to be a mixed
crowd so it might be difficult for some people to perform elaborate gestures.
2.2.1 Hand gesture recognition
A hand gesture us a sequence of hand postures connected by continuous hand or
finger movements over a short period of time. Hand gestures provide a separate
complementary modality to speech for expressing ones ideas. So, hand gestures
recognizing system can be a natural way of communicating between the computer and
humans.
There is basically two approaches to hand gesture recognition; Vision based and Non-
vision based approaches.
Non- vison based approach uses sensor devices like data glove as shown in figure 2.1
below.The extra sensors makes it easy to collect hand location and movement.

13

Figure 2.1
The vision based approach uses camera as an input device, thus facilitating a natural
interaction between users and computers without the use of any extra devices.
This project looks into the vision based method, which is discussed in detail in the next
section.
2.3 Vision Based Method
Bare-hand gestures are probably the most straightforward interpretations when people
think about gestures. Here we refer to the gestures that are defined entirely by the
movements of the users hands and/or fingers. Typically the bare hand gestures are
captured using computer vision techniques, i.e. cameras watching the users
movements (Hardenberg et al (2001);Segen et al (1998)), It can be a single camera,
stereo or multiple cameras depending on the application and settings.
This project requires a gesture recognition method that is easy to use and allows the
user a certain level of freedom. This project uses a single camera (web camera) as an
input device to capture gestures performed by the users. The vision based gesture
recognition seems to be a better option due to its advantages over non-vision based
method. The devices used in non-vision methods are expensive and bring weighty
experience to the users. Also, the devices are generally connected to the computer,
which restricts free movement of the users to perform the activity they want. Whereas
in vision method, the users are free to perform gestures without any restrictions.
Similar research of gesture recognition has been done through out the past covering a
wide variety approaches with successful outcome. For example,Segen[ l 3] describes
a system with two cameras that can recognize three gestures and tracking hand in 3D.
The system detects two fingers (thumb finger and pointing finger) by extracting the
feature points on hand contour and output their poses. In [15] and [16], Quek describes
a system called finger mouse that can replace mouse with hand gestures for certain
actions. The system defines only one gesture (pointing gesture), and the SHIFT key on
the keyboard is adopted to register a mouse button press. In [17] Triesch presents a
robust gesture recognition algorithm using multiple cues including motion cue, color
cue and stereo cue. This algorithm is used to build a gesture interface for a real robot
grasping objects on a table.This project uses approaches used in Segen’s work.

14

The development of vision based gesture recognition software generally goes through
different phases.
• Image Processing
• Hand Tracking
• Hand Gesture recognition
The following sections will discuss the stages in detail and past works related to the
field.
2.4 Image processing
“Images are stored as a collection of pixels.” [4] Color Images consists of a red, green
and blue value, which is combined to allow colors to be represented. Grayscale images
are different however; “as pixels are represented by a single number ranging from 0 to
255, where 0 is very black and 255 is very white.” [4]
Image processing in computing is used to extract useful information form images to
perform some specific tasks. Image processing generally involves three basic steps.
Image segmentation, which involves image conversion between different color spaces
to minimize the complexity of image. Skin detection, which gets rid of any unwanted
background objects and noises associated with the image. Contour detection, to locate
an object in the image. Each of these stages will be discussed in detail below.
2.4.1 Color Spaces
A device color space simply describes the range of colors, or gamut, that a camera can
see, a printer can print, or a monitor can display (18). The various color spaces exist
because they present color information in ways that make certain calculations more
convenient or because they provide a way to identify colors that is more intuitive.
2.4.1.1 RGB
RGB is an initial for Red, Green and Blue.RGB is one of the most widely used color
spaces for processing and storing of digital image data. However, high correlation
between channels, mixing of chrominance and luminance data makes RGB not a very
favorable choice for color analysis and color- based recognition algorithms
(Vezhnevets et al)

15

2.4.1.2 HSV (Hue Saturation Value)
Hue-saturation based color space is another popular color space that is based on
human perception color. Hue defines the major color of an area. Saturation measures
the colorfulness of an area in proportion to its brightness. The value is related to the
color luminance. It was introduced for users who need to define the color properties
numerically. It is easier to implement and also can be converted to and from RGB
anytime. (Vezhnevets et al)
2.4.2 Image segmentation
In computing, segmentation refers to the process of partitioning a image into multiple
segments (sets of pixels). The main aim of segmentation is to simplify and/or change
the representation of an image into something more meaningful and easier to use and
analyze (Linda et al (2001))
Image segmentation is one of the first step involved in the process of gesture
recognition in our case hand gesture recognition. The image captured by the camera
cannot be used to track hand or recognize gestures as the image consists of other
background objects and exists generally in RGB color space which makes skin
detection process complex due to the involvement of different color pixels in the image.
So, in order to make the skin detection process simpler, the image needs to be
converted to a simpler color space which is easier to analyze and which involves lesser
color pixels. After the HSV skin color model is built, it can be used for skin detection.
2.4.3 Skin Detection
Skin color is one of the most important features in the humans. There are lots of color
spaces that have been used in early work of skin detection, such as RGB, YCbCr, HSV
(Yoo et al (1999)). Although RGB color space is one of the most used color spaces for
processing images, it is not widely used in skin detection algorithms because the
chrominance and luminance components are mixed (Hashem (2009)). Some work has
been done to compare different skin color space performance in skin detection
problems. According to Zarit et al., HSV gives the best performance for skin pixel
detection. When building a system, that uses skin color as a feature for detection,
several points must be kept in mind like what color space to choose and how to model
the skin color distribution (Vezhnevets et al).

16

In this project, “a skin color model based on HSV color space will be built because it
has only two components (H, S) which help to speed up the calculations and also the
transformations from RGB color space into HSV color space is done using simple and
fast transformations” (Hashem (2009)).
2.4.3.1 Explicitly Defined Skin Region
The first step in skin detection is pixel-based skin detection. This is one of the easiest
methods as it explicitly defines skin-color boundaries in different color spaces. Different
ranges of thresholds are defined according to each color space components as the
image pixels that fall between the predefined ranges are considered as skin pixels
(Vezhnevets et al) The simplicity of this method have attracted (and still does) many
researchers (Peer et al. 2003).
How it’s applied in this project to generate a set of algorithm is discussed in chapter
4.The work of is taken as reference for skin detection.
2.4.4 Contour Detection
The term contour can be defined as an outline or a boundary of an object.
Therefore, contour detection deals with detecting various objects in an image .Use of
contour detection in image processing is to locate objects and their boundaries in
Images. Also, output of contour detection shows only the prominent region boundaries
leaving behind unwanted edges in the image. Hence, detection of specific objects in
the image is only possible through contours (Lahari et al).

17

Figure 2.2
So in this project, it is very important to detect contours of the hand before we can
extract the hand features from the image taken from the camera.
2.5 Summary
This chapter looked into different areas of gesture recognition and its complexity in
application. The research helped to gather knowledge on hand gesture recognition
system and its design requirements. It also looks into the different image processing
steps namely image segementation, skin detection and contour detection. Different
techniques for skin detection was looked into and the decision was made on using the
HSV color space and explicitly defined skin detection model. The gathered literature
covers what aspects are required to develop initial ideas during design and
implementation stages. Thus benefits the initial ideas required during design and
implementation stages. The next chapter looks into the different methods and tools
used in the development of the project and the software.
CHAPTER 3: METHODOLOGY
3.1 Introduction
This chapter will illustrate the comparison between different system development
lifecycle approach namely Waterfall and Spiral models, as well the chosen system
development lifecycle approach used throughout the project. Different methodologies
to assist in the development of the software are discussed.
3.2 System Development Life Cycle
The systems development life cycle (SDLC) is a conceptual model used in project
management that describes the stages involved in an information system development
project, from an initial feasibility study through maintenance of the completed
application .For software project we need to consider the project lifecycle and the
various approaches to system development for maintaining the project. There are
various models that evolved from the traditional development lifecycle therefore
selecting the appropriate lifecycle to follow will lead to the success of the project. The
commonly used development lifecycles are waterfall model, spiral and many more. The
way these differ from each other are discussed below.

18

3.2.1 Waterfall Model
Waterfall approach was first Process Model to be introduced and followed widely in
Software Engineering to ensure success of the project. In the waterfall model, the
whole process of software development is divided into separate process phases. The
phases in are: Requirement Specifications, Software Design, Implementation and
Testing & Maintenance as shown in the figure below. All these phases are related to
each other so that second phase is started as and when defined set of goals are
achieved for first phase (Parekh (2011)).
Figure 3.1
3.2.2 Spiral Model
The spiral model was suggested by Barry Boehm in 1988 is similar to waterfall model
but follows an evolutionary or iterative approach to system developments. The spiral
model, as seen in figure below, combines the iterative nature of prototyping with the
controlled and systematic aspects of the waterfall model, therein providing the potential
for rapid development of incremental versions of the software. In this model the
software is developed in a series of incremental releases with the early stages being
either paper models or prototypes. Later iterations become increasingly more complete
versions of the product . Further, there are no fixed phases for requirements
specification, design or testing in the spiral model. So, spiral model is used in cases
where the requirement is not understood or difficult to specify. The main feature of this
method is identifying potential risk to the system (Amber (2003)).

19

Figure 3.2
3.3 Development Methodology Chosen
The waterfall model is the suitable development methodology because as our project is
software oriented, its important to make sure the program works at each stage without
any errors and is producing the stated result. In Waterfall model, there is no user
involvement during the development process and focuses on minimum amount of
requirements set that will not be changed throughout the project. The users are
involved only after the end product is finished which is the case in this project.
Whereas, spiral model requires user involvement throughout the project. Unlike Spiral
model, which deals with risk analysis, in waterfall model risk analysis is a sub-task. As
this project involves no risk using a model that focuses highly on a phase that is not
important just leads to loss of time. Also, the spiral model is time consuming due to the
fact that it’s an iterative lifecycle.
Requirements – requirements of the systems were analyzed by researching the
existing literature on image processing techniques. Thus be able to design the required
algorithm to perform basic image processing namely image segmentation, skin
detection and contour detection.
Design – This stage will take the information gathered from the requirements and
design a suitable system that will relate to the objectives set. With the relevant
requirement gathered form the literature researched, the system requirements are
obtained. UML diagrams will be used to understand the design of the software in
depth.

20

Implementation - Once the design stage has been completed; the implementation of
the proposed system design will be initiated. The following steps will be carried out
• Implementation of the algorithm which captures the image from the camera
• Implementation of the algorithm which looks into image processing
Testing – In this stage, the implemented algorithms will be tested. These processes will
use different testing methodologies that will be discussed later in this chapter.
Evaluation – Finally, the evaluation of the application against the aims and objective
will be performed.
3.4 Implementation Methodology
In developing software we will need to consider various resources available for us to
use. This part looks into the programming language and any open source library used
in the design and implementation of the software.
3.4.1 Programming language
The program will be written in C++ in Microsoft Visual Studio 2008 Express Edition
environment, as it is available to use and able to identify problems or syntax error
made to reduce amount of time needed to go through the source code.
3.4.2 Open Source Library
The open source library used in this project was the OpenCV library. It is an open
source library of programming functions mainly aimed at real time computer vision,
developed by Intel.OpenCV functions related to image processing and tracking was
used in the development of implementable algorithms which meets the aims of this
project.
3.5 Summary
This chapter explains the chosen software development lifecycle is the Waterfall model
that the project will adopt. Also, it looked into how this project goes about the different
stages of the development cycle. It discussed the different implementation
methodology chosen that will be used as solution to the development of the software.

21

The testing methodologies which will be adopted during the testing phase of the
software were discussed. The next chapter will discuss how the system will be
designed and functions with the help of different UML diagrams to represent it.
CHAPTER 4: REQUIREMENTS ANALYSIS AND DESIGN
4.1 Introduction
The previous chapter explained the system development lifecycle adopted was the
Waterfall model and the solutions applied to development issues will be used in the
design. This chapter will explain the design on how the developed software will
functions with the use of some UML diagrams. This chapter applies the first two stages
of the Waterfall model ; Requirement analysis and Design t the over all project.
4.2 Software Engineering
The software is developed in C++ which is an object oriented language like Java
therefore will need to apply software engineering in an object oriented environment.
For the development of high quality software, it must meet the user requirements set at
the starting of the project. The software should be reliable with few bugs and should be
well maintainable. It should have very good usability. The software engineering
approach applied to this project will be defining a process that is concerned with
fulfilling the requirements and using verification and validation to make sure the product
built is stable.
4.3 Requirements Specification
From the findings in chapter 2, literature review, we can state the requirements of the
software that will be developed. A requirement is a statement that identifies a
necessary characteristic or quality of a system in order for it to have value and utility to
a user. They are used as an input into the design stage of a system as this will be used
in designing and implementation of the product. Requirements are also an important
input into the testing phase, as the test should produce output as stated in the
requirement. Hence, it is very important for requirements to be simple and specific.
As we can see form chapter 1, the aim of the project is to create software that will is
capable can perform basic image processing techniques recognize hand gestures

22

performed by the users, which will be used to browse through the picture gallery of the
PC being used. This gives us a general idea of what the software needs to do but to
get an in-depth idea of what the user can be able to do with the software how the
system operates we will state the detailed functional requirements. The functional
requirements describe how the system is required to function.
4.3.1 Functional Requirements
1. The camera used will be able to capture user images from the video sequences.
2. The software will be able to produce multiple frames and display the image in the
RGB color space.
3. The software will be able to display the converted RGB image in a new window.
4. The software will be able to detect the skin regions of the user in the image
captured.
5. The software will be able to detect the contours of the detected skin regions.
The implementation of each requirement will be discussed in chapter
5(Implementation). The following section will look into the use of various UML diagrams
to represent the structure of the system.
4.4 Block Diagram
Figure 4.1
Figure 4.1 gives the block diagram of the overall system. This was the initial design but
due to time constraint and lack of knowledge of the programming language used which
will be discussed in detail in chapter 5(implementation), this project will look into the initial
stages of hand gesture recognition. The redefined aim is to develop and implement a set
of algorithm, which enables the detection of hand in a real time environment.
4.5 UML Diagrams
UML(Unified Modelling Language) is a most useful method of visualization and
documenting software systems design.UML includes a set of graphic notation techniques
!"#$%&
'()*%++,-$&
.#"%(#&
/%+01(%&
(%*)$-,0,)-&
Figure 4.1
Figure 4.1 gives the block diagram of the overall system. This was the initial design but
due to time constraint and lack of knowledge of the programming language used that

23

will be discussed in detail in chapter 5(implementation), this project will look into the
initial stages of hand gesture recognition. The redefined aim is to develop and
implement a set of algorithm, which enables the detection skin and contours in a real
time environment.
4.5 UML Diagrams
UML(Unified Modelling Language) is a most useful method of visualization and
documenting software systems design.UML includes a set of graphic notation
techniques to create visual models of software systems.It is used to
specify,visualize,modify,construct and document the artifacts of an object-oriented
software system under development.(Mishra 1997).
4.5.1 UML Use Case Diagram
The use case diagram is used to identify the primary elements and processes that form
the system. It defines a goal-oriented set of interactions between external actors and
the system. The primary elements are termed as “actors” and the processes are called
use cases or actions. Actors are entities that will utilize the application in order to
complete a task. An actor maybe a class of users, roles users can play or other
systems. Cockburn (1997) distinguishes between primary and secondary actors. A
primary actor is one having a goal requiring the assistance of the system. A secondary
actor is one from which the system needs assistance. Since this project focuses more
on the image processing phase, this particular use case diagram will focus on the
secondary actor i.e. the web camera and its part in the functionality of the software.

24

Figure 4.2
Figure 4.2,shows the use case diagram that includes two actors, where “user” is the
primary actor and “camera”, the secondary actor. As said earlier, the diagram focuses
more on the secondary actor as this projects deals mainly with the image processing
phase.
The user actor, which is a human accessing the system, can quit the program
externally after it is initiated. Also, the image captured by the camera is the image of
the user accessing the system.
The camera actor, which is the camera of the PC being used, is responsible for
capturing the user image in real time. Since, this project looks into vision based hand
gesture recognition in real time, which means that we are trying to identify the hand in
the image without the help of any external device like the data glove (described in
chapter 2,literature review) in the video sequences captured by the camera, image
processing becomes an important phase of the system as it helps to eliminate any
unwanted background objects and helps to focus on more important parts of the
image. Also makes it easier to analyze the image to extract useful information.
Below are the Use Cases, these go through each use case and detail what actions are
performed by the actor and what outcome is expected.
Use Case 1: image capture
Actor: Camera
Goal: To capture an image of the user from the video sequence.
Overview: The web camera will grab an image of the user that will be processed to
extract useful information.
Use Case 2: image processing
Actor: Camera
Goal: Process the image captured by the web camera.
Overview: Convert the initial RGB image captured by the camera into HSV color
space, undergo skin detection and contour detection phases to get the image ready for

25

hand tracking phase.
The detail of each use case in provided above. It provides information about the
different steps involved and the outcome expected. The above use case diagram
doesn’t provide information regarding the case where expected result is not outputted.
In order to overcome this risk, an activity diagram was produced in order to get a
detailed view of the different paths within an activity.
4.5.2 Activity Diagram
The process flows in the system are captured in the activity diagram. An activity
diagram consists of activities, actions, transitions, initial and final states and guard
conditions.
It is used for modeling the logic captured by a single use case scenario(Amber (2003)).
Figure 4.3
In figure 4.3, first the image captured by the camera is in RGB color space. If the camera is not
initialized or is not able to capture a video the user will be required to externally restart the
program. In order to detect the skin in the image captured, it first needs to be converted into HSV
color space as it was introduced for users who need to define the color properties numerically.
After the image has been converted into HSV color space, we will use the explicitly defined skin
model for skin detection. In the explicitly defined skin model, different ranges of thresholds are
defined according to each color space components as the image pixels and that fall between the

26

predefined ranges are considered as skin pixels. This process helps to distinguish skin regions
from the non-skin regions. The importance of this stage is that it helps eliminate any unwanted
background objects and helps to extract important regions from the image, in our case a hand.
The next step is the contour detection where, the outline of the object in the HSV image is
detected. Hence, the skin regions in the HSV image are outlined. This simplifies the feature
extraction of the hand that makes the tracking of the hand easier. The next three steps will not be
considered due complexity of the algorithm which looks into hand tracking and hand gesture
recognition,also due to lack of knowledge in the programming and shortage of time available for
the completion of the project. Since there are no more actions involved after contour detection, the
use can quit the program in order to end the process.
4.6 Assumptions
Throughout the design, there were many assumptions made. This section briefly
describes some of the important assumptions made. The main assumption was made
regarding the environment in which the system was operated. The place where the
user intends to use the system is expected to have enough lighting for the camera to
capture the video. Insufficient lighting conditions may lead to undesirable outputs. Also,
the user is assumed to be wearing a full sleeves clothes with only the hand area
shown, as the program is not fully functional to exclude other skin regions like the
arms. While it could be possible to program in preparation for such problems, such as
once skin detection phase is implemented, segment the hand area from other regions
and then apply contour detection algorithm in the segmented hand image. However,
the development of such algorithms require as much work as another project all by
itself, and are not feasible in the time allocated and require high level programming
knowledge.
4.7 Summary
This chapter looks into the use of the functional requirements that was gathered by
carrying out an in-depth literature research on image processing as well as the past
projects available. This chapter has also used various UML diagrams like Use case
and Activity diagram to represent the different aspects of the system. Use cases
outlined the primary and the secondary actors (user and camera respectively)
associated with the system and the tasks that they should be able to accomplish.
Activity diagram showed the flow task within the system. It also looked into some
assumptions made while designing the system that should be considered by the user

27

while using the system. The next chapter looks into how the requirements stated above
are implemented.
CHAPTER 5: IMPLEMENTATION
The previous chapter discussed how the software was designed and how would it
function. This chapter will focus on the implementation stage of the project. It will use
the design phase information to develop the required algorithm to successfully
accomplish the defined aims and objectives.
Gathering the requirements and the proposed design concepts, the algorithm can be
implemented to capture the image, detect the skin in the image captured and detect
the contour that will make tracking of the hands easier for gesture recognition. To do
this, the algorithm must be implemented using a flexible programming language, which
has already been used by other programmers for developing such software.
5.1 Development Tools
After looking at many past projects on similar software, the decision of using Microsoft
Visual C++ as a programming environment was taken. Also, the OpenCV library, which
is an open source library which contains functions that specializes in image processing
and gesture recognition, used in this project is compatible with C++ programming
language.
The main problem that arised at the implementation phase was familiarizing with the
programming language as well as the OpenCV.Having to learn a whole new
programming language lead to a major amount of time loss. Also, understanding how
the OpenCV functions work and how I can use them in Visual C++ was a hectic task.
After being able to successfully implement some simple functions in C++, I finally
started with the coding of the software while continuing my research to get more in
depth knowledge about C++ programming language. Then it came to my knowledge
that C++ wouldn’t be able to provide an effective output that is expected, i.e.
implementing the gesture recognition phase would involve a lot of complicated
processes. A solution to this would be using C sharp (C#) which deals with high level
programming in topics like gesture recognition. At this point, I was completely baffled,
as I had already finished with coding the first phase of the system, i.e. image

28

processing. After losing out much time in learning C++,I couldn’t afford to go back to
the phase where I had to learn a new programming all over again and lose out on the
much precious available time I had in hand. Therefore, decided to carry on my coding
in Visual C++ and try and implement as much as I can in the limited time possible. This
was one of the major reason for not being able to implement the whole hand gesture
recognition system. Hence, the sections below will look at how implementation of the
image processing part of the system was carried out.
5.2 OpenCV Functions Used
cvGetSize(): used to call the CvSize structure.This gives us the size of the existing
structure image.
IplImage() : IplImage is an OpenCV construct.OpenCV uses this structure to handle all
kinds of images.
cvCreateImage() : create a new image to hold the changes made to the original image.
cvCvtColor() : converts from one color space to another expecting the data type to be
the same.
CV_BGR2HSV : conversion code to convert BGR image to HSV(Hue saturation Value)
color space.
cvNamedWindow() : opens a window on the screen that can contain and display an
image.
CvScalar (): this function takes one, two,three or four arguments and assigns those
arguments to the corresponding elements of val[].It represents the RGB values.
cvInRangeS (): this function is used to check if the pixels in an image fall within a
particular specified range.
cvSmooth() : used to reduce noise or camera artifacts.
cvFindContours(): computes contours from binary images.
cvDrawContours() : drawing a contour on the screen.
5.3 Initialization
Since this project deals with real time image captured from the web camera, we used
the highGUI portion of the OpenCV library that deals with input/output routines and
functions for storing and loading videos and images. We used the OpenCV function
called cvCreateCameraCapture(). It initializes capturing video from camera.

29

#include "highgui.h"
int main()
{
CvCapture* capture = cvCaptureFromCAM(0);
if(!cvQueryFrame(capture)){ cout<<"Video capture failed, please check the
camera."<<endl;}else{cout<<"Video camera capture status: OK"<<endl;};
}
cvReleaseCapture( &capture);
}
The OpenCV function cvCreateCameraCapture takes as it argument the camera ID to
be initiated and then return a pointer to CvCapture which is a video out structure
function OpenCV.Since there is just once camera associated with the PC used the
camera ID in our case is ‘0’.CvCapture structure will contain all the information about
the image frame captured from the camera that was initialized. The next step is to let
the users know if it was successful in initializing and capturing a video stream for the
camera.The ‘if’ statement is used for the case. The cvReleaseCapture function frees
the memory associated with the CvCapture structure.
Figure 5.1

30

Figure
5.2
Figure 5.3
Figure 5.1 and 5.2 are the initial frames that appears which gives the user information
on whether the camera is initialized or not. Figure 5.1 shows the frame that appears
when the program cannot find a camera that will be initialized to capture the images.
Figure 5.2 shows a frame that appears when the camera is initialized and its starts to
capture the images. Figure 5.3 shows the initial image in the RGB color space.
5.4 Image Processing
As seen in section 2.3 literature review, the image processing phase involves sub
stages; image conversion, skin detection and contour detection. Below, we will look at
how each stage was implemented using different image processing functions available.
5.4.1 Image Conversion
int main()

31

{
int c = 0;
CvSize sz = cvGetSize(cvQueryFrame( capture));
IplImage* src = cvCreateImage( size, 8, 3 );
IplImage* hsv = cvCreateImage( size, 8, 3);
while( c != 27)
{
src = cvQueryFrame( capture);
cvCvtColor(src, hsv, CV_BGR2HSV);
cvNamedWindow( "src",1);
cvNamedWindow( "img",1);
cvShowImage( "img", hsv);
cvShowImage( "src", src);
From the code snippet above, we can see that the variable “size” is declared in terms
Cvsize structure which is called using the OpenCV function cvGetSize.It gives the size
of the image captured in the frame created.An image named “src” is created using the
function cvCreateImage,to hold the actual image captured by the camera in RGB color
space.The first argument gets the size fo the image captured by the camera,the
second argument indicates the number of channels ,which in our case is 3 and the last
argument indicates the bits available in each chanel,in our case its 8 bits per
channel.Similarly,another image called “hsv” is created to hold the image after its
converted into HSV color space with the exact same arguments.Next,the function
cvCvtColor converts the original RGB image stored in the variable “src” into HSV color
space and hold the output HSV image in the variable “hsv” using the conversion code
CV_BGR2HSV.Lastly,it will create a window named “src” and “hsv” using the function

32

cvNamedWindow and call the function cvShowImage ,which displays the RGB image
and the HSV image in the “src” and “hsv” windows created respecitively.
Figure 5.4
Figure 5.4 shows the RGB image (left) which is converted into the HSV color
space(right)
5.4.2 Skin Detection
IplImage* mask = cvCreateImage( sz, 8, 1);
CvScalar min = cvScalar(0, 30, 80, 0);
CvScalar max = cvScalar(20, 150, 255, 0);
In the above code snippet, an image to hold the skin region detected HSV image is
created. The first argument gets the size of the actual RGB image frame captured, with
1 channel of 8 bits. The two variables “min” and “max” is declared on the basis of the

33

function cvScalar. The “min” takes arguments 0,30,80 as the lower RGB values and
“max” takes arguments 20,150,255 as the upper RGB values.
cvInRangeS (hsv, min, max, mask);
The above code compares the HSV image the constant (CvScalar) values in the lower
and upper (min and max). If the value in the HSV image is greater than or equal to the
value in the lower (min) and less than the value in the upper (max), then the
corresponding value in the outputted image is set to 0.Hence,an intermediate
image(mask) with skin pixel detected as white color is displayed. The non-skin parts
are displayed in black.
Figure 5.5
Figure 5.5 shows the HSV image on the left and the skin detected parts in the image
on the right. But as we can see, the skin detected parts are not that clear and contains
a large amount of noise. In order to get rid of any noise in the image we use the
function cvSmooth.
cvSmooth( mask, mask, CV_MEDIAN, 27, 0, 0, 0 );

34

The code above takes the skin detected intermediate image, removes the noises
associated with the image using the CV_MEDIAN smoothing type.
Figure 5.6
Figure 5.6 is an output produced after applying the smoothing function to filter out the
noises. As we can see we get a clearer image with a minimum about of noise. It also
helped to get rid of the unwanted background images which could be seen in figure 5.5
5.4.3 Contour Detection
IplImage* new = cvCreateImage( size, 8, 3);

35

CvSeq* contours = NULL;
CvMemStorage* storage = cvCreateMemStorage(0);
cvFindContours( mask, storage, &contours, sizeof(CvContour), CV_RETR_LIST,
CV_CHAIN_APPROX_SIMPLE, cvPoint(0,0) );
In the above code snippet, the first argument is the input image where the contour
needs to be found. This image needs to be a 8 bit single channel image. In our case
the input image is the image where all the skin parts are detected that is represented
by a variable “mask”. The next argument, storage, indicates a place where the function
cvFindContours() can find memory in which to record the contours. The storage area
has been allocated with cvCreateMemStorage (). Next is the &contours, which is a
pointer to a CvSeq*. CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE are the mode
and methods respectively.The mode set in this case retrives all the contours and puts
them in the list.The method tells us how the contours are approximated.The methods
set here compresses horizontal,vertical, and diagonal segments ,leaving only their
ending points.
cvDrawContours( new, contours, CV_RGB( 0, 200, 0), CV_RGB( 0, 100, 0), 1, 1, 8,
cvPoint(0,0));
In the above code snippet, the function cvDrawContours() takes the first argument in
which the contour is drawn. In our case, we created an image to hold the output
contour image called represented by the variable “new”. The next argument, CV_RGB,
indicated the color with which the contour is drawn. In our case the contours are drawn
with color green.
Figure 5.7

36

Figure 5.7 shows the contour drawn with the use of the function cvDrawContours().As
we can see the contour is drawn with the color green as discussed above.
Figure 5.8

37

Figure 5.9
5.5 Summary
To summarize, the implementation phase uses the initial concept stages to
successfully design an algorithm to detect edges in a corridor. The annotation gives an
insight on the algorithms functionality and certain uses. To determine the overall
success of the implemented algorithm, test cases must be applied. These test cases
will involve different images of a corridor taken from various angles. Therefore proving
the robustness of the implemented algorithm, once results are achieved during testing
phase.
CHAPTER 6: TESTING
Software testing is the process of analyzing a software item to detect the differences
between existing and required conditions (that is, bugs) and to evaluate the features of
the software item [19, 20]. Software testing generally involves Validation and
Verification processes. Verification is the process of evaluating a system or component
to determine whether the products of a given development phase satisfy the conditions
imposed at the start of that phase [21]. Verification activities include testing and
reviews. Validation is the process of evaluating a system or component during or at the
end of the development process to determine whether it satisfies specified
requirements [21].
6.1 Introduction
Once the code has been implemented, the testing of the code is carried out. Testing
ensures that the software developed functions as expected. There are two basic
testing techniques available, Black box testing and white box testing.
The white-box testing focuses on the internal structure of the code. The white box
tester (most often the developer of the code) looks into the coding of the system that’s
has been implemented to look for any errors by writing test cases. Here the inputs are
chosen specifically to determine an appropriate output the result can be evaluated
more clearly as said in “Redstone Software Inc” (2008). White-box testing is often used
for verification.
Black-box test design is usually described as focusing on testing functional

38

requirements. Knowledge of the systems code and programming knowledge is not
required. The test cases are built around what the software is supposed to do. Valid
and invalid inputs are taken to determine the correct output as discussed in “Redstone
Software Inc” (2008). Black-box testing is often used for validation. The black-box
testing technique here was applied to test the stated functionality, which ensures that
the right product has been built. This type of testing aims to uncover errors which include
interface errors, performance errors as well as initialization and termination errors to name
a few which are relevant to this project (Pressman, 2005).
6.2 Black-Box Testing
Test Description Desired Outcome Actual Outcome Result
1 Initialize camera Display user image in
new window
Displayed user
original image in new
window
PASS
2 Image conversion Display HSV image in
new window
Displayed HSV image
in new window
PASS
3 Filtering Smooth out image Smooth out image PASS
4 Skin detection Display skin areas in
the image in a new
window
Displayed skin
regions as white with
background removed
PASS
5 Contour detection Draw contour around
skin region
Green outline drawn
along outline of the
skin region detected
PASS
6 Exit Exists when user clicks
‘ESC’ button
Exists when ‘ESC’
button is clicked by
user
PASS
Table 6.1
Table 6.1 shows the Black Box tests that were implemented for the code developed,
tests labeled “pass” have passed the test at the first iteration of the code. Nearly all of
the test cases passed in the software.
6.3 Testing Evaluation

39

Gathering all the results from testing, the software clearly performs all the functions
that was intended to perform. It can capture a user image and undergo the color
spaces conversion in real time. The filtering was successful performed in the image to
get rid of any noises associated with the image. The skin regions in the image were
successfully detected and the contour was successfully drawn along the outer part of
the detected skin regions.
6.4 Project Evaluation
This section evaluates each project phase and will discuss each stage in regards to
success and constraints. This chapter will include a summary of project and further
research and development that can improve this project further.
This section looks into the objectives set under aims and objectives stage are
discussed to determine the success.
6.4.1 The Project Aim
The original aim was to develop gesture recognition software that uses hand gesture to
browse through the picture gallery of the PC being used. But due to various problems
discussed earlier, this project looks into image processing and the basic image
processing code developed which can be further used to as a base algorithm for any
image processing or gesture recognition system. The code developed uses the image
processing functions provided by OpenCV, which is an open source computer vision
library that specializes in real time image processing and vision algorithms. The code
developed was able to convert the real time image captured in RGB color space to
HSV color space that makes it easier to detect the skin pixels in the image.
6.4.2 Objective
In this chapter we will consider how well our objectives were met and how useful they
were to the overall project:
The first objective (O1) aimed to gather literature on gesture recognition, image
processing and open source libraries. The first objective has been achieved through
chapter 1(literature review). It looked at different techniques used by past researchers
in the image processing field. Decisions on what image processing technique the
software would use were made from this research. The research in this area
uncovered many technologies and gave me an insight into the basic image processing

40

processes and the development environment. Overall I would consider this objective
an important phase in this project to gain information on the problem definition and I
would consider the research into this area successful.
The second objective (O2) aimed to shed light on the possible methodologies that
could be used in order to implement the techniques discussed for O1.This was
achieved through the techniques discussed in chapter 3(methodology). The Waterfall
model was adopted as the development methodology that was successfully applied
throughout for the successful management of the project. The Microsoft Visual C++
environment with combination with OpenCV library functions was used to implement
the codes that would be developed using the techniques that was described in
chapter1 (literature review). I believe this objective was met and has shaped the
successful completion of the project.
The third objective (O3) aimed to gather the requirements to start with the design of the
system intended. Chapter 3(requirement analysis and design) looked into the detailed
function requirement of the system. The requirements were gathered through literature
review. The successful analysis of the requirements led to the achievement of O3.
The fourth objective (O4) was the design of the system. Chapter 4(design) detailed the
design of the system very well. The use case and the activity diagrams were used to
describe the internal processes of the image processing system. I do believe that the
detail in the design phase helped me to identify the different algorithms that need to be
implemented in order to get the system working.
The fifth objective (O5) looked into the implementation; chapter 5 gives the detail of the
implementation process of the various functions. It shows that the OpenCV functions
were used to perform different image processing functions namely image conversion,
skin detection and contour detection. It also shows the output produced by the codes
developed that marks the achievement of this objective. Even though the codes
developed are quite stable it still needs some improvement and can be made more
usable by adding more functionalities that will be discussed in section 7.3(chapter
7;conclusion).
The sixth objective (O6) was covered in chapter 6. Various testing methods were

41

utilized in order to ensure that the code was stable and usable. Black box and white
box testing techniques were used. Unit testing was performed in order to identify any
bugs present in the code. Test plan was used for the validation of the project by setting
certain objectives and reviewing the out put produces. The software was tested
successfully.
The final objective (O7) was to evaluate the project against the aims and objectives
set. I feel that this section has done this well not only have we evaluated the project as
a whole against the aim but also we have evaluated all the objectives and discussed
which sections have played a part in achieving the objectives. The software effectively
captures the image in real time and goes through the conversion, skin detection and
contour detection process which covers the redefined aim stated in chapter 4(design).
CHAPTER 7: CONCLUSION
This chapter provides a recap of the key points of the dissertation as well as a brief
overview of the research contributions made, the scope for improvements in future
research and development of the system evolved from this study and personal
reflections based on the overall project to emphasize its achievements, limitations and
scope.
7.1 Summary of the dissertation
Chapter 1: Introduction
This chapter provides an introduction to the project topic that is gesture recognition and
image processing and stating the aim of the project to develop software that performs
basic image processing and hand gesture recognition. Also, discussed the research
approaches and the project scope of the project. I slightly decided to change the project
scope and concentrate solely on image processing.
Chapter 2: Literature Review
This chapter looked into the literature related to gesture recognition and image processing
in order to gain an in depth knowledge on the topic. The literature was then analyzed and
various past projects reviewed in order to identify the key techniques to be used in the
project. The vision-based method of hand gesture recognition was chosen which do

42

not require the use of any external devices. The requirements needed in order to
design the system was gathered after analyzing the different techniques.
Chapter 3: Methodology
This chapter discussed the Software Development Life Cycle (SDLC) methods that could
be considered for this project; namely the Waterfall model and the Spiral model. Based on
analysis of each, the Waterfall model was chosen, as it was most suitable to this project.
The research methodology involved analyzing the literature n the techniques used by
researched and any past projects that looked into similar problem. Finally, Visual C++
was chosen as the platform to implement the software that also incorporated OpenCV, an
open source library containing functions that specializes in image processing and gesture
recognition.
Chapter 4: Requirements Analysis & Design
This chapter discussed the functional requirements determined from the literature
reviewed. UML(Unified Modeling Language) diagrams namely; Use Case and Activity
diagrams were used to plan the design of the software and to get an overview of the tasks
the software is supposed to perform.
This was followed by design of the system according to the requirements specified.
Chapter 5: Implementation
The final design was implemented using the Microsoft Visual C++. The implementation
also made use of the image processing functions provided by OpenCV library. Some of
the main image processing functions used were discussed. Also, the output’s produced
by the code to show what the system can perform is shown.
Chapter 6: Testing And Evaluation
The developed software was then tested using black-box techniques to compare the
implemented functionality with the functionality defined in the requirements specification.
All the test cases produced passed the black box test. The project was then evaluated
against the aims and objective.
7.2 Research contributions
This project initially aimed to look at the development and implementation of a hand
gesture recognition system which can be used to browse through the picture gallery of

43

the PC being used, but due to various factors like time constrain and lack of
programming knowledge and other problems discussed earlier, the project focused
more on the image processing phase Therefore, this project contributes into the are of
image processing in real time. I feel that further research can be initiated form the
implemented algorithm. As image processing is a vital and basic phase of ever gesture
based system, the algorithm developed can be seen as a contribution made in the field
of image processing and gesture recognition. The algorithms developed are very basic
and easy to understand and every function used is documented very clearly. Hence, it
can be very useful to those who do not have any knowledge about OpenCV and image
processing and are looking for basic information on such a topic. I feel that, my codes
can be used as a base to built more complex system as it already deals with basic
image processing and hence researchers/developers can save time and effort by not
having to look into the image-processing phase. I hope that this will help future
developers in making decisions for similar software.
7.3 Limitations and Future Development
The popularity of gesture based systems have grown over the years and I believe that
it will continue to expand and become an essential part of everyday like as the
technology advances. This project has expanded my knowledge about the different
techniques used in the develop of such systems and its useful application in different
fields.
My system is now capable of performing basic image processing like conversion
between two different color spaces, differentiates between skin pixels and non-skin
pixels in the image, performs background subtraction and cancels out major noises
associated with the image. However there are limitations to software functions and may
not always produce expected outputs. One of the major limitations is that I have only
tested my system in a few environments therefore low or varying lighting conditions
may effect the outputs produced. Also a more complex image filtering should be
applied to the image as it still contains some amount of noise associated with the final
image.
In regards to further development, I feel that the project can be expanded further if the
minor environmental and filter issues can be resolved. In order to improve the project
further, research and further development on the filters used should be critically

44

examined. I feel that once these issues have been further implemented, the algorithms
developed can be used as a basis for future development. Functions for feature
extraction, hand tracking and gesture recognition can be added to make the system
usable. Right now the project just looks into image processing but after adding further
functions to the current algorithm, the system can be used to assist disabled and
elderly to interact with the computer with ease. The project therefore has immense
scope in terms of the functionality that can be changed to gain better results. Gesture
based system can also be used in the gaming field to provide the gamers with an ultimate
touch less gaming experience. Image processing can be used in robotics that can be
used to further develop an aid to help elderly autonomously commute in a
building/house, which will help immensely.
7.4 Personal Reflections
This research project was a great learning experience as it made me realize my
strengths as well as areas where I could improve my skills for future projects. This
project was the most challenging of all projects completed in the past but yet enjoyable.
Even after knowing the complexity of the project, the eagerness to explore a whole
new field of technology was what led me to choose the topic of touch less interaction
through gesture recognition.
I feel that my strength have been in making my code. The code developed in the
project are completely from scratch so developing the code has been challenging but
has made my realize that I do have the ability to code, before this project I saw coding
as tedious but this project has completely changed my view on coding and I have
embraced this change.
However this project does not completely meet the initial aim that was set out for this
project, I have been pleased with how far I have come from having no clue about what
programming environment to use to gradually designing small parts of the system to
make it work together.
My weakness however is my software & time management. Time management was
one of the main weaknesses of mine. I feel that I could have handled this project much
better in terms of sticking to a scheduled completion time or sticking to the approaches
that I have specified. I did manage to stick to the Waterfall model of development used
but I feel that this could have been done better. The time required for the

45

implementation and testing was underestimated which led to the incomplete
development of the software and poor testing applied. I spent maximum amount of my
time familiarizing with the new programming language and development tools and then
developing the code. In future, a more realistic plan for completion and execution could
be used to avoid disappointment and delays in task completion.

46

REFERENCES
1) Ambler,W.S (2003-2010) UML 2 Activity Diagrams [WWW] Agile
Modelling.Available from:
http://www.agilemodeling.com/artifacts/activityDiagram.htm [Accessed:
10/03/2011]
2) Baudel,T. and Beaudoin-Lafon,M. (1993). Charade: remote control of objects
using free-hand gestures. Communications of the ACM, 36(7). p. 28-35.
3) Bolt,R. (1980). Put-that-there: Voice and gesture at the graphics interface. ACM
SIGGRAPH Computer Graphics, 14(3). p. 262-270.
4) Boehm, B. "A Spiral Model of Software Development and Enhancement", ACM
SIGSOFT Software Engineering Notes", "ACM", 11(4): 14-24, August 1986.
5) Bradski,G. and Kaehler,A. (2008) Learning OpenCV.1st
ed.United
States.O’Relly Media.
6) Creek Photo (2011) Introduction to Color Spaces [WWW]. Available from:
http://www.drycreekphoto.com/Learn/color_spaces.htm [Accessed: 26/01/2011]
7) English, W.K., Engelbart, D.C., and Berman, M.L., "Display Selection
Techniques for Text Manipulation." IEEE Transactions on Human Factors in
Electronics, 1967. HFE-8(1)
8) Feris,R.,Turk,M.,Raskar,R.,Tan,K. and Ohashi,G. “ Recognition of isolated
Fingerspelling Gestures Using Depth Edges, “ in Real-Time Vision For Human-
Computer Interaction. Kisacanin.V. Pavlovic. And T.S. Huang. Eds: Springer,
p.43-56, 2005.

47

9) Grasso,M., Ebert,D. and Finin,T. (1998) The integrality of speech in multimodal
interfaces. ACM CH 1998 Conference on Human Factors in Computing
Systems. p. 303-325
10) Goldberg, A., ed. A History of Personal Workstations. 1988, Addison-Wesley
Publishing Company: New York, NY. 537.
11) Hashem, H.F, "Adaptive technique for human face detection using HSV color
space and neural networks," Radio Science Conference, 2009. NRSC 2009.
National, vol., no., pp.1-7, 2009.
12) Hardenberg, C. V and Berard,F. (2001) Bare-Hand Human Computer
Interaction Proceedings of the ACM Workshop on Perceptive User Interfaces
13) Hofmann, F., Heyer, P. and Hommel, G. Velocity Profile Based Recognition of
Dynamic Gestures with Discrete Hidden Markov Models. Proc. of the
International Gesture Workshop on Gesture and Sign Language in Human-
Computer Interaction, Springer London (2004), 81–95.
14) Image processing (2007) [WWW] Available from:
http://pyrorobotics.org/?page=Introduction_20to_20Computer_20Vision [Accessed
28/02/2011]
15) Kovac,P., Peer and Solina,F. (2003). Human Skin Color Clustering for Face
Detection. EUROCON2003
16) Lahari.T and Aneesha.V. Contour Detecion in Computer Vision[WWW]
V.Siddhartha Engineering College. Available from :
http://www.scribd.com/Contour-Detection-My-Ppt-1/d/41564107 [Accessed:
01/03/2011]

48

17) Linda ,G., Shapiro and George, C. Stockman (2001): “Computer Vision”, pp
279-325, New Jersey, Prentice-Hall
18) Mysliwiec,A.T.1994. FingerMouse: A Freehand Computer Pointing Interface,
Vision Interfaces and Systems Laboratory Technical Report, VISlab-94-001,
Electrical Engineering and Computer Science Department, The University of
Illinois at Chicago.
19) Nadgeri, Sulochana ,M., Sawarkar, S.D., Gawande, A.D., "Hand Gesture
Recognition Using CAMSHIFT Algorithm," Emerging Trends in Engineering and
Technology (ICETET), 2010 3rd International Conference on , vol., no., pp.37-
41, 19-21 Nov. 2010
20) Nallaperumal,K., Ravi,S., Babu, C. N. K. , Selvakumar, R. K. , Fred, A. L. C.,
Seldev and Vinsley. S. S. (2007). Skin Detection using Color Pixel
Classification with Application to Face Detection: A Comparative Study.
International Conference on Computational Intelligence and Multimedia
Applications 2007.
21) Oavlovic,V.I.,Sharma,R. and Huang,T.S. “Visual Interpretation of Hand
Gestures for Human-Computer Interaction : AReview. “ IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol . 19, pp.677-695,1997.
22) Oviatt,S. (2001) Designing Robust Multimodal Systems for Diverse Users and
Environments. EC/NSF Workshop on Universal Accessibility of Ubiquitous
Computing
23) Parekh, N, (2011) The Waterfall Model Explained [WWW] .Available from :
http://www.buzzle.com/editorials/1-5-2005-63768.asp [Accessed : 03/03/2011]
24) Pressman, R.S (2010). Software Engineering: A Practitioner’s Approach, (7th
Edition) New York: McGraw – Hill.

49

25) Quek ,K.H.F., 1994. Non-Verbal Vision-Based Interfaces, Vision Interfaces and
Systems Laboratory Technical Report, Electrical Engineering and Computer
Science Department, The University of Illinois at Chicago.
26) [Figure 2.2] rickyc (2004) Github Social Coding [online image].Available from :
https://github.com/rickyc/contour-detection [Accessed: 21/02/2011].
27) Redstone Software Inc (2008) Black-box vs. White-box Testing: Choosing the
Right Approach to Deliver Quality Applications [WWW]. Available from:
http://www.testplant.com/download_files/BB_vs_WB_Testing.pdf [Accessed :
05/03/2011]
28) Rodrigo.R, (2009) Image Processing and Computer Vision [WWW] University of
Moratuwa Available from:
http://www.ent.mrt.ac.lk/~ranga/files/mathematics_society_talk.pdf [Accessed
28/02/2011]
29) Segen,J. and Kumar,S. (1998). Gesture VR: Vision-based 3D hand interface for
spatial interaction. ACM International Conference on Multimedia. p. 455-464.
30) Search Software Quality (2001) systems development life cycle (SDLC)
[WWW]. Available from:
http://searchsoftwarequality.techtarget.com/definition/systems-development-life-
cycle [Accessed: 06/03/2011]
31) Segen.J. and Kumar.S., Human- Computer Interaction using gesture
Recognition and 3D Hand Tracking, Proceedings of ICIP’98, Vol. 3, pp188-192,
Chicago, October 1998.
32) [Figure 3.1] Software Engineering Philippines (2008) Waterfall Model
[WWW].Available from http://shannonxj.blogspot.com/2008/01/waterfall-
model.html [Accessed : 05/03/2011]

50

33) Target, The –software- Experts. Software Process Models [WWW]. Available
http://www.the-software-experts.de/e_dta-sw-process.htm [Accessed:
04/03/2011]
34) Triesch, J. and Malsburg,C. 1998. Robotic Gesture Recognition, Proceedings
of ECCV, 233-24149,98.
35) Utsumi,A.,Miyasato,T.,Kishino,F.,Nakatsu,R, Hand Gesture Recognition
System using Multiple Cameras, Proceedings of ICPR’96, Vol. 1, 667-
67A1u,wst 1996.
36) [Figure 2.1] Virtual Technologies, Inc. (1996) Computer Desktop Encyclopedia
[Online image]. Available from:
http://images.yourdictionary.com/images/computer/_GLOVE.GIF [Accessed:
21/01/2011]
37) Wang,J., Zhai.S. and Su,H. (2001) Chinese input with keyboard and eye-
tracking: an anatomical study. ACM CHI 2001 Conference on Human Factors in
Computing Systems. p. 349-356
38) Wikipedia (2011) Segmentation (image processing) [WWW].Available from :
http://en.wikipedia.org/wiki/Segmentation_(image_processing) [Accessed:
01/03/2011].
39) [Figure 3.2] Wikipedia (2011) Spiral Model [WWW]. Available from:
http://en.wikipedia.org/wiki/Spiral_model [Accessed: 05/03/2011]
40) Ying Wu and T.S.Huang, “Vision-Based Gesture Recognition: A Review,”in
International Gesture Workshop. Pp. 103-115 Computer Lecture Notes.1999.
41) Yoo.T., and Oh.S. " A fast algorithm for tracking human faces based on
chromatic histograms" Pattern Recognition Letters, 20:967-978, 1999.

51

42) IEEE, "ANSI/IEEE Standard 1008-1987, IEEE Standard for Software Unit
Testing," no., 1986.
43) IEEE, "IEEE Standards Collection: Glossary of Software Engineering
Terminology," IEEE Standard 610.12-1990, 1990.
44) IEEE, "IEEE Standard 610.12-1990, IEEE Standard Glossary of Software
Engineering Terminology," 1990.
45) Zarit. B. D., Super. B. J., and Quek. K. H.F., Comparison of five color models in
skin pixel classification. In ICCV'99 Int'IWorkshop on recognition, analysis and
tracking offaces and gestures in Real-Time systems, 1999.
APPENDIX A: SOURCE CODE
#include "stdafx.h"
#include "cv.h"
#include "cxcore.h"
#include "highgui.h"
#include "math.h"
#include <iostream>
#include <stdio.h>
#include <string.h>
#include <conio.h>
#include <sstream>
#include <time.h>
using namespace std;
int main()

52

{
int c = 0;
CvCapture* capture = cvCaptureFromCAM(0);
if(!cvQueryFrame(capture)){ cout<<"Video capture failed, please check the
camera."<<endl;}else{cout<<"Video camera capture status: OK"<<endl;};
CvSize size = cvGetSize(cvQueryFrame( capture));
IplImage* src = cvCreateImage( size, 8, 3 );
IplImage* hsv = cvCreateImage( size, 8, 3);
IplImage* mask = cvCreateImage( size, 8, 1);
CvScalar min = cvScalar(0, 30, 80, 0);
CvScalar max = cvScalar(20, 150, 255, 0);
CvMemStorage* storage = cvCreateMemStorage(0);
CvMemStorage* areastorage = cvCreateMemStorage(0);
CvMemStorage* minStorage = cvCreateMemStorage(0);
CvMemStorage* dftStorage = cvCreateMemStorage(0);
CvSeq* contours = NULL;
while( c != 27)
{
IplImage* image = cvCreateImage( size, 8, 3);
src = cvQueryFrame( capture);

53

cvNamedWindow( "img",1);
cvShowImage( "img", hsv);
cvNamedWindow( "msk",1);
cvShowImage( "msk", mask);
cvInRangeS (hsv, min, max, mask);
cvSmooth( mask, mask, CV_MEDIAN, 27, 0, 0, 0 );
cvFindContours( mask, storage, &contours, sizeof(CvContour),
CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE, cvPoint(0,0) );
cvDrawContours( image, contours, CV_RGB( 0, 200, 0), CV_RGB( 0, 100, 0),
1, 1, 8, cvPoint(0,0));
CvSeq* contours2 = NULL;
double result = 0, result2 = 0;
while(contours)
{
result = fabs( cvContourArea( contours, CV_WHOLE_SEQ ) );
if ( result > result2) {result2 = result; contours2 = contours;};
contours = contours->h_next;
}
if ( contours2 )
{
CvRect rect = cvBoundingRect( contours2, 0 );

54

cvRectangle( image, cvPoint(rect.x, rect.y + rect.height), cvPoint(rect.x +
rect.width, rect.y), CV_RGB(200, 0, 200), 1, 8, 0 );
int checkcxt = cvCheckContourConvexity( contours2 );
CvSeq* hull = cvConvexHull2( contours2, 0, CV_CLOCKWISE, 0 );
CvSeq* defect = cvConvexityDefects( contours2, hull, dftStorage );
CvBox2D box = cvMinAreaRect2( contours2, minStorage );
cvCircle( image, cvPoint(box.center.x, box.center.y), 3, CV_RGB(200, 0,
200), 2, 8, 0 );
cvEllipse( image, cvPoint(box.center.x, box.center.y), cvSize(box.size.height/2,
box.size.width/2), box.angle, 0, 360, CV_RGB(220, 0, 220), 1, 8, 0 );
}
IplImage* contour = cvCreateImage( size, 8, 3 );
cvDrawContours( image, contours2, CV_RGB( 255, 0, 0), CV_RGB( 0, 0, 100), 1,
4, 8, cvPoint(0,0));
cvShowImage( "src", src);
cvNamedWindow("image",0);
cvShowImage("image",image);
cvReleaseImage( &image);
c = cvWaitKey( 10);
}
cvReleaseCapture( &capture);
cvDestroyAllWindows();

55

}

56

APPENDIX B: COMMUNICATIONS LOG

57

Final Year Project-Gesture Based Interaction and Image Processing

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Final Year Project-Gesture Based Interaction and Image Processing

Semelhante a Final Year Project-Gesture Based Interaction and Image Processing (20)

Final Year Project-Gesture Based Interaction and Image Processing