This paper proposes a method for tracking people in indoor surveillance videos with challenges like illumination changes and occlusions. It uses background subtraction to detect moving objects and extracts color features to distinguish between occluded objects. The method tracks people by matching color clusters between frames and handles occlusions by using color information to accurately assign unique tags to each tracked person. Experiments on PETS dataset demonstrate the effectiveness of using color features for occlusion handling and person tracking in challenging indoor scenes.
2. form groups and separate from one another using color 1. Motion detection
information. In [7] I. Haritaoglu et al, employ histogram
based approach to locate human body part as head, hands, The most basic form of motion detection is the
feet and torso, then uses head information to find the method of subtracting know background image containing
number of people. no objects from an image under test. There are several
A. J. Lipton et al. [8], using shape and color methods to background subtraction, including averaging
information to detect and track multiple people and background frames over time and statistical modeling of
vehicles in a crowded scene and monitor activities over a each pixel. Preprocessing based on mean filtering is done
large area and extended periods of time. To survive in on the input video (i.e., image sequences) to equalize the
occlusion conditions, one should take advantage of light illumination changes and also to suppress the
multiple cues, like color, motion, edge, etc., as none of presence of shadows.
these features alone can provide universal result to
different environments. The color histogram is robust A. Background subtraction
against the partial occlusion, but sensitive to the
illumination changes in the scene. Preprocessing is done on the video frames to reduce
In [9] color cues are combined with motion and cues the presence of noise. We apply mean filter which in turn
to provide a better result. Color and shape cues are also blurs the image frames which helps in shadow removal.
used in [10], where shape is described using a After preprocessing motion detection is performed.
parameterized rectangle or ellipse. In [11] the color, shape The background subtraction method is the common
and edge cues are combined under a particle filter method of motion detection. It is a technique that uses the
framework to provide robust tracking result, it also difference of the current image and the background image
involves an adoption scheme to select most effective cues to detect the motion region. Its calculation is simple and
in different conditions. easy to implement. Background subtraction delineates the
foreground from background in the images.
3. PROPOSED METHODOLOGY
Our algorithm aims to assign consistent identifier to | , – , |
, (1)
each object appears in scene when individual merge into or
split from the group and involves several methods to obtain
the lowest possibility of false tracking and tagging. In
tracking interested object (human), shadows affect the where Dk(x,y) is the resultant difference, Fk(x,y) is
performance of tracking and leads to false tagging. To the current frame and Bk-1(x,y) is the background
avoid this problem, we apply mean filter to remove noise initialized frame and T is the threshold which suppress
which causes the image sequence to blur. Since we are shadow depending on the value assigned.
using color information for tracking, blurring causes no
loss of data. The structural design of our proposed method
shown in Fig.1
Fig. 2 Background Subtraction
(a) Background image initialization
(b) Current frame with Moving objects.
(c) Resultant background subtracted image
There are many ways to initial background image.
For example, with the first frame as the background
directly, or the average pixel brightness of the first few
Fig. 1 System architecture frames as the background or using a background image
606
3. sequences without the prospect of moving objects to
estimate the background model parameters and so on
depending on the application. Among these we prefer the
image sequence having no objects as background image
since we use indoor videos (has less illumination change).
Following figure 2 illustrates the result of background
subtraction.
The drastic changes in pixel’s intensity indicate that
the pixel is in motion. The background subtraction step
generates a binary image containing black (represents
background) and white (moving pixels).
Then a post processing step is applied on the
binary image to label groups motion pixels as motion blobs
using connected component analysis. The key idea of
connected component analysis is to attach the adjacent
foreground’s pixel (i.e. white pixels) in order to construct a
region. Connected component labeling is used in computer
vision to detect connected regions in binary digital images.
Blobs may be counted, filtered, and tracked.
2. Object tracking
Fig.3 The work flow of color-based motion tracking
component
Once the object areas are determined in each
frame, the tracking is performed to trace the objects from
The third sub-task is, once the average
frame to frame. The color information from each blob is
comparison score of the motion block in the current frame
derived and tracking is performed by matching blob color.
is computed, the processor then assigns a tag to the motion
To handle occlusion, each motion blob is
blocks in the current frame. The processor tags the motion
The key feature of proposed method is the color
blocks in the current frame with either a tag similar to that
information of each object is extracted cluster-by-cluster.
of the previous frames or a new tag. The decision to retain
Each cluster has its own weightage for comparison. The
a tag or assign a new tag is dependent on the average
color information is extracted from the motion blocks in
comparison score computed for the motion block in the
the current frame to categorize matching color information
current frame and all motion blocks in the previous frame.
between motion blocks in the current frame and previous
frames. Subsequently, a tag is assigned to the motion
If( comparison score > threshold)
blocks in the current frame.
Assign previous tag
The first sub-task in object tracking is, each
Else
motion block in the current frame is segmented into areas
Assign new tag
of almost similar color as clusters (head, torso and feet).
For each cluster of the motion block, color information is
4. TESTS ON PETS DATASET
then derived as HSV values and stored, which helps in
comparison.
The above algorithm is implemented using Matlab on
The second sub-task is, to identify matching color
Windows 7 platform and tested with 4GB RAM. The test
information between motion blocks in the current frame
video for this example is in the PETS-ECCV'2004 –
and motion blocks in the previous frames. This is done by
CAVIAR database, which is an open database for research
comparing the cluster color information of a cluster of the
on visual surveillance.
motion block in the current frame with the cluster color
information of clusters in all motion blocks in the previous
A. Motion detection
frames using weighted matching. For each comparison
made, the processor computes a respective comparison
Accuracy in motion detection is important for efficient
score. The comparison score for each of the clusters of the
tracking. The threshold should be set in such a way to
motion block in the current frame is stored. The processor
avoid shadow to a greater extent also the blob size should
then identifies the highest comparison score of each cluster
be maintained properly and it depends on the application.
in the current frame. This is repeated for every cluster of
The figure 4 shown below illustrates the results with
the motion block in the current frame.
various threshold values.
607
4. Fig 5 Occlusion handling
Fig 4 Motion Detection
5. CONCLUSION
B. Object tracking
The advantages of using color as feature to
Assigning a suitable tag accurately during achieve object’s similarity is analyzed and found that it is
occlusion condition is illustrated below. Color feature robust against the complex, deformed and changeable
extraction and matching provides good solution in shape (i.e. different human profiles). In addition, it is also
assigning tags and clustering helps in reducing the cost of scale and rotation invariant, as well as faster in terms of
comparison. The following fig 5 shows handling processing time. Color information is extracted, stored and
occlusions during tracking. compared to find uniqueness of each object.
608
5. REFERENCES [6] S. J. McKenna, S. Jabri, Z. Duric, A. Rosenfeld and H.
Wechsler, “Tracking group of people,” Comput. Vis. Image
[1] R. Cucchiara, C. Grana, G. Neri, M. Piccardi and A. Understanding, vol. 80, no. 1, pp. 42-56, 2000.
Prati, “The Sakbot system for moving object detection and [7] I. Haritaoglu, D. Harwood, and L. S. Davis, “W4: Real-
tracking,” Video-based Surveillance Systems-Computer time surveillance of people and their activities,” In IEEE
vision and Distributed Processing, pp. 145-157, 2001. Trans. Pattern Analysis and Machine Intelligent, vol. 22,
[2] C. Stauffer and W. E. L. Grimson, “Adaptive no. 8, 2000, pp. 809-830.
background mixture models for real-time tracking,” in [8] A. Lipton, H. Fujiyoshi and R. Patil, “Moving target
Proc. IEEE Conf. Computer Vision and Pattern classification and tracking from real-time video,” In
Recognition, 1999. DARPA Image Understanding Workshop, pp. 129-136,
[3] C. Wren, A. Azarbayejani, T. Darrell, A. Pentl, November 1998.
“Pfinder: Real-time tracking of the human body,” In IEEE [9] P. Pérez, J. Vermaak, and A. Blake, "Data fusion for
Trans. Pattern Analysis and tracking with particles," Proceedings of the IEEE, vol. 92,
Machine Intelligent, vol. 19, no. 7, pp. 780-785. no. 3, pp. 495-513, (2004).
[4] L. Qiu and L. Li, “Contour extraction of moving [10] C. Shen, A. van den Hengel, and A. Dick,
objects,” in Proc. IEEE Int’l Conf. Pattern Recognition, "Probabilistic multiple cue integration for particle filter
vol. 2, 1998, pp. 1427–1432. based tracking," in Proc. of the VIIth Digital Image
[5] Tang Sze Ling, Liang Kim Meng, Lim Mei Kuan, Computing: Techniques and Applications. C. Sun, H.
Zulaikha Kadim and Ahmed A. Baha‘a Al-Deen, “Colour- Talbot, S. Ourselin, T. Adriansen, Eds., 10-12, (2003).
based Object Tracking in Surveillance Application” in [11] Wang, H., et al., "Adaptive object tracking based on
Proceedings of the International MultiConference of an effective appearance filter". IEEE Transactions on
Engineers and Computer Scientists 2009 Vol I IMECS Pattern Analysis and Machine Intelligence,(2007).
2009, March 18 - 20, 2009, Hong Kong.
609