SlideShare uma empresa Scribd logo
1 de 6
Baixar para ler offline
Aditya Venkatraman et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142

RESEARCH ARTICLE

Improvised
Monocular

Filter Approach
Images

www.ijera.com

OPEN ACCESS

for Estimating

Depth

from

Aditya Venkatraman1, Sheetal Mahadik2
1
2

(Department of Electronics and Telecommunication, ST Francis Institute of Technology, Mumbai, India)
(Department of Electronics and Telecommunication, ST Francis Institute of Technology, Mumbai, India

ABSTRACT
Depth estimation or extraction refers to the set of techniques and algorithm’s aiming to obtain distance of each
and every pixel from the camera view point. Depth Estimation poses various challenges and has wide range
applications.Depth can be estimated using different monocular and binocular cues. In [1] depth is estimated
from monocular images i.e. by using monocular cues. The filters used for texture gradient estimation in [1]
detect false edges and are susceptible to noise. In this paper for depth estimation from monocular images, we
have obtained a monocular cue named texture gradient using Canny edge detector which is more robust to
noise and false edges as compared to six oriented edge detection filters (Nevatia Babu filters) used in [1]. We,
have reduced the dimensions of feature vector as compared to [1] and hence feature optimization for texture
gradient extraction is achieved.Even though reduced set of features are used our outputs gave better results as
compared to [1].
Keywords – Canny, depth estimation, linear least squares problem, monocular cue, texture gradient

I.

INTRODUCTION

Depth estimation is an important research
area since it has its extensive use in robotic vision
and machine vision because of which it will be
possible to construct robots and cars that move
themselves or automatically without human
assistance. In related work, Saxena’s algorithm [1]
generates depth map from monocular images. Depth
estimation has important applications in robotics,
scene understanding and 3-D reconstruction.
By using depth estimation techniques,
distance of each point seen in the scene can be
obtained from the camera view point. Depth
estimation has continuously become an effort-taking
subject in visual computer sciences. Conventionally,
depths on a monocular image are estimated by using
a laser scanner or a binocular stereo vision system.
However, using a binocular stereo vision system
requires adjustment on the camera taking a scanner
picture, and using a laser scanner takes huge capitals
as well, so both these apparatuses bring significant
complexities. Therefore, some algorithms have been
developed to process monocular cues in the picture
for depth estimation. In related work, Michel’s,
Saxena & Ng [2] used supervised learning to estimate
1-D distances to obstacles, for the
application of autonomously driving a remote control
car. Most work on depth estimation has focused on
binocular vision (stereopsis) [3] and on other
algorithms that require multiple images, such as
shape from shading [4] and depth from focus [5].
Gini & Marchi [6] used single-camera vision to drive
an in- door robot, but relied heavily on known ground
colors and textures.
www.ijera.com

In this paper a more accurate depth map as compared
to [1] is obtained by using an improvised filter for
texture gradient extraction.

II.

METHODOLOGY

In this paper monocular cues are used for
depth estimation. There are different monocular cues
such as texture variations, texture gradients,
interposition, occlusion, known object sizes, light and
shading, haze, defocus etc. which can be used for
depth estimation .The monocular cues used in this
paper are haze, texture gradient and texture energy as
these cues are present in most images as in [1]. Many
objects texture appear different depending on their
distances from the camera viewpoints which help in
indicating depth. Texture gradients, which capture
the distribution of the direction of edges, also help to
indicate depth. Haze is another cue resulting from
atmospheric scattering light.
To obtain haze, averaging filters are used.
To obtain texture energy, nine Laws filters are used
and to obtain texture gradient, Canny edge detector is
used as compared to six edge detectors (Nevatia
Babu filters) used in [1].

2137 | P a g e
Aditya Venkatraman et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142
Fig. 1. Shows block diagram for estimating depth
from monocular images
Convert original image
from RGB field to YCbCr
field

Segment the entire image
into plurality of patches

Using
filters
obtain
monocular cues, in order to
obtain initial feature vector

Incorporate local and
global features in initial
features vector to obtain
absolute feature vector

2.2 ABSOLUTE FEATURE VECTOR
For a patch i in an image I(x,y) the outputs
of two local averaging filter, Canny edge detector
and nine Law’s mask filters which are in total of 12
outputs are used as:
Ei(n) =∑(x,y) patch(i) |I(x,y) ∗ Fn(x,y)|k

from

After obtaining these monocular cues, using
them, an initial feature vector is obtained. Although
this initial feature vector gives some information
about depth using local information such as variation
in texture and
color of a patch, these
are
insufficient to determine depth accurately and thus
global properties have to be used. For example, just
by looking at a blue patch it is difficult to tell whether
this patch is of a sky or a part of a blue object. Due to
these difficulties, one needs to look at both the local
and global properties of an image to determine depth.
Thus the final feature vector which includes
both local as well as global features is used in a
supervised learning algorithm which predicts depth
map as a function of image.
The detailed explanation of feature
calculation and Learning is done in section 2.1, 2.2
and 2.3.
2.1 FEATURE VECTOR
The entire image is initially divided into
small rectangular patches which are arranged in a
uniform grid, and a single depth value for each patch
www.ijera.com

is estimated. Here absolute depth features are used
which are used to determine absolute depth of a
patch. Three types of monocular cues are selected:
texture variations, texture gradients and haze, as these
are present in most of the indoor and outdoor images.
Texture information is mostly contained within the
image intensity channel so Laws’ mask is applied to
this channel, to compute the texture energy. Haze is
reflected in the low frequency information in the
color channels, and is captured by applying a local
averaging filter to the color channels. To compute an
estimate of texture gradient, Canny edge detector [7]
is used which is more robust to noise and false edges
as compared to six oriented edge filters (Nevatia
Babu filters) used in [1]

Figure 2: Filters used for depth estimation wherein,
the nine filters indicate Law’s 3X3 masks.

Incorporate absolute
feature vector in a
supervised learning
algorithm which predicts
depth map as a function of
image
Figure 1: Block diagram for depth estimation
monocular images

www.ijera.com

(1)

Where for Fn(x, y) indicates the filter used.
Here n indicates the number of filters used, where,
n=1 to 12. Here k = {1, 2} gives
the sum absolute
energy and sum squared energy respectively. Thus an
initial feature vector of dimension 12 is obtained as
compared to dimension 17 in [1].
With these filters, local image features for a
patch is obtained. But to obtain absolute depth of a
patch, local image features centered on the patch are
insufficient, and more global properties of the image
have to be used. Image features extracted at multiple
image resolutions are used for this very purpose.
Objects at different depths exhibit very different
behaviors at different resolutions, and using multiscale features (scale 1, scale 3, and scale 9) allows us
to capture these variations. Computing features at
multiple spatial scales also helps to account for
different relative sizes of objects. A closer object
appears larger in the image, and hence will be
captured in the larger scale features. The same object
when far away will be small and hence be captured in
the small scale features. To capture additional global
features, the features used to predict the depth of a
particular patch are computed from that patch as well
as the four neighboring patches which is repeated at
each of the three scales, so that the feature vector of a
patch includes features of its immediate neighbors at
large scale, features of far neighbors at a larger
2138 | P a g e
Aditya Venkatraman et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142
spatial scale, and again features of very far neighbors
at an even larger spatial scale. Along with local and
global features, many structures found in outdoor
scenes show vertical structure so, additional summary
features of the column that the patch lies in, are
added to the features of a patch.
For each patch, after including features from
itself and its four neighbors at 3 scales, and summary
features for its four column patches, absolute feature
vector of dimension 24*19=456 is obtained as
compared to dimension of 34*19=646 in [1].
2.3 SUPERVISED LEARNING
A change in depth along the row of any
image as compared to same along the columns is very
less. This is clearly evident in outdoor images since
depth along the column is till infinity as the outdoor
scene is unbounded due to the presence of sky. Since
depth is estimated for each patch in an image, feature
is calculated for each patch whereas learning done for
each row as changes along the row is very less.
Linear least squares method is used for learning
whose equation is given by:
Өr=min (∑i=1 to N (di – xiT Өr )2 )

EXPERIMENTS

3.1 DATA
The data set is available online
(http://make3d.cs.cornell.edu/data.html)
which
consists of 400 images and their corresponding
ground truth depth maps which includes real world
set of images of forests, campus (roadside) areas and
indoor images
3.2 RESULTS
Depth is estimated from real-world test-set
images of forests (containing trees, bushes, etc.),
campus areas (buildings, trees and roads). The
algorithm was trained on a training set comprising
images from all of these environments. Here 300
images are used for training and the rest 100 images
are used for testing.
TABLE 1 shows the comparison of depth
map using Canny edge detector and depth map
obtained by [1] i.e. by using Nevatia Babu filters
based on RMS (root mean square) errors in various
environments such as campus, forest, indoor and
areas which include both campus and forest. The
result on the test set shows that our output less RMS
error as compared to [1].
TABLE 2 shows the comparison of depth
map obtained using Canny edge detector and depth
www.ijera.com

map obtained by [1] i.e by using Nevatia Babu filters
based on computation time ,set of features used and
average RMS error .It can be seen that we have
reduced set of features as only 456 as compared to
646 features used by [1]. Since we have used less set
of features , our total computation time is only as
compared to 24sec total computation of 44sec
[1].Even though we have used less set of features our
average RMS error is less than [1]
Table 1: Comparison of RMS errors resulting from
depth map using Canny edge detector and depth map
obtained using [1]
Test
Depth map
Depth map
Environments
using
using Canny
Nevatia
edge detector
Babu filters
Forest
1.1035
1.0365
Campus

0.6367

0.5968

Forest
&Campus
Indoor

0.6254

0.4386

1.234

1.220

(2)

In equation (2), N represents
the total
number of patches in the image. Here di is the ground
truth depth map for patch i, xi is the absolute feature
vector for patch i. Here Өr, where r is the row of an
image, is estimated using linear least squares
problem.

III.

www.ijera.com

Table 2: Comparison of depth map using Canny
edge detector and depth map using Nevatia Babu
filters based on different parameters
Parameters
Depth map
Depth map
using Nevatia
using Canny
Babu filters
edge detector
Average
RMS error
Set
of
features
Computation
time(sec)

0.899

0.8229

646

456

44

24

Figure 3: Original image (Forest)

2139 | P a g e
Aditya Venkatraman et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142

Figure 3.1: Ground truth depth map (Forest)

Figure 3.2: Depth map using Nevatia Babu filters
(Forest)

Figure 3.3: Depth map using Canny edge detector
(Forest)

www.ijera.com

www.ijera.com

Figure 4: Original image (Campus-roadside
(combined)

Figure 4.1: Ground truth depth map (Campusroadside (combined))

Figure 4.2: Depth map using Nevatia Babu filters
(campus-roadside (combined)).

2140 | P a g e
Aditya Venkatraman et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142

www.ijera.com

Figure 4.3: Depth map using Canny edge detector
(Campus-roadside (combined))

Figure 5.2: Depth map using Nevatia Babu filters
(campus-roadside (combined)).

Figure 5: Original image (campus-roadside and
forest (combined))

Figure 5.3: Depth map using Canny edge detector
(campus-roadside (combined)).

Figure 5.1: Ground truth depth map (campusroadside and forest (combined))

www.ijera.com

Figure 6: Original image (indoor)

2141 | P a g e
Aditya Venkatraman et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142

www.ijera.com

monocular cue named texture gradient not only set of
features are reduced but also depth map is predicted
more accurately as compared to [1].

REFERENCES
[1]

[2]

[3]
Figure 6.1: Ground truth depth map (indoor)
[4]

[5]

[6]

[7]
Figure 6.2: Depth map using Nevatia Babu filters
(indoor).

Ashutosh Saxena, Sung H. Chung, and
Andrew Y. Ng. "Learning depth from single
monocular images”, In NIPS 18, 2006.
Jeff Michels, Ashutosh Saxena and A.Y. Ng.
"High speed obstacle avoidance using
monocular vision", In Proceedings of the
Twenty First International Conference on
Machine Learning (ICML), 2005
D.Scharstein and R. Szeliski. A taxonomy
and evaluation of dense two-frame stereo
correspondence algorithms. Int’l Journal of
Computer Vision, 47:7–42, 2002M.
M. Shao, T. Simchony, and R. Chellappa.
New algorithms from reconstruction of a 3-d
depth map from one or more images. In
Proc IEEE CVPR, 1988
S.Das and N. Ahuja. Performance analysis
of stereo, vergence, and focus as depth cues
for active vision. IEEE Trans Pattern
Analysis& Machine Intelligence, 17:1213–
1219, 1995.
G. Gini and A. Marchi. Indoor robot
navigation with single camera vision. In
PRIS, 2002.
J. Canny, “A computational approach to
edge detection,” IEEE Trans.Pattern Anal.
Mach. Intell., vol. 8, no. 6, pp. 679–698,
1986

Figure 6.3: Depth map using Canny edge detector
(indoor).

IV.

CONCLUSION

A detailed comparison of our method and
the method used by [1] is done in terms of average
RMS errors, computation time and set of features
used. This detailed comparison is done in set of
various indoor and outdoor environments. It is found
that by using Canny edge detector for estimating a
www.ijera.com

2142 | P a g e

Mais conteúdo relacionado

Mais procurados

COLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATION
COLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATIONCOLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATION
COLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATION
IAEME Publication
 
Comparative study on image fusion methods in spatial domain
Comparative study on image fusion methods in spatial domainComparative study on image fusion methods in spatial domain
Comparative study on image fusion methods in spatial domain
IAEME Publication
 

Mais procurados (20)

Sub-windowed laser speckle image velocimetry by fast fourier transform techni...
Sub-windowed laser speckle image velocimetry by fast fourier transform techni...Sub-windowed laser speckle image velocimetry by fast fourier transform techni...
Sub-windowed laser speckle image velocimetry by fast fourier transform techni...
 
IMAGE SEGMENTATION BY USING THRESHOLDING TECHNIQUES FOR MEDICAL IMAGES
IMAGE SEGMENTATION BY USING THRESHOLDING TECHNIQUES FOR MEDICAL IMAGESIMAGE SEGMENTATION BY USING THRESHOLDING TECHNIQUES FOR MEDICAL IMAGES
IMAGE SEGMENTATION BY USING THRESHOLDING TECHNIQUES FOR MEDICAL IMAGES
 
Importance of Mean Shift in Remote Sensing Segmentation
Importance of Mean Shift in Remote Sensing SegmentationImportance of Mean Shift in Remote Sensing Segmentation
Importance of Mean Shift in Remote Sensing Segmentation
 
Digital image classification22oct
Digital image classification22octDigital image classification22oct
Digital image classification22oct
 
IMAGE SEGMENTATION TECHNIQUES
IMAGE SEGMENTATION TECHNIQUESIMAGE SEGMENTATION TECHNIQUES
IMAGE SEGMENTATION TECHNIQUES
 
Performance Analysis of Spatial and Frequency Domain Multiple Data Embedding ...
Performance Analysis of Spatial and Frequency Domain Multiple Data Embedding ...Performance Analysis of Spatial and Frequency Domain Multiple Data Embedding ...
Performance Analysis of Spatial and Frequency Domain Multiple Data Embedding ...
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Segmentation
SegmentationSegmentation
Segmentation
 
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
 
2
22
2
 
COLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATION
COLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATIONCOLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATION
COLOUR BASED IMAGE SEGMENTATION USING HYBRID KMEANS WITH WATERSHED SEGMENTATION
 
Image Classification Techniques in GIS
Image Classification Techniques in GISImage Classification Techniques in GIS
Image Classification Techniques in GIS
 
A Review over Different Blur Detection Techniques in Image Processing
A Review over Different Blur Detection Techniques in Image ProcessingA Review over Different Blur Detection Techniques in Image Processing
A Review over Different Blur Detection Techniques in Image Processing
 
Fpga implementation of image segmentation by using edge detection based on so...
Fpga implementation of image segmentation by using edge detection based on so...Fpga implementation of image segmentation by using edge detection based on so...
Fpga implementation of image segmentation by using edge detection based on so...
 
Fpga implementation of image segmentation by using edge detection based on so...
Fpga implementation of image segmentation by using edge detection based on so...Fpga implementation of image segmentation by using edge detection based on so...
Fpga implementation of image segmentation by using edge detection based on so...
 
PhD_ppt_2012
PhD_ppt_2012PhD_ppt_2012
PhD_ppt_2012
 
IRJET-Design and Implementation of Haze Removal System
IRJET-Design and Implementation of Haze Removal SystemIRJET-Design and Implementation of Haze Removal System
IRJET-Design and Implementation of Haze Removal System
 
Image analysis basics and principles
Image analysis basics and principlesImage analysis basics and principles
Image analysis basics and principles
 
Comparison of Segmentation Algorithms and Estimation of Optimal Segmentation ...
Comparison of Segmentation Algorithms and Estimation of Optimal Segmentation ...Comparison of Segmentation Algorithms and Estimation of Optimal Segmentation ...
Comparison of Segmentation Algorithms and Estimation of Optimal Segmentation ...
 
Comparative study on image fusion methods in spatial domain
Comparative study on image fusion methods in spatial domainComparative study on image fusion methods in spatial domain
Comparative study on image fusion methods in spatial domain
 

Destaque

Grayscale Image Transmission over Rayleigh Fading Channel in a MIMO System Us...
Grayscale Image Transmission over Rayleigh Fading Channel in a MIMO System Us...Grayscale Image Transmission over Rayleigh Fading Channel in a MIMO System Us...
Grayscale Image Transmission over Rayleigh Fading Channel in a MIMO System Us...
IJERA Editor
 
Dynamic Metadata Management in Semantic File Systems
Dynamic Metadata Management in Semantic File SystemsDynamic Metadata Management in Semantic File Systems
Dynamic Metadata Management in Semantic File Systems
IJERA Editor
 
Structuri de acces multipartite
Structuri de acces multipartiteStructuri de acces multipartite
Structuri de acces multipartite
Valentina Radovici
 

Destaque (20)

Mt3621702173
Mt3621702173Mt3621702173
Mt3621702173
 
Mu3621742180
Mu3621742180Mu3621742180
Mu3621742180
 
Mr3621592164
Mr3621592164Mr3621592164
Mr3621592164
 
Ms3621652169
Ms3621652169Ms3621652169
Ms3621652169
 
Mp3621482152
Mp3621482152Mp3621482152
Mp3621482152
 
Mq3624532158
Mq3624532158Mq3624532158
Mq3624532158
 
Gx3612421246
Gx3612421246Gx3612421246
Gx3612421246
 
Grayscale Image Transmission over Rayleigh Fading Channel in a MIMO System Us...
Grayscale Image Transmission over Rayleigh Fading Channel in a MIMO System Us...Grayscale Image Transmission over Rayleigh Fading Channel in a MIMO System Us...
Grayscale Image Transmission over Rayleigh Fading Channel in a MIMO System Us...
 
Estimation of Water Vapour Attenuation And Rain Attenuation
Estimation of Water Vapour Attenuation And Rain AttenuationEstimation of Water Vapour Attenuation And Rain Attenuation
Estimation of Water Vapour Attenuation And Rain Attenuation
 
Dynamic Metadata Management in Semantic File Systems
Dynamic Metadata Management in Semantic File SystemsDynamic Metadata Management in Semantic File Systems
Dynamic Metadata Management in Semantic File Systems
 
A Comprehensive Introduction of the Finite Element Method for Undergraduate C...
A Comprehensive Introduction of the Finite Element Method for Undergraduate C...A Comprehensive Introduction of the Finite Element Method for Undergraduate C...
A Comprehensive Introduction of the Finite Element Method for Undergraduate C...
 
Performance of High-Strength Concrete Using Palm Oil Fuel Ash as Partial Ceme...
Performance of High-Strength Concrete Using Palm Oil Fuel Ash as Partial Ceme...Performance of High-Strength Concrete Using Palm Oil Fuel Ash as Partial Ceme...
Performance of High-Strength Concrete Using Palm Oil Fuel Ash as Partial Ceme...
 
Ejercicios de-word t.a
Ejercicios de-word t.aEjercicios de-word t.a
Ejercicios de-word t.a
 
EMV
EMVEMV
EMV
 
Presentacion s.n.c
Presentacion s.n.cPresentacion s.n.c
Presentacion s.n.c
 
Kiver surf camp power point
Kiver surf camp power pointKiver surf camp power point
Kiver surf camp power point
 
Work convo
Work convoWork convo
Work convo
 
Affordable housing in Cavite rush rush for sale/brand new houses rush for sal...
Affordable housing in Cavite rush rush for sale/brand new houses rush for sal...Affordable housing in Cavite rush rush for sale/brand new houses rush for sal...
Affordable housing in Cavite rush rush for sale/brand new houses rush for sal...
 
Structuri de acces multipartite
Structuri de acces multipartiteStructuri de acces multipartite
Structuri de acces multipartite
 
3553
35533553
3553
 

Semelhante a Mn3621372142

Real time implementation of object tracking through
Real time implementation of object tracking throughReal time implementation of object tracking through
Real time implementation of object tracking through
eSAT Publishing House
 
Stereo Correspondence Algorithms for Robotic Applications Under Ideal And Non...
Stereo Correspondence Algorithms for Robotic Applications Under Ideal And Non...Stereo Correspondence Algorithms for Robotic Applications Under Ideal And Non...
Stereo Correspondence Algorithms for Robotic Applications Under Ideal And Non...
CSCJournals
 
IJARCCE 22
IJARCCE 22IJARCCE 22
IJARCCE 22
Prasad K
 

Semelhante a Mn3621372142 (20)

Fd36957962
Fd36957962Fd36957962
Fd36957962
 
Depth Estimation from Defocused Images: a Survey
Depth Estimation from Defocused Images: a SurveyDepth Estimation from Defocused Images: a Survey
Depth Estimation from Defocused Images: a Survey
 
Application of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving WeatherApplication of Image Retrieval Techniques to Understand Evolving Weather
Application of Image Retrieval Techniques to Understand Evolving Weather
 
Real time implementation of object tracking through
Real time implementation of object tracking throughReal time implementation of object tracking through
Real time implementation of object tracking through
 
Stereo Correspondence Algorithms for Robotic Applications Under Ideal And Non...
Stereo Correspondence Algorithms for Robotic Applications Under Ideal And Non...Stereo Correspondence Algorithms for Robotic Applications Under Ideal And Non...
Stereo Correspondence Algorithms for Robotic Applications Under Ideal And Non...
 
A Review on Deformation Measurement from Speckle Patterns using Digital Image...
A Review on Deformation Measurement from Speckle Patterns using Digital Image...A Review on Deformation Measurement from Speckle Patterns using Digital Image...
A Review on Deformation Measurement from Speckle Patterns using Digital Image...
 
F045073136
F045073136F045073136
F045073136
 
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
A NOBEL HYBRID APPROACH FOR EDGE  DETECTIONA NOBEL HYBRID APPROACH FOR EDGE  DETECTION
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
 
El4301832837
El4301832837El4301832837
El4301832837
 
J25043046
J25043046J25043046
J25043046
 
J25043046
J25043046J25043046
J25043046
 
IJARCCE 22
IJARCCE 22IJARCCE 22
IJARCCE 22
 
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
 
Texture Unit based Approach to Discriminate Manmade Scenes from Natural Scenes
Texture Unit based Approach to Discriminate Manmade Scenes from Natural ScenesTexture Unit based Approach to Discriminate Manmade Scenes from Natural Scenes
Texture Unit based Approach to Discriminate Manmade Scenes from Natural Scenes
 
OBIA on Coastal Landform Based on Structure Tensor
OBIA on Coastal Landform Based on Structure Tensor OBIA on Coastal Landform Based on Structure Tensor
OBIA on Coastal Landform Based on Structure Tensor
 
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
 
Fuzzy Region Merging Using Fuzzy Similarity Measurement on Image Segmentation
Fuzzy Region Merging Using Fuzzy Similarity Measurement  on Image Segmentation  Fuzzy Region Merging Using Fuzzy Similarity Measurement  on Image Segmentation
Fuzzy Region Merging Using Fuzzy Similarity Measurement on Image Segmentation
 
Ed34785790
Ed34785790Ed34785790
Ed34785790
 
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...
Image Segmentation from RGBD Images by 3D Point Cloud Attributes and High-Lev...
 
F045033337
F045033337F045033337
F045033337
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Mn3621372142

  • 1. Aditya Venkatraman et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142 RESEARCH ARTICLE Improvised Monocular Filter Approach Images www.ijera.com OPEN ACCESS for Estimating Depth from Aditya Venkatraman1, Sheetal Mahadik2 1 2 (Department of Electronics and Telecommunication, ST Francis Institute of Technology, Mumbai, India) (Department of Electronics and Telecommunication, ST Francis Institute of Technology, Mumbai, India ABSTRACT Depth estimation or extraction refers to the set of techniques and algorithm’s aiming to obtain distance of each and every pixel from the camera view point. Depth Estimation poses various challenges and has wide range applications.Depth can be estimated using different monocular and binocular cues. In [1] depth is estimated from monocular images i.e. by using monocular cues. The filters used for texture gradient estimation in [1] detect false edges and are susceptible to noise. In this paper for depth estimation from monocular images, we have obtained a monocular cue named texture gradient using Canny edge detector which is more robust to noise and false edges as compared to six oriented edge detection filters (Nevatia Babu filters) used in [1]. We, have reduced the dimensions of feature vector as compared to [1] and hence feature optimization for texture gradient extraction is achieved.Even though reduced set of features are used our outputs gave better results as compared to [1]. Keywords – Canny, depth estimation, linear least squares problem, monocular cue, texture gradient I. INTRODUCTION Depth estimation is an important research area since it has its extensive use in robotic vision and machine vision because of which it will be possible to construct robots and cars that move themselves or automatically without human assistance. In related work, Saxena’s algorithm [1] generates depth map from monocular images. Depth estimation has important applications in robotics, scene understanding and 3-D reconstruction. By using depth estimation techniques, distance of each point seen in the scene can be obtained from the camera view point. Depth estimation has continuously become an effort-taking subject in visual computer sciences. Conventionally, depths on a monocular image are estimated by using a laser scanner or a binocular stereo vision system. However, using a binocular stereo vision system requires adjustment on the camera taking a scanner picture, and using a laser scanner takes huge capitals as well, so both these apparatuses bring significant complexities. Therefore, some algorithms have been developed to process monocular cues in the picture for depth estimation. In related work, Michel’s, Saxena & Ng [2] used supervised learning to estimate 1-D distances to obstacles, for the application of autonomously driving a remote control car. Most work on depth estimation has focused on binocular vision (stereopsis) [3] and on other algorithms that require multiple images, such as shape from shading [4] and depth from focus [5]. Gini & Marchi [6] used single-camera vision to drive an in- door robot, but relied heavily on known ground colors and textures. www.ijera.com In this paper a more accurate depth map as compared to [1] is obtained by using an improvised filter for texture gradient extraction. II. METHODOLOGY In this paper monocular cues are used for depth estimation. There are different monocular cues such as texture variations, texture gradients, interposition, occlusion, known object sizes, light and shading, haze, defocus etc. which can be used for depth estimation .The monocular cues used in this paper are haze, texture gradient and texture energy as these cues are present in most images as in [1]. Many objects texture appear different depending on their distances from the camera viewpoints which help in indicating depth. Texture gradients, which capture the distribution of the direction of edges, also help to indicate depth. Haze is another cue resulting from atmospheric scattering light. To obtain haze, averaging filters are used. To obtain texture energy, nine Laws filters are used and to obtain texture gradient, Canny edge detector is used as compared to six edge detectors (Nevatia Babu filters) used in [1]. 2137 | P a g e
  • 2. Aditya Venkatraman et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142 Fig. 1. Shows block diagram for estimating depth from monocular images Convert original image from RGB field to YCbCr field Segment the entire image into plurality of patches Using filters obtain monocular cues, in order to obtain initial feature vector Incorporate local and global features in initial features vector to obtain absolute feature vector 2.2 ABSOLUTE FEATURE VECTOR For a patch i in an image I(x,y) the outputs of two local averaging filter, Canny edge detector and nine Law’s mask filters which are in total of 12 outputs are used as: Ei(n) =∑(x,y) patch(i) |I(x,y) ∗ Fn(x,y)|k from After obtaining these monocular cues, using them, an initial feature vector is obtained. Although this initial feature vector gives some information about depth using local information such as variation in texture and color of a patch, these are insufficient to determine depth accurately and thus global properties have to be used. For example, just by looking at a blue patch it is difficult to tell whether this patch is of a sky or a part of a blue object. Due to these difficulties, one needs to look at both the local and global properties of an image to determine depth. Thus the final feature vector which includes both local as well as global features is used in a supervised learning algorithm which predicts depth map as a function of image. The detailed explanation of feature calculation and Learning is done in section 2.1, 2.2 and 2.3. 2.1 FEATURE VECTOR The entire image is initially divided into small rectangular patches which are arranged in a uniform grid, and a single depth value for each patch www.ijera.com is estimated. Here absolute depth features are used which are used to determine absolute depth of a patch. Three types of monocular cues are selected: texture variations, texture gradients and haze, as these are present in most of the indoor and outdoor images. Texture information is mostly contained within the image intensity channel so Laws’ mask is applied to this channel, to compute the texture energy. Haze is reflected in the low frequency information in the color channels, and is captured by applying a local averaging filter to the color channels. To compute an estimate of texture gradient, Canny edge detector [7] is used which is more robust to noise and false edges as compared to six oriented edge filters (Nevatia Babu filters) used in [1] Figure 2: Filters used for depth estimation wherein, the nine filters indicate Law’s 3X3 masks. Incorporate absolute feature vector in a supervised learning algorithm which predicts depth map as a function of image Figure 1: Block diagram for depth estimation monocular images www.ijera.com (1) Where for Fn(x, y) indicates the filter used. Here n indicates the number of filters used, where, n=1 to 12. Here k = {1, 2} gives the sum absolute energy and sum squared energy respectively. Thus an initial feature vector of dimension 12 is obtained as compared to dimension 17 in [1]. With these filters, local image features for a patch is obtained. But to obtain absolute depth of a patch, local image features centered on the patch are insufficient, and more global properties of the image have to be used. Image features extracted at multiple image resolutions are used for this very purpose. Objects at different depths exhibit very different behaviors at different resolutions, and using multiscale features (scale 1, scale 3, and scale 9) allows us to capture these variations. Computing features at multiple spatial scales also helps to account for different relative sizes of objects. A closer object appears larger in the image, and hence will be captured in the larger scale features. The same object when far away will be small and hence be captured in the small scale features. To capture additional global features, the features used to predict the depth of a particular patch are computed from that patch as well as the four neighboring patches which is repeated at each of the three scales, so that the feature vector of a patch includes features of its immediate neighbors at large scale, features of far neighbors at a larger 2138 | P a g e
  • 3. Aditya Venkatraman et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142 spatial scale, and again features of very far neighbors at an even larger spatial scale. Along with local and global features, many structures found in outdoor scenes show vertical structure so, additional summary features of the column that the patch lies in, are added to the features of a patch. For each patch, after including features from itself and its four neighbors at 3 scales, and summary features for its four column patches, absolute feature vector of dimension 24*19=456 is obtained as compared to dimension of 34*19=646 in [1]. 2.3 SUPERVISED LEARNING A change in depth along the row of any image as compared to same along the columns is very less. This is clearly evident in outdoor images since depth along the column is till infinity as the outdoor scene is unbounded due to the presence of sky. Since depth is estimated for each patch in an image, feature is calculated for each patch whereas learning done for each row as changes along the row is very less. Linear least squares method is used for learning whose equation is given by: Өr=min (∑i=1 to N (di – xiT Өr )2 ) EXPERIMENTS 3.1 DATA The data set is available online (http://make3d.cs.cornell.edu/data.html) which consists of 400 images and their corresponding ground truth depth maps which includes real world set of images of forests, campus (roadside) areas and indoor images 3.2 RESULTS Depth is estimated from real-world test-set images of forests (containing trees, bushes, etc.), campus areas (buildings, trees and roads). The algorithm was trained on a training set comprising images from all of these environments. Here 300 images are used for training and the rest 100 images are used for testing. TABLE 1 shows the comparison of depth map using Canny edge detector and depth map obtained by [1] i.e. by using Nevatia Babu filters based on RMS (root mean square) errors in various environments such as campus, forest, indoor and areas which include both campus and forest. The result on the test set shows that our output less RMS error as compared to [1]. TABLE 2 shows the comparison of depth map obtained using Canny edge detector and depth www.ijera.com map obtained by [1] i.e by using Nevatia Babu filters based on computation time ,set of features used and average RMS error .It can be seen that we have reduced set of features as only 456 as compared to 646 features used by [1]. Since we have used less set of features , our total computation time is only as compared to 24sec total computation of 44sec [1].Even though we have used less set of features our average RMS error is less than [1] Table 1: Comparison of RMS errors resulting from depth map using Canny edge detector and depth map obtained using [1] Test Depth map Depth map Environments using using Canny Nevatia edge detector Babu filters Forest 1.1035 1.0365 Campus 0.6367 0.5968 Forest &Campus Indoor 0.6254 0.4386 1.234 1.220 (2) In equation (2), N represents the total number of patches in the image. Here di is the ground truth depth map for patch i, xi is the absolute feature vector for patch i. Here Өr, where r is the row of an image, is estimated using linear least squares problem. III. www.ijera.com Table 2: Comparison of depth map using Canny edge detector and depth map using Nevatia Babu filters based on different parameters Parameters Depth map Depth map using Nevatia using Canny Babu filters edge detector Average RMS error Set of features Computation time(sec) 0.899 0.8229 646 456 44 24 Figure 3: Original image (Forest) 2139 | P a g e
  • 4. Aditya Venkatraman et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142 Figure 3.1: Ground truth depth map (Forest) Figure 3.2: Depth map using Nevatia Babu filters (Forest) Figure 3.3: Depth map using Canny edge detector (Forest) www.ijera.com www.ijera.com Figure 4: Original image (Campus-roadside (combined) Figure 4.1: Ground truth depth map (Campusroadside (combined)) Figure 4.2: Depth map using Nevatia Babu filters (campus-roadside (combined)). 2140 | P a g e
  • 5. Aditya Venkatraman et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142 www.ijera.com Figure 4.3: Depth map using Canny edge detector (Campus-roadside (combined)) Figure 5.2: Depth map using Nevatia Babu filters (campus-roadside (combined)). Figure 5: Original image (campus-roadside and forest (combined)) Figure 5.3: Depth map using Canny edge detector (campus-roadside (combined)). Figure 5.1: Ground truth depth map (campusroadside and forest (combined)) www.ijera.com Figure 6: Original image (indoor) 2141 | P a g e
  • 6. Aditya Venkatraman et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.2137-2142 www.ijera.com monocular cue named texture gradient not only set of features are reduced but also depth map is predicted more accurately as compared to [1]. REFERENCES [1] [2] [3] Figure 6.1: Ground truth depth map (indoor) [4] [5] [6] [7] Figure 6.2: Depth map using Nevatia Babu filters (indoor). Ashutosh Saxena, Sung H. Chung, and Andrew Y. Ng. "Learning depth from single monocular images”, In NIPS 18, 2006. Jeff Michels, Ashutosh Saxena and A.Y. Ng. "High speed obstacle avoidance using monocular vision", In Proceedings of the Twenty First International Conference on Machine Learning (ICML), 2005 D.Scharstein and R. Szeliski. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int’l Journal of Computer Vision, 47:7–42, 2002M. M. Shao, T. Simchony, and R. Chellappa. New algorithms from reconstruction of a 3-d depth map from one or more images. In Proc IEEE CVPR, 1988 S.Das and N. Ahuja. Performance analysis of stereo, vergence, and focus as depth cues for active vision. IEEE Trans Pattern Analysis& Machine Intelligence, 17:1213– 1219, 1995. G. Gini and A. Marchi. Indoor robot navigation with single camera vision. In PRIS, 2002. J. Canny, “A computational approach to edge detection,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 8, no. 6, pp. 679–698, 1986 Figure 6.3: Depth map using Canny edge detector (indoor). IV. CONCLUSION A detailed comparison of our method and the method used by [1] is done in terms of average RMS errors, computation time and set of features used. This detailed comparison is done in set of various indoor and outdoor environments. It is found that by using Canny edge detector for estimating a www.ijera.com 2142 | P a g e