SlideShare uma empresa Scribd logo
1 de 47
1
Modeling of buildings with a flat roof in
aerial photographs
T.H.M Derks(0569074)
2
Contents
1. Summary ................................................................................................................................. 4
2. Introduction ............................................................................................................................ 5
2.1. Literature Overview ......................................................................................................... 5
2.2. Analysis and discussion.................................................................................................... 7
2.3. Conclusions and Recommendations in problem approach ............................................. 8
3. System overview ..................................................................................................................... 9
4. Algorithm description ........................................................................................................... 10
4.1. Segmentation ................................................................................................................. 10
4.1.1. Watershed............................................................................................................... 10
4.1.2. Edge preserving filter.............................................................................................. 10
4.2. Region merging .............................................................................................................. 11
4.2.1. Seed point ............................................................................................................... 11
4.2.2. Efficient merging implementation.......................................................................... 12
4.2.3. Merge condition...................................................................................................... 12
4.2.4. Optimal merging distance....................................................................................... 12
4.3. Polygon simplification.................................................................................................... 14
4.3.1. Weighted Graph...................................................................................................... 14
4.3.2. Shortest Path algorithm.......................................................................................... 16
4.4. Shape detection ............................................................................................................. 16
4.4.1. Corner penalty function.......................................................................................... 16
4.4.2. Optimization cost function ..................................................................................... 17
4.5. Building segment merging ............................................................................................. 18
4.5.1. Combining segments............................................................................................... 19
5. Experiments and results........................................................................................................ 20
5.1. Quality measurement .................................................................................................... 20
6. Results................................................................................................................................... 21
6.1. Detection statistics......................................................................................................... 21
6.2. Examples of correct detections...................................................................................... 24
6.3. Examples of incorrect detections................................................................................... 27
3
6.3.1. Building to background contrast............................................................................. 28
6.3.2. Segment combining ................................................................................................ 29
6.3.3. Segment finding ...................................................................................................... 30
6.3.4. Multi-color roofs ..................................................................................................... 32
6.3.5. Correctness ............................................................................................................. 33
6.3.6. Shadows / Dark patches.......................................................................................... 35
7. Recommendations ................................................................................................................ 37
7.1. Merging distance............................................................................................................ 37
7.2. Graph weight function ................................................................................................... 37
7.3. Segments ........................................................................................................................ 38
7.4. Low contrast buildings ................................................................................................... 39
7.5. Shadows ......................................................................................................................... 39
7.6. Shape detection optimizer............................................................................................. 40
8. Conclusion............................................................................................................................. 41
9. References............................................................................................................................. 42
10. Appendix A: Matlab scripts................................................................................................ 44
11. Appendix B: Detection results ........................................................................................... 45
4
1. Summary
Nowadays, very-high-resolution color aerial images are captured from The Netherlands at an
annual basis. Automated interpretation of aerial images can lead to faster inspection resulting
in more frequent updating of civil community databases. Existing research work copes with
localization of buildings with a gable roof. As a significant percentage of the buildings has a flat
roof, this system needs to be extended to detect this kind of buildings. Finding flat roofed
buildings is a difficult problem since many look-alike objects exist. Therefore this work focuses
at a semi-automatic system, capable of modeling the buildings given a seed point.
The developed system (Figure 1) uses a region-based detection approach based on
segmentation by the watershed method. A region-merge step is used to merge all the regions
probably belonging to the building at the given seed point. The optimal region-merge distance
is found by exploiting the fact that a building can be represented by a polygon with a low vertex
count. A robust polygon simplification method is employed to convert the building region to a
building model. A shape detection step modifies the angles of the model to values which are
most likely to occur in roof shapes.
Segmentation Region merge
Polygon
simplification
Shape
detection
Seedpoint
Image Building
model
Figure 1: System diagram
The system performs reasonably well since it models 48% of the buildings correctly in the
dataset. The biggest weakness of the system is partly detected roofs, finding all segments
belonging to a roof is a difficult problem.
5
2. Introduction
Nowadays, very-high-resolution color aerial images are captured from The Netherlands at an
annual basis, resulting in an accurate and recent view of the country infrastructure. Due to the
time-consuming nature of manual inspection of the infrastructure, automated interpretation of
aerial images can lead to faster inspection and more frequent updating of civil community
databases.
Existing research work copes with localization of buildings with a gable roof [1]. As a significant
percentage of the buildings has a flat roof, this system needs to be extended to detect this kind
of buildings. Finding flat roofed buildings is a difficult problem since many look-alike objects
exist. Therefore, this work focuses at a semi-automatic system, capable of modeling the
buildings given a point found on the roof, but not solving the detection problem.
First a literature study is done to get an overview of existing implementations for the detection
and modeling of flat roofs. From this overview techniques are selected to be used in the
developed system.
2.1.Literature Overview
To get an overview of existing solutions for the detection of flat roofs a short summary is given
of interesting papers. The papers are chronologically ordered.
C. Lin and R. Nevatia [2] propose a fully automatic building extraction method that is based on
the detection of edges in the image. It is assumed that the searched rectangular buildings can
be found by finding parallelograms in the image. The edges are taken as building hypothesis
and classified by use of a feature vector and additional features like shadow.
S Müller and D. W Zaum [3] deal with the roof detection problem by starting with a seeded
region growing algorithm to segment the entire image. Then photometric and geometric
features are calculated for all regions. A numerical classification based on these features is
performed to differentiate between building and non-building regions to detect roof tops.
The automated building-extraction strategy by X. Jin and C. H. Davis [4] uses structural,
contextual, and spectral information to extract buildings. A series of geodesic opening and
closing operations is used to build a differential morphological profile (DMP) that provides
image structural information. Building hypotheses are generated and verified through shape
analysis applied to the DMP. Shadows are extracted using the DMP to provide reliable
contextual information to hypothesize position and size of adjacent buildings. Building shapes
were reconstructed by starting a region-merge algorithm on candidate buildings on a
watershed segmented image.
6
L. Hai-yue et al [5] created an automatic building extraction system that uses a potential
clustering function to segment the image. The Hough transform is used on the region contour
to detect the most dominant line. Building regions are verified by judging the length of this line.
A grid matching method is used to map the target buildings areas into regular polygons.
Y. Wei et al [6] established a semi-automatic building rooftop extraction method applied on
high resolution satellite imagery. Two different segmentation methods are used to create
building regions. The first segmentation method is a seeded region growing algorithm that
merges pixels into a building region. The second segmentation method is the mean-shift which
is applied in the target building area. A model matching technique based on node graph search
is used to convert the found regions to the correct shape of the rooftop.
B. Sirmac and C. Unsalan [7] proposed a color invariant that detects roofs based on the color
red. A different color invariant is used to detect shadows in the image. The illumination angle of
the image is calculated from the shadow of detected red roofs. Roofs with different colors can
now be detected solely based on the shadow. The shape of the building is determined with a
box fitting method.
The detection method by A. Katartzis and H. Sahli [8] is based on a stochastic image
interpretation model. Rooftop hypotheses are extracted using a contour based grouping
hierarchy that emanates from the principles of perceptual organization. A Markov random field
model is used to describe dependencies between all available hypotheses. The hypothesis
verification step is treated as a stochastic optimization process that operates on the whole
grouping hierarchy to find it’s optimum configuration for the interacting group hypothesis.
Z. Liu et al [9] constructed a general semi-automatic rooftop extraction method using high
resolution satellite imagery. A seeded region growth segmentation or localized multi-scale
object oriented segmentation is applied to extract small and simple rectilinear rooftops from its
background. Model matching techniques based on node graph search are used for finding the
correct building rooftop shape.
K. Karantzalos and N. Paragios [10] established a recognition-driven variational framework for
fully automatic building extraction from aerial photos. Competing building shape priors are
considered and used in building extraction by using the prior models in the segmentation
process.
M. Kabolizade et al [11] proposed a boundary extraction method based on a GVF snake model.
This method has the advantage of integrating edge-based and region-based snakes by
minimizing internal and external energy forces. A genetic algorithm has been used to optimize
the parameters of the snake model.
7
E. Pakizeh and M. Palhang [12] presented an approach for building detection using Hough
transform and intensity information. Building locations are first detected with the use of
intensity information. Morphological operations are applied to filter out small non-building
regions. The Hough transform is used to verify the existence of buildings on candidate regions.
The proposed method by M. Izadi and P. Saeedi [13] incorporates a hierarchical multilayer
feature based image segmentation technique using color. A number of geometrical or regional
attributes are defined to identify potential regions in multiple layers of segmented images. A
tree-based mechanism is utilized to search for regions that maximize a set of rooftop definition
measures. Candidate regions are verified through shadow evidence.
2.2.Analysis and discussion
Flat roofs have numerous properties that can be used in the detection process. Most detection
systems employ more than one property. By using multiple properties, more information is
considered, which results in more accurate output. The following properties are often
employed:
 Color (both value and homogeneity)
 Features (i.e. lines, corners)
 Texture
 Size
 Shadows
 Height information
Not all properties can be used at the same time in a single detection step. Most algorithms
apply multiple detection steps that use one building property at a time to improve results.
Algorithms start with the property that, according to the makers, gives the most information.
The extraction techniques listed previously can be classified into two main categories.
 Region based (i.e. color homogeneity)
 Feature based (i.e. features)
The first category is a region based extraction method. Color homogeneity of the roof building
is used to find building contours. A region growing method is employed to find a building by
merging similar pixels into an area. Another option is to apply a segmentation algorithm like
mean-shift or watershed to segment the image first. A region-merging algorithm is applied to
get the total building region from the segmented image. The result of a region based detection
step is the contour of a region. There are several ways to convert the building contour into a
building model. Possibilities for this conversion are polygon simplification or approximation and
box fitting or prior shape matching techniques.
8
The second category is based on features. Lines or corners (or both) are detected in the image.
A popular method to detect lines in an image is the Hough transform. Detected lines can be
combined by detecting intersections. By forming a closed contour with detected lines possible
building polygons are formed.
For both detection categories, other properties are used to refine the detection results. Shape,
size and shadows are used as constraints or probability models. This helps to reduce the
number of false positives or assist in the detection of difficult cases.
2.3.Conclusions and Recommendations in problem approach
An overview of existing papers was given to get idea of the techniques applied in detecting
images. This resulted in a list of building properties that can be employed for detection. All
detecting techniques have their weaknesses, associated with the properties that are used. The
best results were accomplished by combining building properties in the detection process.
Since almost all papers could be categorized in a region or features based method the starting
point of the new system will be based on one of these two.
The remainder of this report will describe the system that is developed. The system overview
can be found in Section 3, where each module is explained in detail in Section 4. Section 5
explains the testing setup and contains the quality measurement definitions for evaluation of
the results. Section 6 evaluates the performance of the system with the quality measurement
statistics. It also shows examples of correct and incorrect detections to get insight in the
practical limitations and possibilities of the system. Section 7 gives recommendations to
improve the results of the system. Section 8 represents the conclusion of the report.
9
3. System overview
A system is developed to find flat roof buildings in a high resolution aerial image, given a seed
point. The system consists of four modules:
1. Segmentation
2. Region-merge
3. Polygon simplification
4. Shape detection
The systemoverview is shown in Figure 2. The first step is the image segmentation process. The
whole image is converted into segments by applying the watershed method. The result is an
heavily oversegmented image, where individual buildings consist of multiple regions that need
to be merged together.
The region-merging step starts at the given building seed point. The region-merge step tries to
merge all the building segments probably belonging to the building without including any non-
building regions. Adjacent regions are merged based on the mean region color. Multiple
merging distances are evaluated and the best merging distance is automatically selected based
on a cost function applied to the result.
Segmentation Region merge
Polygon
simplification
Shape
detection
Seed point
Image Building
model
Figure 2: System block diagram
The resulting region is converted into a contour. This contour contains all the boundary points
of the region, which makes it a polygon with a high vertex count. The contour is transformed
into a building model by polygon simplification. To find the best low vertex count
approximation of the contour, a weighted graph is constructed that interconnects all the
contour points with each other. A weight is assigned to each edge, based on the quality of the
approximation between those points. Dijkstra’s shortest path algorithm is used to select the
best simplified polygon by calculating the shortest path for a cycle through the graph.
The last step of the system is the shape detection. Prior knowledge about building corners is
used to improve the corners of the building model. A corner penalty function is constructed
that favors perpendicular and 45 degrees angles, since these angles occur most frequently. An
optimization function is used to minimize the corner penalties without deviating too much from
the initial building contour.
10
4. Algorithm description
This chapter explains the system implementation. A separate section is dedicated to each of the
four modules of the system. The fifth section explains the segment finding feature which
resides in the region merge module.
4.1.Segmentation
The first step of the system segments the image into regions. The regions need to be as large as
possible without introducing regions that are located only partially inside the building contour.
The watershed method is used to segment the whole image. This method requires a gradient
map as input. Before calculating the gradient map, an optional edge preserving smoothing filter
can be applied. Pre-filtering the image with an edge preserving filter reduces the amount of
regions in the segmentation output without deteriorating the gradient map at the straight
building edges. A block diagram of the segmentation step is shown in Figure 3. For all the
regions, statistics like the mean color and size are calculated to be used in the region merging
step.
Segmentation
Edgepreserving
filter
Gradient map Watershed
Image
Segmented
Image
Optional
Figure 3: Segmentation block diagram
4.1.1. Watershed
A gradient map of the image is required for the watershed algorithm. For the gradient map, a
Sobel operator is applied horizontally and vertically on the grey scale intensity image. The Sobel
operator has been selected because it is less sensitive to noise than the Roberts operator and
has a more isotropic response than the Prewitt. An isotropic response is uniform in all
directions, which is desired because the building orientation is unknown. The gradient
magnitude is the Euclidian length of the horizontal and vertical magnitudes of the sobel
operator. Minima that are too shallow are suppressed to reduce oversegmentation. This
minimum is empirically determined as the maximum value that does not merge non-building
with building regions in an image with very low contrast between building and environment.
This is important because merged regions that contain building and non-building sections can
still be merged in the next step.
4.1.2. Edge preserving filter
Before calculating the gradient map, a smoothing filter can be applied to reduce the number of
regions of the watershed result. However, a normal smoothing filter also degrades the building
edges, which results in building and non-building areas in one region. A bilateral filter [14]
11
smooths an image while preserving the strong edges of the buildings. This results in a less over-
segmented image that still separates building from non-building regions. Figure 4 displays a
comparison of the watershed segmentation result based on the average region color with and
without bilateral filtering applied. The filtered case has fewer regions, while there are additional
regions for straight edges. This can be seen in the brown area below the right part of the
building.
Figure 4: Result of watershed method without (left) and with bilateral filtering (right).
4.2.Region merging
The watershed method results in an oversegmented image. The edge preserving filter and the
local minimum constraint reduced the oversegmentation, but similarly colored regions still
need to be merged to get the building contour. The seed point determines the start of the
region merge algorithm. The initial region is expanded by merging adjacent regions based on
color similarity. As the color variation on the roof and the contrast between building and
surrounding varies, multiple merging distances are evaluated and the best merging distance is
automatically selected based on a cost function applied to the result.
4.2.1. Seed point
The region merging algorithm starts with a region selected by the seed point. Simply picking the
region which contains the seed point can give an improper start region. The seed point may
select a small region inside the building with a different color as the roof, as shown in Figure 5.
Therefore, a different method is applied to select the start region. The region that has the most
pixels inside a 60 pixel radius from the seed point is selected as the start region. Because the
biggest region inside the circle is selected, the chance of selecting an incorrect start region is
reduced.
12
4.2.2. Efficient merging implementation
When a region is merged into to the building region, the adjacent regions of this region need to
be considered for merging as well. Accessing the watershed image contour to find neighbors
each time a region is merged is inefficient. A region adjacency graph (RAG) is employed to
identify the adjacent regions for all watershed regions. An array of lists is created, where each
element is linked to the list of neighbors of that region. These lists can be calculated efficiently
for the whole image, eliminating the need to search the watershed image for neighbors when a
region is merged. Another advantage is that the neighbors of a region only need to be lookup
once when multiple merging distances are evaluated
4.2.3. Merge condition
The mean color of a region is used as the merging condition. When the color difference to an
adjacent region is below a threshold distance it is merged into the building region. The Lab
color space has been used because it is perceptually uniform. In this color space, a change in
visual importance produces the same change in the Euclidian distance of the color components.
4.2.4. Optimal merging distance
The optimal merging distance depends on the color contrast at the border regions of the
building and the color variation of the regions that belong to it. The distance should be high
enough to merge all the building regions but low enough not to include any surrounding
regions. Multiple merging distances are evaluated and the best distance is selected by
Figure 5: Regions that cannot be used as seed point
13
evaluating the result by a cost function. A building shape can be represented by a low vertex
count polygon. The optimal merging distance is selected using this property.
A boundary tracing algorithm converts the region merge area into a polygon. This polygon is
simplified by a polygon simplification method (Douglas-Peuckler). The first component of the
cost function is the mean contour difference. This is the area between the original en the
simplified contour divided by the contour length. For a good merging distance choice this
difference is low, because a building can be accurately represented by a polygon with a low
vertices count. When surrounding non-building regions are merged, the area difference
between the two contours will be higher, since the simplified polygon does not represent the
building anymore. The Douglas-Peuckler polygon simplification method varies in the number of
vertices for different shaped contours. Additional vertices will always improve the mean
contour difference. Therefore a cost per vertex is added as the second component. The third
component is the ratio of the initial contour length to the simplified contour length. The
simplified contour length is always lower than the initial contour length. When the simplified
contour is a good approximation of the original contour the ratio will be smaller. The last
component subtracts a portion of the merging distance to favor a bigger merging distance
slightly, as this favors larger building contours over building subsections with a low vertex count
shape.
𝐶 =
AreaDifference
ContourLength
+ a ∙ DpVertices + b ∙
Points
DpLength
− c ∙ MergeDistance (Formula 1)
4.2.4.1. Douglas-Peucker
The Douglas-Peucker algorithm aims at the reduction of the number of vertices of a polygon. It
starts with the start and end point of the polygon. It then iteratively adds the point which is the
furthest away from the polygon approximation until all points are within a tolerance distance.
An example of this process is shown in Figure 6. The approximation of a closed polygon induces
one problem. The closed contour is opened at a random point on the polygon to apply the
algorithm. This start/end point will always be in the polygon approximation. Therefor the result
of the douglas-peuckler simplification is opened at a different point in the contour and the
algorithm is applied again. This will remove the start/end point of the first simplification if it lies
below the tolerance distance.
Figure 6: Douglas-Peucker algorithm steps
14
4.3.Polygon simplification
The resulting building region mask should to be converted into a building model. The first step
is a boundary tracing algorithm to convert the region mask into a polygon. This polygon
contains all the boundary points of the region mask. The high vertex count polygon should be
simplified into a low vertex count building model. The Douglas-Peucker algorithm is not suitable
for this simplification. Sometimes non-building regions are merged into the building mask, since
it’s the best possible result the region merging step could find. The Douglas-Peucker algorithm
is very sensitive to wrongly merged regions, since it always adds a vertex for a point which lies
too far from the existing simplification. More robust simplification is required, that ignores
outliers caused by wrong merged regions. A good method to accomplish this is to find a
simplified polygon that minimizes area in between the original and simplified contour.
A shortest path method similar to the one described in [15] is employed to implement this
robust polygon simplification step. A directed weighted graph is constructed with vertices for all
the points on the polygon. All vertices are connected with an edge weight that defines the area
in between the difference of the and the polygon. The exact function used to calculate the
weights is explained is the next subsection. By calculating the shortest path for a cycle in the
graph, the simplified polygon can be constructed. Figure 7 shows an example of the Douglas-
Peucker (green) and the shortest path (red) algorithm.
Figure 7: Building detection result with the boundary points of region merge(Blue), Douglas-Peuckler (Green)and Shortest
Path (yellow) algorithm
4.3.1. Weighted Graph
A weighted graph is used to generate a better polygon simplification for the building region
than the Douglas-Peucker algorithm. All boundary points of the building are added as vertices.
15
The weight function explained in the next subsection is applied to assign values to the edges
between all the vertices. This weight function is very important, since it determines the criteria
the shortest path algorithm will use to simplify the polygon.
4.3.1.1. Weight function
The weight function defines the criteria used to get a robust simplification of the building
contour. The edge weight between two vertices in the graph is calculated using the following
function:
𝑊( 𝑖, 𝑖 + 𝑘) = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒_𝑒𝑟𝑟𝑜𝑟_𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 + 𝑣𝑒𝑟𝑡𝑖𝑐𝑒_𝑐𝑜𝑠𝑡 + 𝑔 ∙ 𝐶𝑜𝑛𝑡𝑜𝑢𝑟𝑃𝑜𝑖𝑛𝑡s (Formula 2)
Figure 9: Calculation of average error distance
The first component of the weight function is the average distance of all the points between
the interpolated line of the end points (Figure 9). Which results in the following formula for the
average error distance between a point A and B with k interlaying points:
𝑎𝑣𝑒𝑟𝑎𝑔𝑒_𝑒𝑟𝑟𝑜𝑟_𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 =
1
𝑘
∑ 𝐷𝑖
𝑘
𝑖=1
To calculate the distance d between the line and the point (Figure 8), first the line equation
through the points P1 and P2 is calculated:
𝐴 = 𝑃1 = ( 𝑥1, 𝑦1), 𝐵 = 𝑃2 = ( 𝑥2, 𝑦2)
𝑎 ∙ 𝑥 + 𝑏 ∙ 𝑦 + 𝑐 = 0
With the variables a, b and c defined as:
𝑎 = 𝑦2 − 𝑦1
𝑏 = 𝑥1 − 𝑥2
𝑐 = −( 𝑎 ∙ 𝑦2 + 𝑏 ∙ 𝑥2)
Then the distance between the point 𝑃0 = ( 𝑥0, 𝑦0) and the line is given by:
𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 = |
𝑎∙𝑥0+𝑏∙𝑦0+𝑐
√𝑎2+𝑏2 | (Formula 3)
Figure 8: Distance between point and a line
16
The second component of the weight function is the vertex cost. Adding an additional edge will
always decrease the average distance, therefore a penalty for the vertex is added. The third
component of the weight function adds another contribution to this edge cost. For a large
building, error distances are generally bigger. The penalty for a vertex needs to be increased for
prevent the simplification of using more vertices for large scaled buildings.
To increase the speed of the algorithm, some restrictions are used to prevent calculating the
weight between every single point of the polygon. When the total error distance, which can be
regarded as an area, is above one eighth of the building area and the mean distance is above
ten pixels, the vertex weight is not calculated. This does not have an effect on the result of the
algorithm, but prevents calculation of vertex weights that will never appear in the shortest
path. Furthermore, only vertexes with an average line error below four are added to save
calculation time in the shortest path algorithm.
4.3.2. Shortest Path algorithm
Now that a weight is assigned to each edge of the directed graph, the best simplification is
determined by calculating the shortest path through the directed graph. First, the shortest
distance between all points is calculated for the directed graph. Then, the minimum distance
between all points and its preceding point is calculated. Since the graph is directed, a path
around the whole contour is forced. Since the end point lies only one pixel away from the start
point these two points are merged to create the simplified polygon.
4.4.Shape detection
Knowledge about building shapes can be employed to further enhance the detected building
shape [2]. Building footprints can have lots of different shapes. However, certain angles are far
more likely than others. The corners in buildings are usually squared or have a 45˚ angle. This
prior information about building corners can be applied to correct angles in the detected
building. Corner angles that lie close to a minimum on the corner penalty function will be
shifted to the minimum. A cost function is constructed that has two components (formula 4).
The first component is the average error distance to all the building contour points also used in
the weight function of Section 6.1.1. The second component is the sum of the penalty of all the
corners of the polygon. The coordinates of vertices are varied to find an optimum for the given
cost function.
𝐻 = average_error + 𝑏 ∑ 𝑓(𝛼𝑖) (Formula 4)
4.4.1. Corner penalty function
The corner penalty function defines the penalty for each corner angle. The corner penalty
function defined in [15] is applied and is shown in Figure 10. Since a square angle is preferred, it
has zero penalty. Angles of 180 degree have also zero penalty, since it eliminates falsely
17
detected corners. Angles of 45 and 135 degrees are also favored, so they have lower penalties
than the angles nearby.
Figure 10: Corner penalty function
4.4.2. Optimization cost function
To find an optimum for the cost function, the Matlab optimizer fmincon is applied. The cost
function is minimized with the coordinates of the vertices as optimization variables. All the
angles will be shifted towards 90,45, and 135 angles as long as the average error distance to all
the contour points doesn’t become too large. The weighing factor b determines the balance
between the penalty costs for corners and the deviation from the contour points. Good results
are reached with a factor of 20.
0 20 40 60 80 100 120 140 160 180
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Angle (Degrees)
Penalty
18
4.5.Building segment merging
Some buildings consist of multiple segments. The region merging algorithm only detects a
single segment since the region merging stops at the edge of a segment. The whole building
should be detected, so a method is needed to find the other building segments belonging to the
same building.
First, a contour is made with a fixed distance d to the region contour found in the first region
merging step, this is the cyan contour in Figure 11. All regions that occur more than r times in
the contour and have a close color similarity to the first segment are considered a new start
region. The Euclidian distance in Lab color space is used as the metric for color similarity with a
maximum distance of 7. These restrictions limit the amount of starting regions, since the
algorithm will try the region-merging step on all this regions. After localization of the optimum
merging distance for a given start region, the shape of the region is evaluated with a corner
penalty function similar to the one used in Section 4.4.1 (Figure 10). If the average corner
penalty is low enough, the segment is added to the building footprint. This check is added to
only add segments which are very likely to be a building segment.
Both the maximum color distance and the average corner penalty are set very strict to prevent
surroundings to be detected as a building segment. Detecting building segments without false
positives solely on shape and color are very limited. Building segments can differ in color and
can have some corners with a high penalty cost. Height estimation would be a great addition to
find other building segments, since they are always higher than the surroundings. This would
give an extra constraint, so that the current ones can be loosened.
Figure 11: Segment finding (O initial seed point, + seed point of segment)
19
4.5.1. Combining segments
When multiple segments are obtained, the areas of the region-merge results are combined
together with a morphology closing method. This is an operation that employs dilation and
erosion to merge two segments which are close to each other. Since the orientation of the
building on the image is random, a circular element is used for the operation. In Figure 12 both
operations are shown. When two areas are close together they will be connected by the
dilation operation. The erosion operation will shrink the area back to the original size.
Figure 12: Morphological dilation (left) and erosion (right) with a circular element (input blue, output cyan )
20
5. Experiments and results
The performance of the detection algorithm is evaluated. A test set of 81 buildings from 11
aerial photos is used to benchmark the algorithm. For all buildings, the footprint is entered
manually as a reference. The centroid of the reference footprint is used as the seed point in the
detection algorithm because the result is expected to be independent of the chosen seed point
as long the building is not found partially. The detection algorithm is run for the complete data
set and evaluated with the quality measurements given in Section 5.1. When a building has a
low quality measurement, the result is inspected graphically to determine the cause of the
weaknesses in the detection algorithm.
5.1. Quality measurement
A quality measurement is needed to evaluate the performance of the detection algorithm. The
result of the algorithm is compared with a manually entered reference footprint of the building.
The extracted building and the reference footprint are compared pixel-by-pixel and categorized
into four types[4]:
 True positive (𝑇𝑃). Both the manual and automated method label the pixel belonging to
the buildings.
 True negative (𝑇 𝑁). Both the manual and automated method label the pixel belonging to
the background.
 False positive (𝐹𝑃 ). The automated method incorrectly labels the pixel as belonging to a
building.
 False negative (𝐹𝑁 ). The automated method incorrectly labels a pixel truly belonging to a
building.
The total number of pixels in each category are determined for the building. With these
numbers the following quality measurements can be calculated:
 Completeness:
𝑇 𝑃
𝑇 𝑃+𝐹 𝑁
 Correctness:
𝑇 𝑃
𝑇𝑝+𝐹𝑃
 Quality:
𝑇 𝑃
𝑇 𝑃+𝐹𝑃+𝐹 𝑁
The completeness measurement gives the fraction of building which is detected by the
algorithm. The correctness measurement gives the fraction of reference pixels which were
correctly denoted as building pixels. The quality measurement is the best overall performance
evaluation method. To get a high quality measurement the algorithm must correctly label every
building pixel, without mislabeling any background pixels. The completeness and correctness
show whether the false positive or the true negatives have a bigger influence on the quality.
This is important to be able to determine the weakest aspect of the detection algorithm.
21
6. Results
The detection algorithm was run on the building dataset of 81 buildings. The detection results
for all the individual buildings are listed in Appendix B. To get an overview of the overall
performance, the detection statistics are shown in Probability density functions. Afterwards, we
will show graphical examples of correctly detected buildings. Finally examples of incorrect
detected buildings will give an overview of the problems encountered.
6.1.Detection statistics
The probability density functions of the correctness, completeness and quality are used to get
an overview of the detection results of the system. It follows from Figure 13 that the
correctness of the system is very high. For 94 percent of the buildings, the correctness factor is
above 95 percent. This means that the percentage of marked building pixels which were actual
building pixels is very high. However, the percentage of correctly marked building pixels doesn’t
mean anything with a low completeness factor.
Figure 13: Probability density function of the correctness
The completeness distribution shown in Figure 14 is more widely spread. This means that for a
lot of buildings not all the building pixels were correctly labeled as such. The completeness
factors below eighty percent can be explained by the fact that some buildings are only partly
detected. Other reasons for a low correctness are shadows, low building contrast or
discolorations on the roof. Examples of all these causes are shown in Section 6.3. Even though
there are problems for the correctness, there is still a big part of the distribution in the higher
segment, fifty percent of the buildings has a completeness factor above ninety percent.
0
10
20
30
40
50
60
70
#buildings
Correctness Percentage
Correctness
22
Figure 14: probability density function of the completeness
The quality probability density function is shown in Figure 15. The quality measurement is a
combination between the correctness and the completeness measurement, since it accounts
for both the mislabeled building pixels and the missed building pixels. The spread in the quality
is caused by the spread in correctness. To improve the quality measurement, the correctness
performance needs to be improved. In the quality measurement, 43 percent of the buildings
has a factor above ninety percent.
Figure 15: Probability density function of the quality
To make an easy comparison between the different metrics the probability density functions of
the correctness, completeness and quality are plotted into one plot in Figure 16. Here it is even
0
2
4
6
8
10
12
14
16
18
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
#buidings
Percentage
Completeness
0
5
10
15
20
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
#buidlings
Quality Percentage
Quality
23
more clear that the completeness performance needs to be improved to increase the quality
factor.
Figure 16: Probability density function of correctness, completeness and quality
0
10
20
30
40
50
60
70
80 0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
BuildingPercentage
Measurement Percentage
Correctness
Completness
Quality
24
6.2.Examples of correct detections
This chapter gives graphical examples of correctly detected buildings to get an overview of
which kind of buildings are correctly handled and how the individual steps of the detection
algorithm function. In Figure 17 the legend for the detection results is shown. Each step in the
detection process is shown as a colored line. For polygons, the vertices are shown by a cross.
Figure 17: Legend for the detection results
In the left image in Figure 18, building 1 is easily detected due to the high contrast with the
surroundings. The Douglas-Peucker algorithm already gives a correct result. The blue region
merge result for building 2 and 3 is more tortuous due to the lower contrast of the buildings.
The resulting polygon simplification step extracts the correct building model out of the contour.
On the right image, two building segments are correctly merged together into a single building.
Figure 18: Examples of correctly detected buildings left: dataset A4, right: dataset A5
25
In the left picture of figure 13, two uncommon shaped buildings are correctly detected. On the
right picture, it can be seen that rectangular shaped objects on the roof do not cause any
detection problems.
Figure 19: Examples of correctly detected buildings left: dataset G1, right: dataset J4
In Figure 20, two buildings with uncommon building angles are correctly detected. In the right
picture, the polygon simplification filters out the blue contour spike going into the building. The
two buildings also show that the system works on a large variety of building shapes. The
building on the right is 8 times bigger in area than the building on the left.
Figure 20: Examples of correctly detected buildings left: dataset H1, right: dataset J2
26
In Figure 21, a very big complex building shape is detected correctly. The shape fit step of the
detection process compensates all the angles to be exactly 90 degrees.
Figure 21: Examples of correctly detected buildings dataset J1
27
6.3.Examples of incorrect detections
This subchapter will give examples of incorrect detected buildings. These are buildings with a
quality factor below ninety percent. This overview will illustrate the weaknesses of the current
algorithm. The goal is to give a good impression of all the problem cases that were encountered
in the dataset.
The reason for an incorrect detection can be divided into six main categories:
1. Low building to background contrast: When the contrast between the roof and the
surroundings is low. It can be impossible to find a correct merging distance.
2. Incorrect Segment combining: Detected roof segments are incorrectly combined with
each other.
3. Segment finding: Not all segments belonging to a building are found.
4. Shadows or dark patches: Due to color variations on the roof due to shadow or rain
water a part of the roof is not found.
5. Correctness: In some cases non-building segments are included in the building contour.
6. Multicolor roofs: When a roof consists of multiple colors, the roof is only partly found
because the region-merging is based on color.
For each category, this chapter has a section that shows the detection results graphically. This
will give a clear view of the detection process by showing the output of all the steps in the
detection process. The reason of the false detection is discussed and when possible a method
for improvement is given.
28
6.3.1. Building to background contrast
The first step of the detection algorithm tries to find the optimum merging distance to merge
all regions of the building contour. When the contrast between the building and the
background is low at a certain point in the building contour, the merging distance has to be
short. However, a short merging distance might not include all the regions of the building. A
segment of a building is almost always a better fit for a low polygon shape then included
background regions (exception in the subchapter “Correctness”). Therefore the maximum
distance that selects as many building regions as possible without including background regions
is chosen.
A few examples of this problem are shown in Figure 22. Building 1 is only partly detected due to
low contrast of the building at the right edge. This is also the case for building 3, not all the
regions belonging to the building are found, which results in a bad detection. On the right
picture, building 7 has a lower contrast to the background than the other similar buildings. This
is due to the road on the right that almost matches the color of the building. Because of the
resulting short merging distance not all building regions are found. The steps following this first
step can’t correct for so many missing building regions.
Buildings with low contrast to the surroundings are fundamentally harder to detect. Detection
based on region merging is more sensitive to low contrast situations, because the weakest link
around the contour determines the maximum usable contrast for the merging distance. A
possibility is to assist the region merging step with line detection techniques. This will be
further discussed in the recommendations section.
Figure 22: Examples of false detected buildings due to low building to background contrast. (dataset I)
29
6.3.2. Segment combining
Segments are combined with the morphological closing operation. This is a very simple
operation that uses dilation and erosion to connect separate areas with each other. The
downside of this operation is that inner corners of the segments
are rounded (Figure 23). This limits the maximum distance
between segments that can be merged. When the circle used for
the morphological closing is large the unwanted rounding has a
great impact on the building shape.
To minimize the rounding effect of the closing operations, the
size of the circular element is minimized. However, this gives rise
to other problems in some situations. For example, when the gap
between two segments is too large, the segments are not
completely connected, which can be seen in Figure 24.
Another problem is visible in the polygon (red line) of both buildings in Figure 24. The segments
that are detected have parallel lines that are near each other, but are not aligned. This results
in an approximation of the two individual lines with one line diagonally through both of them.
Segment merging is currently performed before the polygon simplification step, but the result
is expected to be better when this order is reversed. First, a polygon is determined for each
individual segment and then the separate polygons are merged into one building polygon
afterwards. This option will be further discussed in the recommendations section.
Figure 24: Examples of false detected buildings due to segment combining. left: dataset J2, right: dataset A2
Figure 23: Morphological closing
30
6.3.3. Segment finding
The segment finding searches for similar colored regions a short distance from the initially
found segment. The distance is relatively small and can’t be increased, since the current
segment merging algorithm is limited to merging segments close to each other as explained in
the previous subchapter.
Figure 25: Examples of false detected buildings due to unfound segments. Dataset J2
Due to the small search distance for additional segments a lot of building segments are not
found, especially in large buildings. In Figure 25 and Figure 26 examples of buildings with
undetected segments are shown. The main reason for the unfound segments is the short
searching distance. The aerial photos are captured at an angle with the ground. When two
segments don’t have the exact same height, the side view of the walls is in-between the two
segments. A difference in height can also cast a shadow on the adjacent segment. Both effects
require a larger searching distance for segments than is currently possible due to the limitation
of the merging step. Therefore a new method to merge building segment is needed. Ideas for a
better method will be discussed in the recommendations section.
31
Figure 26: Examples of falsely detected buildings due to unfound segments. left/top: dataset F, bottom: dataset B
32
6.3.4. Multi-color roofs
Flat roofs are not always uniformly colored. Since the region-merging algorithm is solely based
on color differences, it is unable to correctly detect multi-colored roofs. Examples of this
problem are shown in Figure 27 and Figure 28. The only way to solve this problem is with
additional height information that can link segments with a different color based on a similar
height. At the moment radar height information is not available. Extracting height information
based on perspective might be a possibility.
Figure 27: Examples of false detected buildings due to multi-color roofs. left: dataset H2, right: dataset A6
Figure 28: Examples of false detected buildings due to multi-color roofs. dataset A1
33
6.3.5. Correctness
The overall performance for the correctness is very high for the building detection algorithm.
Only three buildings out of 81 have a correctness factor below 95 percent, which will all be
discussed in this section.
In Figure 29, the shadow of the building is merged into the building contour. This is caused by a
combination of three factors:
1. A lot of building segments due to a line grid on the roof.
2. Low contrast between roof and shadow.
3. Straight edges on the shadow of the building.
A large merging distance is required to merge all of the small building segments. However, due
to the low contrast of the building with its shadow it cannot merge all the building regions
without including the shadow. Since a subset of the building segments cannot be correctly
approximated by a low count polygon, the detection includes the shadow because of the long
straight lines. The shadow of the building has the same shape property as the building itself and
is therefore incorrectly seen as such.
Figure 29: Examples of false detected buildings due to wrong merged regions. dataset J1
34
The other buildings with a low correctness factor are caused by incorrect detected segments,
see Figure 30. To the left building the algorithm incorrectly added a segment. The manual
reference was entered separately due to the difference in orientation and the objects on the
roof. The segment is however attached to the building and has a similar color. A clear definition
to verify that a segment belongs to a building is hard to formulate, which makes the reference
choice debatable.
The right building has merged a segment which clearly does not belong to it. The segments
seem very different, but the mean color of the striped segment is exactly the same as the mean
color of the other segment. An additional comparison on the color distribution of both
segments would solve this incorrect match
Figure 30: Examples of false detected buildings due to an incorrect building segment. left: dataset H2, right: dataset H1
35
6.3.6. Shadows / Dark patches
Shadows from nearby objects like trees, chimneys and towers can cause color differences on
the roof. When the contour of the shadow has straight lines, a merging distance that excludes
the shadow from the building roof can be selected. Examples of this problem are shown below
in Figure 31. The shadow on the picture at the left is caused by a tree right next to the building.
In the right picture the tower of a church casts a shadow over the roof.
Figure 31: Examples of falsely detected buildings due to shadows on the roof. left: dataset H2, right: dataset A3
Flat roofs sometimes have strong discoloration on the edges due to rain water. These dark
patches are visible on buildings two and three in Figure 32. The polygon simplification step
compensates for the missing corners in building three (red). The shape fit method further
improves the detection by preferring perpendicular angles (yellow). Small patches in relation to
the building contour are normally fixed by the polygon simplication step as can also be seen in
Figure 26 and Figure 27.
For building two, this correction cannot be made because the contour misses a whole part of
the building. When closely looking at the picture, the region-merge could grow around the dark
patch. However the top regions of the building are so light it also includes non building regions.
Therefore this faulty detection is also partly caused by low building contrast.
36
Figure 32: Examples of false detected buildings due to dark patches on the roof. dataset C (roof 2+3)
37
7. Recommendations
Despite the accurate performance in some cases, the current detection algorithm has some
weaknesses. In this chapter, recommendations are given to address these weaknesses to
improve the detection results. These ideas followed from analysis of the detection results.
7.1.Merging distance
Since there is a metric to qualify the result of the region merging step, more merging criteria
could be evaluated. Currently, the merging distance is varied to find the optimum distance for
the Euclidian distance in Lab color-space. Occasionally, no distance gives a good result for the
region-merge, such that all the building regions could not be matched without including some
of the background regions. The search for an optimal merging distance could be extended by
using various definitions of this distance. A different definition of the merging distance might
include a distance that does merge all building regions without inclusion of background. A
different definition of the distance could mean a different color-space or assigning different
weights to the components of the lab color-space. Another possibility is an asymmetrical
distance, where a value above the mean is treated differently than a value below the mean.
Multiple definitions of the merging distance will increase the time spent in the region-merging
step significantly. For every definition, multiple values will be tried. However, it will help to get
the maximum result out of the region merging step in the algorithm.
7.2.Graph weight function
The graph weight function is meant to calculate the difference in area between the line
approximation and the contour. The difference in terms of area is chosen since mistakes of the
region-merge step need to be corrected. Since the region-merge step can make mistakes, the
polygon approximation tries to find the best solution by finding a low count polygon that
minimizes the area difference.
The sum of distances between all contour points and the polygon line is a good approximation
of the area when distances between the points and the line are small. When these distances
become larger and the area is mainly perpendicular to the line, this is not valid anymore. An
example is shown in Figure 33. In the left figure the sum of all the points on the blue line is a
good approximation for the difference in terms of area. In the right figure this is not the case.
The area difference should be small since the area has a very small width. The sum of all the
points on the blue line is not a good approximation.
38
Figure 33: Examples of good (left) and bad (right) situation when calculating the graph weight
A better method would be to calculate the real area of the closed polygon constructed by the
contour segment (blue line) and the approximation (red line). The resulting polygon is complex
because the polygon intersects itself. Most standard polygon functions, including area
calculating functions, do not give the expected results on complex polygons. Therefore the
complex polygon needs to be converted to one or multiple simple polygons first. Then the area
of the simple polygons can be determined to calculate the exact area difference. This
calculation will probably be more intensive, but this might be compensated by simplifying the
blue contour first.
7.3.Segments
The biggest weakness of the current detection algorithm is the segment finding. This accounts
for most of the low completeness results within our dataset. The problem with the current
segment finding lies mainly in the segment merging. Due to limitations in the merging step, the
searching distance should be kept low. A larger searching distance is required to find all
segments. Therefore, a different segment merging method is needed. Currently the segment
merging is performed by connecting the different areas by morphological closing. Then, the
polygon simplification step is applied to the combined area. Applying the polygon simplification
step for each building segment individually would be a better solution. When the polygon for
each segment is determined the segment-polygons need to be combined into one building-
polygon. This will be more complex than the current merging method, but will allow larger
distances between building segments.
When the searching distance for segments is increased, a lot more segment candidates will be
found. This also implies that there will be more candidates falsely detected, increasing the
chances to incorrectly mark a false candidate as a building segment. Height information that
39
can compare estimated height between regions would help distinct between correct and falsely
found segments. Furthermore, height information would make it possible to detect multi-
colored roofs correctly.
7.4.Low contrast buildings
Buildings with low contrast are obviously more difficult to detect. With a region-merging based
algorithm the weakest contrast around the building contour determines the overall contrast. A
very clear building with a small segment of low contrast will not be detected correctly. Line
detection could assist in these problematic cases. Detected lines on the image could be used as
an aid to increase contrast in the low contrast sections of the building contour.
7.5.Shadows
Shadows cause problems since they cause color differences on building segments which
prevents the merging algorithm finding all the segments. The opposite can also occur, a non-
building segments in shadow matches the color of the building. Shadows have specific color
properties, therefore shadow regions can be assessed based on this information. The shadow
presence indicator based on the YCbCr color space defined by Tsai [16] is used:
𝑆 =
𝐶𝑟 + 1
𝑌 + 1
This indicator is applied on two problem cases with shadows in Figure 34. In the first case the
shadow on the roof caused by the church tower is clearly defined. This makes it possible to
treat shadow regions differently. In the second case the shadow presence indicator is of no use.
The color of the building matches the shadowed background. Since it has the same color value,
the building will also have a high value on the shadow indicator. This makes a distinction
between building and non-building segment impossible based on the indicator. A more complex
detection method will be needed that detects shadow not solely based on color.
40
Figure 34: Shadow presence indicator applied on buildings where shadow is a problem
7.6.Shape detection optimizer
In the current shape detection step, the default matlab optimizer (fmincon) is used. The cost
function explained in Section 4.4 is applied and is minimized with the coordinates of the
polygon points as optimization variables. This is working correctly for simple cases where the
angles need to be adjusted slightly. For more complex situations the results are not yet verified.
The optimizer could find a local minimum or might not converge.
41
8. Conclusion
This report has described a semi-automatic algorithm to detect buildings with a flat roof on
very-high-resolution aerial photos. The algorithm uses a region-based detection approach that
automatically finds the optimal region-merge distance. This is performed by exploiting the fact
that a building can be represented by a polygon with a low vertex count. A robust polygon
simplification method is employed to convert the building region to a building model. This is
accomplished by applying the shortest path algorithm through a graph that is constructed from
the region. A shape detection step modifies the angles of the model to values which are most
likely to occur in roof shapes.
The algorithm was evaluated on a test set of 81 buildings. The correctness factor of the system
is very high, for 94% of the buildings at least 95% of the marked pixels belong to the roof. The
weak point of the system is the correctness factor, which represents the percentage of building
pixels that are detected. The main reason for not detecting all the roof pixels is that locating all
segments belonging to a building is difficult. The segment merging method only merges
segments close to each other, which limits the search range. A more complex segment merging
is required that can correctly connect segments at a greater distance without distorting inside
corners. Increasing the search range increases the amount of new segment candidates greatly,
therefore additional criteria like height estimation might be useful.
The detection algorithm performs reasonably well with a quality factor above 90% for 48% of
the buildings in the dataset. There is however still a lot of room for improvement.
Recommendations have been given to solve the problem cases in the used dataset.
42
9. References
[1] L Hazelhoff & P. H. N With. (2011). Localization of buildings with a gable roof in very-high-
resolution aerial images.
[2] Lin, C., & Nevatia, R. (1998). Building Detection from a Single Image Building Detection and
Description from a Single Intensity Image.
[3] Zaum, D. W. (2005). Robust building detection in aerial images
[4] Jin, X., & Davis, C. H. (2005). Automated Building Extraction from High-Resolution Satellite
Imagery in Urban Areas Using Structural , Contextual , and Spectral Information. EURASIP
Journal on Applied Signal Processing, (September 1999)
[5] Hai-yue, L. I., Hong-qi, W., & Chi-biao, D. (2006). A New Solution of Automatic Building
Extraction in Remote Sensing Images.
[6] Wei, Y., Zhao, Z., & Song, J. (2008). Urban building extraction from high-resolution satellite
panchromatic image using clustering and edge detection.
[7] Unsalan, C., Vision, C., & Engineering, E. (2008). Building Detection from Aerial Images using
Invariant Color Features and Shadow.
[8] Katartzis, A., & Sahli, H. (2008). A Stochastic Framework for the Identification of Building
Rooftops Using a Single Remote Sensing Image
[9] Liu, Z., Cui, S., & Yan, Q. (2008). Building Extraction from High Resolution Satellite Imagery
Based on Multi-scale Image Segmentation and Model Matching. Earth Observation and Remote
Sensing.
[10] Karantzalos, K., & Paragios, N. (2009). Recognition-Driven Two-Dimensional Competing
Priors Toward Automatic and Accurate Building Detection.
[11]Kabolizade, M., Ebadi, H., & Ahmadi, S. (2010). An Improved Snake Model for Automatic
Extraction of Buildings from Urban Aerial Images and LiDAR Data Using Genetic Algorithm
[12] Pakizeh, E., & Palhang, M. (2010). Building Detection from Aerial Images Using Hough
Transform and Intensity Information.
[13] Izadi, M., & Saeedi, P. (2010). Automatic Building Detection in Aerial Images Using a
Hierarchical Feature Based Image Segmentation. 2010 20th International Conference on Pattern
Recognition, 472–47
43
[14] C. Tomasi and R. Manduchi, (2008) "Bilateral Filtering for Gray and Color
Images", Proceedings of the 1998 IEEE International Conference on Computer Vision, Bombay,
India.
[15] Wang, O., Lodha, S. K., & Helmbold, D. P. (2006). “A Bayesian Approach to Building
Footprint Extraction from Aerial LIDAR Data” Third International Symposium on 3D Data
Processing, Visualization, and Transmission.
[16] Tsai, V. J. D., “A comparative study on shadow compensation of color aerial images in
invariant color models,” IEEE Transactions on Geoscience and Remote Sensing 44(6), 1661–
1671 (2006).
44
10. Appendix A: Matlab scripts
Test Framework
enter_roof_data.m Script to manually enter building contours for reference
show_roof_data.m Shows the entered reference building contours
test_framework.m Runs the building localization algorithm on all images in the test set
and saves the results.
show_result.m Shows contours of all steps of the detecting algorithm
show_table.m Groups all the detection results in a table
Roof detection
bfilter2.m Bilateral filter for RGB image
detect_building.m The main function of the roof detection algorithm
DisplayWatershedRegions.m Displays the watershed result of the image with the mean
region color for each region
imRAG.m Builds the region adjacency graph for the watershed image.
line_eq.m Calculate line equation for line crossing two points
linortfit2.m Fit a line to data by orthogonal least-squares. (2 dimensional)
linortfitn.m Fit a line to data by orthogonal least-squares. (N dimensional)
polygon_corner_penalty.m Calculate corner penalty of polygon based on the
defined corner cost function.
polygon_fit.m Polygon simplification by calculating shortest path of weighted
graph as explained in Chapter 6.
region_merge.m Region merge segmented image starting with a seed region
shape_fit.m Shape detection by as explained in Chapter 7
watershed_regions.m Performs the watershed algorithm
45
11. Appendix B: Detection results
ID Dataset Building ID Completeness Correctness Quality Comment
1 A1 1 0.541 1.000 0.541 Building with multicolor segments
2 A1 2 0.605 0.996 0.603 Building with multicolor segments
3 A2 1 0.971 0.970 0.943
4 A3 1 0.640 0.959 0.623 Shadow on roof
5 A3 2 0.879 0.984 0.866
6 A4 1 0.987 0.999 0.987
7 A4 2 0.904 0.983 0.890 Low contrast on building edge
8 A4 3 0.973 1.000 0.973
9 A5 1 0.982 1.000 0.982
10 A5 2 0.982 0.948 0.932
11 A6 1 0.668 0.993 0.665 Building with multicolor segments
12 A6 2 0.989 0.970 0.960
13 B 1 0.813 1.000 0.813 Shadow on building segment
14 B 2 0.953 0.982 0.937
15 B 3 0.609 0.953 0.591 Shadow on building segment
16 C 1 0.924 0.962 0.891 Segment merging limitation
17 C 2 0.583 1.000 0.583 Dark patches on roof
18 C 3 0.904 0.990 0.896 Shadow on roof
19 C 4 0.909 1.000 0.909
20 D 1 0.762 0.991 0.756 Building with multicolor segments
21 E 1 0.974 1.000 0.974
22 F 1 0.515 0.949 0.501 Multi color roof. Shadow on
segment and search not far enough
23 F 2 0.084 0.932 0.084 Multi color roof/segment search not
far enough
24 F 3 0.107 1.000 0.107 Edges inside the roof contour
25 G1 1 0.960 0.986 0.947
26 G1 2 0.121 1.000 0.121 segment search not far enough
27 G1 3 0.272 1.000 0.272 multi color roof
28 G1 4 0.916 0.965 0.886
29 G1 5 0.970 0.991 0.962
30 G2 1 0.803 0.981 0.790 Shadow on segment
31 G2 2 0.379 1.000 0.379 Multi color roof, rails near roof
edge
32 H1 1 0.953 0.994 0.948
33 H1 2 0.975 0.990 0.966
46
34 H1 3 0.955 0.970 0.927
35 H1 4 0.934 0.986 0.922
36 H1 5 0.978 0.724 0.712 False detected segment
37 H1 6 0.817 0.979 0.803 Segment not detected
38 H1 7 0.505 0.994 0.504 Segment not detected
39 H1 8 0.963 0.980 0.945
40 H1 9 0.980 0.973 0.954
41 H2 1 0.278 0.996 0.277 Multi color roof
42 H2 2 0.824 0.985 0.814 Dark patches on roof
43 H2 3 0.906 0.991 0.899
44 H2 4 0.938 0.774 0.737 Shadow on roof/false segment
45 H2 5 0.558 0.983 0.553 Shadow on roof
46 H2 6 0.962 0.996 0.958
47 I 1 0.427 0.988 0.425 Low contrast on building edge
48 I 2 0.981 0.924 0.908
49 I 3 0.319 0.927 0.311 Low contrast on building edge
50 I 4 0.864 0.982 0.850 Low contrast on building edge
51 I 5 0.846 0.999 0.845 Low contrast on building edge
52 I 6 0.859 0.998 0.858 Low contrast on building edge
53 I 7 0.408 1.000 0.408 Low contrast on building edge
54 I 8 0.488 0.836 0.445 Low contrast on building edge
55 J1 1 0.977 0.999 0.976
56 J1 2 0.900 0.718 0.665 Shadow included in region merging
57 J2 1 0.749 0.955 0.724 Multi color roof
58 J2 2 0.469 0.973 0.463 Multi color roof
59 J2 3 0.459 0.978 0.455 Shadow on segment
60 J2 4 0.811 0.985 0.801 Multi color roof
61 J2 5 0.901 0.988 0.891 Shadow on segment
62 J2 6 1.000 0.948 0.948
63 J2 7 0.963 0.996 0.959
64 J3 1 0.932 0.999 0.932
65 J3 2 0.963 0.998 0.961
66 J3 3 0.987 0.974 0.962
67 J3 4 0.814 0.991 0.808 Shadow on segment
68 J3 5 0.972 0.990 0.963
69 J3 6 0.930 0.998 0.928
70 J3 7 0.527 1.000 0.527 Segment not detected
71 J3 8 0.766 1.000 0.766 Segment not detected
72 J4 1 0.981 0.990 0.971
73 J4 2 0.974 0.992 0.967
47
74 J4 3 0.986 0.992 0.978
75 J4 4 0.964 0.992 0.956
76 J4 5 0.961 0.998 0.959
77 J5 1 0.731 0.986 0.724 Pipes on roof
78 J5 2 0.639 0.997 0.638 Solar panels/rails on roof
79 J6 1 0.943 0.992 0.935
80 J6 2 0.490 0.998 0.490 Dark patches on roof
81 K 1 0.833 0.974 0.815 Multi color roof
Average 0.776 0.973 0.758

Mais conteúdo relacionado

Destaque

Comunicato attività polisportiva - Pallavolo N°15 del 26/01/2016
Comunicato attività polisportiva - Pallavolo N°15 del 26/01/2016 Comunicato attività polisportiva - Pallavolo N°15 del 26/01/2016
Comunicato attività polisportiva - Pallavolo N°15 del 26/01/2016 Giuliano Ganassi
 
Stai mangiando per avere energia?
Stai mangiando per avere energia?Stai mangiando per avere energia?
Stai mangiando per avere energia?Lorenzo Molinari
 
Leach Recommendation
Leach Recommendation Leach Recommendation
Leach Recommendation Andrew Kriek
 
Lopez Recommendation
Lopez Recommendation Lopez Recommendation
Lopez Recommendation Andrew Kriek
 
Powerpointblog
PowerpointblogPowerpointblog
PowerpointblogLauraTxell
 
Catalogue Oriflame Inscription Recrutement Consultant Oriflame Tél 20631567
Catalogue Oriflame Inscription Recrutement Consultant Oriflame Tél 20631567Catalogue Oriflame Inscription Recrutement Consultant Oriflame Tél 20631567
Catalogue Oriflame Inscription Recrutement Consultant Oriflame Tél 20631567Cristian Lay
 
Rapport de stage d’initiation
Rapport de stage d’initiationRapport de stage d’initiation
Rapport de stage d’initiationGBO
 
El fósil de un mundo maravilloso
El fósil de un mundo maravillosoEl fósil de un mundo maravilloso
El fósil de un mundo maravillosolauhernagar
 
Rapport de stage d'initiation 2015 Mahmoudi Mohamed Amine
Rapport de stage d'initiation 2015 Mahmoudi Mohamed AmineRapport de stage d'initiation 2015 Mahmoudi Mohamed Amine
Rapport de stage d'initiation 2015 Mahmoudi Mohamed AmineMohamed Amine Mahmoudi
 
كتالوج أوريفليم المغرب لشهر يونيو 2013 catalogue oriflame maroc juin 2013
كتالوج أوريفليم المغرب لشهر يونيو 2013 catalogue oriflame maroc juin 2013كتالوج أوريفليم المغرب لشهر يونيو 2013 catalogue oriflame maroc juin 2013
كتالوج أوريفليم المغرب لشهر يونيو 2013 catalogue oriflame maroc juin 2013Holooolblog
 

Destaque (17)

Maps
MapsMaps
Maps
 
11 arte romano 2010 2011
11 arte romano 2010 201111 arte romano 2010 2011
11 arte romano 2010 2011
 
Comunicato attività polisportiva - Pallavolo N°15 del 26/01/2016
Comunicato attività polisportiva - Pallavolo N°15 del 26/01/2016 Comunicato attività polisportiva - Pallavolo N°15 del 26/01/2016
Comunicato attività polisportiva - Pallavolo N°15 del 26/01/2016
 
Stai mangiando per avere energia?
Stai mangiando per avere energia?Stai mangiando per avere energia?
Stai mangiando per avere energia?
 
Leach Recommendation
Leach Recommendation Leach Recommendation
Leach Recommendation
 
Lopez Recommendation
Lopez Recommendation Lopez Recommendation
Lopez Recommendation
 
Powerpointblog
PowerpointblogPowerpointblog
Powerpointblog
 
11 arte romano 2010 2011
11 arte romano 2010 201111 arte romano 2010 2011
11 arte romano 2010 2011
 
TourNative Español
TourNative EspañolTourNative Español
TourNative Español
 
Catalogue Oriflame Inscription Recrutement Consultant Oriflame Tél 20631567
Catalogue Oriflame Inscription Recrutement Consultant Oriflame Tél 20631567Catalogue Oriflame Inscription Recrutement Consultant Oriflame Tél 20631567
Catalogue Oriflame Inscription Recrutement Consultant Oriflame Tél 20631567
 
Nirvana
NirvanaNirvana
Nirvana
 
Rapport de stage d’initiation
Rapport de stage d’initiationRapport de stage d’initiation
Rapport de stage d’initiation
 
Nirvana
Nirvana Nirvana
Nirvana
 
El fósil de un mundo maravilloso
El fósil de un mundo maravillosoEl fósil de un mundo maravilloso
El fósil de un mundo maravilloso
 
3 er grado
3 er grado3 er grado
3 er grado
 
Rapport de stage d'initiation 2015 Mahmoudi Mohamed Amine
Rapport de stage d'initiation 2015 Mahmoudi Mohamed AmineRapport de stage d'initiation 2015 Mahmoudi Mohamed Amine
Rapport de stage d'initiation 2015 Mahmoudi Mohamed Amine
 
كتالوج أوريفليم المغرب لشهر يونيو 2013 catalogue oriflame maroc juin 2013
كتالوج أوريفليم المغرب لشهر يونيو 2013 catalogue oriflame maroc juin 2013كتالوج أوريفليم المغرب لشهر يونيو 2013 catalogue oriflame maroc juin 2013
كتالوج أوريفليم المغرب لشهر يونيو 2013 catalogue oriflame maroc juin 2013
 

Semelhante a Image Processing_Modeling of buildings with a flat roof

Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docxAnalysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docxAdnan Lazem
 
Analysis and Design of Mid-Rise Building_2023.docx
Analysis and Design of Mid-Rise Building_2023.docxAnalysis and Design of Mid-Rise Building_2023.docx
Analysis and Design of Mid-Rise Building_2023.docxadnan885140
 
IEA EBC Annex 58_guidelines
IEA EBC Annex 58_guidelinesIEA EBC Annex 58_guidelines
IEA EBC Annex 58_guidelinesGuillaume Leth
 
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docxAnalysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docxadnan885140
 
NOVEL NUMERICAL PROCEDURES FOR LIMIT ANALYSIS OF STRUCTURES: MESH-FREE METHODS
NOVEL NUMERICAL PROCEDURES FOR LIMIT ANALYSIS OF STRUCTURES: MESH-FREE METHODSNOVEL NUMERICAL PROCEDURES FOR LIMIT ANALYSIS OF STRUCTURES: MESH-FREE METHODS
NOVEL NUMERICAL PROCEDURES FOR LIMIT ANALYSIS OF STRUCTURES: MESH-FREE METHODSCanh Le
 
Analysis and Design of Mid-Rise Building_2023.docx
Analysis and Design of Mid-Rise Building_2023.docxAnalysis and Design of Mid-Rise Building_2023.docx
Analysis and Design of Mid-Rise Building_2023.docxAdnan Lazem
 
Pranav_Shah_Report
Pranav_Shah_ReportPranav_Shah_Report
Pranav_Shah_ReportPranav Shah
 
Final Design Report - 2
Final Design Report - 2Final Design Report - 2
Final Design Report - 2Cohen Poirier
 
B.Arch Dissertation Design in modules
B.Arch Dissertation Design in modulesB.Arch Dissertation Design in modules
B.Arch Dissertation Design in modules07206771966
 
dissertation master degree
dissertation master degreedissertation master degree
dissertation master degreeKubica Marek
 
Ijctcm030301TILTED WINDOW DETECTION FOR GONDOLATYPED FACADE ROBOT
Ijctcm030301TILTED WINDOW DETECTION FOR GONDOLATYPED FACADE ROBOTIjctcm030301TILTED WINDOW DETECTION FOR GONDOLATYPED FACADE ROBOT
Ijctcm030301TILTED WINDOW DETECTION FOR GONDOLATYPED FACADE ROBOTijctcm
 
GE4230 micromirror SUMMIT project 1
GE4230 micromirror SUMMIT project 1GE4230 micromirror SUMMIT project 1
GE4230 micromirror SUMMIT project 1Jon Zickermann
 
Senior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanSenior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanEvan Oman
 
Finite element method in solving civil engineering problem
Finite element method in solving civil engineering problemFinite element method in solving civil engineering problem
Finite element method in solving civil engineering problemSaniul Mahi
 
B.Sc. Thesis 1 - MEMS Vibratory Gyroscope and Readout Circuit
B.Sc. Thesis 1 - MEMS Vibratory Gyroscope and Readout CircuitB.Sc. Thesis 1 - MEMS Vibratory Gyroscope and Readout Circuit
B.Sc. Thesis 1 - MEMS Vibratory Gyroscope and Readout CircuitAhmed El-Sayed
 
User guide of structural modeling mudule v2.2.1
User guide of structural modeling mudule v2.2.1User guide of structural modeling mudule v2.2.1
User guide of structural modeling mudule v2.2.1Bo Sun
 

Semelhante a Image Processing_Modeling of buildings with a flat roof (20)

Report
ReportReport
Report
 
Parametric design
Parametric designParametric design
Parametric design
 
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docxAnalysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
 
Analysis and Design of Mid-Rise Building_2023.docx
Analysis and Design of Mid-Rise Building_2023.docxAnalysis and Design of Mid-Rise Building_2023.docx
Analysis and Design of Mid-Rise Building_2023.docx
 
IEA EBC Annex 58_guidelines
IEA EBC Annex 58_guidelinesIEA EBC Annex 58_guidelines
IEA EBC Annex 58_guidelines
 
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docxAnalysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
Analysis and Design of Telecommunication Steel Towers (Guyed Mast)_2023.docx
 
NOVEL NUMERICAL PROCEDURES FOR LIMIT ANALYSIS OF STRUCTURES: MESH-FREE METHODS
NOVEL NUMERICAL PROCEDURES FOR LIMIT ANALYSIS OF STRUCTURES: MESH-FREE METHODSNOVEL NUMERICAL PROCEDURES FOR LIMIT ANALYSIS OF STRUCTURES: MESH-FREE METHODS
NOVEL NUMERICAL PROCEDURES FOR LIMIT ANALYSIS OF STRUCTURES: MESH-FREE METHODS
 
Analysis and Design of Mid-Rise Building_2023.docx
Analysis and Design of Mid-Rise Building_2023.docxAnalysis and Design of Mid-Rise Building_2023.docx
Analysis and Design of Mid-Rise Building_2023.docx
 
Pranav_Shah_Report
Pranav_Shah_ReportPranav_Shah_Report
Pranav_Shah_Report
 
Final Design Report - 2
Final Design Report - 2Final Design Report - 2
Final Design Report - 2
 
B.Arch Dissertation Design in modules
B.Arch Dissertation Design in modulesB.Arch Dissertation Design in modules
B.Arch Dissertation Design in modules
 
dissertation master degree
dissertation master degreedissertation master degree
dissertation master degree
 
Ijctcm030301TILTED WINDOW DETECTION FOR GONDOLATYPED FACADE ROBOT
Ijctcm030301TILTED WINDOW DETECTION FOR GONDOLATYPED FACADE ROBOTIjctcm030301TILTED WINDOW DETECTION FOR GONDOLATYPED FACADE ROBOT
Ijctcm030301TILTED WINDOW DETECTION FOR GONDOLATYPED FACADE ROBOT
 
reviewpaper
reviewpaperreviewpaper
reviewpaper
 
GE4230 micromirror SUMMIT project 1
GE4230 micromirror SUMMIT project 1GE4230 micromirror SUMMIT project 1
GE4230 micromirror SUMMIT project 1
 
Tr1546
Tr1546Tr1546
Tr1546
 
Senior_Thesis_Evan_Oman
Senior_Thesis_Evan_OmanSenior_Thesis_Evan_Oman
Senior_Thesis_Evan_Oman
 
Finite element method in solving civil engineering problem
Finite element method in solving civil engineering problemFinite element method in solving civil engineering problem
Finite element method in solving civil engineering problem
 
B.Sc. Thesis 1 - MEMS Vibratory Gyroscope and Readout Circuit
B.Sc. Thesis 1 - MEMS Vibratory Gyroscope and Readout CircuitB.Sc. Thesis 1 - MEMS Vibratory Gyroscope and Readout Circuit
B.Sc. Thesis 1 - MEMS Vibratory Gyroscope and Readout Circuit
 
User guide of structural modeling mudule v2.2.1
User guide of structural modeling mudule v2.2.1User guide of structural modeling mudule v2.2.1
User guide of structural modeling mudule v2.2.1
 

Image Processing_Modeling of buildings with a flat roof

  • 1. 1 Modeling of buildings with a flat roof in aerial photographs T.H.M Derks(0569074)
  • 2. 2 Contents 1. Summary ................................................................................................................................. 4 2. Introduction ............................................................................................................................ 5 2.1. Literature Overview ......................................................................................................... 5 2.2. Analysis and discussion.................................................................................................... 7 2.3. Conclusions and Recommendations in problem approach ............................................. 8 3. System overview ..................................................................................................................... 9 4. Algorithm description ........................................................................................................... 10 4.1. Segmentation ................................................................................................................. 10 4.1.1. Watershed............................................................................................................... 10 4.1.2. Edge preserving filter.............................................................................................. 10 4.2. Region merging .............................................................................................................. 11 4.2.1. Seed point ............................................................................................................... 11 4.2.2. Efficient merging implementation.......................................................................... 12 4.2.3. Merge condition...................................................................................................... 12 4.2.4. Optimal merging distance....................................................................................... 12 4.3. Polygon simplification.................................................................................................... 14 4.3.1. Weighted Graph...................................................................................................... 14 4.3.2. Shortest Path algorithm.......................................................................................... 16 4.4. Shape detection ............................................................................................................. 16 4.4.1. Corner penalty function.......................................................................................... 16 4.4.2. Optimization cost function ..................................................................................... 17 4.5. Building segment merging ............................................................................................. 18 4.5.1. Combining segments............................................................................................... 19 5. Experiments and results........................................................................................................ 20 5.1. Quality measurement .................................................................................................... 20 6. Results................................................................................................................................... 21 6.1. Detection statistics......................................................................................................... 21 6.2. Examples of correct detections...................................................................................... 24 6.3. Examples of incorrect detections................................................................................... 27
  • 3. 3 6.3.1. Building to background contrast............................................................................. 28 6.3.2. Segment combining ................................................................................................ 29 6.3.3. Segment finding ...................................................................................................... 30 6.3.4. Multi-color roofs ..................................................................................................... 32 6.3.5. Correctness ............................................................................................................. 33 6.3.6. Shadows / Dark patches.......................................................................................... 35 7. Recommendations ................................................................................................................ 37 7.1. Merging distance............................................................................................................ 37 7.2. Graph weight function ................................................................................................... 37 7.3. Segments ........................................................................................................................ 38 7.4. Low contrast buildings ................................................................................................... 39 7.5. Shadows ......................................................................................................................... 39 7.6. Shape detection optimizer............................................................................................. 40 8. Conclusion............................................................................................................................. 41 9. References............................................................................................................................. 42 10. Appendix A: Matlab scripts................................................................................................ 44 11. Appendix B: Detection results ........................................................................................... 45
  • 4. 4 1. Summary Nowadays, very-high-resolution color aerial images are captured from The Netherlands at an annual basis. Automated interpretation of aerial images can lead to faster inspection resulting in more frequent updating of civil community databases. Existing research work copes with localization of buildings with a gable roof. As a significant percentage of the buildings has a flat roof, this system needs to be extended to detect this kind of buildings. Finding flat roofed buildings is a difficult problem since many look-alike objects exist. Therefore this work focuses at a semi-automatic system, capable of modeling the buildings given a seed point. The developed system (Figure 1) uses a region-based detection approach based on segmentation by the watershed method. A region-merge step is used to merge all the regions probably belonging to the building at the given seed point. The optimal region-merge distance is found by exploiting the fact that a building can be represented by a polygon with a low vertex count. A robust polygon simplification method is employed to convert the building region to a building model. A shape detection step modifies the angles of the model to values which are most likely to occur in roof shapes. Segmentation Region merge Polygon simplification Shape detection Seedpoint Image Building model Figure 1: System diagram The system performs reasonably well since it models 48% of the buildings correctly in the dataset. The biggest weakness of the system is partly detected roofs, finding all segments belonging to a roof is a difficult problem.
  • 5. 5 2. Introduction Nowadays, very-high-resolution color aerial images are captured from The Netherlands at an annual basis, resulting in an accurate and recent view of the country infrastructure. Due to the time-consuming nature of manual inspection of the infrastructure, automated interpretation of aerial images can lead to faster inspection and more frequent updating of civil community databases. Existing research work copes with localization of buildings with a gable roof [1]. As a significant percentage of the buildings has a flat roof, this system needs to be extended to detect this kind of buildings. Finding flat roofed buildings is a difficult problem since many look-alike objects exist. Therefore, this work focuses at a semi-automatic system, capable of modeling the buildings given a point found on the roof, but not solving the detection problem. First a literature study is done to get an overview of existing implementations for the detection and modeling of flat roofs. From this overview techniques are selected to be used in the developed system. 2.1.Literature Overview To get an overview of existing solutions for the detection of flat roofs a short summary is given of interesting papers. The papers are chronologically ordered. C. Lin and R. Nevatia [2] propose a fully automatic building extraction method that is based on the detection of edges in the image. It is assumed that the searched rectangular buildings can be found by finding parallelograms in the image. The edges are taken as building hypothesis and classified by use of a feature vector and additional features like shadow. S Müller and D. W Zaum [3] deal with the roof detection problem by starting with a seeded region growing algorithm to segment the entire image. Then photometric and geometric features are calculated for all regions. A numerical classification based on these features is performed to differentiate between building and non-building regions to detect roof tops. The automated building-extraction strategy by X. Jin and C. H. Davis [4] uses structural, contextual, and spectral information to extract buildings. A series of geodesic opening and closing operations is used to build a differential morphological profile (DMP) that provides image structural information. Building hypotheses are generated and verified through shape analysis applied to the DMP. Shadows are extracted using the DMP to provide reliable contextual information to hypothesize position and size of adjacent buildings. Building shapes were reconstructed by starting a region-merge algorithm on candidate buildings on a watershed segmented image.
  • 6. 6 L. Hai-yue et al [5] created an automatic building extraction system that uses a potential clustering function to segment the image. The Hough transform is used on the region contour to detect the most dominant line. Building regions are verified by judging the length of this line. A grid matching method is used to map the target buildings areas into regular polygons. Y. Wei et al [6] established a semi-automatic building rooftop extraction method applied on high resolution satellite imagery. Two different segmentation methods are used to create building regions. The first segmentation method is a seeded region growing algorithm that merges pixels into a building region. The second segmentation method is the mean-shift which is applied in the target building area. A model matching technique based on node graph search is used to convert the found regions to the correct shape of the rooftop. B. Sirmac and C. Unsalan [7] proposed a color invariant that detects roofs based on the color red. A different color invariant is used to detect shadows in the image. The illumination angle of the image is calculated from the shadow of detected red roofs. Roofs with different colors can now be detected solely based on the shadow. The shape of the building is determined with a box fitting method. The detection method by A. Katartzis and H. Sahli [8] is based on a stochastic image interpretation model. Rooftop hypotheses are extracted using a contour based grouping hierarchy that emanates from the principles of perceptual organization. A Markov random field model is used to describe dependencies between all available hypotheses. The hypothesis verification step is treated as a stochastic optimization process that operates on the whole grouping hierarchy to find it’s optimum configuration for the interacting group hypothesis. Z. Liu et al [9] constructed a general semi-automatic rooftop extraction method using high resolution satellite imagery. A seeded region growth segmentation or localized multi-scale object oriented segmentation is applied to extract small and simple rectilinear rooftops from its background. Model matching techniques based on node graph search are used for finding the correct building rooftop shape. K. Karantzalos and N. Paragios [10] established a recognition-driven variational framework for fully automatic building extraction from aerial photos. Competing building shape priors are considered and used in building extraction by using the prior models in the segmentation process. M. Kabolizade et al [11] proposed a boundary extraction method based on a GVF snake model. This method has the advantage of integrating edge-based and region-based snakes by minimizing internal and external energy forces. A genetic algorithm has been used to optimize the parameters of the snake model.
  • 7. 7 E. Pakizeh and M. Palhang [12] presented an approach for building detection using Hough transform and intensity information. Building locations are first detected with the use of intensity information. Morphological operations are applied to filter out small non-building regions. The Hough transform is used to verify the existence of buildings on candidate regions. The proposed method by M. Izadi and P. Saeedi [13] incorporates a hierarchical multilayer feature based image segmentation technique using color. A number of geometrical or regional attributes are defined to identify potential regions in multiple layers of segmented images. A tree-based mechanism is utilized to search for regions that maximize a set of rooftop definition measures. Candidate regions are verified through shadow evidence. 2.2.Analysis and discussion Flat roofs have numerous properties that can be used in the detection process. Most detection systems employ more than one property. By using multiple properties, more information is considered, which results in more accurate output. The following properties are often employed:  Color (both value and homogeneity)  Features (i.e. lines, corners)  Texture  Size  Shadows  Height information Not all properties can be used at the same time in a single detection step. Most algorithms apply multiple detection steps that use one building property at a time to improve results. Algorithms start with the property that, according to the makers, gives the most information. The extraction techniques listed previously can be classified into two main categories.  Region based (i.e. color homogeneity)  Feature based (i.e. features) The first category is a region based extraction method. Color homogeneity of the roof building is used to find building contours. A region growing method is employed to find a building by merging similar pixels into an area. Another option is to apply a segmentation algorithm like mean-shift or watershed to segment the image first. A region-merging algorithm is applied to get the total building region from the segmented image. The result of a region based detection step is the contour of a region. There are several ways to convert the building contour into a building model. Possibilities for this conversion are polygon simplification or approximation and box fitting or prior shape matching techniques.
  • 8. 8 The second category is based on features. Lines or corners (or both) are detected in the image. A popular method to detect lines in an image is the Hough transform. Detected lines can be combined by detecting intersections. By forming a closed contour with detected lines possible building polygons are formed. For both detection categories, other properties are used to refine the detection results. Shape, size and shadows are used as constraints or probability models. This helps to reduce the number of false positives or assist in the detection of difficult cases. 2.3.Conclusions and Recommendations in problem approach An overview of existing papers was given to get idea of the techniques applied in detecting images. This resulted in a list of building properties that can be employed for detection. All detecting techniques have their weaknesses, associated with the properties that are used. The best results were accomplished by combining building properties in the detection process. Since almost all papers could be categorized in a region or features based method the starting point of the new system will be based on one of these two. The remainder of this report will describe the system that is developed. The system overview can be found in Section 3, where each module is explained in detail in Section 4. Section 5 explains the testing setup and contains the quality measurement definitions for evaluation of the results. Section 6 evaluates the performance of the system with the quality measurement statistics. It also shows examples of correct and incorrect detections to get insight in the practical limitations and possibilities of the system. Section 7 gives recommendations to improve the results of the system. Section 8 represents the conclusion of the report.
  • 9. 9 3. System overview A system is developed to find flat roof buildings in a high resolution aerial image, given a seed point. The system consists of four modules: 1. Segmentation 2. Region-merge 3. Polygon simplification 4. Shape detection The systemoverview is shown in Figure 2. The first step is the image segmentation process. The whole image is converted into segments by applying the watershed method. The result is an heavily oversegmented image, where individual buildings consist of multiple regions that need to be merged together. The region-merging step starts at the given building seed point. The region-merge step tries to merge all the building segments probably belonging to the building without including any non- building regions. Adjacent regions are merged based on the mean region color. Multiple merging distances are evaluated and the best merging distance is automatically selected based on a cost function applied to the result. Segmentation Region merge Polygon simplification Shape detection Seed point Image Building model Figure 2: System block diagram The resulting region is converted into a contour. This contour contains all the boundary points of the region, which makes it a polygon with a high vertex count. The contour is transformed into a building model by polygon simplification. To find the best low vertex count approximation of the contour, a weighted graph is constructed that interconnects all the contour points with each other. A weight is assigned to each edge, based on the quality of the approximation between those points. Dijkstra’s shortest path algorithm is used to select the best simplified polygon by calculating the shortest path for a cycle through the graph. The last step of the system is the shape detection. Prior knowledge about building corners is used to improve the corners of the building model. A corner penalty function is constructed that favors perpendicular and 45 degrees angles, since these angles occur most frequently. An optimization function is used to minimize the corner penalties without deviating too much from the initial building contour.
  • 10. 10 4. Algorithm description This chapter explains the system implementation. A separate section is dedicated to each of the four modules of the system. The fifth section explains the segment finding feature which resides in the region merge module. 4.1.Segmentation The first step of the system segments the image into regions. The regions need to be as large as possible without introducing regions that are located only partially inside the building contour. The watershed method is used to segment the whole image. This method requires a gradient map as input. Before calculating the gradient map, an optional edge preserving smoothing filter can be applied. Pre-filtering the image with an edge preserving filter reduces the amount of regions in the segmentation output without deteriorating the gradient map at the straight building edges. A block diagram of the segmentation step is shown in Figure 3. For all the regions, statistics like the mean color and size are calculated to be used in the region merging step. Segmentation Edgepreserving filter Gradient map Watershed Image Segmented Image Optional Figure 3: Segmentation block diagram 4.1.1. Watershed A gradient map of the image is required for the watershed algorithm. For the gradient map, a Sobel operator is applied horizontally and vertically on the grey scale intensity image. The Sobel operator has been selected because it is less sensitive to noise than the Roberts operator and has a more isotropic response than the Prewitt. An isotropic response is uniform in all directions, which is desired because the building orientation is unknown. The gradient magnitude is the Euclidian length of the horizontal and vertical magnitudes of the sobel operator. Minima that are too shallow are suppressed to reduce oversegmentation. This minimum is empirically determined as the maximum value that does not merge non-building with building regions in an image with very low contrast between building and environment. This is important because merged regions that contain building and non-building sections can still be merged in the next step. 4.1.2. Edge preserving filter Before calculating the gradient map, a smoothing filter can be applied to reduce the number of regions of the watershed result. However, a normal smoothing filter also degrades the building edges, which results in building and non-building areas in one region. A bilateral filter [14]
  • 11. 11 smooths an image while preserving the strong edges of the buildings. This results in a less over- segmented image that still separates building from non-building regions. Figure 4 displays a comparison of the watershed segmentation result based on the average region color with and without bilateral filtering applied. The filtered case has fewer regions, while there are additional regions for straight edges. This can be seen in the brown area below the right part of the building. Figure 4: Result of watershed method without (left) and with bilateral filtering (right). 4.2.Region merging The watershed method results in an oversegmented image. The edge preserving filter and the local minimum constraint reduced the oversegmentation, but similarly colored regions still need to be merged to get the building contour. The seed point determines the start of the region merge algorithm. The initial region is expanded by merging adjacent regions based on color similarity. As the color variation on the roof and the contrast between building and surrounding varies, multiple merging distances are evaluated and the best merging distance is automatically selected based on a cost function applied to the result. 4.2.1. Seed point The region merging algorithm starts with a region selected by the seed point. Simply picking the region which contains the seed point can give an improper start region. The seed point may select a small region inside the building with a different color as the roof, as shown in Figure 5. Therefore, a different method is applied to select the start region. The region that has the most pixels inside a 60 pixel radius from the seed point is selected as the start region. Because the biggest region inside the circle is selected, the chance of selecting an incorrect start region is reduced.
  • 12. 12 4.2.2. Efficient merging implementation When a region is merged into to the building region, the adjacent regions of this region need to be considered for merging as well. Accessing the watershed image contour to find neighbors each time a region is merged is inefficient. A region adjacency graph (RAG) is employed to identify the adjacent regions for all watershed regions. An array of lists is created, where each element is linked to the list of neighbors of that region. These lists can be calculated efficiently for the whole image, eliminating the need to search the watershed image for neighbors when a region is merged. Another advantage is that the neighbors of a region only need to be lookup once when multiple merging distances are evaluated 4.2.3. Merge condition The mean color of a region is used as the merging condition. When the color difference to an adjacent region is below a threshold distance it is merged into the building region. The Lab color space has been used because it is perceptually uniform. In this color space, a change in visual importance produces the same change in the Euclidian distance of the color components. 4.2.4. Optimal merging distance The optimal merging distance depends on the color contrast at the border regions of the building and the color variation of the regions that belong to it. The distance should be high enough to merge all the building regions but low enough not to include any surrounding regions. Multiple merging distances are evaluated and the best distance is selected by Figure 5: Regions that cannot be used as seed point
  • 13. 13 evaluating the result by a cost function. A building shape can be represented by a low vertex count polygon. The optimal merging distance is selected using this property. A boundary tracing algorithm converts the region merge area into a polygon. This polygon is simplified by a polygon simplification method (Douglas-Peuckler). The first component of the cost function is the mean contour difference. This is the area between the original en the simplified contour divided by the contour length. For a good merging distance choice this difference is low, because a building can be accurately represented by a polygon with a low vertices count. When surrounding non-building regions are merged, the area difference between the two contours will be higher, since the simplified polygon does not represent the building anymore. The Douglas-Peuckler polygon simplification method varies in the number of vertices for different shaped contours. Additional vertices will always improve the mean contour difference. Therefore a cost per vertex is added as the second component. The third component is the ratio of the initial contour length to the simplified contour length. The simplified contour length is always lower than the initial contour length. When the simplified contour is a good approximation of the original contour the ratio will be smaller. The last component subtracts a portion of the merging distance to favor a bigger merging distance slightly, as this favors larger building contours over building subsections with a low vertex count shape. 𝐶 = AreaDifference ContourLength + a ∙ DpVertices + b ∙ Points DpLength − c ∙ MergeDistance (Formula 1) 4.2.4.1. Douglas-Peucker The Douglas-Peucker algorithm aims at the reduction of the number of vertices of a polygon. It starts with the start and end point of the polygon. It then iteratively adds the point which is the furthest away from the polygon approximation until all points are within a tolerance distance. An example of this process is shown in Figure 6. The approximation of a closed polygon induces one problem. The closed contour is opened at a random point on the polygon to apply the algorithm. This start/end point will always be in the polygon approximation. Therefor the result of the douglas-peuckler simplification is opened at a different point in the contour and the algorithm is applied again. This will remove the start/end point of the first simplification if it lies below the tolerance distance. Figure 6: Douglas-Peucker algorithm steps
  • 14. 14 4.3.Polygon simplification The resulting building region mask should to be converted into a building model. The first step is a boundary tracing algorithm to convert the region mask into a polygon. This polygon contains all the boundary points of the region mask. The high vertex count polygon should be simplified into a low vertex count building model. The Douglas-Peucker algorithm is not suitable for this simplification. Sometimes non-building regions are merged into the building mask, since it’s the best possible result the region merging step could find. The Douglas-Peucker algorithm is very sensitive to wrongly merged regions, since it always adds a vertex for a point which lies too far from the existing simplification. More robust simplification is required, that ignores outliers caused by wrong merged regions. A good method to accomplish this is to find a simplified polygon that minimizes area in between the original and simplified contour. A shortest path method similar to the one described in [15] is employed to implement this robust polygon simplification step. A directed weighted graph is constructed with vertices for all the points on the polygon. All vertices are connected with an edge weight that defines the area in between the difference of the and the polygon. The exact function used to calculate the weights is explained is the next subsection. By calculating the shortest path for a cycle in the graph, the simplified polygon can be constructed. Figure 7 shows an example of the Douglas- Peucker (green) and the shortest path (red) algorithm. Figure 7: Building detection result with the boundary points of region merge(Blue), Douglas-Peuckler (Green)and Shortest Path (yellow) algorithm 4.3.1. Weighted Graph A weighted graph is used to generate a better polygon simplification for the building region than the Douglas-Peucker algorithm. All boundary points of the building are added as vertices.
  • 15. 15 The weight function explained in the next subsection is applied to assign values to the edges between all the vertices. This weight function is very important, since it determines the criteria the shortest path algorithm will use to simplify the polygon. 4.3.1.1. Weight function The weight function defines the criteria used to get a robust simplification of the building contour. The edge weight between two vertices in the graph is calculated using the following function: 𝑊( 𝑖, 𝑖 + 𝑘) = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒_𝑒𝑟𝑟𝑜𝑟_𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 + 𝑣𝑒𝑟𝑡𝑖𝑐𝑒_𝑐𝑜𝑠𝑡 + 𝑔 ∙ 𝐶𝑜𝑛𝑡𝑜𝑢𝑟𝑃𝑜𝑖𝑛𝑡s (Formula 2) Figure 9: Calculation of average error distance The first component of the weight function is the average distance of all the points between the interpolated line of the end points (Figure 9). Which results in the following formula for the average error distance between a point A and B with k interlaying points: 𝑎𝑣𝑒𝑟𝑎𝑔𝑒_𝑒𝑟𝑟𝑜𝑟_𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 = 1 𝑘 ∑ 𝐷𝑖 𝑘 𝑖=1 To calculate the distance d between the line and the point (Figure 8), first the line equation through the points P1 and P2 is calculated: 𝐴 = 𝑃1 = ( 𝑥1, 𝑦1), 𝐵 = 𝑃2 = ( 𝑥2, 𝑦2) 𝑎 ∙ 𝑥 + 𝑏 ∙ 𝑦 + 𝑐 = 0 With the variables a, b and c defined as: 𝑎 = 𝑦2 − 𝑦1 𝑏 = 𝑥1 − 𝑥2 𝑐 = −( 𝑎 ∙ 𝑦2 + 𝑏 ∙ 𝑥2) Then the distance between the point 𝑃0 = ( 𝑥0, 𝑦0) and the line is given by: 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 = | 𝑎∙𝑥0+𝑏∙𝑦0+𝑐 √𝑎2+𝑏2 | (Formula 3) Figure 8: Distance between point and a line
  • 16. 16 The second component of the weight function is the vertex cost. Adding an additional edge will always decrease the average distance, therefore a penalty for the vertex is added. The third component of the weight function adds another contribution to this edge cost. For a large building, error distances are generally bigger. The penalty for a vertex needs to be increased for prevent the simplification of using more vertices for large scaled buildings. To increase the speed of the algorithm, some restrictions are used to prevent calculating the weight between every single point of the polygon. When the total error distance, which can be regarded as an area, is above one eighth of the building area and the mean distance is above ten pixels, the vertex weight is not calculated. This does not have an effect on the result of the algorithm, but prevents calculation of vertex weights that will never appear in the shortest path. Furthermore, only vertexes with an average line error below four are added to save calculation time in the shortest path algorithm. 4.3.2. Shortest Path algorithm Now that a weight is assigned to each edge of the directed graph, the best simplification is determined by calculating the shortest path through the directed graph. First, the shortest distance between all points is calculated for the directed graph. Then, the minimum distance between all points and its preceding point is calculated. Since the graph is directed, a path around the whole contour is forced. Since the end point lies only one pixel away from the start point these two points are merged to create the simplified polygon. 4.4.Shape detection Knowledge about building shapes can be employed to further enhance the detected building shape [2]. Building footprints can have lots of different shapes. However, certain angles are far more likely than others. The corners in buildings are usually squared or have a 45˚ angle. This prior information about building corners can be applied to correct angles in the detected building. Corner angles that lie close to a minimum on the corner penalty function will be shifted to the minimum. A cost function is constructed that has two components (formula 4). The first component is the average error distance to all the building contour points also used in the weight function of Section 6.1.1. The second component is the sum of the penalty of all the corners of the polygon. The coordinates of vertices are varied to find an optimum for the given cost function. 𝐻 = average_error + 𝑏 ∑ 𝑓(𝛼𝑖) (Formula 4) 4.4.1. Corner penalty function The corner penalty function defines the penalty for each corner angle. The corner penalty function defined in [15] is applied and is shown in Figure 10. Since a square angle is preferred, it has zero penalty. Angles of 180 degree have also zero penalty, since it eliminates falsely
  • 17. 17 detected corners. Angles of 45 and 135 degrees are also favored, so they have lower penalties than the angles nearby. Figure 10: Corner penalty function 4.4.2. Optimization cost function To find an optimum for the cost function, the Matlab optimizer fmincon is applied. The cost function is minimized with the coordinates of the vertices as optimization variables. All the angles will be shifted towards 90,45, and 135 angles as long as the average error distance to all the contour points doesn’t become too large. The weighing factor b determines the balance between the penalty costs for corners and the deviation from the contour points. Good results are reached with a factor of 20. 0 20 40 60 80 100 120 140 160 180 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Angle (Degrees) Penalty
  • 18. 18 4.5.Building segment merging Some buildings consist of multiple segments. The region merging algorithm only detects a single segment since the region merging stops at the edge of a segment. The whole building should be detected, so a method is needed to find the other building segments belonging to the same building. First, a contour is made with a fixed distance d to the region contour found in the first region merging step, this is the cyan contour in Figure 11. All regions that occur more than r times in the contour and have a close color similarity to the first segment are considered a new start region. The Euclidian distance in Lab color space is used as the metric for color similarity with a maximum distance of 7. These restrictions limit the amount of starting regions, since the algorithm will try the region-merging step on all this regions. After localization of the optimum merging distance for a given start region, the shape of the region is evaluated with a corner penalty function similar to the one used in Section 4.4.1 (Figure 10). If the average corner penalty is low enough, the segment is added to the building footprint. This check is added to only add segments which are very likely to be a building segment. Both the maximum color distance and the average corner penalty are set very strict to prevent surroundings to be detected as a building segment. Detecting building segments without false positives solely on shape and color are very limited. Building segments can differ in color and can have some corners with a high penalty cost. Height estimation would be a great addition to find other building segments, since they are always higher than the surroundings. This would give an extra constraint, so that the current ones can be loosened. Figure 11: Segment finding (O initial seed point, + seed point of segment)
  • 19. 19 4.5.1. Combining segments When multiple segments are obtained, the areas of the region-merge results are combined together with a morphology closing method. This is an operation that employs dilation and erosion to merge two segments which are close to each other. Since the orientation of the building on the image is random, a circular element is used for the operation. In Figure 12 both operations are shown. When two areas are close together they will be connected by the dilation operation. The erosion operation will shrink the area back to the original size. Figure 12: Morphological dilation (left) and erosion (right) with a circular element (input blue, output cyan )
  • 20. 20 5. Experiments and results The performance of the detection algorithm is evaluated. A test set of 81 buildings from 11 aerial photos is used to benchmark the algorithm. For all buildings, the footprint is entered manually as a reference. The centroid of the reference footprint is used as the seed point in the detection algorithm because the result is expected to be independent of the chosen seed point as long the building is not found partially. The detection algorithm is run for the complete data set and evaluated with the quality measurements given in Section 5.1. When a building has a low quality measurement, the result is inspected graphically to determine the cause of the weaknesses in the detection algorithm. 5.1. Quality measurement A quality measurement is needed to evaluate the performance of the detection algorithm. The result of the algorithm is compared with a manually entered reference footprint of the building. The extracted building and the reference footprint are compared pixel-by-pixel and categorized into four types[4]:  True positive (𝑇𝑃). Both the manual and automated method label the pixel belonging to the buildings.  True negative (𝑇 𝑁). Both the manual and automated method label the pixel belonging to the background.  False positive (𝐹𝑃 ). The automated method incorrectly labels the pixel as belonging to a building.  False negative (𝐹𝑁 ). The automated method incorrectly labels a pixel truly belonging to a building. The total number of pixels in each category are determined for the building. With these numbers the following quality measurements can be calculated:  Completeness: 𝑇 𝑃 𝑇 𝑃+𝐹 𝑁  Correctness: 𝑇 𝑃 𝑇𝑝+𝐹𝑃  Quality: 𝑇 𝑃 𝑇 𝑃+𝐹𝑃+𝐹 𝑁 The completeness measurement gives the fraction of building which is detected by the algorithm. The correctness measurement gives the fraction of reference pixels which were correctly denoted as building pixels. The quality measurement is the best overall performance evaluation method. To get a high quality measurement the algorithm must correctly label every building pixel, without mislabeling any background pixels. The completeness and correctness show whether the false positive or the true negatives have a bigger influence on the quality. This is important to be able to determine the weakest aspect of the detection algorithm.
  • 21. 21 6. Results The detection algorithm was run on the building dataset of 81 buildings. The detection results for all the individual buildings are listed in Appendix B. To get an overview of the overall performance, the detection statistics are shown in Probability density functions. Afterwards, we will show graphical examples of correctly detected buildings. Finally examples of incorrect detected buildings will give an overview of the problems encountered. 6.1.Detection statistics The probability density functions of the correctness, completeness and quality are used to get an overview of the detection results of the system. It follows from Figure 13 that the correctness of the system is very high. For 94 percent of the buildings, the correctness factor is above 95 percent. This means that the percentage of marked building pixels which were actual building pixels is very high. However, the percentage of correctly marked building pixels doesn’t mean anything with a low completeness factor. Figure 13: Probability density function of the correctness The completeness distribution shown in Figure 14 is more widely spread. This means that for a lot of buildings not all the building pixels were correctly labeled as such. The completeness factors below eighty percent can be explained by the fact that some buildings are only partly detected. Other reasons for a low correctness are shadows, low building contrast or discolorations on the roof. Examples of all these causes are shown in Section 6.3. Even though there are problems for the correctness, there is still a big part of the distribution in the higher segment, fifty percent of the buildings has a completeness factor above ninety percent. 0 10 20 30 40 50 60 70 #buildings Correctness Percentage Correctness
  • 22. 22 Figure 14: probability density function of the completeness The quality probability density function is shown in Figure 15. The quality measurement is a combination between the correctness and the completeness measurement, since it accounts for both the mislabeled building pixels and the missed building pixels. The spread in the quality is caused by the spread in correctness. To improve the quality measurement, the correctness performance needs to be improved. In the quality measurement, 43 percent of the buildings has a factor above ninety percent. Figure 15: Probability density function of the quality To make an easy comparison between the different metrics the probability density functions of the correctness, completeness and quality are plotted into one plot in Figure 16. Here it is even 0 2 4 6 8 10 12 14 16 18 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 #buidings Percentage Completeness 0 5 10 15 20 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 #buidlings Quality Percentage Quality
  • 23. 23 more clear that the completeness performance needs to be improved to increase the quality factor. Figure 16: Probability density function of correctness, completeness and quality 0 10 20 30 40 50 60 70 80 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 BuildingPercentage Measurement Percentage Correctness Completness Quality
  • 24. 24 6.2.Examples of correct detections This chapter gives graphical examples of correctly detected buildings to get an overview of which kind of buildings are correctly handled and how the individual steps of the detection algorithm function. In Figure 17 the legend for the detection results is shown. Each step in the detection process is shown as a colored line. For polygons, the vertices are shown by a cross. Figure 17: Legend for the detection results In the left image in Figure 18, building 1 is easily detected due to the high contrast with the surroundings. The Douglas-Peucker algorithm already gives a correct result. The blue region merge result for building 2 and 3 is more tortuous due to the lower contrast of the buildings. The resulting polygon simplification step extracts the correct building model out of the contour. On the right image, two building segments are correctly merged together into a single building. Figure 18: Examples of correctly detected buildings left: dataset A4, right: dataset A5
  • 25. 25 In the left picture of figure 13, two uncommon shaped buildings are correctly detected. On the right picture, it can be seen that rectangular shaped objects on the roof do not cause any detection problems. Figure 19: Examples of correctly detected buildings left: dataset G1, right: dataset J4 In Figure 20, two buildings with uncommon building angles are correctly detected. In the right picture, the polygon simplification filters out the blue contour spike going into the building. The two buildings also show that the system works on a large variety of building shapes. The building on the right is 8 times bigger in area than the building on the left. Figure 20: Examples of correctly detected buildings left: dataset H1, right: dataset J2
  • 26. 26 In Figure 21, a very big complex building shape is detected correctly. The shape fit step of the detection process compensates all the angles to be exactly 90 degrees. Figure 21: Examples of correctly detected buildings dataset J1
  • 27. 27 6.3.Examples of incorrect detections This subchapter will give examples of incorrect detected buildings. These are buildings with a quality factor below ninety percent. This overview will illustrate the weaknesses of the current algorithm. The goal is to give a good impression of all the problem cases that were encountered in the dataset. The reason for an incorrect detection can be divided into six main categories: 1. Low building to background contrast: When the contrast between the roof and the surroundings is low. It can be impossible to find a correct merging distance. 2. Incorrect Segment combining: Detected roof segments are incorrectly combined with each other. 3. Segment finding: Not all segments belonging to a building are found. 4. Shadows or dark patches: Due to color variations on the roof due to shadow or rain water a part of the roof is not found. 5. Correctness: In some cases non-building segments are included in the building contour. 6. Multicolor roofs: When a roof consists of multiple colors, the roof is only partly found because the region-merging is based on color. For each category, this chapter has a section that shows the detection results graphically. This will give a clear view of the detection process by showing the output of all the steps in the detection process. The reason of the false detection is discussed and when possible a method for improvement is given.
  • 28. 28 6.3.1. Building to background contrast The first step of the detection algorithm tries to find the optimum merging distance to merge all regions of the building contour. When the contrast between the building and the background is low at a certain point in the building contour, the merging distance has to be short. However, a short merging distance might not include all the regions of the building. A segment of a building is almost always a better fit for a low polygon shape then included background regions (exception in the subchapter “Correctness”). Therefore the maximum distance that selects as many building regions as possible without including background regions is chosen. A few examples of this problem are shown in Figure 22. Building 1 is only partly detected due to low contrast of the building at the right edge. This is also the case for building 3, not all the regions belonging to the building are found, which results in a bad detection. On the right picture, building 7 has a lower contrast to the background than the other similar buildings. This is due to the road on the right that almost matches the color of the building. Because of the resulting short merging distance not all building regions are found. The steps following this first step can’t correct for so many missing building regions. Buildings with low contrast to the surroundings are fundamentally harder to detect. Detection based on region merging is more sensitive to low contrast situations, because the weakest link around the contour determines the maximum usable contrast for the merging distance. A possibility is to assist the region merging step with line detection techniques. This will be further discussed in the recommendations section. Figure 22: Examples of false detected buildings due to low building to background contrast. (dataset I)
  • 29. 29 6.3.2. Segment combining Segments are combined with the morphological closing operation. This is a very simple operation that uses dilation and erosion to connect separate areas with each other. The downside of this operation is that inner corners of the segments are rounded (Figure 23). This limits the maximum distance between segments that can be merged. When the circle used for the morphological closing is large the unwanted rounding has a great impact on the building shape. To minimize the rounding effect of the closing operations, the size of the circular element is minimized. However, this gives rise to other problems in some situations. For example, when the gap between two segments is too large, the segments are not completely connected, which can be seen in Figure 24. Another problem is visible in the polygon (red line) of both buildings in Figure 24. The segments that are detected have parallel lines that are near each other, but are not aligned. This results in an approximation of the two individual lines with one line diagonally through both of them. Segment merging is currently performed before the polygon simplification step, but the result is expected to be better when this order is reversed. First, a polygon is determined for each individual segment and then the separate polygons are merged into one building polygon afterwards. This option will be further discussed in the recommendations section. Figure 24: Examples of false detected buildings due to segment combining. left: dataset J2, right: dataset A2 Figure 23: Morphological closing
  • 30. 30 6.3.3. Segment finding The segment finding searches for similar colored regions a short distance from the initially found segment. The distance is relatively small and can’t be increased, since the current segment merging algorithm is limited to merging segments close to each other as explained in the previous subchapter. Figure 25: Examples of false detected buildings due to unfound segments. Dataset J2 Due to the small search distance for additional segments a lot of building segments are not found, especially in large buildings. In Figure 25 and Figure 26 examples of buildings with undetected segments are shown. The main reason for the unfound segments is the short searching distance. The aerial photos are captured at an angle with the ground. When two segments don’t have the exact same height, the side view of the walls is in-between the two segments. A difference in height can also cast a shadow on the adjacent segment. Both effects require a larger searching distance for segments than is currently possible due to the limitation of the merging step. Therefore a new method to merge building segment is needed. Ideas for a better method will be discussed in the recommendations section.
  • 31. 31 Figure 26: Examples of falsely detected buildings due to unfound segments. left/top: dataset F, bottom: dataset B
  • 32. 32 6.3.4. Multi-color roofs Flat roofs are not always uniformly colored. Since the region-merging algorithm is solely based on color differences, it is unable to correctly detect multi-colored roofs. Examples of this problem are shown in Figure 27 and Figure 28. The only way to solve this problem is with additional height information that can link segments with a different color based on a similar height. At the moment radar height information is not available. Extracting height information based on perspective might be a possibility. Figure 27: Examples of false detected buildings due to multi-color roofs. left: dataset H2, right: dataset A6 Figure 28: Examples of false detected buildings due to multi-color roofs. dataset A1
  • 33. 33 6.3.5. Correctness The overall performance for the correctness is very high for the building detection algorithm. Only three buildings out of 81 have a correctness factor below 95 percent, which will all be discussed in this section. In Figure 29, the shadow of the building is merged into the building contour. This is caused by a combination of three factors: 1. A lot of building segments due to a line grid on the roof. 2. Low contrast between roof and shadow. 3. Straight edges on the shadow of the building. A large merging distance is required to merge all of the small building segments. However, due to the low contrast of the building with its shadow it cannot merge all the building regions without including the shadow. Since a subset of the building segments cannot be correctly approximated by a low count polygon, the detection includes the shadow because of the long straight lines. The shadow of the building has the same shape property as the building itself and is therefore incorrectly seen as such. Figure 29: Examples of false detected buildings due to wrong merged regions. dataset J1
  • 34. 34 The other buildings with a low correctness factor are caused by incorrect detected segments, see Figure 30. To the left building the algorithm incorrectly added a segment. The manual reference was entered separately due to the difference in orientation and the objects on the roof. The segment is however attached to the building and has a similar color. A clear definition to verify that a segment belongs to a building is hard to formulate, which makes the reference choice debatable. The right building has merged a segment which clearly does not belong to it. The segments seem very different, but the mean color of the striped segment is exactly the same as the mean color of the other segment. An additional comparison on the color distribution of both segments would solve this incorrect match Figure 30: Examples of false detected buildings due to an incorrect building segment. left: dataset H2, right: dataset H1
  • 35. 35 6.3.6. Shadows / Dark patches Shadows from nearby objects like trees, chimneys and towers can cause color differences on the roof. When the contour of the shadow has straight lines, a merging distance that excludes the shadow from the building roof can be selected. Examples of this problem are shown below in Figure 31. The shadow on the picture at the left is caused by a tree right next to the building. In the right picture the tower of a church casts a shadow over the roof. Figure 31: Examples of falsely detected buildings due to shadows on the roof. left: dataset H2, right: dataset A3 Flat roofs sometimes have strong discoloration on the edges due to rain water. These dark patches are visible on buildings two and three in Figure 32. The polygon simplification step compensates for the missing corners in building three (red). The shape fit method further improves the detection by preferring perpendicular angles (yellow). Small patches in relation to the building contour are normally fixed by the polygon simplication step as can also be seen in Figure 26 and Figure 27. For building two, this correction cannot be made because the contour misses a whole part of the building. When closely looking at the picture, the region-merge could grow around the dark patch. However the top regions of the building are so light it also includes non building regions. Therefore this faulty detection is also partly caused by low building contrast.
  • 36. 36 Figure 32: Examples of false detected buildings due to dark patches on the roof. dataset C (roof 2+3)
  • 37. 37 7. Recommendations Despite the accurate performance in some cases, the current detection algorithm has some weaknesses. In this chapter, recommendations are given to address these weaknesses to improve the detection results. These ideas followed from analysis of the detection results. 7.1.Merging distance Since there is a metric to qualify the result of the region merging step, more merging criteria could be evaluated. Currently, the merging distance is varied to find the optimum distance for the Euclidian distance in Lab color-space. Occasionally, no distance gives a good result for the region-merge, such that all the building regions could not be matched without including some of the background regions. The search for an optimal merging distance could be extended by using various definitions of this distance. A different definition of the merging distance might include a distance that does merge all building regions without inclusion of background. A different definition of the distance could mean a different color-space or assigning different weights to the components of the lab color-space. Another possibility is an asymmetrical distance, where a value above the mean is treated differently than a value below the mean. Multiple definitions of the merging distance will increase the time spent in the region-merging step significantly. For every definition, multiple values will be tried. However, it will help to get the maximum result out of the region merging step in the algorithm. 7.2.Graph weight function The graph weight function is meant to calculate the difference in area between the line approximation and the contour. The difference in terms of area is chosen since mistakes of the region-merge step need to be corrected. Since the region-merge step can make mistakes, the polygon approximation tries to find the best solution by finding a low count polygon that minimizes the area difference. The sum of distances between all contour points and the polygon line is a good approximation of the area when distances between the points and the line are small. When these distances become larger and the area is mainly perpendicular to the line, this is not valid anymore. An example is shown in Figure 33. In the left figure the sum of all the points on the blue line is a good approximation for the difference in terms of area. In the right figure this is not the case. The area difference should be small since the area has a very small width. The sum of all the points on the blue line is not a good approximation.
  • 38. 38 Figure 33: Examples of good (left) and bad (right) situation when calculating the graph weight A better method would be to calculate the real area of the closed polygon constructed by the contour segment (blue line) and the approximation (red line). The resulting polygon is complex because the polygon intersects itself. Most standard polygon functions, including area calculating functions, do not give the expected results on complex polygons. Therefore the complex polygon needs to be converted to one or multiple simple polygons first. Then the area of the simple polygons can be determined to calculate the exact area difference. This calculation will probably be more intensive, but this might be compensated by simplifying the blue contour first. 7.3.Segments The biggest weakness of the current detection algorithm is the segment finding. This accounts for most of the low completeness results within our dataset. The problem with the current segment finding lies mainly in the segment merging. Due to limitations in the merging step, the searching distance should be kept low. A larger searching distance is required to find all segments. Therefore, a different segment merging method is needed. Currently the segment merging is performed by connecting the different areas by morphological closing. Then, the polygon simplification step is applied to the combined area. Applying the polygon simplification step for each building segment individually would be a better solution. When the polygon for each segment is determined the segment-polygons need to be combined into one building- polygon. This will be more complex than the current merging method, but will allow larger distances between building segments. When the searching distance for segments is increased, a lot more segment candidates will be found. This also implies that there will be more candidates falsely detected, increasing the chances to incorrectly mark a false candidate as a building segment. Height information that
  • 39. 39 can compare estimated height between regions would help distinct between correct and falsely found segments. Furthermore, height information would make it possible to detect multi- colored roofs correctly. 7.4.Low contrast buildings Buildings with low contrast are obviously more difficult to detect. With a region-merging based algorithm the weakest contrast around the building contour determines the overall contrast. A very clear building with a small segment of low contrast will not be detected correctly. Line detection could assist in these problematic cases. Detected lines on the image could be used as an aid to increase contrast in the low contrast sections of the building contour. 7.5.Shadows Shadows cause problems since they cause color differences on building segments which prevents the merging algorithm finding all the segments. The opposite can also occur, a non- building segments in shadow matches the color of the building. Shadows have specific color properties, therefore shadow regions can be assessed based on this information. The shadow presence indicator based on the YCbCr color space defined by Tsai [16] is used: 𝑆 = 𝐶𝑟 + 1 𝑌 + 1 This indicator is applied on two problem cases with shadows in Figure 34. In the first case the shadow on the roof caused by the church tower is clearly defined. This makes it possible to treat shadow regions differently. In the second case the shadow presence indicator is of no use. The color of the building matches the shadowed background. Since it has the same color value, the building will also have a high value on the shadow indicator. This makes a distinction between building and non-building segment impossible based on the indicator. A more complex detection method will be needed that detects shadow not solely based on color.
  • 40. 40 Figure 34: Shadow presence indicator applied on buildings where shadow is a problem 7.6.Shape detection optimizer In the current shape detection step, the default matlab optimizer (fmincon) is used. The cost function explained in Section 4.4 is applied and is minimized with the coordinates of the polygon points as optimization variables. This is working correctly for simple cases where the angles need to be adjusted slightly. For more complex situations the results are not yet verified. The optimizer could find a local minimum or might not converge.
  • 41. 41 8. Conclusion This report has described a semi-automatic algorithm to detect buildings with a flat roof on very-high-resolution aerial photos. The algorithm uses a region-based detection approach that automatically finds the optimal region-merge distance. This is performed by exploiting the fact that a building can be represented by a polygon with a low vertex count. A robust polygon simplification method is employed to convert the building region to a building model. This is accomplished by applying the shortest path algorithm through a graph that is constructed from the region. A shape detection step modifies the angles of the model to values which are most likely to occur in roof shapes. The algorithm was evaluated on a test set of 81 buildings. The correctness factor of the system is very high, for 94% of the buildings at least 95% of the marked pixels belong to the roof. The weak point of the system is the correctness factor, which represents the percentage of building pixels that are detected. The main reason for not detecting all the roof pixels is that locating all segments belonging to a building is difficult. The segment merging method only merges segments close to each other, which limits the search range. A more complex segment merging is required that can correctly connect segments at a greater distance without distorting inside corners. Increasing the search range increases the amount of new segment candidates greatly, therefore additional criteria like height estimation might be useful. The detection algorithm performs reasonably well with a quality factor above 90% for 48% of the buildings in the dataset. There is however still a lot of room for improvement. Recommendations have been given to solve the problem cases in the used dataset.
  • 42. 42 9. References [1] L Hazelhoff & P. H. N With. (2011). Localization of buildings with a gable roof in very-high- resolution aerial images. [2] Lin, C., & Nevatia, R. (1998). Building Detection from a Single Image Building Detection and Description from a Single Intensity Image. [3] Zaum, D. W. (2005). Robust building detection in aerial images [4] Jin, X., & Davis, C. H. (2005). Automated Building Extraction from High-Resolution Satellite Imagery in Urban Areas Using Structural , Contextual , and Spectral Information. EURASIP Journal on Applied Signal Processing, (September 1999) [5] Hai-yue, L. I., Hong-qi, W., & Chi-biao, D. (2006). A New Solution of Automatic Building Extraction in Remote Sensing Images. [6] Wei, Y., Zhao, Z., & Song, J. (2008). Urban building extraction from high-resolution satellite panchromatic image using clustering and edge detection. [7] Unsalan, C., Vision, C., & Engineering, E. (2008). Building Detection from Aerial Images using Invariant Color Features and Shadow. [8] Katartzis, A., & Sahli, H. (2008). A Stochastic Framework for the Identification of Building Rooftops Using a Single Remote Sensing Image [9] Liu, Z., Cui, S., & Yan, Q. (2008). Building Extraction from High Resolution Satellite Imagery Based on Multi-scale Image Segmentation and Model Matching. Earth Observation and Remote Sensing. [10] Karantzalos, K., & Paragios, N. (2009). Recognition-Driven Two-Dimensional Competing Priors Toward Automatic and Accurate Building Detection. [11]Kabolizade, M., Ebadi, H., & Ahmadi, S. (2010). An Improved Snake Model for Automatic Extraction of Buildings from Urban Aerial Images and LiDAR Data Using Genetic Algorithm [12] Pakizeh, E., & Palhang, M. (2010). Building Detection from Aerial Images Using Hough Transform and Intensity Information. [13] Izadi, M., & Saeedi, P. (2010). Automatic Building Detection in Aerial Images Using a Hierarchical Feature Based Image Segmentation. 2010 20th International Conference on Pattern Recognition, 472–47
  • 43. 43 [14] C. Tomasi and R. Manduchi, (2008) "Bilateral Filtering for Gray and Color Images", Proceedings of the 1998 IEEE International Conference on Computer Vision, Bombay, India. [15] Wang, O., Lodha, S. K., & Helmbold, D. P. (2006). “A Bayesian Approach to Building Footprint Extraction from Aerial LIDAR Data” Third International Symposium on 3D Data Processing, Visualization, and Transmission. [16] Tsai, V. J. D., “A comparative study on shadow compensation of color aerial images in invariant color models,” IEEE Transactions on Geoscience and Remote Sensing 44(6), 1661– 1671 (2006).
  • 44. 44 10. Appendix A: Matlab scripts Test Framework enter_roof_data.m Script to manually enter building contours for reference show_roof_data.m Shows the entered reference building contours test_framework.m Runs the building localization algorithm on all images in the test set and saves the results. show_result.m Shows contours of all steps of the detecting algorithm show_table.m Groups all the detection results in a table Roof detection bfilter2.m Bilateral filter for RGB image detect_building.m The main function of the roof detection algorithm DisplayWatershedRegions.m Displays the watershed result of the image with the mean region color for each region imRAG.m Builds the region adjacency graph for the watershed image. line_eq.m Calculate line equation for line crossing two points linortfit2.m Fit a line to data by orthogonal least-squares. (2 dimensional) linortfitn.m Fit a line to data by orthogonal least-squares. (N dimensional) polygon_corner_penalty.m Calculate corner penalty of polygon based on the defined corner cost function. polygon_fit.m Polygon simplification by calculating shortest path of weighted graph as explained in Chapter 6. region_merge.m Region merge segmented image starting with a seed region shape_fit.m Shape detection by as explained in Chapter 7 watershed_regions.m Performs the watershed algorithm
  • 45. 45 11. Appendix B: Detection results ID Dataset Building ID Completeness Correctness Quality Comment 1 A1 1 0.541 1.000 0.541 Building with multicolor segments 2 A1 2 0.605 0.996 0.603 Building with multicolor segments 3 A2 1 0.971 0.970 0.943 4 A3 1 0.640 0.959 0.623 Shadow on roof 5 A3 2 0.879 0.984 0.866 6 A4 1 0.987 0.999 0.987 7 A4 2 0.904 0.983 0.890 Low contrast on building edge 8 A4 3 0.973 1.000 0.973 9 A5 1 0.982 1.000 0.982 10 A5 2 0.982 0.948 0.932 11 A6 1 0.668 0.993 0.665 Building with multicolor segments 12 A6 2 0.989 0.970 0.960 13 B 1 0.813 1.000 0.813 Shadow on building segment 14 B 2 0.953 0.982 0.937 15 B 3 0.609 0.953 0.591 Shadow on building segment 16 C 1 0.924 0.962 0.891 Segment merging limitation 17 C 2 0.583 1.000 0.583 Dark patches on roof 18 C 3 0.904 0.990 0.896 Shadow on roof 19 C 4 0.909 1.000 0.909 20 D 1 0.762 0.991 0.756 Building with multicolor segments 21 E 1 0.974 1.000 0.974 22 F 1 0.515 0.949 0.501 Multi color roof. Shadow on segment and search not far enough 23 F 2 0.084 0.932 0.084 Multi color roof/segment search not far enough 24 F 3 0.107 1.000 0.107 Edges inside the roof contour 25 G1 1 0.960 0.986 0.947 26 G1 2 0.121 1.000 0.121 segment search not far enough 27 G1 3 0.272 1.000 0.272 multi color roof 28 G1 4 0.916 0.965 0.886 29 G1 5 0.970 0.991 0.962 30 G2 1 0.803 0.981 0.790 Shadow on segment 31 G2 2 0.379 1.000 0.379 Multi color roof, rails near roof edge 32 H1 1 0.953 0.994 0.948 33 H1 2 0.975 0.990 0.966
  • 46. 46 34 H1 3 0.955 0.970 0.927 35 H1 4 0.934 0.986 0.922 36 H1 5 0.978 0.724 0.712 False detected segment 37 H1 6 0.817 0.979 0.803 Segment not detected 38 H1 7 0.505 0.994 0.504 Segment not detected 39 H1 8 0.963 0.980 0.945 40 H1 9 0.980 0.973 0.954 41 H2 1 0.278 0.996 0.277 Multi color roof 42 H2 2 0.824 0.985 0.814 Dark patches on roof 43 H2 3 0.906 0.991 0.899 44 H2 4 0.938 0.774 0.737 Shadow on roof/false segment 45 H2 5 0.558 0.983 0.553 Shadow on roof 46 H2 6 0.962 0.996 0.958 47 I 1 0.427 0.988 0.425 Low contrast on building edge 48 I 2 0.981 0.924 0.908 49 I 3 0.319 0.927 0.311 Low contrast on building edge 50 I 4 0.864 0.982 0.850 Low contrast on building edge 51 I 5 0.846 0.999 0.845 Low contrast on building edge 52 I 6 0.859 0.998 0.858 Low contrast on building edge 53 I 7 0.408 1.000 0.408 Low contrast on building edge 54 I 8 0.488 0.836 0.445 Low contrast on building edge 55 J1 1 0.977 0.999 0.976 56 J1 2 0.900 0.718 0.665 Shadow included in region merging 57 J2 1 0.749 0.955 0.724 Multi color roof 58 J2 2 0.469 0.973 0.463 Multi color roof 59 J2 3 0.459 0.978 0.455 Shadow on segment 60 J2 4 0.811 0.985 0.801 Multi color roof 61 J2 5 0.901 0.988 0.891 Shadow on segment 62 J2 6 1.000 0.948 0.948 63 J2 7 0.963 0.996 0.959 64 J3 1 0.932 0.999 0.932 65 J3 2 0.963 0.998 0.961 66 J3 3 0.987 0.974 0.962 67 J3 4 0.814 0.991 0.808 Shadow on segment 68 J3 5 0.972 0.990 0.963 69 J3 6 0.930 0.998 0.928 70 J3 7 0.527 1.000 0.527 Segment not detected 71 J3 8 0.766 1.000 0.766 Segment not detected 72 J4 1 0.981 0.990 0.971 73 J4 2 0.974 0.992 0.967
  • 47. 47 74 J4 3 0.986 0.992 0.978 75 J4 4 0.964 0.992 0.956 76 J4 5 0.961 0.998 0.959 77 J5 1 0.731 0.986 0.724 Pipes on roof 78 J5 2 0.639 0.997 0.638 Solar panels/rails on roof 79 J6 1 0.943 0.992 0.935 80 J6 2 0.490 0.998 0.490 Dark patches on roof 81 K 1 0.833 0.974 0.815 Multi color roof Average 0.776 0.973 0.758