Lecturer Computer Science and Information Technology, Mid-West University, Nepal em Graduate School of Science and Technology, Birendranagar, Surkhet, Nepal
Lecturer Computer Science and Information Technology, Mid-West University, Nepal em Graduate School of Science and Technology, Birendranagar, Surkhet, Nepal
Spatial Data
Spatial data, which means data related to space.
Data that pertains to the space occupied by objects.
Data that define a location.
These are in the form of graphic primitives that usually either points,
lines, polygons or pixels.
Spatial data includes location, shape, size and orientation.
For example: consider a particular square:
• Its center ( the intersection of its diagonals ) specifies its location
• Its shape is square
• The length of one of its sides specifies its size
• The angle of its diagonals make with, say, x-axis specifies its orientation.
2
Spatial Data
The space of interest can be,
• For example, the two-dimensional abstraction of (parts
of) the surface of the earth – that is, geographic space,
• The most prominent example – a man-made space
like the layout of a VLSI design,
• A volume containing a model of the human brain, or
another 3d-space representing the arrangement of
chains of protein molecules.
3
Non-Spatial Data
Non-spatial data ( also called attribute or
characteristic data) is that information which is
independent of all geometric considerations.
For example: a person's height, mass, and age are
non-spatial data because they are independent of
person's location.
It's interesting to note that, while mass is non-spatial
data, weight is spatial data in the sense that
something's weight is very much dependent on its
location.
4
Spatial Data
Two types of spatial data are particularly important:
• Computer-aided-design (CAD) data, which includes spatial
information about how objects such as buildings, cars, or
aircraft, are constructed. Other important examples of
computer-aided-design databases are integrated-circuit and
electronic-device layouts.
• Geographic data such as road maps, land-usage maps,
topographic elevation maps, political maps showing boundaries,
land-ownership maps, and so on. Geographic information
systems are special-purpose databases tailored for storing
geographic data.
5
How Are Spatial Data Organized?
Coordinates are used to specify location of geographic objects in either
two or three dimensional space. The coordinates can be specified as (x,y)
in 2D or (x,y,z) in 3D or spherical coordinates (latitude, longitude).
Discrete geographic features like points, lines and polygons can be used
to represent different types objects. Points might be a house address, a
line might be a road and a polygon might be a land parts or building foot-
prints. These are also known as vector data types.
Continuous geographic features describe phenomena that exist
continuously in landscape. Examples include: elevation, temperature,
relative humidity, gravity, wind, atmospheric pressure and so on. These are
considered as raster data types.
The features can also be summarized by a geographic area. Examples
include population, socio-economic characteristics and other demographic
information.
6
Properties of Spatial Data
There are four main properties of the spatial data that set it apart from
traditional relational data.
i. Geometry
ii. Distribution of Objects in Space
iii. Temporal Changes
iv. Data Volume
7
Geometry
Geometry deals with the mathematical properties of an
object. These properties include measurement (metric),
relationships of points, lines, angles, surfaces, and solids
(topology), and order.
A simple geometry is usually constructed from geometric
primitives such as points, lines, curves, and areas.
Complex geometries are constructed from collections of
simple geometries. In addition, there are a number of
geometric relationships between geometries that are
important in handling spatial data.
8
Distribution of Objects in Space
Usually spatial objects are very irregularly distributed in
space.
Consider the case where we model the town halls of all the
cities in the United States as spatial objects (points).
The distribution of cities on the east coast is very dense
compared to the distribution of cities in Arizona and Nevada,
which is sparse. In addition, different objects have largely
varying extents.
9
Temporal Changes
Spatial data often has an associated temporal
property.
An example is a navigation system that helps
travelers find directions from place A to B in a major
city.
If there is an accident and some road is temporarily
closed, the system has to incorporate this new data
and recompute a suitable path from point A to B.
10
Data Volume
Several GIS applications deal with very large
databases of the order of terabytes.
For example, remote sensing applications gather
terabytes of data from satellites every day.
Similarly, data warehousing applications and NASA’s
Earth
Observation System are other examples of systems
with terabytes of spatial data.
11
Spatial Data Types
Spatial data types are special data types necessary to model
geometry and to suitably represent geometric data in database
systems.
These data types are: point, line, and region but also include more
complex types like partitions (maps) and graphs (networks).
Conceptually, points, lines, rectangles, surfaces, volumes and etc.
Physically, cities, rivers, roads, states, crop coverage, mountain
ranges etc.
Spatial data types provide a fundamental abstraction for modeling
the geometric structure of objects in space, their relationships,
properties and operations.
13
Spatial Database
A spatial database system is a full-fledged database system with
additional capabilities for handling spatial data.
Spatial database system is a database system with:
• Offers spatial data types in its model and query language.
• Supports spatial data types in its implementation providing at least
spatial indexing and efficient algorithms for spatial join.
Spatial data types, e.g. POINT, LINE, REGION, provide a
fundamental abstraction for modeling the structure of geometric
entities in space as well as their relationships (l intersects r),
properties (area(r) > 1000), and operations (intersection(l, r) –
the part of l lying within r).
•
14
Spatial databases
In general, a spatial database stores objects that have spatial
characteristics that describe them and that have spatial
relationships among them.
The spatial relationships among the objects are important, and
they are often needed when querying the database.
A spatial database is optimized to store and query data related to
objects in space, including points, lines and polygons.
Whereas typical databases process numeric and character data,
additional functionality needs to be added for databases to
process spatial data types.
15
Spatial databases
Queries posed on these spatial data, where predicates for selection
deal with spatial parameters, are called spatial queries.
For example, a query such as “List all the customers located within
twenty miles of company headquarters” will require the processing
of spatial data types.
Effectively, each customer will be associated to a <latitude,
longitude> position.
A traditional B+-tree index based on customers’ zip codes or other
non-spatial attributes cannot be used to process this query since
traditional indexes are not capable of ordering multidimensional
coordinate data.
16
Why Spatial databases?
Applications of spatial data initially stored data as files in a file
system, as did early-generation business applications.
But as the complexity and volume of the data, and the number of
users, have grown, ad hoc approaches to storing and retrieving
data in a file system have proved insufficient for the needs of
many applications that use spatial data.
Spatial-data applications require facilities offered by a database
system— in particular, the ability to store and query large
amounts of data efficiently.
17
Why Spatial databases?
Therefore, there is a special need for databases tailored for
handling spatial data and spatial queries.
Spatial data support in databases is important for efficiently
storing, indexing, and querying of data on the basis of spatial
locations.
Efficient processing of the above query would require special-
purpose index structures, such as R-trees for the task.
18
Why Spatial databases?
Spatial databases incorporate functionality that provides
support for databases that keep track of objects in a
multidimensional space.
For example, cartographic databases that store maps include
two-dimensional spatial descriptions of their objects—from
countries and states to rivers, cities, roads, seas, and so on.
Other databases, such as meteorological databases for weather
information, are three-dimensional, since temperatures and
other meteorological information are related to three-
dimensional spatial points.
19
What needs to be represented?
The main application driving research in spatial database
systems are GIS.
There are two important alternative views of what needs to be
represented in this area:
i. Objects in space: This view assumes the distinct entities arranged in
space each of which has its own geometric description. It allows one
to model, for example, cities, forests, or rivers.
ii. Space: The space itself, that is, say something about every point in
space. This second view is the one of thematic maps describing e.g.
land use or the partition of a country into districts. Since raster
images say something about every point in space, they are also
closely related to the second view.
20
The above views to some extent can be reconciled by offering
concepts for modeling (i) single objects , and (ii) spatially related
collections of objects.
For modeling single objects, the fundamental abstractions are point,
line, and region.
A point represents (the geometric aspect of) an object for which only
its location in space, but not its extent, is relevant.
For example, a city may be modeled as a point in a model describing a
large geographic area (a large scale map).
21What needs to be represented?
A line (in this context always to be understood as meaning a curve in
space, usually represented by a polyline, a sequence of line segments)
is the basic abstraction for facilities for moving through space, or
connections in space (roads, rivers, cables for phone, electricity, etc.).
A region is the abstraction for something having an extent in 2d-
space, e.g. a country, a lake, or a national park.
A region may have holes and may also consist of several disjoint
pieces. Figure 1 shows the three basic abstractions for single objects.
22What needs to be represented?
The two most important instances of spatially related collections of objects
are partitions (of the plane) and networks (Figure 2).
A partition can be viewed as a set of region objects that are required
to be disjoint.
The adjacency relationship is of particular interest, that is, there exist often
pairs of region objects with a common boundary. Partitions can be used to
represent thematic maps.
A network can be viewed as a graph embedded into the plane, consisting
of a set of point objects, forming its nodes, and a set of line objects
describing the geometry of the edges.
24What needs to be represented?
Networks are ubiquitous in geography, for example, highways,
rivers, public transport, or power supply lines.
25What needs to be represented?
Figure 2
Examples 26
In Figure 3 (a) the European countries are represented as polygons,
whereas in Figure 3(b) a GIS map is shown which contains information about a
specific geographic area of Northern Greece.
3
Spatial Query Processing
In traditional database systems user queries are usually expressed by
SQL statements containing conditions among the attributes of the
relations (database tables).
A spatial database system must be equipped with additional functionality
to answer queries containing conditions among the spatial attributes of
the database objects, such as location, extend and geometry.
The most common spatial query types are:
• Topological queries (e.g., find all objects that overlap or cover a given
object),
• Directional queries (e.g., find all objects that lie north of a given object),
• Distance queries (e.g., find all objects that lie in less than a given
distance
from a given object).
27
Spatial Queries
Let us examine three queries that are widely used in spatial applications.
• Range query: is the most common topological query. A query area R is
given and all objects that intersect or are contained in R are requested.
• Nearest neighbor (NN) query: is the most common distance query.
Given a query point P and a positive integer k, the query returns the k
objects that are closer to P , based on a distance metric (e.g., Euclidean
distance).
• Spatial join query: is used to determine pairs of spatial objects that
satisfy a particular property. Given two spatial datasets DA and DB and
a predicate θ, the output of the spatial join query is a set of pairs
Oa,Ob such that Oa ∈ DA, Ob ∈ DB and θ(Oa, Ob) is true.
28
Spatial Queries
Figure 1.2 presents examples of range and NN queries for a database
consisting of points in 2-d space. In Figure 1.2(a) the answer to the range
query is
comprised by the three data points that are enclosed by R. In Figure 1.2(b)
the
answer to the NN query is composed of the five data points that are closer
to P .
29
30Spatial Queries
Figure 1.3 gives two examples of spatial join queries. In Figure 1.3(a) the query asks for all
intersecting pairs of the two datasets (intersection spatial join), whereas in Figure 1.3(b) the
query asks for all pairs Oa, Ob such that Ob is totally enclosed by Oa (containment spatial join).
Architecture: Spatial Database
The two main approaches are layered architecture and dual
architecture.
Layered architecture: Here spatial functionality is implemented on top
of a given DBMS, often a commercially available relational system, as
shown in Figure 3 below.
31
Figure 3: Layered Architecture
Dual Architecture: Here a top layer integrates two rather independent
subsystems: the DBMS which handles non-spatial data, and a spatial
subsystem storing and manipulating geometries Figure 4 below.
32
Figure 4: Dual Architecture
Architecture: Spatial Database
GIS : Management of Spatial Data
Linking location to information is a process that applies to
many aspects of decision making in business and
community.
Choosing a site, targeting a market segment, planning a
distribution network, zoning a neighborhood, allocating
resources, and responding to emergencies – all these
problems involve the questions of geography.
Where are the current and potential customers? In which
area do customers with particular profiles live? Which are
area of city are most vulnerable to seasonal flooding and
other natural disasters? Where are power poles located,
and when did they last receive maintenance?
33
GIS : Management of Spatial Data
Intelligent digital maps are made possible by Geographical
Information system.
GIS represents the features on the earth – buildings, roads, cities,
rivers , and states on a computer.
People use GIS to visualize, question, analyze and understand data
about world and human activity.
Often this data is viewed on a map which provides advantages over
using spreadsheet and database.
This is because, maps and spatial analysis can reveal patterns,
point out problems and show connections that may not be apparent
in tables or text.
34
What is GIS ?
GIS is a computer software which links geographic information (
Where things are) with descriptive information ( what things are).
It is a system that manage geographic data and related
applications.
They are widely used in areas such as environmental applications,
transportation systems, emergency response systems, and battle
management.
Geographic information systems(GIS) are used to collect, model,
store, and analyze information describing physical properties of
the geographical world.
35
Power of GIS
Unlike flat paper map, where what you see is what you get, GIS
can present many layers of different information.
Each layer represents a particular theme or feature of the map.
One theme could be made up of all the roads in an area. Another
theme could represent all the lakes in the same area. Yet
another could represent all the cities.
A GIS-based map is not much more difficult to use than a paper
map. As on the paper map, there are dots or points that
represent features on the map such as cities, lines that represent
features such as roads, and small areas that represent features
such as lakes.
36
Data in GIS
The scope of GIS broadly encompasses two types of data:
• Spatial data, originating from maps, digital images,
administrative and political boundaries, roads,
transportation networks; physical data such as rivers,
soil characteristics, climatic regions, land elevations
• Non-spatial data, such as socio-economic data (like
census data), economic data, or sales or marketing
information
GIS is a rapidly developing domain that offers highly
innovative approaches to meet some challenging
technical demands.
37
DBMSs to GIS
Three main types of DBMS are available to GIS users today:
relational (RDBMS), object (ODBMS), and object-relational
(ORDBMS).
A relational database comprises a set of tables, each a two-
dimensional list (or array) of records containing attributes about
the objects under study.
Object database management systems (ODBMS) were initially
designed to address weaknesses of RDBMS, including the inability
to store complete objects directly in the database (both object
state and behavior), poor performance for many types of
geographic query.
38
DBMSs to GIS
Hybrid object-relational DBMS (ORDBMS) can be thought of
as an RDBMS engine with an extensibility framework for
handling objects.
The ideal geographic ORDBMS is one that has been extended
to support geographic object types and functions through the
addition of a geographic query parser, a geographic query
optimizer, a geographic query language, multidimensional
indexing services, storage management for large files, long
transaction services and replication services.
39
Spatial DBMS extensions
The commercial DBMS vendors have released spatial database
extensions to their standard ORDBMS products
IBM – DB2 Spatial Extender and Informix Spatial Datablade
Oracle Spatial
Spatial capabilities in the core of Microsoft SQLServer
Opensource DBMS PostgreSQL has also been extended with
spatial types and functions (PostGIS).
None is a complete GIS software system.
40
GIS Applications 41
Civil engineering and
military evaluation
GIS Applications
Cartographic
Irrigation
Crop yield
analysis
Land Evaluation
Planning and
Facilities
management
Landscape studies
Traffic pattern analysis
Digital Terrain Modeling
Applications
Air and water
pollution studies
Earth
science
Soil Surveys
Flood Control
Water resource
management
Consumer product and
services – economic
analysis
Geographic Objects
Applications
Car navigation
systems
Utility distribution
and consumption
Geographic market
analysis
GIS Applications
It is possible to divide GISs into three categories:
(1) cartographic applications,
(2) digital terrain modeling applications, and
(3) geographic objects applications
Cartography is the study and practice of making maps. It involves
graphically representing a geographical area on a flat surface.
In cartographic and terrain modeling applications, variations in spatial
attributes are captured – for example, soil characteristics, crop
density, and air quality
42
In geographic object applications, objects of interest
are identified from a physical domain – for example,
power plants, electoral districts, property parcels,
product distribution districts, and city landmarks;
These objects are related with pertinent application
data – for example, power consumption, voting
patterns, property sales volumes, product sales
volume, and traffic density
43GIS Applications
Data Management Requirements of
GIS Data Modeling and Representation
• GIS data can be broadly represented in two formats:
(1) vector and (2) raster
• Vector data represents geometric objects such as points, lines, and
polygons.
• Vector models are useful for storing data that has discrete
boundaries, such as country borders, land parcels and streets.
• Thus a lake may be represented as a polygon, a river by a series of
line segments.
• It gives higher geographic accuracy because data isn't dependent on
grid size. But continuous data is poorly stored and displayed as
44
Data Modeling and Representation
• Raster data is characterized as an array of points, where
each point represents the value of an attribute for a real-
world location.
• i.e. Raster data is made up of pixels or grid cells.
• Informally, raster images are n-dimensional array where
each entry is a unit of the image and represents an
attribute.
• Two-dimensional units are called pixels, while three-
dimensional units are called voxels
45
Data Management Requirements of GIS
Three-dimensional elevation data is stored in a raster-based digital
elevation model (DEM) format.
A Digital Elevation Model (DEM) is a digital model or three
dimensional (3D) representation of a terrain's surface created from
elevation data.
It represents height information without any further definition about
the surface.
As topography is one of the major factors in most types of hazard
analysis, the generation of a Digital Elevation Model (DEM) plays a
major role.
46
Another format called triangular irregular network
(TIN) is a topological vector-based approach that
models surfaces by connecting sample points as vector
of triangles and has a point density that may vary with
the roughness of the terrain.
48Data Management Requirements of GIS
In digital terrain modeling (DTM), the model also
may be used by substituting the elevation with
some attribute of interest such as population
density or air temperature
GIS data often includes a temporal structure in
addition to a spatial structure
49Data Management Requirements of GIS
Data Management Requirements of GIS
Data Analysis
• GIS data undergoes various types of analysis
• For example, in applications such as soil erosion studies,
environmental impact studies, or hydrological runoff simulations,
DTM data may undergo various types of geomorphometric
analysis – measurements such as slope values, gradients (the
rate of change in altitude), aspect (the compass direction of the
gradient), profile convexity (the rate of change of gradient), plan
convexity (the convexity of contours and other parameters)
• When GIS data is used for decision support applications, it may
undergo aggregation and expansion operations using data
warehousing
50
Data Management Requirements of GIS
In addition, geometric operations (to compute distances, areas,
volumes), topological operations (to compute overlaps, intersections,
shortest paths), and temporal operations (to compute internal-based or
event-based queries) are involved
Data Integration
• GISs must integrate both vector and raster data from a variety of
sources
• Sometimes edges and regions are inferred from a raster image to form
a vector model, or conversely, raster images such as aerial
photographs are used to update vector models
• Several coordinate systems such as Universal Transverse Mercator
(UTM), latitude/longitude, and local cadastral systems are used to
identify locations
51
Data Management Requirements of GIS
Data Capture
The first step in developing a spatial database for cartographic
modeling is to capture the two-dimensional or three-dimensional
geographical information in digital form – a process that is
sometimes impeded by source map characteristics such as
resolution, type of projection, map scales, cartographic licensing,
diversity of measurement techniques, and coordinate system
differences
Spatial data can also be captured from remote sensors in
satellites such as Landsat, NORA, and Advanced Very High
Resolution Radiometer(AVHRR) as well as SPOT HRV (High
Resolution Visible Range Instrument)
52
Data Management Requirements of GIS
For digital terrain modeling, data capture methods
range from manual to fully automated
Ground surveys are the traditional approach and the
most accurate, but they are very time consuming;
Other techniques include photogrammetric sampling
and digitizing cartographic documents
53
Specific GIS Data Operations
GIS applications are conducted through the use of special operators
such as the following:
Interpolation – derives elevation data for points at which no
samples have been taken. Most interpolation methods are based
on triangulation that uses the TIN method for interpolating
elevations inside the triangle based on those of its vertices.
Interpretation – involves the interpretation of operations on terrain
data such as editing, smoothing, reducing details, and enhancing
Proximity analysis – computation of “zones of interest” around
objects. Such as the determination of a buffer around a car on a
highway. Shortest path algorithms using 2D or 3D information is
an important class of proximity analysis.
54
Raster image processing – can be divided into
1. map algebra which is used to integrate geographic
features on different map layers to produce new maps
algebraically and
2. digital image analysis which deals with analysis of a
digital image for features such as edge detection and
object detection. Detecting roads in a satellite image of a
city is an example of the latter.
Analysis of networks – analysis of networks for segmentation,
overlays, and so on. Network overlay refers to a type of
spatial join where a given network—for example, a highway
network—is joined with a point database—for example,
accident locations—to yield, in this case, a profile of high-
55Specific GIS Data Operations
Specific GIS Data Operations
The functionality of a GIS database is also subject to other
considerations:
• Extensibility – GISs are required to be extensible to
accommodate a variety of constantly evolving applications
and corresponding data types
• Data quality control – quality of source of data is of
paramount importance for providing accurate results to
queries
• Visualization – the graphical display of terrain information
Since, standard RDBMSs or ODBMSs do not meet the special
needs of GIS, it is necessary to design systems that support the
vector and raster representations and the spatial functionality