Spatial databases

Lecturer Computer Science and Information Technology, Mid-West University, Nepal em Graduate School of Science and Technology, Birendranagar, Surkhet, Nepal
15 de May de 2018

Mais conteúdo relacionado

Similar a Spatial databases(20)


Spatial databases

  1. ADBMS By: Dabbal S. Mahara 2018
  2. Spatial Data  Spatial data, which means data related to space.  Data that pertains to the space occupied by objects.  Data that define a location.  These are in the form of graphic primitives that usually either points, lines, polygons or pixels.  Spatial data includes location, shape, size and orientation.  For example: consider a particular square: • Its center ( the intersection of its diagonals ) specifies its location • Its shape is square • The length of one of its sides specifies its size • The angle of its diagonals make with, say, x-axis specifies its orientation. 2
  3. Spatial Data  The space of interest can be, • For example, the two-dimensional abstraction of (parts of) the surface of the earth – that is, geographic space, • The most prominent example – a man-made space like the layout of a VLSI design, • A volume containing a model of the human brain, or another 3d-space representing the arrangement of chains of protein molecules. 3
  4. Non-Spatial Data  Non-spatial data ( also called attribute or characteristic data) is that information which is independent of all geometric considerations.  For example: a person's height, mass, and age are non-spatial data because they are independent of person's location.  It's interesting to note that, while mass is non-spatial data, weight is spatial data in the sense that something's weight is very much dependent on its location. 4
  5. Spatial Data  Two types of spatial data are particularly important: • Computer-aided-design (CAD) data, which includes spatial information about how objects such as buildings, cars, or aircraft, are constructed. Other important examples of computer-aided-design databases are integrated-circuit and electronic-device layouts. • Geographic data such as road maps, land-usage maps, topographic elevation maps, political maps showing boundaries, land-ownership maps, and so on. Geographic information systems are special-purpose databases tailored for storing geographic data. 5
  6. How Are Spatial Data Organized?  Coordinates are used to specify location of geographic objects in either two or three dimensional space. The coordinates can be specified as (x,y) in 2D or (x,y,z) in 3D or spherical coordinates (latitude, longitude).  Discrete geographic features like points, lines and polygons can be used to represent different types objects. Points might be a house address, a line might be a road and a polygon might be a land parts or building foot- prints. These are also known as vector data types.  Continuous geographic features describe phenomena that exist continuously in landscape. Examples include: elevation, temperature, relative humidity, gravity, wind, atmospheric pressure and so on. These are considered as raster data types.  The features can also be summarized by a geographic area. Examples include population, socio-economic characteristics and other demographic information. 6
  7. Properties of Spatial Data  There are four main properties of the spatial data that set it apart from traditional relational data. i. Geometry ii. Distribution of Objects in Space iii. Temporal Changes iv. Data Volume 7
  8. Geometry  Geometry deals with the mathematical properties of an object. These properties include measurement (metric), relationships of points, lines, angles, surfaces, and solids (topology), and order.  A simple geometry is usually constructed from geometric primitives such as points, lines, curves, and areas.  Complex geometries are constructed from collections of simple geometries. In addition, there are a number of geometric relationships between geometries that are important in handling spatial data. 8
  9. Distribution of Objects in Space  Usually spatial objects are very irregularly distributed in space.  Consider the case where we model the town halls of all the cities in the United States as spatial objects (points).  The distribution of cities on the east coast is very dense compared to the distribution of cities in Arizona and Nevada, which is sparse. In addition, different objects have largely varying extents. 9
  10. Temporal Changes  Spatial data often has an associated temporal property.  An example is a navigation system that helps travelers find directions from place A to B in a major city.  If there is an accident and some road is temporarily closed, the system has to incorporate this new data and recompute a suitable path from point A to B. 10
  11. Data Volume  Several GIS applications deal with very large databases of the order of terabytes.  For example, remote sensing applications gather terabytes of data from satellites every day.  Similarly, data warehousing applications and NASA’s Earth Observation System are other examples of systems with terabytes of spatial data. 11
  12. 12
  13. Spatial Data Types  Spatial data types are special data types necessary to model geometry and to suitably represent geometric data in database systems.  These data types are: point, line, and region but also include more complex types like partitions (maps) and graphs (networks).  Conceptually, points, lines, rectangles, surfaces, volumes and etc.  Physically, cities, rivers, roads, states, crop coverage, mountain ranges etc.  Spatial data types provide a fundamental abstraction for modeling the geometric structure of objects in space, their relationships, properties and operations. 13
  14. Spatial Database  A spatial database system is a full-fledged database system with additional capabilities for handling spatial data.  Spatial database system is a database system with: • Offers spatial data types in its model and query language. • Supports spatial data types in its implementation providing at least spatial indexing and efficient algorithms for spatial join.  Spatial data types, e.g. POINT, LINE, REGION, provide a fundamental abstraction for modeling the structure of geometric entities in space as well as their relationships (l intersects r), properties (area(r) > 1000), and operations (intersection(l, r) – the part of l lying within r). • 14
  15. Spatial databases  In general, a spatial database stores objects that have spatial characteristics that describe them and that have spatial relationships among them.  The spatial relationships among the objects are important, and they are often needed when querying the database.  A spatial database is optimized to store and query data related to objects in space, including points, lines and polygons.  Whereas typical databases process numeric and character data, additional functionality needs to be added for databases to process spatial data types. 15
  16. Spatial databases  Queries posed on these spatial data, where predicates for selection deal with spatial parameters, are called spatial queries.  For example, a query such as “List all the customers located within twenty miles of company headquarters” will require the processing of spatial data types.  Effectively, each customer will be associated to a <latitude, longitude> position.  A traditional B+-tree index based on customers’ zip codes or other non-spatial attributes cannot be used to process this query since traditional indexes are not capable of ordering multidimensional coordinate data. 16
  17. Why Spatial databases?  Applications of spatial data initially stored data as files in a file system, as did early-generation business applications.  But as the complexity and volume of the data, and the number of users, have grown, ad hoc approaches to storing and retrieving data in a file system have proved insufficient for the needs of many applications that use spatial data.  Spatial-data applications require facilities offered by a database system— in particular, the ability to store and query large amounts of data efficiently. 17
  18. Why Spatial databases?  Therefore, there is a special need for databases tailored for handling spatial data and spatial queries.  Spatial data support in databases is important for efficiently storing, indexing, and querying of data on the basis of spatial locations.  Efficient processing of the above query would require special- purpose index structures, such as R-trees for the task. 18
  19. Why Spatial databases?  Spatial databases incorporate functionality that provides support for databases that keep track of objects in a multidimensional space.  For example, cartographic databases that store maps include two-dimensional spatial descriptions of their objects—from countries and states to rivers, cities, roads, seas, and so on.  Other databases, such as meteorological databases for weather information, are three-dimensional, since temperatures and other meteorological information are related to three- dimensional spatial points. 19
  20. What needs to be represented?  The main application driving research in spatial database systems are GIS.  There are two important alternative views of what needs to be represented in this area: i. Objects in space: This view assumes the distinct entities arranged in space each of which has its own geometric description. It allows one to model, for example, cities, forests, or rivers. ii. Space: The space itself, that is, say something about every point in space. This second view is the one of thematic maps describing e.g. land use or the partition of a country into districts. Since raster images say something about every point in space, they are also closely related to the second view. 20
  21.  The above views to some extent can be reconciled by offering concepts for modeling (i) single objects , and (ii) spatially related collections of objects.  For modeling single objects, the fundamental abstractions are point, line, and region.  A point represents (the geometric aspect of) an object for which only its location in space, but not its extent, is relevant.  For example, a city may be modeled as a point in a model describing a large geographic area (a large scale map). 21What needs to be represented?
  22.  A line (in this context always to be understood as meaning a curve in space, usually represented by a polyline, a sequence of line segments) is the basic abstraction for facilities for moving through space, or connections in space (roads, rivers, cables for phone, electricity, etc.).  A region is the abstraction for something having an extent in 2d- space, e.g. a country, a lake, or a national park.  A region may have holes and may also consist of several disjoint pieces. Figure 1 shows the three basic abstractions for single objects. 22What needs to be represented?
  23. 23What needs to be represented? Figure 1
  24.  The two most important instances of spatially related collections of objects are partitions (of the plane) and networks (Figure 2).  A partition can be viewed as a set of region objects that are required to be disjoint.  The adjacency relationship is of particular interest, that is, there exist often pairs of region objects with a common boundary. Partitions can be used to represent thematic maps.  A network can be viewed as a graph embedded into the plane, consisting of a set of point objects, forming its nodes, and a set of line objects describing the geometry of the edges. 24What needs to be represented?
  25.  Networks are ubiquitous in geography, for example, highways, rivers, public transport, or power supply lines. 25What needs to be represented? Figure 2
  26. Examples 26 In Figure 3 (a) the European countries are represented as polygons, whereas in Figure 3(b) a GIS map is shown which contains information about a specific geographic area of Northern Greece. 3
  27. Spatial Query Processing  In traditional database systems user queries are usually expressed by SQL statements containing conditions among the attributes of the relations (database tables).  A spatial database system must be equipped with additional functionality to answer queries containing conditions among the spatial attributes of the database objects, such as location, extend and geometry.  The most common spatial query types are: • Topological queries (e.g., find all objects that overlap or cover a given object), • Directional queries (e.g., find all objects that lie north of a given object), • Distance queries (e.g., find all objects that lie in less than a given distance from a given object). 27
  28. Spatial Queries  Let us examine three queries that are widely used in spatial applications. • Range query: is the most common topological query. A query area R is given and all objects that intersect or are contained in R are requested. • Nearest neighbor (NN) query: is the most common distance query. Given a query point P and a positive integer k, the query returns the k objects that are closer to P , based on a distance metric (e.g., Euclidean distance). • Spatial join query: is used to determine pairs of spatial objects that satisfy a particular property. Given two spatial datasets DA and DB and a predicate θ, the output of the spatial join query is a set of pairs Oa,Ob such that Oa ∈ DA, Ob ∈ DB and θ(Oa, Ob) is true. 28
  29. Spatial Queries  Figure 1.2 presents examples of range and NN queries for a database consisting of points in 2-d space. In Figure 1.2(a) the answer to the range query is comprised by the three data points that are enclosed by R. In Figure 1.2(b) the answer to the NN query is composed of the five data points that are closer to P . 29
  30. 30Spatial Queries Figure 1.3 gives two examples of spatial join queries. In Figure 1.3(a) the query asks for all intersecting pairs of the two datasets (intersection spatial join), whereas in Figure 1.3(b) the query asks for all pairs Oa, Ob such that Ob is totally enclosed by Oa (containment spatial join).
  31. Architecture: Spatial Database  The two main approaches are layered architecture and dual architecture.  Layered architecture: Here spatial functionality is implemented on top of a given DBMS, often a commercially available relational system, as shown in Figure 3 below. 31 Figure 3: Layered Architecture
  32.  Dual Architecture: Here a top layer integrates two rather independent subsystems: the DBMS which handles non-spatial data, and a spatial subsystem storing and manipulating geometries Figure 4 below. 32 Figure 4: Dual Architecture Architecture: Spatial Database
  33. GIS : Management of Spatial Data  Linking location to information is a process that applies to many aspects of decision making in business and community.  Choosing a site, targeting a market segment, planning a distribution network, zoning a neighborhood, allocating resources, and responding to emergencies – all these problems involve the questions of geography.  Where are the current and potential customers? In which area do customers with particular profiles live? Which are area of city are most vulnerable to seasonal flooding and other natural disasters? Where are power poles located, and when did they last receive maintenance? 33
  34. GIS : Management of Spatial Data  Intelligent digital maps are made possible by Geographical Information system.  GIS represents the features on the earth – buildings, roads, cities, rivers , and states on a computer.  People use GIS to visualize, question, analyze and understand data about world and human activity.  Often this data is viewed on a map which provides advantages over using spreadsheet and database.  This is because, maps and spatial analysis can reveal patterns, point out problems and show connections that may not be apparent in tables or text. 34
  35. What is GIS ?  GIS is a computer software which links geographic information ( Where things are) with descriptive information ( what things are).  It is a system that manage geographic data and related applications.  They are widely used in areas such as environmental applications, transportation systems, emergency response systems, and battle management.  Geographic information systems(GIS) are used to collect, model, store, and analyze information describing physical properties of the geographical world. 35
  36. Power of GIS  Unlike flat paper map, where what you see is what you get, GIS can present many layers of different information.  Each layer represents a particular theme or feature of the map. One theme could be made up of all the roads in an area. Another theme could represent all the lakes in the same area. Yet another could represent all the cities.  A GIS-based map is not much more difficult to use than a paper map. As on the paper map, there are dots or points that represent features on the map such as cities, lines that represent features such as roads, and small areas that represent features such as lakes. 36
  37. Data in GIS  The scope of GIS broadly encompasses two types of data: • Spatial data, originating from maps, digital images, administrative and political boundaries, roads, transportation networks; physical data such as rivers, soil characteristics, climatic regions, land elevations • Non-spatial data, such as socio-economic data (like census data), economic data, or sales or marketing information  GIS is a rapidly developing domain that offers highly innovative approaches to meet some challenging technical demands. 37
  38. DBMSs to GIS  Three main types of DBMS are available to GIS users today: relational (RDBMS), object (ODBMS), and object-relational (ORDBMS).  A relational database comprises a set of tables, each a two- dimensional list (or array) of records containing attributes about the objects under study.  Object database management systems (ODBMS) were initially designed to address weaknesses of RDBMS, including the inability to store complete objects directly in the database (both object state and behavior), poor performance for many types of geographic query. 38
  39. DBMSs to GIS  Hybrid object-relational DBMS (ORDBMS) can be thought of as an RDBMS engine with an extensibility framework for handling objects.  The ideal geographic ORDBMS is one that has been extended to support geographic object types and functions through the addition of a geographic query parser, a geographic query optimizer, a geographic query language, multidimensional indexing services, storage management for large files, long transaction services and replication services. 39
  40. Spatial DBMS extensions  The commercial DBMS vendors have released spatial database extensions to their standard ORDBMS products  IBM – DB2 Spatial Extender and Informix Spatial Datablade  Oracle Spatial  Spatial capabilities in the core of Microsoft SQLServer  Opensource DBMS PostgreSQL has also been extended with spatial types and functions (PostGIS).  None is a complete GIS software system. 40
  41. GIS Applications 41 Civil engineering and military evaluation GIS Applications Cartographic Irrigation Crop yield analysis Land Evaluation Planning and Facilities management Landscape studies Traffic pattern analysis Digital Terrain Modeling Applications Air and water pollution studies Earth science Soil Surveys Flood Control Water resource management Consumer product and services – economic analysis Geographic Objects Applications Car navigation systems Utility distribution and consumption Geographic market analysis
  42. GIS Applications  It is possible to divide GISs into three categories: (1) cartographic applications, (2) digital terrain modeling applications, and (3) geographic objects applications  Cartography is the study and practice of making maps. It involves graphically representing a geographical area on a flat surface.  In cartographic and terrain modeling applications, variations in spatial attributes are captured – for example, soil characteristics, crop density, and air quality 42
  43.  In geographic object applications, objects of interest are identified from a physical domain – for example, power plants, electoral districts, property parcels, product distribution districts, and city landmarks;  These objects are related with pertinent application data – for example, power consumption, voting patterns, property sales volumes, product sales volume, and traffic density 43GIS Applications
  44. Data Management Requirements of GIS Data Modeling and Representation • GIS data can be broadly represented in two formats: (1) vector and (2) raster • Vector data represents geometric objects such as points, lines, and polygons. • Vector models are useful for storing data that has discrete boundaries, such as country borders, land parcels and streets. • Thus a lake may be represented as a polygon, a river by a series of line segments. • It gives higher geographic accuracy because data isn't dependent on grid size. But continuous data is poorly stored and displayed as 44
  45. Data Modeling and Representation • Raster data is characterized as an array of points, where each point represents the value of an attribute for a real- world location. • i.e. Raster data is made up of pixels or grid cells. • Informally, raster images are n-dimensional array where each entry is a unit of the image and represents an attribute. • Two-dimensional units are called pixels, while three- dimensional units are called voxels 45
  46. Data Management Requirements of GIS  Three-dimensional elevation data is stored in a raster-based digital elevation model (DEM) format.  A Digital Elevation Model (DEM) is a digital model or three dimensional (3D) representation of a terrain's surface created from elevation data.  It represents height information without any further definition about the surface.  As topography is one of the major factors in most types of hazard analysis, the generation of a Digital Elevation Model (DEM) plays a major role. 46
  47. Data Management Requirements of GIS 47
  48.  Another format called triangular irregular network (TIN) is a topological vector-based approach that models surfaces by connecting sample points as vector of triangles and has a point density that may vary with the roughness of the terrain. 48Data Management Requirements of GIS
  49.  In digital terrain modeling (DTM), the model also may be used by substituting the elevation with some attribute of interest such as population density or air temperature  GIS data often includes a temporal structure in addition to a spatial structure 49Data Management Requirements of GIS
  50. Data Management Requirements of GIS  Data Analysis • GIS data undergoes various types of analysis • For example, in applications such as soil erosion studies, environmental impact studies, or hydrological runoff simulations, DTM data may undergo various types of geomorphometric analysis – measurements such as slope values, gradients (the rate of change in altitude), aspect (the compass direction of the gradient), profile convexity (the rate of change of gradient), plan convexity (the convexity of contours and other parameters) • When GIS data is used for decision support applications, it may undergo aggregation and expansion operations using data warehousing 50
  51. Data Management Requirements of GIS  In addition, geometric operations (to compute distances, areas, volumes), topological operations (to compute overlaps, intersections, shortest paths), and temporal operations (to compute internal-based or event-based queries) are involved Data Integration • GISs must integrate both vector and raster data from a variety of sources • Sometimes edges and regions are inferred from a raster image to form a vector model, or conversely, raster images such as aerial photographs are used to update vector models • Several coordinate systems such as Universal Transverse Mercator (UTM), latitude/longitude, and local cadastral systems are used to identify locations 51
  52. Data Management Requirements of GIS  Data Capture  The first step in developing a spatial database for cartographic modeling is to capture the two-dimensional or three-dimensional geographical information in digital form – a process that is sometimes impeded by source map characteristics such as resolution, type of projection, map scales, cartographic licensing, diversity of measurement techniques, and coordinate system differences  Spatial data can also be captured from remote sensors in satellites such as Landsat, NORA, and Advanced Very High Resolution Radiometer(AVHRR) as well as SPOT HRV (High Resolution Visible Range Instrument) 52
  53. Data Management Requirements of GIS For digital terrain modeling, data capture methods range from manual to fully automated Ground surveys are the traditional approach and the most accurate, but they are very time consuming; Other techniques include photogrammetric sampling and digitizing cartographic documents 53
  54. Specific GIS Data Operations  GIS applications are conducted through the use of special operators such as the following:  Interpolation – derives elevation data for points at which no samples have been taken. Most interpolation methods are based on triangulation that uses the TIN method for interpolating elevations inside the triangle based on those of its vertices.  Interpretation – involves the interpretation of operations on terrain data such as editing, smoothing, reducing details, and enhancing  Proximity analysis – computation of “zones of interest” around objects. Such as the determination of a buffer around a car on a highway. Shortest path algorithms using 2D or 3D information is an important class of proximity analysis. 54
  55.  Raster image processing – can be divided into 1. map algebra which is used to integrate geographic features on different map layers to produce new maps algebraically and 2. digital image analysis which deals with analysis of a digital image for features such as edge detection and object detection. Detecting roads in a satellite image of a city is an example of the latter.  Analysis of networks – analysis of networks for segmentation, overlays, and so on. Network overlay refers to a type of spatial join where a given network—for example, a highway network—is joined with a point database—for example, accident locations—to yield, in this case, a profile of high- 55Specific GIS Data Operations
  56. Specific GIS Data Operations  The functionality of a GIS database is also subject to other considerations: • Extensibility – GISs are required to be extensible to accommodate a variety of constantly evolving applications and corresponding data types • Data quality control – quality of source of data is of paramount importance for providing accurate results to queries • Visualization – the graphical display of terrain information  Since, standard RDBMSs or ODBMSs do not meet the special needs of GIS, it is necessary to design systems that support the vector and raster representations and the spatial functionality
  57. Thank You ! 57