This document provides an overview of MySQL 5.7's new and improved GIS capabilities. It begins with introductions to geographic information systems and common GIS concepts. It then outlines the key new features in MySQL 5.7, such as its integration of Boost.Geometry for geometry representations and comparisons. The document also provides examples of how GIS data can be imported and analyzed using MySQL, and concludes with suggestions for further enhancements to MySQL's GIS functionality.
This is a Safe Harbor Front slide, one of two Safe Harbor Statement slides included in this template.
One of the Safe Harbor slides must be used if your presentation covers material affected by Oracle’s Revenue Recognition Policy
To learn more about this policy, e-mail: Revrec-americasiebc_us@oracle.com
For internal communication, Safe Harbor Statements are not required. However, there is an applicable disclaimer (Exhibit E) that should be used, found in the Oracle Revenue Recognition Policy for Future Product Communications. Copy and paste this link into a web browser, to find out more information.
http://my.oracle.com/site/fin/gfo/GlobalProcesses/cnt452504.pdf
For all external communications such as press release, roadmaps, PowerPoint presentations, Safe Harbor Statements are required. You can refer to the link mentioned above to find out additional information/disclaimers required depending on your audience.
The first law of geography according to Waldo Tobler is "Everything is related to everything else, but near things are more related than distant things."[1]
This observation is embedded in the gravity model of trip distribution. It is also related to the law of demand, in that interactions between places are inversely proportional to the cost of travel, which is much like the probability of purchasing a good is inversely proportional to the cost.
It is also related to the ideas of Isaac Newton's Law of universal gravitation and is essentially synonymous with the concept of spatial dependence that forms the foundation of spatial analysis. Furthermore, it is the founding principle upon which the understanding and corrective measures for spatial autocorrelation have been based upon.[2]
The link structure of Wikipedia's collection of geolocated articles has been demonstrated to be consistent with Tobler's first law of geography.[3]
References:
- Tobler W., (1970) "A computer movie simulating urban growth in the Detroit region". Economic Geography, 46(2): 234-240.
- Luc Anselin, Spatial Econometrics, 1999 <https://csiss.ncgia.ucsb.edu/aboutus/presentations/files/baltchap.pdf>
- Hecht, B., Moxley, E.: "Terabytes of Tobler: Evaluating the first law in a massive, domain-neutral representation of world knowledge".. In Hornsby, K.S., Claramunt, C., Denis, M., Ligozat, G., eds.: Spatial Information Theory, 9th International Conference, COSIT 2009, Aber Wrac'h, France, September 21-25, 2009, Proceedings. Volume 5756 of Lecture Notes in Computer Science., Springer (2009) 88-105
GIS: A computer-based system that stores geographically referenced data layers (features) and links it with non-graphic data tables (attributes) allowing for a wide range of information processing, including manipulation, analysis, and modeling. A GIS also allows for map display and production.
We’re going to focus on simple location services in our examples.
EPSG was folded into The International Association of Oil & Gas Producers (OGP) in 2005.
X or northing which will typically be a longitude value, Y or easting which will typically be a latitude value, Z or height (optionally a true geodetic value), and M or measure (which can be used for Time, for example).
University Consortium for Geographic Information Science
USGS or United States Geological Survey
Started out as the Generic Geometry Library by OSGeo. Now it’s of course part of Boost.
R-trees are the most common index type used for spatial data. It’s a wide search tree, with some similarities with B-trees. It’s similar in that of course it’s a search tree, and it has root nodes, branch nodes, and leaf nodes.
The main differences are:
1. R-trees use pages for each level in the tree, and the page can contain X number of nodes. So the search is not binary at any level. Each level in the tree can have up to some maximum number of nodes. The maximum being set by the specific implementation.
2. You search by bounding box. If the search box overlaps with the MBR stored in a node, then you continue to search down that path, moving to the next child node.
You can set the SRID of geometries in MySQL to any 32 bit unsigned integer, and we will refuse to mix geometries of different SRIDs in the same operation. In calculations, everything will be treated as SRID 0, which in MySQL is a Cartesian system without units (what you get if you don't specify an SRID).
Common alternatives are a fixed address like this, or a GPS location from your mobile device.
I’m using the spherical law of cosines formula because it’s simpler, and thus faster, than Haversine; while also giving us virtually the same accuracy (Haversine is generally more accurate though, particularly for short distances). That’s generally why Haversine is the most common method that you’ll see used.
It’s possible that this need goes away later in 5.7 too. We’re looking into possibly adding an ST_Distance_Sphere() function that returns the great-circle earth distance calculation, in meters, between two geometries (which are points containing X,Y or LON/LAT coordinate pairs).
ST_Distance_Sphere — Returns minimum distance in meters between two lon/lat geometries. Uses a spherical earth and radius of 6,370,986 meters.
We’ll simply use the average distance between longitude and latitude degrees in our coming example. We don’t need the envelope to be too accurate as we’re simply using it to pass down to the spatial index, and we’re later calculating the actual distance using our new SLC function.
The need for this may go away later in 5.7 too. We’re considering adding an ST_MakeEnvelope() function that takes a Geometry parameter (again a POINT containing a LON/LAT coordinate pair) and an integer to create an envelope that would contain all points within the specified number of meters from the Geometry.
ST_MakeEnvelope(point, distance) --- Make a rectangle around a point(x,y) so that the distance between the point and the rectangle's boundaries is 'distance' units (km) away. This is done on an abstract cartesian plane and simply creates an envelope that contains all points within a radius of approximately <distance> km around the <point>.
In our example query we’re simplifying the bounding box calculation and using <km>/111, 111 being the average distance in km between longitude and latitude degrees. It’s not very accurate for longitude, in particular the further you get away from the equator, but it doesn’t have to be very accurate for our bounding box as we’re calculating the actual distance with our SLC function. The bounding box is just to pass down to the spatial index so that we can weed out all of the irrelevant points.
Projections allow you to more easily and accurately measure distance and location within a small area using a CRS and SRID values.
With 3D support, you have the Z axis in addition to the X and Y axis. The Z axis is then the height, but its taken from a simpler North/South measurement of a sphere.
Geodetic data essentially allows you to measure height as well, but far more accurately, using transformations based on various real world phenomena and the measurements of them: for example, the actual shape of the earth (which is an oblate spheroid), the tilt of the earth, and the movement/spin of the earth in space. For example, the Helmert transformation is one common method used in taking the X,Y, and Z axis values, and then transforming them into a very accurate real-world position in 3 dimensional space.
Spatial reference systems, in addition to supporting projections, also allow you to easily generate envelopes of MBRs to push down to the spatial index for your searches. Thus alleviating the need to try and create them by hand as we did in our previous example.
The OGC I_S tables provide a wealth of information. For example, you can see the definitions of all SRIDs as defined by the OGC and by EPSG.
This is a Safe Harbor Back slide, one of two Safe Harbor Statement slides included in this template.
One of the Safe Harbor slides must be used if your presentation covers material affected by Oracle’s Revenue Recognition Policy
To learn more about this policy, e-mail: Revrec-americasiebc_us@oracle.com
For internal communication, Safe Harbor Statements are not required. However, there is an applicable disclaimer (Exhibit E) that should be used, found in the Oracle Revenue Recognition Policy for Future Product Communications. Copy and paste this link into a web browser, to find out more information.
http://my.oracle.com/site/fin/gfo/GlobalProcesses/cnt452504.pdf
For all external communications such as press release, roadmaps, PowerPoint presentations, Safe Harbor Statements are required. You can refer to the link mentioned above to find out additional information/disclaimers required depending on your audience.