A talk I gave at the 2014 MAC URISA Conference in Atlantic City. Often, GIS users have little exposure to SQL. This talk gives a brief overview to SQL from a GIS users' perspective, and provides some examples of how it can be used in place of common ArcGIS/desktop GIS tasks to improve efficiency.
2. #MACURISA2014
DBMS Systems
! Many of the modern DBMSs support spatial data.
! Oracle, MS SQL, PostgreSQL are most often used.
! PostgreSQL is
! open source/free to use and modify
! incredibly reliable, extensible, powerful
! provides spatial capabilities through PostGIS
! DBMSs allow for “enterprise” functionality, like
multiple users/concurrency, high output, etc.
3. #MACURISA2014
Structured Query Language
! SQL is the standardized method of interacting with
a database
! Even Access allows you to use SQL
! Common, hopefully familiar, statements:
! Select (read from database)
! Insert (new records into a DBMS)
! Update (existing records in DBMS)
! Delete (remove records from DBMS)
! Where (limits your results)
4. #MACURISA2014
Select Statements
! Most common SQL
query you will
encounter
! “Select By Attributes”
has this as the
foundation
! Nothing more than
“SELECT * FROM
gis_layer WHERE…”
5.
6. #MACURISA2014
Joins
! In ArcGIS or Access, you join two (or more) tables
together using a primary key.
! If the keys match, the secondary tables are tacked
on to the first
! Again, geospatial is special, so GIS has another
type of join
7. #MACURISA2014
Combining Tables
! The simplest combination of two tables would be to
combine each record from table A with each record
from table B.
! The Cartesian Product.
! Example: A has 2 records, B has 3.
! A ✕ B: {(A1, B1), (A1, B2), (A1, B3),
(A2, B1), (A2, B2), (A2, B3)}
! Let’s take a deck of cards as an example.
8.
9.
10. #MACURISA2014
Joins
! Think of a Join as limiting the Cartesian Product of
two tables down to just the specific records desired.
! The manner in which you form your SELECT … JOIN
will be important:
! Ensure the desired records and columns are returned.
! Speed of the JOIN performed.
11. #MACURISA2014
Spatial Joins
! Relationship not determined by key, but by
proximity or connectivity
! Contains/Within/Overlaps
! One feature falls entirely within another
! Touches/Intersects/Crosses
! One feature touches another
! Equals or Disjoint
12. #MACURISA2014
Set Theory
! General terms first, because these concepts are
used across GIS and not just in SQL.
! Union
! Intersection
! Relative Complement
! Symmetric Difference
! Terms should be
somewhat familiar…
13. #MACURISA2014
Union
! ArcToolbox: returns a set where all features are
returned, however new features created where they
intersect.
! SQL: Set of all values from both tables.
! Join: An FULL JOIN – all values from two tables,
with NULL values where there are not shared values.
! Venn:
14. #MACURISA2014
Unions are not Cartesian
! Union / FULL JOIN will leave NULLs where there
are not matches across tables. All records will be
returned, however the records will not be “shuffled”
together like the cards example.
! FULL JOINs still require a WHERE or ON predicate
to create the join.
18. #MACURISA2014
Intersection
! ArcToolbox: returns a set where the geometries of
two different feature classes overlap.
! SQL: Only where the two tables share values.
! Join: An INNER JOIN – intersection of two tables.
! Venn:
19. #MACURISA2014
LEFT & RIGHT JOINs
! ArcToolbox: called Update.
! SQL: All records in Table A, along with some columns/
records from Table B.
! Join: A LEFT JOIN – columns from B will contain NULL if
there is no match. All records from A returned. (A RIGHT
JOIN is just an easy way of writing the reverse.)
! Venn:
! Examples?
20. #MACURISA2014
Symmetric Difference
! ArcToolbox: returns a set where the geometries
feature class A do not overlap feature class B.
! SQL: Only where the two tables do not share
values.
! Join: An FULL JOIN, WHERE a.value <> b.value
! Venn:
! Examples?
21. #MACURISA2014
Many types of Joins
! INNER and OUTER (LEFT, RIGHT, FULL)
! Different from Cartesian Product because some
comparison value needs to be tested for truth.
! Truth testing can be =, <>, <, > can also be the
result of a function.
! Spatial Joins in SQL:
! ST_Intersects(a.shape, b.shape)
! ST_Contains(a.shape, b.shape), ST_Within()
! ST_Overlaps(a.shape, b.shape)
! ST_Touches(a.shape, b.shape)
23. #MACURISA2014
Fire Stations in Town
! How can we calculate
the number of fire
stations within a
municipality?
! Can we find the most?
! Can we find the least?
! How about those towns
with no fire stations?
! How about those with a
specific number of fire
stations?
30. #MACURISA2014
Bus Routes
! How can we find the towns
that are along a given bus
route?
! How do we find the routes
that cross through a town?
! How do we find the towns
without service?
! bus.line = 553
AND
ST_Intersects(
bus.shape,
mun.shape)
31. #MACURISA2014
Self-Joins
! A table can be referenced
twice in the same query.
! How could we use this to
generate a “neighbor” list?
! How would we generate
that list of towns?
! FROM nj_munis m,
njmunis x
WHERE m.mun <>
x.mun AND
ST_Touches(m.shape
, x.shape)
32. #MACURISA2014
Denny's & La Quinta
! Using SQL to remove
the humor from jokes…
SELECT d.city, d.state,
ST_Transform(d.shape,2163) <->
ST_Transform(l.shape,2163) as
distance
FROM dennys d, laquinta l
WHERE (ST_Transform(d.shape,2163)
<-> ST_Transform(l.shape,2163))
< 150
ORDER BY 3;
34. #MACURISA2014
Power of SQL
! Speed.
! Flexibility.
! Data integrity and control.
! Automated reports as data changes.
! Views and functions can help automate and
streamline your GIS workflow.
! A bit of a learning curve, but SQL is a standard
and is supported and understood by a wide variety
of applications and data stores.
35. #MACURISA2014
More Info and Thanks!
! John Reiser
email: jreiser@njgeo.org
twitter: @johnjreiser
code: github.com/johnjreiser
! Articles on New Jersey Geographer:
http://njgeo.org/
! "Mitch Hedberg & GIS" (using PostgreSQL):
http://njgeo.org/2014/01/30/mitch-hedberg-and-gis/