Some Developments in Space-Time Modelling with GIS
Tao Cheng – University College London (U.K)
Intelligent Analysis of Environmental Data (S4 ENVISA Workshop 2009)
Some Developments in Space-Time Modelling with GIS Tao Cheng – University College London (U.K)
1. Integrated Spatio-Temporal Data
Mining for Network Complexity
Tao Cheng
Senior Lecturer
Department of Civil, Environmental & Geomatic Engineering
University College London
Email: tao.cheng@ucl.ac.uk
2. Dr Tao Cheng – Background
• Data quality and uncertainty [1]
of spatial objects
• Multi-scale spatio-temporal
data modelling and analysis
• Intelligent spatio-temporal
data mining [3]
• Some relevant projects:
– 4D GIS for decision support system
[1]
– Managing uncertainty and temporal
updating (EU)
– Experimental modelling of changing
activity patterns using GIS (HK) [2]
– Location-Based Services and the
Beijing Olympics (HK)
– Spatio-temporal data mining (PRC)
[3] [2]
2
3. Outline
• Why integrated spatio-temporal data mining?
• Existing ST analysis methods
– STARIMA, ANN, SVM
• Our approach
– A hybrid model – ANN + STARIMA
– Space-Time Neural Networks – STANN
– Space-Time Support Vector Machines – STSVM
• ISTDM for network complexity?
4. Characteristics of ST Data
• Dynamic, multi-dimensional, multi-scale
• Spatial dependence
“Everything is related to everything else, but near things are more
related than distant things” — Tobler, First Law of Geography
“If the presence of some quantity in a county (sampling unit)
makes its presence in neighbouring counties (sampling units)
more or less likely, we say that the phenomenon exhibits spatial
autocorrelation” — Cliff and Ord
• Temporal dependence
• Heterogeneity & nonlinearity
5. Existing ST analysis methods
• time series analysis + spatial correlation
• spatial statistics + the time dimension
• time series analysis + artificial neural networks
ST dependence ≠ space + time
Integrated modelling of ST is needed –
• seamless & simultaneous
• ST-association/autocorrelation
6. Principle of ST Modelling
Space-time data = global (deterministic) space-time trends +
local (stochastic) space-time variations
Z i (t ) = μi (t ) + ei (t ) Z(t)=u(t)+e(t)
Zi=ui+ei
• zi (t ) - the observation of the data series at spatial location i and at
time t;
• μi (t ) - space-time patterns that explain large-scale deterministic
space-time trends and can be expressed as a nonlinear function in
space and time.
• ei (t ) - the residual term, a zero mean space-time correlated error
that explains small-scale stochastic space-time variations.
7. Model 1 - STARIMA - Spatio-Temporal Auto-
Regressive Integrated Moving Average
p mk q nl
zi ( t ) = ∑∑ φ khW ( h ) z ( t − k ) − ∑ ∑ θ lhW ( h )ε ( t − l ) + ε ( t )
k =1 h = 0 l =1 h = 0
(Pfeifer P E and Deutsch S J, 1980)
8. Model 2 - ANN - Artificial Neural Networks
(Mandic D P and Chambers JA, 2001)
SFNN – spatial interpolation DRNN – time series analysis
( a ) static neuron ( b ) dynamic neuron
n
z i = ∑ iw ij ⋅ z j + b
ˆ z( t ) = iw ⋅ z(t) + lw⋅ z(t − 1) + b
ˆ ˆ
j=1
9. • ANN for space-time trend analysis
n
μ i (t ) = f (∑ β k f (i, t ) + β 0 )
ˆ
k =1
Tao Cheng, Jiaqiu Wang, Xia Li, Accommodating Spatial Associations in
DRNN for Space-Time Analysis, Computers, Environment and Urban System,
under review
10. Model 3 – SVM - Support Vector Machines
SVC & SVR (Vapnik et al, 1996)
11. Our approach – Integrated modelling of ST
Model 1 – STARIMA
p mk q nl
zi ( t ) = ∑∑ φ khW ( h ) z ( t − k ) − ∑ ∑ θ lhW ( h )ε ( t − l ) + ε ( t )
k =1 h = 0 l =1 h = 0
• define weights based upon spatial distance and
spatial adjacency
• consider anisotropy
• able to model spatially continued phenomena
12. Model 2 - Hybrid Modelling
Z i (t ) = μ i (t ) + ei (t )
ANN to STARIMA to
model model
nonlinear stochastic
space-time space-time
trends variations
• overcome the limits of STARIMA
• Stationarity
• Linearility
Tao Cheng, Jiaqiu Wang, Xia Li, A Hybrid Framework for Space-Time Modeling of
Environmental Data, Geographical Analysis, under review
13. Model 3 - STANN
n
Space-Time Neuron z i (t) = ∑ iw (ji ) ⋅ z j (t − 1) + lw ( 0 ) ⋅ z i (t − 1) + b
ˆ 1
ˆ
j=1
• One step implementation of ANN+ STARIMA
• Accommodate ST associations in ANN
• Deal with nonlinearity & heterogeneity in BP learning
Jiaqiu Wang, Tao Cheng, STANN – Modeling Space-Time Series by Artificial Neural
Networks, International Journal of Geographical Information Science, under review
14. Model 4 - STSVR
• Nonlinear Spatio-Temporal Regression by SVM
• Develop ST kernel function
• Overcome over-fitting in STANN
• Deal with errors
• Model nonlinearity & heterogeneity
25. Team (April 2009 – March 2012)
• UCL
– Dr Tao Cheng (PI), Senior Lecturer in GIS
– Prof. Benjamin Heydecker (Co-I), Professor of Transport Studies
– Dr Jingxin Dong, Transport Modelling (F1)
– Dr Jiaqiu Wang, GIS (F2)
– RS, MSc in GIS – SVM/GWR
– EngD, MSc in Transport – Simulation
– 3 visiting scholars, each 2 months
Other PhDs
– Mr Berk Anbaroglu (RS), BSc in Computer Science – outlier
detection
– Ms Garavig Tanaksaranond (RS), MSc in GIS – dynamic
visualization
• TfL RNP&R
– Mr Andy Emmonds, Principal Transport Analyst
– Mr Mike Tarrier, Head of RNP&R
– Mr Jonathan Turner, Performance Analyst
26. Aim
• To quantitatively measure road network
performance
• To understand causes of traffic congestion
– association between traffic and interventions
• traffic flow, speed/journey time
• incidents, road works, signal changes and bus lane changes
• Case study – London
27. What’s new?
• data-driven, mining
• integrated space and time
– ST associations
• combine regression analysis with machine
learning
– improve the sensitivity and explanatory power
• study the heterogeneity and scale of road
performance
– optimal scale for monitoring
28. ISTDM for Network Complexity
1) Dynamics
2) Spatial dependence
3) Spatio-temporal interactions
4) Heterogeneity
Modelling spatiality and spatio-
temporal dependence
(autocorrelation) of
networks is the bottleneck.
29. London Road Networks
Cordons
Central, Inner, Outer
Screenlines
Thames,
Northern,
five radials
four peripherals
30. Challenge (2) - Data issues
• massive – 20GB monthly
• multi-sourced related to 5 different networks
• different scales (density & frequency)
• variable data quality
• contain conflicts, errors, mistakes and gaps
31. Methodology: some preliminary thoughts
• accommodate network structure (topology &
geometry)
• model spatio-temporal correlation
• investigate network heterogeneity
– STGWR
• model impacts of interventions
– STARIMA & DRNN; hybrid; STANN
• Traffic pattern clustering and long-term
prediction
– STANN; STSVM
• sensitivity analysis and accuracy assessment
• simulate congestion in the short term