SlideShare uma empresa Scribd logo
1 de 63
Baixar para ler offline
Spatial Data Analysis
2/2
Johan Blomme | Leenstraat 11 | 8340 Damme
info@data-insights.be
There is an increased interest in understanding spatial varying processes to explain various social,
political and economic outcomes. Using global and local statistics can lead to completely
different insights into the relationship between area-level characteristics and outcomes.
Two types of spatial analysis are especially relevant :
– spatial autocorrelation : the application of local clustering analysis to establish
significant local patterns and the use of spatial econometrics to account for spatial
effects in regression analysis ;
– spatial heterogeneity : the application of geographically weighted regression analysis to
explore the spatial variation in the relationships between area-level characteristics and
various outcomes.
In this guide, various techniques to perform global and local spatial regression analysis are
explored. The examples used are for illustrative purposes only and are not intended to test the
theoretical underpinnings that exist in the research field of the chosen cases.
Introduction
• In recent years, there has been a growing interest in adding a spatial perspective to the study
of complex patterns of interrelated social, behavioral, economic and environmental
phenomena. It is increasingly argued that spatial thinking and spatial analytical
perspectives have an important role to play in uncovering answers that could prove helpful in
addressing research and policy questions*.
• Spatial analysis of data focuses on four methodological areas :
– spatial econometrics ;
– geographically weighted regression ;
– multilevel models ;
– spatial pattern analysis.
* It is worth noting that the term “spatial analysis” applies equally to the study of incident level point patterns (e.g. crime hot
spots) as well as to the study of aggregated counts or rates at the area level (e.g. census block groups, tracts or
“neighborhoods”).
i
1. Spatial econometrics
• Spatial econometrics account for spatial effects in regression analysis. If geography or place
matters (and it frequently does), then things that are more related geographically (i.e. more
proximate geographically) are also correlated in other ways. Therefore, assumptions about
the independence of covariates and about the independence and distribution of error terms
are violated in an OLS regression framework.
• Let’s take the example of the analysis of crime data. The growing spatial analysis of crime
data enabled criminologists to move beyond simply mapping crime and demonstrating that
crime does indeed cluster in space. An important issue became the question why crime
clustered in space. Spatial regression models were being estimated to explain the observed
patterns of spatial clusters. In addition to crime, many researchers began to use spatial
regression models to demonstrate that many negative health issues such as low birth weight,
infant mortality and depression cluster spatially. From these studies emerged a consistent
set of explanatory variables that characterise “bad” neighborhoods (e.g. concentrated
poverty, stability of residents, female headed households, minority population) and that
there appeared to be an aggregate “neighborhood “ effect. For instance, concentrated
poverty negatively impacts all residents of a community regardless of one’s own level of
personal income.
ii
• That such places also cluster in space suggests that neighborhoods are not independent units
of observation. There might be forces at work that make the level of crime in one
neighborhood dependent upon the actions and activities occurring in other areas. That is,
social processes might be at work that result in the diffusion across space.
• In trying to understand these patterns, spatial regression became the methodology of choice.
As noted, spatial autocorrelation occurs when the values of variables sampled at nearby
locations are not independent of each other. This lack of independence makes the use of OLS
regression inappropriate. To address spatial autocorrelation spatial lag and spatial error
models became most popular.
• When the level of crime in one neighborhood is directly dependent upon the activities or
social processes occurring in a neighboring area, one must apply a spatial lag (spatial
dependence) model. Spatial error models are appropriate for modeling unobservable
processes (e.g. norms or beliefs) that are shared among individuals residing in proximate
places, or when boundaries that separate “places” are arbitrary to the extent that two
different places are actually very similar across various social, economic or demographic
features.
iii
• By examining the statistically significant coefficient on the spatially lagged dependent variable
or the spatial error term, specific explanations were offered regarding the forces driving the
diffusion of the study object.
• The selection of which model, lag or error, has, and continues to be, driven by goodness of fit
tests rather than theory.
• What causes spatial autocorrelation ?
• Feedback. For most social processes, individuals and households interact with each other
and thereby influence each other. The influence of such an interaction is likely to be stronger
for those who are in frequent contact. Residential proximity generally increases the
frequency for those who are in frequent contact. However, it is also possible to
geographically “unbound” the autocorrelation matrix. For example, social similarity
increases the probability of communication and social interaction. In this way, events in an
area can be influenced more by events in non-adjacent but socially similar areas than in
adjacent but socially dissimilar areas. One might model the diffusion of youth violence by
considering social interactions that occur within schools. In such a case, neighborhoods
would be linked if and only if they send students to the same school buildings. Studies that
capture social networks and communication networks can provide an empirical validation of
this approach (instead of using a geographically based matrix, the potential for activities in
one area to influence other areas can be based on social distance between places).
iv
• Grouping forces. Individuals and households with common characteristics sometimes are
found clustered together by choice or they are constrained to co-locate by the coercive
operation of social, economic or political forces. When this type of constraint is responsible
for spatial autocorrelation in a dependent variable, it may be possible to identify the variable
or variables involved in the process and operationalize them on the right-hand of the
regression equation. Sometimes the spatial autocorrelation in the dependent variable (and
the regression residuals) can be explained by autocorrelated covariates (independent)
variables, and standard regression approaches will work fine. If a causal variable cannot be
identified, then the source of the autocorrelation will remain in the error term, necessitating
what is referred to as a spatial error model.
• Grouping responses. Individuals or households that share a common attribute or a set of
common characteristics may respond similarly to external forces. Often there exist
contextual forces that affect individuals and households in an area (e.g. geophysical
conditions, cultural influences). A data analyst can deal with these contextual influences by
declaring different “spatial regimes”. If not, spatial autocorrelation will remain in the
regression error term, the result of an omitted variable in the specification, and spatial
econometric approaches must again be considered.
v
• Nuisance autocorrelation. This occurs when the underlying spatial process creates regions
that are much larger than the units of observation chosen or available to the analyst. The
choice of the proper level of aggregation when estimating neighborhood effect remains
problematic . Data is typically aggregated to geographical areas which serve as the units of
analysis (e.g. census tracts). The modifiable area unit problem (MAUT) arises from the fact
that units are usually arbitrarily defined in the sense that they can be aggregated or
disaggregated to form units of different size. Innovative advances are being undertaken that
define the geography of a community no longer on boundaries for administrative purposes
(e.g. census tracts, zip codes) but capture the spatial dimension of social networks.
vi
• The challenges for future work are not those that pertain to the development of new
mapping technologies or more sophisticated statistical methodologies : “That is, regardless
of how sophisticated our methodologies become for the estimation of spatial models, the key
will always be that the specification of these models be sound in terms of the measurement
and definition of place and the manner in which areas are deemed “neighbors”. … Though the
ability for a crime in a focal area to influence crime in another area might decay over
distance, it is possible that there are other networks of social interactions (e.g. interactions
that occur outside the neighborhood at work or school, participation in voluntary or religious
organizations, …) that make events in one area extremely salient in the commission of future
events in otherwise geographically distant areas” (Tita & Radil, 2010, pp. 476).
vii
• Recently techniques for the analysis of local spatial relationships have been developed.
• In conventional regression, one parameter is estimated for the relationship between each
independent variable and the dependent variable and the relationship is assumed to be
constant across the study area. The term “global” implies that all of the data are used to
compute a single statistic or model, and that the relationships between variables in the
model are stationary across the study area. The GWR approach extends this framework to
estimate local rather than global parameters. Instead of calibrating a single regression
equation, GWR generates a separate regression equation for each observation. Each
equation is calibrated using a different weighting of the observations contained in the data.
• In traditional OLS all places have the same weight as if all places shared the same location.
In GWR, as we move over space observations are weighted according to their proximity to a
location.
2. Geographically weighted regression
viii
• Two problems arise with GWR. If the subset of the full sample is too small, standard errors
will be high. Second, if the subsample is too large, coefficients will be biased because they
drift across space. If the process is spatially non- stationary, a regression with a large
subsample will result in estimates that are spatial averages. To overcome these problems in
GWR a weighted calibration is used.
• Observations in close spatial proximity to region i have a larger influence in the estimation of
the parameters for region i than those further away. That is why those observations have
larger weight in the sample than the observations from regions further away. This weighted
calibration implies that the weighting of an observation is not constant but varies with i.
Region j has a large weight in the estimation of region i if they are close to each other, and
the weight of region j in the estimation of region m might be small if the regions are
separated by a larger distance. Every single region i has a different weight matrix.
• GWR is run in several steps. The first point is how observations should be weighted. The
two most applied weighting functions (Kernel functions) are the Gaussian and the bi-square
kernel. Using the Gaussian kernel, in which space is considered continuous, the weighting of
data will decrease according to a Gaussian curve as the distance between i and j increases.
Up to a certain bandwith, the observations will have a weight of at least 0.5. A binary
scheme implies the notion that space is discrete or discontinuous. Beyond a bandwith, the
weights are set to zero.
ix
• It is often stated that the GWR results are relatively insensitive to the choice of the weighting
function, but they are not insensitive to the choice of the bandwith. As the density of
regions in a dataset can vary, we cannot use just one bandwidth. For example, in a study of
European regions a fixed bandwidth of 800 km is too small for the estimation of coefficients
in Finland, because there are few regions and, accordingly, few data points in close proximity.
The most northern Finish region would have only 3 neighbors. Such a small sample would
result in large standard errors. Similarly, this bandwidth is too large for place like Austria,
where the density of regions is much higher. The region of Tirol would have 129 neighboring
regions within a distance of 800 km. Such a large sample could result in serious drift bias.
That is why an adaptive kernel is most appropriate : an optimal adaptive number of
neighbors will be applied. Adaptive kernel means a fixed proportion of all observations is
included in the estimation, for example 20 percent of all regions.
• An adaptive kernel is smaller in regions where the density of observations is high (like in
Austrian regions) and larger in regions where the density is low (like in Finish regions). While
the advantage of an adaptive kernel is obvious for regions with a high density of
observations, the coefficients of regions with a low density of observations are likely to be
drift biased, as they are also influenced by observations of regions which are in large
distance.
x
• The bandwidth can be understood as the area of influence of each place. A small bandwidth
means a small area of influence, meaning a rapid distance decay function, whereas a large
bandwidth implies a larger area of influence, thus a smoother weighting scheme. In a
regression context, a small bandwidth (slighter smooting) produces estimates with large local
variation, whereas large bandwidths (greater smoothing) produce estimates with little spatial
variation (larger bandwidths will make local coefficient estimates similar to OLS global
estimates).
• There are two methodes for the estimation of the optimal bandwidth : AICc (corrected
Akaike Information Criterion) and CV (cross validation). When comparing between GWR
models with different bandwidths, the model with the lowest AICc or CV can be considered
the most appropriate as it will determine which radius size (bandwidth) is optimal.
xi
• The output from GWR is a set of surfaces that can be mapped , with each surface depicting
the spatial variation of a relationship. Standard global modeling techniques, such as OLS or
spatial regression models, cannot detect nonstationarity, and thus their use may obscure
regional or local variation in the relationships between predictors and the outcome variable.
Public policy inferences based on results from global models in which nonstationarity is
present but not detected may be quite poor in specific local/regional settings.
• GWR analysis and interpretation are largely dependent on GWR maps. Such maps can be
problematic if they illustrate the size of parameter estimates while failing to illustrate their
relative significance. A method to address this issue is the mapping of GWR statistics by
combining local parameter estimates and t-values on a single map.
• It is important to note that GWR is an exploratory technique, and, as with ESDA and spatial
econometric approaches, the insights gained from GWR can be utilized to improve model
specification in global models. Limitations associated with GWR include the computationally
demanding calculation of multiple regressions, multicollinearity and kernel bandwidth
selection. It should also be taken into account that GWR studies that find that parameter
coefficients vary across space have a tendency to focus on this result and do not always seek
to explain the results with further analysis. It is important that GWR and ESDA methods are
utilized to help improve model specification, and that efforts be made to find explanations.
xii
• A methodological focus on multilevel or hierarchical modeling is relevant when assessing to
what extent individual behaviors and demographic and health outcomes are influenced by an
individual’s own characteristics, and by the attributes of the larger geographic area
(neighborhood, village, district, state).
• To some extent, nested data are inherently spatial. Statistical methods that incorporate
neighborhood, city or regional effects are in essence considering the effects of places and
spaces on their outcome(s) of interest. While traditional research has looked at de jure
classifications of space (e.g. census tracts), it is increasingly acknowledged that legal and
political boundaries frequently have little to do with actual lived spaces. Furthermore, many
scientists are working in regions that do not have synonymous spatial categories : that is,
neighborhoods and other administrative bounded areas may have different meanings in
some of the non-industrialized and/or industrializing nations than they do in the developed
world.
3. Multilevel modeling
xiii
• A wide range of methods new exist for analyzing spatial clusters of point data, such as
disease or crime events, in which the goal is to discover whether the observed events exhibit
any systematic pattern, as opposed to being distributed at random within a study area.
Recent applications of spatial pattern analysis include the use of local statistics of spatial
association.
4. Spatial pattern analysis
xiv
What does the near future hold for spatial data analysis ?
• We can predict with some confidence that things will change rapidly, as the geospatial data
and methodological development environment is dynamic.
• It must be emphasized that the volume, sources and forms of geospatial data are growing
rapidly. Data from wireless and sensor technologies and developments in data storage and
handling (e.g. cloud computing, geospatial data warehouses, data mining techniques) will
continue to change what, how and when we collect data on individuals and their
environments. New data formats will be tagged with both a geographic location and a time
stamp, providing unparalleled spatial and temporal precision.
xv
Global and Local Spatial Regression
1
• Traditional regression analysis describes a modelled relationship between a dependent variable and a set
of independent variables. When applied to spatial data, the regression analysis often assumes that the
modelled relationship is stationary over space and produces a global model which is supposed to describe
the relationship at every location in the study area. This would be misleading, however, if relationships
being modelled are intrinsically different across space. One of the spatial statistical methods that attempts
to solve this problem and explain local variation in complex relationships is Geographically Weighted
Regression (GWR).
• In a global regression model, the dependent variable is often modelled as a linear combination o be
stationary over the whole area (i.e. the model returns one value for each parameter). GWR extends this
framework by dropping the stationarity assumption: the parameters are assumed to be continuous
functions of location. The result of the GWR analysis is a set of continuous localised parameter estimate
surfaces, which describe the geography of the parameter space. These estimates are usually mapped or
analysed statistically to examine the plausibility of the stationarity assumption of the traditional
regression and different possible causes of nonstationarity.
2
The definitive text on GWR is : Fotheringham, A.S., Brunsdon, C. & Charlton, M.E., Geographically Weighted Regression : The Analysis of
Spatially Varying Relationships, Chichester, Wiley, 2002.
3
• The use of linear regression is common in many areas of science. Ordinary linear regression implicitly
assumes spatial stationarity of the regression-model that is, the relationships between the variables
remain constant over geographical space. We refer to a model in which the parameter estimates for every
observation in the sample are identical as a global model.
• Spatial nonstationarity occurs when a relationship (or pattern) that applies in one region does not apply in
another. Global models are statements about processes or patterns which are assumed to be stationary
and as such are local independent, i.e. are assumed to apply to all locations. In contrast local models are
spatial disaggregations of global models, the results of which are location-specific. The template of the
model is the same : the model is a linear regression model with certain variables, but the coefficients alter
geographically. If the parameter estimates are allowed to vary across the study area such that every
observation has its own separate set of parameter estimates we have a local model.
• GWR does not assume the relationships between independent and dependant variables are constant
across space. Instead, GWR explores whether the relationships between a set of predictors and an
outcome vary by geographical location. GWR is suggested to be a powerful tool for investigating spatial
nonstationarity in the relationship between predictors and the outcome variable.
4
• GWR4 is new release of a Microsoft Windows based application for calibrating geographically weighted
regression models, which can be used to explore geographically varying relationships between
dependent/response variables and independent/explanatory variables.
5
6
Give the session a name
Specify regression type
and variable settings
Chose a geographic
kernel type
Specify names for files
storing the modelling results
Execute the session
For an extensive review of these 5 steps, see NAKAYA, T., GWR4 User Manual, update 7 may 2012.
• Theoretically, spatial nonstationarity is based on the concept of the social construction of space. The
interaction between individuals with each other and their physical environment produces space. Human
beings are just as much spatial as temporaral beings. By temporal, we mean that we are most influenced
by what is immediate in space. What happens near us matters more than non-proximal events. Human’s
spatiality and temporality are essential and equal powerful in explaining human behavior. Consequently,
everything that is social is inherently spatial, just as everything spatial is inherently socialized.
• From this perspective, we analyse how the macro-level relationship between crime and various socio-
economic and demographic variables unfolds over geographical space.
7
• Processes and characteristics of urban areas at the human-environment interface (e.g. social stratification,
segregation, urban poverty) depend on a diverse set of socio-demographic, economic and environmental
factors. Due to the heterogeneity of urban areas, it can be assumed that the strength and direction of the
influence of these factors varies over space.
• Special properties of geospatial data are spatial autocorrelation and spatial heterogeneity
(nonstationarity). Spatial autocorrelation implies a spatial association between an attribute value at a
particular location and attribute values at other locations close by. Spatial heterogeneity describes
systematic spatial variation of attribute values across space. These spatial effects must be taken into
account when modeling spatial relationships in a regression model.
• Traditionally, global statistical regression approaches are applied to study the influence of explanatory
variables on a target variable. These approaches emphasize similarities across space. In the following
analysis, this global or “one fits all approach” is juxtaposed against spatial autocorrelation and
nonstationarity. In particular, we explore a global non-spatial regression model and both a global and
local spatial regression model of the relationship between indicators of socio-economic disadvantage and
neighborhood demographic context and crime rates in Belgian municipalities in the period 2006-2008.
8
• An exploration of the spatial patterns of crime is warranted. The causal processes driving crime may vary
over space, that is, predictor variables may operate differently in different locations. This may be
especially relevant in policy studies where there is growing recognition that understanding the context of
crime – the where and when of criminal events – is key to understanding how crime can be controlled and
prevented. Crime studies that highlight local variations – local contexts of crime – will likely have more
relevance to real-world policy applications. Empirically, if these variations in causal processes do exist and
are not accounted for, the statistical model will be inaccurate.
• Estimations provided by a global model might be inadequate in capturing spatially varying relationships, as
global statistics are only describing average relations between the dependent variable and the considered
explanatory variables. With increasing spatial variation of local observations, the reliability of global
model estimates decreases.
• There might be spacial dependencies that refer to attribute values in one location which might depend on
values of the attributers in neighboring locations.
• The assumption of spatial heterogeneity can be suggested by the fact that criminality and its
determinants may be distributed unevenly across space. Another source of spatial heterogeneity is the
dynamics between population and location. That is, cultural differences and differences in attitudes and
behaviour across locations may alter how people react to various contextual variables. Given the potential
of spatial heterogeneity, it would be naïve to assume that the spatial processes between criminality and
its determinants are stationary (or universal) and can be captured by a conventional “global” model.
9
• Following Tobler’s first law of geography which states that “everything is related to everything else, but
near things are more related than distant things”, GWR has to be calibrated in a way that observations
near to observation i have more influence on the estimation of the parameters that data located further
away from i.
• GWR takes advantage of spatial dependence in the data. Spatial dependence implies that data available in
locations near the focal location are more informative about the relationship between the independent
and the dependent variables in the focal location. When evaluating estimates for a focal location, GWR
gives more weight to data from closer locations than to data from more distant locations. It is assumed
that the relative weight of the contributing locations decays at an empirically determined rate as that
distance from the focal location increases.
Spatial dependence refers to (socio-economic) interaction among agents, whereas spatial heterogeneity regards the aspects of the socio-economic
structure over space.
10
1. Analytical framework
• Our analysis strategy entails estimating regression models that summarize the “global”, or average, effects
of the predictor variables on crime rates across our sample of Belgian municipalities. Given the well-
known spatial autocorrelation evident in crime data, we generate the global models using Ordinary Least
Squares (OLS) and Spatial AutoRegression (SAR) estimators. OLS and the spatial autoregression model are
“global” models in the sense that they both assume that a single set of parameters sufficiently describe
the relationships between predictor variables and crime rates.
• The classical ordinary least Squares (OLS) model is widely used to model the global relationship between a
response variable and one or more explanatory variables. OLS assumes, among other things that residuals
are spatially independent. Residual autocorrelation captures unexplained similarities between
neighboring municipalities, which can be the result of omitted variables or a misspecification of the
regression model. Assuming a global model does exist, an exploration of spatial patterns in the data can
help determine whether a global model is misspecified – whether the model is missing important
predictor variables (spatial error model) or if a spatial term should be included in the model (spacial lag
model) – which would improve the accuracy of the global model in explaining crime levels across the study
area.
11
• Global models that account for spatial effects are spatial autoregressive models (SAR). The spatial error
model addresses the presence of spatial autocorrelation by defining a spatial autoregressive process for
the error term and, by doing so, captures unexplained similarities. The spacial lag model extends the
standard OLS regression model by including a spatially lagged dependent variable, which can be mostly
interpreted as spill-over effects.
• Global regression models assume a homogeneous behavior of the estimated parameters across space. We
expect spatial homogeneity to be rare and assume that most social phenomena are not geographically
stationary. A way to deal with spacial heterogeneity is the application of geographically weighted
regression (GWR) to investigate spatially varying relationships.
• GWR models spatial autocorrelation and spatial heterogeneity for subsets of the entire data set. Each
subset is established around a regression point with near data points exhibiting a higher influence than
more distant data points. This weighting is often based on a bi-square kernel function. Of crucial
importance is the specification of an appropriate bandwidth length. The most common is the adaptive
bandwidth, where is length is allowed to vary across space, depending on the density of the data points.
In densely populated areas the kernel possesses a shorter bandwith in contrast to regions with larger
inter-point distances, where the bandwidth is longer.
• While it is often argued that GWR is more suitable for exploratory analysis, it is a technique to test
whether local models yield a significant improvement in fit over the global models.
12
• The following analysis models both spatial autocorrelation and nonstationarity by means of global and
local spatial statistical models. An exploratory spatial data analysis, a global non-spatial regression model,
a global spatial regression model and finally a local spatial regression model were applied to explore the
association between various predictors and crime in Belgian municipalities. We rely on crime data in
municipalities, the main political and administrative unit of the Belgian territory.
• The dependent variable in this study is the crime rate/1000 residents (calculated as a mean over the
period 2006-2008) in Belgian municipalities (N= 589, source data : statistics Belgian Federal Police, period
2006-2008).
• To test social deprivation theory we collected data at municipality level about various indicators of
inequality . Besides mean family income and the percentage unemployed, we use the Gini coefficient as a
measure of income variation, indicating the distribution of income in each municipality (between
extremes of 0 (absolute equality) and 1 (maximum inequality). As control variables we include various
socio-demographic indicators : population density, the share of males in the age group 15 to 64, the
percentage of young people (15-24) in the population, the percentage of residents that are foreign born,
the percentage of non-Euro foreign born residents and the degree of female labour force participation
(source data : National Institute of Statistics and statistics Federal Government, period 2006-2008).
13
14
Since the original data for the dependent variable and five of the independent variables are not normally distributed (skewness marked in red in
the above table) and normality of data is a basic assumption for both ordinary least squares regression and spatial regression, natural log values (ln)
were used for these variables.
• The first step in an exploratory spatial data analysis (ESDA) is to verify if spatial data are randomly
distributed. To do this, it is necessary to use global autocorrelation statistics. The global indicators of
spatial autocorrelation are not capable of identifying local patterns of spatial association, such as local
spatial clusters or local outliers in data that are statistically significant. To overcome this obstacle, it is
necessary to implement a spatial clustering analysis (we made use of GeoDa open-source spatial
regression software of the GeoDa Center for Geospatial Analysis and Computation,
http://geodacenter.asu.edu).
• A significant Moran’s I statistic is a first clue that parameter estimates in an OLS regression can be affected
by spatial residual autocorrelation. For this reason, the Moran’s I statistic was calculated for the
dependent variable and the nine independent variables included in this study. The neighborhood
relationships for calculating the Moran’s I statistic are defined as first order queen contiguity, which is
commonly used (a municipality’s spatial lag is a weighted average of its neighboring localities ; neighbors
are typically defined in terms of their physical proximity to the local geographic unit).
• Results indicate that both the dependent and all independent variables exhibit significant positive spatial
autocorrelation. The hypothesis of spatial randomness is clearly rejected. A positive and significant
spatial dependence in the dependent variable (crime rate) indicates that the crime rate in a particular
municipality is associated with (not independent of) crime rates in surrounding counties. The value of the
spatial autocorrelation coefficient (0,297) indicates that a 10 percentage point increase in the crime rate in
a municipality results in an increase of nearly 3% in the crime rate in a neighboring municipality. This,
together with the results of the LISA cluster analysis, is evidence of the existence of significant spillover
effects between municipalities with respect to crime, and implies that there is a need of a coordination of
the municipal efforts to fight criminal activities that spill over the municipal borders.
15
2. Exploratory spatial data analysis
16
Prevalence of crime in Belgian municipalities (N = 589)
17
Global Moran’s I statistic for variables included in this study
18
Cartograms of the geographical distribution of independent variables
19
LISA cluster map for criminality in Belgian municipalities, N= 589
• Exploring the relationship between the independent variables and crime rates starts with a multivariate OLS
regression model. None of the correlations between the predictors is excessively high enough to yield a major
concern about multicollinearity. Nevertheless, we evaluated the diagnostics to assess the issue of
multicollinearity more formally. In particular, Variance Inflation Factors (VIFs) were investigated. Since all VIF
scores are below the critical value of 5, multicollinearity is rejected*.
• Results show that the nine predictors explain about 54,2% of the variance in crime rates. Of those, the variables
representing the percentage of males in the age group 15-64, the percentage of the age group 15-24 in the
population and the percentage of foreign born residents do not contribute significantly to the explanation of the
variability in crime rates between municipalities**.
• A more detailed analysis of the error residuals reveals that they are not normally distributed (Jarque Bera test =
410.059 ; p < 0.001) but not heteroscedastic (Koenker-Bassett test = 14.115 ; p=0.118). Finally, residual
independence is tested by the Moran I-statistic. This test shows significant spatial residual autocorrelation
(Moran’s I = 0.155 ; p < 0.001), violating the model’s independence assumption. This residual pattern in the OLS
model can be the result of existing spatial effects and can be accounted for by means of a spatial regression
model.
* Collinearity diagnostics were estimated using SPSS 21 and no problems of multicollinearity were found among the independent variables. The collinearity diagnostics
used were the variance inflation factors (VIF) and tolerances for individual variables. Multicollinearity is said to exist if the VIF is 5 or higher (or equivalently, tolerances of
0,20 or less). The highest VIF –value in this analysis was 4,852 and the lowest tolerance was 0,206, both for mean income.
** Initially, two dummy variables representing the regions in Belgium were added to the regression equation. However, VIF scores indicated the presence of
multicollinearity. Therefore, these dummy variables were no longer withheld in the OLS regression.
20
3. Global non spatial regression model
21
• The clustering of crime rates indicates that the data are not randomly distributed, but instead follow a systematic
pattern. The spatial clustering of variables, and the possibility of omitted variables that relate to the connectivity
of neighboring localities, raise model specification issues. Evidence for the latter also comes from the residual
autocorrelations present in the OLS model.
• We employ two alternative specifications to correct for spatial dependence. One is the spatial lag model. This
specification is relevant when the spatial dependence works through a spatial lag of the dependent variable. The
other specification is the spatial error model. This specification is relevant when the spacial dependence works
through the disturbance term (spatial regression models ware developed by making use of GeoDa, regression
software of the GeoDa Center for Geospatial Analysis and Computation, http://geodacenter.asu.edu).
• The value of the LMLAG-test is only weakly significant (LMLAG = 3.598 ; p < 0.1) but the results of the LMERROR-test
(56.900 ; p < 0.001) suggest that a spatial error must be considered in the global spatial regression model.
• The results from the spatial lag model shown in the table on the next slide, suggest that this model does not
perform as well as the spatial error model. The effect of the spatial lag term is statistically weak (rho = 0.084 ; p=
0.101). The robust Lagrange Multiplier (LM) test also recommends the use of the spatial error model and the
lower AIC value combined with the higher R2 value for the spatial error model signals that this model outperforms
the spatial lag model. In the spatial error model, all predictor variables except one (the percentage of foreign born
residents) yield a statistically significant effect.
22
4. Global spatial regression model
23
Global OLS versus global spatial regression models
• Based on the results of the global spatial regression model it is difficult to defend similarities in
municipality-level crime as arising from imitation of one’s neighbors, that is, a spatial lag process.
Criminality results from a complex mix of social, economic and cultural factors, only a small number of
which can be brought into a statistical model of the process. Much of it remains unaccounted for and is
summarized in the model’s error term.
• Although we observe a very small Moran’s I value (-0.022) associated with the spatial error model, the
residuals are not in compliance with the assumption of being spatially independent of each other
(Breusch-Pagan test for heteroscedasticity = 54.060 ; p< 0.001).
24
• As a global model, local regression modeling carries the assumption that the processes being modeled are
uniform throughout the study area : the relationships between the dependent and the independent
variables remain stationary (constant) across the entire study area of Belgium. Local spatial regression
models take nonstationarity into account. We use GWR4 to perform geographically weighted regression
analysis (Nakaya, 2012).
• The results of fitting the dataset to different GWR descriptive models are shown below. Four alternatives
of GWR modeling were applied considering the four possible combinations between two different types of
kernels (fixed or adaptive) and two different bandwidth methods (AICC and CV). GWR models 3 and 4
(both models use an adaptive kernel) offered lower residual squares, meaning that these models provided
a better fit to the data. The R2 value of both GWR-models is nearly the same. We chose GWR model 3
with the lowest AICC value to provide an exploratory analysis of the data.
25
5. Local spatial regression model
GWR models applied to dataset of Belgian municipalities
GWR model 1 GWR model 2 GWR model 3 GWR model 4
kernel fixed fixed adaptive adaptive
bandwidth method AICc CV AICc CV
adjusted R2 0,628 0,570 0,633 0,639
residual squares 342571,621 444329,261 337812,559 323290,833
AICc 5584,352 5630,776 5578,562 5581,763
Anova test residuals OLS/GWR p < 0,01 p < 0,01 p < 0,01 p < 0,01
26
27
28
29
lat/lon-coordinates (polygon centroids)
LatDD.dddd
LongDD.ddd
LongZone
LongZoneCM
DeltaLong(Rad)
LatRad
LongRad
rcurv1=rho
rcurv2=nu
CalculateMeridionalArcLength
MeridionalArcS
CoefficientsforUTMCoordinates
Ki
Kii
Kiii
Kiv
Kv
A6
RawNorthing
Northing Easting
LongZone
51,453768 4,468343 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5701311 602022,3 31
51,405344 4,469669 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5695928 602222,5 31
51,394896 4,606091 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5694965 611736,2 31
51,267046 4,370611 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5680415 595620,1 31
51,335968 4,628302 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5688446 613426,8 31
51,342434 4,440327 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5688891 600319,1 31
51,334729 4,371499 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5687942 595541,4 31
51,313418 4,501749 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5685750 604663 31
51,293475 4,724967 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5683875 620271,1 31
51,268405 4,512477 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5680760 605513,8 31
51,256666 4,584644 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5679561 610576,2 31
51,258890 4,673666 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5679946 616782,1 31
51,229417 4,325406 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5676172 592542 31
51,211117 4,685177 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5674652 617707,1 31
UTM conversion
• Results reveal that the GWR model exhibits a significant improvement in explained variance as compared
to the OLS regression model (63,3% vs. 54,2%). The AIC score for the GWR model (5578.562) is
substantially lower than the AIC score for the global OLS model (5657.654), which reflects a better
goodness of fit (AIC is a measure of spatial collinearity. The lower its value, the better the fit of the model
to the observed data).
• Another method to evaluate the GWR model is the ANOVA test which verifies the null hypothesis that the
GWR model represents no improvement over the global model. The computed F-value of 2.753 is in
excess of the critical value of F (2.41 ; α = 0.01) with 10 and 496 degress of freedom. The ANOVA test thus
suggests that the GWR model is a significant improvement on the global model for the data of Belgian
municipalities.
• The results obtained by the GWR method provide information about locally differing estimation
coefficients. Therefore, the GWR results do not report a global estimate for each explanatory variable but
rather they provide insights into local ranges of the estimates (minimum, 25% quantile, median, 75%
quantile and maximum). The 5-number summary (see next slide) is helpful to get a feel of the degree of
spatial nonstationarity in a relationship by comparing the range of the local parameter estimates with a
confidence interval around the global estimate of the parameter. This is accomplished by dividing the
interquartile range of the GWR coefficient by twice the standard error of the same variable from the
global regression (OLS). Ratio values > 1 suggest nonstationarity in the relationship between an
independent variable and the dependent variable.
30
• The results of the Monte Carlo test indicate that the parameter estimates do vary significantly across
space. As shown on the map on the next slide, the total variance explained by the local model ranges
from 47,8% to 83,4%. In general, there is a north-south divide with higher R2 values in the northern part of
the country. Explained variance is lowest in the southern part of the province of East Flanders and its
surrounding municipalities in Wallonia.
31
Geographically weighted regression 5-number parameter summary results and
Monte Carlo significance test for spatial variability of parameters (Belgian municipalities, N = 589)
minimum lower quartile median upper quartile maximum status significance
Intercept -513,965 -28,093 210,299 361,918 512,299 non-stationary p < 0.001
ln(Gini inequality) -0,706 0,285 1,062 1,439 2,529 non-stationary p < 0.001
mean income -0,010 -0,006 -0,004 -0,002 0,001 non-stationary p < 0.001
ln(unemployment) -0,012 0,216 0,325 0,444 0,720 non-stationary p < 0.001
ln(population density) -0,057 0,006 0,056 0,132 0,227 non-stationary p < 0.001
% male in age group 15-64 -0,100 0,019 0,044 0,065 0,129 non-stationary p < 0.001
% 15-24 in population -0,119 -0,058 -0,011 0,022 0,087 non-stationary p < 0.001
ln(% foreign born) -0,119 -0,009 0,036 0,083 0,167 non-stationary p < 0.001
ln(% non-Euro foreign) -0,015 0,076 0,106 0,147 0,319 no spatial variability p < 0.001
female labour force participation 0,008 0,038 0,048 0,064 0,096 non-stationary p < 0.001
5-number parameter summary Monte Carlo test
32
Local R2 values of the GWR model (Belgian municipalities, N = 589)
• To better understand and interpret nonstationarity in individual parameters it is necessary to visualize the
local parameter estimates and their associated diagnostics. The output of a GWR analysis includes data
that can be used to generate surfaces for each model parameter that can be mapped, where each surface
depicts the spatial variation of the relationship between a predictor and the outcome variable.
• One of the main challenges is the presentation and synthesis of the large numbver of results that are
generated in local GWR models. Mapping only the parameter estimates is misleading, as the map reader
has no way of knowing whether the local parameter estimates are significant. As Mennis (2006 : 172)
notes a main issue is that “the spatial distribution of the parameter estimates must be presented in
concert with the distribution of significance, as indicated by the t-value, in order to yield meaningful
interpretation of results”.
• Because the patterns of t-values for the parameter estimates are important to reveal which areas have
statistically significant estimates, we provide maps of t-values for all variables (see next slides). The maps
provide strong evidence of significant spatial heterogeneity in the effect of predictor variables on crime
across municipalities.
33
34
Significance of t-values for parameter estimates (1/3)
35
Significance of t-values for parameter estimates (2/3)
36
Significance of t-values for parameter estimates (3/3)
• The results of the geographically weighted regression analysis indicate that spatially varying processes operate in
Belgian municipalities with respect to the relationships between socio-economic and socio-demographic variables
and crime rates.
• Several local results are of particular note. First, when we examine the incidence of significant parameter
estimates at the local level, 61 % of all parameter estimates are insignificant. With the exception of
unemployment and female labour force participation, the majority of parameter estimates for all other
independent variables and the intercept are insignificant . Positively of negatively signed global effects of
covariates do not hold across all municipalities. This proves it is important to analyze beyond the global level
(OLS) and to examine variation at the local level (GWR).
• Secondly, the global parameter estimates mask a great deal of variation at the local level. For example, while
the global parameter estimate for unemployment is 0,217, the parameter estimates at the local level range from -
0,012 to 0,720. Where the global estimate for the percentage of non-Euro foreign born inhabitants is 0,114, the
local parameter estimates range from -0,015 to 0,319.
• Finally, insignificant global results mask countervailing positive and negative effects of covariates at the local
level. The negatively signed but insignificant global effect of the percentages of 15-24 aged youngsters in the
population reaches negative significance in 23,2 % of the municipalities while the effect of this covariate reverses
to positive significance in a minority (2,9 %) of all municipalities. In a similar way, the positively signed but
insignificant global effect of the percentage of males aged 15-64 in the local population reaches positive
significance in 39,2 % of the municipalities while the effect of this variable is negative significant in 2 % of the
municipalities.
37
38
GWR model significant estimates
• We can further explore the results of the GWR analysis by clustering locations with similar parameter
values for the variables considered. This synthesizes the output that is generated by the GWR model and
can help to interpret the results .
• A two-step cluster analysis based on the nine parameter estimates and the intercept was applied. We
experimented with a range of clusters between 4 and 8. The optimal choice in terms of the number of
clusters was 6 (municipalities were divided in evenly sized clusters).
39
• Although latitude and longitude were not included in clustering municipalities’ parameter estimates, the
six clusters are geographically coherent. A discriminant analysis with cluster membership as the
dependent variable and both lat/lon-coordinates as predictors confirms that 70,6% of the cluster
members are correctly classified based on their location which means that 70,6% of the municipalities
were geographically near other members of the same cluster. By cluster, the percentage of correctly
classified members varies from 57,8% to 83,8%.
40
41
Significance of t-values for parameter estimates by cluster (1/3)
42
Significance of t-values for parameter estimates by cluster (2/3)
43
Significance of t-values for parameter estimates by cluster (3/3)
• Although the parameter estimate of non-Euro inhabitants does not vary spatially (see 5-number
parameter summary), it is by far the most important predictor of criminality in cluster 1. In comparison
with the other clusters, the effect of the percentage of males in the age group 15-64 is significant in a large
majority of municipalities covered by cluster 1.
• Like cluster 1, cluster 2 represents a contiguous area of municipalities but the percentage of correctly
classified municipalities is lowest (57,8%) of all clusters. Within this cluster the percentage of explained
variance strongly differs when moving from west to east (R2 between 47,7% and 80,7).
• Cluster 3 covers large parts of Wallonia, where local R2 values are relative low. In cluster 3 as well as in
cluster 4 and cluster 6, the parameter estimates for socio-economic variables (Gini inequality, mean
income and unemployment) are significant in resp. 80,6 %, 65,9 % and 80,6 % of the municipalities. In the
other clusters, the effect of these variables is significant in less than one third of the municipalities.
• Apart from the effect of socio-economic variables, the effect of non-Euro inhabitants on crime is also
significant in a majority of municipalities in cluster 4.
• In cluster 5 the local R2 values are also relative low and the estimate of the intercept factor is significant.
Criminality in the east cantons of cluster 5 also correlates significant and independent of other predictors
with population density and the presence of young people in the population inhibits criminality in this
area of cluster 5.
• As stated, in the area that represents cluster 6 (the largest cluster in terms of the number of
municipalities), the measures of inequality are the most significant determinants of crime. Criminality also
varies in an independent way with population density.
44
• The objectives of this study were to examine the extent of geographic variation in the relationship between socio-
economic and demographic variables on the one hand and crime rates on the other. The goals of our study were
(i) to compare the performance of global and local spatial regression with OLS regression (ii) examine spatial
nonstationarity throught the use of GWR (iii) map the parameter coefficients of GWR for further interpretation
and (iv) examine whether there are spatial groupings of parameter estimates.
• The analysis revealed that there is evidence of overall clustering in crime rates in Belgium. Local spatial analysis
uncovered that places with the highest crime rates are often proximate.
• The finding of the existence of local spatial autocorrelation in crime rates suggests that failing to utilize spatially-
oriented methodologies may result in biased parameter values in explanatory models. As far as global models are
concerned, this analysis demonstrated that a spatial error model adds significantly to the understanding and
interpretation of spatially varying crime rates.
• The use of a GWR model allowed for an assessment of spatial heterogeneity when exploring the relationships
between predictor variables and crime rates by local area. Geographically weighted estimations provided the best
fit to the data. Predictor variables as well as crime rates showing strong local variation point to problems that
policy makers best address at the local level and the situation in particular areas.
• Significant local parameter estimates were found for the predictor variables, confirming spatial heterogeneity in
the effects of these variables on crime and providing insights into the spatial scale at which processes may be
operating. Furthermore, a two-steps cluster analysis revealed distinct zones of spatial effects.
45
6. Conclusions
46
References Spatial Data Analysis 1 & 2
Anselin, L., Spatial regression analysis in R. A workbook, Spatial Analysis Laboratory, Dep. Of Geography, University of Illinois, Urbana-Champaign, may 2007.
Arnio, A.N. & Baumer, E.P., Demography, foreclosure and crime : Assessing spatial heterogeneity in contemporary models of neighborhood crime rates, Demographic
Research, 26, 2012, pp.449-488.
Cahill, M. & Mulligan, G., Using geographically weighted regression to explore local crime patterns, Social Science Computer Review, 25, 2007, pp. 174-193.
Fotheringham, A.S., Brunsdon, C. & Charlton, M.E., Geographically weighted regression : The analysis of spatially varying relationships, Chichester UK, John Wiley & Sons,
2002.
Helbich, M., Leitner, M. & Kapusta, N.D., Geospatial examination of lithium in drinking water and suicide mortality, International Journal of Health Geography, 2012, pp .
11-19.
Matthews, S.A. & Yang, T.Ch., Mapping the results of local statistics : Using geographically weighted regression, Demographic Research, 26, 2012, pp . 151-166.
Matthews, S.A. & Parker, D.M., Progression in spatial demography, Demographic Research, 28, 2013, pp . 271-312.
Mennis, J.L., Mapping the resulmts of geographically weighted regression, The Cartographic Journal, 43, 2006, pp. 171-179.
Shoff, C., Yang, T.CH & Matthews, S.A., What has geography got to do with it ? Using GWR to explore place-specific associations with prenatal care utilization, Geo Journal,
77, june 2012, pp. 331-341.
Siordia, C., Saenz, J. & Tom, S.E., An introduction to macro-level spatial nonstationarity : A geographically weighted regression analysis of diabetes and poverty, Journal of
Studies and Research in Human Geography, 6, 2012, pp. 5-13.
Tita, G.E. & Radil, S.M., Making space for theory : The challenges of theorizing space and place for spatial analysis in criminology, Journal of Quantitative Criminology, 26,
2010, pp. 467-479.
Tobler, W., A computer movie simulating urban growth in the Detroit region, Economic Geography, 46, 1970, pp. 234-240.
Vilalta, C.J., How exactly does place matter in crime analysis ? Place, space and spatial heterogeneity, Journal of Criminal Justice Education, 2012, pp. 1-26.
Voss, P.R., Long, D.D., Hammer, R.B. & Friedman, S., County child poverty rates in the U.S. : A spatial regression approach, Population Research Policy Review, 25, 2006, pp.
369-391.
Yamashita, K., Understanding urban fire : Modeling fire incidence using classical and geographically weighted regression, ProQuest, UMI Dissertation Publishing, 2012.

Mais conteúdo relacionado

Mais procurados

Spatial data mining
Spatial data miningSpatial data mining
Spatial data miningMITS Gwalior
 
QUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GISQUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GISDEVANG KAPADIA
 
Interpolation 2013
Interpolation 2013Interpolation 2013
Interpolation 2013Atiqa Khan
 
GIS in land suitability mapping
GIS in land suitability mappingGIS in land suitability mapping
GIS in land suitability mappingGlory Enaruvbe
 
Interpolation
InterpolationInterpolation
Interpolationseidmmd
 
Spatial interpolation techniques
Spatial interpolation techniquesSpatial interpolation techniques
Spatial interpolation techniquesManisha Shrivastava
 
Remote Sensing: Overlay Analysis
Remote Sensing: Overlay AnalysisRemote Sensing: Overlay Analysis
Remote Sensing: Overlay AnalysisKamlesh Kumar
 
Geographical information system unit 5
Geographical information  system unit 5Geographical information  system unit 5
Geographical information system unit 5WE-IT TUTORIALS
 
GIS.INTRODUCTION TO GIS PACKAGES &GEOGRAPHIIC ANALYSIS
GIS.INTRODUCTION TO GIS PACKAGES &GEOGRAPHIIC ANALYSISGIS.INTRODUCTION TO GIS PACKAGES &GEOGRAPHIIC ANALYSIS
GIS.INTRODUCTION TO GIS PACKAGES &GEOGRAPHIIC ANALYSISTessaRaju
 
Hierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationHierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationColleen Farrelly
 
Building maps with analysis
Building maps with analysisBuilding maps with analysis
Building maps with analysisLindaBeale
 
Sampling and Probability in Geography
Sampling and Probability in Geography Sampling and Probability in Geography
Sampling and Probability in Geography Prof Ashis Sarkar
 
Interpolation techniques in ArcGIS
Interpolation techniques in ArcGISInterpolation techniques in ArcGIS
Interpolation techniques in ArcGISHarsha Chamara
 
Design Process Using Hierarchical Spatial Reasoning Theory And Gis
Design Process Using Hierarchical Spatial Reasoning Theory And GisDesign Process Using Hierarchical Spatial Reasoning Theory And Gis
Design Process Using Hierarchical Spatial Reasoning Theory And Gisahmad bassiouny
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data miningKrish_ver2
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsMason Porter
 
Inverse distance weighting
Inverse distance weightingInverse distance weighting
Inverse distance weightingPenchala Vineeth
 

Mais procurados (20)

Spatial Data Model 2
Spatial Data Model 2Spatial Data Model 2
Spatial Data Model 2
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
QUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GISQUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GIS
 
Interpolation 2013
Interpolation 2013Interpolation 2013
Interpolation 2013
 
GIS in land suitability mapping
GIS in land suitability mappingGIS in land suitability mapping
GIS in land suitability mapping
 
Spatial databases
Spatial databasesSpatial databases
Spatial databases
 
Interpolation
InterpolationInterpolation
Interpolation
 
Spatial interpolation techniques
Spatial interpolation techniquesSpatial interpolation techniques
Spatial interpolation techniques
 
Remote Sensing: Overlay Analysis
Remote Sensing: Overlay AnalysisRemote Sensing: Overlay Analysis
Remote Sensing: Overlay Analysis
 
10.1.1.17.1245
10.1.1.17.124510.1.1.17.1245
10.1.1.17.1245
 
Geographical information system unit 5
Geographical information  system unit 5Geographical information  system unit 5
Geographical information system unit 5
 
GIS.INTRODUCTION TO GIS PACKAGES &GEOGRAPHIIC ANALYSIS
GIS.INTRODUCTION TO GIS PACKAGES &GEOGRAPHIIC ANALYSISGIS.INTRODUCTION TO GIS PACKAGES &GEOGRAPHIIC ANALYSIS
GIS.INTRODUCTION TO GIS PACKAGES &GEOGRAPHIIC ANALYSIS
 
Hierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationHierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validation
 
Building maps with analysis
Building maps with analysisBuilding maps with analysis
Building maps with analysis
 
Sampling and Probability in Geography
Sampling and Probability in Geography Sampling and Probability in Geography
Sampling and Probability in Geography
 
Interpolation techniques in ArcGIS
Interpolation techniques in ArcGISInterpolation techniques in ArcGIS
Interpolation techniques in ArcGIS
 
Design Process Using Hierarchical Spatial Reasoning Theory And Gis
Design Process Using Hierarchical Spatial Reasoning Theory And GisDesign Process Using Hierarchical Spatial Reasoning Theory And Gis
Design Process Using Hierarchical Spatial Reasoning Theory And Gis
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
Inverse distance weighting
Inverse distance weightingInverse distance weighting
Inverse distance weighting
 

Destaque

Spatial Analysis Using GIS
Spatial Analysis Using GISSpatial Analysis Using GIS
Spatial Analysis Using GISPrachi Mehta
 
Spatial analysis and modeling
Spatial analysis and modelingSpatial analysis and modeling
Spatial analysis and modelingTolasa_F
 
Spatial Analysis and Geomatics
Spatial Analysis and GeomaticsSpatial Analysis and Geomatics
Spatial Analysis and GeomaticsRich Heimann
 
R programming language in spatial analysis
R programming language in spatial analysisR programming language in spatial analysis
R programming language in spatial analysisAbhiram Kanigolla
 
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in RFinding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in RRevolution Analytics
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )designQube
 
Chronic Kidney Disease
Chronic Kidney DiseaseChronic Kidney Disease
Chronic Kidney DiseaseAndre Garcia
 
Financial statement analysis
Financial statement analysisFinancial statement analysis
Financial statement analysisAnuj Bhatia
 
Analysis of financial statements
Analysis of financial statementsAnalysis of financial statements
Analysis of financial statementsAdil Shaikh
 
Network analysis in gis
Network analysis in gisNetwork analysis in gis
Network analysis in gisstudent
 

Destaque (13)

Spatial Analysis Using GIS
Spatial Analysis Using GISSpatial Analysis Using GIS
Spatial Analysis Using GIS
 
Spatial analysis and modeling
Spatial analysis and modelingSpatial analysis and modeling
Spatial analysis and modeling
 
Spatial Analysis and Geomatics
Spatial Analysis and GeomaticsSpatial Analysis and Geomatics
Spatial Analysis and Geomatics
 
R programming language in spatial analysis
R programming language in spatial analysisR programming language in spatial analysis
R programming language in spatial analysis
 
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in RFinding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
Finding Meaning in Points, Areas and Surfaces: Spatial Analysis in R
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )
 
Introduction to Health GIS
Introduction  to Health GIS Introduction  to Health GIS
Introduction to Health GIS
 
GIS in Health
GIS in HealthGIS in Health
GIS in Health
 
Common Kidney Diseases
Common Kidney DiseasesCommon Kidney Diseases
Common Kidney Diseases
 
Chronic Kidney Disease
Chronic Kidney DiseaseChronic Kidney Disease
Chronic Kidney Disease
 
Financial statement analysis
Financial statement analysisFinancial statement analysis
Financial statement analysis
 
Analysis of financial statements
Analysis of financial statementsAnalysis of financial statements
Analysis of financial statements
 
Network analysis in gis
Network analysis in gisNetwork analysis in gis
Network analysis in gis
 

Semelhante a Spatial data analysis 2

Spatial analysis of house price determinants
Spatial analysis of house price determinantsSpatial analysis of house price determinants
Spatial analysis of house price determinantsLaurent Lacaze Santos
 
Spatial Analysis of House Price Determinants
Spatial Analysis of House Price DeterminantsSpatial Analysis of House Price Determinants
Spatial Analysis of House Price DeterminantsLaurent Lacaze Santos
 
Process_Method_versus_The_Hypothesis_Method
Process_Method_versus_The_Hypothesis_MethodProcess_Method_versus_The_Hypothesis_Method
Process_Method_versus_The_Hypothesis_MethodRichard Wilkie
 
Spatial statistics presentation Texas A&M Census RDC
Spatial statistics presentation Texas A&M Census RDCSpatial statistics presentation Texas A&M Census RDC
Spatial statistics presentation Texas A&M Census RDCCorey Sparks
 
Role of Modern Geographical Knowledge in National Development
Role  of Modern Geographical Knowledge in National DevelopmentRole  of Modern Geographical Knowledge in National Development
Role of Modern Geographical Knowledge in National DevelopmentProf Ashis Sarkar
 
Sa Presentation 20070917111 Thomas
Sa Presentation 20070917111 ThomasSa Presentation 20070917111 Thomas
Sa Presentation 20070917111 Thomasnspiropo
 
Geographic Information: Aspects of Phenomenology and Cognition
Geographic Information: Aspects of Phenomenology and CognitionGeographic Information: Aspects of Phenomenology and Cognition
Geographic Information: Aspects of Phenomenology and CognitionRobert (Bob) Williams
 
A Qualitative Study To Show How Other Affect Individual...
A Qualitative Study To Show How Other Affect Individual...A Qualitative Study To Show How Other Affect Individual...
A Qualitative Study To Show How Other Affect Individual...Stephanie King
 
Assessing spatial heterogeneity
Assessing spatial heterogeneityAssessing spatial heterogeneity
Assessing spatial heterogeneityJohan Blomme
 
ASSIGNMENT 2AReview Question 6 the below question and select o.docx
ASSIGNMENT 2AReview Question 6 the below question and select o.docxASSIGNMENT 2AReview Question 6 the below question and select o.docx
ASSIGNMENT 2AReview Question 6 the below question and select o.docxrock73
 
Agent-Based Modeling
Agent-Based ModelingAgent-Based Modeling
Agent-Based ModelingKelly Lipiec
 
5.1 major analytical techniques
5.1 major analytical techniques5.1 major analytical techniques
5.1 major analytical techniquesmd Siraj
 
Vulnerable Groups and Communities in The Context of Adaptation and Developmen...
Vulnerable Groups and Communities in The Context of Adaptation and Developmen...Vulnerable Groups and Communities in The Context of Adaptation and Developmen...
Vulnerable Groups and Communities in The Context of Adaptation and Developmen...Tariq A. Deen
 
Vulnerable Groups and Communities in The Context of Adaptation and Developme...
 Vulnerable Groups and Communities in The Context of Adaptation and Developme... Vulnerable Groups and Communities in The Context of Adaptation and Developme...
Vulnerable Groups and Communities in The Context of Adaptation and Developme...NAP Events
 
r The Impact of Social Policy Pranab Chatterjee an.docx
r The Impact of Social Policy Pranab Chatterjee an.docxr The Impact of Social Policy Pranab Chatterjee an.docx
r The Impact of Social Policy Pranab Chatterjee an.docxaudeleypearl
 
Edwards, Mary E - Regional and Urban Economics and Economic Development _ The...
Edwards, Mary E - Regional and Urban Economics and Economic Development _ The...Edwards, Mary E - Regional and Urban Economics and Economic Development _ The...
Edwards, Mary E - Regional and Urban Economics and Economic Development _ The...theinko1
 

Semelhante a Spatial data analysis 2 (20)

Spatial analysis of house price determinants
Spatial analysis of house price determinantsSpatial analysis of house price determinants
Spatial analysis of house price determinants
 
Spatial Analysis of House Price Determinants
Spatial Analysis of House Price DeterminantsSpatial Analysis of House Price Determinants
Spatial Analysis of House Price Determinants
 
Process_Method_versus_The_Hypothesis_Method
Process_Method_versus_The_Hypothesis_MethodProcess_Method_versus_The_Hypothesis_Method
Process_Method_versus_The_Hypothesis_Method
 
Spatial statistics presentation Texas A&M Census RDC
Spatial statistics presentation Texas A&M Census RDCSpatial statistics presentation Texas A&M Census RDC
Spatial statistics presentation Texas A&M Census RDC
 
Role of Modern Geographical Knowledge in National Development
Role  of Modern Geographical Knowledge in National DevelopmentRole  of Modern Geographical Knowledge in National Development
Role of Modern Geographical Knowledge in National Development
 
Sa Presentation 20070917111 Thomas
Sa Presentation 20070917111 ThomasSa Presentation 20070917111 Thomas
Sa Presentation 20070917111 Thomas
 
Spatial Essay
Spatial EssaySpatial Essay
Spatial Essay
 
Geographic Information: Aspects of Phenomenology and Cognition
Geographic Information: Aspects of Phenomenology and CognitionGeographic Information: Aspects of Phenomenology and Cognition
Geographic Information: Aspects of Phenomenology and Cognition
 
Scalable
ScalableScalable
Scalable
 
A Qualitative Study To Show How Other Affect Individual...
A Qualitative Study To Show How Other Affect Individual...A Qualitative Study To Show How Other Affect Individual...
A Qualitative Study To Show How Other Affect Individual...
 
Assessing spatial heterogeneity
Assessing spatial heterogeneityAssessing spatial heterogeneity
Assessing spatial heterogeneity
 
ASSIGNMENT 2AReview Question 6 the below question and select o.docx
ASSIGNMENT 2AReview Question 6 the below question and select o.docxASSIGNMENT 2AReview Question 6 the below question and select o.docx
ASSIGNMENT 2AReview Question 6 the below question and select o.docx
 
5.pdf
5.pdf5.pdf
5.pdf
 
Agent-Based Modeling
Agent-Based ModelingAgent-Based Modeling
Agent-Based Modeling
 
5.1 major analytical techniques
5.1 major analytical techniques5.1 major analytical techniques
5.1 major analytical techniques
 
Systems 550 ppt_art
Systems 550 ppt_artSystems 550 ppt_art
Systems 550 ppt_art
 
Vulnerable Groups and Communities in The Context of Adaptation and Developmen...
Vulnerable Groups and Communities in The Context of Adaptation and Developmen...Vulnerable Groups and Communities in The Context of Adaptation and Developmen...
Vulnerable Groups and Communities in The Context of Adaptation and Developmen...
 
Vulnerable Groups and Communities in The Context of Adaptation and Developme...
 Vulnerable Groups and Communities in The Context of Adaptation and Developme... Vulnerable Groups and Communities in The Context of Adaptation and Developme...
Vulnerable Groups and Communities in The Context of Adaptation and Developme...
 
r The Impact of Social Policy Pranab Chatterjee an.docx
r The Impact of Social Policy Pranab Chatterjee an.docxr The Impact of Social Policy Pranab Chatterjee an.docx
r The Impact of Social Policy Pranab Chatterjee an.docx
 
Edwards, Mary E - Regional and Urban Economics and Economic Development _ The...
Edwards, Mary E - Regional and Urban Economics and Economic Development _ The...Edwards, Mary E - Regional and Urban Economics and Economic Development _ The...
Edwards, Mary E - Regional and Urban Economics and Economic Development _ The...
 

Mais de Johan Blomme

Curieuzeneuzen ww belgie
Curieuzeneuzen ww belgieCurieuzeneuzen ww belgie
Curieuzeneuzen ww belgieJohan Blomme
 
Text mining and social network analysis of twitter data part 1
Text mining and social network analysis of twitter data part 1Text mining and social network analysis of twitter data part 1
Text mining and social network analysis of twitter data part 1Johan Blomme
 
Trends voor data analyse 2014
Trends voor data analyse 2014Trends voor data analyse 2014
Trends voor data analyse 2014Johan Blomme
 
Trends in business_intelligence_2013
Trends in business_intelligence_2013Trends in business_intelligence_2013
Trends in business_intelligence_2013Johan Blomme
 
Trends in business intelligence 2012
Trends in business intelligence 2012Trends in business intelligence 2012
Trends in business intelligence 2012Johan Blomme
 
The new normal in business intelligence
The new normal in business intelligenceThe new normal in business intelligence
The new normal in business intelligenceJohan Blomme
 
Business intelligence in the real time economy
Business intelligence in the real time economyBusiness intelligence in the real time economy
Business intelligence in the real time economyJohan Blomme
 
E Business Integration. Enabling the Real Time Enterprise
E Business Integration. Enabling the Real Time EnterpriseE Business Integration. Enabling the Real Time Enterprise
E Business Integration. Enabling the Real Time EnterpriseJohan Blomme
 
Correspondentie Analyse
Correspondentie AnalyseCorrespondentie Analyse
Correspondentie AnalyseJohan Blomme
 
Knowledge Discovery In Data. Van ad hoc data mining naar real-time predictie...
Knowledge Discovery In Data.  Van ad hoc data mining naar real-time predictie...Knowledge Discovery In Data.  Van ad hoc data mining naar real-time predictie...
Knowledge Discovery In Data. Van ad hoc data mining naar real-time predictie...Johan Blomme
 
Operational B I In Supply Chain Planning
Operational  B I In Supply Chain PlanningOperational  B I In Supply Chain Planning
Operational B I In Supply Chain PlanningJohan Blomme
 
What is data mining ?
What is data mining ?What is data mining ?
What is data mining ?Johan Blomme
 

Mais de Johan Blomme (12)

Curieuzeneuzen ww belgie
Curieuzeneuzen ww belgieCurieuzeneuzen ww belgie
Curieuzeneuzen ww belgie
 
Text mining and social network analysis of twitter data part 1
Text mining and social network analysis of twitter data part 1Text mining and social network analysis of twitter data part 1
Text mining and social network analysis of twitter data part 1
 
Trends voor data analyse 2014
Trends voor data analyse 2014Trends voor data analyse 2014
Trends voor data analyse 2014
 
Trends in business_intelligence_2013
Trends in business_intelligence_2013Trends in business_intelligence_2013
Trends in business_intelligence_2013
 
Trends in business intelligence 2012
Trends in business intelligence 2012Trends in business intelligence 2012
Trends in business intelligence 2012
 
The new normal in business intelligence
The new normal in business intelligenceThe new normal in business intelligence
The new normal in business intelligence
 
Business intelligence in the real time economy
Business intelligence in the real time economyBusiness intelligence in the real time economy
Business intelligence in the real time economy
 
E Business Integration. Enabling the Real Time Enterprise
E Business Integration. Enabling the Real Time EnterpriseE Business Integration. Enabling the Real Time Enterprise
E Business Integration. Enabling the Real Time Enterprise
 
Correspondentie Analyse
Correspondentie AnalyseCorrespondentie Analyse
Correspondentie Analyse
 
Knowledge Discovery In Data. Van ad hoc data mining naar real-time predictie...
Knowledge Discovery In Data.  Van ad hoc data mining naar real-time predictie...Knowledge Discovery In Data.  Van ad hoc data mining naar real-time predictie...
Knowledge Discovery In Data. Van ad hoc data mining naar real-time predictie...
 
Operational B I In Supply Chain Planning
Operational  B I In Supply Chain PlanningOperational  B I In Supply Chain Planning
Operational B I In Supply Chain Planning
 
What is data mining ?
What is data mining ?What is data mining ?
What is data mining ?
 

Último

Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Neil Kimberley
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...lizamodels9
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
HONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsHONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsMichael W. Hawkins
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo
 

Último (20)

Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
HONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael HawkinsHONOR Veterans Event Keynote by Michael Hawkins
HONOR Veterans Event Keynote by Michael Hawkins
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 

Spatial data analysis 2

  • 1. Spatial Data Analysis 2/2 Johan Blomme | Leenstraat 11 | 8340 Damme info@data-insights.be
  • 2. There is an increased interest in understanding spatial varying processes to explain various social, political and economic outcomes. Using global and local statistics can lead to completely different insights into the relationship between area-level characteristics and outcomes. Two types of spatial analysis are especially relevant : – spatial autocorrelation : the application of local clustering analysis to establish significant local patterns and the use of spatial econometrics to account for spatial effects in regression analysis ; – spatial heterogeneity : the application of geographically weighted regression analysis to explore the spatial variation in the relationships between area-level characteristics and various outcomes. In this guide, various techniques to perform global and local spatial regression analysis are explored. The examples used are for illustrative purposes only and are not intended to test the theoretical underpinnings that exist in the research field of the chosen cases. Introduction
  • 3. • In recent years, there has been a growing interest in adding a spatial perspective to the study of complex patterns of interrelated social, behavioral, economic and environmental phenomena. It is increasingly argued that spatial thinking and spatial analytical perspectives have an important role to play in uncovering answers that could prove helpful in addressing research and policy questions*. • Spatial analysis of data focuses on four methodological areas : – spatial econometrics ; – geographically weighted regression ; – multilevel models ; – spatial pattern analysis. * It is worth noting that the term “spatial analysis” applies equally to the study of incident level point patterns (e.g. crime hot spots) as well as to the study of aggregated counts or rates at the area level (e.g. census block groups, tracts or “neighborhoods”). i
  • 4. 1. Spatial econometrics • Spatial econometrics account for spatial effects in regression analysis. If geography or place matters (and it frequently does), then things that are more related geographically (i.e. more proximate geographically) are also correlated in other ways. Therefore, assumptions about the independence of covariates and about the independence and distribution of error terms are violated in an OLS regression framework. • Let’s take the example of the analysis of crime data. The growing spatial analysis of crime data enabled criminologists to move beyond simply mapping crime and demonstrating that crime does indeed cluster in space. An important issue became the question why crime clustered in space. Spatial regression models were being estimated to explain the observed patterns of spatial clusters. In addition to crime, many researchers began to use spatial regression models to demonstrate that many negative health issues such as low birth weight, infant mortality and depression cluster spatially. From these studies emerged a consistent set of explanatory variables that characterise “bad” neighborhoods (e.g. concentrated poverty, stability of residents, female headed households, minority population) and that there appeared to be an aggregate “neighborhood “ effect. For instance, concentrated poverty negatively impacts all residents of a community regardless of one’s own level of personal income. ii
  • 5. • That such places also cluster in space suggests that neighborhoods are not independent units of observation. There might be forces at work that make the level of crime in one neighborhood dependent upon the actions and activities occurring in other areas. That is, social processes might be at work that result in the diffusion across space. • In trying to understand these patterns, spatial regression became the methodology of choice. As noted, spatial autocorrelation occurs when the values of variables sampled at nearby locations are not independent of each other. This lack of independence makes the use of OLS regression inappropriate. To address spatial autocorrelation spatial lag and spatial error models became most popular. • When the level of crime in one neighborhood is directly dependent upon the activities or social processes occurring in a neighboring area, one must apply a spatial lag (spatial dependence) model. Spatial error models are appropriate for modeling unobservable processes (e.g. norms or beliefs) that are shared among individuals residing in proximate places, or when boundaries that separate “places” are arbitrary to the extent that two different places are actually very similar across various social, economic or demographic features. iii
  • 6. • By examining the statistically significant coefficient on the spatially lagged dependent variable or the spatial error term, specific explanations were offered regarding the forces driving the diffusion of the study object. • The selection of which model, lag or error, has, and continues to be, driven by goodness of fit tests rather than theory. • What causes spatial autocorrelation ? • Feedback. For most social processes, individuals and households interact with each other and thereby influence each other. The influence of such an interaction is likely to be stronger for those who are in frequent contact. Residential proximity generally increases the frequency for those who are in frequent contact. However, it is also possible to geographically “unbound” the autocorrelation matrix. For example, social similarity increases the probability of communication and social interaction. In this way, events in an area can be influenced more by events in non-adjacent but socially similar areas than in adjacent but socially dissimilar areas. One might model the diffusion of youth violence by considering social interactions that occur within schools. In such a case, neighborhoods would be linked if and only if they send students to the same school buildings. Studies that capture social networks and communication networks can provide an empirical validation of this approach (instead of using a geographically based matrix, the potential for activities in one area to influence other areas can be based on social distance between places). iv
  • 7. • Grouping forces. Individuals and households with common characteristics sometimes are found clustered together by choice or they are constrained to co-locate by the coercive operation of social, economic or political forces. When this type of constraint is responsible for spatial autocorrelation in a dependent variable, it may be possible to identify the variable or variables involved in the process and operationalize them on the right-hand of the regression equation. Sometimes the spatial autocorrelation in the dependent variable (and the regression residuals) can be explained by autocorrelated covariates (independent) variables, and standard regression approaches will work fine. If a causal variable cannot be identified, then the source of the autocorrelation will remain in the error term, necessitating what is referred to as a spatial error model. • Grouping responses. Individuals or households that share a common attribute or a set of common characteristics may respond similarly to external forces. Often there exist contextual forces that affect individuals and households in an area (e.g. geophysical conditions, cultural influences). A data analyst can deal with these contextual influences by declaring different “spatial regimes”. If not, spatial autocorrelation will remain in the regression error term, the result of an omitted variable in the specification, and spatial econometric approaches must again be considered. v
  • 8. • Nuisance autocorrelation. This occurs when the underlying spatial process creates regions that are much larger than the units of observation chosen or available to the analyst. The choice of the proper level of aggregation when estimating neighborhood effect remains problematic . Data is typically aggregated to geographical areas which serve as the units of analysis (e.g. census tracts). The modifiable area unit problem (MAUT) arises from the fact that units are usually arbitrarily defined in the sense that they can be aggregated or disaggregated to form units of different size. Innovative advances are being undertaken that define the geography of a community no longer on boundaries for administrative purposes (e.g. census tracts, zip codes) but capture the spatial dimension of social networks. vi
  • 9. • The challenges for future work are not those that pertain to the development of new mapping technologies or more sophisticated statistical methodologies : “That is, regardless of how sophisticated our methodologies become for the estimation of spatial models, the key will always be that the specification of these models be sound in terms of the measurement and definition of place and the manner in which areas are deemed “neighbors”. … Though the ability for a crime in a focal area to influence crime in another area might decay over distance, it is possible that there are other networks of social interactions (e.g. interactions that occur outside the neighborhood at work or school, participation in voluntary or religious organizations, …) that make events in one area extremely salient in the commission of future events in otherwise geographically distant areas” (Tita & Radil, 2010, pp. 476). vii
  • 10. • Recently techniques for the analysis of local spatial relationships have been developed. • In conventional regression, one parameter is estimated for the relationship between each independent variable and the dependent variable and the relationship is assumed to be constant across the study area. The term “global” implies that all of the data are used to compute a single statistic or model, and that the relationships between variables in the model are stationary across the study area. The GWR approach extends this framework to estimate local rather than global parameters. Instead of calibrating a single regression equation, GWR generates a separate regression equation for each observation. Each equation is calibrated using a different weighting of the observations contained in the data. • In traditional OLS all places have the same weight as if all places shared the same location. In GWR, as we move over space observations are weighted according to their proximity to a location. 2. Geographically weighted regression viii
  • 11. • Two problems arise with GWR. If the subset of the full sample is too small, standard errors will be high. Second, if the subsample is too large, coefficients will be biased because they drift across space. If the process is spatially non- stationary, a regression with a large subsample will result in estimates that are spatial averages. To overcome these problems in GWR a weighted calibration is used. • Observations in close spatial proximity to region i have a larger influence in the estimation of the parameters for region i than those further away. That is why those observations have larger weight in the sample than the observations from regions further away. This weighted calibration implies that the weighting of an observation is not constant but varies with i. Region j has a large weight in the estimation of region i if they are close to each other, and the weight of region j in the estimation of region m might be small if the regions are separated by a larger distance. Every single region i has a different weight matrix. • GWR is run in several steps. The first point is how observations should be weighted. The two most applied weighting functions (Kernel functions) are the Gaussian and the bi-square kernel. Using the Gaussian kernel, in which space is considered continuous, the weighting of data will decrease according to a Gaussian curve as the distance between i and j increases. Up to a certain bandwith, the observations will have a weight of at least 0.5. A binary scheme implies the notion that space is discrete or discontinuous. Beyond a bandwith, the weights are set to zero. ix
  • 12. • It is often stated that the GWR results are relatively insensitive to the choice of the weighting function, but they are not insensitive to the choice of the bandwith. As the density of regions in a dataset can vary, we cannot use just one bandwidth. For example, in a study of European regions a fixed bandwidth of 800 km is too small for the estimation of coefficients in Finland, because there are few regions and, accordingly, few data points in close proximity. The most northern Finish region would have only 3 neighbors. Such a small sample would result in large standard errors. Similarly, this bandwidth is too large for place like Austria, where the density of regions is much higher. The region of Tirol would have 129 neighboring regions within a distance of 800 km. Such a large sample could result in serious drift bias. That is why an adaptive kernel is most appropriate : an optimal adaptive number of neighbors will be applied. Adaptive kernel means a fixed proportion of all observations is included in the estimation, for example 20 percent of all regions. • An adaptive kernel is smaller in regions where the density of observations is high (like in Austrian regions) and larger in regions where the density is low (like in Finish regions). While the advantage of an adaptive kernel is obvious for regions with a high density of observations, the coefficients of regions with a low density of observations are likely to be drift biased, as they are also influenced by observations of regions which are in large distance. x
  • 13. • The bandwidth can be understood as the area of influence of each place. A small bandwidth means a small area of influence, meaning a rapid distance decay function, whereas a large bandwidth implies a larger area of influence, thus a smoother weighting scheme. In a regression context, a small bandwidth (slighter smooting) produces estimates with large local variation, whereas large bandwidths (greater smoothing) produce estimates with little spatial variation (larger bandwidths will make local coefficient estimates similar to OLS global estimates). • There are two methodes for the estimation of the optimal bandwidth : AICc (corrected Akaike Information Criterion) and CV (cross validation). When comparing between GWR models with different bandwidths, the model with the lowest AICc or CV can be considered the most appropriate as it will determine which radius size (bandwidth) is optimal. xi
  • 14. • The output from GWR is a set of surfaces that can be mapped , with each surface depicting the spatial variation of a relationship. Standard global modeling techniques, such as OLS or spatial regression models, cannot detect nonstationarity, and thus their use may obscure regional or local variation in the relationships between predictors and the outcome variable. Public policy inferences based on results from global models in which nonstationarity is present but not detected may be quite poor in specific local/regional settings. • GWR analysis and interpretation are largely dependent on GWR maps. Such maps can be problematic if they illustrate the size of parameter estimates while failing to illustrate their relative significance. A method to address this issue is the mapping of GWR statistics by combining local parameter estimates and t-values on a single map. • It is important to note that GWR is an exploratory technique, and, as with ESDA and spatial econometric approaches, the insights gained from GWR can be utilized to improve model specification in global models. Limitations associated with GWR include the computationally demanding calculation of multiple regressions, multicollinearity and kernel bandwidth selection. It should also be taken into account that GWR studies that find that parameter coefficients vary across space have a tendency to focus on this result and do not always seek to explain the results with further analysis. It is important that GWR and ESDA methods are utilized to help improve model specification, and that efforts be made to find explanations. xii
  • 15. • A methodological focus on multilevel or hierarchical modeling is relevant when assessing to what extent individual behaviors and demographic and health outcomes are influenced by an individual’s own characteristics, and by the attributes of the larger geographic area (neighborhood, village, district, state). • To some extent, nested data are inherently spatial. Statistical methods that incorporate neighborhood, city or regional effects are in essence considering the effects of places and spaces on their outcome(s) of interest. While traditional research has looked at de jure classifications of space (e.g. census tracts), it is increasingly acknowledged that legal and political boundaries frequently have little to do with actual lived spaces. Furthermore, many scientists are working in regions that do not have synonymous spatial categories : that is, neighborhoods and other administrative bounded areas may have different meanings in some of the non-industrialized and/or industrializing nations than they do in the developed world. 3. Multilevel modeling xiii
  • 16. • A wide range of methods new exist for analyzing spatial clusters of point data, such as disease or crime events, in which the goal is to discover whether the observed events exhibit any systematic pattern, as opposed to being distributed at random within a study area. Recent applications of spatial pattern analysis include the use of local statistics of spatial association. 4. Spatial pattern analysis xiv
  • 17. What does the near future hold for spatial data analysis ? • We can predict with some confidence that things will change rapidly, as the geospatial data and methodological development environment is dynamic. • It must be emphasized that the volume, sources and forms of geospatial data are growing rapidly. Data from wireless and sensor technologies and developments in data storage and handling (e.g. cloud computing, geospatial data warehouses, data mining techniques) will continue to change what, how and when we collect data on individuals and their environments. New data formats will be tagged with both a geographic location and a time stamp, providing unparalleled spatial and temporal precision. xv
  • 18. Global and Local Spatial Regression 1
  • 19. • Traditional regression analysis describes a modelled relationship between a dependent variable and a set of independent variables. When applied to spatial data, the regression analysis often assumes that the modelled relationship is stationary over space and produces a global model which is supposed to describe the relationship at every location in the study area. This would be misleading, however, if relationships being modelled are intrinsically different across space. One of the spatial statistical methods that attempts to solve this problem and explain local variation in complex relationships is Geographically Weighted Regression (GWR). • In a global regression model, the dependent variable is often modelled as a linear combination o be stationary over the whole area (i.e. the model returns one value for each parameter). GWR extends this framework by dropping the stationarity assumption: the parameters are assumed to be continuous functions of location. The result of the GWR analysis is a set of continuous localised parameter estimate surfaces, which describe the geography of the parameter space. These estimates are usually mapped or analysed statistically to examine the plausibility of the stationarity assumption of the traditional regression and different possible causes of nonstationarity. 2 The definitive text on GWR is : Fotheringham, A.S., Brunsdon, C. & Charlton, M.E., Geographically Weighted Regression : The Analysis of Spatially Varying Relationships, Chichester, Wiley, 2002.
  • 20. 3
  • 21. • The use of linear regression is common in many areas of science. Ordinary linear regression implicitly assumes spatial stationarity of the regression-model that is, the relationships between the variables remain constant over geographical space. We refer to a model in which the parameter estimates for every observation in the sample are identical as a global model. • Spatial nonstationarity occurs when a relationship (or pattern) that applies in one region does not apply in another. Global models are statements about processes or patterns which are assumed to be stationary and as such are local independent, i.e. are assumed to apply to all locations. In contrast local models are spatial disaggregations of global models, the results of which are location-specific. The template of the model is the same : the model is a linear regression model with certain variables, but the coefficients alter geographically. If the parameter estimates are allowed to vary across the study area such that every observation has its own separate set of parameter estimates we have a local model. • GWR does not assume the relationships between independent and dependant variables are constant across space. Instead, GWR explores whether the relationships between a set of predictors and an outcome vary by geographical location. GWR is suggested to be a powerful tool for investigating spatial nonstationarity in the relationship between predictors and the outcome variable. 4
  • 22. • GWR4 is new release of a Microsoft Windows based application for calibrating geographically weighted regression models, which can be used to explore geographically varying relationships between dependent/response variables and independent/explanatory variables. 5
  • 23. 6 Give the session a name Specify regression type and variable settings Chose a geographic kernel type Specify names for files storing the modelling results Execute the session For an extensive review of these 5 steps, see NAKAYA, T., GWR4 User Manual, update 7 may 2012.
  • 24. • Theoretically, spatial nonstationarity is based on the concept of the social construction of space. The interaction between individuals with each other and their physical environment produces space. Human beings are just as much spatial as temporaral beings. By temporal, we mean that we are most influenced by what is immediate in space. What happens near us matters more than non-proximal events. Human’s spatiality and temporality are essential and equal powerful in explaining human behavior. Consequently, everything that is social is inherently spatial, just as everything spatial is inherently socialized. • From this perspective, we analyse how the macro-level relationship between crime and various socio- economic and demographic variables unfolds over geographical space. 7
  • 25. • Processes and characteristics of urban areas at the human-environment interface (e.g. social stratification, segregation, urban poverty) depend on a diverse set of socio-demographic, economic and environmental factors. Due to the heterogeneity of urban areas, it can be assumed that the strength and direction of the influence of these factors varies over space. • Special properties of geospatial data are spatial autocorrelation and spatial heterogeneity (nonstationarity). Spatial autocorrelation implies a spatial association between an attribute value at a particular location and attribute values at other locations close by. Spatial heterogeneity describes systematic spatial variation of attribute values across space. These spatial effects must be taken into account when modeling spatial relationships in a regression model. • Traditionally, global statistical regression approaches are applied to study the influence of explanatory variables on a target variable. These approaches emphasize similarities across space. In the following analysis, this global or “one fits all approach” is juxtaposed against spatial autocorrelation and nonstationarity. In particular, we explore a global non-spatial regression model and both a global and local spatial regression model of the relationship between indicators of socio-economic disadvantage and neighborhood demographic context and crime rates in Belgian municipalities in the period 2006-2008. 8
  • 26. • An exploration of the spatial patterns of crime is warranted. The causal processes driving crime may vary over space, that is, predictor variables may operate differently in different locations. This may be especially relevant in policy studies where there is growing recognition that understanding the context of crime – the where and when of criminal events – is key to understanding how crime can be controlled and prevented. Crime studies that highlight local variations – local contexts of crime – will likely have more relevance to real-world policy applications. Empirically, if these variations in causal processes do exist and are not accounted for, the statistical model will be inaccurate. • Estimations provided by a global model might be inadequate in capturing spatially varying relationships, as global statistics are only describing average relations between the dependent variable and the considered explanatory variables. With increasing spatial variation of local observations, the reliability of global model estimates decreases. • There might be spacial dependencies that refer to attribute values in one location which might depend on values of the attributers in neighboring locations. • The assumption of spatial heterogeneity can be suggested by the fact that criminality and its determinants may be distributed unevenly across space. Another source of spatial heterogeneity is the dynamics between population and location. That is, cultural differences and differences in attitudes and behaviour across locations may alter how people react to various contextual variables. Given the potential of spatial heterogeneity, it would be naïve to assume that the spatial processes between criminality and its determinants are stationary (or universal) and can be captured by a conventional “global” model. 9
  • 27. • Following Tobler’s first law of geography which states that “everything is related to everything else, but near things are more related than distant things”, GWR has to be calibrated in a way that observations near to observation i have more influence on the estimation of the parameters that data located further away from i. • GWR takes advantage of spatial dependence in the data. Spatial dependence implies that data available in locations near the focal location are more informative about the relationship between the independent and the dependent variables in the focal location. When evaluating estimates for a focal location, GWR gives more weight to data from closer locations than to data from more distant locations. It is assumed that the relative weight of the contributing locations decays at an empirically determined rate as that distance from the focal location increases. Spatial dependence refers to (socio-economic) interaction among agents, whereas spatial heterogeneity regards the aspects of the socio-economic structure over space. 10
  • 28. 1. Analytical framework • Our analysis strategy entails estimating regression models that summarize the “global”, or average, effects of the predictor variables on crime rates across our sample of Belgian municipalities. Given the well- known spatial autocorrelation evident in crime data, we generate the global models using Ordinary Least Squares (OLS) and Spatial AutoRegression (SAR) estimators. OLS and the spatial autoregression model are “global” models in the sense that they both assume that a single set of parameters sufficiently describe the relationships between predictor variables and crime rates. • The classical ordinary least Squares (OLS) model is widely used to model the global relationship between a response variable and one or more explanatory variables. OLS assumes, among other things that residuals are spatially independent. Residual autocorrelation captures unexplained similarities between neighboring municipalities, which can be the result of omitted variables or a misspecification of the regression model. Assuming a global model does exist, an exploration of spatial patterns in the data can help determine whether a global model is misspecified – whether the model is missing important predictor variables (spatial error model) or if a spatial term should be included in the model (spacial lag model) – which would improve the accuracy of the global model in explaining crime levels across the study area. 11
  • 29. • Global models that account for spatial effects are spatial autoregressive models (SAR). The spatial error model addresses the presence of spatial autocorrelation by defining a spatial autoregressive process for the error term and, by doing so, captures unexplained similarities. The spacial lag model extends the standard OLS regression model by including a spatially lagged dependent variable, which can be mostly interpreted as spill-over effects. • Global regression models assume a homogeneous behavior of the estimated parameters across space. We expect spatial homogeneity to be rare and assume that most social phenomena are not geographically stationary. A way to deal with spacial heterogeneity is the application of geographically weighted regression (GWR) to investigate spatially varying relationships. • GWR models spatial autocorrelation and spatial heterogeneity for subsets of the entire data set. Each subset is established around a regression point with near data points exhibiting a higher influence than more distant data points. This weighting is often based on a bi-square kernel function. Of crucial importance is the specification of an appropriate bandwidth length. The most common is the adaptive bandwidth, where is length is allowed to vary across space, depending on the density of the data points. In densely populated areas the kernel possesses a shorter bandwith in contrast to regions with larger inter-point distances, where the bandwidth is longer. • While it is often argued that GWR is more suitable for exploratory analysis, it is a technique to test whether local models yield a significant improvement in fit over the global models. 12
  • 30. • The following analysis models both spatial autocorrelation and nonstationarity by means of global and local spatial statistical models. An exploratory spatial data analysis, a global non-spatial regression model, a global spatial regression model and finally a local spatial regression model were applied to explore the association between various predictors and crime in Belgian municipalities. We rely on crime data in municipalities, the main political and administrative unit of the Belgian territory. • The dependent variable in this study is the crime rate/1000 residents (calculated as a mean over the period 2006-2008) in Belgian municipalities (N= 589, source data : statistics Belgian Federal Police, period 2006-2008). • To test social deprivation theory we collected data at municipality level about various indicators of inequality . Besides mean family income and the percentage unemployed, we use the Gini coefficient as a measure of income variation, indicating the distribution of income in each municipality (between extremes of 0 (absolute equality) and 1 (maximum inequality). As control variables we include various socio-demographic indicators : population density, the share of males in the age group 15 to 64, the percentage of young people (15-24) in the population, the percentage of residents that are foreign born, the percentage of non-Euro foreign born residents and the degree of female labour force participation (source data : National Institute of Statistics and statistics Federal Government, period 2006-2008). 13
  • 31. 14 Since the original data for the dependent variable and five of the independent variables are not normally distributed (skewness marked in red in the above table) and normality of data is a basic assumption for both ordinary least squares regression and spatial regression, natural log values (ln) were used for these variables.
  • 32. • The first step in an exploratory spatial data analysis (ESDA) is to verify if spatial data are randomly distributed. To do this, it is necessary to use global autocorrelation statistics. The global indicators of spatial autocorrelation are not capable of identifying local patterns of spatial association, such as local spatial clusters or local outliers in data that are statistically significant. To overcome this obstacle, it is necessary to implement a spatial clustering analysis (we made use of GeoDa open-source spatial regression software of the GeoDa Center for Geospatial Analysis and Computation, http://geodacenter.asu.edu). • A significant Moran’s I statistic is a first clue that parameter estimates in an OLS regression can be affected by spatial residual autocorrelation. For this reason, the Moran’s I statistic was calculated for the dependent variable and the nine independent variables included in this study. The neighborhood relationships for calculating the Moran’s I statistic are defined as first order queen contiguity, which is commonly used (a municipality’s spatial lag is a weighted average of its neighboring localities ; neighbors are typically defined in terms of their physical proximity to the local geographic unit). • Results indicate that both the dependent and all independent variables exhibit significant positive spatial autocorrelation. The hypothesis of spatial randomness is clearly rejected. A positive and significant spatial dependence in the dependent variable (crime rate) indicates that the crime rate in a particular municipality is associated with (not independent of) crime rates in surrounding counties. The value of the spatial autocorrelation coefficient (0,297) indicates that a 10 percentage point increase in the crime rate in a municipality results in an increase of nearly 3% in the crime rate in a neighboring municipality. This, together with the results of the LISA cluster analysis, is evidence of the existence of significant spillover effects between municipalities with respect to crime, and implies that there is a need of a coordination of the municipal efforts to fight criminal activities that spill over the municipal borders. 15 2. Exploratory spatial data analysis
  • 33. 16 Prevalence of crime in Belgian municipalities (N = 589)
  • 34. 17 Global Moran’s I statistic for variables included in this study
  • 35. 18 Cartograms of the geographical distribution of independent variables
  • 36. 19 LISA cluster map for criminality in Belgian municipalities, N= 589
  • 37. • Exploring the relationship between the independent variables and crime rates starts with a multivariate OLS regression model. None of the correlations between the predictors is excessively high enough to yield a major concern about multicollinearity. Nevertheless, we evaluated the diagnostics to assess the issue of multicollinearity more formally. In particular, Variance Inflation Factors (VIFs) were investigated. Since all VIF scores are below the critical value of 5, multicollinearity is rejected*. • Results show that the nine predictors explain about 54,2% of the variance in crime rates. Of those, the variables representing the percentage of males in the age group 15-64, the percentage of the age group 15-24 in the population and the percentage of foreign born residents do not contribute significantly to the explanation of the variability in crime rates between municipalities**. • A more detailed analysis of the error residuals reveals that they are not normally distributed (Jarque Bera test = 410.059 ; p < 0.001) but not heteroscedastic (Koenker-Bassett test = 14.115 ; p=0.118). Finally, residual independence is tested by the Moran I-statistic. This test shows significant spatial residual autocorrelation (Moran’s I = 0.155 ; p < 0.001), violating the model’s independence assumption. This residual pattern in the OLS model can be the result of existing spatial effects and can be accounted for by means of a spatial regression model. * Collinearity diagnostics were estimated using SPSS 21 and no problems of multicollinearity were found among the independent variables. The collinearity diagnostics used were the variance inflation factors (VIF) and tolerances for individual variables. Multicollinearity is said to exist if the VIF is 5 or higher (or equivalently, tolerances of 0,20 or less). The highest VIF –value in this analysis was 4,852 and the lowest tolerance was 0,206, both for mean income. ** Initially, two dummy variables representing the regions in Belgium were added to the regression equation. However, VIF scores indicated the presence of multicollinearity. Therefore, these dummy variables were no longer withheld in the OLS regression. 20 3. Global non spatial regression model
  • 38. 21
  • 39. • The clustering of crime rates indicates that the data are not randomly distributed, but instead follow a systematic pattern. The spatial clustering of variables, and the possibility of omitted variables that relate to the connectivity of neighboring localities, raise model specification issues. Evidence for the latter also comes from the residual autocorrelations present in the OLS model. • We employ two alternative specifications to correct for spatial dependence. One is the spatial lag model. This specification is relevant when the spatial dependence works through a spatial lag of the dependent variable. The other specification is the spatial error model. This specification is relevant when the spacial dependence works through the disturbance term (spatial regression models ware developed by making use of GeoDa, regression software of the GeoDa Center for Geospatial Analysis and Computation, http://geodacenter.asu.edu). • The value of the LMLAG-test is only weakly significant (LMLAG = 3.598 ; p < 0.1) but the results of the LMERROR-test (56.900 ; p < 0.001) suggest that a spatial error must be considered in the global spatial regression model. • The results from the spatial lag model shown in the table on the next slide, suggest that this model does not perform as well as the spatial error model. The effect of the spatial lag term is statistically weak (rho = 0.084 ; p= 0.101). The robust Lagrange Multiplier (LM) test also recommends the use of the spatial error model and the lower AIC value combined with the higher R2 value for the spatial error model signals that this model outperforms the spatial lag model. In the spatial error model, all predictor variables except one (the percentage of foreign born residents) yield a statistically significant effect. 22 4. Global spatial regression model
  • 40. 23 Global OLS versus global spatial regression models
  • 41. • Based on the results of the global spatial regression model it is difficult to defend similarities in municipality-level crime as arising from imitation of one’s neighbors, that is, a spatial lag process. Criminality results from a complex mix of social, economic and cultural factors, only a small number of which can be brought into a statistical model of the process. Much of it remains unaccounted for and is summarized in the model’s error term. • Although we observe a very small Moran’s I value (-0.022) associated with the spatial error model, the residuals are not in compliance with the assumption of being spatially independent of each other (Breusch-Pagan test for heteroscedasticity = 54.060 ; p< 0.001). 24
  • 42. • As a global model, local regression modeling carries the assumption that the processes being modeled are uniform throughout the study area : the relationships between the dependent and the independent variables remain stationary (constant) across the entire study area of Belgium. Local spatial regression models take nonstationarity into account. We use GWR4 to perform geographically weighted regression analysis (Nakaya, 2012). • The results of fitting the dataset to different GWR descriptive models are shown below. Four alternatives of GWR modeling were applied considering the four possible combinations between two different types of kernels (fixed or adaptive) and two different bandwidth methods (AICC and CV). GWR models 3 and 4 (both models use an adaptive kernel) offered lower residual squares, meaning that these models provided a better fit to the data. The R2 value of both GWR-models is nearly the same. We chose GWR model 3 with the lowest AICC value to provide an exploratory analysis of the data. 25 5. Local spatial regression model GWR models applied to dataset of Belgian municipalities GWR model 1 GWR model 2 GWR model 3 GWR model 4 kernel fixed fixed adaptive adaptive bandwidth method AICc CV AICc CV adjusted R2 0,628 0,570 0,633 0,639 residual squares 342571,621 444329,261 337812,559 323290,833 AICc 5584,352 5630,776 5578,562 5581,763 Anova test residuals OLS/GWR p < 0,01 p < 0,01 p < 0,01 p < 0,01
  • 43. 26
  • 44. 27
  • 45. 28
  • 46. 29 lat/lon-coordinates (polygon centroids) LatDD.dddd LongDD.ddd LongZone LongZoneCM DeltaLong(Rad) LatRad LongRad rcurv1=rho rcurv2=nu CalculateMeridionalArcLength MeridionalArcS CoefficientsforUTMCoordinates Ki Kii Kiii Kiv Kv A6 RawNorthing Northing Easting LongZone 51,453768 4,468343 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5701311 602022,3 31 51,405344 4,469669 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5695928 602222,5 31 51,394896 4,606091 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5694965 611736,2 31 51,267046 4,370611 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5680415 595620,1 31 51,335968 4,628302 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5688446 613426,8 31 51,342434 4,440327 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5688891 600319,1 31 51,334729 4,371499 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5687942 595541,4 31 51,313418 4,501749 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5685750 604663 31 51,293475 4,724967 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5683875 620271,1 31 51,268405 4,512477 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5680760 605513,8 31 51,256666 4,584644 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5679561 610576,2 31 51,258890 4,673666 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5679946 616782,1 31 51,229417 4,325406 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5676172 592542 31 51,211117 4,685177 31 3 0 1 0 ## ## ## ## ## ## ## ## -0 ## 5674652 617707,1 31 UTM conversion
  • 47. • Results reveal that the GWR model exhibits a significant improvement in explained variance as compared to the OLS regression model (63,3% vs. 54,2%). The AIC score for the GWR model (5578.562) is substantially lower than the AIC score for the global OLS model (5657.654), which reflects a better goodness of fit (AIC is a measure of spatial collinearity. The lower its value, the better the fit of the model to the observed data). • Another method to evaluate the GWR model is the ANOVA test which verifies the null hypothesis that the GWR model represents no improvement over the global model. The computed F-value of 2.753 is in excess of the critical value of F (2.41 ; α = 0.01) with 10 and 496 degress of freedom. The ANOVA test thus suggests that the GWR model is a significant improvement on the global model for the data of Belgian municipalities. • The results obtained by the GWR method provide information about locally differing estimation coefficients. Therefore, the GWR results do not report a global estimate for each explanatory variable but rather they provide insights into local ranges of the estimates (minimum, 25% quantile, median, 75% quantile and maximum). The 5-number summary (see next slide) is helpful to get a feel of the degree of spatial nonstationarity in a relationship by comparing the range of the local parameter estimates with a confidence interval around the global estimate of the parameter. This is accomplished by dividing the interquartile range of the GWR coefficient by twice the standard error of the same variable from the global regression (OLS). Ratio values > 1 suggest nonstationarity in the relationship between an independent variable and the dependent variable. 30
  • 48. • The results of the Monte Carlo test indicate that the parameter estimates do vary significantly across space. As shown on the map on the next slide, the total variance explained by the local model ranges from 47,8% to 83,4%. In general, there is a north-south divide with higher R2 values in the northern part of the country. Explained variance is lowest in the southern part of the province of East Flanders and its surrounding municipalities in Wallonia. 31 Geographically weighted regression 5-number parameter summary results and Monte Carlo significance test for spatial variability of parameters (Belgian municipalities, N = 589) minimum lower quartile median upper quartile maximum status significance Intercept -513,965 -28,093 210,299 361,918 512,299 non-stationary p < 0.001 ln(Gini inequality) -0,706 0,285 1,062 1,439 2,529 non-stationary p < 0.001 mean income -0,010 -0,006 -0,004 -0,002 0,001 non-stationary p < 0.001 ln(unemployment) -0,012 0,216 0,325 0,444 0,720 non-stationary p < 0.001 ln(population density) -0,057 0,006 0,056 0,132 0,227 non-stationary p < 0.001 % male in age group 15-64 -0,100 0,019 0,044 0,065 0,129 non-stationary p < 0.001 % 15-24 in population -0,119 -0,058 -0,011 0,022 0,087 non-stationary p < 0.001 ln(% foreign born) -0,119 -0,009 0,036 0,083 0,167 non-stationary p < 0.001 ln(% non-Euro foreign) -0,015 0,076 0,106 0,147 0,319 no spatial variability p < 0.001 female labour force participation 0,008 0,038 0,048 0,064 0,096 non-stationary p < 0.001 5-number parameter summary Monte Carlo test
  • 49. 32 Local R2 values of the GWR model (Belgian municipalities, N = 589)
  • 50. • To better understand and interpret nonstationarity in individual parameters it is necessary to visualize the local parameter estimates and their associated diagnostics. The output of a GWR analysis includes data that can be used to generate surfaces for each model parameter that can be mapped, where each surface depicts the spatial variation of the relationship between a predictor and the outcome variable. • One of the main challenges is the presentation and synthesis of the large numbver of results that are generated in local GWR models. Mapping only the parameter estimates is misleading, as the map reader has no way of knowing whether the local parameter estimates are significant. As Mennis (2006 : 172) notes a main issue is that “the spatial distribution of the parameter estimates must be presented in concert with the distribution of significance, as indicated by the t-value, in order to yield meaningful interpretation of results”. • Because the patterns of t-values for the parameter estimates are important to reveal which areas have statistically significant estimates, we provide maps of t-values for all variables (see next slides). The maps provide strong evidence of significant spatial heterogeneity in the effect of predictor variables on crime across municipalities. 33
  • 51. 34 Significance of t-values for parameter estimates (1/3)
  • 52. 35 Significance of t-values for parameter estimates (2/3)
  • 53. 36 Significance of t-values for parameter estimates (3/3)
  • 54. • The results of the geographically weighted regression analysis indicate that spatially varying processes operate in Belgian municipalities with respect to the relationships between socio-economic and socio-demographic variables and crime rates. • Several local results are of particular note. First, when we examine the incidence of significant parameter estimates at the local level, 61 % of all parameter estimates are insignificant. With the exception of unemployment and female labour force participation, the majority of parameter estimates for all other independent variables and the intercept are insignificant . Positively of negatively signed global effects of covariates do not hold across all municipalities. This proves it is important to analyze beyond the global level (OLS) and to examine variation at the local level (GWR). • Secondly, the global parameter estimates mask a great deal of variation at the local level. For example, while the global parameter estimate for unemployment is 0,217, the parameter estimates at the local level range from - 0,012 to 0,720. Where the global estimate for the percentage of non-Euro foreign born inhabitants is 0,114, the local parameter estimates range from -0,015 to 0,319. • Finally, insignificant global results mask countervailing positive and negative effects of covariates at the local level. The negatively signed but insignificant global effect of the percentages of 15-24 aged youngsters in the population reaches negative significance in 23,2 % of the municipalities while the effect of this covariate reverses to positive significance in a minority (2,9 %) of all municipalities. In a similar way, the positively signed but insignificant global effect of the percentage of males aged 15-64 in the local population reaches positive significance in 39,2 % of the municipalities while the effect of this variable is negative significant in 2 % of the municipalities. 37
  • 56. • We can further explore the results of the GWR analysis by clustering locations with similar parameter values for the variables considered. This synthesizes the output that is generated by the GWR model and can help to interpret the results . • A two-step cluster analysis based on the nine parameter estimates and the intercept was applied. We experimented with a range of clusters between 4 and 8. The optimal choice in terms of the number of clusters was 6 (municipalities were divided in evenly sized clusters). 39
  • 57. • Although latitude and longitude were not included in clustering municipalities’ parameter estimates, the six clusters are geographically coherent. A discriminant analysis with cluster membership as the dependent variable and both lat/lon-coordinates as predictors confirms that 70,6% of the cluster members are correctly classified based on their location which means that 70,6% of the municipalities were geographically near other members of the same cluster. By cluster, the percentage of correctly classified members varies from 57,8% to 83,8%. 40
  • 58. 41 Significance of t-values for parameter estimates by cluster (1/3)
  • 59. 42 Significance of t-values for parameter estimates by cluster (2/3)
  • 60. 43 Significance of t-values for parameter estimates by cluster (3/3)
  • 61. • Although the parameter estimate of non-Euro inhabitants does not vary spatially (see 5-number parameter summary), it is by far the most important predictor of criminality in cluster 1. In comparison with the other clusters, the effect of the percentage of males in the age group 15-64 is significant in a large majority of municipalities covered by cluster 1. • Like cluster 1, cluster 2 represents a contiguous area of municipalities but the percentage of correctly classified municipalities is lowest (57,8%) of all clusters. Within this cluster the percentage of explained variance strongly differs when moving from west to east (R2 between 47,7% and 80,7). • Cluster 3 covers large parts of Wallonia, where local R2 values are relative low. In cluster 3 as well as in cluster 4 and cluster 6, the parameter estimates for socio-economic variables (Gini inequality, mean income and unemployment) are significant in resp. 80,6 %, 65,9 % and 80,6 % of the municipalities. In the other clusters, the effect of these variables is significant in less than one third of the municipalities. • Apart from the effect of socio-economic variables, the effect of non-Euro inhabitants on crime is also significant in a majority of municipalities in cluster 4. • In cluster 5 the local R2 values are also relative low and the estimate of the intercept factor is significant. Criminality in the east cantons of cluster 5 also correlates significant and independent of other predictors with population density and the presence of young people in the population inhibits criminality in this area of cluster 5. • As stated, in the area that represents cluster 6 (the largest cluster in terms of the number of municipalities), the measures of inequality are the most significant determinants of crime. Criminality also varies in an independent way with population density. 44
  • 62. • The objectives of this study were to examine the extent of geographic variation in the relationship between socio- economic and demographic variables on the one hand and crime rates on the other. The goals of our study were (i) to compare the performance of global and local spatial regression with OLS regression (ii) examine spatial nonstationarity throught the use of GWR (iii) map the parameter coefficients of GWR for further interpretation and (iv) examine whether there are spatial groupings of parameter estimates. • The analysis revealed that there is evidence of overall clustering in crime rates in Belgium. Local spatial analysis uncovered that places with the highest crime rates are often proximate. • The finding of the existence of local spatial autocorrelation in crime rates suggests that failing to utilize spatially- oriented methodologies may result in biased parameter values in explanatory models. As far as global models are concerned, this analysis demonstrated that a spatial error model adds significantly to the understanding and interpretation of spatially varying crime rates. • The use of a GWR model allowed for an assessment of spatial heterogeneity when exploring the relationships between predictor variables and crime rates by local area. Geographically weighted estimations provided the best fit to the data. Predictor variables as well as crime rates showing strong local variation point to problems that policy makers best address at the local level and the situation in particular areas. • Significant local parameter estimates were found for the predictor variables, confirming spatial heterogeneity in the effects of these variables on crime and providing insights into the spatial scale at which processes may be operating. Furthermore, a two-steps cluster analysis revealed distinct zones of spatial effects. 45 6. Conclusions
  • 63. 46 References Spatial Data Analysis 1 & 2 Anselin, L., Spatial regression analysis in R. A workbook, Spatial Analysis Laboratory, Dep. Of Geography, University of Illinois, Urbana-Champaign, may 2007. Arnio, A.N. & Baumer, E.P., Demography, foreclosure and crime : Assessing spatial heterogeneity in contemporary models of neighborhood crime rates, Demographic Research, 26, 2012, pp.449-488. Cahill, M. & Mulligan, G., Using geographically weighted regression to explore local crime patterns, Social Science Computer Review, 25, 2007, pp. 174-193. Fotheringham, A.S., Brunsdon, C. & Charlton, M.E., Geographically weighted regression : The analysis of spatially varying relationships, Chichester UK, John Wiley & Sons, 2002. Helbich, M., Leitner, M. & Kapusta, N.D., Geospatial examination of lithium in drinking water and suicide mortality, International Journal of Health Geography, 2012, pp . 11-19. Matthews, S.A. & Yang, T.Ch., Mapping the results of local statistics : Using geographically weighted regression, Demographic Research, 26, 2012, pp . 151-166. Matthews, S.A. & Parker, D.M., Progression in spatial demography, Demographic Research, 28, 2013, pp . 271-312. Mennis, J.L., Mapping the resulmts of geographically weighted regression, The Cartographic Journal, 43, 2006, pp. 171-179. Shoff, C., Yang, T.CH & Matthews, S.A., What has geography got to do with it ? Using GWR to explore place-specific associations with prenatal care utilization, Geo Journal, 77, june 2012, pp. 331-341. Siordia, C., Saenz, J. & Tom, S.E., An introduction to macro-level spatial nonstationarity : A geographically weighted regression analysis of diabetes and poverty, Journal of Studies and Research in Human Geography, 6, 2012, pp. 5-13. Tita, G.E. & Radil, S.M., Making space for theory : The challenges of theorizing space and place for spatial analysis in criminology, Journal of Quantitative Criminology, 26, 2010, pp. 467-479. Tobler, W., A computer movie simulating urban growth in the Detroit region, Economic Geography, 46, 1970, pp. 234-240. Vilalta, C.J., How exactly does place matter in crime analysis ? Place, space and spatial heterogeneity, Journal of Criminal Justice Education, 2012, pp. 1-26. Voss, P.R., Long, D.D., Hammer, R.B. & Friedman, S., County child poverty rates in the U.S. : A spatial regression approach, Population Research Policy Review, 25, 2006, pp. 369-391. Yamashita, K., Understanding urban fire : Modeling fire incidence using classical and geographically weighted regression, ProQuest, UMI Dissertation Publishing, 2012.