Worldgrids.org: building global covariates for automated mapping
1. Worldgrids.org
building global covariates for automated
mapping
Tomislav Hengl & Hannes I. Reuter
ISRIC World Soil Information, Wageningen University
SS2010 conference, Mar 26th 2011
2. About ISRIC
ISRIC World Soil Information.
SS2010 conference, Mar 26th 2011
3. About ISRIC
ISRIC World Soil Information.
ISRIC = International Soil Reference Information Center.
SS2010 conference, Mar 26th 2011
4. About ISRIC
ISRIC World Soil Information.
ISRIC = International Soil Reference Information Center.
Non-prot organization / aliated to Wageningen University
and Research.
SS2010 conference, Mar 26th 2011
5. About ISRIC
ISRIC World Soil Information.
ISRIC = International Soil Reference Information Center.
Non-prot organization / aliated to Wageningen University
and Research.
Mandate: serve soil data; serve international soil standards;
moderate collaboration and partnerships.
SS2010 conference, Mar 26th 2011
6. About ISRIC
ISRIC World Soil Information.
ISRIC = International Soil Reference Information Center.
Non-prot organization / aliated to Wageningen University
and Research.
Mandate: serve soil data; serve international soil standards;
moderate collaboration and partnerships.
Projects: GlobalSoilMap.net, SOTER, Green Water Credits ...
SS2010 conference, Mar 26th 2011
7. This talk
Global repository of publicly available data (worldgrids.org).
A global multiscale approach to geostat mapping.
Some examples: Malawi.
Upcoming activities.
SS2010 conference, Mar 26th 2011
8. Main thesis
Global (multiscale) modeling is now!
SS2010 conference, Mar 26th 2011
9. Analysis objectives
For Diggle and Ribeiro (2007) there are three scientic objectives
of geostatistics:
1. model estimation, i.e.inference about the model parameters;
2. prediction, i.e.inference about the unobserved values of the
target variable;
3. hypothesis testing;
SS2010 conference, Mar 26th 2011
10. Regression-kriging
Target variable z is a sum of deterministic and stochastic
components:
z(s) = m(s) + ε(s) (1)
where m(s) is the deterministic part of the variation (i.e.a linear
function of the auxiliary variables), ε(s) is the residual for every (s).
SS2010 conference, Mar 26th 2011
11. BLUP for spatial data
ˆ ˆ ˆ
z (s0 ) = qT · β + λT · (z − q · β)
ˆ 0 0
ˆ −1
β = qT · C−1 · q · qT · C−1 · z (2)
ˆ
λ0 = C−1 · c0
This is the dominant model used in ∼90% of our mapping projects
(Minasny and McBratney, 2007)
SS2010 conference, Mar 26th 2011
13. Worldgrids.org
I was ask to write a review of publicly available global data
sets of interest for species distribution modeling.
SS2010 conference, Mar 26th 2011
14. Worldgrids.org
I was ask to write a review of publicly available global data
sets of interest for species distribution modeling.
I discovered that at 15 km resolution, there is A LOT of
publicly available data which are under-used.
SS2010 conference, Mar 26th 2011
15. Worldgrids.org
I was ask to write a review of publicly available global data
sets of interest for species distribution modeling.
I discovered that at 15 km resolution, there is A LOT of
publicly available data which are under-used.
The original images need to be processed before you can use
them as global covariates.
SS2010 conference, Mar 26th 2011
16. Worldgrids.org
I was ask to write a review of publicly available global data
sets of interest for species distribution modeling.
I discovered that at 15 km resolution, there is A LOT of
publicly available data which are under-used.
The original images need to be processed before you can use
them as global covariates.
Produce grids → prepare data for upload → geo-serve it.
SS2010 conference, Mar 26th 2011
17. Worldgrids.org
I was ask to write a review of publicly available global data
sets of interest for species distribution modeling.
I discovered that at 15 km resolution, there is A LOT of
publicly available data which are under-used.
The original images need to be processed before you can use
them as global covariates.
Produce grids → prepare data for upload → geo-serve it.
The result is a repository with cca 100 unique rasters, that can
be obtained directly from
http://spatial-analyst.net/worldmaps/.
SS2010 conference, Mar 26th 2011
18. Worldgrids.org
I was ask to write a review of publicly available global data
sets of interest for species distribution modeling.
I discovered that at 15 km resolution, there is A LOT of
publicly available data which are under-used.
The original images need to be processed before you can use
them as global covariates.
Produce grids → prepare data for upload → geo-serve it.
The result is a repository with cca 100 unique rasters, that can
be obtained directly from
http://spatial-analyst.net/worldmaps/.
Each gridded map consists of 7200 columns and 3600 rows;
the cell size is 0.05 arcdegrees, which corresponds to about
5.6 km; all maps fall on the same grid.
SS2010 conference, Mar 26th 2011
19. Worldgrids.org
I was ask to write a review of publicly available global data
sets of interest for species distribution modeling.
I discovered that at 15 km resolution, there is A LOT of
publicly available data which are under-used.
The original images need to be processed before you can use
them as global covariates.
Produce grids → prepare data for upload → geo-serve it.
The result is a repository with cca 100 unique rasters, that can
be obtained directly from
http://spatial-analyst.net/worldmaps/.
Each gridded map consists of 7200 columns and 3600 rows;
the cell size is 0.05 arcdegrees, which corresponds to about
5.6 km; all maps fall on the same grid.
PS: I also have a lot of data at 1 km.
SS2010 conference, Mar 26th 2011
20. Read more (or see a gallery)
SS2010 conference, Mar 26th 2011
29. PyWPS
Overlay, subset, reproject, aggregate functionality (example):
GNworldgrids(layername=globcov, xcoord=6.848911, ycoord=52.245427)
[1] 50
under construction.
SS2010 conference, Mar 26th 2011
30. Global Soil Information
An international initiative to make soil property maps (7+3) at
six depths at 3 arcsecs (100 m).
the lightmotive is to assemble, collate, and rescue as much of
the worlds existing soil data ;
Some 30 people directly involved (ISRIC is the main project
coordinator).
International compilation of soil data.
The soil-equivalent of the OneGeology.org, GBIF, GlobCover
and similar projects.
See full specications at
http://globalsoilmap.org/specifications
SS2010 conference, Mar 26th 2011
31. My dream is to build an Open multipurpose GLIS
SS2010 conference, Mar 26th 2011
32. The six pillars of open geo-data production1
1. open data, in real-time
2. open source geospatial software
3. open, reproducable procedures
4. open, web-based, methods for data and processing models
(interoperability)
5. open and explicitly quantied signicance and accuracy levels
of research ndings
6. managed, open user and developer communities
1
Edzer Pebesma, (OpenGeostatistic.org)
SS2010 conference, Mar 26th 2011
33. GSM in numbers
The total productive soil areas: about 104 million square
km.
SS2010 conference, Mar 26th 2011
34. GSM in numbers
The total productive soil areas: about 104 million square
km.
To map the world soils at 100 m (1:200k), would cost about
5 billion EUR (0.5 EUR per ha) using traditional methods.
According to Pedro Sanchez, world soils could be mapped for
$0.20 USD per ha ($300 million USD).
SS2010 conference, Mar 26th 2011
35. GSM in numbers
The total productive soil areas: about 104 million square
km.
To map the world soils at 100 m (1:200k), would cost about
5 billion EUR (0.5 EUR per ha) using traditional methods.
According to Pedro Sanchez, world soils could be mapped for
$0.20 USD per ha ($300 million USD).
We would require some 65M proles according to the strict
rules of Avery (1987).
SS2010 conference, Mar 26th 2011
36. GSM in numbers
The total productive soil areas: about 104 million square
km.
To map the world soils at 100 m (1:200k), would cost about
5 billion EUR (0.5 EUR per ha) using traditional methods.
According to Pedro Sanchez, world soils could be mapped for
$0.20 USD per ha ($300 million USD).
We would require some 65M proles according to the strict
rules of Avery (1987).
World map at 0.008333333 arcdegrees (ca.1 km) resolution is
an image of size 43,200Ö21,600 pixels.
SS2010 conference, Mar 26th 2011
37. GSM in numbers
The total productive soil areas: about 104 million square
km.
To map the world soils at 100 m (1:200k), would cost about
5 billion EUR (0.5 EUR per ha) using traditional methods.
According to Pedro Sanchez, world soils could be mapped for
$0.20 USD per ha ($300 million USD).
We would require some 65M proles according to the strict
rules of Avery (1987).
World map at 0.008333333 arcdegrees (ca.1 km) resolution is
an image of size 43,200Ö21,600 pixels.
One image of the world at a 100 m resolution contains 27
billion pixels (productive soil areas only!).
SS2010 conference, Mar 26th 2011
38. Our proposal
Build global repositories of point and gridded data
(covariate).
SS2010 conference, Mar 26th 2011
39. Our proposal
Build global repositories of point and gridded data
(covariate).
Animate people to contribute to the data repositories
(crowdsourcing).
SS2010 conference, Mar 26th 2011
40. Our proposal
Build global repositories of point and gridded data
(covariate).
Animate people to contribute to the data repositories
(crowdsourcing).
Implement the six pillars of open geo-data production
(especially open infrastructures and open code).
SS2010 conference, Mar 26th 2011
41. Our proposal
Build global repositories of point and gridded data
(covariate).
Animate people to contribute to the data repositories
(crowdsourcing).
Implement the six pillars of open geo-data production
(especially open infrastructures and open code).
Prove that it is doable ( showcases).
SS2010 conference, Mar 26th 2011
42. Soil proles from various projects (65k points)
SS2010 conference, Mar 26th 2011
43. Critical question:
How to produce soil property maps @ 100 m
with such limited data?
SS2010 conference, Mar 26th 2011
44. Global Multiscale Nested RK
We propose using nested RK model:
z(sB ) = m0 (sB−k ) + e1 (sB−k |sB−[k+1] ) + . . . + ek (sB−2 |sB−1 ) + ε(sB ) (3)
where z(s ) is the value of the target variable estimated at ground
B
scale (B), , . . . , are the higher order components,
B−1
) is the residual variation from scale s to a
B−k
e (s |s
higher resolution scale s , and ε is spatially auto-correlated
k B−k B−(k+1) B−(k+1)
residual soil variation (dealt with ordinary kriging).
B−k
SS2010 conference, Mar 26th 2011
51. Showcase
Let us see some real examples
SS2010 conference, Mar 26th 2011
52. GM-NRK in action: Malawi showcase
2740 soil observations, from which some 8001000 contain
complete analytical and descriptive data.
SS2010 conference, Mar 26th 2011
53. GM-NRK in action: Malawi showcase
2740 soil observations, from which some 8001000 contain
complete analytical and descriptive data.
1:800k polygon soil map.
SS2010 conference, Mar 26th 2011
54. GM-NRK in action: Malawi showcase
2740 soil observations, from which some 8001000 contain
complete analytical and descriptive data.
1:800k polygon soil map.
Some 30-40 gridded layers at various resolutions
(covariates).
SS2010 conference, Mar 26th 2011
55. Data sets available for Malawi
(a) (b) (c)
48.8
32.7
16.6
0.5 10°
11°
12°
13°
14°
15°
16°
38000
32667
27333
22000
17°
33° 34° 35°
SS2010 conference, Mar 26th 2011
58. pH visualized in GE (1 degree block)
SS2010 conference, Mar 26th 2011
59. Conclusions
Global models global multiscale predictions are
now.
SS2010 conference, Mar 26th 2011
60. Conclusions
Global models global multiscale predictions are
now .
It is very probable that, in the near future, any geostatistical
analysis will be global.
SS2010 conference, Mar 26th 2011
61. Conclusions
Global models global multiscale predictions are
now .
It is very probable that, in the near future, any geostatistical
analysis will be global.
We probably need to re-write the geostatistical algorithms
so they work with sphere geometry (3D + time).
SS2010 conference, Mar 26th 2011
62. Conclusions
Global models global multiscale predictions are
now .
It is very probable that, in the near future, any geostatistical
analysis will be global.
We probably need to re-write the geostatistical algorithms
so they work with sphere geometry (3D + time).
There is enormous amount of publicly available RS and
GIS data that is waiting to be used for geostatistical
mapping use it !
SS2010 conference, Mar 26th 2011
63. In one sentence:
Take a broader view!
SS2010 conference, Mar 26th 2011
64. Next steps
Launch 5 and 1 km worldgrids.
SS2010 conference, Mar 26th 2011
65. Next steps
Launch 5 and 1 km worldgrids.
Provide geo-service and spatial analysis functionality
(overlay, subset, aggregate).
SS2010 conference, Mar 26th 2011
66. Next steps
Launch 5 and 1 km worldgrids.
Provide geo-service and spatial analysis functionality
(overlay, subset, aggregate).
Start making cyber-infrastructure for 250 m and 100 m
grids.
SS2010 conference, Mar 26th 2011
67. Next steps
Launch 5 and 1 km worldgrids.
Provide geo-service and spatial analysis functionality
(overlay, subset, aggregate).
Start making cyber-infrastructure for 250 m and 100 m
grids.
Provide geo-processing services for automated mapping.
SS2010 conference, Mar 26th 2011