Maps are easy, right? Right. Except... when they aren't.
What about when you need to mix different data types on your maps?
What about when the simple solutions (paging) to complex problems (too much data) don't cut it with your users?
What about when your scaling problems exceed the bounds of all the available solutions?
When building any kind of GIS on the web, whether with Bing Maps or Google Maps or something else, one of the things you need to realize is that the web is a far different environment from the desktop. Massive datasets have serious performance problems on the web. Although there are some built-in and add-on scaling solutions (clustering, polyline encoding), you can quickly run into issues like "unresponsive script" or just plain old horrible laggy performance on your map, when you attempt to zoom or pan with too many markers or polylines on the map.
In this talk, I'll walk through several such problems we encountered in the development of our Oil & Gas data browsing app, eTriever, and our initial, simplistic solutions, followed by our total re-write to provide a more robust and high-performance web map.
Including detail data for display, when items are clicked\n
Google & Bing used to have problems with > 100 items (markers, lines, polygons) on the map. Limit the numbers displayed to avoid. \n
But if you limit the numbers of things you display on the map, then you need to page to additional items.\n
\n
Now, lets talk about the scale problems you can run into, when building at complex GIS on the web\n
Not just markers, but lots of data types on the same map\n
Point locations (wells, facilities)\n
Pipelines, roads, trails, routes\n
Land leases, buildings, counties, park boundaries\n
\n
What does that mean anyway? In the context of a web application, that is one hell of a lot of data to sling from the server to the client, and imagine the network latency...\n
This happens when your polylines have discontinuities, or gaps\n
\n
The finer grain the data, the more “things” you are adding to your map.\n
On your polygons, you may have size variations that don’t display well at varying zoom levels. For example, when you zoom out, the polygons become dots of color, lost on the map\n
Land leases can consist of multiple separate areas; How do you handle clicks? The different areas need to respond as one.\n
Some leases will have holes cut out of the middle.\n
\n
So, this is starting to look not so easy any more, isn’t it?\n\nLibraries that sit between you and your library (Google/Bing), are, at these levels of complexity, just going to get in your way. So, get down and understand your mapping library API, IN DETAIL.\n
Make certain that your library (ym4r, etc), doesn’t generate Javascript. That never ends well. Supply data to your map via Ajax and json\n
\n
\n
\n
With 2.5 million segments\n
\n
When people pan and zoom on your map, if you continue to re-load, you can beat up your server. \n
Data size, distance are all going to affect the performance\n
What kind of solutions?\n
Images. Massive amounts of data, but ... cheap to store on Amazon S3, cheap to generate using Elastic Cloud\n
\n
Is there a higher level item that can represent a group of items? Counties? States? For our data, wells generally belong to fields, so we can present fields at higher zoom levels.\n
\n
Id for lookup (when needed/clicked on), display, marker, location, reduce the size of your JSON by using single letter keys\n
\n
Google has the Marker Clusterer and the Marker Manager. Which is great, managing data up to about 10K. But, what about > 100K?\n
Server side clustering... really, it’s the only solution. You can’t throw 100K of data around between the server and the browser. You have to figure out how to reduce the data on the server side.\n
You make have never used an analytic function. They are present in most databases - Oracle, SQL Server, PostgreSQL specifically have NTILE. (Not MySQL, sorry)\n\n\n
NTILE can partition your query into buckets. Group by the data by the lat/long buckets, voila!\n
Remember to include the counts for your server side clusters. We modified the MarkerClusterer to under stand cluster data.\n
\n
I call them cells to differentiate from (static) tiles. Cells are like tiles, but dynamically populated (and cached) by AJAX calls to the server.\n
So, that was a lot about marker, what about polylines? Google now includes polyline encoding by default, but for massive numbers of segments, you will still have data size/latency issues.\n
Reduce data on the server side by filtering on sizes/lengths or some other attribute that makes sense. We do pipe diameter and pipe length filtering.\n\nThere are some drawbacks. The polylines get suspiciously regular when filtered this way. Currently investigating a more sophisticated solution.\n
At some levels, you just can’t display the data effectively. For Land, the leases become very small at higher zoom levels. At that point, which switch to another layer, and display land markers. \n\nBut sometimes you just need to post a message that says “zoom in to see data”\n
\n
This is a major gotcha. Calculating the number of items that will appear on the map will vary, depending on your screen resolution\n
At zoom level X - you see Y number of states/provinces\n
At zoom level X - you see a greatly reduced set.\n
My main app - eTriever, is not publicly available, but WIMBY is. Wells In My Back Yard.\n