There’s a ton of material out there on the web by people who are plotting things like push-pins to Virtual Earth maps. But what has always interested me, and has always been somewhat of a struggle, is doing the same for complex polygons like U.S. States and counties. In fact, to date I haven’t run into a single example out there of anyone working with large numbers of polygons, or complex ones at that. And I think for good reason.
For starters, let’s face it. Internet Explorer 7 doesn’t have the best performing Javascript engine out there. In fact, its probably not even in the top 3 browsers. I don’t think I need to link to any of the many studies, tests, and opinion articles out there (good and bad) that make this case. The point here is that performance quickly becomes a concern when attempting to develop a Virtual Earth map with a layer of, oh, let’s say 50 or more complex polygons. And highly complex polygons, like those with thousands of points, become almost impossible to render.
And unfortunately, geographic entities like U.S. States fall into the latter category due to all the wonderful natural features that delineate them. We’re not talking about squares and rectangles here. In Virtual Earth, there are ways to reduce accuracy to save on computation time, but users in the era of CNN election poll maps are familiar with these shapes and come to expect them to "look right".
So if that didn’t scare you away and you’re interested in displaying geographic entities like U.S. States, counties, and zip codes using Virtual Earth, you can download the Shapefiles from two places free of charge. Each location is courtesy of the U.S. Census Bureau:
- The Tiger/Line(R) Shapefiles. These Shapefiles are the most detailed out there.
- The Generalized Cartographic Boundary Files. These are simplified Shapefiles and do not include the extended shoreline that the Tiger/Line Shapefiles have.
Assuming you’re working with SQL 2008, you can easily import these files to a SQL table using Morten Nielsen’s excellent Shape2SQL utility. Once you get that done, let’s see what the resulting data looks like in SQL.
If you opted for the Generalized Cartographic Boundary files, you’ll discover the following:
That’s 273 rows for only 50 U.S. States! I can tell you know that you typically don’t want to try to draw more than about 100 polygons on a given map. States in Tiger/Line look like this:
That’s one line per state. Granted, several states are of MULTIPOLYGON type, meaning multiple polygons exist for that one entity. But it turns out there are significantly fewer polygons per state in the Tiger/Line data that with the Generalized files. This is due to the fact that the shoreline is extended out three miles from the mainland in the Tiger/Line data. This eliminates all those pesky little islands that end up getting drawn with the Generalized Boundary files. As a rule, the fewer polygons you can get away with drawing, the better.
The drawback to using Tiger/Line is the fact that the shapes are so complex. For example, the Shapefile for Wyoming alone consists of 4,187 points! So you have a choice between a set of simplified shapes that force you into a large number of polygons, or a set of complex shapes that result in fewer polygons (a little over 50). What do you do?
This is an example where the new spatial functionality in SQL 2008 really shines. Since we’re using the geometry datatype, we can apply the following function to the query:
1: SELECT [NAME00], [geom].Reduce(0.05).ToString()
2: FROM [GeoDemo].[dbo].[fe_2007_us_state00]
On line 1, the Reduce() function applies the same algorithm used to produce the Generalized Boundary files. In addition, the ToString() function converts the spatial data to Well Known Text (WKT) format, which is need to allow for the data to be used by the Virtual Earth ASP.NET application. Using this technique, you get the best of both worlds.
However, there is one thing to point out. Applying the same tolerance for all U.S. State shapes will not yield good results. More complex shapes then to maintain their expected shapes than simpler ones. Here is a screenshot of the entire map set to a tolerance of .008:
The good news is that took about 5 seconds to run on my laptop. The bad news is the result we get for North Dakota, Montana, Wyoming, Kansas, etc…. Reducing the tolerance to .004 results in slight improvements, but increases the load time to about 20 seconds. That’s longer than I think anyone would want to tolerate. A good way to deal with this is to handle the tolerance for problem shapes on a case by case basis:
1: SELECT [NAME00],
2: [geom] = CASE
3: WHEN [NAME00] IN ('Wyoming') THEN [geom].Reduce(0.0005).ToString()
4: WHEN [NAME00] IN ('Arkansas', 'Colorado') THEN [geom].Reduce(0.001).ToString()
5: WHEN [NAME00] IN ('Idaho', 'Montana', 'Missouri', 'North Dakota', 'Utah', 'Kansas', 'Oklahoma', 'Pennsylvania') THEN [geom].Reduce(0.0025).ToString()
6: WHEN [NAME00] IN ('Texas', 'Louisiana', 'Maryland', 'Illinois') THEN [geom].Reduce(0.004).ToString()
7: ELSE [geom].Reduce(0.05).ToString()
8: END
9:
10: FROM [GeoDemo].[dbo].[fe_2007_us_state00]
11: WHERE [NAME00] NOT IN ('Hawaii', 'Puerto Rico', 'Alaska')
On my laptop, this approach has allowed me to draw all 48 continental states in just under 10 seconds. I’ll take it! The polygons aren’t perfect, but the tolerances can always be adjusted, if desired.
I would expect the situation with drawing polygons to improve with Internet Explorer 8, which I hear tell has an improved Javascript engine (fingers crossed). This is good news for folks who want to be more like that guy on CNN.