Geospatial and the Entity Framework: Half Full, Half Empty, or Wrong Sized ORM?

In late 2004, Ted Neward famously called Object-Relational-Mapping (ORM) the Vietnam of Computer Science.  Recently I switched to .NET 4.5, hoping to reap the benefits of LINQ-to-Entities‘ support for the spatial datatypes.  For SQL Server, this works every bit as well as the Entity Framework does in the first place – great for some databases, a hassle for others (particularly legacy databases).

When LINQ-to-SQL first came out – things didn’t really work too well for spatial.  Back then, it took a little modification of the generated SQL queries before things could get rolling using WKT.  Few people manage their spatial objects as WKT, of course, so you sprinkled some conversion code into your DAL.  Nothing worked out of the box, but the solutions were clear and made sense.

With the Entity Framework’s new spatial support in the System.Data.Spatial namespace, did things improve?  They certainly did if you’re just shuffling geometries from the web to SQL Server.  But what about people who do real work?  Their applications were all built using geometries from Vertesaur, DotSpatial, SharpMap, or NTS.  So we’re still looking at conversion, mostly likely via WKT.  Beyond that letdown, how is the database support?  I personally ran into a lack of naive DbGeometry support when using SQLite.  I wouldn’t have much cared if it were serialized to WKB or WKT, as long as something worked out of the box.

The plain truth is, its often easier to do things yourself than to learn the weird things other people do.  So despite some great use cases, the new geospatial support in .NET 4.5, for me, is the wrong sized glass.  This GIS-specific realization mirrors ORM’s issues in general.  Synapses firing, my brain dug up an old Sam Saffron / Marc Gravell project called Dapper.  Dapper has been called a “micro-ORM”; less of a ground assault and more of a smart bomb – you still manage your ADO connections and write your own SQL, it does fast binding of objects to query parameters and results.

In the end, I moved to Dapper.  Its code-base was small enough for me to grok and hack geospatial support into in a few hours.  Writing SQL is a fair trade for control, particularly when you need control – geospatial data storage being a prime candidate.  It is great to generate object models using the Entity Framework; but I’ve grabbed my POCOs and switched to a smaller, easier to modify ORM with a more stable codebase.

Free, public base-map imagery data?

I’m looking for a decent, truly public, imagery base-map for offline use.

The OnEarth Global Mosaic (15m pan-sharpened pseudo-color Landsat 7) may be years old, but its the best I’ve found so far.  Unfortunately, the old download links appear to be dead, and I don’t imagine they’d appreciate me scraping their WMS.

Does anybody have a link to the entire (~1.3TB) OnEarth Global Mosaic dataset, or a recommendation for a prettier / newer / better data-set at 15-30m?

Lessons in wrestling an ESRI compact cache

I have a 13.2 terabyte ESRI compact-cache.  For those of you not familiar, a compact-cache is ESRI’s proprietary bundle format.  You need to bundle these large caches, because if you stored “just a bunch of tiles”, when you went to move/back-up the data, your OS would spend an eternity worrying about the details.  My cache has 44,500 bundles, 16384 tiles per bundle, and 512×512 pixels per tile.  A comparable tile-cache with 256×256 pixels per file would have around three trillion tiles.

If you look around for about 5 minutes, you’ll find a myriad of tile-cache servers capable of serving traditional tile caches.  For ESRI compact-caches, you’ll find ArcGIS Server.  Now, ArcGIS Server isn’t free, and that’s enough to make one wonder if there are other bundle formats out there.  There are.

As it turns out, other people have previously cracked open ESRI compact-caches, and documented the cache format on the interwebs.  Basically, the files are in that bundle, and they’re whole files.  Mine are whole JPEGs.  I wanted at those JPEGs, and I wanted to put them into new bundles.

Forget the details of my bundle format; I used .NET to extract the JPEGs, and GDAL to write them to the new format.  Along the way, I resized the JPEGs from 512×512 into the 256×256 tiles my new format expected.

Well, making a 512×512-pixel JPEG into four 256×256-pixel JPEGs is a comparatively CPU-intensive operation.  For me, it’s ~125x slower than a simple file copy.  Also, sending a JPEG to GDAL from .NET is similarly slow.  GDAL’s .NET interop layer won’t store a JPEG directly, you first have to convert to a raw bitmap.  On the other side, GDAL turns that bitmap back into a JPEG.  Again, this is ~125x slower than a simple file copy.

In the end, when one tries to convert 13TB using a single box, one makes compromises.  I ended up dropping GDAL from the equation (favoring lower level libraries), and kept the 512×512 pixel internal tile size.  My conversion code is now 250x faster, and operates at near file-copy speed.  Viola.

Upgrading SQL’08 to SQL’12 for better Geography support

Users of SQL Server 2008′s native Geography type were a little disappointed when try tried to use geometries such as this one… MULTIPOINT (55 55, -55 55, -55 -55).  SQL Server 2008 responds with the error:

A .NET Framework error occurred during execution of user-defined routine or aggregate “geography”: 

Microsoft.SqlServer.Types.GLArgumentException: 24205: The specified input does not represent a valid geography instance because it exceeds a single hemisphere. Each geography instance must fit inside a single hemisphere. A common reason for this error is that a polygon has the wrong ring orientation. To create a larger than hemisphere geography instance, upgrade the version of SQL Server and change the database compatibility level to at least 110.

If those directions are cryptic, the solution is pretty easy.  

First, back up all your databases, in case something goes wrong.  Second, upgrade your SQL 2008 instance to SQL 2012 (you may have to update your service packs on 2008 first).  Finally, for each database, go into SQL Server Management Studio, and right-click your database to get its properties.  Under Options, change the Compatilibity level to “SQL Server 2012″.

Image

That should be enough to fix these Geography errors.  You can test your success with a simple query such as SELECT geography::STGeomFromText(‘MULTIPOINT (55 55, -55 55, -55 -55)’, 4326)

Disk / File Management in .NET is Lamentable

Its surprising how poor nearly everything in .NET is that has to do with file management.  Here’s two quick examples:

Directory.GetFiles() / EnumerateFiles() / GetDirectories() / EnumerateDirectories()

The methods are the fast replacement methods for the oh-so-slow recursive search technique.  The problem?  If they encounter a folder they don’t have permission for they throw a UnauthorizedAccessException during MoveNext().  In other words, the first folder it can’t access, you cannot continue.  At all.

File.Copy()

This one’s easy.  You cannot cancel File.Copy(), nor get any progress reporting.  If you want those things, you have to do a stream copy or pInvoke the Win32 API.