On Cartographic Symbology and the GIS Software Business

I began my GIS career using ArcIMS.  These were ancient times… in order to specify “this line should be red“, one had to know ArcIMS’s proprietary XML.  It was ghastly then, and it’s ghastly now.  Using Google Earth?  The cartography is represented by KML.  Using TileMill?  Your CartoCSS is translated to XML behind the scenes.  Using ESRI?  At best, your symbology is stored as JSON.  QGIS?  QML.  MapServer?  Arcane plain text.  GeoServer?  SLD.

This is not meant to be a comparison of data formats, but of the breadth of different cartographic styling languages.

A raster in any of these programs will look the same.  With rasters, the software doesn’t matter — this is why OGC WMS was so successful.  Things did fall apart for WMS around the edges – OGC Styled Layer Descriptions (SLDs) were seldom used, and that style specification never really gained traction.  Seldom did a client really need to supply alternative cartography to a WMS.  The idea of a WFS server passing an SLD to a client as rendering instructions would be great, but its something I’ve never seen real world implementations of.

Hence vector styling has remained the wild west.  ESRI recently said they’d use a “format based on existing community specifications” for vector tile maps.  Presumably, that means CartoCSS or some variant.  The question looms “can I use my ArcMap symbology for ESRI vector tiles”? [It’s worth plugging Arc2Earth‘s TileMill Connect feature here.]  The opposite is also true.  It’s become simple to export OpenStreetMap data into ESRI-friendly formats.  Nevertheless, it will look terrible out of the box, you’ll have a devil of a time finding a cartographer to work on it, and its impossible for ESRI’s rendering engine to match OSM’s / Mapnik‘s rendering engine.

We are blessed in GIS, until the words cartographer and style trigger the cringe-worthy vector rasterization engine.  Nevertheless, this world is upon us.  Cloud resources such as OpenStreetMap and Twitter are defining the new worlds of cartography.  MapD’s Twitter Demo exemplifies how big data requires new types of rasterization engines.  Recently MapBox  has shifted from a server-side Mapnik rendering engine to a client-side MapBox GL; no doubt vastly reducing their storage and processing overhead.

Those of us building new GIS applications – even the mundane – should start by worrying about the cartography.  Data set support, application capabilities, application performance, cartographic costs, data storage sizes, and interoperability with other programs are just a few of the critical reasons why having good style is important.

Windows Search and Image GPS Metadata

Windows Search is a feature that on paper is fantastic but that completely fails in its default implementation.  I won’t wax poetic on what Windows Search claims to do, but its an amazing set of features given that nobody seems to be able to find anything with it.  However, if you fiddle around with your settings and have a concrete goal, things get better.

I can instantly find pictures from my iPhone by querying for “Apple *.jpg“.  This search utilizes the full-text index; a more precise search could have read “System.Photo.CameraManufacturer:Apple *.jpg“.  Herein lies the first challenge of Window Search: for non-text search, you usually need to know the name of the field you’re looking for.

A little digging reveals that image location data is stored as System.GPS.Latitude and System.GPS.Longitude.  Sweet!  Type “System.GPS.Latitude:>0” in your search box and prepare for disappointment.  There are a number of issues at hand here.  One of these issues is the format of the data, which is not the decimal you expect. Its actually a “computed property”, and there’s a lot of detail there, which I will skip over.

The bigger issue is that latitude and longitude simply aren’t being indexed.

If the property is indexed so that it is searchable: This means it is included in the Windows Search property store (denoted by isColumn = "true" in the property description schema) or available for full text searches (inInvertedIndex = "true")

Referring to the System.GPS.Latitude property description, isColumn and inInvertedIndex are both false.  I’m not yet aware how one might change these settings, but I’ll post again if I have any luck.


On Windows 8.1 and Windows Server 2012 R2, there’s a System.GPS.LatitudeDecimal property, which appear to be searchable by default.  Unfortunately, it appears that only Panoramic (.pano) files are associated with this property.  Prop.exe is a great tool for further exploring the Windows Property System.

Kicking the tires of TileMill’s support for File Geodatabases

Back in April, MapBox announced that TileMill now supports ESRI File Geodatabases.  The support appears to come via GDAL’s integration of  Even Rouault’s work to reverse engineer the FGDB format.

When I first looked at Even’s work, there was no support for reading the spatial indexing files of a FGDB.  Of course, without spatial indexing, large data sets would perform quite poorly.  Its worth noting that Even’s project now supports spatial indexing, but GDAL 1.1 uses the older version.  The current latest TileMill dev build to include an installer – TileMill-v0.10.1-291 – should similarly lack spatial indexing.

To make my test exciting, then, I decided to use a large dataset.  I fired up Ogr2Org and created an FGDB dump of the full Open Street Map globe (OSM2PGSQL schema).  I tested the data in ArcMap and OGR and everything was quite zippy.  Upon attempting to load the FGDB in TileMill, it crashed.  I can’t say I didn’t expect this.

It’s worth noting that ESRI’s File Geodatabase API is free as in beer.  I think Even’s work is fantastic for the community, but I’m not sure why MapBox didn’t use that other GDAL FGDB driver.  Nevertheless, OSS marches on, and I expect we’ll see these recent features bubble their way up.  I look forward to seeing FGDB spatial-indexing support hit TileMill, as I believe the idea has real legs.

Plasio, king content, and the browser

Gary Bernhardt’s The Birth and Death of JavaScript presents an interesting vision of the future – where web apps are cross-compiled into Javascript / asm.js.  Closer to home, Howard Butler just posted about his plasio project.   It’s fantastic stuff – lidar in the browser – built on Emscripten + WebGL.  Cesium is another wiz-bang WebGL GIS application – but contrary to Bernhardt’s vision of the future – Cesium is coded in JavaScript, no cross-compiler necessary.  Still, Bernhardt is right about the death of JavaScript, the value of these apps is not the language they were coded in.

Immortal sage Jim Morrison once said “Whoever controls the media, controls the mind.”  I prefer “whoever controls the medium controls the mind.”  Cesium and Plasio shine in their presentation of 3D datasets, but there’s a subtle undertone here: these web maps were designed for certain browsers.  Plasio goes so far as to use Chrome’s NaCl processing – eschewing web development norms.  Repeating myself, Bernhardt is right.  Even for web apps,  JavaScript doesn’t rule the show.  Content is still king.



Thus if content is king, there must be a queen.  In web-based GIS, the value of content is limited by its visualization medium – the browser.  Bill Gate’s oft-quoted saying is undercut by IE’s slow adoption of emerging browser technologies.  Plasio’s brave new world where WebGL and asm.js are required features is already upon us.  Programming language may be becoming implementation detail, but browser choice is not.

Browsers are becoming the medium in which content – including web-based GIS content – is being delivered.  Nevertheless, “web” is a loose term.  Application development platforms – such as the QT Project – have embedded browsers into their core capabilities.  PhoneGap has swept the mobile world with it’s embedded browser technology.  Even GIS applications such as TileMill are being built with the Chromium Embedded Framework.  As technologists, our ability to control the browser directly impacts our ability to create compelling content.

The value in our web-based GIS is not the language they were create in, it is in content dissemination and visualization.  As we attempt to integrate better content in more compelling ways, we must re-examine its relationship with the stand-alone “browser”, and attempt to better control the medium.

The Go Programming Language

A few days ago, I referenced Imposm 3 and joked that developers who use the Go programming language must be hipsters.  

Yet, I’m close to being of a hipster myself.  I’m variously obsessed with libuv,  ZeroMQ, the C10k problem, and all things threading minus locking.  I deliberately write many .NET classes akin to JavaBeans, disregarding elements of object-oriented design in favor of trivial serialization, trivial cloning, and therefore easy distribution processing.  I was a crack shot JavaScript programmer long before it was cool and my obsession with reflection and auto-generated code goes beyond healthy.  In short, people pay me to write .NET code, but I long for something more.

Lo and behold, I discovered something truly lovable in Go.  It’s simple, type-safe, and garbage collected.  Lightweight threading constructs (Go routines) are the eponymous, standout feature.  Its a fast, capable alternative to C/C++.  The compiler is fast and build tools include package management, making static linked native binaries totally sensible.  The syntax is succinct, and its relatively easy to call existing C/C++ libraries.

GIS packages are beginning to show up in Go, even if they’re mostly C/C++ wrappers… EG: GoGEOS and go-mapnik.

Geospatial and the Entity Framework: Half Full, Half Empty, or Wrong Sized ORM?

In late 2004, Ted Neward famously called Object-Relational-Mapping (ORM) the Vietnam of Computer Science.  Recently I switched to .NET 4.5, hoping to reap the benefits of LINQ-to-Entities‘ support for the spatial datatypes.  For SQL Server, this works every bit as well as the Entity Framework does in the first place – great for some databases, a hassle for others (particularly legacy databases).

When LINQ-to-SQL first came out – things didn’t really work too well for spatial.  Back then, it took a little modification of the generated SQL queries before things could get rolling using WKT.  Few people manage their spatial objects as WKT, of course, so you sprinkled some conversion code into your DAL.  Nothing worked out of the box, but the solutions were clear and made sense.

With the Entity Framework’s new spatial support in the System.Data.Spatial namespace, did things improve?  They certainly did if you’re just shuffling geometries from the web to SQL Server.  But what about people who do real work?  Their applications were all built using geometries from Vertesaur, DotSpatial, SharpMap, or NTS.  So we’re still looking at conversion, mostly likely via WKT.  Beyond that letdown, how is the database support?  I personally ran into a lack of naive DbGeometry support when using SQLite.  I wouldn’t have much cared if it were serialized to WKB or WKT, as long as something worked out of the box.

The plain truth is, its often easier to do things yourself than to learn the weird things other people do.  So despite some great use cases, the new geospatial support in .NET 4.5, for me, is the wrong sized glass.  This GIS-specific realization mirrors ORM’s issues in general.  Synapses firing, my brain dug up an old Sam Saffron / Marc Gravell project called Dapper.  Dapper has been called a “micro-ORM”; less of a ground assault and more of a smart bomb – you still manage your ADO connections and write your own SQL, it does fast binding of objects to query parameters and results.

In the end, I moved to Dapper.  Its code-base was small enough for me to grok and hack geospatial support into in a few hours.  Writing SQL is a fair trade for control, particularly when you need control – geospatial data storage being a prime candidate.  It is great to generate object models using the Entity Framework; but I’ve grabbed my POCOs and switched to a smaller, easier to modify ORM with a more stable codebase.

Free, public base-map imagery data?

I’m looking for a decent, truly public, imagery base-map for offline use.

The OnEarth Global Mosaic (15m pan-sharpened pseudo-color Landsat 7) may be years old, but its the best I’ve found so far.  Unfortunately, the old download links appear to be dead, and I don’t imagine they’d appreciate me scraping their WMS.

Does anybody have a link to the entire (~1.3TB) OnEarth Global Mosaic dataset, or a recommendation for a prettier / newer / better data-set at 15-30m?