RSS
 

Posts Tagged ‘spatial data’

Mapping Tools for Developers

09 Sep


This is a great time to be a geodeveloper. There’s more spatial data, geo-processing tools, geo enabled storage and mapping tools than ever.

Let’s start with storage – not too long ago geo developers had two choices, file formats or proprietary object-relational databases. Today there are production ready open source object-relational databases such as PostgreSQL/PostGIS and MySQL; even mobile devices have lightweight databases with spatial capabilities such as SQLite. In addition to traditional object-relational databases, NoSQL databases such as Cassandra, CouchDB, and MongoDB have a spatial capabilities. Big Table clones such as Hbase can also store spatial data and there is ongoing work for developing a spatial index which facilitates spatial queries and operations. Neo4J is a graph database that also handles spatial data. Finally, even full text search engines such as ElasticSearch provide geospatial search capabilities.

Manipulating spatial data and performing analysis used to be dominated by specialized proprietary Geographic Information Systems (GIS) desktop software.  The geospatial software landscape has expanded into many open source desktop products such as QGIS, UDig and GVSig. While desktop products are typically used for spatial analyses or cartographic production, they also provide a quick way to visualize data and results from API queries. Many open source desktop are built on standard geospatial libraries such as JTS, or the Java Topology Suite, and GEOS, the C port of JTS. These spatial libraries also have bindings to popular scripting languages like Python or Ruby, which lets developers process geospatial data in their language of choice. For example, Shapely is a python library and rgeo and GeoRuby are Ruby geospatial libraries.  Software for data extraction, translation and loading (ETL) tasks are also available as open source or as proprietary software. GDAL/OGR is an open source geospatial ETL library and collection of utilities that work with most of the common raster and vector formats. FME (Feature Manipulation Engine) is a commercial product that can perform ETL on most geospatial formats.

Developers want to see their results on a map quickly and there are many services that provide base maps for applications. Google Maps is the most popular map service for mash-ups, but OpenStreet Map and derivative services such as Cloudmade provide maps that use data gathered by volunteers. For some applications, custom base maps are needed, especially when certain features such as roads need to  be de-emphasized. Both Google Maps and Cloudmade support changing map styles; however there are a number of ways to generate custom maps. WMS (or Web Map Service) is a common way to generate custom maps. The main advantage to WMS is that they are interoperable, many mapping clients understand how to talk to WMS to get a map. Their downside is the complexity of configuration, quality of output and lack of scalability for high volume sites. A fast and cartographically attractive alternative is Mapnik, one of the default rendering engines behind OpenStreet Map.  It’s main advantages are speed and high quality output. It has a simpler XML based configuration, but lacks a graphical interface to preview maps. A newer alternative is TileMill which uses CSS to to style maps. TileMill is built on top of node.js (the serverside javascript engine)  and includes a display which lets you see the map while editing styles. Finally, there are geospatial portals that allow you to import your data, perform analysis and, and create maps. ArcGIS Online is a portal based on proprietary ESRI technology that allows users to build mashups in their environments. GeoCommons by GeoIQ provides similar capabilities and offers a wide range of user contributed data.

Maps have evolved from simple mashups of pushpins overlaid on street maps. They are frequently used for interactive visualizations that integrate many types of data with interactions to tell a complex story. For example the New York Times map of Hurricane Irene’s path along the Eastern Seaboard  illustrates the effects and aftermath of the hurricane. Developers have several javascript libraries to choose from when building custm maps. OpenLayers is the largest and most flexible of the javascript map client libraries. Its strengths are:  it can handle many different data formats and mapping services, has a large developer community, and has been integrated into web frameworks such as Drupal and Django.  Polymaps is another map client library that use SVG to make interactive maps. Data can be  attached to the graphic elements of the map to create fast interactive data displays. Recently, there has been a trend towards lightweight clients that do just the minimum to keep the size of the library small and quick to download. Leading this trend is Leaftlet by Cloudmade which provides a minimal framework for displaying map tiles and data. Even smaller than Leaflet is Modest Maps JS which is a javascript port of the Modest Maps actionscript library. Modest Maps JS in conjunction with Wax, a library of UI widgets is a 28K download, making it the smallest of the client libraries.

 

Mapping Ecosystem

The growth of geospatial developer tools has been driven by the availability of spatial data. Collecting spatial data was once the domain of government agencies, but widespread availability of consumer GPS on smartphones has created an explosion of spatial data generated through social media and checkin services. Transparency efforts at all levels of government has added to the growing amount of spatial data. While there are a number of options for storing your own data in one of the solutions mentioned previously or hosting it on a service such as Google Fusion Tables, another alternative is to use a service that provides a consistent API to spatial data. A number of data providers, including InfoChimps, provide spatial data, but when working with spatial data it is easy to overwhelm browser based map clients by the volume of data. Schuyler Earle coined the term “red dot fever” to describe the situation where data markers obscure the map and any discernible patterns. Two ways to overcome data overload are clustering data to show outliers and aggregation which decreases spatial resolution but reveals patterns. InfoChimps provides the Summarizer tool to make data query results more usable by organizing data points into intelligent geographic clusters. Another advantage of the InfoChimps Geo API is consistency across data by a unifying schema call the Infochimps Simple Schema (ICSS). ICSS is based on schema.org to provide a consistent and web friendly way to access data The mind map above is a first stab at organizing the ever growing array of geospatial tools and data. It’s a work and progress and comments are welcome.

Â