Afficher de grandes quantités de données avec Woosmap

Jean-Thomas Rouzin - Reading time : 10 min

Sommaire

L'un des défis centraux de notre stack cartographique est d'afficher un grand nombre de points de vente tout en maintenant des temps de réponse acceptables et une navigation fluide. C'est un enjeu complexe qui implique de maîtriser l'ensemble de la chaîne, du stockage des données géographiques au rendu côté client.

Woosmap permet d'afficher des centaines de milliers de points de vente quelle que soit l'interface, mobile ou non. Pour y répondre, nous avons intégré une technologie de rendu cartographique « multi-échelle » basée sur le tuilage.

Cet article définit cette approche et explique notre implémentation pour gérer de grands volumes de données tout en offrant une bonne expérience utilisateur.

Tiling Method

The tiling consists in cutting “large” geographic datasets in many rectangles which could then reassembled on demand on the client side. Nowadays, this method is used in many web-mapping applications, from Google Maps to Mapbox. Below is a basic schema of a raster tile - an image tile.

Pour cette tuile spécifique, l'image est pré-générée côté serveur : aucun temps d'attente n'est nécessaire — elle est immédiatement envoyée au client.

Tiled Maps

There is a need for tiling when the size of a vector or a raster layer is decreasing the performances of the navigation in the map: loading time too important and navigation not smooth. Basically, this technology allows faster navigation and a generally more positive experience when searching the information you want on a map. We could find a lot of arguments in favour of implementing tiles.

Cache efficiently on the client: if you download tiles of Paris to view a map of La Defense your browser can make use of those same tiles from cache instead of downloading them again when showing neighbouring areas.
Load progressively: the center of the map will load before the outer edges, letting you pan or zoom into a particular spot even if tiles at the edges of your map view haven’t finished loading.
Simple to use: the coordinate scheme describing map tiles is simple, making it easy to implement integrating technologies on the server, web, desktop, and mobile devices.

Originally, when we talk about Tiled Maps, we refer primarily to images - rasterized tiles - pre-built on the server. In the past several years, a new data storage format called “vector tiles” has gained popularity. Below is a brief introduction to both of these tiling methods.

Raster Tiles

A Raster image is made of pixels, each of a different color, arranged to display an image. Raster tiles are simply image tiles representing the map data. Most of the web mapping technologies are raster based. Those maps consist of many map tiles ordered in pyramidal scheme. Such tiles are being loaded in maps quite fast and that is because they are most of the time already rendered on servers.

Vector Tiles

Much like raster tiles, vector tiles are simply vector representations of geographic data whereby vector features are clipped to the boundary of each tile. They are the vector data equivalent of image tiles for web mapping, applying the strengths of tiling – developed for caching, scaling and serving map imagery rapidly – to vector data.

The idea behind vector tiles is that it is more efficient to keep data styling separate from the data coordinates and attributes. The client can use a predefined set of styling rules to draw tiles of raw vector coordinates and attribute data sent by the server. This allows the restyling of data on the fly, which is another serious limitation of rasterized tiles. Vector tiles have several advantages over fully rendered image tiles:

On-Demand Styling: as vectors, tiles can be styled when requested, allowing many map styles.
Small Size: vector tiles are really small, enabling fast map loads and efficient caching.
Client Resolution: raster tiles are pre-rendered at what is assumed to be a normal screen resolution. Vector tiles are delivered to the client device so the shape rendering appears as clear as the screen resolution.

Benchmark of Vector vs Raster

Test Case

It is difficult, first of all, to benchmark performance between vector against raster tiles as the available features of each method are not similar and the JavaScript API used for rendering (Web GL or not) significantly influences the results.

Dataset of 85K location places
Small data attributes (e.g. id,name and address fields)
Screen Resolution of 2880px x 1800px (Macbook Pro 15” Retina).
Map centered on London
No use of Web GL

Agregated Tiles Sizes

The following diagram shows the agregated tiles sizes of vector vs raster map. As the vector data encapsulates attribute information, required for styling for example, that the raster doesn’t, this chart should therefore be taken with precaution (regarding your data attributes).

Dans notre cas, nous utilisons le Google Maps DataLayer pour afficher les tuiles vectorielles. Comme le montre ce schéma, les performances se dégradent à mesure que les marqueurs s'accumulent. Au-delà de 1 000 marqueurs issus de tuiles vectorielles, la carte web devient inexploitable.

Compte tenu de cela, et ayant décidé de ne pas utiliser WebGL pour des raisons de compatibilité navigateur, nous avons développé une technologie de tuilage hybride combinant les deux approches, raster et vectorielle. L'idée est de tirer le meilleur des deux mondes pour mieux répondre aux besoins des utilisateurs.

Implémentation Woosmap

Rendu multi-échelle

Nous avons mis en œuvre une combinaison de tuiles raster pour les niveaux de zoom élevés et de tuiles vectorielles pour les niveaux plus détaillés. Cette approche garantit une carte rapide à toutes les échelles, tout en préservant la fluidité de navigation et la clarté du rendu sur les écrans haute résolution.

Pour passer de l'un à l'autre, nous définissons un niveau de zoom comme point de bascule. Le rendu vectoriel tuilé s'effectuant côté client, il est relativement simple d'offrir une bonne expérience utilisateur à ce niveau. Mais pour les niveaux supérieurs — le rendu raster — c'est une autre paire de manches !

Interactivité

Pour une expérience cartographique web rapide, l'aspect interactif est essentiel. Une carte typique peut contenir des milliers d'éléments, qui doivent tous être utilisables sur une variété d'appareils. La spécification UTFGrid définit un mécanisme pour transmettre des données interactives à une interface cartographique — comme une infobulle — de manière progressive et performante, y compris sur les anciens navigateurs et les appareils mobiles modernes.

Dans l'exemple jsFiddle suivant, vous pouvez obtenir des informations (identifiant au survol, nom au clic) sur chaque point, malgré l'utilisation d'images raster. Cette fonctionnalité a été rendue possible grâce au support des tuiles UTF Grid.

Cette technologie a été développée par Mapbox et est open source. La spécification est disponible sur leur GitHub.

Stack technologique Woosmap

Nous avons mis en place une architecture basée sur des standards ouverts — Mapnik et UTF Grid — et des composants propriétaires Woosmap. Voici le schéma de notre architecture de tuilage.

User Interface

Users interact with your mapping application primarily through a JavaScript library that listens to user events, requests tiles from the map server, assembles tiles in the viewport, and draws additional elements on the map, such as popups, markers, and vector shapes. We provide a small JavaScript API to display your data over Google Maps and to implement essential features, like search by address and get directions from user location. For example, the TiledView class enables the use of the raster tiles (see official doc). Our WebApp natively supports the hybrid tiling system.

Tile Cache on CDN

A tile cache is a server that sits between the browser and the map server. It checks to see if a requested map tile is already hanging around in a cache somewhere, where it can be served up quickly to short-circuit the call to the map server. If the map tile has not been generated, the tile cache gets it from the map server and saves it to speed up subsequent requests. In our configuration, we’re using a Content Delivery Network as it allows us to manage tile cache as well as other traffic to end users. The principle of a CDN is quite simple: dispersing all of your static content across multiple servers geographically closer to your users will make your web pages load much faster.

We’re caching the tiles only on-demand, not for pre-seeding, to keep data the most up-to-date as possible. However, our CDN instructs a visitor’s browser to cache files for two hours to prevent heavy loads due to concurrent users. During this period, the browser loads the files from its local cache, speeding up page loads.

Map Server

Basically, the map server brick takes geospatial data as input and renders graphical output. In our case, it spits out a series of map tiles, which are uniformly sized graphic (raster) and JSON (vector) files that are served to and assembled in the browser as the displayed map.

Raster Tiler

For the Raster tiling, we’ve implemented Mapnik using the python bindings (Mapnik is orignially written in C++). It’s an open source toolkit for developing mapping applications that renders beautifully, has a developer-friendly interface and offers strong performance. Furthermore, the toolkit is aimed primarily at web-based development and there is a large developers community around this map server.

Homemade Vector Tiler

For the vector tiles, we could use Mapnik as it supports natively this feature (See this implementation of Mapbox Vector Tile specification for Mapnik), but our data is lightweight (it supports only point geometry) and we don’t need all the complexity of this spec. That’s why we developed our own custom vector tiler server. For now, we’re not using a data encoding to transport the vector data from map server through to the client. This could be a great improvement so we are looking seriously on implementing the Google Porotobuf data interchange format.

Geospatial Data

All the geographic dataset we manage is stored using a spatial database which provides a geometry type and functions that operate on it. This gives us the ability to make SQL queries that include spatial predicates like “within two miles of this latitude and longitude”, that is especially interesting for our needs to bring our users to the displayed places.

Final Thoughts

Our platform is able to dynamically serve a lot of concurrent users displaying huge geographic data. The development of an hybrid tiling strategy and the use of a content delivery network make us comfortable to support heavy loads.

We are continuously working to improve the user experience and need to explore certain topics in greater detail. For instance, the Google ProtoBuf data interchange format could be adapted to our homemade tiler to increase the performance. Also, the use of a WebGL technology, like the CanvasLayer Utility for Google Maps, on the client side would be a great improvement for a smoother navigation.

If you have any questions about the content or the process described, please don’t hesitate to reach out to me through the contact page.

Useful Links

Take control of your maps, an old but great introduction to maps for developers
Why tiled maps, teachers from The Pennsylvania State University
Mapnik, the indispensable map server toolkit
Mastering the interactivity of your rasterized maps with UTF Grid