Machine Learning to Explore Geographical Data

Self-Organising Maps (SOMs)

Exploratory data analysis methods such as Self Organising Maps (SOM’s) can provide an alternative way to analyse spatial data, especially when the data is multi-dimensional.

1 What are SOMs?

Self-Organising Maps (SOMs) are a type of unsupervised Artificial Neural Networks (ANNs). They were developed by Teuvo Kohonen in the early 1980’s and are mostly used for clustering, visualisation and data exploration. SOMs reduce n-dimensional data and display it on the two-dimensional map where similar data is placed into the same grid cells, hereinafter referred to as neurons or nodes.

2 How do SOMs work?

Imagine 1000 people in a (big) room. We define a number of attributes (e.g. gender, age, height, income) and ask the people on the room to move closer to other people who are most similar to them according to all these attributes. After a while, everyone in the room is surrounded by those people that share similar attribute values. This configuration is an example of a two-dimensional representation of multi-dimensional data points. Of course, the SOM algorithm is slightly more complicated. For a more detailed but yet easy to understand explanation click here.

3 An Example

SOMs offer insights that can’t be explored or displayed with the linear indices. The following example provides an alternative look at deprivation patterns in Edinburgh, Scotland. The study uses the 2016 SIMD dataset containing 26 deprivation variables (excluding absolute measures), an overall rank and rankings over each of the 7 domains (Income, employment, education, health, access to services, crime and housing). To closely monitor deprivation patterns, the study area was reduced to the 597 data zones in Edinburgh. The 26 variables were summed up within each of the themes and the SOM was then trained with seven variables representing the seven themes. After the SOM training seven clusters were created using the hierarchical clustering method. All analysis was conducted within R studio and a copy of the code can be found here.

3.1 Training the SOM

The process starts with the user defining the size, shape and topology of the SOM grid.  These factors are determined by the number of observation and can be altered to reduce edge effect.  Once selected the SOM is trained to determine the appropriate number of iterations required. As the SOM training iterations progress, the distance from each node’s weights to the samples represented by that node is reduced. Ideally, this distance should reach a minimum plateau. This plot option shows the progress over time. If the curve is continually decreasing, more iterations are required. Once the plateau has been reached continuing iterations does not improve the quality. Defining size, shape and topology and training the SOM is an ongoing process and several combinations are tested before selecting the final parameters.

3.2 Clustering

Clustering within SOMs was firstly conducted by Ritter & Kohonen (1989) developing their “semantic bird maps”. Since then, it has become a widely-used technique in the field of SOMs. In this study, the number of clusters was chosen based on the Within Clusters Sum of Squares (WCSS) metric, a rough indicator for the ideal number of clusters. In addition, the fact that clusters are spatially continuous within the SOM, and do not display divided clusters or islands, indicates that the clustering was successful. For this reason, a hierarchical clustering method with seven clusters was performed.

By means of component planes analysis and display of the results on a geographic map, the clusters were characterised to facilitate the interpretation. Note that the cluster descriptions are not purely based on scientific knowledge, moreover they aim to tell a narrative backed by common knowledge about the city.

Characteristics of the seven clusters. The node background colours represent the seven clusters. The codes show the seven variable properties for each neuron, with larger symbols indicating higher disadvantage. They are displayed to visualise similarity and differences between adjacent neurons.


3.3 Cluster Characterstics


  • Cluster 1 (7%) | The Precariats — The Precariats is a cluster defined by very high disadvantage within the income and employment domain, low education and health issues, whilst access, crime and housing do not seem to be a major issue. The Precariats are the poorest and most deprived cluster.
  • Cluster 2 (23%) | Rough Edinburgh — The inhabitants of Rough Edinburgh share similarities with The Precariats but are less vulnerable. However, they still score very low within the domains income, employment, education and health. Typical “Rough Edinburgh” districts are areas with social housing such as Dumbiedykes.
  • Cluster 3 (13%) | The Wealthy Commuters — The Wealthy Commuters are the cluster most disadvantaged by access – but by choice. Apart from the access domain they show very low disadvantage amongst all the themes. They typically live on the outskirts or in suburban areas where they are house owners and commute to work every day.
  • Cluster 4 (41%) | The Waitrose Shoppers (Edinburgh Posh) — This cluster is defined by a uniformly low disadvantage across the seven themes. Typical middle class families and professionals with high income and education living in urban and suburban areas belong to this cluster. Areas such as Marchmont and Stockbridge are typical for this cluster.
  • Cluster 5 (3%) | The Urban Intermediates — This cluster is defined by excellent access and intermediate characteristics amongst the domains income, employment, health and education. It is noticeable that there is a relatively high crime rate and rather low housing conditions.
  • Cluster 6 (1%) | The Crime Triangle (Edinburgh Nightlife) — The Crime Triangle is defined by very high crime and bad housing. It represents only a very small area in the city centre nearby Princess Street. Data zones within this cluster are not a typical living area but rather an area where people congregate and where young people enjoy the nightlife, which explains the high proportions of crime rates per inhabitants.
  • Cluster 7 (12%) | The Hotchpotch — The Hotchpotch is characterised by very bad housing and excellent access. It encompasses areas in the city centre with a high proportion of students and presumably flat shares, but also more ethnic areas near to the city centre.


Finally, clusters can then be projected back onto a map to allow analysis of there distribution.



4 Scientific Sources

Kohonen, T. 1982. Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59-69, 10.1007/BF00337288.

Kohonen, T. 2001. Self-Organizing Maps. 3rd ed. Berlin Heidelberg: Springer-Verlag. Available at: (Accessed March 29, 2018).

Skupin, A. & Agarwal, P. 2008. Introduction: What is a Self-Organizing Map? In Agarwal, P. & Skupin, A., eds. Self-Organising Maps. Chichester, UK: John Wiley & Sons, Ltd, 1–20., 10.1002/9780470021699.ch1.

London Underground – The Journey of Life

This interactive web map explores life expectancy discrepancies between London Underground Stations. The output is a startling reminder of the importance of ‘place’ to people’s lives. It displays sharp contrasts between life expectancy on a small spatial scale. For example, Ladbroke Grove and Latimer Road on the Hammersmith and City line are separated by one stop (800 metres) but the average life expectancy is seven years higher in Ladbroke Grove.

My intention in designing this map was to create a memorable impression of the spatial inequalities along the routes travelled by Londoners each day. Within ArcGIS online, I uploaded a dataset which contained the average life expectancy from health clinics in London (total of 1436 health clinics) and created a spatial join to connect them to their closest underground station (total of 312 stations). The station then displays the average life expectancy (male and female combined).

Webmap URL: Click here

Distributed GIS

What are Distributed Geographical Information Systems (GIS)? Distributed GIS are systems that do not have all the components in the same physical location. – Here a diagram complying with the old saying, “A picture is worth a thousand words”.




Peng, Z. and Tsou M. (2003). Internet GIS: Distributed Geographic Information Services for the Internet and Wireless Network. London: Wiley.

Young C. (2017). Distributed Geospatial Computing (DGC). In: Encyclopedia of GIS. 2nd ed. New York: Springer.

Icon Sources within the Diagram

Clip art images: Available at: [Accessed 23. Sept. 2017].

Globe image: Available at: [Accessed 23. Sept. 2017].