In the 21st century ongoing rapid urbanization highlights the need to gain deeper insights into the social structure of cities. While work on this challenge can profit from abundant data sources, the complexity of this data itself proves to be a challenge. In this paper we use diffusion maps, a manifold learning method, to discover hidden manifolds in the UK 2011 census data set. The census key statistics and quick statistics report 1450 different statistical features for each census output area. Here we focus primarily on the city of Bristol and the surrounding countryside, comprising 3490 of these output areas. Our analysis finds the main variables that span the census responses, highlighting that university student density and poverty are the most important explanatory variables of variation in census responses.
Figure 1: Diffusion maps find effective variables describing city neighborhoods. Shown is the second most important variable characterizing the census in Bristol. This variable is essentially a nonlinear principal component identified from a dataset containing 1450 statistics for each area, by a deterministic non-parametric algorithm. We interpret this variable as a proxy of economic deprivation. Highlighted in yellow are buildings with a high density of social housing.