Skip to content

G2. BUILDING FOOTPRINTS SOURCES

Mihyun Kim edited this page Oct 10, 2024 · 70 revisions

Introduction

This section provides information on:


Building footprint use in CCRP platforms

Building footprints form the basic building blocks for Colouring Cities platforms. Footprints allow CCRP open data to be captured, collated, verified and visualised. They can also be used to infer other building characteristics such as typology, height and land use, on their own or in conjunction with other attributes. To produce the highest quality open data possible, Colouring Cities platforms require access to the highest quality, regularly updated, most comprehensive, geometrically precise footprints. These are most commonly held by national mapping agencies. Where access to national mapping data is restricted, charged for, or not available, other open datasets will need to be used. In recent years open footprints have come more widely available owing to: the release, in a number of countries (mainly since 2015), of open property tax datasets; increased interest from OpenStreetMap in collating building footprints datasets, and open datasets now produced by Microsoft and Google using satellite imagery and AI, producing block level footprints at global scale. This has radically changed the level of interest in granular building attribute data and significantly increased opportunities to understand and analyse stocks. It has also raised a number of security and privacy issues which now need to be addressed.


Privacy and security issues

Building level footprints are only used in CCRP platforms to capture, collate, verify and visualise non-personal open attribute data at building level. CCRP platforms do not aim to release open footprint datasets at building level. This is due to security and privacy concerns regarding the use of footprints to collect and visualise of personal data relating to building occupants. e.g relating to income or health. Colouring Cities national platforms are asked to control building level footprints behind firewalls regardless of whether these are released by public bodies open data. Data relating the interior of buildings is also considered by the CCRP to be private. Visualisation of, for example, energy ratings, energy use and heat loss, at building are is seen grey area requiring further discussion with visualisation currently recommended at block not building level, again even where these are released publicly released at building/property level. Wider discussion regarding regulation in relation to spatial data collection, is promoted by the CCRP, especially owing to growing commercial interest in this area. (Concern regarding microscale visualisations also extend to georeferenced point data at building level, though not used in CCRP platforms).


Computational generation of building footprint datasets/comparative analysis, papers and links

Papers and links providing information on building footprint generation, comparison and quality are collected below:


Footprint data sources for CCRP international platforms

Below is a brief summary of sources of footprint data used Colouring Cities academic hosts in partner countries.

Current building footprints are an amalgamated set of polygons obtained from OpenStreetMap and Microsoft Bing Maps. Work and negotiations are currently underway to obtain authoritative building datasets from governmental and private custodians including Geoscape and Vexcel. For Colouring Australia's paper explaining issues with use of Microsoft/Bing footprints see here

Currently being discussed with Bahrain's national mapping agency Survey and Land Registration Bureau and Ministry of Works, Municipality Affairs and Urban Planning. -Esri representative office in GCC.

The Colouring London prototype has, since 2016, used Ordnance Survey MasterMap polygons as its sole source of footprint data. OSMM offers the highest quality, most comprehensive and updated footprints available for the UK. Updates are ongoing and available for integration. OSMM data are accessed from Ordnance Survey (OS), the Uk's national mapping agency, under an OS Public Sector Geospatial Agreement signed by the Greater London Authority. Additional authorisation to use OSMM was also required from OS.

The agreement with OS requires that vectorised footprints, to which Crown copyright restrictions apply, must be protected from download. Footprints are therefore transformed into raster tiles which facilitate data capture and visualisation but are nor released. Coordinates are not permitted to be derived from footprints/polygons, though OS release since 2020 of Unique Property Reference Numbers (UPRNs) now enables coordinates to be freely accessed.

Each OS building feature is taken as a "building", so building_id <-> toid is 1 to 1 Commonsense buildings could span multiple features, or multiple buildings could be represented by a single building. Buildings may have Unique Property Reference Numbers (UPRNs) (multiple properties with multiple UPRNs within one building). UPRNs may span several buildings (on a more complex site, covered by one parent UPRN)

The Colouring London prototype is now expanding to become Colouring Britain- with CCRP platforms engineered to be easily scaled. Unactivated OSMM footprints can be viewed on the Colouring London site by zooming out of London. It was originally envisaged that Colouring Cities code would be tested by individual cities and towns in Britain/UK, and that separate OSMM licences would be negotiated with. OS has said that no further access can be permitted to OSMM other than for possible two other test sites.

Microsoft Bing footprints have been identified as too aggregated for use in data capture for most CCRP classes. Experiments are currently being made using the UK INSPIRE open land parcel dataset and OS Open Map Local which is a more accurate version of Microsoft Bing (i.e. at block level. Building level footprints will continue to be kept behind a fire wall for security purposes until the wider implications of releasing building level footprint data are better understood. (Last updated 7.10.2023)

OS historical maps also offer access to historical footprints at c20 year increments from c1870. Vectorisation of footprints within historical maps is further under 'Historical footprints' below.

For further information on footprint use in Colouring London please see here. (Last updated 7.10.2023)

COLOURING CANADA/MONTREAL

The geospatial data collected for building footprints comes from various sources published by Natural Resources Canada, City of Montreal. The primary semantic information was extracted from a city-run cadastral data portal, which provides georeferenced building lot shapefiles and building footprints for six boroughs in Montreal. To expand coverage, we incorporated satellite-based image segmentation data from Natural Resources Canada, covering the entire province. We generated individual lot-level buildings for higher accuracy by segmenting the satellite-derived footprints using the city's cadastral parcel data. Building heights were then added using LiDAR data. We pre-processed the data after a spatial join operation, eliminating duplicate entries and polygon artifacts through area-based filters.

Data Sources

  1. Montreal Property Assesment Units
  2. Montreal 3D Buildings (LOD2 model with textures)
  3. Montreal Aerial Lidar Dataset
  4. NRCAN Building Footprints
  5. Administrative boundaries of the agglomeration of Montréal (boroughs and related cities). (Last updated 6.10.2023).

The data source used for initial testing will be provided by the IDECA (Spatial data infrastructure of Bogota). This local authority integrates more than 98 local departments and spatial layers in an open a reusable license. Base Map of Bogota: https://www.ideca.gov.co/recursos/mapas/mapa-de-referencia-para-bogota-dc Buildings: https://www.ideca.gov.co/recursos/mapas/predios-bogota-dc. Other datasets considered include OpenStreetMap (OSM) and local datasources. The local team will implement a GIS system to help the validation and integration required to access all layers.

To be completed https://opendata.dresden.de/ https://www.geodaten.sachsen.de/

A hybrid model is currently being tested for Athens combining footprint from the 2001 Census and held by the Hellenic Statistical Authority with OSM data. The advantage of Hellenic Statistical Authority data is that it is comprehensive government data. the disadvantage is that is out of date and the next data will not be released until the next in Census in XXX. The advantage of OSM data is that it is relatively up-to-date for the semi-central municipalities in Athens central Athens and allows platform users to download Colouring Athens footprints as open shapefiles. The disadvantage is that footprints only cover parts of the city, it is not known when more will be added or when existing footprints will be updated.

Hybrid

athens footprints 2

Building footprint attributes used in the construction of Colouring Beirut were obtained from the National Center for Scientific Research CNRS Lebanon. Footprints were manually digitized over the entire city of Beirut by CNRS-L using aerial photos at 15cm resolution and VHR pan-chromatic satellite images from Pleaides-1A at 70cm resolution.

Building footprints are available from the Indonesian government, free of charge, available on Jakarta Satu website. Link and image to be added

Building footprints are available free from the Swedish government at national level. However, as with Colouring Britain, though footprints can be used to capture and visualise building attribute data, they are not currently permitted to be downloaded for public use.


Historical building footprint data - integration within CCRP platforms

Historical maps and other types of historical information are integrated into CCRP platforms wherever possible to support public understanding of the evolution of urban areas, to improve predictive models, to increase understanding of the way in which typologies change and adapt over time, and to better understand stocks as complex dynamic systems. However availability, granularity, precision and openness of digitised historical maps varies considerably across countries with, for example, some countries such as the UK lucky enough to have access to building level national mapping agency maps at national scale at c20 year increments from c1870 to the late 20th century, to other countries where no historical maps may be available. A key role of CCRP platforms is to also demonstrate the value of existing paper map archives, the importance of archiving of building footprint datasets year on year to allow change to be tracked.

Digital copies of historical paper maps (geoferenced and in raster format) can be easily integrated as layers within CCRP platform interfaces. These can be overlaid with either current or historical vectorised building footprints, at building footprints. An example of overlaying of current vectorised footprints onto historical maps and collecting data on survivals and demolitions can be seen at https://colouringlondon.org/view/age/.

A key challenge has been to access_historical vectorised footprints_ at building level for specific timepoints, necessary to collate and collect historical information, as well as to track change. The CCRP works with the Alan Turing Institute's Computer Vision for Digital Heritage Special Interest group to support research into AI and machine learning approaches to footprint vectorisation, and offer use of the platform prototype for experimentation. It currently collaborates on historical footprint research with a group of research institutions, based in Switzerland, Germany and the UK; the Digital Humanities Laboratory at the École Polytechnique Fédérale de Lausanne/EPFL, The Leibniz Institute of Ecological Urban and Regional Development/IOER, The University of Bristol's MAPHIS initiative (part of the Mapping History international project), and The University of Cambridge. As part of the collaboration, vectorised building footprints will be generated from Ordnance Survey historical maps, integrated within the Colouring London prototype. They aim is to provide:

  • Open historical footprints able to be downloaded and experimented with, within diverse areas of research including analysis of typology adaptation, mutation, persistence and resilience, and identification of underlying rules of operation within stocks, and locked-in patterns and cycles;
  • Mini filing cabinets into which historians, and others, can upload historical data (e.g. historical land use or attributes of the building if demolished);
  • Links to historical sources and archive collections providing information on the history of individual sites.

Relevant publication links may be found below

  • Petitpierre, R., & Guhennec, P. (2023). Effective annotation for the automatic vectorization of cadastral maps. Digital Scholarship in the Humanities. https://doi.org/10.1093/llc/fqad006

Notes on non-profit & commercial sources

National Mapping Agencies

Government organisations provide maps for national use, highest quality possible - remit isn’t international, will be strategy for updating, likely to have high accuracy standards. Official record of buildings - though this may not include temporary or illegal buildings (ref).
(1) Ordnance Survey (OS) Ordnance Survey (OS) offers high-resolution mapping at building levels. They provide highly detailed data but often require advanced technical literacy to utilise effectively. Interoperability challenges further limit the usability of data across different software platforms and for users who lack specialised GIS knowledge. Agencies with open data initiatives tend to offer more widely usable datasets, enhancing the accessibility for researchers, businesses, and the general public (ref). (2) United States Geological Survey (USGS) United States Geological Survey (USGS) provides the complexity of geographic data includes both topographical details and metadata associated with infrastructure - complex layers of data that encompass not just building footprints but also geological and environmental data. However, more complex data structures can result in challenges for end-users, particularly those with limited technical expertise. They often face challenges in maintaining up-to-date data, particularly in rapidly developing urban areas. Many agencies update data periodically, but due to cost and resource limitations, the data can quickly become outdated. This is a critical issue in regions undergoing rapid development or environmental change. They adhere to local standards (ref). (3) Swiss Federal Office of Topography (Swisstopo) Swiss Federal Office of Topography (Swisstopo) has restrictive licences that limit commercial use or public availability of data. Open access initiatives, such as USGS, provide some data freely. They follow national and EU standards. + to ensure interoperability between systems through open standards. However, technological innovations such as cloud computing and AI-driven mapping are not evenly adopted across all national agencies, leading to gaps in the efficiency and capability of using these datasets (ref). (4) Canada National Topographic System (NTS) Canada National Topographic System (NTS) provides data more focused on broader land parcels and zones, leading to limitations in detailed urban mapping, leading to limitations in detailed urban mapping. They have more rigid commercial licensing agreements, limiting wider economic use. This can impact industries such as urban development, real estate, and environmental monitoring. They adhere to local standards. (5) Geoscience Australia Geoscience Australia aims to ensure interoperability between systems through open standards. However, technological innovations such as cloud computing and AI-driven mapping are not evenly adopted across all national agencies, leading to gaps in the efficiency and capability of using these datasets.

Non-Profit foundations/open-crowd sourcing platform

Open-crowdsourcing platforms and non-profit foundations play a pivotal role in the geospatial data ecosystem by providing open access to mapping data. These platforms rely on community contributions and aim to make geographic information available to the public, fostering innovation and collaboration in the process (ref). Below are key platforms. (1) OpenStreetMap (OSM) In 2004, Steve Coast founded OpenStreetMap (OSM) in response to the lack of accessible UK mapping data, which was restricted by high fees and licensing imposed by Ordnance Survey. Initially focused on visualising road networks, OSM has since grown into a global, community-driven project that allows users to map streets, buildings, and landscapes, contributing to a continuously expanding open mapping database. With the mission of providing free geospatial data for anyone to use and share, OSM has had a profound influence on the open data movement. Despite its success, OSM faces challenges, including privacy concerns in sensitive areas and the lack of a unified update strategy or global quality standards, leading to occasional inconsistencies in data accuracy (ref). (2) Mapillary Mapillary was founded in 2011 and focuses on crowdsourced street-level imagery, enabling users to capture and share photos of streetscapes from around the world. This platform combines imagery and map data to create an ever-evolving, detailed visual representation of cities and rural areas. By integrating user-uploaded photos, Mapillary helps provide insights into infrastructure, road conditions, and other public assets. One challenge faced by Mapillary is ensuring the privacy of individuals in the imagery, and there are efforts to use AI to blur faces and license plates. The platform’s data has proven useful for city planners, infrastructure maintenance, and autonomous driving research (ref). (3) MapTiler MapTiler was established in 2017 and is another platform in the open mapping ecosystem that specialises in providing base maps and geospatial data for integration with web applications and GIS software. It focuses on easy-to-use, highly customisable maps for both private and public use. MapTiler prides itself on its adherence to open standards, which promotes interoperability across platforms. However, like other platforms, it faces challenges in maintaining consistently high-quality data across different regions and ensuring frequent updates in rapidly changing environments. Additionally, its accessibility may be limited for users without advanced technical skills in geospatial technologies (ref).

Academia

The academic sector plays a pivotal role in advancing geospatial research and the development of innovative tools and methodologies for spatial data management. Various initiatives and research networks contribute to this growing field, fostering collaboration across disciplines and sectors to enhance the understanding and use of geospatial information (ref). (1) European Spatial Data Research (Euro SDR) Euro SDR is a federated European research network focused on advancing the science and technology behind spatial data collection, processing, and dissemination. It brings together national mapping and cadastral agencies, academic institutions, and industry stakeholders to promote research collaboration and innovation in geospatial data infrastructure. Through cross-border cooperation, Euro SDR aims to enhance the quality and accessibility of spatial data across Europe, contributing significantly to the development of interoperable systems and spatial data standards (ref). (2) International Council for Science Committee on Data for Science and Technology (CODATA-GEO) CODATA-GEO is a global initiative under the International Council for Science, dedicated to improving the availability and usability of scientific data, with a special emphasis on geospatial data. This committee works to advance interdisciplinary research by promoting best practices in data sharing, management, and interoperability. CODATA-GEO is a key player in ensuring that geographic data supports a wide array of scientific research, particularly in fields such as climate science, biodiversity, and environmental monitoring (ref). (3) Colouring Cities Research Programme (CCRP) The Colouring Cities Research Programme (CCRP) is an innovative academic initiative that focuses on developing open platforms for the collection, visualisation, and analysis of building-level data. The programme’s primary goal is to enhance urban data infrastructure by mapping and visualising buildings across cities to inform urban planning, sustainability, and heritage preservation efforts. Through collaboration between universities, government bodies, and local communities, CCRP aims to provide detailed, openly accessible building data to support academic research and public engagement in urban development. By involving citizens in data collection and utilising a participatory approach, CCRP helps bridge the gap between academia, policy-making, and the public (ref).

Commercial

The commercial geospatial landscape has evolved with the advent of large-scale mapping platforms that offer a mix of proprietary and open data. These platforms play a significant role in providing geospatial solutions to various industries leading the charge. The following section explores both open and closed commercial platforms and highlights their approach to data collection, monetisation, and contribution to open-source initiatives (ref).

A. Open commercial platform

(1) Google Maps Google has created its own proprietary dataset for Google Maps, built using a combination of various data sources, including public and private data, and AI-driven automated data generation. Although these datasets are proprietary, Google has significant influence over the commercial mapping landscape due to the scope and accuracy of its mapping service. Google Maps has become integral for many businesses, providing APIs that allow developers to integrate mapping services into their own platforms. However, the proprietary nature of the dataset means that access to this data is typically behind a paywall or restricted for non-commercial use (ref). The platform is one of the most widely used commercial mapping platforms globally. It offers extensive geospatial data through a highly accurate and reliable system, supported by regular updates and comprehensive global coverage. Google Maps operates on a paid service model, with businesses and developers paying for access to its API, which powers a wide range of commercial applications. However, Google’s dominance in the field has led to scrutiny regarding competition, especially with regards to the use of proprietary data. Despite this, Google continues to be a market leader, and its continuous updates and vast dataset make it a valuable tool for commercial and individual users alike (ref). (2) Microsoft Bing Maps Microsoft Bing Maps focuses heavily on generating aerial imagery and contributes to the open mapping community, including OpenStreetMap (OSM). A significant contribution of Microsoft is the generation and provision of free building footprint data, particularly in the UK, where they have reportedly mapped over a billion buildings. By sharing this data openly, Microsoft fosters a collaborative approach to mapping while also maintaining a competitive edge against Google. Bing Maps, like Google Maps, serves as both a complementary and competitive product within the mapping and geospatial industries (ref). They offer a charged commercial platform alongside its open contributions. Whilst Microsoft supports OSM and provides free data such as building footprints, its commercial platform is a direct challenge to Google Maps. Bing Maps allows companies to access rich geospatial data and tools for integration into their own services. This highlights Microsoft's strategy of maintaining a balance between complementary and competitive products. By supporting open data initiatives and offering a paid service, Microsoft enhances its public relations while also generating revenue from premium mapping services. The company’s integration of free data into OSM, coupled with its competitive stance toward Google Maps, demonstrates a dual strategy aimed at providing affordable mapping solutions while undercutting its rival (ref).

B. Charging commercial platform

(1) Overture Maps Overture Maps represents a new model of commercialisation by repackaging open data, such as building footprint data, for profit. Unlike platforms that generate their own original datasets, Overture leverages data from open sources and member contributions. Major companies, including Amazon and Facebook, are part of its membership, using the platform's data to enhance their own mapping services. Overture occupies a unique position, operating between the open and commercial sectors by providing tailored solutions while monetising open data (ref). One of Overture's strengths is its alignment with UN Sustainable Development Goals (SDGs), particularly in urban development and infrastructure. The platform repurposes open data to assist in city planning efforts, contributing to the SDGs in meaningful ways. However, its contribution compared to non-commercial initiatives remains a topic of debate, as its commercial focus contrasts with the more altruistic aims of open-data advocates. Data privacy and licensing are central to Overture's business model, with the platform's use of the Community Database License Agreement (CDLA) – Permissive v2 allowing for broad data use with minimal restrictions. This licensing structure supports both commercial and non-commercial innovation. However, compared to fully open platforms like OpenStreetMap (OSM), Overture's more closed approach raises ethical concerns. The monetisation of open data and restrictions on access to repackaged information challenge the open data ethos, leading to questions about the balance between open access and commercial interests. Overture offers tailored consultation and co-development services, providing bespoke solutions for its commercial partners, often backed by major tech companies. This collaboration drives innovation, but it also raises concerns about corporate control, transparency, and data ownership. The platform works closely with member companies to align its data with specific industry needs. However, the focus on commercial outcomes can limit the openness of the platform, creating potential conflicts between community-driven goals and corporate priorities. At the core of Overture’s operations is its ability to collect and process data from various sources, including public datasets, aerial imagery, and open platforms like OSM. Data collection and large-scale ingestion are crucial to its commercial operations, enabling the platform to manage vast amounts of information. While this offers commercial advantages, maintaining data quality across such a broad spectrum remains a challenge, particularly when relying on open data sources that may not meet the high standards expected by commercial users. The data ecosystem and performance of Overture depend on continuous updates from external sources. While aerial imagery and public data contribute to this ecosystem, relying on these external updates can lead to inconsistencies, particularly when mapping agencies are not directly involved in ensuring the accuracy of the data. This reliance raises issues of consistency and reliability, especially when repackaged data is resold to commercial clients. The quality and timeliness of the original data can significantly impact the platform's performance in various commercial contexts. To address some of these concerns, Overture has implemented a feedback loop within its ecosystem, allowing users and partners to provide input on the accuracy and relevance of the data. This feedback mechanism is designed to improve the platform’s offerings over time. However, the effectiveness of this system depends on how well Overture integrates the feedback into future updates, ensuring that the data remains reliable and valuable for its commercial users. As such, whilst Overture’s approach to repackaging open data contrasts with Google’s proprietary datasets, it also introduces competition by offering tailored solutions through open-source collaboration. Unlike Google Maps, which operates as a proprietary service, Overture actively works with open-source platforms, including OpenStreetMap, contributing to the growing open data ecosystem. However, its commercialisation of open data has raised questions about whether this model contributes to monopolistic control by large corporations over key aspects of the data market.

Clone this wiki locally