Exploring road and points of interest (POIs) associations in OpenStreetMap, a new paradigm for OSM road class prediction

Proceedings of the OSM Science

Published On 2023

1 GIScience Research Group, Heidelberg University, Heidelberg, Germany; francis. andorful@ uni-heidelberg. de, nir. fulman@ uni-heidelberg. de 2 HeiGIT-Heidelberg Institute for Geoinformation Technology, 69120 Heidelberg, Germany; sven. lautenbach@ uni-heidelberg. de, christina. ludwing@ uni-heidelberg. de, herfort@ uni-heidelberg. de, zipf@ uni-heidelberg. de

Journal

Proceedings of the OSM Science

Published On

2023

Page

69-72

Authors

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Position

Chair of GIScience HeiGIT Heidelberg Institute for Geoinformation Technology

H-Index(all)

56

H-Index(since 2020)

39

I-10 Index(all)

0

I-10 Index(since 2020)

0

Citation(all)

0

Citation(since 2020)

0

Cited By

0

Research Interests

Geoinformatics

GIScience

VGI

Geomatics

Geographic Information Science

Sven Lautenbach

Sven Lautenbach

Ruprecht-Karls-Universität Heidelberg

Position

H-Index(all)

39

H-Index(since 2020)

32

I-10 Index(all)

0

I-10 Index(since 2020)

0

Citation(all)

0

Citation(since 2020)

0

Cited By

0

Research Interests

Ecosystem services

GIScience

integrated modelling

land use change

VGI

Nir Fulman

Nir Fulman

Tel Aviv University

Position

PhD candidate

H-Index(all)

5

H-Index(since 2020)

5

I-10 Index(all)

0

I-10 Index(since 2020)

0

Citation(all)

0

Citation(since 2020)

0

Cited By

0

Research Interests

Spatial modeling

Transportation

GIS

University Profile Page

Other Articles from authors

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Geo-spatial Information Science

An investigation of the temporality of OpenStreetMap data contribution activities

OpenStreetMap (OSM) is a dataset in constant change and this dynamic needs to be better understood. Based on 12-year time series of seven OSM data contribution activities extracted from 20 large cities worldwide, we investigate the temporal dynamic of OSM data production, more specifically, the auto- and cross-correlation, temporal trend, and annual seasonality of these activities. Furthermore, we evaluate and compare nine different temporal regression methods for forecasting such activities in horizons of 1–4 weeks. Several insights could be obtained from our analyses, including that the contribution activities tend to grown linearly in a moderate intra-annual cycle. Also, the performance of the temporal forecasting methods shows that they yield in general more accurate estimations of future contribution activities than a baseline metric, i.e. the arithmetic average of recent previous observations. In particular, the …

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

arXiv preprint arXiv:2401.04218

Distortions in Judged Spatial Relations in Large Language Models: The Dawn of Natural Language Geographic Data?

We present a benchmark for assessing the capability of Large Language Models (LLMs) to discern intercardinal directions between geographic locations and apply it to three prominent LLMs: GPT-3.5, GPT-4, and Llama-2. This benchmark specifically evaluates whether LLMs exhibit a hierarchical spatial bias similar to humans, where judgments about individual locations' spatial relationships are influenced by the perceived relationships of the larger groups that contain them. To investigate this, we formulated 14 questions focusing on well-known American cities. Seven questions were designed to challenge the LLMs with scenarios potentially influenced by the orientation of larger geographical units, such as states or countries, while the remaining seven targeted locations less susceptible to such hierarchical categorization. Among the tested models, GPT-4 exhibited superior performance with 55.3% accuracy, followed by GPT-3.5 at 47.3%, and Llama-2 at 44.7%. The models showed significantly reduced accuracy on tasks with suspected hierarchical bias. For example, GPT-4's accuracy dropped to 32.9% on these tasks, compared to 85.7% on others. Despite these inaccuracies, the models identified the nearest cardinal direction in most cases, suggesting associative learning, embodying human-like misconceptions. We discuss the potential of text-based data representing geographic relationships directly to improve the spatial reasoning capabilities of LLMs.

Sven Lautenbach

Sven Lautenbach

Ruprecht-Karls-Universität Heidelberg

HEUREKA'24-Optimierung in Verkehr und Transport, Stuttgart, 13rd-14th March 2024

Vollständigkeit von OpenStreetMap-POI-Daten für die Nutzung in der Verkehrsplanung

Zur Beschreibung der Attraktivität von Gebieten im Rahmen der Zielwahlmodellierung werden oftmals Informationen über Points-of-Interest (POI) aus OpenStreetMap (OSM) genutzt. Wir haben die Vollständigkeit der OSM-POI-Datenbank für 129 Untersuchungsgebiete mithilfe von Vollerhebungen geprüft. Die Vollständigkeit der OSM-Datenbank unterscheidet sich zwischen einzelnen Kategorien erheblich. OSM ist in den Kategorien Gastronomie und Einzelhandel in weiten Teilen vollständig und nach stichprobenartiger Prüfung im Anwendungsfall für die Modellierung nutzbar. In den Kategorien Dienstleistung mit Kundenverkehr und Medizinische Versorgung fehlen in OSM zumeist eine Vielzahl an POI. Strukturelle Einflüsse räumlicher oder intrinsischer Indikatoren konnten nicht nachgewiesen werden.

Nir Fulman

Nir Fulman

Tel Aviv University

Cities

A project-based view of urban dynamics: Analyzing ‘leapfrogging’ and fringe development in Israel

Analyzing urban pattern dynamics based on construction projects, we classify them into three types - infilling, fringe, and leapfrogging, and focus on the role of leapfrogging projects as seeds for new developments, leading to uncontrolled urban sprawl. To study the leapfrogging phenomenon, we investigate the sprawl of three Israeli cities - Netanya, Haifa, and Safed over 54 years from 1964 to 2018 and conduct a country-wide analysis of the urban sprawl of all 66 Israeli municipalities between 2013 and 2018. Our analysis is based on a country-wide GIS database of roads, buildings, other infrastructure elements, and development plans, as well as high-resolution aerial photos covering the investigated areas and periods. We uncover and characterize a positive feedback mechanism of rapid leapfrogging developments that attract further developments in their proximity and emphasize the potential of leapfrogging …

Sven Lautenbach

Sven Lautenbach

Ruprecht-Karls-Universität Heidelberg

ERDKUNDE

How to assess the needs of vulnerable population groups towards heat-sensitive routing? An evidence-based and practical approach to reducing urban heat stress

Heat poses a significant risk to human health, particularly for vulnerable populations, such as pregnant women, older individuals, young children and people with pre-existing medical conditions. In view of this, we formulated a heat stress-avoidant routing approach in Heidelberg, Germany, to ensure mobility and support day-to-day activities in urban areas during heat events. Although the primary focus is on pedestrians, it is also applicable to cyclists. To obtain a nuanced understanding of the needs and demands of the wider population, especially vulnerable groups, and to address the challenge of reducing urban heat stress, we used an inter-and transdisciplinary approach. The needs of vulnerable groups, the public, and the city administration were identified through participatory methods and various tools, including interactive city walks. Solution approaches and adaptation measures to prevent heat stress were evaluated and integrated into the development of a heat-avoiding route service through a co-design process. The findings comprise the identification of perceived hotspots for heat (such as large public spaces in the city centre with low shading levels), the determination of commonly reported symptoms resulting from severe heat (eg, fatigue or lack of concentration), and the assessment of heat adaptation measures that were rated positively, including remaining in the shade and delaying errands. Additionally, we analysed and distinguished between individual and community adaptation strategies. Overall, many respondents did not accurately perceive the risk of heat stress in hot weather, despite severe limitations. As a result, the heat …

Nir Fulman

Nir Fulman

Tel Aviv University

arXiv preprint arXiv:2401.04218

Distortions in Judged Spatial Relations in Large Language Models: The Dawn of Natural Language Geographic Data?

We present a benchmark for assessing the capability of Large Language Models (LLMs) to discern intercardinal directions between geographic locations and apply it to three prominent LLMs: GPT-3.5, GPT-4, and Llama-2. This benchmark specifically evaluates whether LLMs exhibit a hierarchical spatial bias similar to humans, where judgments about individual locations' spatial relationships are influenced by the perceived relationships of the larger groups that contain them. To investigate this, we formulated 14 questions focusing on well-known American cities. Seven questions were designed to challenge the LLMs with scenarios potentially influenced by the orientation of larger geographical units, such as states or countries, while the remaining seven targeted locations less susceptible to such hierarchical categorization. Among the tested models, GPT-4 exhibited superior performance with 55.3% accuracy, followed by GPT-3.5 at 47.3%, and Llama-2 at 44.7%. The models showed significantly reduced accuracy on tasks with suspected hierarchical bias. For example, GPT-4's accuracy dropped to 32.9% on these tasks, compared to 85.7% on others. Despite these inaccuracies, the models identified the nearest cardinal direction in most cases, suggesting associative learning, embodying human-like misconceptions. We discuss the potential of text-based data representing geographic relationships directly to improve the spatial reasoning capabilities of LLMs.

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

ERDKUNDE

How to assess the needs of vulnerable population groups towards heat-sensitive routing? An evidence-based and practical approach to reducing urban heat stress

Heat poses a significant risk to human health, particularly for vulnerable populations, such as pregnant women, older individuals, young children and people with pre-existing medical conditions. In view of this, we formulated a heat stress-avoidant routing approach in Heidelberg, Germany, to ensure mobility and support day-to-day activities in urban areas during heat events. Although the primary focus is on pedestrians, it is also applicable to cyclists. To obtain a nuanced understanding of the needs and demands of the wider population, especially vulnerable groups, and to address the challenge of reducing urban heat stress, we used an inter-and transdisciplinary approach. The needs of vulnerable groups, the public, and the city administration were identified through participatory methods and various tools, including interactive city walks. Solution approaches and adaptation measures to prevent heat stress were evaluated and integrated into the development of a heat-avoiding route service through a co-design process. The findings comprise the identification of perceived hotspots for heat (such as large public spaces in the city centre with low shading levels), the determination of commonly reported symptoms resulting from severe heat (eg, fatigue or lack of concentration), and the assessment of heat adaptation measures that were rated positively, including remaining in the shade and delaying errands. Additionally, we analysed and distinguished between individual and community adaptation strategies. Overall, many respondents did not accurately perceive the risk of heat stress in hot weather, despite severe limitations. As a result, the heat …

Nir Fulman

Nir Fulman

Tel Aviv University

Epidemiology

Residential Greenness and Long-term Mortality Among Patients Who Underwent Coronary Artery Bypass Graft Surgery

Background:Studies have reported inverse associations between exposure to residential greenness and mortality. Greenness has also been associated with better surgical recovery. However, studies have had small sample sizes and have been restricted to clinical settings. We investigated the association between exposure to residential greenness and all-cause mortality among a cohort of cardiac patients who underwent coronary artery bypass graft (CABG) surgery.Methods:We studied this cohort of 3,128 CABG patients between 2004 and 2009 at seven cardiothoracic departments in Israel and followed patients until death or 1st May 2021. We collected covariate information at the time of surgery and calculated the patient-level average normalized difference vegetation index (NDVI) over the entire follow-up in a 300 m buffer from the home address. We used Cox proportional hazards regression models to …

Sven Lautenbach

Sven Lautenbach

Ruprecht-Karls-Universität Heidelberg

Nature Communications

A spatio-temporal analysis investigating completeness and inequalities of global urban building data in OpenStreetMap

OpenStreetMap (OSM) has evolved as a popular dataset for global urban analyses, such as assessing progress towards the Sustainable Development Goals. However, many analyses do not account for the uneven spatial coverage of existing data. We employ a machine-learning model to infer the completeness of OSM building stock data for 13,189 urban agglomerations worldwide. For 1,848 urban centres (16% of the urban population), OSM building footprint data exceeds 80% completeness, but completeness remains lower than 20% for 9,163 cities (48% of the urban population). Although OSM data inequalities have recently receded, partially as a result of humanitarian mapping efforts, a complex unequal pattern of spatial biases remains, which vary across various human development index groups, population sizes and geographic regions. Based on these results, we provide recommendations for data …

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Proceedings of the OSM Science

OpenStreetMap Data for Automated Labelling Machine Learning Examples: The Challenge of Road Type Imbalance

Advances in Artificial Intelligence (AI) and, specifically, in Deep Learning (DL) have fostered geospatial analysis and remote sensing, culminating in the establishment of GeoAI [1, 2] and the solidification of research on methodologies and techniques for AI-assisted mapping [3-7]. Nevertheless, a particular challenge lies in the substantial demand for training examples in DL. Manual labelling of these examples is labour-intensive, consuming a considerable amount of time and financial resources. Alternatively, semi or automated labelling of data emerges as a prominent solution, as exemplified by the tool ohsome2label [8], which harnesses data from the OpenStreetMap [9] to label satellite images. However, moving from characterising object types (road, river, building) based on geometry to categorising them by attributes might result in an imbalanced class distribution in the utilised Machine Learning (ML) dataset.Such imbalances are common in numerous practical applications. Learning from skewed datasets can be particularly challenging and often requires non-conventional ML techniques. A comprehensive awareness of the issues associated with class imbalance, as well as strategies for mitigating them, is essential [10]. In the context of spatial data, the distribution of classes can vary from country to country and region to region, adding a new layer of complexity and exacerbating this issue. In this context, an analysis was conducted on the distribution of road types, defined by the values of the OSM" highway" tag, in diverse-profile nations. The aim was to evaluate the extent of class imbalance and to identify any consistent patterns in the …

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Environmental Monitoring and Assessment

Carbon fluxes related to land use and land cover change in Baden-Württemberg

Spatially explicit information on carbon fluxes related to land use and land cover change (LULCC) is of value for the implementation of local climate change mitigation strategies. However, estimates of these carbon fluxes are often aggregated to larger areas. We estimated committed gross carbon fluxes related to LULCC in Baden-Württemberg, Germany, using different emission factors. In doing so, we compared four different data sources regarding their suitability for estimating the fluxes: (a) a land cover dataset derived from OpenStreetMap (OSMlanduse); (b) OSMlanduse with removal of sliver polygons (OSMlanduse cleaned), (c) OSMlanduse enhanced with a remote sensing time series analysis (OSMlanduse+); (d) the LULCC product of Landschaftsveränderungsdienst (LaVerDi) from the German Federal Agency of Cartography and Geodesy. We produced a high range of carbon flux estimates, mostly caused …

Sven Lautenbach

Sven Lautenbach

Ruprecht-Karls-Universität Heidelberg

ISPRS International Journal of Geo-Information

Assessing completeness of OpenStreetMap building footprints using MapSwipe

Natural hazards threaten millions of people all over the world. To address this risk, exposure and vulnerability models with high resolution data are essential. However, in many areas of the world, exposure models are rather coarse and are aggregated over large areas. Although OpenStreetMap (OSM) offers great potential to assess risk at a detailed building-by-building level, the completeness of OSM building footprints is still heterogeneous. We present an approach to close this gap by means of crowd-sourcing based on the mobile app MapSwipe, where volunteers swipe through satellite images of a region collecting user feedback on classification tasks. For our application, MapSwipe was extended by a completeness feature that allows to classify a tile as “no building”, “complete” or “incomplete”. To assess the quality of the produced data, the completeness feature was applied to four regions. The MapSwipe-based assessment was compared with an intrinsic approach to quantify completeness and with the prediction of an existing model. Our results show that the crowd-sourced approach yields a reasonable classification performance of the completeness of OSM building footprints. Results showed that the MapSwipe-based assessment produced consistent estimates for the case study regions while the other two approaches showed a higher variability. Our study also revealed that volunteers tend to classify nearly completely mapped tiles as “complete”, especially in areas with a high OSM building density. Another factor that influenced the classification performance was the level of alignment of the OSM layer with the satellite imagery.

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Engineering Proceedings

Urban Heat Island Intensity Prediction in the Context of Heat Waves: An Evaluation of Model Performance

Urban heat islands, characterized by higher temperatures in cities compared to surrounding areas, have been studied using various techniques. However, during heat waves, existing models often underestimate the intensity of these heat islands compared to empirical measurements. To address this, an hourly time-series-based model for predicting heat island intensity during heat wave conditions is proposed. The model was developed and validated using empirical data from the National Monitoring Network in Temuco, Chile. Results indicate a strong correlation (r > 0.98) between the model’s predictions and actual monitoring data. Additionally, the study emphasizes the importance of considering the unique microclimatic characteristics and built environment of each city when modelling urban heat islands. Factors such as urban morphology, land cover, and anthropogenic heat emissions interact in complex ways, necessitating tailored modelling approaches for the accurate representation of heat island phenomena.

Sven Lautenbach

Sven Lautenbach

Ruprecht-Karls-Universität Heidelberg

Environmental Monitoring and Assessment

Carbon fluxes related to land use and land cover change in Baden-Württemberg

Spatially explicit information on carbon fluxes related to land use and land cover change (LULCC) is of value for the implementation of local climate change mitigation strategies. However, estimates of these carbon fluxes are often aggregated to larger areas. We estimated committed gross carbon fluxes related to LULCC in Baden-Württemberg, Germany, using different emission factors. In doing so, we compared four different data sources regarding their suitability for estimating the fluxes: (a) a land cover dataset derived from OpenStreetMap (OSMlanduse); (b) OSMlanduse with removal of sliver polygons (OSMlanduse cleaned), (c) OSMlanduse enhanced with a remote sensing time series analysis (OSMlanduse+); (d) the LULCC product of Landschaftsveränderungsdienst (LaVerDi) from the German Federal Agency of Cartography and Geodesy. We produced a high range of carbon flux estimates, mostly caused …

Nir Fulman

Nir Fulman

Tel Aviv University

Transport Policy

Investigating occasional travel patterns based on smartcard transactions

Public transportation (PT) studies often neglect non-routine trips focusing predominantly on commuting. However, recent research revealed that occasional trips make up a substantial portion of public transport journeys, and traveler preferences for non-routine trips diverge from their preferences for regular commuting. We study non-routine trips based on a database of 63 million smartcard (SC) records of PT boardings made in Israel during June 2019. The characteristics of these trips are revealed by clustering PT users’ boarding records based on the location of the boarding stops and time of day, applying an extended DBSCAN algorithm. Our major findings are that (1) conventional home-work-home commuters are a minority in Israel and constitute less than 15% of the riders; (2) at least 30% of the PT trips do not belong to any cluster and can be classified as occasional; (3) The vast majority of users make both …

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

International Journal of Applied Earth Observation and Geoinformation

Semi-supervised water tank detection to support vector control of emerging infectious diseases transmitted by Aedes Aegypti

The disease transmitting mosquito Aedes Aegypti is an increasing global threat. It breeds in small artificial containers such as rainwater tanks and can be characterized by a short flight range. The resulting high spatial variability of abundance is challenging to model. Therefore, we tested an approach to map water tank density as a spatial proxy for urban Aedes Aegypti habitat suitability. Water tank density mapping was performed by a semi-supervised self-training approach based on open accessible satellite imagery for the city of Rio de Janeiro. We ran a negative binomial generalized linear regression model to evaluate the statistical significance of water tank density for modeling inner-urban Aedes Aegypti distribution measured by an entomological surveillance system between January 2019 and December 2021. Our proposed semi-supervised model outperformed a supervised model for water tank detection …

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Nature Communications

A spatio-temporal analysis investigating completeness and inequalities of global urban building data in OpenStreetMap

OpenStreetMap (OSM) has evolved as a popular dataset for global urban analyses, such as assessing progress towards the Sustainable Development Goals. However, many analyses do not account for the uneven spatial coverage of existing data. We employ a machine-learning model to infer the completeness of OSM building stock data for 13,189 urban agglomerations worldwide. For 1,848 urban centres (16% of the urban population), OSM building footprint data exceeds 80% completeness, but completeness remains lower than 20% for 9,163 cities (48% of the urban population). Although OSM data inequalities have recently receded, partially as a result of humanitarian mapping efforts, a complex unequal pattern of spatial biases remains, which vary across various human development index groups, population sizes and geographic regions. Based on these results, we provide recommendations for data …

Nir Fulman

Nir Fulman

Tel Aviv University

AGILE: GIScience Series

Exploring Non-Routine Trips Through Smartcard Transaction Analysis

Public transportation (PT) studies often overlook non-routine trips, focusing on commuting trips. However, recent research reveals that occasional trips comprise a significant portion of public transportation trips. Furthermore, traveler preferences for non-routine trips essentially differ from their preferences for regular commuting. We investigate non-routine trips based on a database of 63 million records of PT boardings made in Israel during June 2019. The behavioral patterns of PT users are revealed by clustering their boarding records based on the location of the boarding stops and time of day, applying an extended DBSCAN algorithm. Our major findings are that (1) conventional home-work-home commuters are a minority and constitute less than 15% of Israeli riders; (2) at least 30% of the PT trips do not belong to any cluster and can be classified occasional; (3) The vast majority of users make both recurrent and occasional trips. A linear regression model provides a good estimate (R2 = 0.85) of the number of occasional boardings at a stop as a function of the total number of boardings, time of a day, and land use composition around the trip origin.

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Challenges and solution approach for greenhouse gas emission inventories at fine spatial resolutions–the example of the Rhine-Neckar district

This discussion paper originated as the concluding publication of one of the pilot projects of the "Climate Action Science" research initiative at Heidelberg Center for the Environment (HCE), focusing on the Rhine-Neckar district and the city of Heidelberg. The aim of the explorative project was to generate a first overview on greenhouse gas emission data in order to initiate climate action of various actors and to provide well-founded support by using accurate infor-mation. The focus during the pilot phase was on the collection, compilation and evaluation of the quality of heterogeneous data sets and methods for a greenhouse gas emission inventory, as well as on the information preparation and evaluation of different inventory and presentation options. These should in turn be adapted to the needs of different users and fields of applica-tion. The study focused on different German approaches to greenhouse gas accounting, espe-cially in Baden-Württemberg compared to other German states, and in detail on the City of Heidelberg compared to the surrounding municipalities in the Rhine-Neckar district. The over-arching goal is to use the results beyond the case study projected here as a stimulus and pre-liminary work for further projects and activities in the overall "Climate Action Science" project. Several difficulties were encountered in processing the emissions inventory and compiling var-ious data sets on emissions in the study area. Three basic situations were identified: 1. De-sired data is not available (measurements required), 2. Desired data is not freely accessible (stakeholder involvement), 3. Data generation via proxy data. In the pilot phase …

Other articles from Proceedings of the OSM Science journal

Oliver O'Brien

Oliver O'Brien

University College London

Proceedings of the OSM Science

Towards an open high-resolution land use dataset in Great Britain–Comparing and consolidating retail centre areas from open data sources

Great Britain does not have a comprehensive and openly licenced high-resolution land use dataset that includes detail on building usage, but OpenStreetMap (OSM) has potential as a good base for creation of such a dataset [1]. OSM’s quality and completeness is highly variable, but often good and improving, including for land use mapping [2, 3]. This research evaluates use of separate open datasets to augment OSM for Great Britain. The research focuses on retail areas as these have recently been impacted both by internet shopping and the COVID pandemic [4].This paper evaluates three generally openly available datasets showing retail centre extents across Great Britain, analysing each by areal footprint and, where available, premises counts. Firstly, the Consumer Data Research Centre (CDRC)’s Retail Centres Boundaries 2022 product, secondly non-domestic Energy Performance Certificates (EPCs) geolocated with Unique Property Reference Numbers (UPRNs), filtered for retail categories, and finally OSM land use retail polygons on their own.

Florian Ledermann

Florian Ledermann

Technische Universität Wien

Proceedings of the OSM Science

Mapping public space in urban neighbourhoods using OpenStreetMap data

OpenStreetMap (OSM) enriches the exploration and study of urban landscapes. In this research project, we aim to use OSM data to investigate urban public spaces from a distributional justice perspective. While public spaces are acknowledged as an important resource for urban society, it becomes important, in light of ongoing trends towards privatization, commercialization, and festivalization, to critically observe and reflect on the extent to which resources, rights and opportunities regarding public space are distributed equally. The amount, accessibility, and character of public space can differ between cities and neighbourhoods. A quantitative analysis of public space could offer insights into the distribution and availability of public space. To this end, we propose a framework for the identification and categorization of these spaces based on OSM data. The framework aims to enable both the mapping of public spaces as well as an evaluation of the share of public space. We also hope to investigate the potential of OSM data. Some preliminary findings and an introductory overview of the research process, with an emphasis on its cartographic aspects, were presented in a previous publication [1]. The inspiration for this research is the so-called Nolli map, a map of Rome dating back over 250 years. Giovanni Battista Nolli, an Italian architect, engineer and cartographer, analysed the urban fabric beyond the structure of roads and buildings. In his work, titled'La Nuova Topografia di Roma', Nolli mapped the interior and exterior spaces of Rome in high detail as a figure-ground map with contrasting dark and light sections. This distinction is commonly …

2023/12/29

Article Details
Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Proceedings of the OSM Science

OpenStreetMap Data for Automated Labelling Machine Learning Examples: The Challenge of Road Type Imbalance

Advances in Artificial Intelligence (AI) and, specifically, in Deep Learning (DL) have fostered geospatial analysis and remote sensing, culminating in the establishment of GeoAI [1, 2] and the solidification of research on methodologies and techniques for AI-assisted mapping [3-7]. Nevertheless, a particular challenge lies in the substantial demand for training examples in DL. Manual labelling of these examples is labour-intensive, consuming a considerable amount of time and financial resources. Alternatively, semi or automated labelling of data emerges as a prominent solution, as exemplified by the tool ohsome2label [8], which harnesses data from the OpenStreetMap [9] to label satellite images. However, moving from characterising object types (road, river, building) based on geometry to categorising them by attributes might result in an imbalanced class distribution in the utilised Machine Learning (ML) dataset.Such imbalances are common in numerous practical applications. Learning from skewed datasets can be particularly challenging and often requires non-conventional ML techniques. A comprehensive awareness of the issues associated with class imbalance, as well as strategies for mitigating them, is essential [10]. In the context of spatial data, the distribution of classes can vary from country to country and region to region, adding a new layer of complexity and exacerbating this issue. In this context, an analysis was conducted on the distribution of road types, defined by the values of the OSM" highway" tag, in diverse-profile nations. The aim was to evaluate the extent of class imbalance and to identify any consistent patterns in the …

Randall Guensler

Randall Guensler

Georgia Institute of Technology

Proceedings of the OSM Science

Assessing bike-transit accessibility with OpenStreetMap

Low-density land use, sprawl, and Euclidean zoning (ie, separation of commercial and residential land-uses) can reduce the effectiveness of public transit by reducing the number of homes, amenities, services, and jobs near transit stops [1, 2]. This gives rise to the first-last mile problem, where transit riders must travel long distances to access transit from their origin and from transit to their destination. Bicycles as a first and/or last-mile mode (henceforth referred to as bike-transit) can extend the service coverage area of a transit stop or station by allowing transit users to cover a greater distance in the same amount of time [3]. Not only can people reach transit stops faster on a bicycle than they could by walking; people using bicycles can also reach more transit stops within the same time frame. Lastly, people may be able to avoid bus feeder routes and cycle directly to higher service quality transit routes (such as rail).Despite bike-transit's potential for shortening travel times, bike-transit is not commonly modeled in traditional travel demand modeling or trip planners. This is because bike-transit trips are computationally intensive to calculate given the number of possible transit stop pairs and departure times. Our solution to this is to use bicycle and transit shortest path algorithms to demonstrate how bike-transit improves public transit's accessibility to destinations by reducing overall travel times, transit waiting times, and the number of transit transfers needed. Previous work has been done in assessing how bike-transit improves the effectiveness of public transit [4-7]; in this case, we will take an in-depth look at three different locations that are varying in …

Andy South

Andy South

Liverpool School of Tropical Medicine

Proceedings of the OSM Science

Developing a data validation method with OpenStreetMap Senegal and the Ministry of Health in support of accurate health facility data

This research examines the collaboration between a local OpenStreetMap chapter and health authorities to improve health facility data accuracy. By utilizing open data and statistical methods, communities can empower Ministries of Health, address Sustainable Development Goals (SDGs) indicators, and enhance emergency response. The healthsites. io Digital Public Good [1] has been working with OpenStreetmap Senegal [2] since 2017. We have established a data collaborative focused on health facility data that lives in OpenStreetMap. The collaborative is a semi-formal network that identifies and shares geospatial data on health to OpenStreetMap. It works to identify gaps and barriers to sharing, defines methodologies and data models for sharing and supports stakeholders with sharing and the use of data especially for decision-making. Crucially, the collaborative saves validated data to OpenStreetMap which means that successive projects are able to benefit from the work even when programs end. Accurate health data plays a vital role in effective healthcare planning, resource allocation, and emergency response. However, existing data sources often suffer from inaccuracies and limited sharing, hindering the potential for informed decision-making and comprehensive health interventions. In response, we have developed an Emergency Health data validation method [3]. The method involves local stakeholders and the healthsites. io open data platform as a means to enhance data quality and accessibility. The Global Fund's COVID-19 response mechanism underscores the significance of accurate health facility data [4]. This mechanism …

Alexander Zipf

Alexander Zipf

Ruprecht-Karls-Universität Heidelberg

Proceedings of the OSM Science

Exploring road and points of interest (POIs) associations in OpenStreetMap, a new paradigm for OSM road class prediction

1 GIScience Research Group, Heidelberg University, Heidelberg, Germany; francis. andorful@ uni-heidelberg. de, nir. fulman@ uni-heidelberg. de 2 HeiGIT-Heidelberg Institute for Geoinformation Technology, 69120 Heidelberg, Germany; sven. lautenbach@ uni-heidelberg. de, christina. ludwing@ uni-heidelberg. de, herfort@ uni-heidelberg. de, zipf@ uni-heidelberg. de

Hao Li

Hao Li

Heidelberg University

Proceedings of the OSM Science

Beyond Two Dimensions: Large-Scale Building Height Mapping in OpenStreetMap via Synthetic Aperture Radar and Street-View Imagery

In the past decades, the world has been comprehensively mapped in 2D, however, a vertical dimension remains underexplored despite its huge potential. For instance, as of August 2023, more than 571 million buildings are mapped in OpenStreetMap (OSM) according to statistics from Taginfo, but less than 3% of them are associated with height values via the key/value pairs heights=*. Though one can often estimate the height information via OSM key/value pairs such as building: levels=* and stories=*. Mapping human settlements as a 3D representation of reality requires an accurate description of vertical dimensions besides the 2D footprints and shapes. A 3D representation of human settlement is important in many aspects, including public health, urban planning, and environment monitoring, disaster management, etc. In this context, a list of the most relevant 3D building attributes mainly includes but is not limited to building height, building floor, and roof type [1, 2]. For instance, building height is a key and fundamental factor in post-disaster (eg, earthquake and flood) damage and situation assessment. Similarly, the roof type information is beneficial in estimating photovoltaic electricity potential at scale. As defined in CityGML 2.0 [3], 3D building models are divided into five levels of detail (LoDs)[4]. In LoD0, only the 2D footprint information is involved in the model. In LoD1, the LoD0 model is extruded by their building heights, and the obtained cuboid after extrusion is the LoD1 model. In LoD2, the 3D roof structure information is added to the LoD2 model. The LoD3 model further contains facade elements such as windows and doors. The …

Nir Fulman

Nir Fulman

Tel Aviv University

Proceedings of the OSM Science

Exploring road and points of interest (POIs) associations in OpenStreetMap, a new paradigm for OSM road class prediction

1 GIScience Research Group, Heidelberg University, Heidelberg, Germany; francis. andorful@ uni-heidelberg. de, nir. fulman@ uni-heidelberg. de 2 HeiGIT-Heidelberg Institute for Geoinformation Technology, 69120 Heidelberg, Germany; sven. lautenbach@ uni-heidelberg. de, christina. ludwing@ uni-heidelberg. de, herfort@ uni-heidelberg. de, zipf@ uni-heidelberg. de

Sven Lautenbach

Sven Lautenbach

Ruprecht-Karls-Universität Heidelberg

Proceedings of the OSM Science

Exploring road and points of interest (POIs) associations in OpenStreetMap, a new paradigm for OSM road class prediction

1 GIScience Research Group, Heidelberg University, Heidelberg, Germany; francis. andorful@ uni-heidelberg. de, nir. fulman@ uni-heidelberg. de 2 HeiGIT-Heidelberg Institute for Geoinformation Technology, 69120 Heidelberg, Germany; sven. lautenbach@ uni-heidelberg. de, christina. ludwing@ uni-heidelberg. de, herfort@ uni-heidelberg. de, zipf@ uni-heidelberg. de

Edson Augusto Melanda

Edson Augusto Melanda

Universidade Federal de São Carlos

Proceedings of the OSM Science

OpenStreetMap Data for Automated Labelling Machine Learning Examples: The Challenge of Road Type Imbalance

Advances in Artificial Intelligence (AI) and, specifically, in Deep Learning (DL) have fostered geospatial analysis and remote sensing, culminating in the establishment of GeoAI [1, 2] and the solidification of research on methodologies and techniques for AI-assisted mapping [3-7]. Nevertheless, a particular challenge lies in the substantial demand for training examples in DL. Manual labelling of these examples is labour-intensive, consuming a considerable amount of time and financial resources. Alternatively, semi or automated labelling of data emerges as a prominent solution, as exemplified by the tool ohsome2label [8], which harnesses data from the OpenStreetMap [9] to label satellite images. However, moving from characterising object types (road, river, building) based on geometry to categorising them by attributes might result in an imbalanced class distribution in the utilised Machine Learning (ML) dataset.Such imbalances are common in numerous practical applications. Learning from skewed datasets can be particularly challenging and often requires non-conventional ML techniques. A comprehensive awareness of the issues associated with class imbalance, as well as strategies for mitigating them, is essential [10]. In the context of spatial data, the distribution of classes can vary from country to country and region to region, adding a new layer of complexity and exacerbating this issue. In this context, an analysis was conducted on the distribution of road types, defined by the values of the OSM" highway" tag, in diverse-profile nations. The aim was to evaluate the extent of class imbalance and to identify any consistent patterns in the …

Kari Watkins

Kari Watkins

Georgia Institute of Technology

Proceedings of the OSM Science

Assessing bike-transit accessibility with OpenStreetMap

Low-density land use, sprawl, and Euclidean zoning (ie, separation of commercial and residential land-uses) can reduce the effectiveness of public transit by reducing the number of homes, amenities, services, and jobs near transit stops [1, 2]. This gives rise to the first-last mile problem, where transit riders must travel long distances to access transit from their origin and from transit to their destination. Bicycles as a first and/or last-mile mode (henceforth referred to as bike-transit) can extend the service coverage area of a transit stop or station by allowing transit users to cover a greater distance in the same amount of time [3]. Not only can people reach transit stops faster on a bicycle than they could by walking; people using bicycles can also reach more transit stops within the same time frame. Lastly, people may be able to avoid bus feeder routes and cycle directly to higher service quality transit routes (such as rail).Despite bike-transit's potential for shortening travel times, bike-transit is not commonly modeled in traditional travel demand modeling or trip planners. This is because bike-transit trips are computationally intensive to calculate given the number of possible transit stop pairs and departure times. Our solution to this is to use bicycle and transit shortest path algorithms to demonstrate how bike-transit improves public transit's accessibility to destinations by reducing overall travel times, transit waiting times, and the number of transit transfers needed. Previous work has been done in assessing how bike-transit improves the effectiveness of public transit [4-7]; in this case, we will take an in-depth look at three different locations that are varying in …

Silvana Camboim

Silvana Camboim

Universidade Federal do Paraná

Proceedings of the OSM Science

Fostering OSM's Micromapping Through Combined Use of Artificial Intelligence and Street-View Imagery

Map scale is fundamental to cartography. The International Cartographic Association's definition of a map [1] already explicitly emphasizes the selection of specific features, pointing to the process of cartographic generalization. This operation simplifies the representation of geographic data to produce a map at a given scale [2]. In the past, geospatial data was collected in a standardized way. Representations at larger scales could then be derived. However, OpenStreetMap (OSM) has changed this approach. Each object can be captured individually, resulting in digital representations of varying and distinct accuracies. Nevertheless, products such as OSM's Slippy Map Tiles are designed for consistent scales, maintaining a uniform scale for each tile. This characteristic gives users a seemingly seamless view of the map. The tile layers on the OpenStreetMap website range in maximum scale from around 1: 2000 to 1: 250 [3, 4], suitable for mapping urban detail. In the context of collaborative mapping, the term Micromapping was coined as the" mapping of small geographic objects"[5] and appears as a topic of growing interest among the OpenStreetMap and general Volunteered Geographic Information community [5, 6, 7, 8], can be helpful in many applications like mapping large-scale infrastructure [5]; pedestrian security and flow prediction [6]; detailed 3D model generation and indoor mapping [7]; assistive technologies like tactile maps generation [8]; and also general-purpose micro mapping rendering [9]. There is also some discussion among the OSM community about their idiosyncrasies, comprising issues that may arise when there is a bigger …