Craig A. Knoblock

University of Southern California

H-index: 79

North America-United States

Description

Craig A. Knoblock, With an exceptional h-index of 79 and a recent h-index of 34 (since 2020), a distinguished researcher at University of Southern California, specializes in the field of Knowledge graphs, data integration, data science, machine learning.

His recent articles reflect a diverse array of research interests and contributions to the field:

Automatically Constructing Geospatial Feature Taxonomies from OpenStreetMap Data

Detecting Semantic Errors in Tables using Textual Evidence

Indirect Cooperation in Distributed Stationary-Resource Searching with Predefined Destinations

Exploiting Polygon Metadata to Understand Raster Maps-Accurate Polygonal Feature Extraction

Building spatio-temporal knowledge graphs from vectorized topographic historical maps

GeoAI for the Digitization of Historical Maps

Sand: A tool for creating semantic descriptions of tabular sources

Towards the automated large-scale reconstruction of past road networks from historical maps

Professor Information

University	University of Southern California
Position	Keston Executive Director, USC Information Sciences Institute
Citations(all)	24662
Citations(since 2020)	4545
Cited By	21654
hIndex(all)	79
hIndex(since 2020)	34
i10Index(all)	246
i10Index(since 2020)	96
Email	Access Email
University Profile Page	University of Southern California

Research & Interests List

Knowledge graphs

data integration

data science

machine learning

Top articles of Craig A. Knoblock

Automatically Constructing Geospatial Feature Taxonomies from OpenStreetMap Data

This paper presents a method for constructing a lightweight taxonomy of geospatial features using OpenStreetMap (OSM) data. Leveraging the OSM data model, our process mines frequent tags to efficiently produce a structured hierarchy, enriching the semantic representation of geo-features. This data-driven taxonomy supports various geospatial analysis applications. Accompanying the methodology, we release the source code of our tool and demonstrate its practical application with tailored taxonomies for California (US) and Greece, underscoring our approach’s adaptability and scalability.

Authors

Basel Shbita,Craig A Knoblock

Published Date

2024/2/5

Detecting Semantic Errors in Tables using Textual Evidence

Tables can contain various types of errors, including both syntactic and semantic errors. Semantic errors relate to the meaning of the data and can be detrimental for downstream applications. The existing approaches for semantic error detection use structured knowledge sources such as Wikidata and DBpedia, but the coverage of such sources is quite limited. There is much more information available in free text to validate the contents of tables. In this paper, we present a novel semantic-error-detection approach that exploits open-domain textual data to verify the semantic correctness of tables. Our approach leverages contrastive learning, table linearization, and pre-trained language models to implement the error detection process. We implement our approach in a system called SEED and show in the evaluation that it significantly outperforms the other competing approaches.

Authors

Minh Pham,Craig A Knoblock,Muhao Chen

Published Date

2023/12/15

Indirect Cooperation in Distributed Stationary-Resource Searching with Predefined Destinations

Private vehicles are a direct means to bring people from one place to their desired destinations. However, no omniscient dispatcher is handling the origin-destination of vehicles and the availability of stationary resources, such as parking spaces or charging stations. Competitive cruising for stationary resources leads to environmental pollution and is a waste of drivers' time. We focus on the problem of distributed stationary-resource searching with predefined destinations under a multi-agent scenario. It is a distributed route planning problem with global optimization objectives. We present a probabilistic approach to achieving indirect resource coordination and latent agent cooperation in a distributed manner. Our approach treats the estimated availability of stationary resources as a reference and guides each agent based on their preferences. We evaluate our approach on four real-world datasets. Our approach …

Authors

Fandel Lin,Craig A Knoblock

Published Date

2023/11/13

Exploiting Polygon Metadata to Understand Raster Maps-Accurate Polygonal Feature Extraction

Locating undiscovered deposits of critical minerals requires accurate geological data. However, most of the 100,000 historical geological maps of the United States Geological Survey (USGS) are in raster format. This hinders critical mineral assessment. We target the problem of extracting geological features represented as polygons from raster maps. We exploit the polygon metadata that provides information on the geological features, such as the map keys indicating how the polygon features are represented, to extract the features. We present a metadata-driven machine-learning approach that encodes the raster map and map key into a series of bitmaps and uses a convolutional model to learn to recognize the polygon features. We evaluated our approach on USGS geological maps; our approach achieves a median F1 score of 0.809 and outperforms state-of-the-art methods by 4.52%.

Authors

Fandel Lin,Craig A Knoblock,Basel Shbita,Binh Vu,Zekun Li,Yao-Yi Chiang

Published Date

2023/11/13

Building spatio-temporal knowledge graphs from vectorized topographic historical maps

Historical maps provide rich information for researchers in many areas, including the social and natural sciences. These maps contain detailed documentation of a wide variety of natural and human-made features and their changes over time, such as changes in transportation networks or the decline of wetlands or forest areas. Analyzing changes over time in such maps can be labor-intensive for a scientist, even after the geographic features have been digitized and converted to a vector format. Knowledge Graphs (KGs) are the appropriate representations to store and link such data and support semantic and temporal querying to facilitate change analysis. KGs combine expressivity, interoperability, and standardization in the Semantic Web stack, thus providing a strong foundation for querying and analysis. In this paper, we present an automatic approach to convert vector geographic features extracted from multiple …

Authors

Basel Shbita,Craig A Knoblock,Weiwei Duan,Yao-Yi Chiang,Johannes H Uhl,Stefan Leyk

Journal

Semantic Web

Published Date

2023/1/1

GeoAI for the Digitization of Historical Maps

1 Historical maps capture past landscapes' natural and anthropogenic features, with geohistorical data from periods before the 1970s (before the Landsat program's launch) primarily found, barring a few exceptions, only on printed map sheets. In the past decade, numerous maps have been digitized and made publicly accessible. This chapter overviews cutting-edge AI methods and systems for processing historical maps to generate valuable data, insights, and knowledge. Individual sections highlight our recently published research findings across various domains, including the semantic web, big data, data mining, machine learning, document understanding, natural language processing, remote sensing, and geographic information systems. 1 Contact: Yao-Yi Chiang. Email: yaoyi@umn.edu. All other authors are listed in alphabetical order.

Authors

Yao-Yi Chiang,Muhao Chen,Weiwei Duan,Jina Kim,Craig A Knoblock,Stefan Leyk,Zekun Li,Yijun Lin,Min Namgung,Basel Shbita,Johannes H Uhl

Published Date

2023/12/29

Sand: A tool for creating semantic descriptions of tabular sources

Building semantic descriptions of tables is a vital step in data integration. However, this task is expensive and time-consuming as users often need to examine the table data, its metadata, and ontologies to find the most appropriate description. In this paper, we present SAND , a tool for creating semantic descriptions semi-automatically. SAND makes it easy to integrate with semantic modeling systems to predict or suggest semantic descriptions to the users, as well as to use different knowledge graphs (KGs). Besides its modeling capabilities, SAND is equipped with browsing/querying tools to enable users to explore data in the table and discover how it is often modeled in KGs.

Authors

Binh Vu,Craig A Knoblock

Published Date

2022/5/29

Towards the automated large-scale reconstruction of past road networks from historical maps

Transportation infrastructure, such as road or railroad networks, represent a fundamental component of our civilization. For sustainable planning and informed decision making, a thorough understanding of the long-term evolution of transportation infrastructure such as road networks is crucial. However, spatially explicit, multi-temporal road network data covering large spatial extents are scarce and rarely available prior to the 2000s. Herein, we propose a framework that employs increasingly available scanned and georeferenced historical map series to reconstruct past road networks, by integrating abundant, contemporary road network data and color information extracted from historical maps. Specifically, our method uses contemporary road segments as analytical units and extracts historical roads by inferring their existence in historical map series based on image processing and clustering techniques. We tested …

Authors

Johannes H Uhl,Stefan Leyk,Yao-Yi Chiang,Craig A Knoblock

Journal

Computers, environment and urban systems

Published Date

2022/6/1