Published October 8, 2022 | Version v1
Presentation Open

A Framework for Extracting Scientific Measurements and Geo-Spatial Information from Scientific Literature

  • 1. Christian-Albrechts-University Kiel, Germany
  • 2. GEOMAR Helmholtz Centre for Ocean Research Kiel, Germany

Description

Research papers are often the primary source of scientific information dissemination, as researchers encapsulate their findings in these documents. Generally such findings are of complex types, diverse expressions and also carry rich context. The traditional approach for extracting certain scientific information from these documents is manual extraction, which is very time consuming. Due to the rapid increase in number of publications, using the full potential of these rich data sources by manual extraction is becoming infeasible. In this paper, we propose a framework for the automatic extraction of targeted (user defined) quantitative information, e.g. temperature sensor values, with its geo-spatial context from scientific documents. Given a database of scientific documents and a targeted user-defined geotagable measurement variables, mass accumulation rate (MAR) and sedimentation rate (SR), the problem we are addressing is to retrieve all the values together with their geo-spatial information respectively. Though there has been done a lot in information retrieval, to the best of our knowledge, this problem has not been explored, yet. We design a novel heterogeneous linking solution, that links measurements with locations, which are found by our tailored extraction pipeline. In experimental studies based on our novel dataset of Marine Geology papers, we showcase the capabilities of our linking framework using common geo-tagable Marine Geology measurements.

Files

Files (2.6 MB)

Name Size Download all
md5:e041759abf6eb755d40c424d8689388b
2.6 MB Download