Published August 5, 2017 | Version v1
Conference paper Open

Mining Social Science Publications for Survey Variables

  • 1. GESIS - Leibniz Institute for the Social Sciences

Description

Research in Social Science is usually based on survey data where individual research questions relate to observable concepts (variables). However, due to a lack of standards for data citations a reliable identification of the variables used is often difficult. In this paper, we present a work-in-progress study that seeks to provide a solution to the variable detection task based on supervised machine learning algorithms, using a linguistic analysis pipeline to extract a rich feature set, including terminological concepts and similarity metric scores. Further, we present preliminary results on a small dataset that has been specifically designed for this task, yielding a significant increase in performance over the random baseline

Files

document(1).pdf

Files (140.5 kB)

Name Size Download all
md5:4e128a4fd9a433597db7ea9f9f36cd7d
140.5 kB Preview Download

Additional details

Funding

OpenMinTeD – Open Mining INfrastructure for TExt and Data 654021
European Commission