There is a newer version of the record available.

Published June 26, 2023 | Version v1.0
Software Open

Divide and Conquer the EmpiRE: A Community-Maintainable Knowledge Graph of Empirical Research in Requirements Engineering - A Sustainable Literature Review for Analyzing the State and Evolution of Empirical Research in Requirements Engineering

  • 1. TIB - Leibniz Information Centre for Science and Technology

Description

This project contains the constantly updated dataanalysis, and results of a sustainanle literature review on the state and evolution of empirical research in requirements engineering (RE) using the developed KG-EmpiRE.

KG-EmpiRE is a community-maintainable knowledge graph (KG) of empirical research in requirements engineering based on scientific data extracted from currently 570 papers published in the research track of the IEEE International Conference on Requirement Engineering from 2000 to 2022. We are currently organizing scientific data in KG-EmpiRE using a defined template for the six themes of research paradigmresearch designresearch methoddata collectiondata analysis and bibliographic metadata with the long-term plan to expand the themes.

KG-EmpiRE itself is maintained in the Open Research Knowledge Graph (ORKG). The ORKG is a cross-domain and cross-topic research knowledge graph (RKG) with a corresponding technical infrastructure and services for the organization of Findable, Accessible, Interoperable, and Reusable (FAIR) scientific data from papers in accordance with the FAIR data principles. The TIB - Leibniz Information Centre for Science and Technology developes and maintains the ORKG permaently and has committed itself to the long-term archiving of all data. As a central accees point to all curated papers in KG-EmpiRE, we established a more general ORKG observatory on empirical research in software engineering. In addition, the ORKG provides a RDF dump of all its data that includes the most recent data from KG-EmpiRE. We also store the data used for analysis as CSV files, which can be distinguished by date.

Note: The CSV files with the date "2023-06-26" enable the replication of the results of the related publication. The details on the replication of the results can be found in the usage instructions.

In this project, we perform the data analysis of KG-EmpiRE, which has two purposes:

(1) We evaluate the coverage of the curated topic of empirical research in RE by KG-EmpiRE.

(2) We gain insights into the state and evolution of empirical research in RE.

The data analysis is based on competency questions regarding empirical research in SE, including RE, derived from the vision of Sjøberg et al. (2007). They describe their vision of the role of empirical methods in SE, including RE, for the period of 2020 – 2025 by comparing the "current" state of practice (2007) with their target state (2020 - 2025). We analyzed these descriptions and derived a total of 77 competency questions. The number of competency questions answered reflects the coverage of the curated topic in KG-EmpiRE (1), and the answers to competency questions provide insights into the state and evolution of empirical research in RE (2). For each competency question that can be answered with KG-EmpiRE (currently 16 of 77), we specified SPARQL queries to retrieve and analyze the data of KG-EmpiRE from the ORKG. We provide all details of the analysis with its SPARQL queries, data, visualizations, and explanations in the Jupyter Notebook hosted on binder for interactive replication and (re-)use, always using the most recent data from KG-EmpiRE.

The analysis of the individual competency questions always follows the same structure:

  1. Data Selection: Explaining the competency question and the required data for the analysis.
  2. Data Collection: Executing the specified SPARQL query to retrieve the data.
  3. Data Exploration: Exploring the data, including its cleaning and validation, to prepare the data for data analysis.
  4. Data Analysis: Analyzing the data and creating visualizations.
  5. Data Interpretation: Interpreting the data and derive insights.

Overall, this project serves to make the data, analysis, and results openly available in the long term according to the FAIR data principles to enable a replicable, (re-)usable and thus sustainable literature review.

In this way, this project can be used for:

  1. Replication of the results from the related publication.

  2. (Re-)use of KG-EmpiRE with its most recent data.

  3. Repetition of our research approach for sustainable literature reviews on other topics.

Files

okarras/EmpiRE-Analysis-v1.0.zip

Files (23.6 MB)

Name Size Download all
md5:cb1b7b0c1e13762bcda39ed1c8ce9e14
23.6 MB Preview Download

Additional details

Related works

Is supplement to
Computational notebook: https://github.com/okarras/EmpiRE-Analysis/tree/v1.0 (URL)
Conference paper: 10.1109/ESEM56168.2023.10304795 (DOI)
Preprint: 10.48550/arXiv.2306.16791 (DOI)

Software

Repository URL
https://github.com/okarras/EmpiRE-Analysis
Programming language
Jupyter Notebook, Python
Development Status
Active