ConnOSS and Metadata Extraction for Research Software

1. Carl von Ossietzky Universität Oldenburg
2. GESIS - Leibniz-Institute for the Social Sciences
3. ZB MED - Information Centre for Life Sciences
4. Oldenburger Institut für Informatik

Metadata and software descriptors help realize the FAIR principles for software. Various efforts exist around research software metadata (e.g., CodeMeta, Bioschemas, maSMP schema) as well as metadata extraction (e.g., SOMEF, HERMES, MAUS). Despite these efforts, existing tools and schemas remain fragmented, cover limited metadata, and are rarely built for large-scale, automated processing or enrichment with modern AI techniques. To address this gap, the Connected Open Source Software (ConnOSS) project aims to provide a consistent infrastructure for metadata extraction and publication, enabling researchers to create harmonized software descriptions and facilitating metadata harvesting by registries and aggregators. The project aims to analyze and extend existing research software metadata schemas, identify metadata sources, and develop a harmonized extraction pipeline from platforms like GitHub and GitLab. Machine learning models trained on a curated corpus plan to extract, enrich, and validate metadata from README files, addressing current automation gaps. A publication workflow then intends to make metadata accessible to humans and machines via GitHub/GitLab pages. In this poster we introduce ConnOSS and present a preliminary comparison across different research software metadata extractors which will be later used to define the requirements for the ConnOSS metadata extractor.

This work is part of the contributions to the deRSE 2026 Conference, see https://events.hifis.net/event/2945/contributions/21334/

This work has been supported by the German Research Foundation (DFG) through the project ConnOSS with project number 561044496.

Files

deRSE26 ConnOSS Poster.pdf

Files (5.9 MB)

Name	Size	Download all
deRSE26 ConnOSS Poster.pdf md5:63e284c5520f92286e1e1c37e37a900a	930.0 kB	Preview Download
deRSE26 ConnOSS Poster.png md5:3330ba743c3127315f4bf458f340768a	1.6 MB	Preview Download
deRSE26 ConnOSS Poster.svg md5:cfe4dd6ae32741e7ddc86cd2d4c46c2e	3.4 MB	Download

Additional details

Deutsche Forschungsgemeinschaft
Connected Open-Source Software (ConnOSS) 561044496

Views

Downloads

Show more details

	All versions	This version
Views	71	71
Downloads	55	55
Data volume	89.6 MB	89.6 MB

More info on how stats are collected....

DOI

Resource type

Poster

Publisher

Zenodo

Conference

deRSE26 - Conference for Research Software Engineering in Germany (deRSE26) , Stuttgart, Germany, 03-05 March 2026 (Session Poster session, Part 255)

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: March 3, 2026
Modified: March 5, 2026

ConnOSS and Metadata Extraction for Research Software

Authors/Creators

Description

Files

deRSE26 ConnOSS Poster.pdf

Files (5.9 MB)

Additional details

Funding