Published July 7, 2025 | Version 01
Dataset Open

data and code for "Beyond Human Gold Standards: A Multi-Model Framework for Automated Abstract Classification and Information Extraction" article

  • 1. University of Geneva

Description

This is the public repository for the article "Beyond Human Gold Standards: A Multi-Model Framework for Automated Abstract Classification and Information Extraction" by Delphine S. Courvoisier, Diana Buitrago Garcia, Nils Burgisser, Clément P. Buclin, Michele Iudici, and Denis Mongin.
The uptodate repository can be found here: https://gitlab.unige.ch/trial_integrity/llm_majority_public

The structure of the repository is as follows:


- The folder [LLM_inference](./LLM_inference) contains the LLM inferences for the two tasks performed on the abstracts list of the [abstract csv file](./LLM_inference/abstract.csv) by the list of LLMs described in the [model_list.csv](./LLM_inference/model_list.csv) file. The two tasks are the task for the classification of the intervention (folder [abstract_classification](./LLM_inference/abstract_classification)) and the task for the extraction of the number of participants  ([participant_numbers](./LLM_inference/participant_numbers) folder). The initial list of abstract conatined 1080 abstract, some of which were not considered in our final analysis because they were protocols, and not randomized.
  - both folders contain the python script used for the inference using the prompt in the `prompt` folder, the two bash scripts used to run it on the university HPC.
  - All inference results are une the `results` folder, which contains the log files, and one csv file per model
  - The file gold.csv contains, for the final list of 1020 abstracts, the tasks performed by each reviewers, the human gold standard, and the platine stndard, with a 0/1 variable `platine_check` indicating which gold results were re-checked
- The folder [R_analysis](./R_analysis) contains the R files allowing to perform the analysis, produce the tables and the figures:
  - the file [analysis.R](./R_analysis/analysis.R) contains the code to read the LLM inferences results, and calculate the accuracy for the different model combinations. It output a file in the [results](./R_analysis/results) folder
  - the file [figure_tables.R](./R_analysis/figure_tables.R) contains the R code using the result of the analysis.R code to produce the tables and figures of the article. The figures and tables are created in the [figures_tables](./R_analysis/figures_tables) folder. The file [trial_publication_info.csv](./R_analysis/trial_publication_info.csv) contains the information about the RCT used for this analysis, coming from the data of the study doi.org/10.1016/j.jclinepi.2024.111586 .
  - the file [help_func.R](./R_analysis/help_func.R) contains the functions used to format the table results, and is loaded in `figure_tables.R`.

Files

llm_majority_public.zip

Files (7.3 MB)

Name Size Download all
md5:77f09ae0c0a4f194cbd4fe8d3d39fe39
7.3 MB Preview Download

Additional details

Funding

Swiss National Science Foundation
Fostering transparency in rheumatology randomized clinical trials 212393

Software

Programming language
R, Python
Development Status
Active