Published March 12, 2026
| Version 1.0
Dataset
Open
Sample Dataset for AI-Generated Scientific Storytelling
Authors/Creators
Description
This repository contains the dataset used to fine-tune the models evaluated in the paper.
The released data represents the training material and is provided to illustrate the structure, format, and intermediate representations used in the scientific storytelling pipeline.
Contents:
- dataset.json: metadata describing scientific papers and associated narrative sources.
- paper_transcriptions.json: parsed text of scientific papers used as model input.
- stories_with_text.json: narrative texts used as supervision for story generation.
Files
dataset.json
Additional details
Dates
- Updated
-
2026-03-12