Published January 19, 2026 | Version v1
Dataset Open

Sample Dataset for AI-Generated Scientific Storytelling

Authors/Creators

Description

This repository contains the dataset used to fine-tune an AI Scientific Storyteller .

The released data represents the training material and is provided to illustrate the structure, format, and intermediate representations used in the scientific storytelling pipeline.

Contents:

  • new_dataset.json: metadata describing scientific papers and associated narrative sources.
  • new_parsed_output.json: parsed text of scientific papers used as model input.
  • new_stories_with_text.json: narrative texts used as supervision for story generation.

Files

new_dataset.json

Files (6.4 MB)

Name Size Download all
md5:611293fd0647195d804a63c86e1d0eab
896.1 kB Preview Download
md5:0e23506aaa4b74c93f86f73aa1d62092
4.0 MB Preview Download
md5:86664348918a06e46cca7c69dd584aa2
1.6 MB Preview Download
md5:c1055d8e75f82741515d8cc3a2e00d2b
545 Bytes Preview Download