Short Stories Dataset

Medrado Gondim, João; Bonil da Silva, Gustavo; Pedrini, Helio; Bitencourt dos Santos, Marina; Avila, Sandra; Hashiguti, Simone

doi:10.5281/zenodo.16883667

Published August 15, 2025 | Version 1.1.0

Dataset Open

Short Stories Dataset

1. Universidade Estadual de Campinas (UNICAMP)
2. University of Campinas

The corpus was created with the purpose of investigating the construction of narratives about Black and white women in short stories generated in Portuguese. Each of the 2100 instances of the dataset comprises a short story generated with the usage of the model meta-llama/Llama-3.2-3B-Instruct from Hugging Face. The data is inside a csv file, with each row containing: the prompt employed, the short story outputted by the model, the name used to create the story, or the tag “no name” if no name was used, and the race of the main character (as set in the prompt, this tag was mainly used for visualization purposes).

The datasheet with more information on the corpus, along with generation and analysis codes, can be found in this repository: https://github.com/hiaac-nlp/clusteringdiscourses.

Files

shortstories_name_noname_pt.csv

Files (5.5 MB)

Name	Size	Download all
shortstories_name_noname_pt.csv md5:3eac2acd3479c3dfc2939649d0dcf172	5.5 MB	Preview Download

Additional details

Available: 2025-08-15

Dataset made available

Repository URL: https://github.com/hiaac-nlp/clusteringdiscourses
Programming language: Python

312

Views

Downloads

Show more details

	All versions	This version
Views	312	211
Downloads	92	76
Data volume	615.4 MB	500.4 MB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Conference

Simpósio em Tecnologia da Informação e da Linguagem Humana (STIL), Fortaleza/CE, Brazil, 29-September-02-October

Languages

Portuguese

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: August 15, 2025
Modified: August 15, 2025

shortstories_name_noname_pt.csv

Files (5.5 MB)

Dates

Software

Short Stories Dataset

Authors/Creators

Description

Files

shortstories_name_noname_pt.csv

Files (5.5 MB)

Additional details

Dates

Software