Evaluating LLM Generalization of Literary Style: Fine-tuning on Federico García Lorca's Poetry
Description
I present an annotated dataset of 283 poems by Federico García Lorca spanning nine major works (1921-1940), enriched with publication metadata, structural features, and GPT-4-generated thematic and contextual annotations. We describe the construction pipeline from EPUB extraction to synthetic annotation and bibliographic resolution and demonstrate its application to evaluating the generalization capacity of large language models (LLMs) in the domain of literary style transfer. As a case study, we fine-tune Llama 3 8B on this corpus and evaluate its ability to generate original poetry that reflects Lorca's distinctive stylistic patterns. Our results suggest that even sub-10B parameter models can internalize non-trivial aspects of a specific author's voice from fewer than 300 training examples, opening questions about the nature of stylistic representation in neural language models.
Files
main.pdf
Files
(347.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:734aa3327da70fa1d40c5fb62c16562b
|
347.7 kB | Preview Download |
Additional details
Related works
- Is supplemented by
- Model: https://huggingface.co/xaviviro/Lorca-LLama3-8B-GGUF (URL)
- Dataset: https://huggingface.co/datasets/xaviviro/FEDERICO-GARCIA-LORCA-canciones-poemas-romances-annotated (URL)
Dates
- Available
-
2026-03-12
References
- Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah... Language models are few-shot learners.
- Asli Celikyilmaz, Elizabeth Clark, and Jianfeng Gao. Evaluation of text generation: a survey.
- Tuhin Chakrabarty, Vishakh Padmakumar, and He He. Help me write a poem: instruction tuning as a vehicle for collaborative poetry writing.