Published February 28, 2025 | Version v2
Journal article Open

Out of Context! Managing the Limitations of Context Windows in ChatGPT-4o Text Analyses

  • 1. ROR icon Finnish Environment Institute
  • 2. ROR icon University of Helsinki

Description

In recent years, large language model (LLM) applications have surged in popularity, and academia has followed suit. Researchers frequently seek to automate text annotation - often a tedious task – and, to some extent, text analysis. Notably, popular LLMs such as ChatGPT have been studied as both research assistants and analysis tools, revealing several concerns regarding transparency and the nature of AI-generated content. This study assesses ChatGPT’s usability and reliability for text analysis – specifically keyword extraction and topic classification – within an “out-of-the-box” zero-shot or few-shot context, emphasizing how the size of the context window and varied text types influence the resulting analyses. Our findings indicate that text type and the order in which texts are presented both significantly affect ChatGPT’s analysis. At the same time, context-building tends to be less problematic when analyzing similar texts. However, lengthy texts and documents pose serious challenges: once the context window is exceeded, “hallucinated” results often emerge. While some of these issues stem from the core functioning of LLMs, some can be mitigated through transparent research planning.

Files

OutOfContext_Mervaala_Kousa.pdf

Files (940.4 kB)

Name Size Download all
md5:3eb9025bcaf998d283fd5c060824e116
940.4 kB Preview Download

Additional details

Identifiers

Other
15090

Related works

Continues
Conference proceeding: 10.18653/v1/2024.nlp4dh-1.51 (DOI)

Dates

Accepted
2025-02-09