AeroEngQA
Description
Dataset name:
AeroEngQA
Description:
AeroEngQA is a low volume, high quality benchmark aircraft design Question Answer (QA) dataset to support qualitative evaluatation of Large Language Models (LLMs).
Dataset DOI:
10.5281/zenodo.14215677
Paper citation:
Silva, E.A. Marsh, R. Yong, H.K. Middleton, S.E. Sóbester, A. Retrieval-Augmented Generation and In-Context Prompted Large Language Models in Aircraft Engineering, AIAA-2025, AIAA, doi:10.2514/6.2025-0700
Abstract:
With the aerospace industry taking its first steps towards exploiting the rapidly evolving technology of Large Language Models (LLMs), this study explores the potential of the latest generation of LLMs to become an effective link in the aircraft design tool chain of the future. Our focus is on the task of Question Answering (QA) in engineering, which has the potential to augment future aircraft design team meetings with an intelligent LLM-based agent able to engage with the team via a chatbot interface. We compare three of the most effective and popular classes of LLM QA prompting today – LLM zero-shot prompting, LLM in-context prompting and LLM-based Retrieval-Augmented Generation (RAG). We describe a new, low volume, high quality benchmark aircraft design QA dataset (AeroEngQA) and use it to qualitatively evaluate each class of LLM and exploring properties including answer accuracy and answer simplicity of the answer. We provide domain-specific insights into the usefulness of today’s LLMs for engineering design tasks such as aircraft design, and a view on how this might evolve in the future as the next generation of LLMs emerges.
Acknowledgements:
The DAWS 2 (Development of Advanced Wing Solutions 2) project is supported by the ATI Programme, a joint Government and industry investment to maintain and grow the UK’s competitive position in civil aerospace design and manufacture. The programme, delivered through a partnership between the Aerospace Technology Institute (ATI), Department for Business, Energy & Industrial Strategy (BEIS) and Innovate UK, addresses technology, capability and supply chain challenges.
Files
AeroEngQA_multi-hop-unanswerable.json
Files
(170.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:05ff4897523e60757d182f793892aff9
|
69.1 kB | Download |
|
md5:59a615a1e174cc17782b506d59b8df40
|
26.6 kB | Preview Download |
|
md5:487e94feb1c02358b86fea7a9050e88d
|
28.8 kB | Preview Download |
|
md5:840a8dd313c509571048f00d8ca02f4a
|
22.5 kB | Preview Download |
|
md5:c697b0dff639d4abda58577285063525
|
21.5 kB | Preview Download |
|
md5:1b695df0cebc31543e1d177a2d79c0bf
|
2.2 kB | Preview Download |
Additional details
Related works
- Is published in
- Conference paper: https://arc.aiaa.org/doi/10.2514/6.2025-0700 (URL)