Degradation of Multi-Hop Reasoning Accuracy in Instruction-Tuned LLMs with Increasing Distraction Ratios
Description
Multi-hop question answering is a knowledge-intensive complex problem. Large Language Models (LLMs) use their Chain of Thoughts (CoT) capability to reason complex problems step by step, and retrieval-augmentation can effectively alleviate factual errors caused by outdated and unknown knowledge in LLMs. Recent works have introduced retrieval-augmentation in the CoT reasoning to solve multi-hop question answering. However, these chain methods have the following problems: 1) Retrieved irrelevant paragraphs may mislead the reasoning; 2) An error in the chain structure may lead to a cascade of erro
Research goal: How does the reasoning accuracy of instruction-tuned LLMs on multi-hop Needle In A Haystack tasks degrade as the ratio of distracting context to relevant documents increases across models with 7B, 13B, and 70B parameters?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.2/10.
Notes
Files
paper.pdf
Files
(84.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:323dc7511fe0527da3680e4c6f5dc25f
|
84.2 kB | Preview Download |