Comparative Analysis of Qwen3 MoE and Llama-3.1-8B Context Retrieval Robustness Under Adversarial Noise
Description
While Dense Retrieval Models (DRMs) have advanced Information Retrieval (IR), one limitation of these neural models is their narrow generalizability and robustness. To cope with this issue, one can leverage the Mixture-of-Experts (MoE) architecture. While previous IR studies have incorporated MoE architectures within the Transformer layers of DRMs, our work investigates an architecture that integrates a single MoE block (SB-MoE) after the output of the final Transformer layer. Our empirical evaluation investigates how SB-MoE compares, in terms of retrieval effectiveness, to standard fine-tunin
Research goal: How does Qwen3's Mixture-of-Expert (MoE) architecture compare to dense models like Llama-3.1-8B in terms of context retrieval robustness under noisy or adversarial inputs on benchmarks like ANLI or HellaSwag?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.6/10.
Notes
Files
paper.pdf
Files
(82.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:6208432a843ad5cc5a155c8799a2e0de
|
82.0 kB | Preview Download |