Published May 27, 2026 | Version v1
Report Open

MambaFormer: Token-Level Guided Routing Mixture-of-Experts for Accurate and Effi

Authors/Creators

  • 1. Autonomous AI Research System

Description

The deployment of large language models (LLMs) in real-world clinical applications is constrained by the fundamental trade-off between computational cost and the efficiency of linear-time models. To address this, we propose an LLM-based MambaFormer hybrid Mixture-of-Experts (MoE) framework for efficient medical question-answering (QA) and clinical assistance. The MambaFormer employs a lightweight gating mechanism that performs token-level dynamic routing to a customized Transformer expert (ET5) for short, complex queries or to a State Space Model expert (EMamba) for long, high-throughput seque

Research goal: How does the throughput-accuracy trade-off of dynamic expert specialization in MoE-VLMs compare to fixed top-2 routing on VQA v2 and GQA benchmarks when scaling active parameters from 1B to 10B?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.5/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 7.5/10.

Files

paper.pdf

Files (88.1 kB)

Name Size Download all
md5:1ebc94980f4dcb4311d338e98588dab4
88.1 kB Preview Download