MambaFormer: Token-Level Guided Routing Mixture-of-Experts for Accurate and Effi

SOVEREIGN Research Kernel

doi:10.5281/zenodo.20412328

Published May 27, 2026 | Version v1

Report Open

MambaFormer: Token-Level Guided Routing Mixture-of-Experts for Accurate and Effi

SOVEREIGN Research Kernel¹

1. Autonomous AI Research System

The deployment of large language models (LLMs) in real-world clinical applications is constrained by the fundamental trade-off between computational cost and the efficiency of linear-time models. To address this, we propose an LLM-based MambaFormer hybrid Mixture-of-Experts (MoE) framework for efficient medical question-answering (QA) and clinical assistance. The MambaFormer employs a lightweight gating mechanism that performs token-level dynamic routing to a customized Transformer expert (ET5) for short, complex queries or to a State Space Model expert (EMamba) for long, high-throughput seque

Research goal: How does the throughput-accuracy trade-off of dynamic expert specialization in MoE-VLMs compare to fixed top-2 routing on VQA v2 and GQA benchmarks when scaling active parameters from 1B to 10B?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.5/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 7.5/10.

Files

paper.pdf

Files (88.1 kB)

Name	Size	Download all
paper.pdf md5:1ebc94980f4dcb4311d338e98588dab4	88.1 kB	Preview Download

	All versions	This version
Views	5	5
Downloads	3	3
Data volume	352.3 kB	352.3 kB

MambaFormer: Token-Level Guided Routing Mixture-of-Experts for Accurate and Effi

Authors/Creators

Description

Notes

Files

paper.pdf

Files (88.1 kB)