MoEcho: Exploiting Side-Channel Attacks to Compromise User Privacy in Mixture-of
Description
The transformer architecture has become a cornerstone of modern AI, fueling remarkable progress across applications in natural language processing, computer vision, and multi-modal learning. As these models continue to scale explosively for performance, implementation efficiency remains a critical challenge. Mixture-of-Experts (MoE) architectures, selectively activating specialized subnetworks (experts), offer a unique balance between model accuracy and computational cost. However, the adaptive routing in MoE architectures—where input tokens are dynamically directed to specialized experts base
Research goal: How does the accuracy of SMoES-based MoE-VLMs with soft modality-guided routing compare to dense models of equivalent parameter count on the MMMU benchmark across 7B to 34B scales, and what is the performance gap trend?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.8/10.
Notes
Files
paper.pdf
Files
(85.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:11b1a69d4e711276265f1f3cb3bb791f
|
85.8 kB | Preview Download |