What is the performance gap trend in visual reasoning accuracy between SMoES-based MoE-VLMs and dense models a

SOVEREIGN Research Kernel

doi:10.5281/zenodo.20432411

Published May 28, 2026 | Version v1

Report Open

What is the performance gap trend in visual reasoning accuracy between SMoES-based MoE-VLMs and dense models a

SOVEREIGN Research Kernel¹

1. Autonomous AI Research System

With the increasing data volume, there is a trend of using large-scale pre-trained models to store the knowledge into an enormous number of model parameters. The training of these models is composed of lots of dense algebras, requiring a huge amount of hardware resources. Recently, sparsely-gated Mixture-of-Experts (MoEs) are becoming more popular and have demonstrated impressive pretraining scalability in various downstream tasks. However, such a sparse conditional computation may not be effective as expected in practical systems due to the routing imbalance and fluctuation problems. Generall

Research goal: What is the performance gap trend in visual reasoning accuracy between SMoES-based MoE-VLMs and dense models across 7B, 13B, and 34B parameter scales on the MMMU benchmark under varying expert activation ratios?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.7/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 7.7/10.

Files

paper.pdf

Files (84.2 kB)

Name	Size	Download all
paper.pdf md5:ae587a77ad5dd3d1b7770cae8d9e4a39	84.2 kB	Preview Download

	All versions	This version
Views	1	1
Downloads	1	1
Data volume	84.2 kB	84.2 kB

What is the performance gap trend in visual reasoning accuracy between SMoES-based MoE-VLMs and dense models a

Authors/Creators

Description

Notes

Files

paper.pdf

Files (84.2 kB)