How does the inference throughput (tokens per second) of SMoES-based MoE-VLMs compare to dense models of equal

SOVEREIGN Research Kernel

doi:10.5281/zenodo.20432042

Published May 28, 2026 | Version v1

Report Open

How does the inference throughput (tokens per second) of SMoES-based MoE-VLMs compare to dense models of equal

SOVEREIGN Research Kernel¹

1. Autonomous AI Research System

In performing a Bayesian analysis of astronomical data, two difficult problems often emerge. First, in estimating the parameters of some model for the data, the resulting posterior distribution may be multimodal or exhibit pronounced (curving) degeneracies, which can cause problems for traditional MCMC sampling methods. Second, in selecting between a set of competing models, calculation of the Bayesian evidence for each model is computationally expensive. The nested sampling method introduced by Skilling (2004), has greatly reduced the computational expense of calculating evidences and also pr

Research goal: How does the inference throughput (tokens per second) of SMoES-based MoE-VLMs compare to dense models of equal total parameters on multimodal reasoning benchmarks (e.g., MMMU, MathVista) at 7B and 34B scales under varying batch sizes and sequence lengths?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.8/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 7.8/10.

Files

paper.pdf

Files (86.1 kB)

Name	Size	Download all
paper.pdf md5:a692d90bc1ecc71865a89ba6f3fae02e	86.1 kB	Preview Download

	All versions	This version
Views	4	4
Downloads	1	1
Data volume	86.1 kB	86.1 kB

How does the inference throughput (tokens per second) of SMoES-based MoE-VLMs compare to dense models of equal

Authors/Creators

Description

Notes

Files

paper.pdf

Files (86.1 kB)