Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced M
Description
We present Uni-MoE 2.0 from the Lychee family. As a fully open-source omnimodal large model (OLM), it substantially advances Lychee's Uni-MoE series in language-centric multimodal understanding, reasoning, and generating. Based on the dense LLM, we build Uni-MoE-2.0-Omni from scratch through three core contributions: dynamic-capacity Mixture-of-Experts (MoE) design, a progressive training strategy enhanced with an iterative reinforcement strategy, and a carefully curated multimodal data matching technique. It is capable of omnimodal understanding, as well as generating images, text, and speech
Research goal: How does the computational efficiency (FLOPs per forward pass) of SMoES scale with the number of active experts and input modality composition relative to dense and hard-routed MoE baselines across different batch sizes on multimodal QA tasks?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.5/10.
Notes
Files
paper.pdf
Files
(95.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:ea9f4434f82d1251b8ed9772dcad84cd
|
95.3 kB | Preview Download |