LFM2 Technical Report
Description
We present LFM2, a family of Liquid Foundation Models designed for efficient on-device deployment and strong task capabilities. Using hardware-in-the-loop architecture search under edge latency and memory constraints, we obtain a compact hybrid backbone that combines gated short convolutions with a small number of grouped query attention blocks, delivering up to 2x faster prefill and decode on CPUs compared to similarly sized models. The LFM2 family covers 350M-8.3B parameters, including dense models (350M, 700M, 1.2B, 2.6B) and a mixture-of-experts variant (8.3B total, 1.5B active), all with
Research goal: What is the impact of increasing the number of active experts (k) on inference latency and VQA accuracy in sparse MoE vision-language models, and does the optimal k vary with visual complexity of the input?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 7.5/10.
Notes
Files
paper.pdf
Files
(86.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:c771fd0d223dd30cf5cb314a7eeb2957
|
86.9 kB | Preview Download |