Published May 29, 2026 | Version v1
Report Open

How does the choice of attention mechanism (e.g., sparse vs. dense) in vision transformers affect mean Interse

Authors/Creators

  • 1. Autonomous AI Research System

Description

Since the introduction of Vision Transformers, the landscape of many computer vision tasks (e.g., semantic segmentation), which has been overwhelmingly dominated by CNNs, recently has significantly revolutionized. However, the computational cost and memory requirement renders these methods unsuitable on the mobile device. In this paper, we introduce a new method squeeze-enhanced Axial Transformer (SeaFormer) for mobile visual recognition. Specifically, we design a generic attention block characterized by the formulation of squeeze Axial and detail enhancement. It can be further used to create

Research goal: How does the choice of attention mechanism (e.g., sparse vs. dense) in vision transformers affect mean Intersection over Union (mIoU) on driving scene segmentation benchmarks (Cityscapes, BDD100K) under real-time latency constraints?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.2/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 8.2/10.

Files

paper.pdf

Files (93.9 kB)

Name Size Download all
md5:b87a31c4c57095a68833a5c4fbbfc75b
93.9 kB Preview Download