Stepwise Asymmetric Quantization for Flow Matching Models
Description
I investigate the per-step quantization sensitivity of flow matching diffusion models through systematic evaluation of FLUX.1-dev across eight GGUF quantization levels (Q2_K–Q8_0). Our experiments on 2,000 images spanning five content categories designed to probe distinct failure modes (text rendering, spatial reasoning, portraits, landscapes, artistic styles) reveal three key findings. First, initial denoising steps (1–3) are robust to extreme quantization: replacing them with a 2.5-bit model preserves absolute image quality equivalent to 8-bit inference, as measured by two independent reference-free metrics (Quality Score 0.0308 vs 0.0306; ImageReward 1.291 vs 1.252). Second, I demonstrate that reference-based metrics (LPIPS, FID) — widely used for quantization evaluation — conflate trajectory divergence with quality degradation in stepwise model switching: configurations with FID of 62.3 and LPIPS of 0.233 achieve reference-free quality matching or exceeding the 8-bit baseline. Third, CLIP Score remains invariant across all quantization levels (0.358–0.362), confirming that semantic composition is preserved even at 2-bit precision. I release a ComfyUI custom node (Dual Model KSampler) enabling practical asymmetric quantization scheduling and our full evaluation toolkit as open-source tools.
Files
Stepwise Asymmetric Quantization for Flow Matching Models.pdf
Files
(512.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:92784ad5d17a5763d601eaab285488f7
|
512.2 kB | Preview Download |
Additional details
Dates
- Created
-
2026-04-10Initial release
Software
- Repository URL
- https://github.com/lee09lee26/ComfyUI-AsymQuantSampler
- Programming language
- Python
- Development Status
- Active