Published December 1, 2025 | Version v1
Journal article Open

Optimizing YOLOv8: OpenVINO standard quantization vs accuracy-controlled for edge deployment

Description

Object detection models, such as you only look once (YOLO), are widely utilized for real-time applications; however, their computational complexity often restricts deployment on edge devices. This research investigates the optimization of YOLO models using OpenVINO, both with and without accuracy control, to enable efficient inference while preserving model accuracy. A two-step pipeline is proposed: first, YOLO models are converted into OpenVINO’s intermediate representation (IR) format, followed by the application of post-training quantization (PTQ) to reduce model size and enhance latency. Additionally, an accuracy-aware quantization approach is introduced, which maintains model performance by calibrating with a validation dataset. Experimental results illustrate the tradeoffs between standard and accuracy-controlled quantization, demonstrating improvements in inference speed while ensuring minimal accuracy degradation. This study provides a practical framework for deploying lightweight object detection models on edge devices, particularly in realworld scenarios such as autonomous systems, smart surveillance, and smart queue management systems.

Files

34 43579.pdf

Files (843.7 kB)

Name Size Download all
md5:2ad3b4aeddbe0f9c9fd83f5f462d0fa8
843.7 kB Preview Download