Sport-ROI: A Live Sports Dataset for Subjective Quality Assessment of Semantically-Driven Video Coding
Authors/Creators
Description
This dataset addresses the perceptual evaluation of Region-of-Interest (ROI)-based coded videos in live streaming scenarios. ROI-based encoding is a well-known approach in video coding that consists of providing higher quality to semantically important regions while saving bits in less important areas. This strategy enables bitrate reductions while maintaining a comparable viewing experience for end users. Such a technique is particularly important in live streaming scenarios, where high quality must be delivered under tight encoding time constraints.
Although this approach has a long history, recent advances in semantic segmentation have opened new opportunities for optimizing encoding based on semantic analysis and enabling bit allocation according to semantic importance. However, optimizing this type of encoding remains challenging, as the relative quality importance of different semantic objects must be quantified.
When predicting the quality of such videos, current state-of-the-art video quality metrics struggle, as most of them have been trained on videos in which ROI-based encoding was not used. While saliency-weighted metrics exist, they are limited, as they often neglect the impact of coding artifacts on shifts in visual attention and also overlook the role of semantics.
To address the need for metrics better suited to evaluating quality in semantically weighted encoding, we introduce a new dataset containing different ROI-based coding approaches and evaluate the performance of existing metrics on this dataset. Our results show that both full-reference and no-reference video quality prediction models are challenged by this dataset.