Published January 10, 2022 | Version v1
Conference paper Open

Language Based Image Quality Assessment

  • 1. University of Florence, Italy

Description

Evaluation of generative models, in the visual domain, is often performed providing anecdotal results to the reader. In the case of image enhancement, reference images are usually available. Nonetheless, using signal based metrics often leads to counterintuitive results: highly natural crisp images may obtain worse scores than blurry ones. On the other hand, blind reference image assessment may rank images reconstructed with GANs higher than the original undistorted images. To avoid time consuming human based image assessment, semantic computer vision tasks may be exploited instead [9, 25, 33]. In this paper we advocate the use of language generation tasks to evaluate the quality of restored images. We show experimentally that image captioning, used as a downstream task, may serve as a method to score image quality. Captioning scores are better aligned with human rankings with respect to signal based metrics or no-reference image quality metrics. We show insights on how the corruption, by artifacts, of local image structure may steer image captions in the wrong direction.

Files

mmasia-21.pdf

Files (3.3 MB)

Name Size Download all
md5:15d371165aaa7036ba4ee579b3bdbce1
3.3 MB Preview Download

Additional details

Funding

AI4Media – A European Excellence Centre for Media, Society and Democracy 951911
European Commission