An Open Access User Generated Video Dataset from 2016 Edinburgh Festival

—This paper presents a user generated video dataset captured during the 2016 Edinburgh festival. The provided dataset was collected using a smart phone and is available with no post-processing. The dataset can be used for evaluation of various research tools, such as video quality assessment and enhancement. The details and characteristics of each video in the dataset are listed, and usage guidelines for the researchers are also stated.


Introduction
With the explosion of social networks and media sharing platforms, users are getting more involved in capturing and recording of moments and highlights in different public events such as festivals and sport events.The huge amount of content recorded by the smart phones creates an opportunity for professional broadcasters to expand their coverage by using the user generated content (UGC) from particular events.It is however of major importance to analyze and post-process the UGC before using it in the production to ensure a high quality of experience for the viewers.Design and implementation of such tools for UGC processing requires proper test sets to validate the performance of the research modules.
The COGNITUS project aims to combine the advances in UHD broadcasting technologies with the explosion of UGC in order to create new interactive, immersive modes of production.The project is intended to optimize how UHD content is produced and distributed, by capitalizing upon the knowledge of professional producers, the ubiquity of UGC and the power of interactive networked social creativity.[1] In the context of this project, and given that public festivals, and in particular the annual Edinburgh festival, are amongst the targeted events in COGNITUS, a collection of 7 user generated videos were created in the 2016 Edinburgh festival for evaluation of different UGC quality assessment and enhancement components.The public dataset can be used by researchers to validate their work on UGC processing and media synchronization.
The rest of this paper is structured as follows.The description of the dataset is listed in Section 2. In Section 3 the characteristics of the UGC dataset is presented in terms of spatiotemporal diversity and complexity of the existing textures and features in the videos, and finally in Section 4 the terms of use and guidelines for users are explained.

Description of the Collection
The dataset includes 7 videos recorded at the Edinburgh festival in August 2016, using a Samsung Galaxy S5 smart phone.The videos mainly cover the Edinburgh streets and the festival atmosphere, and do not cover any performances.Specific descriptions of the videos are provided in Table 1 and screenshots are shown in Figure 1.The dataset includes videos with UHD (3840×2160), Full-HD (1920×1080), HD (1280×720), and VGA (640×480) spatial resolutions.One of the Full-HD sequences is captured at 60 fps frame rate and the rest are at 30 fps.The bit depth for all videos is 8.All except one of the videos are recorded in landscape mode.
All videos are provided in their native format as MP4 files, as produced in the mobile device, with H.264/AVC video encoding [2] and AAC audio encoding [3].All original metadata are retained; only the file names have been changed.

Analysis of the Content
We analyzed the sequences from different perspectives to evaluate their suitability to be used in the context of research activities.In particular, as this dataset is intended mainly for validation of quality assessment, quality enhancement, and compression algorithms, it is important to investigate the spatial and temporal diversity of the videos.In this regard, we computed the spatial and temporal indexes of each video according to the ITU-R specifications [4].Moreover we reencoded the sequences to evaluate the rate-distortion performance of each video.For re-encoding, we decoded the videos using FFmpeg library [5] first, and then encoded the videos in H.265/HEVC [6] using the open source HEVC Turing codec [7] in random access configuration with four QPs of 22, 27, 32, and 37.
Table 2 summarizes the spatial and temporal indexes of each video, while Figure 2 shows the diversity of the spatiotemporal activity in the dataset using the spatial and tempo-  ral indexes.Table 3 summarizes the re-encoding results of the database, where the bitrates and PSNR values represent the average over the four QPs.Moreover Figure 3 illustrates the ratedistortion curves of each video in one plot, where we can see a significant variation in the results.

Usage Guidelines
The dataset is owned by Queen Mary University of London c 2017, and is provided under Creative Commons Attribution-NonCommercial-NoDerivatives 1 license.The dataset is accessible through the Zenodo research data repository 2 .Any queries need to be forwarded to the authors of this paper.
It is highly recommended to the users, to use this dataset along with other publicly available datasets from 2016 Edinburgh festival, in particular [8], where there are overlapping videos covering the same instances as in our dataset.

Figure 1 .
Figure 1.Screenshots of the videos.

Figure 2 .
Figure 2. Spatio-temporal index space of the dataset.

Figure 3 .
Figure 3. Rate-distortion results of re-encoding the dataset.

Table 1 .
Description of the dataset.

Table 2 .
Spatial and temporal indexes of the dataset.

Table 3 .
Bitrate and PSNR results of re-encoding the dataset.