This dataset contains 2 records. The first record is the annotated dataset. The second record contains a built singularity image containing the code and trained model for predicting on new videos.
We generated 1,253 video clips which total 2,637,363 frames. Each video had variable duration, depending upon the grooming prediction length. Annotators were required to provide a "Grooming" or "Not Grooming" annotation for each frame.
The annotated dataset is stored in the h5 record and is described as follows:
First level grouping is Train/Validation split
Second level grouping is by Video Clip
Each video contains 5 datasets
Number of frames in this video
Shape: nframes x 112 x 112
Labels for each frame
0 = not grooming, 1 = grooming
Information for whether or not annotators agreed
0 = disagree, 1 = agree
When annotators disagree, label contains the values from the first person to annotate the frame
Number of annotators that have labeled the video clip
Example usage of the trained model:
singularity run –nv GroomingInferRelease.sif Input_Movie.avi
The input movie must be a 480×480 video and appear visually similar to our training dataset. Since we do not employ distortion augmentation in this model, even slight differences in lighting can cause issues with performance.