People Aggregators

Aggregate frame level information to detect people in shots.

The aggregator functions here take face embeddings and tries to predict the identity of people within each shot.

class dvt.aggregate.people.PeopleAggregator(**kwargs)[source]

Bases: dvt.abstract.Aggregator

Uses face embeddings to identify the identity of people in the frame.

You will need to provide baseline faces for the annotator to compare to. Note that the annotator returns the nearest faces along with the distance to each face.

face_names

List of names associated with each face in the set of predefined faces

Type:list
fprint

A numpy array giving the embedding vectors for the predefined faces. Each row should correspond with one face id and the number of columns should match the number of columns in your embedding.

Type:numpy array
name

A description of the aggregator. Used as a key in the output data.

Type:str
aggregate(ldframe, **kwargs)[source]

Aggregate faces.

Parameters:ldframe (dict) – A dictionary of DictFrames from a FrameAnnotator. Must contain an entry with the key ‘face’, which is used in the annotation.
Returns:A dictionary frame giving the detected people, with one row per detected face.
name = 'people'
dvt.aggregate.people.make_fprint_from_images(dinput)[source]

Create face fingerprints from a directory of faces.

This function takes as an input a directory containing image files, with each image given the name of a person or character. The function returns the ‘fingerprints’ (sterotypical embedding) of the faces in a format that can be passed to the PeopleAggregator.

Parameters:face_names (list) – List of names associated with each face in the set of predefined faces
Returns:A tuple giving the number array of embedding vectors and a list of the names of the people in the images.