Face Annotations¶
Annotators to detect and identify faces.
Identifying individuals in an image generally requires two distinct steps. The first is detecting bounding boxes for faces in the image and the second is identifying the faces themselves. Currently the most common method for doing the second step is to project a detected face into a high-dimensional space designed such that different images of the same person will be close together and images of different people will be farther apart. This module is built around this paradigm, allowing for the specification of custom detectors and embeddings into the model.
-
class
dvt.annotate.face.FaceAnnotator(**kwargs)[source]¶ Bases:
dvt.abstract.FrameAnnotatorAnnotator for detecting faces and embedding them as a face vector.
The annotator will return a list with one DictList item for every frame with a detected face. If an embedding is supplied, the DictList items will contain a numpy array with the face embeddings.
-
detector¶ An object with a method called detect that takes an image and returns a set of detect faces. Can be set to None (default) as a pass-through option for testing.
-
embedding¶ An object with a method embed that takes an image along with a set of bounding boxed and returns embeddings of the faces as a numpy array. Set to None (default) to only run the face detector.
-
freq¶ How often to perform the embedding. For example, setting the frequency to 2 will embed every other frame in the batch.
Type: int
-
frames¶ An optional list of frames to process. This should be a list of integers or a 1D numpy array of integers. If set to something other than None, the freq input is ignored.
Type: array of ints
-
name¶ A description of the aggregator. Used as a key in the output data.
Type: str
-
annotate(batch)[source]¶ Annotate the batch of frames with the face annotator.
Parameters: batch (FrameBatch) – A batch of images to annotate. Returns: A list of dictionaries containing the video name, frame, bounding box coordinates (top, bottom, left, and right). If an embedding is included, the result will also contain a numpy array of the embedding for each face.
-
name= 'face'
-
-
class
dvt.annotate.face.FaceDetectMtcnn(cutoff=0)[source]¶ Bases:
objectDetect faces using the Multi-task Cascaded CNN model.
-
cutoff¶ A cutoff value for which faces to include in the final output. Set to zero (default) to include all faces.
Type: float
-
detect(img)[source]¶ Detect faces in an image.
Parameters: img (numpy array) – A single image stored as a three-dimensional numpy array. Returns: A list of dictionaries where each dictionary represents a detected face. Keys include the bounding box (top, left, bottom, right) as well as a confidence score.
-