Object Annotations

Annotator to detect objects.

Detecting objects in an image is an import task for the analysis of both still and moving images. This modules provides the generic annotator ObjectAnnotator to which a specific object detector can be inserted. The module also supplies a direct wrapper to the RetinaNet algorithm, which includes 80 classes of common objects.

class dvt.annotate.obj.ObjectAnnotator(**kwargs)[source]

Bases: dvt.abstract.FrameAnnotator

Annotator for detecting objects in frames or images.

The annotator will return a list with one DictList item for every frame with a detected object.

detector

An object with a method called detect that takes an image and returns a set of detect objects.

freq

How often to perform the embedding. For example, setting the frequency to 2 will embed every other frame in the batch.

Type:int
frames

An optional list of frames to process. This should be a list of integers or a 1D numpy array of integers. If set set to something other than None, the freq input is ignored.

Type:array of ints
name

A description of the aggregator. Used as a key in the output data.

Type:str
annotate(batch)[source]

Annotate the batch of frames with the object detector.

Parameters:batch (FrameBatch) – A batch of images to annotate.
Returns:A list of dictionaries containing the video name, frame, and any additional information (i.e., bounding boxes or object names) supplied by the detector.
name = 'obj'
class dvt.annotate.obj.ObjectDetectRetinaNet(cutoff=0.5)[source]

Bases: object

Detect objects using RetinaNet.

An object detector that locates 80 object types.

cutoff

A cutoff value for which objects to include in the final output. Set to zero (default) to include all object. The default is 0.5.

Type:float
detect(img)[source]

Detect objects in an image.

Parameters:img (numpy array) – A single image stored as a three-dimensional numpy array.
Returns:A list of dictionaries where each dictionary represents a detected object. Keys include the bounding box (top, left, bottom, right), a confidence score, and the class of the object.