Core Objects

Core objects for working with the distant viewing toolkit.

The objects VisualInput, FrameAnnotator, and Aggregator are abstract classes. They correspond to the method for extracting visual inputs from digitized files, methods for extracting metadata from visual images, and methods for aggregating the data across a collection. The toolkit provides many ready-made implementions of these classes in order to solve common tasks.

A DataExtraction class is constructed for each input object. Annotators and aggregators can be iteratively passed to the object to extract metadata. The FrameBatch object serves as the primary internal structure for storing visual information. Users who construct their own FrameAnnotator subclasses will need to interface with these objects and their methods for grabbing visual data inputs. See the example in DataExtraction for the basic usage of the classes.

class dvt.core.DataExtraction(vinput, ainput=None, sinput=None)[source]

Bases: object

The core class of the toolkit. Used to pass algorithms to visual data.

Each instance of a data extraction is tied to a particular input object. Collections of annotators or individual aggregators can be passed to the relevant methods to extract metadata from the associated input.

vinput

The input object associated with the dataset.

Type:VisualInput
ainput

Path to audio input as wav file. Optional.

Type:str
sinput

Path to subtitle input as srt file. Optional.

Type:str
data

Extracted metadata.

Type:OrderedDict

Example

Assuming we have an input named “input.mp4”, the following example shows a straightforward usage of DataExtraction. We create an extraction object, pass through a difference annotator, and then aggregate use the cut detector. Finally, the output from the cut aggregator is obtained by calling the get_data method and grabbing the relevant key (“cut”).

>>> from dvt.core import DataExtraction, FrameInput
>>> from dvt.annotate.diff import DiffAnnotator
>>> from dvt.aggregate.cut import CutAggregator
>>> dextra = DataExtraction(FrameInput(input_path="input.mp4"))
>>> dextra.run_annotators([DiffAnnotator(quantiles=[40])])
>>> dextra.run_aggregator(CutAggregator(cut_vals={'q40': 3}))
>>> dextra.get_data()['cut']

Using the input file in the Distant Viewing Toolkit test directory yields the following output:

>>> dextra.get_data()['cut']
   frame_start  frame_end
0            0         74
1           75        154
2          155        299
3          300        511
get_data()[source]

Get dataset from the object.

Returns:An ordered dictionary where each key corresponds to an annotator or aggregator and the values are all pandas DataFrame objects.
get_json(path=None, exclude_set=None, exclude_key=None)[source]

Get dataset as a JSON object.

Parameters:
  • path_or_buf – Location to store the output. If set to None, return as a string.
  • exclude_set – Set of dataset names to ignore when creating the output. None, the default, includes all data in the output.
  • exclude_key – Set of column names to ignore when creating the output. None, the default, includes all keys in the output.
run_aggregator(aggregator)[source]

Run an aggregator over the extracted annotations.

Parameters:aggregator (Aggregator) – Aggregator object use for processing the input data.
run_annotators(annotators, max_batch=None, msg='Progress: ')[source]

Run a collection of annotators over the input material.

Batches of inputs are grabbed from vinput and passed to the annotators. Output is collapsed into one DataFrame per annotator, and stored as keys in the data attribute. An additional key is included (“meta”) that contains metadata about the collection.

Parameters:
  • annotators (list) – A list of annotator objects.
  • max_batch (int) – The maximum number of batches to process. Useful for testing and debugging. Set to None (default) to process all available frames.
  • progress (bool) – Should a progress bar be shown over each batch?
  • msg (str) – Message to display in the progress bar, if used. Set to None to supress the message
run_audio_annotator()[source]

Run the audio annotator.

After running this method, two new annotations are given: ‘audio’ and ‘audiometa’. They contain all of the sound data as a DataFrame objects.

run_subtitle_annotator()[source]

Run the subtitle annotator.

After running this method, a new annotation called ‘subtitle’ will be added to the DataExtraction object. Requires that the attribue sinput is set to a valid path.

class dvt.core.FrameBatch(**kwargs)[source]

Bases: object

A collection of frames and associated metadata.

The batch contains an array of size (bsize * 2, width, height, 3). At the start and end of the video file, the array is padded with zeros (an all black frame). The batch includes twice as many frames as given in the batch size, but an annotator should only return results from the first half of the data (the “batch”). The other data is included for annotators that need to look ahead of the current, such as the cut detectors.

img

A four-dimensional array containing pixels from the next 2*bsize of images.

Type:np.array
start

Time code at the start of the current batch.

Type:float
end

Time code at the end of the current batch.

Type:float
continue_read

Indicates whether there more frames to read from the input.

Type:bool
fnames

Names of frames in the batch.

Type:list
bnum

The batch number.

Type:int
bsize

Number of frames in a batch.

Type:int
get_batch()[source]

Return image data for just the current batch.

Use this method unless you have a specific need to look ahead at new values in the data. Images are given in RGB space.

Returns:A four-dimensional array containing pixels from the current batch of images.
get_frame_names()[source]

Return frame names for the current batch of data.

Returns:A list of names of length equal to the batch size.
get_frames()[source]

Return the entire image dataset for the batch.

Use this method if you need to look ahead at the following batch for an annotator to work. Images are given in RGB space.

Returns:A four-dimensional array containing pixels from the current and next batches of data.