Utility Functions¶
Utility functions used across the toolkit.
Public methods may be useful in producing new annotators, aggregators, and pipeline methods.
-
dvt.utils.setup_tensorflow()[source]¶ Setup options for TensorFlow.
These options should allow most users to run TensorFlow with either a GPU or CPU. It sets several options to avoid keras taking up too much memory space and ignore a common warnings about library conflicts that can occur on macOS. It also silences verbose warnings from TensorFlow that most users can safely ignore.
-
dvt.utils.process_output_values(values)[source]¶ Take input and create pandas data frame.
This function standardizes the output from annotators and aggregators in order to create the output stored in a DataExtraction object.
Parameters: values – Either a DataFrame object, None, or dictionary object. If a dictionary object, the key should contain lists or ndarrays that have the same (leading) dimension. It is also possible to pass a dictionary of all scalar values. Returns: A list of length one, containing a single DataFrame object or a value of None (returned if and only if the input is None).
-
dvt.utils.pd_col_asarray(pdf, column)[source]¶ Takes a pandas dataframe and column name returns a numpy array.
Pandas DataFrame columns cannot store numpy array with more than one dimension. This is a problem for objects such as image embeddings. We instead store these multidimensional arrays a an array of objects. This function reconstructs the original array.
Parameters: - pdf (DatFrame) – the pandas DataFrame from which to extract the column.
- column (str) – Name of the column to extract as a numpy array.
-
dvt.utils.sub_image(img, top, right, bottom, left, fct=1, output_shape=None)[source]¶ Take a subset of an input image and return a (resized) subimage.
Parameters: - img (numpy array) – Image stored as a three-dimensional image (rows, columns, and color channels).
- top (int) – Top coordinate of the new image.
- right (int) – Right coordinate of the new image.
- bottom (int) – Bottom coordinate of the new image.
- left (int) – Left coordinate of the new image.
- fct (float) – Percentage to expand the bounding box by. Defaults to 1, using the input coordinates as given.
- output_shape (tuple) – Size to scale the output image, in pixels. Set to None (default) to keep the native resolution.
Returns: A three-dimensional numpy array describing the new image.