hyperstream.workflow package¶
Submodules¶
hyperstream.workflow.factor module¶
-
class
hyperstream.workflow.factor.Factor(tool, source_nodes, sink_node, alignment_node, plates)[source]¶ Bases:
hyperstream.workflow.factor.FactorBaseA factor in the graph. This defines the element of computation: the tool along with the source and sink nodes.
-
execute(time_interval)[source]¶ Execute the factor over the given time interval :param time_interval: :return:
-
get_alignment_stream(plate=None, plate_value=None)[source]¶ Gets the alignment stream for a particular plate value :param plate: The plate on which the alignment node lives :param plate_value: The plate value to select the stream from the node :return: The alignment stream
-
get_sources(plate, plate_value, sources=None)[source]¶ Gets the source streams for a given plate value on a plate. Also populates with source streams that are valid for the parent plates of this plate, with the appropriate meta-data for the parent plate. :param plate: The plate being operated on :param plate_value: The specific plate value of interest :param sources: The currently found sources (for recursion) :return: The appropriate source streams :type plate: Plate :type plate_value: tuple :type sources: list[Stream] | None
-
-
class
hyperstream.workflow.factor.FactorBase(tool)[source]¶ Bases:
hyperstream.utils.utils.Printable-
factor_id¶
-
-
class
hyperstream.workflow.factor.MultiOutputFactor(tool, source_node, splitting_node, sink_node, input_plate, output_plates)[source]¶ Bases:
hyperstream.workflow.factor.FactorBaseA multi-output factor in the graph. As with a factor, this links source nodes to sink nodes. However in this case the source node is being split onto multiple plates. Note there is no concept of an alignment node here.
hyperstream.workflow.meta_data_manager module¶
-
class
hyperstream.workflow.meta_data_manager.MetaDataManager[source]¶ Bases:
hyperstream.utils.utils.Printable-
contains(identifier)[source]¶ Determines if the meta data with the given identifier is in the database
Parameters: identifier – The identifier Returns: Whether the identifier is present
-
delete(identifier)[source]¶ Delete the meta data with the given identifier from the database
Parameters: identifier – The identifier Returns: None
-
global_meta_data¶ Get the global meta data, which will be stored in a tree structure
Returns: The global meta data
-
insert(tag, identifier, parent, data)[source]¶ Insert the given meta data into the database
Parameters: - tag – The tag (equates to meta_data_id)
- identifier – The identifier (a combination of the meta_data_id and the plate value)
- parent – The parent plate identifier
- data – The data (plate value)
Returns: None
-
hyperstream.workflow.node module¶
Node module.
Nodes are a collection of streams defined by shared meta-data keys (plates), and are connected in the computational graph by factors.
-
class
hyperstream.workflow.node.Node(channel, node_id, streams, plates)[source]¶ Bases:
hyperstream.utils.utils.PrintableA node in the graph. This consists of a set of streams defined over a set of plates
-
difference(other)[source]¶ Summarise the differences between this node and the other node.
Parameters: other (Node) – The other node Returns: A tuple containing the diff, the counts of the diff, and whether this plate is a sub-plate of the other
-
factor¶
-
intersection(meta)[source]¶ Get the intersection between the meta data given and the meta data contained within the plates. Since all of the streams have the same meta data keys (but differing values) we only need to consider the first stream. :param meta: The meta data to compare :return: A stream id with the intersection between this node’s meta data and the given meta data :type meta: dict :rtype: StreamId
-
is_leaf¶
-
plate_ids¶
-
plate_values¶
-
print_head(parent_plate_value, plate_values, interval, n=10, print_func=<function info>)[source]¶ Print the first n values from the streams in the given time interval. The parent plate value is the value of the parent plate, and then the plate values are the values for the plate that are to be printed. e.g. print_head()
Parameters: - parent_plate_value – The (fixed) parent plate value
- plate_values – The plate values over which to loop
- interval – The time interval
- n – The maximum number of elements to print
- print_func – The function used for printing (e.g. logging.info() or print())
Returns: None
-
streams¶
-
hyperstream.workflow.plate module¶
Plate definition.
-
class
hyperstream.workflow.plate.Plate(plate_id, meta_data_id, values, parent_plate=None)[source]¶ Bases:
hyperstream.utils.utils.PrintableA plate in the execution graph. This can be thought of as a “for loop” over the streams in a node
-
ancestor_meta_data_ids¶ The meta data ids of all ancestor plates in the tree
-
ancestor_plate_ids¶ The plate ids of all ancestor plates in the tree
-
ancestor_plates¶ All ancestor plates in the tree
-
static
combine_values(parent_plate_value, plate_value)[source]¶ Combine the plate value(s) with the parent plate value(s) :param parent_plate_value: The parent plate value(s) :param plate_value: The plate value(s) :return: The combined plate values
-
static
get_overlapping_values(plates)[source]¶ Need to find where in the tree the two plates intersect, e.g.
We are given as input plates D, E, whose positions in the tree are:
root -> A -> B -> C -> D root -> A -> B -> E
The results should then be the cartesian product between C, D, E looped over A and B
If there’s a shared plate in the hierarchy, we need to join on this shared plate, e.g.:
- [self.plates[p].values for p in plate_ids][0] =
- [((‘house’, ‘1’), (‘location’, ‘hallway’), (‘wearable’, ‘A’)),
- ((‘house’, ‘1’), (‘location’, ‘kitchen’), (‘wearable’, ‘A’))]
- [self.plates[p].values for p in plate_ids][1] =
- [((‘house’, ‘1’), (‘scripted’, ‘15’)),
- ((‘house’, ‘1’), (‘scripted’, ‘13’))]
- Result should be one stream for each of:
- [((‘house’, ‘1’), (‘location’, ‘hallway’), (‘wearable’, ‘A’), (‘scripted’, ‘15)),
- ((‘house’, ‘1’), (‘location’, ‘hallway’), (‘wearable’, ‘A’), (‘scripted’, ‘13)), ((‘house’, ‘1’), (‘location’, ‘kitchen’), (‘wearable’, ‘A’), (‘scripted’, ‘15)), ((‘house’, ‘1’), (‘location’, ‘kitchen’), (‘wearable’, ‘A’), (‘scripted’, ‘13))]
Parameters: plates (list[Plate] | list[Plate]) – The input plates Returns: The plate values
-
identifier¶
-
is_ancestor(other)[source]¶ Determines if this plate is an ancestor plate of the other (i.e. other is contained in the ancestors)
Parameters: other – The other plate Returns: True if this plate is a ancestor of the other plate
-
is_child(other)[source]¶ Determines if this plate is a child plate of the other
Parameters: other (Plate) – The other plate Returns: True if this plate is a child of the other plate
-
is_descendant(other)[source]¶ Determines if this plate is an descendant plate of the other (i.e. self is contained in the other’s ancestors)
Parameters: other (Plate) – The other plate Returns: True if this plate is a descendant of the other plate
-
is_parent(other)[source]¶ Determines if this plate is a parent plate of the other
Parameters: other (Plate) – The other plate Returns: True if this plate is a parent of the other plate
-
is_root¶ True if this plate is at the root of the tree, i.e. has no parent plate
-
is_sub_plate(other)[source]¶ Determines if this plate is a sub-plate of another plate - i.e. has the same meta data but a restricted set of values
Parameters: other – The other plate Returns: True if this plate is a sub-plate of the other plate
-
is_super_plate(other)[source]¶ Determines if this plate is a super-plate of another plate - i.e. has the same meta data but a larger set of values
Parameters: other – The other plate Returns: True if this plate is a super-plate of the other plate
-
parent¶
-
values¶
-
hyperstream.workflow.plate_manager module¶
-
class
hyperstream.workflow.plate_manager.PlateManager[source]¶ Bases:
hyperstream.utils.utils.PrintablePlate manager. Manages the mapping between plates defined in the database with the global meta data definition.
-
add_plate(plate_definition)[source]¶ Add a plate using the plate definition :param plate_definition: The plate definition :return: None :type plate_definition: PlateDefinitionModel
-
create_plate(plate_id, description, meta_data_id, values, complement, parent_plate)[source]¶ Create a new plate, and commit it to the database :param plate_id: The plate id - required to be unique :param description: A human readable description :param meta_data_id: The meta data id, which should correspond to the tag in the global meta data :param values: Either a list of string values, or the empty list (for use with complement) :param complement: If complement is true, then the complement of the values list will be used when getting values from the global meta data :param parent_plate: The parent plate identifier :return: The newly created plate :type plate_id: str | unicode :type complement: bool :type values: list | tuple
-
static
get_parent_data(tree, node, current)[source]¶ Recurse up the tree getting parent data :param tree: The tree :param node: The current node :param current: The current list :return: The hierarchical dictionary
-
hyperstream.workflow.workflow module¶
Workflow and WorkflowManager definitions.
-
class
hyperstream.workflow.workflow.Workflow(channels, plate_manager, workflow_id, name, description, owner, online=False)[source]¶ Bases:
hyperstream.utils.utils.PrintableWorkflow. This defines the graph of operations through “nodes” and “factors”.
-
static
check_plate_compatibility(tool, source_plate, sink_plate)[source]¶ Checks whether the source and sink plate are compatible given the tool
Parameters: Returns: Either an error, or None
Return type: None | str
-
create_factor(tool, sources, sink, alignment_node=None)[source]¶ Creates a factor. Instantiates a single tool for all of the plates, and connects the source and sink nodes with that tool.
Note that the tool parameters these are currently fixed over a plate. For parameters that vary over a plate, an extra input stream should be used
Parameters: - alignment_node (Node | None) –
- tool (Tool | dict) – The tool to use. This is either an instantiated Tool object or a dict with “name” and “parameters”
- sources (list[Node] | tuple[Node] | None) – The source nodes
- sink (Node) – The sink node
Returns: The factor object
Return type:
-
create_multi_output_factor(tool, source, splitting_node, sink)[source]¶ Creates a multi-output factor. This takes a single node, applies a MultiOutputTool to create multiple nodes on a new plate Instantiates a single tool for all of the input plate values, and connects the source and sink nodes with that tool.
Note that the tool parameters these are currently fixed over a plate. For parameters that vary over a plate, an extra input stream should be used
Parameters: Returns: The factor object
Return type:
-
create_node(stream_name, channel, plate_ids)[source]¶ Create a node in the graph. Note: assumes that the streams already exist
Parameters: - stream_name – The name of the stream
- channel – The channel where this stream lives
- plate_ids – The plate ids. The stream meta-data will be auto-generated from these
Returns: The streams associated with this node
-
create_node_creation_factor(tool, source, output_plate, plate_manager)[source]¶ Creates a factor that itself creates an output node, and ensures that the plate for the output node exists along with all relevant meta-data :param tool: The tool :param source: The source node :param output_plate: The details of the plate that will be created (dict) :param plate_manager: The hyperstream plate manager :type output_plate: dict :type plate_manager: PlateManager :return: The created factor
-
execute(time_interval)[source]¶ Here we execute the factors over the streams in the workflow Execute the factors in reverse order. We can’t just execute the last factor because there may be multiple “leaf” factors that aren’t triggered by upstream computations.
Parameters: time_interval – The time interval to execute this workflow over
-
requested_intervals¶
-
static
hyperstream.workflow.workflow_manager module¶
-
class
hyperstream.workflow.workflow_manager.WorkflowManager(channel_manager, plate_manager)[source]¶ Bases:
hyperstream.utils.utils.PrintableWorkflow manager. Responsible for reading and writing workflows to the database, and can execute all of the workflows
-
add_workflow(workflow, commit=False)[source]¶ Add a new workflow and optionally commit it to the database :param workflow: The workflow :param commit: Whether to commit the workflow to the database :type workflow: Workflow :type commit: bool :return: None
-
commit_workflow(workflow_id)[source]¶ Commit the workflow to the database :param workflow_id: The workflow id :return: None
-
delete_workflow(workflow_id)[source]¶ Delete a workflow from the database :param workflow_id: :return: None
-
load_workflow(workflow_id)[source]¶ Load workflow from the database and store in memory :param workflow_id: The workflow id :return: The workflow
-