bnlearn.bnlearn¶
Bayesian techniques for structure learning, parameter learning, inference and sampling.
-
bnlearn.bnlearn.adjmat2dict(adjmat)¶ Convert adjacency matrix to dict.
- Parameters
adjmat (pd.DataFrame) – Adjacency matrix.
- Returns
graph – Graph.
- Return type
dict
-
bnlearn.bnlearn.adjmat2vec(adjmat, min_weight=1)¶ Convert adjacency matrix into vector with source and target.
- Parameters
adjmat (pd.DataFrame()) – Adjacency matrix.
min_weight (float) – edges are returned with a minimum weight.
- Returns
nodes that are connected based on source and target
- Return type
pd.DataFrame()
Examples
>>> source=['Cloudy','Cloudy','Sprinkler','Rain'] >>> target=['Sprinkler','Rain','Wet_Grass','Wet_Grass'] >>> adjmat = vec2adjmat(source, target) >>> vector = adjmat2vec(adjmat)
-
bnlearn.bnlearn.compare_networks(model_1, model_2, pos=None, showfig=True, figsize=(15, 8), verbose=3)¶ Compare networks of two models.
- Parameters
model_1 (dict) – Results of model 1.
model_2 (dict) – Results of model 2.
pos (graph, optional) – Coordinates of the network. If there are provided, the same structure will be used to plot the network.. The default is None.
showfig (bool, optional) – plot figure. The default is True.
figsize (tuple, optional) – Figure size.. The default is (15,8).
verbose (int, optional) – Print progress to screen. The default is 3. 0: None, 1: ERROR, 2: WARN, 3: INFO (default), 4: DEBUG, 5: TRACE
- Returns
scores : Score of differences between the two input models. adjmat_diff : Adjacency matrix depicting the differences between the two input models.
- Return type
tuple containing (scores, adjmat_diff)
-
bnlearn.bnlearn.df2onehot(df, y_min=10, perc_min_num=0.8, dtypes='pandas', excl_background=None, verbose=3)¶ Convert dataframe to one-hot matrix.
- Parameters
df (pd.DataFrame()) – Input dataframe for which the rows are the features, and colums are the samples.
dtypes (list of str or 'pandas', optional) – Representation of the columns in the form of [‘cat’,’num’]. By default the dtype is determiend based on the pandas dataframe.
y_min (int [0..len(y)], optional) – Minimal number of sampels that must be present in a group. All groups with less then y_min samples are labeled as _other_ and are not used in the enriching model. The default is None.
perc_min_num (float [None, 0..1], optional) – Force column (int or float) to be numerical if unique non-zero values are above percentage. The default is None. Alternative can be 0.8
verbose (int, optional) – Print message to screen. The default is 3. 0: None, 1: ERROR, 2: WARN, 3: INFO (default), 4: DEBUG, 5: TRACE
- Returns
One-hot dataframe.
- Return type
pd.DataFrame()
-
bnlearn.bnlearn.get_edge_properties(model, color='#000000', weight=1, verbose=3)¶ Collect edge properties.
- Parameters
model (dict) – dict containing (initialized) model.
color (str, (Default: '#000000')) – The default color of the edges.
weight (float, (Default: 1)) – The default weight of the edges.
verbose (int, optional) – Print progress to screen. The default is 3. 0: None, 1: ERROR, 2: WARN, 3: INFO (default), 4: DEBUG, 5: TRACE
- Returns
edges – Edge properties.
- Return type
dict.
Examples
>>> # Example 1 >>> import bnlearn as bn >>> edges = [('A', 'B'), ('A', 'C'), ('A', 'D')] >>> # Create DAG and store in model >>> model = bn.make_DAG(edges) >>> edge_properties = bn.get_edge_properties(model) >>> # Adjust the properties >>> edge_properties[('A', 'B')]['weight']=10 >>> edge_properties[('A', 'B')]['color']='#8A0707' >>> # Make plot >>> bn.plot(model, interactive=False, edge_properties=edge_properties)
-
bnlearn.bnlearn.get_node_properties(model, node_color='#1f456e', node_size=None, verbose=3)¶
-
bnlearn.bnlearn.import_DAG(filepath='sprinkler', CPD=True, checkmodel=True, verbose=3)¶ Import Directed Acyclic Graph.
- Parameters
filepath (str, (default: sprinkler)) – Pre-defined examples are depicted below, or provide the absolute file path to the .bif model file.. The default is ‘sprinkler’. ‘sprinkler’, ‘alarm’, ‘andes’, ‘asia’, ‘sachs’, ‘filepath/to/model.bif’,
CPD (bool, optional) – Directed Acyclic Graph (DAG). The default is True.
checkmodel (bool) – Check the validity of the model. The default is True
verbose (int, optional) – Print progress to screen. The default is 3. 0: None, 1: ERROR, 2: WARN, 3: INFO (default), 4: DEBUG, 5: TRACE
- Returns
model : BayesianModel adjmat : Adjacency matrix
- Return type
dict containing model and adjmat.
Examples
>>> import bnlearn as bn >>> model = bn.import_DAG('sprinkler') >>> bn.plot(model)
-
bnlearn.bnlearn.import_example(data='sprinkler', n=10000, verbose=3)¶ Load example dataset.
- Parameters
data (str, (default: sprinkler)) – Pre-defined examples. ‘titanic’, ‘sprinkler’, ‘alarm’, ‘andes’, ‘asia’, ‘sachs’, ‘water’, ‘random’
n (int, optional) – Number of samples to generate. The default is 1000.
verbose (int, (default: 3)) – Print progress to screen. 0: None, 1: Error, 2: Warning, 3: Info, 4: Debug, 5: Trace
- Returns
df
- Return type
pd.DataFrame()
-
bnlearn.bnlearn.load(filepath='bnlearn_model.pkl', verbose=3)¶ Load learned model.
- Parameters
filepath (str) – Pathname to stored pickle files.
verbose (int, optional) – Show message. A higher number gives more information. The default is 3.
- Returns
- Return type
Object.
-
bnlearn.bnlearn.make_DAG(DAG, CPD=None, methodtype='bayes', checkmodel=True, verbose=3)¶ Create Directed Acyclic Graph based on list.
- Parameters
DAG (list) – list containing source and target in the form of [(‘A’,’B’), (‘B’,’C’)].
CPD (list, array-like) – Containing TabularCPD for each node.
methodtype (str (default: 'bayes')) –
‘bayes’: Bayesian model
’nb’ or ‘naivebayes’: Special case of Bayesian Model where the only edges in the model are from the feature variables to the dependent variable. Or in other words, each tuple should start with the same variable name such as: edges = [(‘A’, ‘B’), (‘A’, ‘C’), (‘A’, ‘D’)]
checkmodel (bool) – Check the validity of the model. The default is True
verbose (int, optional) – Print progress to screen. The default is 3. 0: None, 1: ERROR, 2: WARN, 3: INFO (default), 4: DEBUG, 5: TRACE
- Returns
‘adjmat’: Adjacency matrix
’model’: pgmpy.models
’methodtype’: methodtype
’model_edges’: Edges
- Return type
dict keys
Examples
>>> import bnlearn as bn >>> edges = [('A', 'B'), ('A', 'C'), ('A', 'D')] >>> DAG = bn.make_DAG(edges, methodtype='naivebayes') >>> bn.plot(DAG)
-
bnlearn.bnlearn.plot(model, pos=None, scale=1, interactive=False, title='bnlearn causal network', node_color=None, node_size=None, node_properties=None, edge_properties=None, params_interactive={'bgcolor': '#ffffff', 'font_color': False, 'height': '800px', 'layout': None, 'notebook': False, 'width': '70%'}, params_static={'alpha': 0.8, 'arrowsize': 30, 'arrowstyle': '-|>', 'edge_alpha': 0.8, 'facecolor': 'white', 'font_color': '#000000', 'font_family': 'sans-serif', 'font_size': 14, 'height': 8, 'layout': 'fruchterman_reingold', 'node_shape': 'o', 'width': 15}, verbose=3)¶ Plot the learned stucture.
- Parameters
model (dict) – Learned model from the .fit() function.
pos (graph, optional) – Coordinates of the network. If there are provided, the same structure will be used to plot the network.. The default is None.
scale (int, optional) – Scaling parameter for the network. A larger number will linearily increase the network.. The default is 1.
interactive (Bool, (default: True)) – True: Interactive web-based graph. False: Static plot
title (str, optional) – Title for the plots.
node_color (str, optional) – Color each node in the network using a hex-color, such as ‘#8A0707’
node_size (int, optional) – Set the node size for each node in the network. The default size when using static plolts is 800, and for interactive plots it is 10.
node_properties (dict (default: None)) –
Dictionary containing custom node_color and node_size parameters for the network. The node properties can easily be retrieved using the function: node_properties = bn.get_node_properties(model) node_properties = {‘node1’:{‘node_color’:’#8A0707’,’node_size’:10},
’node2’:{‘node_color’:’#000000’,’node_size’:30}}
edge_properties (dict (default: None)) – Dictionary containing custom node_color and node_size parameters for the network. The edge properties can easily be retrieved using the function: edge_properties = bn.get_edge_properties(model)
params_interactive (dict.) – Dictionary containing various settings in case of creating interactive plots.
params_static (dict.) – Dictionary containing various settings in case of creating static plots.
verbose (int, optional) – Print progress to screen. The default is 3. 0: None, 1: Error, 2: Warning, 3: Info (default), 4: Debug, 5: Trace
- Returns
- poslist.
Positions of the nodes.
- GGraph.
Graph model
- node_properties: dict.
Node properties.
- Return type
dict containing pos and G
Examples
>>> import bnlearn as bn >>> >>> # Load asia DAG >>> df = bn.import_example(data='asia') >>> >>> # Structure learning of sampled dataset >>> model = bn.structure_learning.fit(df) >>> >>> # plot static >>> G = bn.plot(model) >>> >>> # plot interactive >>> G = bn.plot(model, interactive=True) >>> >>> # plot interactive with various settings >>> bn.plot(model, node_color='#8A0707', node_size=35, interactive=True, params_interactive = {'height':'800px', 'width':'70%', 'layout':None, 'bgcolor':'#0f0f0f0f'}) >>> >>> # plot with node properties >>> node_properties = bn.get_node_properties(model) >>> # Make some changes >>> node_properties['xray']['node_color']='#8A0707' >>> node_properties['xray']['node_size']=50 >>> # Plot >>> bn.plot(model, interactive=True, node_properties=node_properties) >>>
-
bnlearn.bnlearn.predict(model, df, variables, to_df=True, method='max', verbose=3)¶ Predict on data from a Bayesian network.
The inference on the dataset is performed sample-wise by using all the available nodes as evidence (obviously, with the exception of the node whose values we are predicting). The states with highest probability are returned.
- Parameters
model (Object) – An object of class from bn.fit.
df (pd.DataFrame) – Each row in the DataFrame will be predicted
variables (str or list of str) – The label(s) of node(s) to be predicted.
to_df (Bool, (default is True)) – The output is converted to dataframe output. Note that this heavily impacts the speed.
method (str) – The method that is used to select the for the inferences. ‘max’ : Return the variable values based on the maximum probability. None : Returns all Probabilities
verbose (int, optional) – Print progress to screen. The default is 3. 0: None, 1: ERROR, 2: WARN, 3: INFO (default), 4: DEBUG, 5: TRACE
- Returns
P – Predict() returns a dict with the evidence and states that resulted in the highest probability for the input variable.
- Return type
dict or DataFrame
Examples
>>> import bnlearn as bn >>> model = bn.import_DAG('sprinkler') >>> >>> # Make single inference >>> query = bn.inference.fit(model, variables=['Rain', 'Cloudy'], evidence={'Wet_Grass':1}) >>> print(query) >>> print(bn.query2df(query)) >>> >>> # Lets create an example dataset with 100 samples and make inferences on the entire dataset. >>> df = bn.sampling(model, n=1000) >>> >>> # Each sample will be assesed and the states with highest probability are returned. >>> Pout = bn.predict(model, df, variables=['Rain', 'Cloudy']) >>> >>> print(Pout) >>> # Cloudy Rain p >>> # 0 0 0 0.647249 >>> # 1 0 0 0.604230 >>> # .. ... ... ... >>> # 998 0 0 0.604230 >>> # 999 1 1 0.878049
-
bnlearn.bnlearn.print_CPD(DAG, checkmodel=False)¶ Print DAG-model to screen.
- Parameters
DAG (pgmpy.models.BayesianModel.BayesianModel) – model of the DAG.
checkmodel (bool) – Check the validity of the model. The default is True
- Returns
- Return type
None.
-
bnlearn.bnlearn.query2df(query, variables=None)¶ Convert query from inference model to a dataframe.
- Parameters
query (Object from the inference model.) – Convert query object to a dataframe.
variables (list) – Order or select variables.
- Returns
df – Dataframe with inferences.
- Return type
pd.DataFrame()
-
bnlearn.bnlearn.sampling(DAG, n=1000, verbose=3)¶ Generate sample(s) using forward sampling from joint distribution of the bayesian network.
- Parameters
DAG (dict) – Contains model and adjmat of the DAG.
n (int, optional) – Number of samples to generate. The default is 1000.
verbose (int, optional) – Print progress to screen. The default is 3. 0: None, 1: ERROR, 2: WARN, 3: INFO (default), 4: DEBUG, 5: TRACE
- Returns
df – Dataframe containing sampled data from the input DAG model.
- Return type
pd.DataFrame()
Example
>>> import bnlearn >>> DAG = bnlearn.import_DAG('sprinkler') >>> df = bnlearn.sampling(DAG, n=1000)
-
bnlearn.bnlearn.save(model, filepath='bnlearn_model.pkl', overwrite=False, verbose=3)¶ Save learned model in pickle file.
- Parameters
filepath (str, (default: 'bnlearn_model.pkl')) – Pathname to store pickle files.
overwrite (bool, (default=False)) – Overwite file if exists.
verbose (int, optional) – Show message. A higher number gives more informatie. The default is 3.
- Returns
bool – Status whether the file is saved.
- Return type
[True, False]
-
bnlearn.bnlearn.to_bayesianmodel(model, verbose=3)¶ Convert adjacency matrix to BayesianModel.
Convert a adjacency to a BayesianModel. This is required as some of the functionalities, such as
structure_learningoutput a DAGmodel. If the output ofstructure_learningis provided, the adjmat is extracted and processed.- Parameters
model (pd.DataFrame()) – Adjacency matrix.
- Raises
Exception – The input should not be None and if a model (as dict) is provided, the key ‘adjmat’ should be included.
- Returns
bayesianmodel – BayesianModel that can be used in
parameter_learning.fit.- Return type
Object
-
bnlearn.bnlearn.to_undirected(adjmat)¶ Transform directed adjacency matrix to undirected.
- Parameters
adjmat (np.array()) – Adjacency matrix.
- Returns
Directed adjacency matrix – Converted adjmat with undirected edges.
- Return type
pd.DataFrame()
-
bnlearn.bnlearn.topological_sort(adjmat, start=None)¶ Topological sort.
Get nodes list in the topological sort order.
- Parameters
adjmat (pd.DataFrame or bnlearn object.) – Adjacency matrix.
start (str, optional) – Start position. The default is None and the whole network is examined.
- Returns
Topological sort order.
- Return type
list
Example
import bnlearn as bn DAG = bn.import_DAG(‘sprinkler’, verbose=0) bn.topological_sort(DAG, ‘Rain’) bn.topological_sort(DAG)
References
-
bnlearn.bnlearn.vec2adjmat(source, target, symmetric=True)¶ Convert source and target into adjacency matrix.
- Parameters
source (list) – The source node.
target (list) – The target node.
symmetric (bool, optional) – Make the adjacency matrix symmetric with the same number of rows as columns. The default is True.
- Returns
adjacency matrix.
- Return type
pd.DataFrame
Examples
>>> source=['Cloudy','Cloudy','Sprinkler','Rain'] >>> target=['Sprinkler','Rain','Wet_Grass','Wet_Grass'] >>> vec2adjmat(source, target)