Figure 2e: Drug combination associations with host and microbiome

This script prepares input for Figure 2e: Mediation analysis of the host and microbiome features for the selected drug combinations. Drug-feature effect graph representing potential feature mediation effects between host and microbiome features. Solid lines represent drug effects on the feature, colour represents direction of the effect. Dashed lines between features indicate potential mediation (general mediation model one-sided P < 0.1 ), colour represents the sign of Pearson’s correlation coefficient (P < 0.1).

For visualization purposes, the edges between drugs and features in this graph have the value of drug combination effect size for the corresponding combination. (E.g. if combination of statin wih calcium antagonist synergistically decrease VLDL Cholesterol, both statin and calcium antagonist have an edge connecting them with VLDL cholesterol, which has the edge value of the combination effect size).

Required file in the input_data folder:

  • Supplementary_Table_6_2019-09-13434.xlsx

Supplementary Table 6: Features of microbiome, host and metabolome impacted by different drug groups and drug compounds. Results of drug group (or drug compound according to the ATC classification) assessment for its impact on host and microbiome features for each patient group. Compound comparison with Maier et al., Nature 2018, tab shows microbiome features negatively impacted by the drug treatment (for the ATC-level compounds) in at least one patient group, and bacterial species whose growth was inhibited by the same drug in the in vitro experiment.

  • Supplementary_Table_8_2019-09-13434.xlsx

Supplementary Table 8: Features of microbiome, host and metabolome impacted by different drug combinations. Analysis of the effect of drug combinations, assessed for impact on host and microbiome falling within different measurement categories in each patient group.

  • Supplementary_Table_10_2019-09-13434.xlsx

Supplementary Table 10. Mediation analysis of host and microbiome features for drug intake, dosage and combinations. Mediation analysis via a regression model of drug effect on each host feature mediated through a microbiome feature or vice versa.

Figure is based on the data from Supplementary Tables 6, 8 and 10.

In [1]:
# load required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.patches as mpatches
%matplotlib inline
In [2]:
# read combination feature mediation file
fileFolder = './input_data/'
fileName = 'Supplementary_Table_10_2019-09-13434.xlsx'
sheetName = 'Drug combinations'

mediation_df_filtered_combined_filtered = pd.read_excel(fileFolder + fileName,
                           sheet_name = sheetName)

Get feature infromation for drug and drug combination associations for further filtering

In [3]:
# read single drug features files
fileName = 'Supplementary_Table_6_2019-09-13434.xlsx'
sheetName = 'Drug group effect'
drugEffect = pd.read_excel(fileFolder + fileName,
                           sheet_name = sheetName)
In [4]:
# read drug combination features files
fileName = 'Supplementary_Table_8_2019-09-13434.xlsx'
sheetName = 'Data'
drugCombinationEffect = pd.read_excel(fileFolder + fileName,
                           sheet_name = sheetName)
In [5]:
# concatenate single drug effects and combination effects
allEffects_df = pd.concat([drugCombinationEffect, drugEffect])
In [6]:
# get the patient groups (Sample set column)
listconditions = list(set(allEffects_df['Sample set']))

For each mediation pair, get effect sizes of drugs and combinations and diseases

In [7]:
curset = 'T2D (3)' # select Group T2D (3)
# copy all effects for one condition to reduce number
selectedEffectd_df = allEffects_df[(allEffects_df['Sample set'].str.find(curset)>=0)]
In [8]:
# select only features with congruence opposite of disease (drug and disease effects are opposite)
selectedEffectd_df = selectedEffectd_df[selectedEffectd_df['Congruence']=='Opposite'].copy()
In [9]:
# create a dataframe containing information on the potential features and mediators
mediation_df_filtered_combined_filtered_features = []

for i in range(len(mediation_df_filtered_combined_filtered)):
    curfeature1 = mediation_df_filtered_combined_filtered['Feature1'].iloc[i]
    curfeature2 = mediation_df_filtered_combined_filtered['Feature2_med'].iloc[i]
    curspace1 = mediation_df_filtered_combined_filtered['FeatureSpace1'].iloc[i]
    curspace2 = mediation_df_filtered_combined_filtered['FeatureSpace2'].iloc[i]
    cureffector = mediation_df_filtered_combined_filtered['Effector'].iloc[i]

    cureffector_drugs = cureffector.replace('Combination: ', '')
    cureffector_drugs = cureffector_drugs.split(', ')

    cureffect_df = selectedEffectd_df[
                (((selectedEffectd_df['Feature display name']==curfeature1) &
                  (selectedEffectd_df['Feature space']==curspace1)) |
                 ((selectedEffectd_df['Feature display name']==curfeature2) &
                  (selectedEffectd_df['Feature space']==curspace2)))&
                ((selectedEffectd_df['Effector']=='Group contrast') |
                 (selectedEffectd_df['Effector']==cureffector) |
                 (selectedEffectd_df['Effector']==cureffector_drugs[0]) |
                 (selectedEffectd_df['Effector']==cureffector_drugs[1]))]
    if len(mediation_df_filtered_combined_filtered_features)==0:
        mediation_df_filtered_combined_filtered_features = cureffect_df.copy()
    else:
        mediation_df_filtered_combined_filtered_features = pd.concat([
            mediation_df_filtered_combined_filtered_features, cureffect_df])
In [10]:
mediation_df_filtered_combined_filtered_features_subset = \
    mediation_df_filtered_combined_filtered.copy()

Select statin, metformin, aspirin and calcium antagonists to extract the mediation graph.

In [11]:
ploteffectors = list(set(mediation_df_filtered_combined_filtered_features_subset['Effector']))
In [12]:
ploteffectors = [item for item in ploteffectors 
                 if (('Statin' in item) & ('Metformin' in item)) |
                    (('Statin' in item) & ('Aspirin' in item)) |
                    (('Statin' in item) & ('Calcium' in item))]
ploteffectors
Out[12]:
['Combination: Calcium antagonist intake, Statin intake',
 'Combination: Metformin intake, Statin intake',
 'Combination: Statin intake, Aspirine intake']

Filter mediation results by effectors and patient type

In [13]:
mediation_df_filtered_combined_filtered_features_subset
Out[13]:
Effector Sample set Feature1 Feature2_med FeatureSpace1 FeatureSpace2 ACME (average)_Estimate ADE (average)_Estimate Total effect_Estimate ACME (average)_P-value ... Total effect_P-value FeatureID1 FeatureID2 FeatureDrugCorr FeatureDrugCorrP FeatureFeatureCorr FeatureFeatureCorrP Feature1_Type Feature2_Type FeatureTypes
0 Combination: ACE intake, Aspirine intake MetS (1) Unannotated serum metabolite (7.055425 PPM) unclassified Firmicutes (CAG01153) Serum, unannotated MGS 232.682304 1506.937363 1739.619666 0.100 ... 0.000 NMR_v2_SERUM_242_PPM7.055425 TaxonAdjustedID.MGS.down.10000000.v6.1165 0.135916 0.031353 0.050925 0.421805 Host Bact Opposite
1 Combination: ACE intake, Aspirine intake Chronic CAD (5) Blautia sp. CAG:257 (CAG01293) Alanine MGS Serum, NMR absolute -0.007616 0.010583 0.002967 0.076 ... 0.888 TaxonAdjustedID.MGS.down.10000000.v6.1315 SERUM_NMR_ABSOLUTE_Alanine -0.041737 0.601416 0.155095 0.050930 Bact Host Opposite
2 Combination: ACE intake, Beta-blocker intake T2D (3) unclassified Firmicutes (CAG01223) Unannotated serum metabolite (7.3439 PPM) MGS Serum, unannotated 0.070816 0.123033 0.193849 0.046 ... 0.470 TaxonAdjustedID.MGS.down.10000000.v6.1240 NMR_v2_SERUM_260_PPM7.3439 0.076553 0.071794 0.046753 0.271965 Bact Host Opposite
3 Combination: ACE intake, Beta-blocker intake T2D (3) unclassified Firmicutes (CAG01223) Unannotated urine metabolite (3.2692 PPM) MGS Urine, unannotated -0.117279 0.303841 0.186561 0.028 ... 0.464 TaxonAdjustedID.MGS.down.10000000.v6.1240 NMR_v2_URINE_321_PPM3.2692 0.076553 0.071794 0.019295 0.650421 Bact Host Opposite
4 Combination: ACE intake, Diuretic intake Obesity, morbid (2b) Methanobrevibacter smithii 2 (CAG01189) Acetic_acid (urine) MGS Urine, annotated 0.021168 0.095218 0.116386 0.046 ... 0.000 TaxonAdjustedID.MGS.down.10000000.v6.1204 URINE_V2_Acetic_acid 0.360765 0.000015 0.124180 0.148239 Bact Host Opposite
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
5764 Combination: Thiazidique intake, Statin intake MetS (1) IDL Cholesterol mg/dL unclassified Oscillibacter (CAG00985) Serum, lipoproteins MGS -0.601905 -3.088073 -3.689978 0.060 ... 0.120 IDCH TaxonAdjustedID.MGS.down.10000000.v6.938 -0.120623 0.056332 -0.106974 0.090800 Host Bact Opposite
5765 Combination: Thiazidique intake, Statin intake MetS (1) Unannotated serum metabolite (3.099 PPM) unclassified Oscillibacter (CAG00985) Serum, unannotated MGS 156.598622 1474.221448 1630.820070 0.080 ... 0.000 NMR_v2_SERUM_130_PPM3.099 TaxonAdjustedID.MGS.down.10000000.v6.938 0.167163 0.007958 0.058964 0.352205 Host Bact Opposite
5766 Combination: Thiazidique intake, Statin intake MetS (1) HDL-cholesterol in mmol/l unclassified Oscillibacter (CAG00985) Phenotype MGS 0.057962 0.096212 0.154174 0.040 ... 0.080 FASTLIPHDLC_C TaxonAdjustedID.MGS.down.10000000.v6.938 0.080407 0.204231 0.196394 0.001769 Host Bact Opposite
5767 Combination: Thiazidique intake, Statin intake MetS (1) unclassified Oscillibacter (CAG00985) LDL-cholesterol in mmol/l MGS Phenotype 0.031986 0.201756 0.233742 0.038 ... 0.004 TaxonAdjustedID.MGS.down.10000000.v6.938 FASTLIPLDL_C 0.240168 0.000122 -0.162244 0.010033 Bact Host Opposite
5768 Combination: Thiazidique intake, Statin intake MetS (1) LDL-cholesterol in mmol/l unclassified Oscillibacter (CAG00985) Phenotype MGS -0.098259 -0.582586 -0.680845 0.080 ... 0.000 FASTLIPLDL_C TaxonAdjustedID.MGS.down.10000000.v6.938 -0.195988 0.001809 -0.162244 0.010033 Host Bact Opposite

5769 rows × 21 columns

In [14]:
mediation_df_filtered_combined_filtered_features_subset = \
    mediation_df_filtered_combined_filtered_features_subset[
    (mediation_df_filtered_combined_filtered_features_subset['Effector']==ploteffectors[0]) |
    (mediation_df_filtered_combined_filtered_features_subset['Effector']==ploteffectors[1]) |
    (mediation_df_filtered_combined_filtered_features_subset['Effector']==ploteffectors[2])].copy()
In [15]:
feature_set = 'T2D (3)'
mediation_df_filtered_combined_filtered_features_subset = \
    mediation_df_filtered_combined_filtered_features_subset[
    mediation_df_filtered_combined_filtered_features_subset['Sample set']==feature_set].copy()
In [16]:
mediation_df_filtered_combined_filtered_features_subset = \
    mediation_df_filtered_combined_filtered_features_subset[
    mediation_df_filtered_combined_filtered_features_subset['ACME (average)_P-value']<=0.1].copy()

Select feature pairs that are present in drug combo or single drugs feature matrices

In [17]:
newcolumn_names = ['Feature1_drug1','Feature1_drug2',
                   'Feature1_drug1effect','Feature1_drug2effect',
                   'Feature1_drugcombo',
                   'Feature2_drug1','Feature2_drug2',
                   'Feature2_drug1effect','Feature2_drug2effect',
                   'Feature2_drugcombo']
for col in newcolumn_names:
    mediation_df_filtered_combined_filtered_features_subset[col] = np.zeros(
        [len(mediation_df_filtered_combined_filtered_features_subset),1])
In [18]:
# prepare a subset of features to record effect of each drug on each feature
mediation_df_filtered_combined_filtered_features_subset = \
    mediation_df_filtered_combined_filtered_features_subset.reset_index()
In [19]:
# populate the dataframe with information on which drugs are associated with each feature
for i in range(len(mediation_df_filtered_combined_filtered_features_subset)):
    curfeatures = [mediation_df_filtered_combined_filtered_features_subset['Feature1'].iloc[i],
                   mediation_df_filtered_combined_filtered_features_subset['Feature2_med'].iloc[i]]
    curspaces = [mediation_df_filtered_combined_filtered_features_subset['FeatureSpace1'].iloc[i],
                   mediation_df_filtered_combined_filtered_features_subset['FeatureSpace2'].iloc[i]]
    cureffector = mediation_df_filtered_combined_filtered_features_subset['Effector'].iloc[i]

    cureffector_drugs = cureffector.replace('Combination: ', '')
    cureffector_drugs = cureffector_drugs.split(', ')

    for curfeat_i in range(len(curfeatures)):
            for curdrug_i in range(len(cureffector_drugs)):
                cureffect_df = selectedEffectd_df[
                    (selectedEffectd_df['Feature display name']==curfeatures[curfeat_i]) &
                    (selectedEffectd_df['Feature space']==curspaces[curfeat_i]) &
                    (selectedEffectd_df['Effector']==cureffector_drugs[curdrug_i])]
                if len(cureffect_df)>0:
                    mediation_df_filtered_combined_filtered_features_subset.loc[i,
                        'Feature'+str(curfeat_i+1)+'_drug'+str(curdrug_i+1)] = \
                    cureffector_drugs[curdrug_i]
                    mediation_df_filtered_combined_filtered_features_subset.loc[i,
                        'Feature'+str(curfeat_i+1)+'_drug'+str(curdrug_i+1)+'effect'] = \
                    cureffect_df['Effect size'].values[0]
            cureffect_df = selectedEffectd_df[
                    (selectedEffectd_df['Feature display name']==curfeatures[curfeat_i]) &
                    (selectedEffectd_df['Feature space']==curspaces[curfeat_i]) &
                    (selectedEffectd_df['Effector'].str.find(cureffector_drugs[0])>=0) &
                    (selectedEffectd_df['Effector'].str.find(cureffector_drugs[1])>=0)]
            if len(cureffect_df)>0:
                mediation_df_filtered_combined_filtered_features_subset.loc[i,
                    'Feature'+str(curfeat_i+1)+'_drugcombo'] = \
                cureffect_df['Effect size'].values[0]
In [20]:
# for plotting, select only mediated features that are both affected by the drug combination
plotdata_both = mediation_df_filtered_combined_filtered_features_subset[
   (mediation_df_filtered_combined_filtered_features_subset['Feature1_drugcombo']!=0) &
   (mediation_df_filtered_combined_filtered_features_subset['Feature2_drugcombo']!=0)].copy()
In [21]:
# select only features that pass correlation p-value threshold of 0.1
plotdata_both_corr = plotdata_both[plotdata_both['FeatureFeatureCorrP']<=0.1].copy()

Save mediator features to graph

In [22]:
plotdata = plotdata_both_corr.copy()
In [23]:
# represent the data as graph nodes (drugs and features)
# connected by edges (effect sizes and correlations)
graphnodes = []
graphsources = []
graphedges = []
graphedgesP = []
nodesnames = []
nodestypes=[]
edgetype=[]
for i in range(len(plotdata)):
    cureffector = plotdata['Effector'].iloc[i]
    cureffector_drugs = cureffector.replace('Combination: ', '')
    cureffector_drugs = cureffector_drugs.split(', ')
    
    # feature 1 drug 1
    graphnodes.append(cureffector_drugs[0])
    graphsources.append(plotdata['Feature1'].iloc[i])
    graphedges.append(plotdata['Feature1_drugcombo'].iloc[i])
    edgetype.append('drug_feat')
    # feature 1 drug 2
    graphnodes.append(cureffector_drugs[1])
    graphsources.append(plotdata['Feature1'].iloc[i])
    graphedges.append(plotdata['Feature1_drugcombo'].iloc[i])
    edgetype.append('drug_feat')
    # feature 2 drug 1
    graphnodes.append(cureffector_drugs[0])
    graphsources.append(plotdata['Feature2_med'].iloc[i])
    graphedges.append(plotdata['Feature2_drugcombo'].iloc[i])
    edgetype.append('drug_feat')
    # feature 2 drug 2
    graphnodes.append(cureffector_drugs[1])
    graphsources.append(plotdata['Feature2_med'].iloc[i])
    graphedges.append(plotdata['Feature2_drugcombo'].iloc[i])
    edgetype.append('drug_feat')
    # feature 1 feature 2
    graphnodes.append(plotdata['Feature1'].iloc[i])
    graphsources.append(plotdata['Feature2_med'].iloc[i])
    graphedges.append(plotdata['FeatureFeatureCorr'].iloc[i])
    graphedgesP.append(plotdata['FeatureFeatureCorrP'].iloc[i])
    edgetype.append('feat_feat')
    # nodes info
    nodesnames.append(plotdata['Feature1'].iloc[i])
    nodestypes.append('Feature')
    nodesnames.append(plotdata['Feature2_med'].iloc[i])
    nodestypes.append('Feature')
    nodesnames.append(cureffector_drugs[0])
    nodestypes.append('Drug')
    nodesnames.append(cureffector_drugs[1])
    nodestypes.append('Drug')
In [24]:
# make a dataframe of the graph nodes
graph_df = {'Node1': graphnodes,
            'Node2': graphsources,
            'EdgeValue': graphedges,
            'EdgeType': edgetype}

graph_df = pd.DataFrame(graph_df)

graph_df = graph_df.drop_duplicates()
In [25]:
graph_df
Out[25]:
Node1 Node2 EdgeValue EdgeType
0 Calcium antagonist intake Alistipes sp. (motu_linkage_group_621) 0.161954 drug_feat
1 Statin intake Alistipes sp. (motu_linkage_group_621) 0.161954 drug_feat
2 Calcium antagonist intake VLDL-3 Cholesterol mg/dL -0.176885 drug_feat
3 Statin intake VLDL-3 Cholesterol mg/dL -0.176885 drug_feat
4 Alistipes sp. (motu_linkage_group_621) VLDL-3 Cholesterol mg/dL -0.073258 feat_feat
9 VLDL-3 Cholesterol mg/dL Alistipes sp. (motu_linkage_group_621) -0.073258 feat_feat
10 Calcium antagonist intake VLDL Cholesterol mg/dL -0.179195 drug_feat
11 Statin intake VLDL Cholesterol mg/dL -0.179195 drug_feat
14 VLDL Cholesterol mg/dL Alistipes sp. (motu_linkage_group_621) -0.069984 feat_feat
15 Metformin intake methanogenesis (trimethylamine degradation) (M... 0.151456 drug_feat
16 Statin intake methanogenesis (trimethylamine degradation) (M... 0.151456 drug_feat
17 Metformin intake IDL Phospholipids mg/dL -0.177343 drug_feat
18 Statin intake IDL Phospholipids mg/dL -0.177343 drug_feat
19 methanogenesis (trimethylamine degradation) (M... IDL Phospholipids mg/dL -0.079177 feat_feat
24 IDL Phospholipids mg/dL methanogenesis (trimethylamine degradation) (M... -0.079177 feat_feat
25 Metformin intake Xylene degradation, xylene => methylbenzoate (... 0.163755 drug_feat
26 Statin intake Xylene degradation, xylene => methylbenzoate (... 0.163755 drug_feat
29 Xylene degradation, xylene => methylbenzoate (... IDL Phospholipids mg/dL -0.087335 feat_feat
34 IDL Phospholipids mg/dL Xylene degradation, xylene => methylbenzoate (... -0.087335 feat_feat
35 Metformin intake Toluene degradation, toluene => benzoate (M00538) 0.163755 drug_feat
36 Statin intake Toluene degradation, toluene => benzoate (M00538) 0.163755 drug_feat
39 Toluene degradation, toluene => benzoate (M00538) IDL Phospholipids mg/dL -0.087335 feat_feat
44 IDL Phospholipids mg/dL Toluene degradation, toluene => benzoate (M00538) -0.087335 feat_feat
45 Statin intake IDL Phospholipids mg/dL -0.313802 drug_feat
46 Aspirine intake IDL Phospholipids mg/dL -0.313802 drug_feat
47 Statin intake glutamate degradation III (MF0032) 0.291080 drug_feat
48 Aspirine intake glutamate degradation III (MF0032) 0.291080 drug_feat
49 IDL Phospholipids mg/dL glutamate degradation III (MF0032) -0.090746 feat_feat
50 Statin intake observed species richness 0.227687 drug_feat
51 Aspirine intake observed species richness 0.227687 drug_feat
52 Statin intake IDL Apo-B mg/dL -0.296421 drug_feat
53 Aspirine intake IDL Apo-B mg/dL -0.296421 drug_feat
54 observed species richness IDL Apo-B mg/dL -0.093991 feat_feat
59 IDL Apo-B mg/dL observed species richness -0.093991 feat_feat
64 observed species richness IDL Phospholipids mg/dL -0.094625 feat_feat
69 IDL Phospholipids mg/dL observed species richness -0.094625 feat_feat
70 Statin intake unclassified Oscillibacter (CAG00961) 0.211984 drug_feat
71 Aspirine intake unclassified Oscillibacter (CAG00961) 0.211984 drug_feat
72 Statin intake Fat mass in % -0.344896 drug_feat
73 Aspirine intake Fat mass in % -0.344896 drug_feat
74 unclassified Oscillibacter (CAG00961) Fat mass in % -0.141452 feat_feat
79 Fat mass in % unclassified Oscillibacter (CAG00961) -0.141452 feat_feat
In [26]:
# UNCOMMENT TO PRINT GRAPH EDGES TO FILE
#graph_df.to_csv('fig2e_mediation_graph_edges.tsv', sep='\t')
In [27]:
# make a dataframe of node types
nodetype_df = pd.DataFrame({'Node': nodesnames, 'Type': nodestypes})
nodetype_df = nodetype_df.drop_duplicates()
In [28]:
nodetype_df
Out[28]:
Node Type
0 Alistipes sp. (motu_linkage_group_621) Feature
1 VLDL-3 Cholesterol mg/dL Feature
2 Calcium antagonist intake Drug
3 Statin intake Drug
8 VLDL Cholesterol mg/dL Feature
12 methanogenesis (trimethylamine degradation) (M... Feature
13 IDL Phospholipids mg/dL Feature
14 Metformin intake Drug
20 Xylene degradation, xylene => methylbenzoate (... Feature
28 Toluene degradation, toluene => benzoate (M00538) Feature
37 glutamate degradation III (MF0032) Feature
39 Aspirine intake Drug
40 observed species richness Feature
41 IDL Apo-B mg/dL Feature
56 unclassified Oscillibacter (CAG00961) Feature
57 Fat mass in % Feature
In [29]:
# UNCOMMENT TO PRINT NODE TYPES TO FILE
#nodetype_df.to_csv('fig2e_mediation_graph_node_types.tsv', sep='\t')
In [30]:
import pkg_resources
import sys

#print package versions
print('Sesssion info:')
print('Python: ', sys.version)
print('numpy: ', pkg_resources.get_distribution('numpy').version)
print('pandas: ', pkg_resources.get_distribution('pandas').version)
print('matplotlib: ', pkg_resources.get_distribution('matplotlib').version)
print('seaborn: ', pkg_resources.get_distribution('seaborn').version)
Sesssion info:
Python:  3.7.7 (default, Apr 15 2020, 05:09:04) [MSC v.1916 64 bit (AMD64)]
numpy:  1.18.1
pandas:  1.0.3
matplotlib:  3.1.3
seaborn:  0.10.0