This README file was generated on 2022-10-28 by Mengyang Fan. GENERAL INFORMATION 1. Title of Dataset: Data from: Covalent disruptor of YAP-TEAD association suppresses defective Hippo signaling. 2. Corresponding Author Information Corresponding Investigator 1 Name: Prof Nathanael Gray Institution: Department of Chemical and Systems Biology, ChEM-H, Stanford Cancer Institute, School of Medicine, Stanford University, Stanford, CA 94305, USA Email: nsgray01@stanford.edu Corresponding Investigator 2 Name: Dr Tinghu Zhang Institution: Department of Chemical and Systems Biology, ChEM-H, Stanford Cancer Institute, School of Medicine, Stanford University, Stanford, CA 94305, USA Email: ztinghu8@stanford.edu 3. Date of data collection: 2019-2021 4. Geographic location of data collection: Boston, USA DATA & FILE OVERVIEW 1. Description of dataset These data were generated to characterize the TEAD inhibitor MYF-03-69. 2. File List: File 1 Name: Supplementary_Dataset_1._Proteome-wide_selectivity_profile_of_MYF-03-69_on_cysteines_labeling_using_SLC-ABPP_approach.xlsx File 1 Description: Competetion of MYF-03-69 with DBIA on proteome-wide cysteine sites. Dimensions: 12499 rows x 35 columns Variables: Protein Id: the unique identification code from UniProtKB for each protein gene_symbol: the corresponding gene name for each protein prot_description: a short description for each protein Site Position: the labeling site or sites by the cysteine reactive desthiobiotin iodoacetamide (DBIA) probe sequence: the sequence of the enriched peptides, they were identified through mass spectrum and the labeled cysteine was marked with a # Columns F through H: the relative intensity of corresponding peptide in mass spectrum from DMSO group, in triplication Columns I through K: the relative intensity of corresponding peptide in mass spectrum from group treated with 0.5uM MYF-03-69, in triplication Columns L through N: the relative intensity of corresponding peptide in mass spectrum from group treated with 2.0uM MYF-03-69, in triplication Columns O through Q: the relative intensity of corresponding peptide in mass spectrum from group treated with 10uM MYF-03-69, in triplication Columns R through T: the relative intensity of corresponding peptide in mass spectrum from group treated with 25uM MYF-03-69, in triplication DMSO Avg: the average relative intensity of corresponding peptide in mass spectrum from DMSO group 0.5uM Avg: the average relative intensity of corresponding peptide in mass spectrum from group treated with 0.5uM MYF-03-69 2.0uM Avg: the average relative intensity of corresponding peptide in mass spectrum from group treated with 2.0uM MYF-03-69 10uM Avg: the average relative intensity of corresponding peptide in mass spectrum from group treated with 10uM MYF-03-69 25uM Avg: the average relative intensity of corresponding peptide in mass spectrum from group treated with 25uM MYF-03-69 0.5uM CR: the competition ratio of corresponding peptide for group treated with 0.5uM MYF-03-69, calculated from DMSO Avg/0.5uM Avg 2.0uM CR: the competition ratio of corresponding peptide for group treated with 2.0uM MYF-03-69, calculated from DMSO Avg/2.0uM Avg 10uM CR: the competition ratio of corresponding peptide for group treated with 10uM MYF-03-69, calculated from DMSO Avg/10uM Avg 25uM CR: the competition ratio of corresponding peptide for group treated with 25uM MYF-03-69, calculated from DMSO Avg/25uM Avg Gene + Site: the corresponding gene name and labeling site or sites for each protein CV DMSO: coefficient of variation for each identified peptide from DMSO group CV 0.5: coefficient of variation for each identified peptide from group treated with 0.5uM MYF-03-69 CV 2.0: coefficient of variation for each identified peptide from group treated with 2.0uM MYF-03-69 CV 10: coefficient of variation for each identified peptide from group treated with 10uM MYF-03-69 CV 25: coefficient of variation for each identified peptide from group treated with 25uM MYF-03-69 File 2 Name: Supplementary_Dataset_2._List_of_differentially_expressed_genes_under_MYF-03-69_treatments.xlsx File 2 Description: The genes that were differentially expressed with statistical significance (Fold change > 1.5 and adjusted p value < 0.05) from RNA sequencing. Dimensions: 339 rows x 1 column in the sheet "2uM Treatment", 99 rows x 1 column in the sheet "0.5uM Treatment", 1 row x 1 column in the sheet "0.1uM Treatment". Variables: the single column in each sheet lists the genes that were differentially expressed with statistical significance (Fold change > 1.5 and adjusted p value < 0.05). The sheet "2uM Treatment" indicates eligible genes from the group treated with 2.0uM MYF-03-69. The sheet "0.5uM Treatment" indicates eligible genes from the group treated with 0.5uM MYF-03-69. The sheet "0.1uM Treatment" indicates eligible genes from the group treated with 0.1uM MYF-03-69. File 3 Name: Supplementary_Dataset_3.xlsx File 3 Description: Area under the curve (AUC) as a measurement of MYF-03-69's effect on cell viability in Prime screening. CERES score of YAP1 or TEADs from CRISPR (Avana) Public 21Q1 dataset (DepMap) from DepMap portal. Dimensions: 904 rows x 10 columns Variables: Depmap_ID: the unique identification code from Depmap Portal for each cancer cell line Disease: the correspongding lineage of each cancer cell line Disease subtype: the correspongding cancer name of each cancer cell line Cell line: the name of each cancer cell line AUC: In the PRISM screen, the cell viability values were measured at 8-point dose manner (3-fold dilution from 10 μM of MYF-03-69) and fitted a dose-response curve for each cell line. Area under the curve (AUC) was calculated as a measurement of compound effect on cell viability. YAP1 CRISPR (Avana) Public 21Q1: The dependency score of YAP1 for each cell line from CRISPR (Avana) Public 21Q1 dataset (https://depmap.org/portal/download/). For those cell lines that were not reported with the score, #N/A was indicated in the cell. TEAD1 CRISPR (Avana) Public 21Q1: The dependency score of TEAD1 for each cell line from CRISPR (Avana) Public 21Q1 dataset (https://depmap.org/portal/download/). For those cell lines that were not reported with the score, #N/A was indicated in the cell. TEAD2 CRISPR (Avana) Public 21Q1: The dependency score of TEAD2 for each cell line from CRISPR (Avana) Public 21Q1 dataset (https://depmap.org/portal/download/). For those cell lines that were not reported with the score, #N/A was indicated in the cell. TEAD3 CRISPR (Avana) Public 21Q1: The dependency score of TEAD3 for each cell line from CRISPR (Avana) Public 21Q1 dataset (https://depmap.org/portal/download/). For those cell lines that were not reported with the score, #N/A was indicated in the cell. TEAD4 CRISPR (Avana) Public 21Q1: The dependency score of TEAD4 for each cell line from CRISPR (Avana) Public 21Q1 dataset (https://depmap.org/portal/download/). For those cell lines that were not reported with the score, #N/A was indicated in the cell. File 4 Name: Supplementary_Dataset_4.xlsx File 4 Description: Correlation analysis results of "Supplementary_Dataset_3". Pearson correlation coefficients and associated p-values from the correlation analysis between MYF-03-69's PRISM sensitivity (log2.AUC of each cell line) and dependency of certain gene (CRISPR knockout score for each cell line, from DepMap Public 20Q4 Achilles_gene_effect.csv dataset) across the PRISM cell line panel. The q-values (a corrected significance value accounting for false discovery rate) are computed from p-values using the Benjamini Hochberg algorithm. Dimensions: 49 rows x 4 columns Variables: Gene: name of the top 48 genes whose dependency scores across the cancer cell lines correlate with PRISM sensitivity to MYF-03-69. Pearson correlation coefficient: Pearson correlation coefficient between dependency scores and PRISM sensitivity (AUC value) across the cancer cell lines for each gene. minus log q value: negative logarithmic of q value of each gene. q value (FDR): a corrected significance value accounting for false discovery rate for each gene. METHODOLOGICAL INFORMATION For "Supplementary_Dataset_1._Proteome-wide_selectivity_profile_of_MYF-03-69_on_cysteines_labeling_using_SLC-ABPP_approach", the date was collected on NCI-H226 cells using the same methods reported in below reference paper. Kuljanin, M.; Mitchell, D. C.; Schweppe, D. K.; Gikandi, A. S.; Nusinow, D. P.; Bulloch, N. J.; Vinogradova, E. V.; Wilson, D. L.; Kool, E. T.; Mancias, J. D.; Cravatt, B. F.; Gygi, S. P., Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries. Nature Biotechnology 2021, 39, 630-641. The competition ratio CR was also calculated as descibed in the reference paper. For "Supplementary_Dataset_2._List_of_differentially_expressed_genes_under_MYF-03-69_treatments", the date was collected on NCI-H226 cells treated with MYF-03-69 at indicated concentrations for 6 hours (n=3). The RNA was extracted using RNeasy plus mini kit (Qiagen, cat no.74134) according to the manufacturer instructions. Then libraries were prepared using Roche Kapa mRNA HyperPrep strand specific sample preparation kits from 200 ng of purified total RNA according to the manufacturer’s protocol on a Beckman Coulter Biomek i7. The finished dsDNA libraries were quantified by Qubit fluorometer and Agilent TapeStation 4200. Uniquely dual indexed libraries were pooled in an equimolar ratio and shallowly sequenced on an Illumina MiSeq to further evaluate library quality and pool balance. The final pool was sequenced on an Illumina NovaSeq 6000 targeting 40 million 100bp read pairs per library at the Dana-Farber Cancer Institute Molecular Biology Core Facilities. Sequenced reads were aligned to the UCSC hg19 reference genome assembly and gene counts were quantified using STAR (v2.7.3a). RNA sequencing data have been deposited in BioSample database under accession codes SAMN19288936, SAMN19288937, SAMN19288938, SAMN19288939, SAMN19288940, SAMN19288941, SAMN19288942, SAMN19288943, SAMN19288944, SAMN19288945 and SAMN19288946. Differential gene expression testing was performed by DESeq2 (v1.22.1). RNAseq analysis was performed using the VIPER snakemake pipeline. KEGG pathway enrichment analysis was performed through metascape webportal. For "Supplementary_Dataset_3", the date was collected using the methods reported in below reference paper. Corsello, S. M.; Nagari, R. T.; Spangler, R. D.; Rossen, J.; Kocak, M.; Bryan, J. G.; Humeidi, R.; Peck, D.; Wu, X.; Tang, A. A.; Wang, V. M.; Bender, S. A.; Lemire, E.; Narayan, R.; Montgomery, P.; Ben-David, U.; Garvie, C. W.; Chen, Y.; Rees, M. G.; Lyons, N. J.; McFarland, J. M.; Wong, B. T.; Wang, L.; Dumont, N.; O’Hearn, P. J.; Stefan, E.; Doench, J. G.; Harrington, C. N.; Greulich, H.; Meyerson, M.; Vazquez, F.; Subramanian, A.; Roth, J. A.; Bittker, J. A.; Boehm, J. S.; Mader, C. C.; Tsherniak, A.; Golub, T. R., Discovering the anticancer potential of non-oncology drugs by systematic viability profiling. Nature Cancer 2020, 1 (2), 235-248. Briefly, up to 931 barcoded cell lines in pools of 20-25 were thawed and plated into 384-well plates (1250 cells/well for adherent cell pools, 2000 cells/well for suspension or mixed suspension/adherent cell pools) containing compound (top concentration: 10 µM, 8-point, threefold dilution). All conditions were tested in triplicate. Cells were lysed after 5 days of treatment and mRNA based Luminex detection of barcode abundance from lysates was carried out as in the reference paper above. Luminex median fluorescence intensity (MFI) data was input to a standardized R pipeline (https://github.com/broadinstitute/prism_data_processing) to generate viability estimates relative to vehicle treatment for each cell line and treatment condition, and to fit dose-response curves from viability data. CERES score of YAP1 or TEADs from CRISPR (Avana) Public 21Q1 dataset were downloaded from DepMap portal (https://depmap.org/portal/download/) and listed with the viability data. For "Supplementary_Dataset_4", the data was correlation analysis results of "Supplementary_Dataset_3", which was performed in the R pipeline mentioned above (https://github.com/broadinstitute/prism_data_processing).