Single-cell datasets for cell cycle plasticity underlies fractional resistance to palbociclib in ER+/HER2- breast tumor cells
Creators
Description
There are 7 files uploaded in the data.
tumor_preprocessed.h5ad: Full primary tumor dataset post-feature selection and standardization across three treatment conditions (0, 10, and 100 nM palbociclib). AnnData object format. 14 cell cycle features, phase labels and other cell metadata, and two PHATE dimensions for manifold visualization.
T47D_preprocssed.h5ad: Full dataset of main text T47D dataset post-feature selection and standardization across three treatment conditions (0, 10, and 100 nM palbociclib). AnnData object format. 14 cell cycle features, phase labels and other cell metadata, and two PHATE dimensions for manifold visualization.
sketched_integrated.h5ad: After downsample 6,000 (2,000 per condition) from T47D_preprocessed and tumor_preprocessed, we integrate the two datasets into one joint latent space using TRANSACT. Now included in the data are the consensus component columns ('0',..,'13'). AnnData object.
sketched_integrated_df.csv: sketched_integrated.h5ad in .csv format.
T47D_replicate_preprocessed: Replicate experimental dataset of T47D for supplementary analysis post-feature selection and standardization across three treatment conditions (0, 10, and 100 nM palbociclib). 15 cell cycle features (same 14 but with CDK6).
sketched_rep.h5ad: Representative downsample of the T47D_replicate_preprocessed. Selecting 6,000 cells (2,000 for each of the three treatment conditions) using kernel herding sketching. AnnData object.
sketched_rep_df.csv: Same data as sketched_rep.h5ad in csv format.
T47D_triplicate_preprocessed.h5ad: T47D biological replicate sample collected in triplicate form (three wells for 0, 10, and 100 nM of palbociclib). Wells were joined and the data were sketched down to 20,000 per condition.
T47D_triplicate_preprocessed.h5ad: T47D triplicate in .csv form.
tumor_2_preprocessed.h5ad: An additional tumor sample from a new patient with the same treatment conditions of palbociclib. Sketched down to 2,000 cells per condition.
tumor_2_preprocessed.csv: The additional tumor sample in .csv form.
Further description of sketched_integrated: This is the joint dataset between the T47D and primary tumor, after subsampling using kernel herding sketching. This is a dataset consisting of T47D and primary tumor cells resected from a consented patient. The samples were imaged using iterative indirect immunofluorescent imaging (4i) to get proteomic measurements on a single-cell level. The T47D and tumor samples were gathered, cultured, and imaged separately. Each sample was treated with three conditions of CDK4/6 inhibitor palbociclib (control, 10 nM, and 100 nM). Then, we used kernel sketching to representatively downsample each dataset, selecting 2,000 from each of the three treatment conditions (6,000 cells from each of the two sources). We used an integration method called TRANSACT to integrate the two datasets into one shared, latent space. The dataset here is consisting of these 12,000 cells. The columns ('0','1',...'13') are the principal vectors of the joint latent space. After that, there are the columns of the standardized proteomic measurements of different cell cycle effectors, and biological annotations of interest. The standardization is done for each data source separately. Well refers to the treatment condition. 'prb_ratio' is a marker of if a cell is still proliferating or arrested, found by selecting the upper modality of pRB/RB values. 'phase' are cell cycle phase labels found by unsupervised clustering done on a handful of known cell cycle markers.