Enhanced Protein Isoform Characterization Through Long-Read Proteogenomics - Jurkat Samples and Reference Data
Creators
- 1. University of Wisconsin - Madison
- 2. University of Virginia
- 3. Lifebit Biotech Ltd.
- 4. University of Zurich
- 5. University of Florida
- 6. Science and Technology Consulting LLC
Description
The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g. PacBio, Oxford Nanopore) provide full-length transcript sequencing, which can be used to predict full-length proteins. Here, we describe a long-read proteogenomics approach for integrating matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discovery novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data in protein inference to enable detection of protein isoforms that are intractable to MS detection. We have released an open-source Nextflow pipeline that integrates long-read RNA-sequencing in a proteomic workflow for isoform-resolved analysis.
Companion Repositories:
- Long-Read-Proteogenomics Workflow GitHub Repository Release
- Long-Read-Proteogenomics Analysis GitHub Repository Release
Companion Datasets
- TEST Data for Long-Read-Proteogenomics Workflow GitHub Actions
- Long-Read-Proteogenomics Workflow Results using Jurkat Sample data
This Repository contains the Jurkat Samples and Reference Data
Files
jurkat_classification.txt
Files
(49.6 GB)
Name | Size | Download all |
---|---|---|
md5:3e7e167cf2a1756280a12e2c731613de
|
1.4 GB | Download |
md5:06386647ccb0e9942208a659ca761ee1
|
125.0 kB | Download |
md5:d6bfd335a049ce7173ba7366dc0d48bc
|
3.1 MB | Download |
md5:55eb7d15f2b68b460a6b784b6baf9306
|
57.4 MB | Preview Download |
md5:423b8fdf5e45c857d12411ede6e008c0
|
71.2 MB | Download |
md5:19fce83d8361f6e0116500ddb723c3d0
|
188.9 MB | Download |
md5:0f3a0a1525ece57a15ad053674c88c1f
|
362.6 kB | Download |
md5:4358213de663b01e20c606d0b772c2aa
|
7.4 GB | Download |
md5:2d369ab988f06df9af7b1250e2751219
|
3.6 GB | Download |
md5:86657ef4692edee4029a59a196dcca36
|
3.8 GB | Download |
md5:0b6ec27c462889cb8854e129c3420441
|
1.4 MB | Download |
md5:222d467d8ef8d532be30b29b25472740
|
5.9 GB | Download |
md5:1ef7d3d031b223776fca759f1e16df2e
|
70 Bytes | Download |
md5:1d6945c27e7207ff3a74039c7b92b3e2
|
2.2 GB | Download |
md5:7450045ac7d583dea6345143e0826a14
|
24.9 GB | Download |
md5:64aebe205d4ef6b1f33a50cd22ecbef9
|
2.5 kB | Download |
md5:0dbedc2a724f50b4a19b6a7a625f3c2b
|
9.2 MB | Download |