A systematic benchmark of high-accuracy PacBio long-read RNA sequencing for transcript-level quantification
Authors/Creators
Description
Background: The assembly of fragmented RNA-sequencing reads into complete transcripts is error-prone, particularly for genes with complex splicing, resulting in ambiguity in transcript discovery and quantification. PacBio long-read RNA sequencing resolves transcripts with greater clarity than short-read technologies. PacBio Kinnex employs a cDNA concatenation approach that increases read yield on average by 8-fold relative to previous protocols. However, its quantitative performance remains under-evaluated at scale.
Results: Here, we benchmark the high-throughput PacBio Kinnex platform against Illumina short-read RNA-seq using matched, deeply sequenced datasets across a time course of endothelial cell differentiation. Compared to Illumina, Kinnex achieves comparable gene-level quantification and more accurate transcript discovery and transcript quantification. While Illumina detects more transcripts overall, many reflect potentially unstable or ambiguous estimates in complex genes. Kinnex largely avoids these issues, producing more reliable differential transcript expression calls, despite a mild bias against short transcripts (shorter than 1.25 kb). When correcting Illumina for inferential variability, Kinnex and Illumina quantifications are highly concordant, demonstrating equivalent performance. We also benchmark long-read tools, nominating Oarfish as the most efficient for our Kinnex data.
Conclusions: Together, our results establish Kinnex as a reliable platform for full-length transcript quantification.