category
bioRxiv
date
Feb 12, 2026
slug
status
Published
summary
提出LRAA框架,通过拼接图结构建模与期望最大化算法解决长读长数据的歧义问题;支持定量-only、参考引导和无参考三种分析模式;开发基于MORFs的新型基准策略;实现单细胞/单核层面的异构体解析及疾病相关异构体检测。
tags
测序技术
单细胞测序
type
Post
📄 原文题目
Accurate strand-specific long-read transcript isoform discovery and quantification at bulk, single-cell, and single-nucleus resolution
🔗 原文链接
💡 AI 核心解读
提出LRAA框架,通过拼接图结构建模与期望最大化算法解决长读长数据的歧义问题;支持定量-only、参考引导和无参考三种分析模式;开发基于MORFs的新型基准策略;实现单细胞/单核层面的异构体解析及疾病相关异构体检测。
📝 英文原版摘要
Recent advances in long-read transcriptome sequencing enable high-throughput profiling of full-length RNA isoforms in bulk, single-cell, and single-nucleus samples. However, long-read datasets typically contain a mixture of complete and partial transcripts, leading to pervasive ambiguity in read-to-isoform assignment and complicating accurate isoform identification and quantification, particularly in the absence of reliable reference annotations. These challenges are further amplified in single-cell and single-nucleus samples, where coverage is sparse and transcriptional heterogeneity is high. Here, we present the Long Read Alignment Assembler (LRAA), a unified and versatile computational framework for isoform identification and quantification from long-read RNA sequencing data across bulk, single-cell, and single-nucleus transcriptomic samples. LRAA combines splice-graph based structural modeling with expectation maximization based optimization to probabilistically resolve ambiguous read assignments and improve isoform abundance estimation. The framework supports quantification-only, reference-guided, and fully reference-free (de novo) modes of analysis within a single methodological paradigm. We benchmarked LRAA using both simulated and genuine long-read datasets spanning sequencing standards and whole transcriptomes. Central to this evaluation is a novel benchmarking strategy based on Multiplexed Overexpression of Regulatory Factors (MORFs), which provides biologically expressed, barcoded isoforms with unambiguous read-level ground truth. Across all benchmarks, including MORFs, synthetic spike-ins, and whole-transcriptome datasets, LRAA consistently outperformed state-of-the-art methods in isoform identification accuracy, sensitivity, and expression quantification. F
inally, we demonstrate the biological utility of LRAA by resolving cell-type-specific isoform usage across peripheral blood immune cell populations and by detecting a pathogenic cryptic isoform of STMN2 with associated transcriptional changes in single-nucleus RNA-seq data from frontal cortex tissue of an individual with frontotemporal dementia (FTD). Together, these results establish LRAA as a robust and general solution for resolving transcript diversity in complex biological systems, from development to disease.
- 作者:NotionNext
- 链接:https://tangly1024.com/article/30648bd6-1f96-8150-8a73-d0de8cfa7b9f
- 声明:本文采用 CC BY-NC-SA 4.0 许可协议,转载请注明出处。
相关文章
