category
NAR
date
Feb 17, 2026
slug
status
Published
summary
提出GeMoSeq算法,结合候选转录本组合枚举与启发式分割策略,实现基于似然模型的转录本重建;整合编码序列预测模块,通过与同源方法GeMoMa结合实现基因组注释优化。
tags
测序技术
蛋白质组学
type
Post

📄 原文题目

Improved reconstruction of transcripts and coding sequences from RNA-seq data

🔗 原文链接

💡 AI 核心解读

提出GeMoSeq算法,结合候选转录本组合枚举与启发式分割策略,实现基于似然模型的转录本重建;整合编码序列预测模块,通过与同源方法GeMoMa结合实现基因组注释优化。

📝 英文原版摘要

<span class="paragraphSection"><div class="boxTitle">Abstract</div>Annotation of genes and transcripts is a key prerequisite for understanding the information that is encoded in newly sequenced genomes. One source of information suited for this purpose is RNA-seq data mapped to the respective genome sequence. RNA-seq-based approaches for transcript reconstruction generate transcript models from these data by combining regions of contiguous coverage (exons) and split read mappings (introns). Understanding phenotypes as a consequence of proteins encoded in a genome further requires the annotation of coding sequences within transcript models. We present GeMoSeq, a novel approach for transcript reconstruction from RNA-seq data that combines combinatorial enumeration of candidate transcripts with heuristics for splitting candidate transcripts into regions of contiguous coverage and subsequent likelihood-based quantification. Prediction of coding sequences is an integral part of the GeMoSeq algorithm. We benchmark GeMoSeq against previous approaches using a large collection of public RNA-seq data for seven species. For the majority of species, we observe an improved prediction performance of GeMoSeq, especially on the level of coding sequences and for species with dense genomes. We combine GeMoSeq with the homology-based approach GeMoMa to re-annotate two recently sequenced genomes of <span style="font-style: italic;">Nicotiana benthamiana</span> lab strains, which illustrates the main purpose of GeMoSeq: the initial annotation of newly sequenced genomes with protein-coding genes.</span>
非编码GGC重复序列在神经肌肉疾病中被翻译为有毒的多甘氨酸蛋白一种快速检测蛋白质-DNA相互作用的方法
Loading...