category
date
link
slug
status
summary
tags
type
📄 原文题目
Identification and Classification of Expressed Orphan Genes, Spurious Orphan Genes, and Conserved Genes in the Human Gut Microbiome
🔗 原文链接
💡 AI 核心解读
创新性地结合宏转录组数据与机器学习区分表达孤基因与伪孤基因,通过大规模数据分析(近5000个宏转录组文库)揭示表达孤基因与保守基因在序列组成、结构约束和进化信号上的系统性差异,并开发了基于SHAP值的生物学信号解释方法。
📝 英文原版摘要
Orphan genes - genes lacking detectable homologs outside a species - are widespread in microbial genomes and are thought to contribute to their adaptation and molecular innovation. However, not all predicted orphan genes may represent novel functional coding sequences. False positive orphan genes, also called spurious orphan genes, can arise from gene prediction errors. We reason that orphan genes lacking detectable expression are more likely to be spurious. To this end, we combined large-scale metatranscriptomic profiling of the human gut microbiome with machine learning to distinguish expressed orphan genes from spurious ones and to compare them with conserved genes found in multiple species. Using nearly 5,000 metatranscriptome libraries, we identified ~218,000 orphan genes supported by expression evidence, while ~330,000 predicted orphan genes lacked detectable expression, and were classified as spurious. We extracted 154 sequence, structural, and evolutionary features for each gene and trained XGBoost classifiers while accounting for genomic representation. The models achieved an area under the receiver operating characteristic curve (AUC) of 0.82 in distinguishing expressed orphan genes from spurious orphan genes and 0.93 in distinguishing expressed orphan genes from conserved genes. SHAP-based interpretation revealed clear biological signals. E.g., expressed orphans were present in more genomes than spurious ones and expressed orphan genes were shorter than conserved genes. This work improves orphan gene discovery and suggests that expressed orphan genes differ systematically from conserved genes and spurious orphan genes in sequence composition, structural constraints, and evolutionary signals.
- 作者:NotionNext
- 链接:https://tangly1024.com/article/2ec48bd6-1f96-81cf-9658-fec5c1ef6f1b
- 声明:本文采用 CC BY-NC-SA 4.0 许可协议,转载请注明出处。
相关文章
