category
bioRxiv
date
Mar 20, 2026
slug
status
Published
summary
1. 首创多模态生物推理大语言模型(LLM)框架,整合蛋白质序列、结构、功能域和相互作用信息;2. 引入GO-GPT模型实现GO术语的层次化预测;3. 通过监督微调和强化学习优化模型性能;4. 实现73.6%的GO术语预测Fmax值和8/10的LLM评估得分;5. 能够定位冷冻电镜结构中的精确接触残基,预测实验验证的结合伙伴。
tags
蛋白质组学
type
Post
📄 原文题目
BioReason-Pro: Advancing Protein Function Prediction with Multimodal Biological Reasoning
🔗 原文链接
💡 AI 核心解读
1. 首创多模态生物推理大语言模型(LLM)框架,整合蛋白质序列、结构、功能域和相互作用信息;2. 引入GO-GPT模型实现GO术语的层次化预测;3. 通过监督微调和强化学习优化模型性能;4. 实现73.6%的GO术语预测Fmax值和8/10的LLM评估得分;5. 能够定位冷冻电镜结构中的精确接触残基,预测实验验证的结合伙伴。
📝 英文原版摘要
Protein function annotation is fundamental to understanding biological mechanisms, designing therapeutics, and advancing biomedical research. Current computational methods either rely on shallow sequence similarity or treat function prediction as isolated classification tasks, failing to capture the integrative reasoning across sequence, structure, domains, and interactions that expert biologists perform to infer function. We introduce BioReason-Pro, the first multimodal reasoning large language model (LLM) for protein function prediction that integrates protein embeddings with biological context to generate structured reasoning traces. A key input into BioReason-Pro is the set of GO term predictions made by GO-GPT, our autoregressive transformer that captures hierarchical and cross-aspect dependencies of GO terms. BioReason-Pro is trained via supervised fine-tuning on synthetic reasoning traces generated by GPT-5 for over 130K proteins and further optimized through reinforcement learning. It achieves 73.6% Fmax on GO term prediction and an LLM judge score of 8/10 on functional summaries, substantially outperforming previous methods. Evaluations with human protein experts show that BioReason-Pro annotations are preferred over ground truth UniProt annotations in 79% of cases. Remarkably, BioReason-Pro de novo predicted experimentally confirmed binding partners with per-residue attention localizing to the exact contact residues resolved in cryo-EM structures of those complexes. Together, GO-GPT and BioReason-Pro establish a framework for protein function prediction that combines precise ontology modeling with interpretable biological reasoning.
- 作者:NotionNext
- 链接:https://tangly1024.com/article/32948bd6-1f96-8154-a331-e66c95844f93
- 声明:本文采用 CC BY-NC-SA 4.0 许可协议,转载请注明出处。
相关文章
