莫里奇研究所Philip A. Romero课题组取得一项新突破。他们报道了蛋白质工程中基于生物物理学的蛋白质语言模型。这一研究成果发表在2025年9月11日出版的国际学术期刊《自然—方法学》上。
该课题组提出突变效应迁移学习(METL),这是一种结合先进机器学习和生物物理建模的蛋白质语言模型框架。利用METL框架,小组基于生物物理模拟数据预训练基于变压器的神经网络,以捕获蛋白质序列、结构和能量学之间的基本关系。研究小组对METL在实验序列功能数据上进行微调,以利用这些生物物理信号,并将其应用于预测蛋白质特性,如热稳定性、催化活性和荧光。METL擅长于挑战性的蛋白质工程任务,如从小训练集进行归纳和位置外推,尽管现有的训练进化信号的方法在许多类型的实验分析中仍然很强大。课题组人员展示了METL在64个样本上训练后设计功能性绿色荧光蛋白变体的能力,展示了基于生物物理学的蛋白质语言模型在蛋白质工程中的潜力。
研究人员表示,经过进化数据训练的蛋白质语言模型已经成为预测涉及蛋白质序列、结构和功能问题的强大工具。然而,这些模型忽略了几十年来对控制蛋白质功能的生物物理因素的研究。
附:英文原文
Title: Biophysics-based protein language models for protein engineering
Author: Gelman, Sam, Johnson, Bryce, Freschlin, Chase R., Sharma, Arnav, DCosta, Sameer, Peters, John, Gitter, Anthony, Romero, Philip A.
Issue&Volume: 2025-09-11
Abstract: Protein language models trained on evolutionary data have emerged as powerful tools for predictive problems involving protein sequence, structure and function. However, these models overlook decades of research into biophysical factors governing protein function. We propose mutational effect transfer learning (METL), a protein language model framework that unites advanced machine learning and biophysical modeling. Using the METL framework, we pretrain transformer-based neural networks on biophysical simulation data to capture fundamental relationships between protein sequence, structure and energetics. We fine-tune METL on experimental sequence–function data to harness these biophysical signals and apply them when predicting protein properties like thermostability, catalytic activity and fluorescence. METL excels in challenging protein engineering tasks like generalizing from small training sets and position extrapolation, although existing methods that train on evolutionary signals remain powerful for many types of experimental assays. We demonstrate METL’s ability to design functional green fluorescent protein variants when trained on only 64 examples, showcasing the potential of biophysics-based protein language models for protein engineering.
DOI: 10.1038/s41592-025-02776-2
Source: https://www.nature.com/articles/s41592-025-02776-2
Nature Methods:《自然—方法学》,创刊于2004年。隶属于施普林格·自然出版集团,最新IF:47.99
官方网址:https://www.nature.com/nmeth/
投稿链接:https://mts-nmeth.nature.com/cgi-bin/main.plex