1.
School of Physics and Astronomy,Shanghai Jiao Tong University,Shanghai 200240,China
2.
Shanghai National Center for Applied Mathematics (SJTU Center),Institute of Natural Sciences,Shanghai Jiao Tong University,Shanghai 200240,China
3.
Shanghai Artificial Intelligence Laboratory,Shanghai 200240,China
摘要:
自然界中存在的酶拥有多种多样的功能,它们已经被应用在工业生产和学术研究中,但其中许多酶的性质和功能还不能完全满足应用需要,通过改造来提升这类酶的某些特性是酶工程的重要任务。本文介绍了酶工程的主要发展历程,并重点梳理了人工智能(AI)助力酶工程领域的研究进展。酶工程主要包括理性设计、定向进化、半理性设计和人工智能辅助设计等策略。理性设计方法根据酶的催化机理、结构等先验知识进行改造。定向进化技术通过构建随机突变文库和高通量筛选提升目标酶的稳定性和活性等性质。半理性设计方法借助一系列计算方法构建相比于定向进化更小也更合理的突变文库以降低筛选工作量。人工智能技术在大量数据驱动下可以学习有关蛋白质构成和进化的特征信息。通过直接学习自然界中存在的蛋白质序列、共进化信息和结构,深度神经网络已经可以解决许多类型的酶工程问题,如预测具有有益影响的突变、优化蛋白质的稳定性、提高催化活性等。通过对酶工程现状进行分析,本文旨在进一步推动酶的开发和优化以实现更广泛的应用,为研究者和相关从业人员提供更多有价值的见解。
Abstract:
Enzymes have garnered significant attention in both research and industry due to their unparalleled specificity and functionality, and thus opportunities remain for enhancing their physichemical properties and fitness to improve catalytic performance. The primary objective of enzyme engineering is to optimize the fitness of targeted enzymes through various strategies for their modifications, even redesigning. This review provides a comprehensive overview for progress made in enzyme engineering, with a focus on artificial intelligence (AI)-guided design methodology. Several key strategies have been employed in enzyme engineering, including rational design, directed evolution, semi-rational design, and AI-guided design. Rational design relies on an extensive knowledge based on encompassing protein structures and catalytic mechanisms, allowing for purposeful manipulations of enzyme properties. Directed evolution, on the other hand, involves the generation of a library of random variants for subsequent high-throughput screening to identify beneficial mutations. Semi-rational design combines rational design and directed evolution, resulting in a smaller, yet more targeted, library of variants, which mitigates high cost associated with extensive screening of large libraries developed through directed evolution. In recent years, AI technologies, particularly deep neural networks, have emerged as a promising approach for enzyme engineering, and AI-guided methods leverage a vast amount of information regarding protein sequences, multiple sequence alignments, and protein structures to learn key features for correlations. These learned features can then be applied to various downstream tasks in enzyme engineering, such as predicting mutations with beneficial effect, optimizing protein stability, and enhancing catalytic activity. Herewith, we delves into advancements and successes in each of these strategies for enzyme engineering, highlighting the growing impact of AI-guided design on the process. By offering a detailed examination of the current state of enzyme engineering, we aim at providing valuable insight for researchers and engineers to further advance the development and optimization of enzymes for more applications.
Key words:
enzyme engineering,
directed evolution,
artificial intelligence,
deep neural network
康里奇, 谈攀, 洪亮. 人工智能时代下的酶工程[J]. 合成生物学, 2023, 4(3): 524-534,
doi: 10.12211/2096-8280.2023-009
.
Liqi KANG, Pan TAN, Liang HONG. Enzyme engineering in the age of artificial intelligence[J]. Synthetic Biology Journal, 2023, 4(3): 524-534,
doi: 10.12211/2096-8280.2023-009
.
无监督模型在不同数据集上预测结果与实验结果的相关性
蛋白质适应度分类
|
数据集
|
ESM-IF1
|
ESM-1v
|
MSA transformer
|
ProGen2
|
Tranception
|
催化活性
|
B3VI55_LIPST
|
0.291
|
0.272
|
0.316
|
0.239
|
0.290
|
MTH3_HAEAESTABILIZED
|
0.423
|
0.488
|
0.564
|
0.507
|
0.479
|
KKA2_KLEPN
|
0.204
|
0.198
|
0.153
|
0.191
|
0.077
|
MK01_HUMAN
|
0.155
|
0.164
|
0.182
|
0.200
|
0.203
|
AMIE_PSEAE
|
0.295
|
0.537
|
0.523
|
0.526
|
0.517
|
RASH_HUMAN
|
0.070
|
0.131
|
0.089
|
0.135
|
0.085
|
UBC9_HUMAN
|
0.485
|
0.518
|
0.425
|
0.484
|
0.476
|
BG_STRSQ
|
0.665
|
0.670
|
0.727
|
0.749
|
0.656
|
TRPC_THEMA
|
0.392
|
0.488
|
0.462
|
0.397
|
0.444
|
TIM_SULSO
|
0.506
|
0.617
|
0.613
|
0.529
|
0.594
|
P84126_THETH
|
0.519
|
0.564
|
0.656
|
0.558
|
0.548
|
BLAT_ECOLX
|
0.673
|
0.692
|
0.538
|
0.601
|
0.622
|
稳定性
|
PTEN_HUMAN
|
0.559
|
0.458
|
0.366
|
0.471
|
0.430
|
TPMT_HUMAN
|
0.560
|
0.531
|
0.530
|
0.458
|
0.478
|
肽段结合能力
|
DLG4_RAT
|
0.468
|
0.531
|
0.224
|
0.361
|
0.446
|
WW
|
0.415
|
0.399
|
0.441
|
0.309
|
0.563
|
蛋白质结合能力
|
IF1_ECOLI
|
0.337
|
0.356
|
0.363
|
0.246
|
0.347
|
SUMO1_HUMAN
|
0.543
|
0.548
|
0.565
|
0.482
|
0.423
|
RL40B_YEAST
|
0.372
|
0.365
|
0.647
|
0.473
|
0.455
|
DNA结合能力
|
FOSJUN
|
0.532
|
0.464
|
0.366
|
0.515
|
0.536
|
GAL4_YEAST
|
0.326
|
0.476
|
0.386
|
0.468
|
0.458
|
RNA 结合能力
|
RRM
|
0.443
|
0.536
|
0.509
|
0.512
|
0.407
|
TDP43
|
0.158
|
0.026
|
0.117
|
0.013
|
0.125
|
Ig-G结合能力
|
GB1
|
0.337
|
0.105
|
0.329
|
0.232
|
0.254
|
平均值
|
|
0.413
|
0.453
|
0.438
|
0.376
|
0.374
|
王晟, 王泽琛, 陈威华, 陈珂, 彭向达, 欧发芬, 郑良振, 孙瑨原, 沈涛, 赵国屏.
基于人工智能和计算生物学的合成生物学元件设计
[J]. 合成生物学, 2023, 4(3): 422-443.
阮青云, 黄莘, 孟子钧, 全舒.
蛋白质稳定性计算设计与定向进化前沿工具
[J]. 合成生物学, 2023, 4(1): 5-29.
祁延萍, 朱晋, 张凯, 刘彤, 王雅婕.
定向进化在蛋白质工程中的应用研究进展
[J]. 合成生物学, 2022, 3(6): 1081-1108.
吕靖伟, 邓子新, 张琪, 丁伟.
基于深度学习识别RiPPs前体肽及裂解位点
[J]. 合成生物学, 2022, 3(6): 1262-1276.
崔馨予, 吴冉冉, 王园明, 朱之光.
酶促生物电催化系统的设计构建与强化
[J]. 合成生物学, 2022, 3(5): 1006-1030.
唐宇琦, 叶松涛, 刘嘉, 张鑫.
分子伴侣作用下的蛋白质稳定与进化
[J]. 合成生物学, 2022, 3(3): 445-464.
卞佳豪, 杨广宇.
人工智能辅助的蛋白质工程
[J]. 合成生物学, 2022, 3(3): 429-444.
杨璐, 瞿旭东.
亚胺还原酶在手性胺合成中的应用
[J]. 合成生物学, 2022, 3(3): 516-529.
于慧敏, 郑煜堃, 杜岩, 王苗苗, 梁有向.
合成生物学研究中的微生物启动子工程策略
[J]. 合成生物学, 2021, 2(4): 598-611.
张以恒.
忆王义翘教授对生物炼制的贡献和我对此领域未来发展的观点
[J]. 合成生物学, 2021, 2(4): 497-508.
王也, 王昊晨, 晏明皓, 胡冠华, 汪小我.
生物分子序列的人工智能设计
[J]. 合成生物学, 2021, 2(1): 1-14.