Abstract
The field of protein design has made remarkable progress over the past decade. Historically, the low reliability of purely structure-based design methods limited their application, but recent strategies that combine structure-based and sequence-based calculations, as well as machine learning tools, have dramatically improved protein engineering and design. In this Review, we discuss how these methods have enabled the design of increasingly complex structures and therapeutically relevant activities. Additionally, protein optimization methods have improved the stability and activity of complex eukaryotic proteins. Thanks to their increased reliability, computational design methods have been applied to improve therapeutics and enzymes for green chemistry and have generated vaccine antigens, antivirals and drug-delivery nano-vehicles. Moreover, the high success of design methods reflects an increased understanding of basic rules that govern the relationships among protein sequence, structure and function. However, de novo design is still limited mostly to α-helix bundles, restricting its potential to generate sophisticated enzymes and diverse protein and small-molecule binders. Designing complex protein structures is a challenging but necessary next step if we are to realize our objective of generating new-to-nature activities.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 /Â 30Â days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Arnold, F. H. Innovation by evolution: bringing new chemistry to life (Nobel Lecture). Angew. Chem. Int. Ed. Engl. 58, 14420â14426 (2019).
Winter, G. Harnessing evolution to make medicines (Nobel Lecture). Angew. Chem. Int. Ed. Engl. 58, 14438â14445 (2019).
Trudeau, D. L. & Tawfik, D. S. Protein engineers turned evolutionists-the quest for the optimal starting point. Curr. Opin. Biotechnol. 60, 46â52 (2019).
Packer, M. S. & Liu, D. R. Methods for the directed evolution of proteins. Nat. Rev. Genet. 16, 379â394 (2015).
Arnold, F. H. The nature of chemical innovation: new enzymes by evolution. Q. Rev. Biophys. 48, 404â410 (2015).
Arnold, F. H. Combinatorial and computational challenges for biocatalyst design. Nature 409, 253â257 (2001).
Tokuriki, N., Stricher, F., Serrano, L. & Tawfik, D. S. How protein stability and new functions trade off. PLoS Comput. Biol. 4, e1000002 (2008).
Tokuriki, N. et al. Diminishing returns and tradeoffs constrain the laboratory optimization of an enzyme. Nat. Commun. 3, 1257 (2012).
Goldsmith, M. et al. Overcoming an optimization plateau in the directed evolution of highly efficient nerve agent bioscavengers. Protein Eng. Des. Sel. 30, 333â345 (2017).
Fleishman, S. J. & Baker, D. Role of the biomolecular energy gap in protein design, structure, and evolution. Cell 149, 262â273 (2012).
Stranges, P. B. & Kuhlman, B. A comparison of successful and failed protein interface designs highlights the challenges of designing buried hydrogen bonds. Protein Sci. 22, 74â82 (2013).
Baker, D. What has de novo protein design taught us about protein folding and biophysics? Protein Sci. 28, 678â683 (2019).
Khare, S. D. & Fleishman, S. J. Emerging themes in the computational design of novel enzymes and protein-protein interfaces. FEBS Lett. 587, 1147â1154 (2013).
Baker, D. An exciting but challenging road ahead for computational enzyme design. Protein Sci. 19, 1817â1819 (2010).
Baek, M. & Baker, D. Deep learning and protein structure modeling. Nat. Methods 19, 13â14 (2022).
Pan, X. & Kortemme, T. Recent advances in de novo protein design: principles, methods, and applications. J. Biol. Chem. 296, 100558 (2021).
Korendovych, I. V. & DeGrado, W. F. De novo protein design, a retrospective. Q. Rev. Biophys. 53, e3 (2020).
Woolfson, D. N. A brief history of de novo protein design: minimal, rational, and computational. J. Mol. Biol. 433, 167160 (2021).
Kortemme, T. De novo protein design â from new structures to programmable functions. Cell 187, 526â544 (2024).
Yue, K. & Dill, K. A. Inverse protein folding problem: designing polymer sequences. Proc. Natl Acad. Sci. USA 89, 4163â4167 (1992).
Bowie, J. U., Lüthy, R. & Eisenberg, D. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164â170 (1991).
Weinstein, J., Khersonsky, O. & Fleishman, S. J. Practically useful protein-design methods combining phylogenetic and atomistic calculations. Curr. Opin. Struct. Biol. 63, 58â64 (2020).
Huang, P.-S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320â327 (2016).
Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089â1100 (2023). Applying diffusion models to backbone generation yields large de novo-designed proteins and assemblies. Available as a Colab notebook.
Kuhlman, B. et al. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364â1368 (2003).
Chevalier, A. et al. Massively parallel de novo protein design for targeted therapeutics. Nature 550, 74â79 (2017).
Cao, L. et al. Design of protein-binding proteins from the target structure alone. Nature 605, 551â560 (2022). Repertoires of miniprotein binders for 12 different antigens are designed based solely on the structure of the target antigen site.
Bershtein, S., Segal, M., Bekerman, R., Tokuriki, N. & Tawfik, D. S. Robustness-epistasis link shapes the fitness landscape of a randomly drifting protein. Nature 444, 929â932 (2006).
Zhao, H. & Arnold, F. H. Directed evolution converts subtilisin E into a functional equivalent of thermitase. Protein Eng. 12, 47â53 (1999).
Anfinsen, C. B. Principles that govern the folding of protein chains. Science 181, 223â230 (1973).
Levinthal, C. Are there pathways for protein folding? J. Chim. Phys. 65, 44â45 (1968).
Dill, K. A. Polymer principles and protein folding. Protein Sci. 8, 1166â1180 (1999).
Brocchieri, L. & Karlin, S. Protein length in eukaryotic and prokaryotic proteomes. Nucleic Acids Res. 33, 3390â3400 (2005).
Johansson, K. E. et al. Computational redesign of thioredoxin is hypersensitive toward minor conformational changes in the backbone template. J. Mol. Biol. 428, 4361â4377 (2016).
Cherny, I. et al. Engineering V-type nerve agents detoxifying enzymes using computationally focused libraries. ACS Chem. Biol. 8, 2394â2403 (2013).
Baran, D. et al. Principles for computational design of binding antibodies. Proc. Natl Acad. Sci. USA 114, 10900â10905 (2017).
Murphy, P. M., Bolduc, J. M., Gallaher, J. L., Stoddard, B. L. & Baker, D. Alteration of enzyme specificity by computational loop remodeling and design. Proc. Natl Acad. Sci. USA 106, 9215â9220 (2009).
Fleishman, S. J. et al. Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science 332, 816â821 (2011).
Whitehead, T. A. et al. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nat. Biotechnol. 30, 543â548 (2012).
Goldenzweig, A. & Fleishman, S. J. Principles of protein stability and their application in computational design. Annu. Rev. Biochem. 87, 105â129 (2018).
Khersonsky, O. & Fleishman, S. J. Why reinvent the wheel? Building new proteins based on ready-made parts. Protein Sci. 25, 1179â1187 (2016).
Goldenzweig, A. et al. Automated structure- and sequence-based design of proteins for high bacterial expression and stability. Mol. Cell 63, 337â346 (2016). Combining phylogenetic analysis with atomistic design calculations improves expression and stability of diverse proteins. Available as a web server.
Khersonsky, O. et al. Automated design of efficient and functionally diverse enzyme repertoires. Mol. Cell 72, 178â186.e5 (2018). An evolution-guided atomistic design method enhances enzyme activity levels. Available as a web server.
Hanning, K. R., Minot, M., Warrender, A. K., Kelton, W. & Reddy, S. T. Deep mutational scanning for therapeutic antibody engineering. Trends Pharmacol. Sci. 43, 123â135 (2022).
Fox, R. J. et al. Improving catalytic function by ProSAR-driven enzyme evolution. Nat. Biotechnol. 25, 338â344 (2007).
Yang, K. K., Wu, Z. & Arnold, F. H. Machine-learning-guided directed evolution for protein engineering. Nat. Methods 16, 687â694 (2019).
Taft, J. M. et al. Deep mutational learning predicts ACE2 binding and antibody escape to combinatorial mutations in the SARS-CoV-2 receptor-binding domain. Cell 185, 4008â4022.e14 (2022).
Bedbrook, C. N., Yang, K. K., Rice, A. J., Gradinaru, V. & Arnold, F. H. Machine learning to design integral membrane channelrhodopsins for efficient eukaryotic expression and plasma membrane localization. PLoS Comput. Biol. 13, e1005786 (2017).
Balchin, D., Hayer-Hartl, M. & Hartl, F. U. In vivo aspects of protein folding and quality control. Science 353, aac4354 (2016).
McLendon, G. & Radany, E. Is protein turnover thermodynamically controlled? J. Biol. Chem. 253, 6335â6337 (1978).
Kwon, W. S., Da Silva, N. A. & Kellis, J. T. Jr. Relationship between thermal stability, degradation rate and expression yield of barnase variants in the periplasm of Escherichia coli. Protein Eng. 9, 1197â1202 (1996).
Parsell, D. A. & Sauer, R. T. The structural stability of a protein is an important determinant of its proteolytic susceptibility in Escherichia coli. J. Biol. Chem. 264, 7590â7595 (1989).
Shusta, E. V., Kieke, M. C., Parke, E., Kranz, D. M. & Wittrup, K. D. Yeast polypeptide fusion surface display levels predict thermal stability and soluble secretion efficiency. J. Mol. Biol. 292, 949â956 (1999).
Christendat, D. et al. Structural proteomics: prospects for high throughput sample preparation. Prog. Biophys. Mol. Biol. 73, 339â345 (2000).
Mehlin, C. et al. Heterologous expression of proteins from Plasmodium falciparum: results from 1000 genes. Mol. Biochem. Parasitol. 148, 144â160 (2006).
Klenk, C., Ehrenmann, J., Schütz, M. & Plückthun, A. A generic selection system for improved expression and thermostability of G protein-coupled receptors by directed evolution. Sci. Rep. 6, 21294 (2016).
Andréll, J. & Tate, C. G. Overexpression of membrane proteins in mammalian cells for structural studies. Mol. Membr. Biol. 30, 52â63 (2013).
Bloom, J. D., Labthavikul, S. T., Otey, C. R. & Arnold, F. H. Protein stability promotes evolvability. Proc. Natl Acad. Sci. USA 103, 5869â5874 (2006).
Rosace, A. et al. Automated optimisation of solubility and conformational stability of antibodies and proteins. Nat. Commun. 14, 1937 (2023).
Wijma, H. J., Fürst, M. J. L. J. & Janssen, D. B. A computational library design protocol for rapid improvement of protein stability: FRESCO. Methods Mol. Biol. 1685, 69â85 (2018).
Musil, M. et al. FireProt: web server for automated design of thermostable proteins. Nucleic Acids Res. 45, W393âW399 (2017).
Campeotto, I. et al. One-step design of a stable variant of the malaria invasion protein RH5 for use as a vaccine immunogen. Proc. Natl Acad. Sci. USA 114, 998â1002 (2017).
Peleg, Y. et al. Community-wide experimental evaluation of the pross stability-design method. J. Mol. Biol. 433, 166964 (2021).
Pokorna, S. et al. Design of a stable human acid-β-glucosidase: towards improved Gaucher disease therapy and mutation classification. FEBS J. 290, 3383â3399 (2023).
Borgert, S. R. et al. Moonlighting chaperone activity of the enzyme PqsE contributes to RhlR-controlled virulence of Pseudomonas aeruginosa. Nat. Commun. 13, 7402 (2022).
Barber-Zucker, S. et al. Stable and functionally diverse versatile peroxidases designed directly from sequences. J. Am. Chem. Soc. 144, 3564â3571 (2022).
Williams, J. A. et al. Structural and computational design of a SARS-CoV-2 spike antigen with improved expression and immunogenicity. Sci. Adv. 9, eadg0330 (2023).
Mao, G. et al. A sustainable approach for degradation and detoxification of malachite green by an engineered polyphenol oxidase at high temperature. J. Clean. Prod. 328, 129437 (2021).
Lambert, A. R., Hallinan, J. P., Werther, R., GÅów, D. & Stoddard, B. L. Optimization of protein thermostability and exploitation of recognition behavior to engineer altered proteinâDNA recognition. Structure 28, 760â775.e8 (2020).
Khersonsky, O. et al. Stable mammalian serum albumins designed for bacterial expression. J. Mol. Biol. 435, 168191 (2023).
Sherkhanov, S. et al. Isobutanol production freed from biological limits using synthetic biochemistry. Nat. Commun. 11, 4292 (2020).
Allouche-Arnon, H. et al. Computationally designed dual-color MRI reporters for noninvasive imaging of transgene expression. Nat. Biotechnol. 40, 1143â1149 (2022).
Doble, M. V. et al. Engineering thermostability in artificial metalloenzymes to increase catalytic activity. ACS Catal. 11, 3620â3627 (2021).
Hsieh, C.-L. et al. Stabilized coronavirus spike stem elicits a broadly protective antibody. Cell Rep. 37, 109929 (2021).
Higgins, M. K. Can we AlphaFold our way out of the next pandemic? J. Mol. Biol. 433, 167093 (2021).
Graham, B. S., Gilman, M. S. A. & McLellan, J. S. Structure-based vaccine antigen design. Annu. Rev. Med. 70, 91â104 (2019).
Hsieh, C.-L. & McLellan, J. S. Protein engineering responses to the COVID-19 pandemic. Curr. Opin. Struct. Biol. 74, 102385 (2022).
U.S. National Library of Medicine. ClinicalTrials.gov https://clinicaltrials.gov/study/NCT05790889 (2023).
Hettiaratchi, M. H. et al. Reengineering biocatalysts: computational redesign of chondroitinase ABC improves efficacy and stability. Sci. Adv. 6, eabc6378 (2020).
Rosenzweig, E. S. et al. Chondroitinase improves anatomical and functional outcomes after primate spinal cord injury. Nat. Neurosci. 22, 1269â1275 (2019).
Busch, S. A., Horn, K. P., Silver, D. J. & Silver, J. Overcoming macrophage-mediated axonal dieback following CNS injury. J. Neurosci. 29, 9967â9976 (2009).
Schueler-Furman, O., Wang, C., Bradley, P., Misura, K. & Baker, D. Progress in modeling of protein structures and interactions. Science 310, 638â642 (2005).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583â589 (2021).
Tunyasuvunakool, K. et al. Highly accurate protein structure prediction for the human proteome. Nature 596, 590â596 (2021).
Tennenhouse, A. et al. Computational optimization of antibody humanness and stability by systematic energy-based ranking. Nat. Biomed. Eng. 8, 30â44 (2023).
Krishna, R. et al. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science https://doi.org/10.1126/science.adl2528 (2024).
Abanades, B. et al. ImmuneBuilder: deep-learning models for predicting the structures of immune proteins. Commun. Biol. 6, 575 (2023).
Zelnik, I. D. et al. Computational design and molecular dynamics simulations suggest the mode of substrate binding in ceramide synthases. Nat. Commun. 14, 2330 (2023).
Weinstein, J. J. et al. One-shot design elevates functional expression levels of a voltage-gated potassium channel. Preprint at bioRxiv https://doi.org/10.1101/2022.12.28.522065 (2022).
Schymkowitz, J. et al. The FoldX web server: an online force field. Nucleic Acids Res. 33, W382âW388 (2005).
Bednar, D. et al. FireProt: energy- and evolution-based computational design of thermostable multiple-point mutants. PLoS Comput. Biol. 11, e1004556 (2015).
Marques, S. M., Planas-Iglesias, J. & Damborsky, J. Web-based tools for computational enzyme design. Curr. Opin. Struct. Biol. 69, 19â34 (2021).
Breen, M. S., Kemena, C., Vlasov, P. K., Notredame, C. & Kondrashov, F. A. Epistasis as the primary factor in molecular evolution. Nature 490, 535â538 (2012).
Weinreich, D. M., Watson, R. A. & Chao, L. Perspective: sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59, 1165â1174 (2005).
Smith, J. M. Natural selection and the concept of a protein space. Nature 225, 563â564 (1970).
Starr, T. N. & Thornton, J. W. Epistasis in protein evolution. Protein Sci. 25, 1204â1218 (2016).
Yang, G. et al. Higher-order epistasis shapes the fitness landscape of a xenobiotic-degrading enzyme. Nat. Chem. Biol. 15, 1120â1128 (2019).
Goldsmith, M. & Tawfik, D. S. Enzyme engineering: reaching the maximal catalytic efficiency peak. Curr. Opin. Struct. Biol. 47, 140â150 (2017).
Corbella, M., Pinto, G. P. & Kamerlin, S. C. L. Loop dynamics and the evolution of enzyme activity. Nat. Rev. Chem. 7, 536â547 (2023).
Sumbalova, L., Stourac, J., Martinek, T., Bednar, D. & Damborsky, J. HotSpot Wizard 3.0: web server for automated design of mutations and smart libraries based on sequence input information. Nucleic Acids Res. 46, W356âW362 (2018).
Stourac, J. et al. Caver Web 1.0: identification of tunnels and channels in proteins and analysis of ligand transport. Nucleic Acids Res. 47, W414âW422 (2019).
Klaus, M., Buyachuihan, L. & Grininger, M. Ketosynthase domain constrains the design of polyketide synthases. ACS Chem. Biol. 15, 2422â2432 (2020).
Ospina, F. et al. Selective biocatalytic N-methylation of unsaturated heterocycles. Angew. Chem. Int. Ed. Engl. 61, e202213056 (2022).
Gomez de Santos, P. et al. Repertoire of computationally designed peroxygenases for enantiodivergent CâH oxyfunctionalization reactions. J. Am. Chem. Soc. 145, 3443â3453 (2023).
Beltrán-Nogal, A. et al. Surfing the wave of oxyfunctionalization chemistry by engineering fungal unspecific peroxygenases. Curr. Opin. Struct. Biol. 73, 102342 (2022).
Warshel, A. Electrostatic origin of the catalytic power of enzymes and the role of preorganized active sites. J. Biol. Chem. 273, 27035â27038 (1998).
Rocklin, G. J. et al. Global analysis of protein folding using massively parallel design, synthesis, and testing. Science 357, 168â175 (2017).
Tsuboyama, K. et al. Mega-scale experimental analysis of protein folding stability in biology and design. Nature 620, 434â444 (2023). More than a million miniproteins were designed and screened to learn the determinants of foldability and stability.
Lipsh-Sokolik, R. et al. Combinatorial assembly and design of enzymes. Science 379, 195â201 (2023).
Weinstein, J. Y. et al. Designed active-site library reveals thousands of functional GFP variants. Nat. Commun. 14, 2890 (2023). Millions of active-site variants were designed in the GFP active site and used to learn molecular determinants of activity.
Khersonsky, O. & Fleishman, S. J. What have we learned from design of function in large proteins? BioDesign Res. 2022, 9787581 (2022).
Lambert, T. J. FPbase: a community-editable fluorescent protein database. Nat. Methods 16, 277â278 (2019).
Hoch, S. Y., Weinstein, J. Y., Netzer, R., Hakeny, K. & Fleishman, S. J. GGAssembler: economical design of gene libraries with precise control over mutations. Preprint at bioRxiv https://doi.org/10.1101/2023.05.18.541394 (2023).
Povolotskaya, I. S. & Kondrashov, F. A. Sequence space and the ongoing expansion of the protein universe. Nature 465, 922â926 (2010).
Notin, P., Rollins, N., Gal, Y., Sander, C. & Marks, D. Machine learning for functional protein design. Nat. Biotechnol. 42, 216â228 (2024).
Ho, S. P. & DeGrado, W. F. Design of a 4-helix bundle protein: synthesis of peptides which self-associate into a helical protein. J. Am. Chem. Soc. 109, 6751â6758 (1987).
Richardson, J. S. et al. Looking at proteins: representations, folding, packing, and design. Biophysical Society National Lecture, 1992. Biophys. J. 63, 1185â1209 (1992).
Broome, B. M. & Hecht, M. H. Nature disfavors sequences of alternating polar and non-polar amino acids: implications for amyloidogenesis. J. Mol. Biol. 296, 961â968 (2000).
Anishchenko, I. et al. De novo protein design by deep network hallucination. Nature 600, 547â552 (2021).
Dahiyat, B. I. & Mayo, S. L. De novo protein design: fully automated sequence selection. Science 278, 82â87 (1997).
Koga, N. et al. Principles for designing ideal protein structures. Nature 491, 222â227 (2012).
Marcos, E. et al. De novo design of a non-local β-sheet protein with high stability and accuracy. Nat. Struct. Mol. Biol. 25, 1028â1034 (2018).
Dou, J. et al. De novo design of a fluorescence-activating β-barrel. Nature 561, 485â491 (2018).
Shakhnovich, E. I. Protein design: a perspective from simple tractable models. Fold. Des. 3, R45â58 (1998).
McMillan, P. F., Clary, D. C. & Wolynes, P. G. Energy landscapes and solved protein-folding problems. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 363, 453â467 (2004).
Govindarajan, S. & Goldstein, R. A. Why are some proteins structures so common? Proc. Natl Acad. Sci. USA 93, 3341â3345 (1996).
Helling, R. et al. The designability of protein structures. J. Mol. Graph. Model. 19, 157â167 (2001).
Tóth-Petróczy, A. & Tawfik, D. S. The robustness and innovability of protein folds. Curr. Opin. Struct. Biol. 26, 131â138 (2014).
Pierce, N. A. & Winfree, E. Protein design is NP-hard. Protein Eng. 15, 779â782 (2002).
Kuhlman, B. & Bradley, P. Advances in protein structure prediction and design. Nat. Rev. Mol. Cell Biol. 20, 681â697 (2019).
Street, A. G. & Mayo, S. L. Computational protein design. Structure 7, R105â9 (1999).
Bhardwaj, G., Mulligan, V. K., Bahl, C. D. & Gilmore, J. M. Accurate de novo design of hyperstable constrained peptides. Nature 538, 329â335 (2016).
Pan, X. et al. Expanding the space of protein geometries by computational design of de novo fold families. Science 369, 1132â1136 (2020).
Verkuil, R. et al. Language models generalize beyond natural proteins. Preprint at bioRxiv https://doi.org/10.1101/2022.12.21.521521 (2022).
Lisanza, S. L. et al. Joint generation of protein sequence and structure with RoseTTAFold sequence space diffusion. Preprint at bioRxiv https://doi.org/10.1101/2023.05.08.539766 (2023).
Dauparas, J. et al. Robust deep learning-based protein sequence design using ProteinMPNN. Science 378, 49â56 (2022). An artificial-intelligence-based sequence design method improves design success rate relative to previous, physics-based methods. Available as a Colab notebook.
Wicky, B. I. M. et al. Hallucinating symmetric protein assemblies. Science 378, 56â61 (2022).
Huang, B. et al. A backbone-centred energy function of neural networks for protein design. Nature 602, 523â528 (2022).
Anand, N. et al. Protein sequence design with a learned potential. Nat. Commun. 13, 746 (2022).
Harteveld, Z. et al. Deep sharpening of topological features for de novo protein design. OpenReview.net https://openreview.net/forum?id=DwN81YIXGQP (2022).
Eguchi, R. R., Choe, C. A. & Huang, P.-S. Ig-VAE: generative modeling of protein structure by direct 3D coordinate generation. PLoS Comput. Biol. 18, e1010271 (2022).
Wang, J. et al. Scaffolding protein functional sites using deep learning. Science 377, 387â394 (2022).
Kim, D. E. et al. De novo design of small beta barrel proteins. Proc. Natl Acad. Sci. USA 120, e2207974120 (2023).
Goverde, C. A. et al. Computational design of soluble analogues of integral membrane protein structures. Preprint at bioRxiv https://doi.org/10.1101/2023.05.09.540044 (2023).
Harteveld, Z. et al. Exploring âdark matterâ protein folds using deep learning. Preprint at bioRxiv https://doi.org/10.1101/2023.08.30.555621 (2023).
Huang, P.-S. et al. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy. Nat. Chem. Biol. 12, 29â34 (2016).
Norn, C. et al. Protein sequence design by conformational landscape optimization. Proc. Natl Acad. Sci. USA 118, e2017228118 (2021).
Lee, J. S., Kim, J. & Kim, P. M. Score-based generative modeling for de novo protein design. Nat. Comput. Sci. 3, 382â392 (2023).
Ingraham, J. B. et al. Illuminating protein space with a programmable generative model. Nature 623, 1070â1078 (2023).
Yim, J. et al. Fast protein backbone generation with SE(3) flow matching. Preprint at https://doi.org/10.48550/arXiv.2310.05297 (2023).
Sesterhenn, F. et al. De novo protein design enables the precise induction of RSV-neutralizing antibodies. Science 368, eaay5051 (2020).
Yeh, A. H.-W. et al. De novo design of luciferases using deep learning. Nature 614, 774â780 (2023).
Polizzi, N. F. & DeGrado, W. F. A defined structural unit enables de novo design of small-molecule-binding proteins. Science 369, 1227â1233 (2020). Computational design of small-molecule binding sites using a precomputed, low-energy constellation of ligand and interacting amino acids.
Marchand, A., Van Hall-Beauvais, A. K. & Correia, B. E. Computational design of novel protein-protein interactions â an overview on methodological approaches and applications. Curr. Opin. Struct. Biol. 74, 102370 (2022).
Linsky, T. W. et al. De novo design of potent and resilient hACE2 decoys to neutralize SARS-CoV-2. Science 370, 1208â1214 (2020).
Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17, 184â192 (2020).
Gainza, P. et al. De novo design of protein interactions with learned surface fingerprints. Nature 617, 176â184 (2023). Designing binders of four target proteins using an artificial-intelligence-based strategy that predicts putative binding sites.
Cao, L. et al. De novo design of picomolar SARS-CoV-2 miniprotein inhibitors. Science 370, 426â431 (2020).
Strauch, E.-M. et al. Computational design of trimeric influenza-neutralizing proteins targeting the hemagglutinin receptor binding site. Nat. Biotechnol. 35, 667â671 (2017).
Silva, D.-A. et al. De novo design of potent and selective mimics of IL-2 and IL-15. Nature 565, 186â191 (2019).
Hafler, D. A. Cytokines and interventional immunology. Nat. Rev. Immunol. 7, 423 (2007).
Correia, B. E. et al. Proof of principle for epitope-focused vaccine design. Nature 507, 201â206 (2014).
Azoitei, M. L. et al. Computation-guided backbone grafting of a discontinuous motif onto a protein scaffold. Science 334, 373â376 (2011).
Sesterhenn, F. et al. Boosting subdominant neutralizing antibody responses with a computationally designed epitope-focused immunogen. PLoS Biol. 17, e3000164 (2019).
Jardine, J. G. et al. HIV-1 broadly neutralizing antibody precursor B cells revealed by germline-targeting immunogen. Science 351, 1458â1463 (2016).
Marcandalli, J. et al. Induction of potent neutralizing antibody responses by a designed protein nanoparticle vaccine for respiratory syncytial virus. Cell 176, 1420â1431.e17 (2019).
Kanekiyo, M. et al. Self-assembling influenza nanoparticle vaccines elicit broadly neutralizing H1N1 antibodies. Nature 499, 102â106 (2013).
Abbott, R. K. et al. Precursor frequency and affinity determine B cell competitive fitness in germinal centers, tested with germline-targeting HIV vaccine immunogens. Immunity 48, 133â146.e6 (2018).
Arunachalam, P. S. et al. Adjuvanting a subunit COVID-19 vaccine to induce protective immunity. Nature 594, 253â258 (2021).
Walls, A. C. et al. Elicitation of potent neutralizing antibody responses by designed protein nanoparticle vaccines for SARS-CoV-2. Cell 183, 1367â1382.e17 (2020).
Griss, R. et al. Bioluminescent sensor proteins for point-of-care therapeutic drug monitoring. Nat. Chem. Biol. 10, 598â603 (2014).
Dawson, W. M. et al. Differential sensing with arrays of de novo designed peptide assemblies. Nat. Commun. 14, 383 (2023).
Lim, W. A. & June, C. H. The principles of engineering immune cell treat. Cancer Cell 168, 724â740 (2017).
Giordano-Attianese, G. et al. Author Correction: A computationally designed chimeric antigen receptor provides a small-molecule safety switch for T-cell therapy. Nat. Biotechnol. 38, 503 (2020).
Elazar, A. et al. De novo-designed transmembrane domains tune engineered receptor functions. eLife 11, e75660 (2022).
Lajoie, M. J. et al. Designed protein logic to target cells with precise combinations of surface antigens. Science 1643, eaba6527 (2020).
Mushegian, A. R. Are there 1031 virus particles on earth, or more, or fewer? J. Bacteriol. 202, e00052-20 (2020).
Acknowledgements
We thank A. Tennenhouse for critical reading. Work in the Fleishman lab was funded by the Volkswagen Foundation grant 9474, the Israel Science Foundation grant 1844, the European Research Council through a Consolidator Award grant 815379, the Dr. Barry Sherman Institute for Medicinal Chemistry, and a donation in memory of Sam Switzer. Work in the Correia lab was supported by the Swiss National Foundation, the National Center of Competence in Molecular Systems Engineering and Fondation Leenaards.
Author information
Authors and Affiliations
Contributions
D.L. and C.A.G. researched data for the article. All authors contributed substantially to discussion of the content, wrote the article, and reviewed and/or edited the manuscript before submission.
Corresponding authors
Ethics declarations
Competing interests
S.J.F. and B.E.C. are named inventors on patents relating to methods and designs described in the manuscript and consult on the application of protein design methods.
Peer review
Peer review information
Nature Reviews Molecular Cell Biology thanks Haiyan Liu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
Protein Data Bank: https://www.rcsb.org/
Supplementary information
Glossary
- Backbone designability
-
The ability of amino acid sequences to fold into the desired backbone. A backbone that has many solutions is highly designable.
- Backbone generation
-
Generating a spatial arrangement of the protein backbone excluding the amino acid side chains.
- Epistasis
-
Non-additive effects of combinatorial mutations; for instance, when mutations are tolerated in combination but not individually, or vice versa.
- Fold design
-
Design of a protein backbone that shares no significant sequence homology with natural proteins. Sometimes denoted de novo design.
- Function design
-
Implementing a new function into a protein scaffold.
- Idealized topologies
-
Simplified geometric representation of protein structure, mostly comprising secondary structure elements connected by short linkers.
- Negative design
-
Designing elements that destabilize undesired (for example, non-functional or aggregation-prone) structural states.
- Physics-based methods
-
Computational methods that apply the laws of physics, usually in the form of forcefields, to minimize protein structures and analyse or design three-dimensional protein structures.
- Positive design
-
Designing protein elements that improve the stability of a desired structural state.
- Protein backbone
-
The protein mainchain of amino acids connected through covalent amide linkages. Also known as protein scaffold.
- Protein optimization
-
Design with the goal of optimizing desired protein functional aspects such as thermodynamic and kinetic stabilities, production yields, catalytic efficiency, binding affinity, and specificity.
- Protein switches
-
Proteins that toggle several different conformations by interacting with a specific molecule or environment.
- Relative contact order
-
Represents the relative complexity of a protein fold. Computed as the extent to which amino acids that are far in the primary sequence are physically close in the 3D structure.
- Sequence space
-
The theoretical space of possible combinations of protein sequence changes. This space is often too large for experimental or computational enumeration, and design methods must find ways to restrict and sample it efficiently.
- Stability design
-
Design with the goal of improving protein thermodynamic and kinetic stabilities.
- Structure-based design
-
Design based on computed or experimentally determined molecular structures using physical principles.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Listov, D., Goverde, C.A., Correia, B.E. et al. Opportunities and challenges in design and optimization of protein function. Nat Rev Mol Cell Biol 25, 639â653 (2024). https://doi.org/10.1038/s41580-024-00718-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41580-024-00718-y
This article is cited by
-
Microbial cell factories for cost-effective and high-quality cultured meat
Nature Reviews Bioengineering (2024)
-
De novo design of mini-protein binders broadly neutralizing Clostridioides difficile toxin B variants
Nature Communications (2024)
-
Computational design of soluble and functional membrane protein analogues
Nature (2024)
-
Designing de novo D-protein binders
Cell Research (2024)