Previously met with skepticism, AI won scientists a Nobel Prize for Chemistry in 2024 after they used it to solve the protein folding and design problem, and it has now been adopted by biologists across the globe. AI models like artificial neural networks and language models help scientists solve a variety of problems, from predicting the 3D structure of proteins to designing novel antibiotics from scratch. Researchers press on with the refinement of AI models, addressing their limitations and demonstrating widespread applications in biology.
Nobel Prize for AI: A Recap of Protein Folding and Design HistoryÂ
Nobel laureate David Baker uses deep learning models to create de novo proteins that are better suited to solving modern problems than natural proteins.
Ian C Haydon
A major sore spot for protein biologists, the protein-folding problem has now been solved by AI, winning University of Washington biochemist David Baker and DeepMind researchers Demis Hassabis and John Jumper a Nobel Prize in Chemistry. After struggling for around two decades to determine the tertiary structure of proteins from the sequence of their amino acids, scientists established the Critical Assessment of Structural Prediction (CASP) competition in 1994 to foster collaboration in this area. In 1998, Baker’s team built the Rosetta software for protein energy configuration modelling; in fact, a few years later, the team turned their computational model into a game called Foldit to rope in volunteers to partake in solving protein structures. At the 2018 edition of CASP, the DeepMind team unveiled their breakthrough AlphaFold program, trained on real protein sequences and structures. Two years later, the success of the AlphaFold2 program at accurately predicting protein structure led experts to declare the protein-folding problem largely solved. In 2024, Baker, Hassabis, and Jumper were awarded the Nobel Prize for their work that has enabled a deeper understanding of protein functions and applications.Â
Read up on the background behind the Nobel Prize-winning discovery here.
AlphaFold Inspired the Rapid Adoption of AI in Biology
With the 2018 release of AlphaFold, an AI deep learning model, scientists were finally able to predict the 3D structure of proteins—a decades-old challenge in biology. Trained on 100,000 known protein sequences and structures, the model can not only accurately predict protein structures with near experimental level accuracy but can also be used to design de novo proteins for a variety of applications in therapeutics and beyond. Inspired by the success of AlphaFold, scientists are now using deep learning models to create spatiotemporal maps of cells, analyze images of cells to detect changes in morphology that indicate disease, and estimate the efficacy of new drugs in halting disease progression to minimize losses in the drug discovery pipeline. Experts like Maddison Masaeli, an engineer scientist and chief executive officer at Deepcell, are happy about the rapid adoption of AI in biology but caution that researchers need significant expertise to harness AI for biological applications.Â
Explore the broad applications of AI in biology here.Â
De Novo Proteins Tackle 21st Century Problems
Using advanced machine learning tools, researchers can create artificial proteins with new functions.
Ian C Haydon
Harnessing the power of AI models, scientists are now able to design bespoke proteins with specific biological functions, allowing them to solve problems that cannot be addressed by the proteins found in nature. Traditional protein engineering is based on making incremental changes and observing their effects, but machine learning models can both design better proteins and significantly speed up the process. Protein design specialist David Baker and his team at the University of Washington used several different AI models to design stable luciferase enzymes that can bind to synthetic luciferin to glow, with applications in the deep imaging of animal tissue. While this type of protein design has room for improvement and isn’t yet fully automated, it could be used in the future to create a variety of proteins for therapeutic and other purposes.Â
Learn more about de novo proteins here.Â
AI Discovers New Antibiotic for Drug-Resistant Bacteria
Jon Stokes and his team developed SyntheMol, a generative artificial intelligence model that they used to create novel antibiotics with predicted efficacy against the ESKAPE pathogen, Acinetobacter baumannii.
McMaster University
The design of de novo proteins using AI could be a major boon in antibiotic development. With the incidence of antimicrobial resistance increasing worldwide and a dearth of new antibiotics being discovered, researchers at McMaster University have turned to AI to design novel antibiotics that can be easily synthesized. Led by biochemist Jon Stokes, the team developed a generative AI model called SyntheMol to design small molecules that possess antibacterial activity against Acinetobacter baumannii, a drug-resistant pathogen considered by the World Health Organization as a major threat to global health. Although they haven’t been tested in human subjects yet, several of the molecules inhibited the growth of the target bacteria as well as other drug-resistant microbes in vitro.Â
Delve into AI-generated antibiotics in this article.Â
Artificial Neural Networks Learn Like Human Brains
Â
Inspired by the human brain, artificial neural networks (ANNs) are a type of machine learning model containing multiple layers of interconnected nodes (or neurons) that can process data. Each node in the network performs a mathematical equation using weighted input data and determines whether the output will be passed forward to the next layer of nodes based on a threshold value. Scientists train the ANN using datasets that have known values or features, then allow it to assess its predicted outputs against the true answer for each sample so it can improve its accuracy over time. The ANN can then be used to predict outcomes from new datasets. Despite some key limitations, ANNs can identify patterns in complex data that humans might not be capable of and perform menial tasks to free up time for researchers.Â
Read more about neural nets in this explainer article.Â
Large Language Models Help Us Understand the Brain
Researchers have now developed a language model—the type of deep learning model responsible for ChatGPT—that can determine a person’s thoughts from MRI images of their brain. Alexander Huth, a researcher at the University of Texas at Austin, created the technique with the goal of allowing people who are unable to speak to communicate, but it has also revealed insights about the function of the human brain. Huth’s model showed that all parts of the brain use meaning-related information even if MRI scans show that only the prefrontal cortex is active. While the model isn’t generalizable across different subjects, meaning it can’t read minds, experts advise caution as these models become more accurate in the future.
Learn more about language models and their application in biology in this article.Â
Predicting Gene Expression Using Artificial Intelligence
While ChatGPT is used to predict the next words in a sentence, scientists have now created similar deep learning models that can predict gene expression in individual cells. Created by computational biologist Bo Wang and his team at the University of Toronto, the single-cell generative pretrained transformer (scGPT) can analyze single-cell RNA sequencing data more effectively than several of the most popular current methods. The model was also able to more accurately predict the effects of genetic perturbation than a standard model. Originally trained on bone marrow and immune cells, a new iteration of scGPT has now been adapted for the analysis of a variety of other cell types and could be used to answer important biological questions in the near future.
Continue reading about scGPT here.Â
AI models have enormous potential in biology, from helping us understand the brain to creating novel therapeutics, yet experts have warned that their use should be tempered with caution, and that their success depends on having a depth and breadth of knowledge. Researchers continue to explore, develop, and refine deep learning models for a variety of applications, including the interpretation and prediction of biological data.Â