Proteins are one of nature’s most incredible and versatile inventions. These essential building blocks of life catalyze virtually every chemical reaction in the body, regulate gene expression and the immune system, make up the major structural elements of all cells, and form the major components of muscle.
But why’s this AI called ‘AlphaFold’? Proteins are chains of amino acids that fold spontaneously to form a 3D structure crucial to the protein’s biological function. You can look at their components and sequences on paper, but if you don’t know what 3D shape they fold up into, you can’t predict what they’ll do, or how they’ll interact with other molecules.
Various human-operated experimental techniques have been used to determine protein structures, but they’re time-consuming and expensive. Considering that there are over 200 million known proteins across all lifeforms – and, so far, only 170,000 protein structures have been identified – it made sense to co-opt AI to speed up the process. But, until AlphaFold, programs haven’t been able to predict protein structures in a way that’s as accurate as human-based experimental techniques.
AlphaFold 2, released in 2021, was a game-changing breakthrough, predicting 3-D structures for nearly every protein in the human body and enabling all sorts of cutting-edge science. In less than three years, it’s been used by researchers worldwide to accelerate discoveries in cancer treatment, malaria vaccines, the creation of plastic-eating enzymes, and countless others – the Alphafold 2 paper currently lists more than 14,000 citations.
So, how is AlphaFold 3 an improvement? Well, the newest version moves beyond just predicting the structure and interactions of proteins to include everything from multiple proteins to DNA, RNA, and small-molecule ligands. (Most drugs are ligands that bind to proteins to change how they interact in human health and disease.)
This makes it an absolutely unprecedented resource for simulating how specific proteins in the body will interact with specific drug molecules.
The GIF above gives you an idea of the structure that AlphaFold 3 generates. It shows its structural prediction for a spike protein (blue) of a common cold virus as it interacts with antibodies (turquoise) and simple sugars (yellow). You can see how accurately AlphaFold 3’s model matches the protein’s true structure, which is underlaid in gray.
To achieve these advanced capabilities, AlphaFold 3 was trained on global molecular structural data held within the Protein Data Bank. Deepmind says it can process over 99% of all known biomolecular complexes in that database. In addition, its Evoformer module, the architecture that allowed AlphaFold 2 to perform like it did, was improved.
This is how the AlphaFold system works in very, very basic terms. (Thanks to the University of Oxford’s Protein Informatics Group for its easier-to-understand explanation.) It takes the amino acid sequence that’s input, searches databases for similar sequences already identified in other living organisms, extracts all of the relevant information using a transformer (that’s the Evoformer), and passes that information on to a neural network that produces a 3D structure – a long list of coordinates representing the position of each atom of the protein, including side chains.
What the new-and-improved Evoformer does is assemble its structural predictions using a diffusion network, like those found in AI image generators. As the joint blog post by DeepMind and Isomorphic Labs announcing AlphaFold 3 explains, it “starts with a cloud of atoms, and over many steps converges on its final, most accurate molecular structure.”
In a recent interview with Bloomberg’s Tom Mackenzie, Google DeepMind CEO and co-founder (and CEO and founder of Isomorphic Labs) Demis Hassabis discussed the implications of using AlphaFold 3 in drug discovery.
“The holy grail of drug discovery is not just knowing the protein structure, which is what AlphaFold 2 did, but actually designing drug compounds called ligands that bind to the protein’s surface,” Hassabis said. “And you want to know where it binds, and how strongly it binds, in order for you to design the right kind of drug compound. So, AlphaFold 3 is a big step in that direction of predicting protein-ligand binding and how that interaction will work.”
In January this year, Isomorphic Labs announced a strategic partnership with pharmaceutical giants Eli Lilly and Novartis, worth a combined value of around US$3 billion. But what’s incredible here is the drug production timeline that’s expected to result from these partnerships.
“So, we’re already working on real drug programs,” said Hassabis. “And I would be expecting maybe in the next couple of years the first AI-designed drugs [appearing] in the clinic.”
“If you ask me the number one thing AI could do for humanity,” he continued, “it would be to solve, you know, hundreds of terrible diseases. I can’t imagine a better use case for AI. So that’s partly the motivation behind Isomorphic and AlphaFold and all the work we do in science, it’s to advance society in these big ways.”
The full interview between Hassabis and Mackenzie can be viewed in the video below.
Google DeepMind CEO on Drug Discovery, Hype, Isomorphic
When it was tested, AlphaFold 3 demonstrated state-of-the-art accuracy in predicting drug-like interactions, including proteins bound with ligands and antibodies bound with target proteins.
Using the PoseBusters benchmark, it was found to be 50% more accurate than the best existing methods – without the need to input any structural information. PoseBusters checks the chemical and physical plausibility of molecular and protein-ligand ‘poses’ generated by a deep-learning model.
And you can play with it yourself. AlphaFold 3 is available via the AlphaFold Server, which includes a database of 200 million protein structures. This phenomenal resource is free to scientists conducting non-commercial research – or indeed just curious Web users worldwide.
AlphaFold Server Demo – Google DeepMind
Predicting protein structures without a tool like this can take… Well, about as long as it takes to complete a PhD, and can cost hundreds of thousands of dollars. Much like how DeepMind’s GNoME tool has catapulted materials and crystals discovery hundreds of years into the future, AlphaFold 3 promises to radically accelerate vast areas in biological science and pharma.
“This new window on the molecules of life reveals how they’re all connected and helps understand how those connections affect biological functions – such as the actions of drugs, the production of hormones and the health-preserving process of DNA repair,” said Google DeepMind and Isomorphic Labs. “This leap could unlock more transformative science, from developing biorenewable materials and more resilient crops, to accelerating drug design and genomics research.”
We can’t wait to see where this technology takes us.
Research into the predictive capabilities of AlphaFold 3 was published in Nature.
Sources: Google DeepMind, Isomorphic Labs