Our mission is to train scientists. This blog is a platform for us to share updates on our annual programme, tips and tricks for scientists, new e-learning opportunities, and sometimes just something to make you smile.
Written by event reporter Ahmed Adel Ezat
The past decade has witnessed several trials to apply artificial intelligence (AI) techniques such as deep learning (DL), to solve grand challenges in the field of structural biology and biophysics of biomolecules. One of these challenges is the protein folding problem. These efforts have culminated in a new breakthrough in the field of structural bioinformatics called AlphaFold, developed by Google DeepMind, and published in 2021 for the scientific community.
AlphaFold solves half of the protein folding problem which is predicting the native 3D structure of proteins with a nearly experimental accuracy. It leverages the co-evolutionary information encoded in multiple sequence alignment (MSA) to predict the native structure of monomeric proteins.
The AlphaFold breakthrough drove forward computational and experimental structural biologists and biophysicists to organize workshops and conferences to discuss the potential pros and cons of AI applications in structural biology and bioinformatics and how they can harness its power to accelerate the structure determination of proteins and their complexes.
Two EMBO workshops have been organised in 2021 and 2022 to meet such aim. The 2023 edition ‘Computational structural biology’ just took place at the beginning of December in Heidelberg.
I am Ahmed Adel Ezat, Lecturer at Biophysics Department, Faculty of Science, Cairo University and I was so lucky and proud to attend all these workshops and be updated with the different applications and new advancements of AI in structural biology and biophysics.
I was very keen to take part in such a meeting to be acquainted with the current capabilities and applications of AI in the field of structural biology and biophysics and what are its new frontiers and perspectives and how do other experts in the field think of it in the future?
The recent workshop discusses the AI revolution in the field of structural biology from the structure determination of proteins and nucleic acids and their complexes, structure- based function annotation, the prediction of mutations effect on of protein structure and function to the integrative modeling of large supercomplexes of the cell and even whole cell dynamical modeling.
I am very thankful to EMBL to give me the opportunity to attend as an event reporter. It was a very nice experience for me.
Briefly, I will highlight the key five takeaway messages from the meeting.
The correct cellular functioning depends on the binding of different biomolecules, such as proteins, peptides and nucleic acids, with other proteins. Most of these interactions are transient and the experimental modeling of such complexes is challenging.
The current advancements of AI assisted modeling of proteins (AlphaFold and RoseTTAFold) helped to predict the protein – protein and protein – peptide interactions (PPI) and start to build the human cellular proteome and their interaction networks.
As an acknowledgement of their contributions to the AI revolution of protein structure determination, John Jumper, AlphaFold developer, and Minkyung Baek, RoseTTAFold developer, presented the state of the art of their software and their new capabilities and potential improvements in the future.
John Jumper presented the evolution of AlphaFold versions and what has been built on it till now from the modeling of protein monomers to the accurate prediction of larger multimeric proteins (AlphaFold Multimer) and prediction of sequence variation impact on protein structure (AlphaMissense).
Baek highlighted the current state and improvements of RoseTTAFold and its performance with respect to AlphaFold2 and other AI assisted protein modeling software. She showed the current power of AI in predicting protein – nucleic acids complexes through RoseTTAFoldNA.
Several groups showed their contributions to improve AI assisted modeling of proteins through building neural network energy models based on co-evolution information and biophysics laws for accurate modeling and design of proteins.
The accurate modeling of RNA 3D structure is a challenging problem in the computational structural biology field due to its high flexibility and dynamics.
Two groups presented their current advances in RNA 3D modeling through integration of experimental data such as solid-state mass spectroscopy at Cryo temperatures, electron microscopy and chemical probing with molecular modeling and simulations.
Others shed the light on the power of machine learning and protein language models to improve the modeling accuracy of protein – protein interactions and applied them to antigen – antibody complex prediction and T- cells’ receptors (TCR) recognition of diverse antigenic proteins.
Computational microscopy, a term coined by Klaus Schulten several years ago, to demonstrate the power of molecular dynamics simulations to describe the biomolecular structural dynamics at different temporal and spatial resolutions.
The adjustment of the spatial (Atomistic, Coarse Grained, ML/MM and QM/MM) and temporal (from femtoseconds to milliseconds) resolutions of biomolecules allowed different researchers to study complex cellular processes thanks to the current developments in both hardware and software.
Several research groups showcased different applications such as SARS – CoV2 spike interaction with cellular receptors, mitochondrial membrane bending by large supercomplexes “electron transport chains (ETC)”, specific binding of Cas9 from different species to CRISPR, antibiotic and drug resistance through non-equilibrium molecular simulations, the investigation of association and dissociation mechanisms of protein – protein interaction and the integrative modeling of monomeric proteins by combining molecular dynamic simulations with other experimental techniques.
The correct prediction of proteins’ function requires the accurate modeling of protein structure and dynamics. Thanks to the AI revolution of structure modeling, we can solve the structure of monomeric and multimeric proteins within a few minutes.
AI models (machine learning and protein language) can be trained based on sequence and structure features to predict proteins’ function.
Several research groups presented their different methods to extract features and build models to solve specific problems such as building a catalogue of enzymes to study and predict enzyme evolution and mechanism, selectivity and specificity of G – protein coupled receptors (GPCRs), functional annotation of microbial proteins to understand their microbiome – phenotype relationships.
The combination of diverse information from multiple experimental and computational techniques aids in the modeling of large supercomplexes of the cell such as proteasomes and nuclear pore complexes containing thousands of proteins. AlphaFold predicted monomers and multimers accelerate such modeling processes. Thus, Integrative modeling is revolutionizing cellular biology and biophysics and mapping the entire cell at different spatial and temporal scales. It adds new molecular, kinetic and dynamical details to our description of cellular processes.
Several groups showed that combination of experiments, such as cryo-EM, cryo-ET, and mass spectrometry (MS) with AI predictions (AlphaFold) increases the confidence of modeled complexes and larger assemblies can be built based on them.
Two groups integrated AlphaFold models with low resolution data inferred from cryo-EM and mass spectrometry (MS) to give molecular details of eIF2 dephosphorylation macromolecular complex and the native architecture of motile cilia.
Another two research groups harnessed the power of deep learning models to predict binding sites and localize intrinsically disordered regions (IDR) in large assemblies and visualize protein-protein interactions and interfaces inside cells at near atomic resolution.
Two more research groups presented their work on modeling the conformational dynamics of proteins and their complexes in solution from Hydrogen-Deuterium Exchange mass-spectrometry (HDX-MS) and the automatic and precise modeling of water molecules in multiconformer protein – ligand models solved from high resolution x-ray maps.
To address the problems of integrative modeling, Andrej Sali presented an approach to deal with them. This approach combines several input models, built from different data, to get a more accurate output model of the structure of interest. This process is called metamodeling.
Zaida Luthey-Schulten constructed a dynamical 3D whole – cell model via integrating structural details with kinetic measurements and other experimental data to simulate a growing minimal cell and visualize cellular processes in action.
AI can aid not only in modeling structure of proteins and their complexes but also in learning complex physical rules of protein – ligand binding, predicting protein – membrane interfaces, predicting effects of mutations and designing new functional proteins and ligands.
One group presented their attempts to design deep learning-based scoring function trained to learn the protein – ligand interactions and showed its performance in identifying important binding interactions.
Another groups presented the capabilities of deep learning based models to predict protein – membrane interfaces and effects of mutations on protein structure and function and their relations to various diseases.
Others presented their efforts to design new protein functionalities using deep learning methods and how these methods are developed to design new sequences that are predicted to fold to specific structures and perform functions of interest.
Finally, I wish to say that the field is rapidly growing and that “Cambrian explosion” of AI applications in every aspects of structural biology requires the organization of frequent discussion meetings to present the current state and future trends in this field.
The scientific organisers are planning to organise workshops with this topic frequently and the next version will likely be in 2025.
If you feel sorry to have missed this workshop, I would recommend to attend the ‘AI and biology’ workshop from 12 — 15 March at EMBL Heidelberg.
The EMBO Workshop ‘Computational structural biology‘ took place from 6 – 9 December 2023 at EMBL Heidelberg and virtually.
Did you know that you can become an event reporter and receive a conference fee waiver in exchange? Find out how to do that by visiting our Become an event reporter page.