Studying how the environment impacts human health
By Eleonora Mastrorilli, MSc, Scientific Programmer, Zimmermann Group, EMBL Heidelberg
The human gut microbiome plays a critical role in human health and disease due to its metabolic interactions with the host and environmental factors, including pharmaceuticals. Although several studies cumulated evidences of microbiome-mediated drug biotransformation at scale, a huge effort is required to produce such high throughput screens, limiting their scalability. The IMPACT project aims at predicting and prioritizing gut microbiome–drug interactions by leveraging untargeted metabolomics and machine learning.
As a pilot test, we assessed the utility of Genome-Scale Metabolic Networks (GSMNs) as a framework to reduce the complexity of untargeted metabolomics data acquired from gut bacteria under physiological condition, while also providing annotation for the retained feature. To this aim, a strain-specific metabolite library was constructed using the AGORA2 resource [1], encompassing 1,748 unique metabolites mapped to physiological metabolic pathways. By using this library for MS1 annotation of our data, we reduced the feature space 25 folds and also enhanced the biological interpretability of the retained data.
We then used the annotated data to test predictive modeling of bacterial drug metabolism using different machine learning algorithms. Microbial drug metabolism itself was encoded as a binary outcome (yes/no), describing whether the drug is metabolized by the strain itself or not, according to the criteria defined in [2], in an exposure assay. Three different classification techniques (Random Forest, Support Vector Machine and Partial least squares discriminant analysis) were used, and model performances were assessed using accuracy and Matthew’s correlation coefficient (MCC).
We tested this approach on a set of 50 gut bacterial strains and predicted their ability to biotransform 31 drugs.
Overall, we determined:
Investigation of the most important features have shown overlaps (although limited) between all three classifiers per drug as well as between drugs within the same classifier.
Overall, our attempt to link the strains-specific metabolic fingerprint under physiological condition and their ability to convert a specific drug demonstrated the possibility of Machine learning models to predict bacterial drug metabolism. For instance, the drug Famciclovir has shown good modelling performances (both in terms of accuracy and MCC), independently of the ML model chosen.
Looking Ahead: Expanding Beyond GSMNs
While GSMNs proved valuable in this pilot as a feature reduction approach, this comes at the cost of largely discarding both un-annotated features and strains for which no metabolic model is available. Therefore, based on the results of the IMPACT project, we will seek to integrate the full untargeted metabolomics dataset to explore the metabolic landscape of gut microbes.
[1] Heinken, A., Hertel, J., Acharya, G. et al. Genome-scale metabolic reconstruction of 7,302 human microorganisms for personalized medicine. Nat Biotechnol 41, 1320–1331 (2023). https://doi.org/10.1038/s41587-022-01628-0
[2] Zimmermann M, Zimmermann-Kogadeeva M, Wegmann R, Goodman AL. Mapping human microbiome drug metabolism by gut bacteria and their genes. Nature. 2019;570(7762):462–467.