Machine learning begins to understand the human gut
The communities formed by human gut microbes can now be predicted more accurately with a new model developed in a collaboration between biologists and engineers, led by the University of Wisconsin and the University of Michigan.
The making of the model also suggests a route toward scaling from the 25 microbe species explored to the thousands that may be present in human digestive systems.
“Whenever we increase the number of species, we get an exponential increase in the number of possible communities,” said Alfred Hero, the John H. Holland Distinguished University Professor of Electrical Engineering and Computer Science and co-corresponding author of the study in the journal eLife.
“That’s why it’s so important that we can extrapolate from the data collected on a few hundred communities to predict the behaviors of the millions of communities we haven’t seen.”
While research continues to unveil the multifaceted ways that microbial communities influence human health, probiotics often don’t live up to the hype. We don’t have a good way of predicting how the introduction of one strain will affect the existing community. But machine learning, an approach to artificial intelligence in which algorithms learn to make predictions based on data sets, could help change that.
“Problems of this scale required a complete overhaul in terms of how we model community behavior,” said Mayank Baranwal, an adjunct professor of systems and control engineering at the Indian Institute of Technology, Bombay, and co-first-author of the paper. He explained that the new algorithm could map out the entire landscape of 33 million possible communities in minutes, compared to the days to months needed for conventional ecological models.
Microbial Sim Cities
Integral to this major step was Ophelia Venturelli, an assistant professor of biochemistry at the University of Wisconsin–Madison and co-corresponding author of the study. Venturelli’s lab runs experiments with microbial communities, keeping them in low-oxygen environments that mimic the environment of the mammalian gut.
Her team created hundreds of different communities with microbes that are prevalent in the human large intestine, emulating the healthy state of the gut microbiome. They then measured how these communities evolved over time and the concentrations of key health-relevant metabolites, or chemicals produced as the microbes break down foods.
“Metabolites are produced in very high concentrations in the intestines,” said Venturelli. “Some are beneficial to the host, like butyrate. Others have more complex interactions with the host and gut community.”
The machine learning model enabled the team to design communities with desired metabolite profiles. This sort of control may eventually help doctors discover ways to treat or protect against diseases by introducing the right microbes.
Feedback for faster model building
While human gut microbiome research has a long way to go before it can offer this kind of intervention, the approach developed by the team could help get there faster. Machine learning algorithms often are produced with a two step process: accumulate the training data, and then train the algorithm. But the feedback step added by Hero and Venturelli’s team provides a template for rapidly improving future models.
Hero’s team initially trained the machine learning algorithm on an existing data set from the Venturelli lab. The team then used the algorithm to predict the evolution and metabolite profiles of new communities that Venturelli’s team constructed and tested in the lab. While the model performed very well overall, some of the predictions identified weaknesses in the model performance, which Venturelli’s team shored up with a second round of experiments, closing the feedback loop.
“This new modeling approach, coupled with the speed at which we could test new communities in the Venturelli lab, could enable the design of useful microbial communities,” said Ryan Clark, co-first author of the paper, who was a postdoctoral researcher in Venturelli’s lab when he ran the microbial experiments. “It was much easier to optimize for the production of multiple metabolites at once.”
The group settled on a long short-term memory neural network for the machine learning algorithm, which is good for sequence prediction problems. However, like most machine learning models, the model itself is a “black box.” To figure out what factors went into its predictions, the team used the mathematical map produced by the trained algorithm. It revealed how each kind of microbe affected the abundance of the others and what kinds of metabolites it supported. They could then use these relationships to design communities worth exploring through the model and in follow-up experiments.
The model can also be applied to different microbial communities beyond medicine, including accelerating the breakdown of plastics and other materials for environmental cleanup, production of valuable compounds for bioenergy applications, or improving plant growth.
This study was supported by the Army Research Office, grant number W911NF1910269, and the National Institutes of Health, grant number R35GM124774.
Jaron Thompson, a PhD student in chemical and biological engineering at UW–Madison analyzed the sensitivity of the model to the type of training data, providing insight into new experiments that would improve model performance. Zeyu Sun, a PhD student in electrical and computer engineering at U-M, also contributed to the machine learning model.
Hero is also the R. Jamison and Betty Williams Professor of Engineering, and a professor of biomedical engineering and statistics. Venturelli is also a professor of bacteriology and chemical and biological engineering. Clark is now a senior scientist at Nimble Therapeutics. Baranwal is also a scientist in the division of data and decision sciences at Tata Consultancy Services Research and Innovation.