EZSpecificity combines comprehensive new enzyme and substrate docking data with a new machine learning algorithm to predict the best conjugation to make the desired product, with up to 91.7% accuracy. Professor Huimin Zhao from Illinois led the study. Photography by Fred Zwicky.
By Liz Ahlberg Touchstone
A new AI-powered tool can help researchers determine the compatibility of an enzyme with its desired target, helping them find the best combination of enzyme and substrate for applications from catalysis to medicine to manufacturing.
led by Huimin ZhaoMr Chemical and biomolecular engineering At the University of Illinois Urbana-Champaign, researchers developed EZSpecificity using new enzyme-substrate pair data and a new machine learning algorithm. They have made the tool available for free connected And published them results In the journal Nature.
“If we want a particular product that uses an enzyme, we want to use the best combination of enzyme and substrate,” said Zhao, who is also the center’s director. NSF Molecular Maker Laboratory Institute And who NSF iBioFoundry at the University of Illinois “EZSpecificity is an artificial intelligence model that can analyze the sequence of an enzyme and then predict the best substrate that would fit that enzyme. It is very much complementary to Clean AI model Which we developed to predict enzyme function from its sequence more than two years ago.
Enzymes are large proteins that catalyze molecular reactions. They have pocket-like regions that target molecules, called substrates, and fit into them. How well the enzyme and substrate are named with specificity. The typical analogy for the interaction between enzyme and substrate is that of a lock and key: only the right key will open the lock. Zhao said the enzyme’s function is not that simple.
“It’s hard to figure out the best combination because the pocket isn’t stable,” he said. “An enzyme actually changes its shape when it interacts with a substrate. It’s more of an induced conformation. Some enzymes are promiscuous and can catalyze different types of reactions. This makes it very difficult to predict. That’s why we need a machine learning model and experimental data to really prove which pairing will work best.”
While other enzyme specificity models have been presented, they are limited in accuracy and in the types of enzymatic reactions that can be predicted.
Zhao’s group realized that to improve AI’s ability to predict privacy, they needed to improve and expand the dataset from which the machine learning model was derived. They have partnered with the group he leads Diwakar Shuklaprofessor of chemical and biomolecular engineering at the University of Illinois. Shukla’s group conducted docking studies of different classes of enzymes to create a large database containing information not only on the enzyme sequence and structure, but also on how enzymes from different classes adapt around different types of substrates.
“Experiments that capture how enzymes interact with their substrates are often slow and complex, so we performed large-scale simulations to complement and extend existing experimental data,” Shukla said. “We have zoomed in on the atomic-level interactions between enzymes and their substrates. Millions of docking calculations have provided us with this missing piece of the puzzle to build a highly accurate enzyme specificity prediction tool.”
Next, the researchers tested EZSpecificity alongside ESP, the current leading model, in four scenarios designed to mimic real-world applications. EZSpecificity outperformed ESP in all scenarios. Finally, the researchers experimentally validated EZSpecificity by looking at eight halogenases, a class that has not been well described but is increasingly used to make bioactive molecules, and 78 substrates. EZSpecificity achieved 91.7% accuracy for the best pairing predictions, while ESP showed only 58.3% accuracy.
“I can’t say it works for every enzyme, but for some enzymes, we’ve shown that EZSpecificity works very well indeed,” Zhao said. “We want to make this tool available to others, so we developed a user interface. Researchers can now enter a substrate and protein sequence, and then they can use our tool to predict whether that substrate could work well or not.”
Next, the researchers plan to expand their AI tools to analyze enzyme selectivity, which indicates whether an enzyme has a preference for a particular site on the substrate, to help rule out enzymes with off-target effects. They also plan to continue improving EZSpecificity with more experimental data.

University of Illinois







