What algebraic topology could frame thy fearful symmetry?

view of a symmetric protein
GCB News

What algebraic topology could frame thy fearful symmetry?

Proteins are the workhorses of the cell. They are required for the structure, function and regulation of the body’s tissues and organs, and although half of all proteins in modern cells are symmetric complexes, less is known about them, as they are more difficult to work with both experimentally and computationally. Bruce Donald is the PI on a new NIH R01 grant that aims to develop tools to help make these symmetrical proteins easier to work with.

Symmetric proteins have two or more components – called subunits – that fit together to form a complex. Like synchronized swimmers, these subunits are identical in shape, but they are translated and rotated to form symmetrical patterns. And they are fearful! Symmetric proteins coat the outside of viruses like HIV, Zika and Ebola. They also form channels in cell membranes and are the target of many modern drugs.

Typically, researchers use three techniques to determine the 3D molecular architecture of proteins: X-ray crystallography, nuclear magnetic resonance (NMR) and cryo-electron microscopy. Each technique, though, has some drawbacks. Proteins can be difficult to crystalize for X-ray diffraction. This is particularly true of membrane proteins, which sit in the cell membrane. NMR is difficult to use with larger proteins and can provide ambiguous measurements. Cryo-electron microscopy is best used for very large proteins, and many symmetric proteins are too small. In all of these techniques, the measurements are indirect, so researchers have to make inferences.

NMR distance measurements of symmetric proteins are inherently ambiguous. These ambiguities lead to multiple ways of assembling the subunits of a protein. When building a model to fit the data, Donald and his team started developing new algorithms using algebraic topology to better deal with the challenges that arise from studying symmetrical proteins.

They began studying diaglycerol kinase (DAGK), a drug target in bacterial cell walls, and found two structure models. One had previously been solved using X-ray crystallography, and one using NMR. The structures appeared vastly different. “They can’t both be right,” Donald said. “So, first we wanted to see if we could make the NMR data match what we found with the X-ray crystallography model.”

It turns out they could, which led to another question: How many other models are out there? By making small, mathematical changes to the protein structure diagrams, they could generate more folds, test the folds, and see which ones fit the data. They made all of the possible structures and analyzed them to see which ones fit the data best, and in addition to finding NMR and crystal structures, they found other folds that are predicted to be even better fits to the data.

Donald worked with Jeff Martin, a software engineer and computer science Ph.D. student (’14) in the Donald Lab to develop these methods. “We applied them in our labs, and they looked promising,” Donald said, “so we wanted to see how widespread this is.” They took a case from literature where there was disagreement. “This could actually be a very big problem for experimental protocols, so we are proposing to fix this problem in our grant.” The team will develop algorithms that work for any symmetric protein and that also apply to particular systems of interest that are symmetric systems in virology and immunology, like the Zika virus and HIV.

“Mathematical symmetry is important in how biology structures the interactions between molecular partners, and we can use topology to understand those interactions and analyze the data that tells us what the structures are,” Donald said.

This research is supported by the National Institute of Health grant R01GM118543. Other collaborators include Leonard Spicer, Ph.D.; Pei Zhou, Ph.D.; Hashim Al-Hashimi, Ph.D.; Scott Schmidler, Ph.D.; and Jeffrey Hoch, Ph.D.

Related News


Weapon of Mass Congestion: Protection from Influenza Virus through Gene Control

A Duke team was recently awarded a grant from DARPA for a research effort aimed at temporarily regulating gene expression that could help protect against pandemic flu.
Credit: Pixabay.

DNA Metabarcoding Useful for Analyzing Human Diet

Lawrence David launches new study demonstrating that DNA metabarcoding provides a promising new method for tracking human plant intake, suggesting that similar approaches could be used to characterize the animal and fungal components of human diets.
Two bacterial colonies that have formed purple rings because of the gene drive that researchers have implanted in them. A new machine learning model greatly speeds up this science by predicting the interaction of dozens of biological variables.

Machine Learning Predicts Behavior of Biological Circuits

Lingchong You has devised a machine learning approach to modeling the interactions between complex variables in engineered bacteria that would otherwise be too cumbersome to predict.