This new paper shows one reason why it’s so tricky to calculate compound binding in an active site (which is what we’d want to do in order to do effective in silico virtual screening). The authors (a multicenter team from York, Demuris, Vernalis, and St. Jude) are looking at a virulence protein from H. influenzae called SiaP. It binds sialic acid, but it also binds a whole set of water molecules that can be seen in the X-ray data. These are arranged into a hydrogen-bonded network that bridges that protein surface and the ligand, and while this is an unusually extensive example, this sort of thing is quite common.
Even single water molecules can have a big influence, and to make things more amusing, they can show up with either an enthalpic or entropic effect (or both, why not) and with either positive or negative influence on the overall free energy of binding. You’d think that something the size of a water molecule would be only a pawn-level piece on the binding-pocket chess board, and sometimes that’s all they are. But they can suddenly turn into rooks and queens depending on the situation (not a bad analogy, considering that their hydrogen bonds are so directional) and they can at the same time turn into either white pieces or black ones. Which situation would make it rather hard to write a useful chess-playing program, and it makes it hard to write a useful binding-prediction one, too.
At right are two X-ray structures of N-acetylneuraminic acid bound to SiaP – the left one is the native protein, and you can count ten water molecules involved in the binding. On the right, though is a point mutant (A11N, alanine to asparagine, down at the bottom of the figure). Most of the water molecules are still present, and in the same place. But the network around water #3 has been disrupted. And now is the time to mention that the Kd of Neu5Ac with the wild-type SiaP is 30 nM, while the Kd with the mutant form is about 42,000 nM. You lose 1400x binding affinity with that one distortion of the water hydrogen-bonding network. Otherwise, the positions of the protein residues and of the ligand itself are within experimental error between the two structures.
This brings to mind another effect that will be familiar to fragment-based drug discovery types. If you look at a series of X-ray structures of different fragments against the same target and ask someone to rank-order their binding affinities just from those, they can’t do it. In some cases you might be able to pick out the best one, if it makes a notably different set of hydrogen bonds or other interactions, but if they’re all roughly the same binding mode you can generally forget it. The differences are too subtle, and they’re too subtly distributed between entropic and enthalpic effects on the ligand, the associated water molecules, and the protein structure itself.
If you’d shown me those two structures above with no other information, I probably would have guessed that the one on the left would be better, because most of the time another hydrogen bonded water interaction is a good thing. But not always: that bound water #4 very likely helped the entropic term when it left its structured state, for example, and sometimes that wins out over the enthalpic effects. And I certainly wouldn’t have guessed a 1400x drop in affinity!
One interesting aspect of this paper is that the crystal structures of both species were collected at cryogenic temperature and at room temperature, and the latter revealed more disorder in the mutant Asn residue on warming. That’s another factor that not everyone appreciates when they look at an X-ray-structure: you’re generally seeing the lowest frozen well of the energy landscape, and that may or may not be as informative as you’d want about the situation at RT (which is where you’re probably running your binding assays!) We have a hard enough time understanding static structures, and adding in dynamic effects just makes things even messier, but things do tend to move around out here in the real world. The movement in the Asn side chain complicated the authors’ attempts to work out a thermodynamic model, and this is actually a pretty well-behaved system compared to many others that you’ll see.
At any rate, the more detailed thermodynamic study laid out in the paper indicates that the majority of the energy change is indeed due to disruption of the water network, and there’s a strong contribution of the enthalpy term. As the paper shows, if you go through the literature looking at the estimates of enthalpic and entropic energy changes for desolvation of a single water molecule, you get numbers all over the place, and the differences between them are all of a magnitude to show up very strongly as differences in binding affinity. It looks like this case agrees more with the high side of those estimates, but anyone who’s in love with virtual screening and computational drug design should consider the facts that (A) these values can be so high and so meaningful to binding, and (B) that the best experts in the field are still disagreeing about what those numbers are. It’s quite likely that trying to assign a single average value to such things isn’t too useful anyway, given the wide variety of water molecule environments, but that’s not helping to simplify anyone’s life, is it?