Cryptic binding sites: now there’s a puzzle for you. When you look at a protein structure, even if you know nothing about its function, you can usually spot small-molecule binding sites without too much trouble. They tend to be pocket-like folds, often with particular polar motifs. (If the protein is an enzyme, the binding site/active site determination is even easier, in most cases, since there will be functional residues involved there as well). Protein-protein binding surfaces are harder, but there are still a number of standard patterns that you can look for.
But there are other binding sites that take people by surprise, often because the protein changes its conformation to cause them to appear. That’s not something that you’re going to be able to eyeball, and it can turn into a major computational problem, too, depending on the protein. Those trees of possible/plausible conformations can branch out pretty wildly, and each leaf on each one of them has to be searched for something that looks like a believable binding site.
There have been several attempts at this problem over the years, and here’s the latest, from a large multicenter academic team. A combination of methods are used simultaneously: sequence conservation, molecular dynamics, fragment docking, etc. Their protocol (“CryptoSite”) seems to accurately predict many non-obvious binding sites that have been shown to exist in different proteins, and if it really is working, it also predicts that there are a lot more of them waiting to be targeted Looking at a set of about 4000 proteins, 1420 of which are known to be disease-related, they find a high proportion of cryptic binding sites:
In contrast to pockets, cryptic sites were predicted in 72% of the disease-associated proteins, 38% of which have no apparent pockets. However, some of the predictions may be false positives (the sites may in fact not bind any ligands). Moreover, for some sites, it may be very difficult to find a ligand (even if it does exist), and even if the ligand is found, it may not be a drug because it does not target the disease-modifying function of a protein or because it does not meet clinical development criteria. Nevertheless, the prediction of cryptic sites on the disease-associated proteins of known structure indicates that small molecules might be used to target significantly more disease-associated proteins than were previously thought druggable.
That immediately brings up the question, though, of how come we don’t see these things in high-throughput screens very often. Here’s how the authors deal with that one:
It has been shown that small-molecule libraries are biased toward traditional drug targets, such as G-protein-coupled receptors, ion channels, and kinases, while they are not as suitable for antimicrobial targets and those identified from genomic studies. It is conceivable that the existing libraries are also less suitable for cryptic sites. Moreover, cryptic sites may tend to bind ligands more weakly than binding pockets due to the need to compensate for the free energy of site formation and may thus be ranked lower on the high-throughput screening lists. Therefore, different approaches based on larger and more diverse chemical libraries, including small fragments, peptides, peptidomimetics, and natural products, may be needed for more efficient discovery of cryptic site ligands.
Well, that’s a big open question in drug discovery, and those are some of the traditional answers, which may be right. Chemical space is very large indeed, and any given compound library is only going to cover a tiny amount of it, especially since the compound libraries we use are already biased by the kinds of starting materials we can get and the kinds of transformations we can run on them. It’s entirely possible that “nontraditional” targets are still waiting for their princes to ride up for them, but at the same time, this argument has always made me uneasy. Maybe the reason we don’t find hits for Targets Like That is because we just haven’t gotten to the chemical matter that Targets Like That want to see, but that same reasoning can be applied (and misapplied) pretty freely. I’ve never tripped over a bar of gold while walking around, but is that because I just haven’t gone down the right street yet? Given what I know (my priors, in Bayesian terms) about the chances of gold bars on the sidewalk, I think I can reject that hypothesis. But I don’t have enough information about the number of good screening hits for tough targets, so my prior is not very useful, and I know that the number of possible streets to walk down (and their variety) is very large. And that’s why this question is still open, and why that answer to it is still at some unknown point on the scale running from “Silly Evasion” to “Perfectly Logical”.
What I can tell you, though, is that people do get impatient with the conclusion that gosh, the screening library must just be inadequate, because we get no hits (again). I think that the fragment screening aspect is a key here. If you can come up with fragment hits (or even plausible fragment docking, as nervous as I am about that approach), then it does seem to strengthen the case that your weirdo binding site really is a potential binding site, and you just haven’t given it what it wants, you lazy dog. But if you can run a good, solid fragment screen and come up totally dry, that (to me) is a good argument that this “site” is just not going to bind things – or is at least as close to that situation as you need it to be to walk away.
I actually hope that the CryptoSite people are right, and that there are a lot more binding sites than we realize. Anything that expands the field of action like that would be welcome, and I look forward to seeing how this holds up now that it’s out there in the real world.