Peter Kenny has a paper out on ligand efficiency that’s required reading for medicinal chemists using (or thinking about) that concept as a design tool. I’d recommend reading it with this recent paper – between the two of them, you’re going to have references to a huge swath of the literature on how to measure drug optimization, and you’ll emerge with a strong background on just how hard that is to do.
Broadly put, LE is (or is supposed to be) a way to think about binding of a molecule to its target, adjusted for the size of the molecule. At that level, things aren’t really controversial: there are ligands that do a better job of binding to their targets because they involve more of their own structures in various interactions. Imagine some compound with a molecular weight of 230 (or the equivalent number of heavy atoms, etc.) that has a binding constant of 1 micromolar versus its protein target. Now imagine another molecule, also with 1 micromolar binding, with a molecular weight of 600. It seems clear that the smaller molecule is using its structure more thoroughly, whereas the larger one would seem to have a good deal of its structure not involved very productively at all. If there aren’t other good reasons to keep all that stuff, why should you? Don’t add things to your compound’s structure unless you’re getting some sort of return on them. This we can agree on.
The problem, as Kenny shows, is the way that most of us define LE. He says that it’s “a good concept that is poorly served by a bad metric”. The problem is that most of those definitions have a log function of some sort in them that causes trouble (as pointed out several years ago, responses to that here and here) and this makes some of the LE definitions either play rather loose with the math and/or causes the ligand efficiency numbers themselves to depend hugely on how you define the drug concentrations. The standard ligand efficiency metric, indeed, assumes an arbitrary concentration as a starting point to make the numbers come out “right”, and there’s just no thermodynamic basis for this. As Kenny observes, acidly but appropriately, “In thermodynamic analysis, a change in perception resulting from a change in a standard state definition would generally be regarded as a serious error rather than a penetrating insight.” Later on in the paper, after showing through example how LE varies with the units chosen for concentration, he puts it this way: “A physical quantity that is expressed in different units is still the same quantity. If perception changes when a quantity is expressed using a different unit then neither the change in perception nor the quantity itself can be regarded as physically meaningful“.
And here’s where we look under the hood of property-based drug design. You can see some of its assumptions in my paragraph above – the idea (to quote Kenny) is “balancing the risk associated with poor physicochemical characteristics against the risk of not being able to achieve the necessary level of affinity”. Both of these risks are necessarily rather hard to get a handle on, but that hasn’t stopped people from trying (see that 2013 blog in the paragraph above). And the whole situation has been muddied thoroughly by the “if it can be measured it will be managed” phenomenon, a problem that’s found in more places than just the drug industry. Over the years, too many people have seized on the idea of measuring and calculating their way to drug-discovery success – crudely, “Just fix X and you’ll be OK”, where X is any number of physical properties or structural characteristics. By this time, it should be clear that there is no X that fits a “Just fix X” mentality. One hates to fall back on saying “It’s too complicated for that”, but you know, it really is too complicated for that.
Drug design guidelines are typically based on trends observed in data and the strengths of these trends indicate how rigidly guidelines should be adhered to. While excessive molecular size and lipophilicity are widely accepted as primary risk factors in design, it is unclear how directly predictive they are of more tangible risks such as poor oral absorption, inadequate intracellular exposure and rapid turnover by metabolic enzymes. This is an important consideration because the strength of the rationale for using LE depends on the degree to which molecular size is predictive of risk.
That’s very sensibly put. Sometimes the trends we can measure are useful predictors, and other times they aren’t. And even the things that we tend to think are useful a greater part of the time (such as molecular weight and lipophilicity) have real problems as predictive tools. But if these things aren’t predictive on some given project, why use ligand efficiency as a tool at all? And why do we think that those properties are useful in general? Kenny lays down the challenge: “Drug designers should not automatically assume that conclusions drawn from analysis of large, structurally-diverse data sets are necessarily relevant to the specific drug design projects on which they are working.”
I’m getting mildly infamous for a talk that I give which compares the workings of drug discovery to those of Wall Street, but in this case it’s impossible not to be reminded of the investing classic “Where are the Customer’s Yachts”, a book that is a rich concentration of good sense. In it, Fred Schwed reviews a few of the classic ideas about stock market prices and concludes that “All of these theories are true part of the time; none of them true all of the time. They are, therefore, dangerous, though sometimes useful.” This applies word-for-word to the use of compound metrics for drug discovery. And the provocative line above is powerfully reminiscent of one from the 1930s fund manager John W. Pope, whom Schwed quotes: “It is the belief of the management of this corporation that a diversified list of carefully selected securities, held over a period of time, will not increase in value“.
Both sound like heresy, but they should be given a careful hearing. One of Kenny’s points is that drug discovery is too various to make broad averages of behavior broadly useful. There are too many special considerations in this business that can override the rules of thumb – a situation that is not improved by the way that this always sounds like special pleading when it’s invoked.
There’s also a solidly argued section in the paper that goes into the problems of using LE to try to break down the contributions of individual parts of a molecule to its overall affinity. Read the paper, but what you’ll find is that this dream – and it is a dream, breaking everything down into independent pieces this way – is generally unworkable. Problems of stoichiometry and non-local effects (one end of the molecule affecting another in indirect ways) confound this approach. Not that it keeps people from trying it, under many guises – it’s just too appealing.
What about all the papers that talk about using ligand efficiency metrics for guiding a project along? Here’s Kenny’s take on that, and anyone familiar with the med-chem literature from the inside may well cringe in recognition:
. . .a depiction of an optimization path for a project that has achieved a satisfactory endpoint is not direct evidence that consideration of molecular size or lipophilicity made a significant contribution toward achieving that endpoint. Furthermore, explicit consideration of lipophilicity and molecular size in design does not mean that efficiency metrics were actually used for this purpose.
By the time a project is written in in J. Med. Chem. or wherever, the story you’re reading is likely not quite the story as it happened. The delays in industrial publication are one factor – people have left, and even without that, it can be hard to reconstruct the order that things happened in, which ideas came from where, why certain things lasted as long as they did or why others weren’t realized earlier. Every project looks that way on close inspection, when such inspection is possible. (Amateur astronomy analogy: every reflecting telescope’s mirror looks filthy if you shine a flashlight across it at a low angle, even if it’s perfectly serviceable). So while you should be able to believe the data, believing the rationales advanced along with those data is a riskier move. This goes (as the paper demonstrates) for many fragment-based drug discovery papers that use LE to tell their story. Kenny’s fine with the idea behind using fragments as starting points (as am I!) and he’s find with the idea of trying to make additions to a fragment’s structure prove their worth along the way. He’s just saying that LE is not the way to do it.
Well, my overall advice is “read the paper”! You’ll find some provocative (but difficult to refute) statements, a thorough review of the literature, and plenty of food for thought. It looks like a thorough re-evaluation of the whole idea of measured compound metrics is underway, now with years of evidence behind it, and we need to decide if we’re getting enough utility out of these things or not. The answer at this point, frankly, looks like “not”.