Here’s a piece to start some arguing: “AI in Drug Discovery is Overhyped”, by Mostapha Benhenda. I realize that a lot of people will read that title and go “Well, yeah, sure”, but it’s definitely worth seeing some specific examples (which the post has).
Update: some of the authors involved have left detailed comments – definitely make sure to see these if you have an interest in this area.
One of these is this paper, from AstraZeneca, on using neural networks to generate molecular structures for screening. Benhenda’s complaint is that the paper spends a lot of time and effort showing how different the AI-generated structures are from a “natural” set, but little or no time showing how different they are from each other. If you’re using this to produce libraries for virtual screening (and what else would you be using it for?), then you’d want more diversity, because huge physically-hard-to-realize diversity is one of the whole points of virtual screening. (Benhanda himself has more detailed objections here).
The second paper he’s looking at is from a group at Harvard, using “Generative Adversarial Networks”. As I understand this, it’s a technique where the output of one network gets critiqued by another one – in this case, downranking structures that get too strange-looking – to try to improve the whole process. But it appears that (as with the AZ work) that the molecules don’t get compared to each other very much, and (as Benhanda dug more into the work), that the second network seems to spend most of its time penalizing whatever comes out of the first one, which indicates that something is not quite right.
He then goes to a third recent example, from Vijay Pande’s group at Stanford. Some of that has come up here on this blog before, with mixed reviews. This paper is related to the MoleculeNet project, which is being funded by Andreessen Horowitz, whose moves into biopharma I’ve written about as well. I won’t get into the details of this one, but the criticism is basically that the work described seems (to Benhanda) to be both lacking in depth and part of a move to make a particular format/data standard (DeepChem) the default for the field as opposed to others (that could be as useful or more). I have no idea whether that’s true, but I would be interested in hearing from practitioners on both sides of the issue.
Now, to be sure, Benhanda himself is selling something – the services of an outfit called Startcrowd, which he touts as an independent way to evaluate AI claims in chemistry and drug discovery. And I’m not qualified to evaluate them, either, but his claims about these recent papers can be addressed independently of whether you want to hire someone else. So here are the questions: first, are these recent papers representative of the field? Second, how well-founded are the objections to them? If these are indeed problematic, what work should people be looking at instead to get a higher-quality read on what’s going on?
I am far from being an expert in this area, but I’m also very much interested in learning about it and keeping an eye on it. The whole AI/machine learning field is something that I think we should all be watching, because it has the potential to both wildly helpful and wildly disruptive, and it would behoove us to be ready for what might happen. I doubt very strongly that I’m going to turn into a neural-network programmer, but I don’t want to just ignore all that stuff, either, because it could change very drastically by the next time I get around to paying attention!