I have wondered several times around here about how (and if) IBM’s Watson platform is going to be able to help out with drug discovery, and it looks like we may be able to find that out. Pfizer has signed up with IBM to use the Watson technology in its immuno-oncology research. Here we go:
Watson for Drug Discovery is a cloud-based platform that will use deep learning, natural language processing and other cognitive reasoning tech to support researchers seeking new drug targets and new drug indications, IBM said in a statement. The platform has been fed more than 25 million abstracts, more than 1 million full-text journal articles and 4 million patents in order to streamline the drug discovery process. By contrast, a researcher will read between 200 and 300 articles in a given year, according to a 2015 PhRMA report.
I have a number of comments about that, and I’ll bet every scientist who reads it has some similar ones. First off, I do not dispute that we all need help digging through, collating, and understanding the mass of the scientific literature. There surely are insights in there that we have all missed, connections that we have not made. I really have no doubts about that at all. Where the doubts come in is how we’re going to find those insights and connections, and whether Watson will be able to do so in any kind of useful way.
I hope that the software can do it, but it’s important to understand the barriers to this working. None of these are insurmountable, but none of them are going to be conquered by issuing press releases, either. In no particular order, some of the big issue are:
Problem Number One: A significant amount of what is to be found in those articles, abstracts, and patents is crap. Moreover, it is several kinds of crap. Some of it is research, meant in earnest, that is simply incorrect. A lot of the earlier kinase literature is like that, because the so-called “selective kinase inhibitors” of the time were mostly nothing of the kind, invalidating a lot of hypotheses. Similarly, there’s been a continuing problem with the use of what are supposed to be useful chemical tool compounds that are largely (or completely) inappropriate, and many of the results obtained via these are suspect. There are misidentified and contaminated cell lines out there that have done (and continue to do) the same mischief, and plenty of not-so-great antibodies, too.
This sort of noise is inevitable; the challenge will be in seeing that your machine-learning tool doesn’t use one of these crumbly bricks to build a new structure. Beyond this sort of thing, though, are some papers that are so sloppy that they’re unreliable, and a depressing amount of outright fraud. Watson does not need to try to assemble connections and hypotheses based on stuff like this, and there’s too much stuff like that out there. I would assume (and hope) that the Pfizer folks have enough sense to completely cut a bunch of possible journals out of the mix entirely as well – when you see an editorial board that includes Dr. Hoss Cartwright at the Ponderosa Institute for Bovine Studies, well, it’s time to keep on movin’, pardner.
Problem Number Two: One of the really neat things about biomedical research is how much we don’t know. There are huge, important things going on in cells right now that we really have no idea about, and we’ve seen this proven many times over the years (the various small RNA interference mechanisms are one example). So any attempt to get any kind of full picture of what’s going on, should, with an honest and useful readout, come back as “Insufficient Data For Meaningful Answer”.
That doesn’t mean that you can’t find new things by this literature-collating approach. Inside limited fields, there should be sufficient data, for one thing, and if Watson is really good, it might be able to discern that certain mechanisms have to exist that haven’t been discovered yet. That sort of result would impress me greatly, and it’s definitely not impossible for it to be realized. Just really hard. But it’s going to also be very hard to know when you’re working on a question that has enough data and when you’re spinning your wheels, sort of like it can be hard to know if you’re in a local minimum or a global one.
Problem Number Three: This might be a big one. From what I understand, a key feature to any machine-learning approach is having negative data for it to work with as well as positive (which makes sense). The problem is, the literature is extremely sparse on negative results. There are so many things that have been done that have not worked, and we’re just sort of taking that information and tossing it aside. Now, it’s not so straightforward to use it, either, because there are an infinite number of reasons that an experiment can give you negative results, starting with “something screwed up”, and it’s notoriously hard to tell what happened. But there are indeed solid negative results out there, real hypothesis-wreckers, that never get reported because there are fewer places to report them.
Problem Number Four: Taken together, these difficulties will place some tricky bounds on the answers that a machine learning system can give you by rooting through the biomedical literature. Depending on how the software is tuned up, I can imagine that you could easily end up underinterpreting or overinterpreting (just as in the statistical problem of fitting a model to a set of data). The first case will give you that “insufficient information” answer to every single interesting question you ask, and will only tell you things that you already know (or should have known, anyway). The second case will give you spurious correlations that you have no good way of knowing are spurious. Ideally, these would come out ranked by confidence – I would also be very impressed if such software were to rank them by testability as well, but I think for now that’s going to be a job for us humans.
What I don’t know is what the gap is between these two extremes. It might be pretty narrow, with Watson giving you either way too little to work with or way too much. The latter situation, a big ol’ steaming heap of false positives, is arguably the worse of the two. I hope that we’ll eventually hear something (well, other than happy-talk) about how this has worked out for Pfizer, but if we hear nothing at all, that’s hearing something too, isn’t it?