An Intro to Deep Learning

I wanted to mention a timely new book, Deep Learning for the Life Sciences, that I’ve received a copy of. It’s by Bharath Ramsundar at Computable, Peter Eastman at Stanford, Pat Walters at Relay, and Vijay Pande at Andreessen Horowitz, and I’ve been using it to shore up my knowledge in this area. From what I can see, there are not too many people who have much understanding of what deep learning/machine learning really entails – not that it stops folks from delivering their opinions on it. So actually obtaining some will make you stand out from the crowd (!)

This book is written for those of us out in biology and chemistry who would like to get up to speed on the topic; it’s not a detailed dive into any one area. But I think that’s a large market: if you would like to know in brief about (say) what a neural network is, the general scheme by which it processes inputs and generates outputs, and how one goes about applying such a thing to a pile of chemical structures or cell images, this would be an excellent place to start. The authors recommend further reading at many points, since they touch on a whole range of topics that have far more detail to them than they’re trying to cover.

Several parts of the book make use of the open-source DeepChem toolbox – there are examples of processing chemical structure and property data, genomic data, protein structural information, imaging data, and so on. Since it’s written for a wide audience, there are introductory sections throughout explaining to the non-life-science computational types (for example) what pi-stacking is and how a SMILES string is generated, and explaining (for example) to the chemists and biologists what a convolutional neural network is and how it might be less susceptible to overfitting errors than some other architectures. A good feature is that the authors have a realistic view of the problems:

At present [the PDB] contains over 142,000 structures. . .that may seem like a lot, but it is far less than we really want. The number of known proteins is orders of magnitude larger, with more being discovered all the time. For any protein that you want to study, there is a good chance that its structure is still unknown. And you really want many structures for each protein, not just one. Many proteins can exist in multiple functionally different states. . .the PDB is a fantastic resource, but the field is still in its “low data” stage. We have far less data than we want, and a major challenge is figuring out how to make the most of what we have. That is likely to remain true for decades.

The book also highlights the limits of what software can accomplish, and when it needs human assistance. That same section quoted from above goes on to warn people that PDB files often contain problematic regions where the protein or ligand is not modeled well, and advises that (at present) there’s no substitute for having an experienced modeler look over the structure for a reality check. Similarly, from the other end, in the chapter on image processing, it’s noted that generating good segmentation masks (read the book!) is often not feasible without some human input as well. That’s something that people outside the field don’t always realize, that these things are not 100% machine, but rather what Gary Kasparov calls centaur systems, using humans and machines in tandem with each doing what they do best.

As someone without much expertise (compared to the authors!), I’ve been particularly enjoying the discussions of “meta” topics such as choosing between different architectures (and how to evaluate such choices), interpretability of the results (and how to quantify that), and testing the validity of output datasets. You may not be surprised to know that some of these topics are complex enough that they are candidates for deep-learning approaches of their own, a recursive feature that will cause you to think of what techniques are then appropriate to evaluate the evaluations. The answer to the question of quis custodiet ipsos custodes turns out, perhaps, to be “this subroutine right over here”, but these are human judgment calls as well.

So overall, this book should make you much more able to digest what people are talking about when they start talking deep learning, and if you’re motivated to try some yourself, it will show you how to get started and where to learn more. And it will also (perhaps paradoxically) reassure you about the current limits of the technique in general and the continued need for intelligent human oversight and intervention. Making ourselves a bit more intelligent about that is no bad thing.

An Intro to Deep Learning

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112