Artificial Intelligence and Drug Discovery

By Aayush Goradia & Leah Strickling 

For decades, the time-consuming and vastly expensive nature of drug discovery has been a significant pain point for pharmaceutical and biotech companies, who continually seek new technologies to improve the process. Recently, artificial intelligence (AI) has shown promise in drug discovery to speed the course, cut formidable costs, and open a world of learning for drug developers.  

AI has made waves in early drug discovery already, resulting in growing interest from investors. In 2018 alone, venture investments in AI companies focusing on drug discovery totaled $1.08 billion, a 350% increase from the ~$240 million raised in 2016. In the first half of 2019, similar companies have raised ~$700 million.

With AI becoming integrated into many industries, it is not surprising that the technology is used by many startups and big pharma companies looking for a development solution. However, despite the advancements made in AI drug discovery over the past several years, significant challenges remain for applying AI to tackle the issues posed.

There are two paradigms in traditional drug discovery: physiology-based and target-based. Physiology-based designs have an unknown target and phenotypic read-outs, meaning that researchers may not know the target on which a compound that is mitigating a disease is acting.

Target-based designs have a known target and read-outs based on the activity or expression of the target. After a target molecule is chosen and validated, high-throughput screening is conducted to identify hits with the desired activity, from which a lead compound is selected and optimized.

While these designs are not mutually exclusive, target-based drug discovery has been the standard for the past ~20 years, and we will focus on applications of AI to the target-based approach.

Two specific challenges arise from target-based drug discovery. First, there are a vast number of diseases that are not well understood by the scientific community. For example, the pathophysiology behind central nervous system (CNS) disorders, a hot area for drug development, is highly complex, making the identification of appropriate biomarkers challenging. Additionally, there is a high attrition rate for compounds being explored, which consequently drives up the cost of drug discovery. These problems provide an opportunity for companies with new technologies to offer innovative and potentially cost-lowering solutions.

Some of the most promising applications of AI to drug discovery are used for target identification and lead optimization, including the following:

  • Benevolent AI is a UK-based company aiming to optimize both target identification and lead optimization using data mining. The company’s platform mines the vast amount of clinical trial data and academic research that has been conducted to date, generating a network of relationships between genes, proteins, diseases, and compounds. The company then uses the information to derive insights into potential targets and associated lead candidates with the goal of reducing the number of failed drug candidates.

  • Atomwise feeds a large amount of organic chemistry data through its convolutional neural network-based machine learning (ML) model to predict potential drug candidates. As the first deep learning technology for small molecule discovery, it is well reputed for its speed, efficacy, and focus on using the fundamentals of structural chemistry to design diverse molecules.

  • DeepMind is Google’s entry into the drug discovery realm, through its parent Alphabet Inc.’s AI arm. Though still in early stages, DeepMind’s implementation of AI to predict the 3D structure of a protein based on the protein’s genetic sequence has intrigued the scientific community. As a part of last year’s Critical Assessment of Structure Prediction competition (CASP13), DeepMind presented data showing that its platform remarkably predicted the shapes of proteins that form the foundation of disease more accurately than veteran biologists, potentially solving the problem of determining the structure of proteins that underlie complex diseases. In addition, GV, the venture capital arm of Google, has made several early-stage investments in the space with its funding of SchrodingerRelay Therapeutics, and OWKIN, further underlining the company’s commitment to the space.

Companies exploring the intersection of AI and drug discovery are emerging rapidly, and Big Pharma companies are forming more partnerships with these players. However, despite the growing investments, these companies will have to address several challenges before they become the answer to fast and precise drug discovery:

Biased Data: Any ML model is only as good as the dataset used to train it. One significant concern for developers is potential bias against a minority class within the dataset, which could lead to an algorithm that makes less advantageous decisions for that class. In the context of drug discovery, an unbalanced dataset could mean that rare disease samples are underrepresented, leading to incorrect predictions for drug targets and candidates. In the coming years, researchers will need to become more proficient at removing biases from biomedical datasets to mitigate the risk of jeopardizing a patient’s health.

Need for Scientist Intervention: As opposed to other applications of AI, AI in drug discovery still relies on heavy involvement by scientists who engage in most of the work. Currently, AI helps humans to more efficiently and effectively complete sub-tasks within drug discovery, but it cannot complete those sub-tasks autonomously. Ideally, as models are optimized, they will play a more significant role in reducing the current reliance on human oversight and input.

Lack of Tangible Results: Though ML is making a good deal of noise within drug discovery, no molecule discovered using an ML drug discovery platform has made it to market. This can be attributed primarily to the lack of quality data—for example, biased data or images with an insufficient resolution, which translates to interesting but still inadequate results. For example, while Google’s DeepMind simulation can predict 3D structures accurately, it does not produce atomic-level resolution for proteins, which is crucial for drug discovery, and means that there still may be a long way to go. Ultimately, the validation of AI’s utility in drug discovery will hinge on the success of these candidates in the clinic.

Despite these challenges, it would have been tough for researchers several years ago to predict that AI would make the kind of impact on drug discovery that it has to date. At Back Bay Life Science Advisors, we are interested in seeing if AI’s place in drug discovery is validated in the coming years, with at least one company successfully bringing an AI-discovered molecule to market.

About the Authors

Aayush Goradia is a Summer Analyst at Back Bay Life Science Advisors, where he works with the strategy consulting team. This fall, Aayush will enter his third year at Duke University and is studying Biomedical Engineering and Computer Science.

Leah Strickling is a consultant at Back Bay Life Science Advisors. She works across an array of therapeutic areas and recently authored an article on digital solutions in healthcare