Inside Big Pharma's AI Playbook: From Molecule Discovery to Clinical Trials

7 fronts where Big Pharma explores AI via partnerships and internal programs

Oct 09, 2025

∙ Paid

The traditional drug discovery process is among the most complex, costly, and time-consuming endeavors in science. Developing a single medicine might take over a decade of research and $2B in investments. This inefficiency stems largely from the linear structure of discovery: beginning with target identification, moving through hit discovery and lead optimization, followed by preclinical testing and long clinical trials. Each stage requires substantial resources, meticulous validation, and, too often, ends in disappointment.

Despite the extraordinary effort, the odds of success remain bleak. Only about 1 in 10 drug candidates entering clinical trials ultimately achieve regulatory approval, with failures most often linked to safety issues or insufficient efficacy. Even high-throughput screening (HTS), once celebrated as a breakthrough, delivers a discouraging hit rate of just 2.5%. Such low yields amplify delays, inflate costs, and exhaust resources.

In this article: Target Identification — Virtual Screening — De novo Design — Drug Repurposing — ADMET Prediction — AI-backed Synthesis Planning and Execution — Clinical Trials — (When) Will AI Cure the World?

Artificial intelligence (AI) and machine learning (ML) are emerging as powerful alternatives meant to accelerate discovery, improve prediction accuracy, and break the limitations of traditional methods. Importantly, the story of AI in drug discovery has evolved alongside advances in computer tech by building on decades of incremental progress:

1960s: The drug discovery field took its first computational step with the development of the QSAR (Quantitative Structure–Activity Relationship) method. The groundwork for QSAR was laid by Corwin Hansch and his colleagues in 1962 when they researched the correlation of molecular properties with biological activity.
1980s: The release of the CHARMM program in 1983 enabled general molecular simulation. In the meantime DOCK developed by Kuntz’s group in UCSF became the first molecular docking software.
1990s: Molecular modelling software platforms like Schrödinger made computer-aided drug design widely accessible, embedding computational tools into the pharmaceutical workflow. Additionally, open-source alternatives like GROMACS appeared (University of Groningen in 1991).
2010s: Deep learning catalyzed a wave of AI-first biotech startups like Recursion and Insilico Medicine. In June 2025 Insilico released the Phase IIa results for an AI-designed compound Rentosertib against idiopathic pulmonary fibrosis, which is believed to be a considerable milestone for a solely AI-inspired drug candidate up to date.
2020s: In CASP14, DeepMind’s AlphaFold achieved a median Global Distance Test (GDT - the main CASP metric for prediction precision evaluation) score of 92.4 —an unprecedented leap in accuracy. In 2024, the Nobel Prize in Chemistry recognized Demis Hassabis and John Jumper (for AlphaFold) and David Baker (for computational protein design).

Over a decade ago, strategists at Big Pharma noted the possibility of a broader AI’s potential for R&D transformation, considering progress in deep learning at the time, and started testing grounds for wider adoption. One of the earliest examples took place back in 2012, with Merck tapping into the online data science community Kaggle to crowdsource solutions for a core drug discovery challenge: predicting biological activity of molecules, both on-target and off-target. The 60-day competition awarded $40,000, with the top prize going to a team led by George Dahl (University of Toronto) for their use of neural networks and deep learning—a hint at how these methods would reshape drug R&D.

Today, we will follow stages of the drug discovery pipeline as outlined in the January 2025 Nature article by researchers from Wenzhou Medical University. We’ll look at each stage in detail and highlight how leading biopharma companies are leveraging AI across the entire drug development lifecycle.

The pipeline unfolds across six critical phases: target identification → discovery → preclinical/clinical → regulatory → post-market. The first three stages represent unique challenges and opportunities for innovation, and AI is increasingly shaping how the pharmaceutical industry approaches them.

Target Identification

Finding the right molecular target,usually a protein or nucleic acid, has always been a bottleneck in drug discovery. Classic approaches like pull-down assays or genome-wide screens work, but they’re slow and costly..

AI is taking over the target detection by uncovering hidden molecular patterns and disease links that traditional tools miss.

NLP models like word2vec have been used to map gene functions in high-dimensional space, boosting sensitivity when data overlap is sparse.
Graph deep learning takes this further by combining network structure with deep models to identify key targets and explain its reasoning, a good example of which is CGMega—a GNN-based tool for cancer gene module dissection.
Platforms like Insilico’s PandaOmics show what’s possible: by linking omics data with biomedical literature, it flagged TRAF2- and NCK-interacting kinase as an anti-fibrotic target, leading to a new inhibitor (INS018_055).

In 2025, AstraZeneca accelerated its AI-driven oncology strategy with a clear focus on target identification and validation. In April, it launched a $200M partnership with Tempus AI and Pathos AI to build a multimodal foundation model to uncover novel targets and accelerate therapeutic development.

Where Tech Meets Bio

Inside Big Pharma's AI Playbook: From Molecule Discovery to Clinical Trials

7 fronts where Big Pharma explores AI via partnerships and internal programs

Target Identification

This post is for paid subscribers