Cancer is among the leading causes of illness and death across the world. In 2020, it accounted for an estimated 19.3 million new cases and 10 million deaths [1], and in 2022 the global burden was reported to increase at nearly 20 million new cases [2]. The economic costs are no less striking as between 2020 and 2050, cancer care and the loss of productivity are projected to amount to around 25.2 trillion US dollars [3]. Even with significant ongoing investments, the path of oncology drug discovery remains slow and highly inefficient [4]. Drug development is often slow and carry costs of several billion dollars, where only about 3.4% to 5.1% of oncology drugs that reach clinical testing ever gain regulatory approval [5, 6]. These statistics show the need to expedite and prioritize the potential drug candidates much earlier in the pipeline.
The traditional drug discovery process is a multi-stage endeavor that begins with target identification and validation, where researchers determine whether a biological molecule is causally linked to cancer progression and assess its suitability as a druggable target. Once a target is established, hit discovery is conducted, often through high-throughput experimental screening or virtual screening of chemical libraries, to identify initial “hit” compounds. This is followed by hit-to-lead optimization, where chemical structures are modified to improve potency, selectivity, and physicochemical properties. In the lead optimization stage, medicinal chemistry and computational modelling are used to enhance pharmacokinetic and pharmacodynamic characteristics, focusing on absorption, distribution, metabolism, excretion, and toxicity (ADME-Tox). Promising leads progress into preclinical development, which includes in vitro assays and in vivo animal studies to evaluate efficacy and safety. Only a very small fraction of candidates advances into clinical trials, which are divided into Phase I (safety and dosage), Phase II (efficacy and side effects), and Phase III (comparative effectiveness) before any regulatory approval is possible [7]. Each stage is characterized by high attrition rates, lengthy timelines, and escalating costs, which collectively explain why oncology remains one of the most resource-intensive areas in drug development [7, 8].
Generative artificial intelligence (AI) and machine learning (ML) have recently gained attention as potential ways to address this gap. Unlike traditional computational methods, which mainly screen compounds from existing chemical libraries, generative models are designed to learn patterns from chemical and biological data and propose entirely new molecules [9]. Several families of models are now being applied in drug discovery, including recurrent neural networks, generative adversarial networks, and transformer-based architectures [9, 10]. These systems can suggest compounds that balance potency, selectivity, and drug-like properties, which, in theory, reduces both the time and cost of searching for chemical space [11]. Proof-of-concept studies have even shown that generative AI can reproduce known drugs or design new scaffolds within days or weeks [9]. Such demonstrations have resulted in expectations that AI-driven design could play a major role in accelerating the discovery of anticancer agents.
In this article, we present a decade-long (January 2015 to June 20, 2025) systematic review that maps and quantitatively synthesizes the application of generative AI and ML to de novo cancer drug discovery. Unlike previous reviews, which were either method-oriented or disease-agnostic, this work specifically targets oncology. Our aim is to establish an evidence base that highlights both opportunities and limitations of these models in producing compounds with potential clinical relevance.
Our initial findings show that out of 1,130 records screened, 57 studies met inclusion criteria. Kinases were by far the most common drug targets (49%), while enzymes, GPCRs, and immune proteins were less frequently studied. Publication activity rose sharply after 2021, mirroring the broader surge in generative AI research. Less than half of the studies reported docking or in vitro potency data, and only 14% described it in vivo validation. Binding free energy values appeared in just over one-quarter of studies, while ADME-Tox assessments were reported in about one-third. Roughly half of the included studies claimed that AI-generated compounds outperformed reference drugs, but most of these claims were based only on computational evidence. Reproducibility was inconsistent, with public code available in just over half of the papers. Taken together, these findings suggest that while generative AI can design molecules with promising anticancer potential, experimental validation and reproducibility remain major gaps.
The selection criteria followed in our research adopts a PRISMA-guided, cancer-specific, quantitative synthesis of de novo generative AI and ML studies. Using a defined corpus, we (i) map targets, model families, and time trends; (ii) quantify how often studies report numeric docking and binding-free-energy values and summarize their distributions overall and by target type and model class; (iii) track progression to biochemical, cell, and in vivo end points; (iv) examine claims that AI-generated compounds outperform reference drugs alongside the numerical evidence provided; and (v) assess openness and reproducibility indicators such as code availability and dataset disclosure. Where prior reviews explain how models work, our goal is to put oncology-specific numbers on what is being reported and to highlight practical steps that would make studies more comparable and more informative for translation.
The remainder of this paper is organized as follows: Section 2 presents the state of the art, Section 3 outlines methods and eligibility criteria, Section 4 presents the results, and Section 5 discusses implications, limitations, and future directions.