[GATE Approach Reveals Novel Candidate Drugs for COVID-19 Treatment]
To systematically identify and validate novel drug repurposing candidates we developed a comprehensive analytical framework that addresses the limitations of standalone computational approaches. Our methodology consists of three integrated phases: (1) computational identification of candidate drugs through state-of-the-art GATE approaches utilizing attention mechanisms and data obtained from Open Targets [20], (2) concept validation of clinical potential using real-world evidence, and (3) molecular biological assessment for mechanism of action (MOA) hypotheses generation using phenome data (Fig. 1). This multilayered evaluation strategy enables us to move beyond simple similarity predictions to provide actionable insights for therapeutic development. To demonstrate the utility of this system, we applied it to identify and validate novel drug repurposing candidates for COVID-19.
AI network analysis (GATE) selects candidate drugs for COVID-19, real-world data analysis (signal detection) narrows them to drugs with protective signals, and phenome analysis (pathway profiling) evaluates their molecular plausibility and mechanisms of action.
Based on the latent space representation of GATE, we identified drug nodes showing high similarity to the COVID-19 node (Table 1). In interpreting the results, we considered different salt or derivative forms of the same compound (e.g., cilastatin and cilastatin sodium) as representing a single drug entity. Among the 17 drugs identified, 13 drugs were already known to have therapeutic effects against COVID-19 symptoms, indicating that our model identifies clinically meaningful drug-disease associations. The remaining 4 drugs (cilastatin, megestrol, drotrecogin alfa, ethacrynic acid) showed minimal prior associations with COVID-19 in the literature, revealing hidden therapeutic opportunities that would have remained invisible to conventional computational discovery approaches. Importantly, these candidate drugs and COVID-19 were not directly connected in the original knowledge graph (Fig. 2, Figure S1-3), demonstrating our GATE’s ability to discover latent associations implicit in network relationships.
Table 1
Top-ranked drug candidates based on embedding similarity to COVID-19 in the latent space learned by GATE model
| ID | Name | Description | Cosine Similarity | OpenTargets Association Score | OpenTargets Rank of Score |
| CHEMBL1201057 | CILASTATIN SODIUM | Antibiotic, Dipeptidyl Peptidase I Inhibitor | 0.86067 | 0.01 | 2672 |
| CHEMBL766 | CILASTATIN | Antibiotic, Dipeptidyl Peptidase I Inhibitor | 0.86006 | 0.01 | 2672 |
| CHEMBL1201014 | PREDNISOLONE SODIUM PHOSPHATE | Steroid Drug | 0.85693 | 0.609 | 4 |
| CHEMBL1201231 | PREDNISOLONE PHOSPHORIC ACID | Steroid Drug | 0.85526 | 0.609 | 4 |
| CHEMBL679 | EPINEPHRINE | Adrenergic Receptor Agonist | 0.85111 | 0.372 | 169 |
| CHEMBL2134724 | IPRATROPIUM BROMIDE | Anticholinergic Drug, COPD and Bronchial Asthma Treatment | 0.84696 | 0.076 | 737 |
| CHEMBL1621597 | IPRATROPIUM | Anticholinergic Drug, COPD and Bronchial Asthma Treatment | 0.84621 | 0.076 | 737 |
| CHEMBL1200689 | NITRIC OXIDE | Nitric Oxide, Pulmonary Hypertension Treatment | 0.84535 | 0.345 | 397 |
| CHEMBL1201335 | GLYCOPYRRONIUM | Anticholinergic Drug, COPD | 0.84439 | 0.076 | 737 |
| CHEMBL1437 | NOREPINEPHRINE | Adrenergic Receptor Agonist | 0.84266 | 0.372 | 169 |
| CHEMBL1215 | PHENYLEPHRINE | Adrenergic Receptor Agonist | 0.84121 | 0.372 | 169 |
| CHEMBL2108429 | MEPOLIZUMAB | Anti-IL-5 Antibody | 0.83772 | 0.068 | 799 |
| CHEMBL1201139 | MEGESTROL ACETATE | Antineoplastic Drug, Contraceptive, Progesterone Receptor Agonist | 0.83667 | 0.086 | 636 |
| CHEMBL1370 | BUDESONIDE | Steroid Drug | 0.83639 | 0.609 | 4 |
| CHEMBL1364144 | METHYLPREDNISOLONE ACETATE | Steroid Drug | 0.83551 | 0.609 | 4 |
| CHEMBL1201027 | GLYCOPYRROLATE | Anticholinergic Drug | 0.83493 | 0.076 | 737 |
| CHEMBL2109065 | DROTRECOGIN ALFA (ACTIVATED) | Antithrombotic, Anti-inflammatory, Fibrinolysis-promoting Human Activated Protein C | 0.83448 | 0.026 | 1339 |
| CHEMBL52440 | DEXTROMETHORPHAN | Cough Suppressant and Expectorant (Medicon) | 0.83335 | 0.365 | 334 |
| CHEMBL1473 | FLUTICASONE PROPIONATE | Steroid Drug | 0.83256 | 0.609 | 4 |
| CHEMBL3707243 | GLYCOPYRRONIUM TOSYLATE | Anticholinergic Drug, Primary Axillary Hyperhidrosis Treatment | 0.83239 | 0.076 | 737 |
| CHEMBL456 | ETHACRYNIC ACID | Loop Diuretic Drug | 0.83171 | 0.004 | 4822 |
Drug candidates are ranked by cosine similarity scores in GATE latent space. Rows shown in bold indicate drugs not previously used for COVID-19 treatment, representing novel repositioning candidates. The OpenTargets association scores range from 0 to 1, with higher values indicating stronger relevance.
Subnetwork extracted from a multimodal knowledge graph consisting of drugs, diseases, and genes. Nodes are colored by type: orange for diseases, blue for drugs, and green for genes.
To further evaluate the findings, we investigated the association scores between COVID-19 and the target genes of each candidate drug using the Open Targets Platform, an integrated drug discovery database. Association scores are comprehensive relatedness indicators based on literature information, genetic evidence, experimental data, and other sources. This analysis revealed that all four candidate drugs showed low association scores with COVID-19 (Table 1), suggesting that these drugs would not be visible through existing knowledge bases. In particular, our top candidate, cilastatin, had an association score of 0.01, which is significantly lower than scores (0.068–0.609) of drugs that are known to have effects on COVID-19.
To benchmark against other approaches, we investigated whether these four drugs had been reported as COVID-19 repurposing candidates in existing studies. Cilastatin was reported as one of 78 candidates for SARS-CoV-2 3CLpro inhibition through docking simulation approaches, but detailed validation was not performed [31]. Megestrol was proposed as one of 34 COVID-19 therapeutic candidates through network-based approaches, though detailed investigation was not conducted [32]. Drotrecogin alfa has no existing reports, representing the first identification of this drug as a COVID-19 candidate. Ethacrynic acid was reported as a promising candidate for SARS-CoV-2 main protease (Mpro) inhibition through docking simulation approaches, with experimental validation also performed [33].
Although some candidate drugs identified by the GATE approach have been previously reported by other computational methods, these studies provided only preliminary identification without comprehensive evaluation of therapeutic potential. Importantly, three drugs (cilastatin, drotrecogin alfa, ethacrynic acid) excluding megestrol were not identified by conventional network-based approaches and were first detected by the GATE approach. This indicates that GATE can identify complex associations that remain undetected by conventional network analysis, exemplified by novel identification of drotrecogin alfa.
These results suggest that the GATE approach is complementary to conventional methods and has the potential to efficiently discover novel relationships between unknown drug-disease pairs. Furthermore, the identification of candidates such as ethacrynic acid, for which experimental validation has been conducted, provides additional evidence supporting the reliability of our approach. However, these remain computational predictions that require experimental validation. To provide mechanistic insights that could guide future preclinical studies, we next analyzed the potential mechanisms of action of these candidate drugs.
[Evaluation of Clinical Potential of Candidate Drugs through FAERS Disproportionality Analysis (DPA)]
We further analyzed the four drugs identified by the GATE approach using real-world adverse events data from FAERS to examine potential clinical associations with COVID-19 by DPA. In DPA, ROR and 95% CI are used to evaluate associations between drugs and diseases. An inverse association (ROR < 1) is considered protective and a 95% CI range with the upper limit < 1 indicates statistical significance. This analysis revealed striking protective signals for cilastatin and megestrol, with ROR for COVID-19 infection of 0.19 (95% CI: 0.09–0.43) and 0.63 (95% CI: 0.42–0.94), respectively. (Fig. 3). Notably, cilastatin demonstrated an exceptionally strong protective association, with patients taking this medication showing approximately 5-fold lower odds of COVID-19 infection compared to non-cilastatin users. Megestrol also exhibited a significant protective effect, with approximately 40% reduction in infection odds. These findings suggest substantial clinical potential for both drugs in COVID-19 infection, with cilastatin showing one of the most pronounced protective signals observed in real-world data.
(A) Reporting odds ratios (ROR) and 95% confidential intervals (CI) for candidate drugs through GATE-based analysis. “NA” indicates values that could not be calculated due to insufficient sample size. (B) Forest plot visualizing RORs for the same set of drugs. The horizontal bars represent 95% CIs. RORs for drotrecogin alfa and ethacrynic acid were not calculable and are indicated with asterisks (*)
To examine whether the observed inverse associations were specific to each drug, we conducted DPA for drugs in the same class as each candidate drug or for drugs that are often co-administered with the candidate drug. Cilastatin is often co-administered with imipenem, a β-lactamase inhibitor. DPA for β-lactamase inhibitors showed ROR values around 1.0, and the remarkable inverse associations observed with cilastatin were not observed (Figure S4). This result suggests that the observed COVID-19 infection suppression effect may be specific to cilastatin. Megestrol is a progesterone receptor agonist. Progesterone preparations classified as progesterone receptor agonists showed RORs around 0.5 with 95% CI upper limit less than 1, indicating inverse association trends similar to megestrol (Figure S5). This result suggests that the observed effect represents a pharmacological class effect mediated through progesterone receptors rather than a mechanism that is specific to megestrol. However, more detailed analysis is needed to clearly distinguish the influence of class effects.
The remarkable inverse association observed for cilastatin is consistent with previous studies demonstrating that this drug may attenuate COVID-19 pathology by reducing SARS-CoV-2 viral replication and providing protection from the cytokine storm observed in severe COVID-19 [34]. Based on these findings, cilastatin warrants further investigation as potential therapeutic candidate for COVID-19.
Regarding megestrol, no literature has reported direct associations with COVID-19. However, progesterone, a drug in the same class that our DPA has indicated to be protective against COVID-19, has anti-inflammatory properties, suggesting potential as a COVID-19 treatment [35–37]. Clinical studies on COVID-19 treatment using progesterone have been conducted (NCT04365127, NCT04865029) and in the NCT04865029 trial, progesterone administration in moderate to severe COVID-19 patients resulted in significant improvement in clinical status (a score of 1.5 points, corresponding to a 3-day reduction in oxygen supplementation and 2.5-day reduction in hospitalization duration) [38]. Given the observed class effect of progesterone receptor agonists and the clinical evidence supporting progesterone, megestrol merits evaluation to determine whether it confers similar protective effects.
Note that drotrecogin alfa and ethacrynic acid could not be evaluated using this DPA due to insufficient co-occurrence reports with COVID-19-related AEs in the FAERS database during the analysis period.
[Evaluation of Molecular Potential of Cilastatin and Megestrol for COVID-19 Treatment]
To further explore these findings at the phenome level, we conducted correlation analysis using pathway profiling with LINCS 2020 for cilastatin and megestrol, the two drugs that showed statistically significant inverse associations in DPA. LINCS 2020 is the successor database to the Connectivity Map (CMAP), an established platform for comparing phenomic signatures of any given drug with those associated with specific diseases or disease therapeutics [22]. For each drug, we explored compounds with correlated pathway profiles from 17,170 compounds (241,597 profiles) using Fisher's exact test.
The compounds showing the strongest correlations with cilastatin (Table 2) included artesunate and sirolimus, which are drugs that have undergone clinical trials for COVID-19. The analysis also revealed ipidacrine, ursolic acid, wortmannin, and thapsigargin, which have been reported in the literature as COVID-19 therapeutic candidates. These results suggest that cilastatin has potential as a therapeutic candidate at the phenome level.
Table 2
Top-ranked compounds positively correlated with the pathway profile of cilastatin
| PertID | Name | MOA | Cell | Conc | Time | Adj. P-value | COVID-19 relation |
| BRD-K79759031 | artesunate | | HT29 | 10uM | 6h | 4.4E-42 | 10 clinical studies |
| BRD-K66896231 | ipidacrine | Acetylcholinesterase inhibitor | HT29 | 10uM | 6h | 1.0E-40 | 1 report |
| BRD-K60067222 | BRD-K60067222 | | PC3 | 10uM | 24h | 8.7E-40 | |
| BRD-K68185022 | ursolic-acid | 11-beta hydroxysteroid dehydrogenase inhibitor | PC3 | 70uM | 24h | 4.2E-39 | 4 reports |
| BRD-A19037878 | BRD-A19037878 | | K562 | 2.5uM | 24h | 8.0E-39 | |
| BRD-A75409952 | wortmannin | PI3K inhibitor | PC3 | 10uM | 6h | 2.6E-38 | 2 reports |
| BRD-K81855038 | roxatidine | | HT29 | 10uM | 6h | 4.9E-38 | |
| BRD-A79768653 | sirolimus | MTOR inhibitor | PC3 | 3.33uM | 24h | 1.5E-37 | 9 clinical studies |
| BRD-K69023402 | thapsigargin | ATPase inhibitor | JURKAT | 0.37uM | 24h | 1.5E-37 | 3 reports |
| BRD-K56032964 | AP-26113 | | HT29 | 0.37uM | 24h | 1.6E-37 | |
The "COVID-19 relation" column indicates whether each compound has been previously associated with COVID-19 through publications or clinical trials. Details of the corresponding reports and studies are provided in Supporting Information 1 and 2.
Similarly, the compounds showing the strongest correlations with megestrol (Table 3) included tianeptine, fedratinib (TG-101348), AZD-6482 and vorinostat. Tianeptine and fedratinib (TG-101348) have undergone clinical trials for COVID-19. AZD-6482 and vorinostat have been reported as COVID-19 therapeutic candidates. These results suggest that megestrol has potential as a COVID-19 therapeutic candidate at the phenome level.
Table 3
Top-ranked compounds positively correlated with the pathway profile of megestrol
| PertID | Name | MOA | Cell | Conc | Time | Adj. P-value | COVID-19 relation |
| BRD-K37142460 | MI-2 | | OCILY3 | 10uM | 24h | 1.3E-08 | |
| BRD-A81370665 | BI-D1870 | Ribosomal protein inhibitor | HBL1 | 10uM | 24h | 1.3E-08 | |
| BRD-K01436366 | XMD-1150 | Leucine rich repeat kinase inhibitor | BJAB | 10uM | 24h | 3.4E-08 | |
| BRD-K36363294 | I-BET-151 | | HIMG002 | 2.5uM | 24h | 3.4E-08 | |
| BRD-A53077924 | tianeptine | Selective serotonin reuptake enhancer | HIMG001 | 15uM | 24h | 4.1E-08 | 1 clinical study |
| BRD-K12502280 | TG-101348 | FLT3 inhibitor | OCILY3 | 10uM | 4h | 2.5E-07 | 1 clinical study |
| BRD-K58772419 | AZD-6482 | PI3K inhibitor | MCF7 | 1.11uM | 24h | 3.0E-07 | 1 report |
| BRD-K12502280 | TG-101348 | FLT3 inhibitor | TMD8 | 2.5uM | 4h | 5.7E-07 | 1 clinical study |
| BRD-K81418486 | vorinostat | HDAC inhibitor | JURKAT | 10uM | 24h | 5.7E-07 | 3 reports |
| BRD-K54606188 | BRD-K54606188 | | TMD8 | 0.66uM | 4h | 1.9E-06 | |
The "COVID-19 relation" column indicates whether each compound has been previously associated with COVID-19 through publications or clinical trials. Details of the corresponding reports and studies are provided in Supporting Information 1 and 2.
[Hypothesized Mechanisms of Action of Cilastatin for COVID-19 Treatment]
The correlation with established COVID-19 therapeutic candidates provided substantial evidence for the intrinsic potential of cilastatin for clinical application. To further strengthen this rationale and evaluate whether cilastatin possesses sufficient mechanistic basis to incentivize clinical development, we sought to elucidate the underlying biological mechanisms based on pathway profiling data. We argued that because pathway-level information represents a functional layer closer to biological responses than individual genes, a detailed analysis of the LINCS 2020 data can facilitate hypothesis generation regarding mechanisms of action.
In the LINCS 2020 data, cilastatin treatment significantly decreased the expression of 42 pathways in the HT29 cell line (Table S2). Enrichment Map analysis revealed that gene sets related to splicing, ribosome, mitochondria, telomere, and ubiquitin pathways were altered (Fig. 4). Based on these pathway alterations, we constructed a mechanistic hypothesis for how cilastatin might intervene in the pathophysiological processes of COVID-19 induced by SARS-CoV-2 infection (Fig. 5).
This network visualizes gene ontology (GO) biological process terms significantly down-regulated by cilastatin treatment. Each node represents a gene set corresponding to a GO term, with node size proportional to the number of genes in the set and node color reflecting statistical significance (log₁₀-transformed p-value). Edges indicate gene overlap between gene sets. Functionally related clusters are annotated.
This schematic illustrates potential therapeutic mechanisms by which cilastatin may alleviate COVID-19 pathology, inferred from pathway-level gene expression changes. Cilastatin-mediated inhibition of DPEP1 is proposed to influence multiple downstream biological processes, including ribosome biogenesis, RNA splicing, and mitochondrial metabolism. These changes converge on antiviral and anti-inflammatory outcomes such as suppression of viral replication and inflammation. Pathways shaded in blue indicate those with down-regulated gene expression following cilastatin treatment.
Cilastatin is clinically used as a dipeptidase-1 (DPEP1) inhibitor in combination with imipenem to prevent renal degradation of the antibiotic by DPEP1. Beyond this established role in antibiotic protection, our analysis of LINCS2020 data revealed that cilastatin modulatesmultiple cellular pathways related to splicing, ribosomal function, and mitochondrial processes in cells. DPEP1, in addition to its primary function of dipeptide hydrolysis in the kidney, also functions as a neutrophil adhesion receptor on vascular endothelial cells and is known to promote neutrophil recruitment during inflammation [39]. DPEP1 may restrict host ribosomes utilized by SARS-CoV-2 by controlling the PI3K/Akt/mTOR pathway [40], which regulates cellular protein synthesis and metabolism [41]. This may suppress viral replication through a mechanism similar to that of mTOR inhibitors such as sirolimus (rapamycin) [42].
Of particular interest, it is known that the nonstructural protein 16 (NSP16) of SARS-CoV-2 binds to U1/U2 snRNA and interferes with the host splicing mechanism [43]. Suppression of splicing-related pathways may function as a cellular stress response that limits host gene expression and potentially suppresses viral replication by restricting protein synthesis. Moreover, splicing abnormalities may activate antiviral responses via RIG-I-like receptors [44].
Beyond the splicing mechanism, cilastatin also affects mitochondrial pathways. Decreased expression of mitochondria-related genes suggests changes in cellular metabolic activity and immune signaling. These changes are expected to achieve balanced immune regulation by limiting energy available for viral replication while maintaining type I interferon responses through mitochondrial antiviral signaling protein (MAVS) [45].
[Study Limitations]
While our multilayered computational approach successfully identified several promising therapeutic candidates and elucidated potential mechanisms of action, there are several important limitations that must be addressed.
Limitations of the GATE approach
The predictive performance of our knowledge graph-based method heavily depends on the comprehensiveness and quality of the base knowledge graph being used. In biomedical knowledge graphs, data bias tends to occur based on research progress and attention levels, with particularly insufficient information regarding rare diseases and drugs in early development stages. This deficiency may limit the discovery of novel drug-disease associations and reduce predictability. Therefore, developing comprehensive knowledge graphs that systematically incorporate data on rare diseases and emerging therapeutics will be essential for improving comprehensiveness and reducing bias in future studies.
Limitations of real-world data analysis
Since FAERS is a spontaneous AE reporting system, it inherently contains problems such as reporting bias, underreporting, and selective reporting. A critical limitation emerged from our analysis of drotrecogin alfa and ethacrynic acid, which revealed that the method cannot analyze drugs with insufficient data in the database. This particularly affects newer drugs that lack extensive real-world usage data, limiting our ability to detect potential associations. Conversely, this suggests that classical, widely used drugs may have higher detection sensitivity due to larger volumes of available data. Furthermore, inferring causal relationships from observed associations remains challenging, and the influence of confounding factors cannot be completely eliminated.
Limitations of phenome analysis
The LINCS database used is primarily based on drug responses in cancer cell lines, which exhibit different characteristics from alveolar epithelial cells, vascular endothelial cells, and immune cells, which are important in COVID-19 pathogenesis. Additionally, drug responses under the inflammatory environment in patients during SARS-CoV-2 infection may differ significantly from normal cell culture conditions. Furthermore, it is difficult to capture host-virus interactions and other complex biological processes that can only be detected in in vivo conditions using cell line-based phenome data. While extensive animal data have been accumulated and could potentially be curated and integrated into such databases, comprehensive incorporation remains challenging due to species differences in drug target molecules, including variations in binding site homology and pharmacological responses between species.
Need for experimental validation
This study primarily focused on computational predictions and analysis of existing databases to systematically identify therapeutic candidates and elucidate potential mechanisms. While this approach provided valuable insights into drug repurposing opportunities for COVID-19 treatment, future studies should include direct validation using SARS-CoV-2-infected models and/or clinical evaluation.