3.1 Cohort characteristics
The analyzed dataset included 111 unique donors: 57 controls (51.4%) and 54 AD cases (48.6%). All six major cell types were represented in both diagnostic groups, with per-cell-type donor counts ranging from 107 (vascular cells) to 111 (oligodendrocytes). Balanced representation enabled robust classifier training (Fig. 1).
3.2 Classification performance by cell type
Classification performance varied substantially across cell types, with non-neuronal populations demonstrating superior discriminative capacity (Fig. 2, Table 1). Astrocytes achieved the highest AUC in a single-seed configuration (AUC = 0.646), followed by microglia/immune cells with the most robust multi-seed performance (AUC = 0.574 ± 0.088). Vascular cells showed moderate performance (AUC = 0.540 ± 0.069). Notably, both excitatory neurons (AUC = 0.480 ± 0.079) and inhibitory neurons (AUC = 0.459 ± 0.123) exhibited limited discriminative capacity, with inhibitory neurons performing near chance levels. Although modest, these values were significantly superior to those obtained in permutation tests, indicating reproducible biological signal.
Table 1
Classification performance by cell type and analytical configuration.
| Cell type | Config.1 | Config.2 | Config.3 | Average | Rank |
| Astrocytes | 0.583 | 0.646 | 0.550 ± 0.067 | 0.593 | 1 |
| Microglia/Immune | 0.590 | 0.571 | 0.574 ± 0.088 | 0.578 | 2 |
| Vascular cells | 0.600 | 0.525 | 0.540 ± 0.069 | 0.555 | 3 |
| Oligodendrocytes | 0.463 | 0.524 | 0.529 ± 0.113 | 0.505 | 4 |
| Excitatory neurons | 0.516 | 0.559 | 0.480 ± 0.079 | 0.518 | 5 |
| Inhibitory neurons | 0.523 | 0.482 | 0.459 ± 0.123 | 0.488 | 6 |
Config.1: tun5 (single split with Naive Bayes comparison); Config.2: tun8 (single split, extended analysis); Config.3: tun9 (cross-validation with 10 seeds, values expressed as mean ± SD). Bold values indicate best performance per configuration.
3.3 Discriminative genes and associated biological processes
Model coefficients identified genes whose expression contributed most strongly to classification. Top contributing genes clustered functionally into the following processes (Fig. 3, Table 2):
-
Neuroinflammation and immune activation (especially in microglia): IKZF3, HLA-DRA, CCL4, CXCL1, CXCR4, VAV1, LAT2
-
Lipid metabolism and cholesterol transport: CH25H, CETP, SCD, PIPOX
-
Oxidative stress response and detoxification: MT1E, MT1X, NXN, PARP14
-
Vascular integrity and blood-brain barrier: F3, CLDN5, FBLN1, ABCB1, AQP1
These patterns were consistent across seeds and functionally related cell types.
Table 2
Top discriminative genes with biological function.
| Gene | Cell type | Coef. | Biological function |
| MCC | Microglia | -0.444 | Cell cycle regulator; tumor suppressor |
| IKZF3 | Microglia | + 0.375 | Ikaros family transcription factor; lymphocyte development |
| HLA-DRA | Astrocytes | -0.110 | MHC class II antigen presentation; immune response |
| BCL2A1 | Astrocytes | + 0.094 | BCL-2 family anti-apoptotic protein |
| CH25H | Vascular | + 0.062 | Cholesterol 25-hydroxylase; oxysterol synthesis, inflammation |
| MT1E | Vascular | + 0.060 | Metallothionein 1E; oxidative stress, heavy metal binding |
| CETP | Microglia | + 0.224 | Cholesteryl ester transfer protein; lipid metabolism |
| TFPI | Astrocytes | -0.082 | Tissue factor pathway inhibitor; coagulation regulation |
Coef.: logistic regression model coefficient. Positive values indicate overexpression in AD; negative values indicate underexpression in AD.
3.4 Differential expression analysis
Differential expression analysis identified genes with significant changes (p < 0.05) between AD and controls (Fig. 4). In astrocytes: ITGA1 (log2FC = + 1.11, p = 0.008), TLR5 (log2FC = + 1.33, p = 0.012), TYMP (log2FC=-0.98, p = 0.012), IGFBP6 (log2FC=-1.73, p = 0.023), TFPI (log2FC=-1.43, p = 0.027). In vascular cells, MT1E showed the most statistically significant change (log2FC = + 0.16, p = 0.00026), representing a 113% increase in AD relative to controls.
3.5 Convergent pathological axes
Integration of discriminative genes across cell types revealed convergence on four main biological axes (Fig. 5): (1) Neuroinflammatory axis: multiple chemokines (CCL4, CXCL1, CXCR4) and immune signaling molecules (VAV1, LAT2, IKZF3), suggesting chronic immune activation and peripheral cell recruitment. (2) Cholesterol/lipid metabolism axis: CH25H (produces 25-hydroxycholesterol with immunomodulatory properties), CETP, and SCD, implicating altered brain cholesterol homeostasis. (3) Oxidative stress response axis: upregulated metallothioneins (MT1E, MT1X), indicating activation of antioxidant defense mechanisms. (4) Vascular/blood-brain barrier axis: F3 (tissue factor), CLDN5 (claudin-5), FBLN1 (fibulin-1), and ABCB1 (P-glycoprotein), suggesting compromised BBB integrity.
3.6 In silico perturbation
Simulated normalization of the top 50 genes with the highest absolute weights (0.25 standard deviation shift toward control levels) resulted in marked changes in model output probabilities (Fig. 6). In microglia, the simulated perturbation shifted the mean predicted AD probability from 0.436 to 0.082 (Δprob = − 0.354), indicating high model sensitivity to coordinated transcriptomic variation within this cell type. In astrocytes, the effect was more modest (Δprob = − 0.019), potentially reflecting greater redundancy or complexity in astrocytic transcriptional programs.
This analysis does not establish causality or imply biological reversibility of disease processes. Instead, it serves as an exploratory computational approach to prioritize genes and pathways contributing disproportionately to the observed transcriptomic associations.