Brain samples
We analyzed 840 cortical postmortem brain samples obtained from three independent cohorts to investigate epigenomic-based heterogeneity in LOAD. We included 249 prefrontal cortex (PFC) samples from the UK Brain Banks Network (UKBBN), and 220 PFC samples from the University of Pittsburgh’s Alzheimer’s Disease Research Center (PITT-ADRC) [10], and 371 dorsolateral PFC (DLPFC) samples from the Religious Orders Study and Memory and Aging Project (ROSMAP) [11]. Samples were selected from donors with no or minimal AD pathology and definite AD (see Diagnostic Criteria).
Diagnostic Criteria
The diagnosis of LOAD and control samples was established based on clinical and neuropathological data from donors aged over 65 at the time of death. Dementia was defined by an antemortem Mini-Mental State Examination (MMSE) score below 24 or a Clinical Dementia Rating (CDR) score of ≥ 1. Within the dementia group, samples were classified as AD if they met the criteria of a "probable" or "definite" Consortium to Establish a Registry for Alzheimer's Disease (CERAD) score for neuritic plaques and a Braak neurofibrillary tangle (NFT) stage of ≥ 3. In cases where CERAD data were unavailable, samples with a Braak stage of ≥ 5 were included in the AD group.
Control samples were defined using stringent criteria specific to each brain bank. They were required to have a Braak stage of 0–2 for NFT pathology and negative for amyloid (CERAD neuritic plaque score 0). Additionally, when available from brain banks, both AD and control samples were screened for co-occurring other tau pathologies, including argyrophilic grain disease, as well as alpha-synuclein and TDP-43 pathologies. Only samples negative for these co-pathologies, with none to moderate cerebrovascular disease and rare microinfarcts, were included. Furthermore, samples from donors with known systemic hematopoietic malignancies, such as leukemia, were excluded to prevent the inclusion of malignant cells circulating in the brain.
Methylomic profiling and data harmonization
For the UKBBN and PITT-ADRC cohorts, genomic DNA was extracted from 30 mg of fresh-frozen tissue using the AllPrep DNA/RNA/miRNA Universal Kit (QIAGEN), followed by bisulfite treatment with the EZ-96 DNA Methylation-Gold Kit (Zymo Research). The treated DNA was analyzed using Illumina's Infinium MethylationEPIC v1.0 BeadChip arrays, quantifying DNAm at over 850,000 CpG sites. For the ROSMAP cohort, Illumina 450K methylation raw data files were obtained from the Accelerated Medicine Partnership (AMP-AD) portal (synID: syn7357283). Uniform and stringent quality control (QC) and normalization procedures were then applied across all three cohorts.
The raw methylation IDAT files were imported into R (version 4.3.0) using the 'readEPIC' function from the wateRmelon R package (v.2.6.0) [12], generating "MethylumiSet" objects. For quality control (QC), samples with a median signal intensity below 1000 were excluded. Additional outliers were removed using the 'outlyx' function, which detects anomalies based on the Inter-Quartile Range (IQR) and Mahalanobis distance. Bisulfite conversion efficiency was assessed with the 'bscon' function, discarding samples with efficiency below 0.8. Low-quality samples and probes were filtered using the 'pfilter' function, removing samples with a detection P-value > 0.05 in more than 5% of probes and CpGs with fewer than three bead counts in 5% of samples or a detection P-value > 0.05 in 1% of samples.
Methylation sex was estimated using the ‘estimateSex’ function based on CpG sites on chromosomes X and Y, and any mismatches with true gender were excluded. Data normalization was performed using 'BMIQ' (Beta-Mixture Quantile intra-sample normalization), preserving inter-sample variance. Probe-Wise Outlier Detection (PWOD) via the 'homonymous' function removed probes associated with known variants to prevent downstream biases. Additional filtering excluded probes with SNP IDs, cross-hybridizing probes [13], and those on sex chromosomes.
Cell type composition in bulk PFC DNAm data was computed using CETYGO (v0.99.0) [14], employing a reference panel to distinguish inhibitory (GABAergic) neurons, excitatory (glutamatergic) neurons, oligodendrocytes, microglia, and astrocytes. To account for mutual technical and biological effects (e.g., age, sex, postmortem interval, plates, cell composition, and cohort-specific batches), a linear multiple regression model was applied to each methylation probe. Non-variant probes were filtered using the 90th percentile absolute deviation, resulting in 325,834 CpGs from EPIC arrays for subtyping. In the ROSMAP cohort, the CpG set was defined by the overlap between the processed 450K array and the available 450K probes within the final EPIC CpGs, yielding 148,951 CpGs.
Following pre-processing and QC, the final dataset included 240 UKBBN samples (129 LOAD, 111 controls), 220 PITT-ADRC samples (192 LOAD, 28 controls), and 371 ROSMAP samples (248 LOAD, 123 controls). This dataset, detailed in Supplementary Table 1, integrates diverse demographics, clinical profiles, and neuropathological information.
Bulk RNA sequencing
Total RNA quality in the UKBBN and PITT-ADRC cohorts was assessed using the Agilent 4200 TapeStation System, with samples having an RNA Integrity Number (RIN) below 3 excluded. Library preparation was conducted using the Illumina Stranded mRNA Prep kit, followed by sequencing on the NovaSeq 6000 platform at the Exeter Sequencing Service. Raw sequencing data in FASTQ format underwent quality trimming to remove low quality bases and adapter sequences. Processed reads were aligned to the human reference genome (GRCh37) using the STAR aligner (v2.7.3a) [15] with default parameters. After alignment, a raw count matrix was generated, and genes with low expression, defined as counts below 10 in over 80% of samples. Gene counts were normalized using the Trimmed Mean of M-values (TMM) method from the edgeR package (v3.42.4) [16]. The normalized counts were then transformed into log-transformed counts per million (logCPM) to stabilize variance and improve interpretability. Following quality control and processing, the RNA sequencing dataset comprised 483 cortical samples from three independent postmortem brain cohorts: UKBBN, PITT-ADRC, and ROSMAP. The dataset included 174 samples classified as LOAD-S1, with 63 from UKBBN, 57 from PITT-ADRC, and 54 from ROSMAP. LOAD-S2 consisted of 107 samples, with 45 from UKBBN, 20 from PITT-ADRC, and 42 from ROSMAP. The control group contained 202 samples, including 109 from UKBBN, 25 from PITT-ADRC, and 68 from ROSMAP.
Genotyping, imputation, and generation of polygenic scores
Genotyping was conducted using the Illumina Global Screening Array (GSA) for the UKBBN and PITT-ADRC cohorts. For the ROSMAP cohort, genotyping data from the Affymetrix GeneChip 6.0 (Affymetrix, Inc.) and the Illumina HumanOmniExpress chip array were obtained from (https://www.synapse.org/ ). Quality control was performed using PLINK (v2.0), excluding samples with > 5% missing values and SNPs with > 1% missing values. Additionally, SNPs with a minor allele frequency (MAF) < 0.05 and a Hardy-Weinberg equilibrium p-value < 1.0 × 10 were filtered out to retain the most reliable signals.
Processed genotype data were merged with the 1000 Genomes Project dataset, and principal component analysis (PCA) was performed to determine sample ethnicity. Samples of non-European ancestry were removed based on PCA results. The quality-controlled genotype data were uploaded to the Michigan Imputation Server and imputed using Minimac4 with the 1000 Genomes reference panel (phase 3, version 5). The imputed data for each chromosome were compiled into a unified VCF file using BCFtools (v1.9) [17] with tri-allelic SNPs removed using VCFtools (v0.1.16) [18]. The final VCF file was converted into PLINK binary format.
Polygenic scores (PGS) were generated using the imputed genotype data and PRSice-2 software [19]. Genetic variants associated with AD were selected based on the genome-wide association study (GWAS) by Bellenguez et al., 2022 [20].
Data analysis
Clustering algorithms
To identify cortical brain subgroups with similar DNAm profiles, agglomerative hierarchical clustering using the Ward D2 method with Euclidean distance and K-means clustering as a nonhierarchical approach were applied to LOAD samples, with each cohort analyzed independently. The optimal number of clusters for both methods was determined using the Elbow method. To assess clustering consistency, Normalized Mutual Information (NMI) analysis was performed for each cohort using the Aricode R package (v1.0.3) [21] (https://github.com/jchiquet/aricode), comparing hierarchical and K-means clustering results. The cluster number with the highest NMI value was selected for further analysis.
To assess cluster specificity, Sparse Partial Least Squares Discriminant Analysis (sPLSDA) was performed using the mixOmics package (v6.24.0) [22] for each clustering algorithm to identify the most discriminative features. From each sPLS-DA model, 12,000 features corresponding to six components were extracted. To evaluate robustness, cluster labels were randomized 10 times across population proportions ranging from 100–1%, followed by repeated sPLS-DA modeling for each iteration. The overlap between the 12,000 extracted features from each iteration and those from the original cluster labels was then examined. Technical batch effects, including plate and chip variations, were assessed using Spearman’s correlation test for continuous variables and Normalized Mutual Information (NMI) for discrete variables, the latter being particularly suited for evaluating clustering outcomes.
To verify the generalizability of findings across Illumina EPIC arrays (UKBBN and PITT-ADRC cohorts) and 450K arrays (ROSMAP cohort), Weighted Gene Co-expression Network Analysis (WGCNA) was employed. Co-expression networks were constructed using the 'blockwiseModules' function in the WGCNA R package (v1.72-5) [23] based on selected EPIC array probes. Module eigenprobes were then calculated using the 'moduleEigengenes' function, and their associations with categorical cluster labels were assessed through ANOVA followed by post hoc Tukey’s tests. For modules showing significant differences between identified clusters, eigenprobes were recalculated using the subset of probes available on the 450K array. Cluster associated co-methylated probe preservation was confirmed by demonstrating that eigenvalue differences between cluster labels remained significant with the 450K probes and that eigenvalue differences for specific subtypes across both platforms were non-significant.
Cross-cohort replication
To identify matching clusters with maximum similarity in DNAm profiles across the UKBBN, PITT-ADRC, and ROSMAP cohorts, we employed two complementary approaches at both the probe and sample levels.
In the first approach, we calculated the median DNAm value per probe for each clustered brain sample within each cohort. Pearson correlation tests were then used to assess the relationship between these median values across the three cohorts.
In the second approach, the UKBBN cohort was designated as the discovery dataset. Using the ‘sPLSDA’ function from the mixOmics package (v6.24.0) [22], we reduced data into a lower dimensional latent space while preserving class separation. The first two latent spaces of sPLSDA were mapped against the clusters identified in this cohort. Convex hulls were established around each UKBBN cluster by connecting the outermost points. DNAm profiles from a replication cohort (either PITT-ADRC or ROSMAP) were then projected onto these latent spaces, and the number of replication cohort samples falling within the UKBBN cluster hulls was recorded. To assess significance, contingency tables were constructed, and a hypergeometric test was performed to determine overlaps between the discovery and replication cohort clusters. This procedure was repeated with PITT-ADRC and ROSMAP serving as discovery sets, while the other two cohorts acted as replication cohorts. An epigenomic-based LOAD subtype was confirmed only when both approaches validated the DNAm profile similarities across all three cohorts, at both the probe and sample levels, using two clustering methods.
Subtype-specific epigenome-wide association analysis
Epigenome-wide association studies (EWASs) were conducted using multiple linear regression models to identify differentially methylated positions (DMPs) associated with each predicted subtype in each cohort. Methylation data were rigorously adjusted for potential confounders, including age, sex, cell type composition, brain banks, and plates, ensuring consistency with previous clustering steps. Surrogate variable analysis (SVA) was applied to adjust for unwanted variations unrelated to the main outcome, further refining accuracy. P-value inflation was assessed using the inflation index for each EWAS and estimates and standard errors were corrected if needed using the Bacon package in R (v1.28.0) [24] before proceeding to the meta-analysis. The meta-analysis was performed using the inverse variance method via the rma.uni function in the metafor R package (v4.6-0) [25]. Subtype-specific DNAm signatures were identified using relaxed criteria. Specifically, DMPs were required to have a Bonferroni adjusted p-value under 0.05. The overlap between DMPs identified across subtypes in the meta-analysis was assessed using the geneOverlap R package (v1.40.0) (http://shenlab-sinai.github.io/shenlab-sinai/).
Cell-type specific DNAm enrichment analysis
To investigate the cell type specificity of the methylation signatures associated with the identified subtypes, we employed the CEAM (Cell-type Enrichment Analysis for Methylation) framework developed by Müller et al. [26] (https://um-dementia-systems-biology.shinyapps.io/CEAM/). CEAM uses fluorescence-activated nuclei sorting (FANS) derived methylation profiles from purified brain cell populations, neurons, oligodendrocytes, microglia, and astrocytes, to construct cell type-specific CpG sets. These CpG sets are stratified into three specificity levels, ranging from uniquely cell type-specific CpGs to those shared across multiple cell types, enabling nuanced enrichment analyses. For each LOAD subtype, meta-analysis-derived DMPs were independently assessed against these CpG panels.
Colocalization analysis
To investigate whether subtype-specific methylation signals share a common genetic basis with AD genetic risk loci, we performed colocalization analysis using methylation quantitative trait loci (mQTL) data from the ROSMAP cohort, accessed via Brain xQTLServe (https://mostafavilab.stat.ubc.ca/xqtl/). As input, we extracted cis-mQTLs (p < 1.0 × 10⁻⁵) associated with DMPs identified in the subtype-specific EWASs, using the suggestive nominal p-value threshold of 1.0 × 10⁻⁵. Bayesian colocalization was conducted using the coloc R package (v5.2.3) [27] for all pairwise comparisons between genomic regions associated with AD risk (as identified by Bellenguez et al.) [20] and variants linked to LOAD subtype DMPs. The coloc.abf function was used, and colocalization was considered significant when the combined posterior probability of hypotheses H3 and H4 exceeded 0.90.
Bulk transcriptomic analysis
Differential expression analysis (DEA) was performed using the Limma R package (v3.55.10) [28], comparing LOAD subtypes to their respective controls across the three cohorts. To account for potential confounders, the analysis included covariates such as age, sex, RIN, brain banks, and surrogate variables, which were incorporated to adjust for unmeasured variation, including differences in cell type composition. Bacon correction was applied to estimates and standard errors to reduce bias. Genes were considered significantly differentially expressed if they showed a nominal p-value below 0.05 in all cohorts, a fold change greater than 1.5 in at least one cohort, and a consistent direction of effect across all cohorts. The lenient p-value threshold was selected to balance sensitivity and reproducibility by emphasizing consistent signals across datasets while avoiding false positives from single-cohort noise. The fold-change cutoff, however, ensures biological relevance, making this approach suitable for exploratory downstream subtype analyses at the transcriptional level.
Single-cell transcriptomic analysis
Subtype-specific analyses of gene expressions across 12 microglial states were conducted using single cell RNA sequencing data from the ROSMAP study. This dataset comprises 427 samples, including 70 definite AD cases and 64 control samples, overlapping with the subset of ROSMAP data utilized in this project. Subtype-specific differentially expressed genes (DEGs) were identified for each of the 12 microglial states characterized by Sun et al.[29].
The muscat R package (v1.14.0) was used with default parameters. Non-expressed or lowly expressed genes were filtered out, and outlier detection was performed at the cell level. To generate a total expression value for each sample, cells were aggregated in a pseudobulk manner by summing RNA expression from all cells within each microglial state. Pseudobulk differential DEA was then conducted for each microglial state, comparing each subtype both to healthy controls and to one another. Genes that were significantly differentially expressed in subtype-to-subtype comparisons, and that were also differentially expressed versus controls in at least one subtype were subjected to GO enrichment analysis. For this analysis, a more lenient significance threshold of p < 0.05 was chosen to broaden the candidate gene set and uncover subtler, potentially biologically relevant pathways.
GO enrichment analysis
GO enrichment analysis was conducted using the clusterProfiler R package (v4.12.6) [30] to identify biological processes, molecular functions, and cellular components significantly associated with DEGs for each subtype. For subtype-specific microglial transcriptomic signatures, enrichment analysis focused exclusively on GO terms related to microglial activity, immune responses, and pro- and anti-inflammatory states [31].