Impact of genetic variation on genome-wide DNA methylation patterns. Filtering and quality control of DNA methylation data
Following alignment and preprocessing, we assessed CpG coverage to evaluate overall data quality and to determine appropriate filtering thresholds. Descriptive statistics calculated prior to filtering indicated that the average read depth per CpG site across samples was ~ 15x. Based on the max read count distributions, filters were applied to remove cytosines with excessively high read counts (total read counts > 100, or with counts > 40 in more than five samples). CpG sites with 0% or 100% methylation across all samples were also removed prior to differential methylation analysis.
Of the initial 22.3 million CpG sites, approximately 6.5 million were excluded due to overlap with annotated SNPs. To further reduce the potential impact of unannotated genetic variants, we excluded CpG sites that had zero coverage in any sample, as such sites may represent mismatches due to genetic variation. Next, methylated and unmethylated reads from both DNA strands were summed per CpG, and sites were retained only if strand-specific methylation proportions differed by no more than 50%. After applying all quality control and filtering steps, approximately 5.4 million CpGs remained for downstream analysis. Filtering reduced CpG island representation from 5–1.50% on autosomes and from 2.80–0.20% on the X chromosome (Fig. 1B), with the requirement for non-zero coverage having the greatest impact on this reduction.
Principal component analysis (PCA) on the 2,500 most variable CpGs showed clear separation of samples by sex and cross on both the autosomes and X chromosome. Separation by sex is consistent with prior reports of sex-biased methylation in mouse liver (3). Among male samples, variability between the two wild-type groups (C + and D+) is likely to reflect genetic background effects, as they clustered by cross rather than forming a unified group (Fig. 1C).
Mutations in Kdm5c or Kdm5d are associated with both loss and gain of methylation on autosomes
Next, we identified DMCs in 6 contrasts: C+/- vs C+/+ (DMCs sensitive to Kdm5c dosage in female mice); C- vs C+ (DMCs sensitive to the Kdm5c mutation); D- vs D+ (DMCs sensitive to the Kdm5d mutation); C + vs D+ (DMCs sensitive to genetic background), and sex-biased DMCs obtained by contrasting wild type (C+/+ vs C+) and mutant (C+/- vs C-) females and males (Supplementary Tables 1 and 2). The substantial number of DMCs from the C + vs D + male contrast suggests that the SNP filtering was insufficient to eliminate genetic background effects. To minimize the likelihood of detecting the genetic background effects on methylation rather than the impact of the mutations, all DMCs that overlapped with those identified in the C + vs D + contrast were excluded from downstream analyses (Table 1, Supplementary Fig. 1, Supplementary Tables 1 and 2). The largest numbers of DMCs were found in the contrasts comparing males and females.
To examine the distribution of sex-biased DNA methylation across the X chromosome, sDMCs were binned into 2 Mb windows (Supplementary Fig. 2). In both the wild-type (C+/+ vs C+; Supplementary Fig. 2A), and the mutant (C+/- vs C-; Supplementary Fig. 2B) comparisons, we observed widespread sex-bias in DNA methylation across the X chromosome. Although sDMCs that were hypermethylated in males were more numerous, this skew is likely attributable to the reduced representation of CGIs in our data. The overall distribution of sDMCs in the mutant contrast closely resembled that of the wild type.
Contrasts comparing the effects of the mutations on methylation in male livers detected comparable numbers of DMCs. In females, the Kdm5c mutation affected fewer CG sites than in males. DMCs were distributed across chromosomes with both hypo- and hypermethylated CGs detected on each chromosome (Supplementary Fig. 1 and Supplementary Table 2).
Table 1
Number of DMCs (p < 0.05) and DMRs (≥ 50% of CpGs with p < 0.05) identified by DSS after filtering out overlaps with the WT male contrast (C + vs D+).
Contrast | Autosomes | X chromosome | Y chromosome |
|---|
| | DMCs (hypo vs hypermethylated) | DMRs (hypo vs hypermethylated) | DMCs (hypo vs hypermethylated) | DMRs (hypo vs hypermethylated) | DMCs (hypo vs hypermethylated) | DMRs (hypo vs hypermethylated) |
C+/- vs C+/+ | 1150 (289 vs 861) | 35 (9 vs 26) | 25 (6 vs 19) | 0 | - | - |
C- vs C+ | 1999 (483 vs 1516) | 93 (36 vs 57) | 57 (12 vs 45) | 2 (1 vs 1) | 4 (1 vs 3) | 0 |
D- vs D+ | 2208 (1159 vs 1049) | 89 (59 vs 30) | 58 (24 vs 34) | 0 | 5 (1 vs 4) | 0 |
C+/+ vs C+ | 3470 (743 vs 2727) | 184 (35 vs 149) | 3298 (2270 vs 1028) | 513 (355 vs 158) | - | - |
C+/- vs C- | 3811 (961 vs 2850) | 185 (23 vs 162) | 3268 (2331 vs 937) | 459 (317 vs 142) | - | - |
Next, DMCs were functionally annotated based on sequence context, genic location, and co-localization with 15 chromatin signatures, and enrichment patterns were compared by contrast (Fig. 2). For the chromatin state annotations, we used ENCODE data from neonatal liver as the first step, as P0 is currently the latest developmental stage with all 8 chromatin marks available in the ENCODE database (32).
Loss of Kdm5c in C-mutant males was associated with lower methylation of CGI shores (Fig. 2A), gene promoters (Fig. 2B) and regions with mostly active chromatin marks (Fig. 2C). Chromatin state annotation of DMCs using the ChromHMM data for neonatal liver (32) showed that regions with lower methylation in C- mutant livers (referred to as hypomethylated DMCs from this point on) were significantly enriched at active promoters (OR = 3.90, p = 2.83x10− 6), flanking promoter regions (OR = 6.42, p = 3.14x10− 6), strong enhancer regions (OR = 14.38, p = 1.24x10− 20) and regions of transcription initiation (OR = 8.87, p = 9.50x10− 10) (Fig. 2C). All the above enriched chromatin states share a common mark, the H3K4me1. Hypomethylated DMCs were additionally enriched in CpG islands (OR = 3.39, p = 1.54x10− 5), whereas DMCs with higher methylation in mutant livers (referred to as hypermethylated) were depleted in CpG islands (OR = 0.17, p = 6.31x10− 5) but modestly enriched in CpG shores (OR = 1.57, p = 4.32x10− 4) (Fig. 2A). Both hypo- and hypermethylated DMCs were underrepresented in regions associated with strong transcription (OR = 0.47, p = 0.005; OR = 0.17, p = 7.33x10− 39, respectively). These observations are largely consistent with the earlier reported functions of KDM5C at both promoters and enhancers (36).
Genomic annotation of Kdm5c dosage-sensitive DMCs (C+/- vs C+/+) revealed no significant enrichment for specific genomic features, such as CpG islands, unlike the pronounced enrichment observed in DMCs from C- mutant males (Fig. 2A). However, hypomethylated DMCs in C+/- mutants were significantly enriched in promoter regions (OR = 2.25, p = 8.48x10− 4), although to a lesser extent than in C- mutant males (OR = 3.14, p = 6.17x10− 17). Similarly, CpG shore hypomethylation was more pronounced in male C- mutants (OR = 4.68, p = 2.98x10− 28) than in C+/- females (OR = 2.59, p = 1.06x10− 04), suggesting that CpG shores and promoters may be sensitive to Kdm5c dosage, though the effect is more pronounced with complete loss of KDM5C.
The Kdm5d mutation was associated with both lower and higher methylation at enhancer regions, and DMCs were underrepresented in CpG islands (Fig. 2A). Hypomethylated DMCs were enriched at weak (OR = 5.47, p = 4.04x10− 19) and poised enhancers (OR = 2.74, p = 6.76x10− 10). Hypermethylated DMCs were enriched at strong TSS-proximal enhancers (OR = 3.39, p = 0.006) and weak enhancers (OR = 3.26 p = 1.36x10− 5). While both hypo- and hypermethylated DMCs were enriched at weak enhancers, the magnitude of enrichment was greater for hypomethylated DMCs (Fig. 2C).
All mutant contrasts showed enrichment of hypermethylated DMCs in intergenic regions (C+/- vs C+/+ p = 3.17x10− 06; C- vs C + p = 1.47x10− 18; D- vs D + p = 1.34x10− 10) and slight enrichment of hypermethylated DMCs in polycomb-associated chromatin (H3K27me3) (C+/- vs C+/+ p = 0.02; C- vs C + p = 0.001; D- vs D + p = 0.01) (Fig. 2C).
Kdm5c and Kdm5d mutations influence DNA methylation at H3K4me1-enriched regions
Since our methylation analysis was done in adult livers, next, we used ENCODE data for adult male livers (33) to examine whether the trends observed in P0 livers were also present in adults. ENCODE data for adult liver are limited to male mice, hence, our colocalization tests were performed using DMCs from the mutant male contrasts only. We investigated whether mutant-associated DMCs tended to occur closer to specific chromatin features than expected by chance by analyzing their distances to peaks of H3K4me3 (active promoters), H3K4me1, H3K27ac (active enhancers), and H3K27me3 (polycomb-mediated repression) (Fig. 3). For each DMC, we calculated the distance to the nearest histone mark peak and compared the resulting distribution to a null distribution generated by randomly sampling CpG sites 100 times and recalculating distances.
Hypomethylated DMCs in male C- mutants were located closer to H3K4me3 peaks compared to the null distribution, with approximately 25% mapping within 1 kb of a H3K4me3 peak (Fig. 3). In contrast, hypomethylated DMCs in D- mutants showed a distribution that closely followed the null. Hypomethylated DMCs in both mutant male groups tended to reside near H3K4me1-enriched regions, with ~ 50% of C- vs C + and ~ 45% of D- vs D + DMCs located within 1 kb of a H3K4me1 peak. Both groups also showed a strong tendency for hypomethylated DMCs to be located near H3K27ac peaks, with this pattern more pronounced in the C- mutant contrast compared to the null distribution. Specifically, ~ 25% of C- hypomethylated DMCs were located within 1 kb of an H3K27ac peak, compared to ~ 15% in the D- mutant contrast. Hypermethylated DMCs in both groups were more frequently located near regions marked by the repressive histone modification H3K27me3.
KDM5C and KDM5D are not major contributors to sex-chromosome complement dependent DNA methylation in mouse liver
To test our hypothesis that KDM5C or KDM5D contributed to sex-chromosome complement-dependent DNA methylation in adult mouse liver, we identified DMRs in the same six contrasts, as described above (Table 1 and Supplementary Tables 1 and 2). As seen with DMCs, the C + vs D + contrast detected quite a few DMRs that reflected genetic variation among ICR mice (Supplementary Tables 1 and 2). To minimize the effects of genetic background, DMRs that overlapped with those from the C + vs D + contrast were excluded from further analyses. The numbers of DMRs identified in each contrast are listed in Table 1, and all DMRs are provided in Supplementary Table 2. Similar to DMCs, both hypo- and hypermethylated DMRs were detected in contrasts and on all chromosomes (Supplementary Fig. 1). For both sex-biased contrasts (C+/+ vs C + and C+/- vs C-), DMRs were enriched on the X chromosome compared to the autosomes (Supplementary Fig. 1C-E). In both contrasts, approximately 70% of the X chromosomal DMRs were hypermethylated in the male groups (C + and C-) (Table 1, Supplementary Fig. 1E). The lower-than-expected number of DMRs hypermethylated in females likely reflects the depletion of X-linked CGIs following stringent filtering, as described in Methods.
We next asked whether Kdm5c- and Kdm5d-sensitive DMRs overlapped with sex-biased DMRs from wild type mice (C+/+ vs C+) or those identified in our previous study, which used mice with different combinations of sex-chromosome complement and gonadal sex that permitted separating the effects of the sex chromosomes from those of gonadal sex hormone pathways (3). The contrast between XX females (XX.F) and sex-reversed XY females (XY.F) detected DMRs that were sensitive to the sex-chromosome complement, whereas the DMRs detected in the contrast between females with monosomy X (XO.F) and XX littermates (XXPaf.F) were presumably sensitive to X-dosage. If KDM5C or KDM5D were responsible for the effects of X-dosage or the Y chromosome on DNA methylation, respectively, we would expect to see overlaps between mutation-sensitive DMRs and those associated with either X-dosage or the presence of the Y chromosome. Moreover, some of the wild type sex-biased DMRs would be lost and new DMRs would appear in the C+/- vs C- contrast.
DMRs from different contrasts were intersected and considered overlapping if they shared at least one nucleotide (Table 2). A total of 35 DMRs were identified as sensitive to Kdm5c dosage in the C+/- vs C+/+ contrast, all of which were located on the autosomes (Table 1 and Supplementary Fig. 1). Contrary to expectations, no overlaps were found between autosomal Kdm5c-dosage sensitive and X-dosage dependent DMRs (C+/- vs C+/+ females and XXPaf.F vs XO.F or XX.F vs XY.F) (Table 2). Similarly, only one overlap was observed between Kdm5c dosage-sensitive DMRs and sex-biased DMRs identified by comparing wild-type male and female littermates. These findings suggest that Kdm5c dosage is unlikely to be the primary driver of sex chromosome complement-dependent DNA methylation in the adult mouse liver.
Table 2
Overlaps between autosomal DMRs from mutant and sex-biased and sex-chromosome complement associated contrasts.
Contrast (number of autosomal DMRs) | C+/- vs C+/+ (35) | C- vs C+ (93) | D- vs D+ (89) |
|---|
C+/+ vs C+ (184) | 1 | 7 | 8 |
*XX.F vs XY.F (802) | 0 | 1 | 0 |
*XX.F vs XO.F (649) | 0 | 1 | 1 |
| * Data from Zhuang et al.2020 (3) |
In the C- vs C + and D- vs D + contrasts, 7 and 8 autosomal DMRs, respectively, overlapped with sex-biased DMRs from the same genetic background (C+/+ vs C+). However, there was minimal overlap between DMRs identified from the C- and D- mutant contrasts and those associated with sex-chromosome complement in mice with a C57BL/6J or mixed C57BL/6J x C3H genetic background (XXPaf.F vs XO.F and XX.F vs XY.F) (3) (Table 2).
Mutations in several genes that contribute to sex-biased regulation of transcription lead not only to loss of sex bias in gene expression or methylation but also gain of sex bias in genes or regions that were not biased in wild type samples (16, 37–40). Comparison of mutant (C+/- vs C-) and wild type (C+/+ vs C+) contrasts identified several regions with loss or gain of sex bias in mutants (Supplementary Fig. 3A and B). Methylation changes in the C- mutants resulted in a gain of sex-bias in the mutants which was not present in the wild-type contrast (Supplementary Fig. 3B), whereas at certain sex-associated DMRs the C-mutation influenced the magnitude of methylation differences between females and males (Supplementary Fig. 3C).
In general, the C- and D-mutations affected DNA methylation at different loci with only two DMRs that overlapped between the mutant male contrasts (Supplementary Fig. 4A). Certain DMRs that were hypomethylated in D- mutants had similar methylation levels in wild type and in C- mutant males (Supplementary Fig. 4B), while other DMRs were affected by the C-mutation, but not the D-mutation (Supplementary Figs. 3B and 5).
DNA methylation signature of Kdm5c mutations
It has been suggested that loss of Kdm5c was associated with failure to repress germ line genes in the somatic cells of mutation carriers (34, 41). To explore whether the Kdm5c mutation that we used had a similar impact in adult liver, we examined our dataset for DMRs that overlapped with regions that were previously reported as hypomethylated in epiblast-like Kdm5c-knock-out cells (EpiLCs) (34). Since the EpiLC dataset included 1kb-regions surrounding transcription start sites (TSS) (34), whereas our dataset was depleted for CGIs that tend to reside in promoters, we used relaxed criteria and allowed a maximum distance of 3 kb between DMRs from the two datasets. We identified 12 DMRs from the C- vs C + contrast that were within less than 3 kb from those reported in EpiLCs.
These DMRs were located near promoters of the following genes: Gm37912, D1Pas1, Nr6a1os, Gm43697, Gm44020, Rps4l, Zfp469, Rad51c, Tex14, Rnf113a2, D430020J02Rik, Nsd1, Gm6723, Usp16, and Zfp532 (Fig. 4, Supplementary Figs. 3B and 5). All DMRs had the same direction of methylation change in both datasets, except for Gm43697, which was hypomethylated in EpiLCs but hypermethylated in adult liver. It is worth noting that several DMRs were associated with pairs of genes expressed in opposite orientations. Several of these genes are expressed predominantly in germ cells, but some are also highly expressed in adult liver (data from (3, 10), GSE248074) and not sensitive to X-chromosome dosage (Nsd1, Usp16, Rps4l, etc.) (Supplementary Fig. 6). Hence, our data from adult liver are consistent with the data from EpiLCs. This suggests that hypomethylation of these regions is a hallmark of a loss-of-function mutations in Kdm5c.