Data Acquisition
Publicly available gene-expression datasets related to osteoarthritis (OA) were systematically screened in the Gene Expression Omnibus (GEO) database. To ensure analytical consistency and biological relevance, datasets were included according to the following criteria: (1) samples must contain human osteoarthritic synovial tissue or cartilage, enabling direct comparison between OA and non-OA conditions; (2) datasets must include a sufficient sample size (≥ 20 total samples) to ensure statistical power; (3) transcriptomic profiling must be generated on commonly used microarray platforms (GPL570 or GPL6244), allowing dataset integration and cross-validation; and (4) datasets must contain both OA patients and healthy controls, ensuring appropriate group comparisons. OA patients' mRNA expression profiles were gathered from three GEO databases: GSE12021, GSE143514, and GSE206848. All three datasets were gathered from OA patients' synovial membranes. A dataset was integrated using GSE143514 and GSE206848 via the combat algorithm in the sva package(16) was used as a training cohort for variable screening and model training. GSE12021 was used as an external validation dataset for the model. All datasets were log2 standardized.
DEG identification and functional enrichment analysis
The differential expression analysis in this work was carried out using the R package "limma"(17). DEGs were screened at a P < 0.05 threshold to avoid omission, and DEG efficiency was measured using PCA. Subsequently, functional enrichment analysis was performed for DEGs including GO, KEGG and DO using the R package "clusterProfiler"(18), and pathways with FDR < 0.05 were considered significant. In addition, we used a technique called gene set enrichment analysis (GSEA)(19). Gene sets were considered highly enriched if P < 0.05 and FDR < 0.05 were satisfied.
Robust prediction models built by machine learning methods
The R package glmnet and e1071 was used to build the machine learning model. First, to filter the key DEGs, LASSO regression (nfold = 5, type.measure = "class") and SVM (number = 20) were used to the entire dataset. As a result, the crossover genes identified by the research were regarded as the important DEGs linked with OA and were utilized to build and train prediction models. We calculated the classifier's effectiveness using ROC curves and confusion matrices. Ultimately, we considered the 13 crossover genes as the best classifiers for constructing OA prediction models and applied them to an external validation cohort to evaluate the capability..
Determination of immune cell patterns in the microenvironment
GEO samples were analyzed for immune cell components and immune pathways using ssGSEA(20), and the list of these gene sets was obtained from a previous literature(21). In turn, we evaluated the association between these immune cells and pathways and the expression of key diagnostic genes.
Cell culture and OA model induction in vitro
Primary human synoviocytes were obtained from ScienCell (Carlsbad, CA, USA) and cultured in DMEM/F12 supplemented with 10% fetal bovine serum and 1% penicillin–streptomycin at 37°C in a 5% CO₂ incubator. For experiments, cells were seeded at 5 × 105 cells per well in 6-well plates for protein and RNA extraction and at 2 × 105 cells per well in 12-well plates for transfection assays. To establish an OA-like inflammatory environment, cells were treated with 10 ng/mL recombinant human IL-1β (PeproTech, Cat. No. 200-01B) for 24 hours prior to downstream assays.
Gene knockdown and overexpression
Three small interfering RNAs (siRNAs) targeting human DLX2 and a scrambled negative control siRNA were synthesized by RiboBio (Guangzhou, China). The sequences are listed as follow.
Name | Sequence |
|---|
siDLX2-1 | 5′-GCCTGAAATTCGGATAGTGAA-3′ |
siDLX2-2 | 5′-CGCACCATCTACTCCAGTTTC-3′ |
siDLX2-3 | 5′-TGATATGCACTCGACCCAGAT-3′ |
Transfection was performed using Lipofectamine 3000 (Thermo Fisher Scientific, Cat. No. L3000001) according to the manufacturer’s protocol. For siRNA knockdown, cells were transfected with 50 nM siRNA in 12-well plates (2 × 105 cells/well). For plasmid overexpression, the full-length SCN4B coding sequence used for cloning is provided in Supplementary Information SI1, and the complete plasmid map of the OE-SCN4B construct is shown in Supplementary Figure S1 (SI2).
The SCN4B CDS was inserted into the pcDNA3.1(+) vector (Invitrogen, Cat. No. V79020), and 1 µg plasmid DNA per well (6-well plate) was transfected. Knockdown/overexpression efficiency was validated by RT-qPCR 48 h post-transfection. All transfection experiments were performed in biological triplicate (n = 3).
Quantitative real-time PCR (RT-qPCR)
Total RNA was extracted from cells or tissues using TRIzol Reagent (Invitrogen, Cat. No. 15596026), and reverse-transcribed using the PrimeScript RT reagent kit (Takara, Cat. No. RR037A). RT-qPCR was performed using SYBR Green Master Mix (Yeasen, Hieff™ qPCR SYBR Green Master Mix, Cat. No. 11184ES) on a QuantStudio 5 Real-Time PCR System (Applied Biosystems; Cat. No. A28574/A28568). Each reaction contained 10 ng cDNA in a 20 µL system, and all samples were run in technical triplicate.
The detailed PCR cycling conditions, reaction composition, and melting curve analysis are provided in Supplementary Information SI3. All qPCR primer sets generated a single, sharp melting-curve peak, confirming amplification specificity (Supplementary Fig. S2). Amplification efficiencies for all genes, calculated using a 5-point 10-fold serial dilution, ranged from 97.8% to 102.0%, with R² values ≥ 0.996, as summarized in Supplementary Table S1.
Relative mRNA expression levels were calculated using the 2−ΔΔCt method, with GAPDH serving as the internal reference gene. Primer sequences used for qPCR are listed below.
Name | Primer sequence: forward primers (F), reverse primers (R) |
|---|
CES3 | Forward: 5′-CTGAGATGGTGCAGTGCCTTCA-3′ Reverse: 5′-GGAAGACAGTGCCATCAACGGT-3′ |
SCN4B | Forward: 5′-GACCAGGAGGAGGAGGACGA-3′ Reverse: 5′-CTGCTGCTGCTGCTGCTGC-3′ |
GNC8 | Forward: 5′-CCAGCAGCTGACCAGCAGGAG-3′ Reverse: 5′-GCTGCTGCTGCTGCTGCTGC-3′ |
DLX2 | Forward: 5′-TACTCCGCCAAGAGCAGCTATG-3′ Reverse: 5′-CGAATTTCAGGCTCAAGGTCCTC-3′ |
SIRPB2 | Forward: 5′-GGATGAAGGCACCTCAGTGCTT-3′ Reverse: 5′-GTCTCCAAGCACTGTGCAGTTC-3′ |
EPHX1 | Forward: 5′-GTTTTCCACCTGGACCAATACGG-3′ Reverse: 5′-TGGTGCCTGTTGTCCAGTAGAG-3′ |
DPEP3 | Forward: 5′-AGGAGCAGGAGGAGCAGGAG-3′ Reverse: 5′-CTGCTGCTGCTGCTGCTGC-3′ |
ABCA3 | Forward: 5′-ACGAGGAGGAGGAGGAGGAG-3′ Reverse: 5′-GTCTCCAAGCACTGTGCAGTTC-3′ |
VPREB3 | Forward: 5′-ACCATCAGGGACTACGGTGTGT-3′ Reverse: 5′-CTCATCCTTGGCTGCCGAGAAT-3′ |
MMP13 | Forward: 5′-CCTTGATGCCATTACCAGTCTCC-3′ Reverse: 5′-AAACAGCTCCGCATCAACCTGC-3′ |
ADAMTS5 | Forward: 5′-CCTGGTCCAAATGCACTTCAGC-3′ Reverse: 5′-TCGTAGGTCTGTCCTGGGAGTT-3′ |
COL2A1 | Forward: 5′-CCTGGCAAAGATGGTGAGACAG-3′ Reverse: 5′-CCTGGTTTTCCACCTTCACCTG-3′ |
SOX9 | Forward: 5′-AGGAAGCTCGCGGACCAGTAC-3′ Reverse: 5′-GGTGGTCCTTCTTGTGCTGCAC-3′ |
GAPDH | Forward: 5’-AAGGTCGGAGTCAACGGATTTG-3 ’ Reverse: 5’-TGACAAAGTGGTCGTTGAGTCA-3 ’ |
ELISA assay for cytokines
Cell culture supernatants and serum samples were collected and analyzed for IL-6, TNF-α, and IL-10 levels using commercial ELISA kits (MultiSciences, China; IL-6: Cat. No. EK106; TNF-α: Cat. No. EK182; IL-10: Cat. No. EK110) according to the manufacturer’s instructions. Absorbance was measured at 450 nm using a microplate reader (BioTek, ELx800).
Ethical Considerations
All animal experiments were reviewed and approved by the Animal Ethics Committee of Nantong University (Approval No. P20230224-025) and were conducted in accordance with institutional guidelines and the ARRIVE guidelines.
Animal model of OA and treatment
The study was conducted in accordance with the ARRIVE guidelines. All animal experiments were approved by the Experimental Animal Center of Nantong University and the Animal Ethics Committee of Nantong University (Approval No. P20230224-025). Male C57BL/6 mice (8 weeks old) were obtained from the Experimental Animal Center of Nantong University. Mice were randomly divided into four groups (n = 6 per group): sham, OA, OA + siDLX2-2, and OA + OE-SCN4B. OA was induced via anterior cruciate ligament transection (ACLT) surgery on the right knee. Anesthesia was induced by intraperitoneal injection of ketamine (100 mg/kg) and xylazine (10 mg/kg) prior to surgery. siRNA (5 nmol) or plasmid DNA (20 µg) was delivered intra-articularly once a week for four weeks, starting one week after surgery. At 6 weeks post-surgery, all mice were euthanized by carbon dioxide inhalation under deep anesthesia. All procedures involving animals were performed in strict accordance with institutional guidelines and ethical standards.
Histological evaluation
Knee joints were fixed in 4% paraformaldehyde, decalcified, and embedded in paraffin. Sections (5 µm) were stained with Safranin O/Fast Green to assess cartilage integrity. The severity of cartilage damage was graded using the Osteoarthritis Research Society International (OARSI) scoring system. Synovitis was scored based on synovial lining hyperplasia, inflammatory cell infiltration, and subintimal fibrosis, as previously described.
Patient and Public Involvement
It was not appropriate or possible to involve patients or the public in the design, conduct, reporting, or dissemination of this laboratory-based study.
Data Availability Statement
The datasets analyzed during the current study are publicly available in the Gene Expression Omnibus (GEO) database under the accession numbers GSE12021, GSE143514, and GSE206848. All data are freely accessible at https://www.ncbi.nlm.nih.gov/geo/.
Statistical Analysis
R software (version 4.2.2) served for all statistical studies. The association between gene expression and immune cells and pathways was established with the help of Spearman correlation. The Wilcox test was applied to determine if any differences existed between the two groups. Statistically significant difference was defined as a p-value less than 0.05.