The internal dataset enrolled 220 patients including 127 patients with mutated KRAS (mean age, 65.22 years ± 12.79) and 93 patients with wild-type KRAS (mean age, 61.96 years ± 10.76). The external dataset enrolled 61 patients including 42 patients with mutated KRAS (mean age, 62.93 years ± 14.16) and 19 patients with wild-type KRAS (mean age, 60.21 years ± 11.96). Except for extramural vascular invasion (EMVI) (P = 0.018), no significant differences were observed in clinical and pathological characteristics including age, sex, CEA and CA199 levels, T stage, N stage, histological grade, and perineural invasion across the training, internal, and external validation datasets (P = 0.106–0.797). Additionally, none of the above clinicopathological characteristics was significantly different between the mutated and wild-type KRAS groups no matter which dataset (P = 0.055–1.000) (Table 2).
Table 2
Demographic and Clinical Characteristics of Patients with Rectal Cancer in the Internal Training, Internal Validation, and External Validation Cohorts
Mutation status | Training dataset | Internal validation | External validation | p-value |
|---|
Mutated (n = 78) Wild-type (n = 74) | Mutated (n = 49) Wild-type (n = 19) | Mutated (n = 42) Wild-type (n = 19) |
|---|
Age, mean ± SD, years | 64.01 ± 11.34 | 64.21 ± 12.94 | 62.41 ± 13.44 | 0.640 |
Gender (%) | | | | 0.530 |
Male | 103 (67.8%) | 44 (67.8%) | 45 (73.8%) | |
Female | 49 (32.2%) | 24 (35.3%) | 16 (26.2%) | |
Tumor differentiation (%) | | | | 0.618 |
Moderate | 109 (71.7%) | 51 (75.0%) | 41 (67.2%) | |
Poor | 43 (28.3%) | 17 (25.0%) | 20 (32.8%) | |
CEA (%) | | | | 0.106 |
normal ≤ 5 | 94 (61.8%) | 48 (70.6%) | 32 (52.5%) | |
abnormal>5 | 58 (38.2%) | 20 (29.4%) | 29 (47.5%) | |
CA199 (%) | | | | 0.771 |
normal ≤ 20 | 122 (80.3%) | 52 (76.5%) | 47 (77.0%) | |
abnormal > 20 | 30 (19.7%) | 16 (23.5%) | 14 (23.0%) | |
pT stage (%) | | | | 0.510 |
| | T1 9 (5.9%) T2 46 (30.3%) T3 97 (63.8%) | T1 6 (8.8%) T2 19 (27.9%) T3 43 (63.2%) | T1 0 (0%) T2 3 (4.9%) T3 58 (95.1%) | |
pN stage (%) | | | | 0.797 |
| | N0 91 (59.9%) N1 52 (34.2%) N2 9 (5.9%) | N0 44 (64.7%) N1 18 (26.5%) N2 6 (8.8%) | N0 37 (60.7%) N1 20 (32.8%) N2 4 (6.6%) | |
pEMVI (%) | | | | 0.018* |
| | 86 (56.6%) | 40 (58.8%) | 47 (77.0%) | |
| | 66 (43.4%) | 28 (41.2%) | 14 (23.0%) | |
pNeural invasion (%) | | | | 0.751 |
Negative | 88 (57.9%) | 38 (55.9%) | 38 (62.3%) | |
Positive | 64 (42.1%) | 30 (44.1%) | 23 (37.7%) | |
CEA: Carcinoembryonic Antigen;CA199༚Carbohydrate Antigen 19 − 9༛EMVI༚Extramural Vascular Invasion.
After feature selection, a final feature subset comprising four radiomic features and two ADC histogram features was established for model construction. The selected radiomic features included original shape Surface Volume Ratio (SVR), original glcm Correlation (GLCM-Corr), wavelet-LHH gldm Small Dependence High Gray Level Emphasis (GLDM-SDHGLE), and wavelet-LLL glszm Zone Entropy (GLSZM-ZE). SVR quantifies the relationship between tumor surface area and volume, reflecting the compactness or irregularity of tumor shape. GLCM-Corr is a second-order texture feature derived from the gray-level co-occurrence matrix, measuring the linear dependency of gray-level intensities between neighboring voxels. GLDM-SDHGLE describes the emphasis of high gray-level values associated with small spatial dependencies after wavelet transformation, reflecting fine-scale intensity variations. GLSZM-ZE calculates the entropy of gray-level size zone distributions, which characterizes the randomness and complexity of homogeneous regions within the tumor. The two ADC histogram features were Skewness and Kurtosis. Skewness measures the asymmetry of the ADC value distribution; higher skewness indicates a right-shifted distribution with more voxels showing low diffusivity. Kurtosis quantifies the peakedness of the distribution, with lower values suggesting broader dispersion of ADC values and increased variation in tissue diffusion characteristics. The differences of each feature between the two genotypic groups are shown in Fig. 3.
The combined model (ADC histogram and radiomic features) for predicting KRAS mutation achieved an AUC of 0.823, an accuracy of 0.765, a sensitivity of 0.737, and a specificity of 0.776 in the internal test set. In the external test set, it yielded an AUC of 0.759, an accuracy of 0.645, a sensitivity of 0.850, and a specificity of 0.548. Overall, the combined model outperformed both the radiomics-only model and the ADC histogram-only model in both test datasets, with detailed results shown in Table 3.
Table 3
Performance Comparison of KRAS Mutation Classification
Model | Accuracy | Sensitivity | Specificity | AUC |
|---|
Internal test set |
Combined | 0.765 [0.662, 0.853] | 0.737 [0.500, 0.933] | 0.776 [0.644, 0.889] | 0.823 [0.701, 0.931] |
Radiomics | 0.750 [0.588, 0.809] | 0.579 [0.429, 0.850] | 0.816 [0.592, 0.837] | 0.751 [0.623, 0.873] |
ADC histogram | 0.529 [0.412, 0.647] | 0.947 [0.833, 1.000] | 0.367 [0.235, 0.500] | 0.702 [0.571, 0.819] |
External test set |
Combined | 0.645 [0.516, 0.758] | 0.850 [0.609, 0.960] | 0.548 [0.419, 0.711] | 0.759 [0.625, 0.870] |
Radiomics | 0.613 [0.452, 0.694] | 0.500 [0.280, 0.708] | 0.667 [0.475, 0.766] | 0.668* [0.514, 0.803] |
ADC histogram | 0.323 [0.323, 0.565] | 0.750 [0.286, 0.714] | 0.119 [0.278, 0.575] | 0.464* [0.298, 0.626] |
KRAS: Kirsten Rat Sarcoma Viral Oncogene Homolog, ADC: apparent diffusion coefficient, AUC: area under the ROC curve. Statistical tests for AUC comparisons were based on Delong test, data with * superscript indicates statistically significant (P < 0.05). Values in square brackets represent the 95% confidence intervals, obtained via bootstrap resampling (1000 iterations).
The comparison of the ROC curves for each model is presented in Fig. 4. The DeLong test showed no statistically significant difference in AUC between the combined model and the radiomics-only model (0.823 vs. 0.751, P = 0.069) or the ADC histogram-only model (0.823 vs. 0.702, P = 0.103) in the internal test set. However, in the external test set, the combined model demonstrated significantly better performance than both comparison models (0.759 vs. 0.668, P = 0.022; 0.759 vs. 0.464, P = 0.003).