Abstract
PURPOSE
Neoadjuvant chemoradiotherapy (CRT) is known to increase sphincter preservation rates and decrease the risk of postoperative recurrence in patients with locally advanced rectal tumors. However, the response to CRT in patients with locally advanced rectal cancer (LARC) varies significantly. The objective of this study was to compare the performance of models based on radiomics features of the tumor alone, the mesorectum alone, and a combination of both in predicting tumor response to neoadjuvant CRT in LARC.
METHODS
This retrospective study included 101 patients with LARC. Patients were categorized as responders (modified Ryan score 0–1) and non-responders (modified Ryan score 2–3). Pre-CRT magnetic resonance imaging evaluations included tumor-T2 weighted imaging (T2WI), tumor-diffusion weighted imaging (DWI), tumor-apparent diffusion coefficient (ADC) maps, and mesorectum-T2WI. The first radiologist segmented the tumor and mesorectum from T2-weighted images, and the second radiologist performed tumor segmentation using DWI and ADC maps. Feature reproducibility was assessed by calculating the intraclass correlation coefficient (ICC) using a two-way mixed-effects model with absolute agreement for single measurements [ICC(3,1)]. Radiomic features with ICC values <0.60 were excluded from further analysis. Subsequently, the least absolute shrinkage and selection operator method was applied to select the most relevant radiomic features. The top five features with the highest coefficients were selected for model training. To address class imbalance between groups, the synthetic minority over-sampling technique was applied exclusively to the training folds during cross-validation. Thereafter, classification learner models were developed using 10-fold cross-validation to achieve the highest performance. The performance metrics of the final models, including accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC), were calculated to evaluate the classification performance.
RESULTS
Among the 101 patients, 36 were classified as responders and 65 as non-responders. A total of 25 radiomic features from the tumor and 20 from the mesorectum were found to be statistically significant (P < 0.05). The AUC values for predicting treatment response were 0.781 for the tumor-only model (random forest), 0.726 for the mesorectum-only model (logistic regression), and 0.837 for the combined model (logistic regression).
CONCLUSION
Radiomic features derived from both the tumor and mesorectum demonstrated complementary prognostic value in predicting treatment response. The inclusion of mesorectal features substantially improved model performance, with the combined model achieving the highest AUC value. These findings highlight the added predictive contribution of the mesorectum as a key peritumoral structure in radiomics-based assessment.
CLINICAL SIGNIFICANCE
Currently, the response of locally advanced rectal tumors to neoadjuvant therapy cannot be reliably predicted using conventional methods. Recently, the significance of the mesorectum in predicting treatment response has gained attention, although the number of studies focusing on this area remains limited. In our study, we performed radiomics analyses of both the tumor tissue and the mesorectum to predict neoadjuvant treatment response.
Main points
• In this study, we developed machine-learning models to predict tumor response to neoadjuvant therapy using radiomics analysis of both the tumor and mesorectum. The area under the receiver operating characteristic values were 0.781 for the tumor-only model, 0.726 for the mesorectum-only model, and 0.837 for the combined tumor and mesorectum model.
• Molecular alterations in peritumoral adipocytes may induce subtle magnetic resonance imaging signal changes that are not visually apparent, highlighting the value of radiomics in quantitatively capturing these hidden imaging features.
• Radiomic-based assessment of the mesorectum underscores its added prognostic value in evaluating neoadjuvant treatment response, providing complementary insights beyond tumor-derived radiomic signatures.
The standard imaging modality for locally advanced rectal cancer (LARC) is magnetic resonance imaging (MRI) to assess rectal wall invasion (T stage), evaluation of locoregional lymph nodes, macroscopic tumor invasion into the mesorectum, mesorectal fascia involvement, and extramural vascular invasion.1, 2 Neoadjuvant chemoradiotherapy (CRT) plays a crucial role in the management of LARC by not only increasing sphincter preservation rates but also facilitating organ preservation through non-operative strategies, such as the watch-and-wait approach, in carefully selected patients who achieve a complete clinical response. Furthermore, CRT has been shown to reduce the risk of postoperative recurrence significantly.3, 4However, the response of patients with LARC to neoadjuvant CRT is variable. Neoadjuvant CRT results in tumor stage regression in 50% of patients, and pathologic complete response is observed in 15%–20% of patients.5 Currently, the response of locally advanced rectal tumors to neoadjuvant therapy cannot be estimated by conventional methods. The prediction of tumor response to neoadjuvant treatment at the time of diagnosis can contribute to patient-specific tailoring of radiation doses and thus increase pathologic complete response and organ preservation rates.6 Therefore, estimating the tumor’s response to the neoadjuvant treatment is important for treatment management.
The influence of adipocytes on tumor pathogenesis has been intensively investigated in recent years. The molecular interaction between tumor cells and adipocytes has been associated with an increase in inflammatory markers and angiogenic factors, such as vascular endothelial growth factor (VEGF) and insulin-like growth factor 1 (IGF-1), that may locally and systematically provoke tumor growth and metastasis. The interaction between rectal cancer and mesorectal adipose tissue has been demonstrated to induce molecular alterations in adipocytes. These changes may lead to subtle MRI findings that are not readily detectable with conventional radiologic methods.7, 8 Some radiomics studies in the literature have evaluated peritumoral adipose tissue to predict clinical outcomes and prognosis. In breast tumors, evaluation of the peritumoral area has been proven to improve the differentiation between benign and malignant breast lesions.9 Likewise, in non-small cell lung cancers, peritumoral lung parenchyma may also predict recurrence after surgery.10
In this study, we performed radiomics analyses of the tumor and mesorectum to predict the response to neoadjuvant CRT; a tumor-only model, mesorectum-only model, and combined tumor-mesorectum model were constructed.
Methods
Study participants
This study was approved by the Non-Interventional Research Ethics Committee of Dokuz Eylül University Hospital (approval number: 2023/33-18, date: August/2023). Due to the study’s retrospective nature, the requirement for informed consent was waived. Details of patients with LARC who underwent neoadjuvant CRT followed by total mesorectal excision between March 2017 and May 2022 were retrieved from the hospital database. Patients who underwent rectal MRI before CRT were included in the study. The exclusion criteria were patients with MRI images with different parameters, pathologic evaluation performed outside the hospital, poor image quality, and patients who refused to be operated on. The patient accrual is summarized in Figure 1.
Image acquisition
Examinations were performed on a 1.5-T MRI machine (Philips Achieva Release 1.8, Eindhoven, The Netherlands) with a pelvic phased-array coil. Turbo spin-echo T2-weighted images (T2WI) were acquired in the sagittal, para-axial (perpendicular to the long axis of the tumor), and para-coronal (parallel to the long axis of the tumor) planes using a repetition time (TR) of 4,500 ms, a field of view (FOV) of 180–220 mm, a matrix size of 256 × 512, a slice thickness of 3 mm, an intersection interval of 0.8 mm, and an echo train length of 16. Diffusion-weighted images (b: 0 and b: 1.000 s/mm2) were acquired in the axial and sagittal planes with a single-shot echo-planar sequence using a 4.200/95 TR/echo time, 350–400 mm FOV, 90° bank angle, and 5-mm slice thickness. Apparent diffusion coefficient (ADC) maps were generated automatically by the software. Fat suppression techniques and contrast agents were not used. Scopolamine butyl bromide (20 mg) was injected intravenously 10 minutes before scanning to reduce intestinal motility.
Protocol for neoadjuvant chemoradiotherapy
All patients received 45 gray (Gy) of pelvic radiotherapy before surgery. Subsequently, a boost of 5.4 Gy in three fractions was administered to the primary tumor. After the first and fifth weeks of radiotherapy, patients received 400 mg/m2/day fluorouracil and 20 mg/m2/day leucovorin for 3 days. Restaging MRI was performed approximately 6–8 weeks after completion of neoadjuvant CRT.
Evaluation of the pathologic response to treatment
In this study, the modified Ryan scoring system was used as the gold standard (Table 1). The modified Ryan scoring has proven to be a reliable tool for classifying tumor regression due to its high reproducibility and inter-observer agreement.11 It is based on the ratio of residual cancer cells to the fibrosis amount. In the modified Ryan scoring system, 0 points are given for complete response, and a score of 3 points indicates a poor response or no response to neoadjuvant treatment.
In the study, the patients were divided into two groups. Patients with a modified Ryan score of 0–1 were classified as responders to neoadjuvant treatment, and patients with a modified Ryan score of 2–3 were classified as non-responders to neoadjuvant treatment.
Image interpretation–texture feature extraction
Data in Digital Imaging and Communications in Medicine format were transferred to a workstation and analyzed by dedicated software (LIFEx version 7.4, Inserm, Orsay, France).
Both tumor tissue and mesorectal adipose tissue were examined in this study. Tumor tissue and mesorectum were segmented separately from T2WI. In addition, tumor tissue was segmented using diffusion-weighted imaging (DWI).
Gy-level normalization and Gy-level discretization were performed to minimize the impact of differences in acquisition protocols on texture features and to generate a homogeneous dataset. For this reason, the voxel values of each lesion in three axes (x, y, z) were recorded, and the median values of these recorded data were obtained. These median values were then utilized as optimized parameters in the texture analysis of each lesion.12 The intensity range was normalized using Z-scoring [mean ± 3 standard deviation (SD)]. Image intensities were discretized into 128 fixed bins.
In the study, the MRI images obtained at the time of diagnosis (pre-treatment MRI) were evaluated. Three radiologists with 5 years (AC), 4 years (RCY), and 33 years (FB) of experience in radiology evaluated the images of 10 patients together. The first radiologist (AC) performed a three-dimensional (3D) semi-automatic segmentation of the entire tumor (Figure 2a, b) and mesorectal adipose tissue (Figure 3) from the axial T2WI without fat suppression of all patients. Mesorectum segmentation was conducted from the point of attachment of the anterior peritoneal reflection to the rectal wall in the cranial section to the intersphincteric area in the caudal section. The second radiologist (RCY) performed a 3D semi-automatic segmentation of the entire tumor using DWI images (Figure 2c, d) and ADC mapping (Figure 2e, f) in the axial plane of all patients. The total number of radiomics features obtained was 17,978.
Statistical analysis
Statistical analyses were performed using IBM SPSS Statistics version 24.0 (IBM Corp., Armonk, NY, USA). The normality of numerical variables, such as age, was assessed using the Kolmogorov–Smirnov test. Correlation analyses between radiomic features were performed using the Spearman rank correlation coefficient, as the features did not follow a normal distribution. Continuous variables were expressed as mean ± SD, and differences in mean age between groups were analyzed using the independent samples t-test. Categorical variables, including sex, distance of extramural extension, and distance to the mesorectal fascia, were compared between groups using the Pearson chi-squared test, as all expected cell frequencies were ≥5. For the comparison of pretreatment, where expected cell counts were below the acceptable threshold, the Fisher–Freeman–Halton test was applied. A P value of <0.05 was considered statistically significant.
Feature selection and machine learning models
Radiomic analysis was conducted using LIFEx software to extract features from tumor and mesorectal segmentations. Prior to feature selection, all radiomic features were normalized using Z-score normalization. To ensure reproducibility, interobserver agreement was assessed on 20 randomly selected patients using (ICC)(3,1) (two-way mixed-effects model, absolute agreement, single measures). Features with ICC values of <0.60 were excluded from further analysis. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression method to reduce dimensionality and retain the most predictive features while minimizing the risk of overfitting. The top five features with the highest coefficients were selected for model training. To address class imbalance between groups, the synthetic minority over-sampling technique (SMOTE) was applied exclusively to the training folds during cross-validation to avoid data leakage (Figure 4). The extracted radiomic data were transferred to Python (version 3.9). Machine learning classifiers–including logistic regression, random forest, extreme gradient boosting (XGBoost), support vector machine (SVM) with radial basis function (RBF) kernel, and k-nearest neighbors (KNN)–were implemented using the scikit-learn and XGBoost libraries. Model performance was evaluated using 10-fold cross-validation. In each iteration, the dataset was split into 9 folds for training and 1 fold for testing, repeated 10 times to calculate average performance.13, 14Evaluation metrics included accuracy, precision, recall, F1-score, and area under the receiver operating characteristic (ROC) curve (AUC).
The methodological quality of this study was evaluated using the METhodological RadiomICs Score (METRICS), a standardized tool comprising 30 parameters that assess key aspects of radiomics research, including imaging acquisition, feature extraction, and model validation. The METRICS tool categorizes studies into quality ratings of very low (0%–20%), low (20%–40%), moderate (40%–60%), good (60%–80%), and excellent (80%–100%).15
Results
In this study, a total of 101 patients [mean age 61.6 ± 13.59 years, 34 women (33.7%) and 67 men (66.3%)] with LARC were evaluated using high-resolution rectal MRI.
In the initial MRI, of 101 patients, 15.8% (n = 16) were staged as T2, 41.6% (n = 42) were staged as T3b, 20.8% (n = 21) were staged as T3c, 11.9% (n = 12) were staged as T3d, 6% (n = 6) were staged as T4a, and 4% (n = 4) were staged as T4b. In the MRI images obtained for re-staging after neoadjuvant CRT, 8% (n = 8) were in the T0 stage, 15% (n = 15) were in the T1 stage, 52.5% (n = 53) were in the T2 stage, 12% (n = 12) were in the T3b stage, 6% (n = 6) were in the T3c stage, 2% (n = 2) were in the T3d stage, 2% (n = 2) were in the T4a stage, and 3% (n = 3) were in the T4b stage.
The response to neoadjuvant treatment, according to the findings in the postoperative pathological material, was divided into groups by modified Ryan scoring. There were 21 patients (20%) with a Ryan score of 0, 15 patients (15%) with a modified Ryan score of 1, 50 patients (50%) with a score of 2, and 15 patients (15%) with a score of 3. Patients with modified Ryan scores of 0–1 were classified as responding, and patients with modified Ryan scores of 2–3 were classified as non-responding (Figure 5).
A total of 101 patients were included in the study, of whom 36 were classified as good responders (36%) and 65 as poor responders (64%). The mean age of good responders was 62 ± 12.5 years, and the mean age of poor responders was 65 ± 9.5 years. No statistically significant difference was observed between the mean age of patients who responded well and poorly to neoadjuvant treatment (P = 0.115). No significant correlation was identified between the T stage (P = 0.196), extramural extension (0.167), the proximity of the tumor to the mesorectal fascia (P = 0.316), and the neoadjuvant treatment response (Table 2).
A radiomic analysis was conducted on the tumor and mesorectum to predict the response of the neoadjuvant CRT.
Prediction of treatment response
In the analyses performed to predict neoadjuvant CRT response, 25 radiomics features from the tumor (Table 3) and 20 radiomics features from the mesorectum (Table 3) were found to be significant (P < 0.05).
Radiomic features were extracted from the tumor region on T2WI and DWI MRI images to construct the tumor-only model. The five most predictive parameters were selected using LASSO. Multiple machine learning models were constructed. The random forest classifier achieved an accuracy of 69.2%, a precision of 70.2%, a recall of 66.7%, an F1-score of 68.4%, and an AUC of 0.781. The XGBoost model yielded an AUC of 0.737. The logistic regression, SVM (RBF kernel), and KNN models resulted in AUCs of 0.714, 0.676, and 0.700, respectively. A detailed summary of the performance metrics for all classifiers in the models is presented in Table 4. The ROC curves for all five classifiers constructed in the tumor-only model are illustrated in Figure 6a. The odds ratios (ORs) and 95% confidence intervals (CIs) from the logistic regression models are summarized in Table 5.
Radiomic features were extracted from the mesorectum on T2WI images to construct the mesorectum-only model. The five most predictive parameters were selected using LASSO. Multiple machine learning models were constructed. The logistic regression model achieved an accuracy of 66.7%, a precision of 66.1%, a recall of 68.3%, an F1-score of 67.2%, and an AUC of 0.726. The XGBoost and random forest models yielded AUCs of 0.708 and 0.700, respectively. The SVM (RBF kernel) and KNN models resulted in AUCs of 0.711 and 0.661, respectively. A detailed summary of the performance metrics for all classifiers in the models is presented in Table 4. The ROC curves for all five classifiers constructed in the mesorectum-only model are illustrated in Figure 6b. The ORs and 95% CIs from the logistic regression models are summarized in Table 5.
Radiomic features extracted from both the tumor and mesorectum regions were combined to construct the combined model. The five most predictive parameters were selected using LASSO. Multiple machine learning models were constructed. The logistic regression model achieved an accuracy of 81%, a precision of 82.1%, a recall of 81.4%, an F1-score of 81.9%, and an AUC of 0.837. The random forest model yielded an AUC of 0.816. The AUCs for the XGBoost, SVM (RBF kernel), and KNN models were 0.789, 0.811, and 0.754, respectively. A detailed summary of the performance metrics for all classifiers in the models is presented in Table 4. The ROC curves for all five classifiers constructed in the combined model are illustrated in Figure 6c. The ORs and 95% CIs from the logistic regression models are summarized in Table 5.
Based on the METRICS assessment, the study achieved a score of 80.3%, classifying it as “excellent quality” (80 ≤ score ≤ 100%) (Appendix 1).
Discussion
In this study, we constructed a series of machine-learning models to predict tumor response to neoadjuvant therapy by analyzing radiomic features extracted from the tumor, mesorectum, and their combination. The AUC values for the three segmentation approaches were as follows: 0.781 for the tumor-only model (random forest), 0.726 for the mesorectum-only model (logistic regression), and 0.837 for the combined model (logistic regression). This finding highlights the complementary value of the mesorectal compartment in radiomics modeling and its contribution to improving the performance of prediction models in LARC.
Personalized treatment protocols have become a prominent feature of clinical practice to minimize side effects, increase the frequency of organ-sparing surgery, and improve the clinical complete response rate in LARCs.6, 16, 17 The prediction of CRT response has emerged as a valuable marker for guiding the development of personalized therapies. The potential of radiomics for predicting the response to LARC treatment has been the subject of numerous studies. In the majority of studies, radiomics models of tumor tissue were constructed from MRI obtained before and/or after CRT.18-20
Mesorectal adipocytes not only act as an anatomical barrier surrounding the tumor but also actively contribute to the tumor microenvironment. The dynamic crosstalk between tumor cells and adipocytes induces profound morphological and functional changes in adipocytes, altering the secretion of adipokines (e.g., leptin, adiponectin) and angiogenic factors (e.g., VEGF, IGF-1). These changes promote key biological processes, such as tumor progression, angiogenesis, and therapeutic and radiotherapy resistance.21-25 Furthermore, molecular profile alterations within peritumoral adipocytes can lead to subtle MRI signal changes that may not be detectable through conventional visual assessment. This underscores the importance of radiomic analyses, which facilitate the extraction of hidden imaging data and provide a quantitative evaluation of subtle changes that would otherwise remain undetected.26, 27
In our study, we aimed to detect changes at the cellular level by performing radiomics measurements from morphologically non-pathologic mesorectum, which did not include tumor deposits, extramural tumor extension, or lymph nodes. The mesorectum contains adipocytes whose molecular profiles are altered in response to tumor processes, as well as venous and lymphatic structures that facilitate the drainage of waste products from both the tumor and surrounding tissues. Recent literature suggests that this microenvironment harbors prognostic information comparable with the tumor itself.7, 8
Relatively few MRI-based studies have incorporated mesorectal features into radiomics modeling. Shaish et al.8 reported an AUC of 0.800 using both tumor and mesorectal features from pretreatment MRI in 132 patients. Jayaprakasam et al.7 evaluated mesorectal features alone in a larger cohort of 236 patients and achieved an AUC of 0.890 for predicting pathological complete response. Kaval et al.28 assessed tumor-only and combined models in 93 patients, reporting AUCs of 0.850 and 0.830, respectively. Although tumor segmentation yielded the highest AUC in that study, the addition of mesorectal features led to improved sensitivity (90%) and overall accuracy (79%), further supporting the complementary role of the mesorectum in response assessment.
Although variations in study design, sample size, and endpoints may account for differences in performance, our results remain consistent with the existing literature, highlighting the importance of including mesorectal features for more accurate
prediction of treatment response.
Compared with our models, which relied solely on MRI-based tumor and mesorectal features, the computed tomography-based radiomics approach developed by Wang et al.29demonstrated lower predictive performance, with an AUC of 0.68 for identifying high-risk neoadjuvant rectal (NAR) scores. Notably, their analysis found mesorectal features to be more predictive than intratumoral features. In contrast, our results indicated that tumor-derived features contributed more strongly to treatment response prediction, suggesting that differences in imaging modality, feature representation, and endpoint definition (Ryan score vs. NAR) may explain the discrepancy. These findings support the utility of MRI-based radiomics as a more accurate and robust non-invasive tool for individualized response prediction.
Multiple models were developed to predict treatment response using radiomic features extracted from the tumor, mesorectal, and combined regions. Although classification performance varied across models, logistic regression, by providing ORs, enabled clinically meaningful interpretation across the three datasets.30, 31 The tumor-only logistic regression model was primarily driven by texture-and intensity-based features, reflecting intratumoral heterogeneity. In contrast, the mesorectum-only model included several morphological descriptors, though only a zone-based texture feature showed statistical significance. These findings indicate that mesorectal adipose tissue may reflect structural or spatial texture changes relevant to treatment response, even in the absence of pronounced intensity heterogeneity.
The combined logistic regression model demonstrated a more balanced and robust predictive performance than the individual models. All five features selected via LASSO contributed significantly to the model’s performance. Notably, features indicative of tissue homogeneity, such as Gy-level autocorrelation and smooth intensity gradient transitions, were associated with favorable response, whereas heterogeneity-related features, including zone size non-uniformity and local textural complexity, were linked to poor response. These results support the hypothesis that radiomic heterogeneity reflects underlying biological disorganization or resistance, whereas homogeneity may indicate a more organized and treatment-sensitive tumor architecture. This interpretation aligns with existing literature. Tumor heterogeneity has been widely associated with treatment resistance and poor prognosis.32, 33
According to our most predictive model (tumor + mesorectum), second-order radiomic features–particularly those derived from the Gy-level co-occurrence matrix (GLCM), Gy-level size zone matrix (GLSZM), and neighboring Gy tone difference matrix (NGTDM)–demonstrated the highest predictive value. These matrices assess tissue heterogeneity at different levels: GLCM captures local structural variability; GLSZM quantifies the size and uniformity of homogeneous zones; and NGTDM evaluates visual texture by comparing a central voxel to its neighbors. Supporting our findings, Shaish et al.8 reported similar prognostic relevance of these features in evaluating response to neoadjuvant therapy. Additionally, Mazzei et al.33 showed that changes in GLCM features before and after treatment in patients with gastric cancer correlated with response, emphasizing the potential of these features as imaging biomarkers.
In radiomics-based machine learning, model performance is strongly shaped by factors such as limited sample size, high feature dimensionality, multicollinearity, and class imbalance.34 Our study reflects these challenges, as we analyzed 101 patients with rectal cancer using 17,978 radiomic features extracted from pretreatment MRI images. To mitigate the risk of overfitting and improve model generalizability, we applied LASSO-based feature selection and SMOTE-based class balancing. Among the tested algorithms, logistic regression with LASSO stood out by consistently providing robust and interpretable predictions, especially in the combined and mesorectum-only models, with strong AUC and F1 scores.35 Ensemble methods, such as random forest and XGBoost, also performed well, reflecting their ability to model complex, non-linear relationships in high-dimensional data.36, 37 Notably, in the tumor-only model, random forest yielded the highest predictive performance, possibly due to its inherent ensemble structure, which reduces variance and captures localized, non-linear dependencies within tumor-derived radiomic features.
Conversely, distance-based algorithms, such as KNN and SVM, showed moderate but generally lower performance than other models. Their results may reflect methodological limitations, such as sensitivity to feature scaling, reduced robustness to noise, and a higher risk of overfitting in high-dimensional, low-sample-size contexts–issues that often require careful tuning and preprocessing to overcome.38, 39 Nevertheless, SVM yielded relatively strong performance in the combined model, suggesting that, when provided with sufficiently rich and diverse input features, distance-based algorithms may perform competitively despite their known limitations.
This study has several limitations, including its retrospective nature, single-center origin, and limited sample size. Although external validation was not feasible due to the small cohort, we employed 10-fold cross-validation to support model robustness. A holdout set was not conducted due to the limited number of cases, and dividing the data into training and test sets would have resulted in information loss. Furthermore, one mucinous tumor was not excluded from our patient cohort. Images with different pixel and FOV sizes were registered in the picture archiving and communication system in our study. This limitation was overcome by utilizing techniques such as pixel size readjustment, normalization, and Gy-level discretization.
In conclusion, our study showed that combining radiomic features from both the tumor and mesorectum improves the prediction of response to neoadjuvant CRT in LARC. The combined model outperformed tumor-only and mesorectum-only models, achieving the highest AUC (0.837) and superior overall classification metrics. Incorporating mesorectal features resulted in a more balanced and more accurate model, highlighting the complementary role of the mesorectum in individualized response prediction. To enable the routine clinical application of these findings, further validation through large-scale, multicenter prospective studies is warranted.