Radiomics-driven prediction of pathologic complete response in non-mass breast cancer using post-neoadjuvant chemotherapy preoperative dynamic contrast-enhanced magnetic resonance imaging

Oleksandr Moroz; Zhiqiang Liu; Cheng Liu; Qian Yin; XingRui Li; Tao Ai

doi:10.4274/dir.2026.263773

ABSTRACT

This study aims to evaluate the clinical utility of a radiomics model derived from post-neoadjuvant chemotherapy (post-NAC) preoperative dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) for predicting pathologic complete response (pCR) to NAC in patients with breast cancer exhibiting non-mass lesions (NMLs).

METHODS

This retrospective study included patients with biopsy-proven breast cancer with NMLs who underwent pre-treatment DCE-MRI and completed standard NAC. Patients were randomly assigned to training and validation cohorts in a 7:3 ratio. Three-dimensional regions of interest (ROIs) of the tumors were manually delineated on pre-NAC DCE-MRI images and spatially registered to post-NAC preoperative images. Radiomic features were extracted from the post-NAC preoperative DCE-MRI ROIs using the Deepwise Multimodal Research Platform. After dimensionality reduction and feature selection, predictive classifiers were constructed based on a logistic regression algorithm for the radiomics-only model and the combined radiomics–clinical model. Subsequently, a triple-integration model further incorporated radiologist assessment. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration curves, and decision curve analysis. SHapley Additive exPlanations (SHAP) analysis was applied to identify the most influential features.

RESULTS

A total of 121 patients were included (training: n = 85; validation: n = 36), of whom 56 achieved pCR. The radiomics-only model demonstrated good discriminative performance (training AUC: 0.927; validation AUC: 0.867), outperforming both the clinical data model (AUCs: 0.577, 0.608) and radiologist assessment (AUCs: 0.708, 0.571). Incorporating clinical variables further improved predictive accuracy (training AUC: 0.933; validation AUC: 0.870). The triple-integration model attained AUCs of 0.936 and 0.810, with no statistically significant difference compared with the radiomics-only model P = 0.450 and P = 0.235 for training and validation, respectively. In addition, SHAP analysis showed radiomic features contributed most to prediction, followed by human epidermal growth factor receptor 2 and hormone receptor status.

CONCLUSION

Post-NAC preoperative DCE-MRI-based radiomics provides a non-invasive method for predicting pCR in non-mass enhancement breast cancer. The combined radiomics–clinical model achieves superior performance and offers potential value for individualized NAC response assessment. Radiomic features effectively characterize the chemotherapy-altered tissue phenotype, offering an objective and quantitative approach for preoperative treatment response assessment in complex NML-type breast cancer, supporting individualized treatment planning.

CLINICAL SIGNIFICANCE

Accurate early prediction of pCR could help identify patients most likely to benefit from NAC and avoid ineffective treatment in non-responders. The developed radiomics model offers an interpretable and reproducible tool; upon successful external validation, it has the potential to support personalized treatment planning in patients with NML-type breast cancer.

Keywords:

Breast cancer, non-mass lesion, neoadjuvant chemotherapy, magnetic resonance imaging, radiomics, pathologic complete response

Main points

• Magnetic resonance imaging-based radiomics can non-invasively predict pathologic complete response (pCR) to neoadjuvant chemotherapy in patients with non-mass enhancement breast cancer.

• The combined radiomics–clinical model shows higher discriminative performance than clinical features or radiomics alone.

• Shape and first-order radiomic features contributed most strongly to pCR prediction, with SHapley Additive exPlanations analysis providing a transparent explanation of feature impact.

• The model demonstrates good generalizability in internal validation, supporting its potential use for treatment decision-making in non-mass lesion breast cancer.

Neoadjuvant chemotherapy (NAC) is a standard treatment for locally advanced breast cancer, aimed at reducing tumor burden and improving surgical outcomes.¹^,² Its use is increasing globally, with registry data showing that the proportion of patients receiving NAC more than doubled in high-income countries such as Australia between 2011 and 2016.³ Consequently, the achievement of postoperative pathologic complete response (pCR) is a critical metric for evaluating NAC efficacy and predicting patient survival.¹^,²^,⁴

At present, dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is frequently employed in clinical settings to assess the efficacy of NAC in patients with breast cancer,⁵ with imaging complete response (iCR) serving as a surrogate for pCR.⁶^,⁷ Nevertheless, there are major discrepancies in the criteria used to define iCR and pCR. iCR is generally determined according to the Response Evaluation Criteria in Solid Tumors (RECIST 1.1), which characterizes it as the complete disappearance of all measurable enhancing lesions on MRI following NAC.⁸ By contrast, pCR is often evaluated using the Miller–Payne grading system, which permits the presence of ductal carcinoma in situ (DCIS) components at the primary tumor site.⁹^,¹⁰ This inconsistency results in a lack of concordance between iCR and pCR, leading to substantial discrepancies between imaging assessments and postoperative pathological diagnoses.¹¹^,¹² This issue is particularly pronounced in breast cancers that manifest as non-mass enhancement (NME). Lesions of the non-mass type typically display an indistinct, diffusely distributed enhancement pattern ¹³^,¹⁴ and frequently exhibit non-centric shrinkage during NAC,^15-17 complicating the assessment of NAC efficacy. Furthermore, the tumor region post-NAC may undergo a series of complex alterations, including tissue necrosis, fibrosis, mucinous degeneration, and calcification. These changes can obscure the true pathological state in MRI signals and enhancement patterns, potentially leading radiologists to underestimate the efficacy of NAC, as evidenced by discrepancies between imaging assessments indicating non-iCR and pathological diagnoses confirming pCR. Consequently, there is a pressing need to develop a novel evaluation method capable of more accurately quantifying tumor imaging characteristics and reflecting their microstructural heterogeneity to assess NAC efficacy in non-mass-type breast cancer.

Radiomics offers a powerful way to non-invasively reveal biological information within tumors by extracting many high-dimensional quantitative features from medical images.^18-21 Although some studies have explored radiomics for predicting NAC efficacy in breast cancer, most focus on morphologically regular mass-type lesions;¹⁷^,²²^,²³ however, research specifically addressing non-mass-type breast cancer remains limited. Therefore, this study aims to construct and validate a predictive model based on post-NAC preoperative DCE-MRI radiomic features. The goal is to accurately predict the pCR status of patients with non-mass-type breast cancer following NAC. Through quantitative analysis of the radiomic features within the tumor region, we aim to develop an objective and standardized preoperative predictive tool that provides reliable assessments for patients with non-mass-type breast cancer, which poses major challenges in imaging evaluation.

Methods

Study participants

This was a single-center retrospective study that included 538 consecutive patients diagnosed with invasive breast cancer. All patients received standard NAC at our hospital between August 2018 and July 2024. The inclusion criteria were as follows: 1. aged ≥ 18 years; 2. confirmed invasive breast cancer through core needle biopsy; 3. lesions meeting the imaging definition of NME on baseline MRI according to the 5^th edition of the Breast Imaging Reporting and Data System (BI-RADS);¹³ 4. 6–8 cycles of a standard NAC regimen (anthracycline-, taxane-, or platinum-based regimens selected based on tumor subtype) received; and 5) radical surgery after NAC, with complete postoperative pathology data. The exclusion criteria were as follows: 1) previous breast cancer treatment; 2) lack of preoperative MRI assessment after NAC; 3) incomplete NAC treatment course or no surgical treatment; 4) poor MRI image quality affecting delineation of the region of interest (ROI); and 5) missing key clinical or pathological data. The final cohort size (121 cases, with 56 achieving pCR and 65 without pCR), though sufficient for initial model development, is a study limitation that underscores the exploratory nature of these findings and necessitates future validation in larger cohorts. A stratified random sampling method was used to split the cases into a training set (n = 85) and a validation set (n = 36) in a 7:3 ratio, ensuring no patient data overlapped between the two cohorts (Figure 1). This approach ensures that no patient appears in both sets, thereby preventing data leakage and enabling rigorous evaluation of model generalizability to unseen data. The Tongji Hospital Institutional Review Board approved this study and waived the requirement for informed patient consent (grant number: TJ-IRB202502119, date: 21.02.2025). All procedures complied with the Declaration of Helsinki.

Breast magnetic resonance imaging scanning protocol

MRI scans of all patients were acquired using a Siemens Skyra 3.0T MR scanner (equipped with a 16-channel dedicated breast phased-array coil; Siemens Healthcare, Erlangen, Germany). Patients were positioned face down with both breasts naturally hanging in the coil. The DCE-MRI scanning parameters were as follows: repetition time/echo time, 5.24/2.46 ms; flip angle, 10°; matrix, 320 × 260; field of view, 320 × 260 mm²; and slice thickness, 1.5 mm. DCE-MRI was continuously scanned for 60 phases with a temporal resolution of 5.74 s/phase. At the end of the third phase, a high-pressure injector was used for intravenous injection of the gadolinium contrast agent (Omniscan, GE Healthcare, Milwaukee, WI) via the elbow vein at a dose of 0.1 mmol/kg body weight, with an injection rate of 2.5 mL/s, followed by a 20-mL saline flush. The total DCE-MRI scanning time was 5 minutes and 57 seconds.

Clinical and pathological data collection

Clinical and pathological data were collected, including age, body mass index (BMI), clinical tumor stage (cT), clinical lymph node stage (cN), hormone receptor (HR) status, human epidermal growth factor receptor 2 (HER-2) status, Ki–67 index, and postoperative pCR status. All data were obtained from the electronic medical record system and pathology reports. HR, HER-2, and Ki–67 status were determined according to the Chinese Society of Clinical Oncology breast cancer diagnosis and treatment guidelines,²⁴ whereas clinical staging (cT, cN) referenced the American Joint Committee on Cancer (AJCC) Cancer Staging Manual.¹⁰

Pathological response was assessed according to the AJCC Cancer Staging Manual,¹⁰ with histopathologic evaluation conducted using the Miller–Payne grading system.⁹ pCR was defined as the absence of invasive carcinoma in the breast and lymph nodes, allowing for the presence of residual DCIS (ypT0/is ypN0),¹⁰ corresponding to Miller–Payne grade 5.⁹ This pathology evaluation system was chosen for its alignment with current clinical guidelines ⁶^,¹⁰^,²⁴ and common use in clinical practice, both within China and internationally,²⁵ thereby facilitating comparison with other studies. Variations in pCR definitions can affect cross-study comparisons.

Imaging data collection and lesion segmentation

All patients underwent DCE-MRI scans before NAC (baseline) and before surgery following NAC. Two radiologists with over 3 years of experience in breast imaging independently assessed and measured the imaging features of the lesions (including maximum diameter, morphology, and enhancement pattern) and were blinded to the patients’ clinical information and pathological results. If the assessments were inconsistent, a chief physician with 25 years of diagnostic experience served as an arbitrator to resolve discrepancies and finalize the results. All descriptions of imaging features followed the 5th edition of BI-RADS.¹³

Imaging efficacy assessment was performed according to the RECIST 1.1 standards, defining iCR as the complete disappearance of all measurable enhancing lesions on MRI after NAC.⁸ To evaluate the predictive efficacy of radiologists for pCR in patients with non-mass-type breast cancer after NAC, this study defined iCR as a “positive” indicator for predicting pCR. Based on postoperative pathological pCR status, the post-NAC preoperative imaging assessment results were categorized into four types: true positive (TP, iCR and pCR), true negative (TN, non-iCR and non-pCR), false positive (FP, iCR but non-pCR), and false negative (FN, non-iCR but pCR).

Lesion segmentation was performed on the enhanced 90-second images of the baseline DCE-MRI. A radiologist with 3 years of experience manually outlined the three-dimensional (3D) ROI of the lesion using ITK-SNAP software (version 3.8.0) ²⁶ while being blinded to clinical information and pathological results. The delineation was performed layer by layer along the most prominent edges of lesion enhancement, including cystic and necrotic tissue areas, ultimately generating a 3D volume of interest (VOI). Subsequently, after image registration, this VOI was automatically aligned with the corresponding post-NAC preoperative DCE-MRI images to match the lesion areas before and after NAC. The second radiologist reviewed all VOIs, and if discrepancies occurred, consensus was reached through discussion with a third chief physician. For VOI mismatches due to patient positioning or changes in breast morphology, the following steps were taken to correct them: (1) manually adjusting the VOI position using stable anatomical structures such as the sternum, aortic arch, and anterior border of the pectoralis major as references and (2) revising the VOI range based on the actual breast tissue boundary if the lesion significantly shrank after NAC, causing the original VOI to exceed the breast tissue boundary.

Radiomic feature extraction, selection, and model construction

Radiomic feature extraction was completed on the Deepwise Multimodal Research Platform (version 2.3; Beijing Deepwise & League of PHD Technology Co., Ltd, Beijing, China; https://keyan.deepwise.com). All post-NAC preoperative images were resampled to a standard resolution of 1 × 1 × 1 mm³ using B-spline interpolation to standardize the data. A total of 1,409 radiomic features were extracted from each patient’s post-NAC preoperative DCE-MRI VOIs, including first-order features, morphological features, texture features [gray-level co-occurrence matrix (GLCM), gray-level size zone matrix, gray-level run length matrix, and gray-level dependence matrix], and neighborhood gray-tone difference matrix features.

To select strong and distinct features and to prevent data leakage or optimistic bias, all subsequent feature selection steps were performed exclusively on the training cohort. First, intraclass correlation coefficients (ICCs) were calculated, and features with ICC > 0.90 were retained to ensure reproducibility. Second, correlation analysis was performed, and pairs of features with an absolute value of the Pearson correlation coefficient greater than 0.90 were randomly removed. Finally, an F-test was performed on the remaining features to select those significantly associated with pCR status. This finalized feature set, derived solely from the training data, was then applied without further modification to the validation cohort for model testing.

Using the selected features, five common machine learning algorithms [decision tree, AdaBoost, random forest, support vector machine (SVM), and logistic regression (LR)] were applied to build predictive models. The area under the receiver operating characteristic (ROC) curve (AUC), F1 score, sensitivity, and specificity of each model in the training set and internal validation set were compared. The LR model with the best performance was chosen as the base algorithm. Subsequently, the radiomic score (Rad-score) for each patient was calculated as follows:,

where,

β represents the coefficients of the i ^th feature, and X represents the feature value of the i ^th feature for each patient.

This score was combined with the independent clinical predictive factors identified by multivariate LR and the radiologists’ imaging assessment results. Then, “radiomics–clinical” and “radiomics–clinical–physician assessment” fusion models were sequentially constructed to evaluate the predictive efficacy of different models and to assess the incremental value of the radiomics model over physicians’ subjective judgment.

Model performance was compared using the ROC curve and DeLong test;²⁷ the AUC, sensitivity, specificity, accuracy, and F1 score were also determined. Calibration curves were plotted to assess agreement between predicted probabilities and actual outcomes. Decision curve analysis was performed to evaluate the clinical net benefit of each model at different decision thresholds. Additionally, the SHapley Additive exPlanations (SHAP) method was used to analyze feature contributions of the best model, enhancing its interpretability and clinical applicability.

Statistical analysis

Statistical analysis was performed using SAS® OnDemand for Academics (Release 3.81, Enterprise Edition). Continuous variables that followed a normal distribution (such as age and BMI) were expressed as mean ± standard deviation, and inter-group comparisons were conducted using independent sample t-tests. Non-normally distributed continuous variables were expressed as median (interquartile range) and compared using the Mann–Whitney U test. Categorical variables (such as cT stage, cN stage, and molecular subtype) were presented as counts (percentages), and groups were compared using the chi-squared test or Fisher’s exact test. Univariate and multivariate LR analyses were also performed to identify independent clinical predictive factors associated with pCR and to build a clinical predictive model using these factors. A P value <0.05 was considered statistically significant.

Results

Clinical and imaging data assessment

This study ultimately included 121 patients with pathologically confirmed breast cancer showing NME on MRI. All patients underwent NAC followed by surgery, with 56 confirmed to have achieved pCR and 65 not achieving it. Comparative analysis between groups showed significant differences in cN stage and molecular subtype between the pCR and non-pCR groups (P < 0.05). However, there were no significant differences between the two groups in age, BMI, maximum tumor diameter, cT stage, and Ki-67 levels (P > 0.05, Table 1).

To ensure the validity of model training, cases were randomly split into a training set (85 cases) and a validation set (36 cases) in a 7:3 ratio. Statistical comparison showed no significant differences in baseline clinicopathological features or pCR status between the two groups (P > 0.05, Table 2), indicating a balanced and comparable dataset.

Subsequently, univariate and multivariate LR analyses were performed on the training set to identify independent predictive factors associated with pCR. The results showed that a higher cN stage [odds ratio (OR): 0.364, P < 0.05] and HR-positive status (OR: 0.284, P < 0.05) were negatively associated with pCR, whereas HER-2-positive status (OR: 15.243, P < 0.05) was positively associated with pCR (Table 3 ). The clinical predictive model was constructed using these three factors, which had AUC values of 0.577 and 0.608 in the training and validation sets, respectively.

Efficacy of radiologists’ imaging assessment

According to the RECIST 1.1 standard, the efficacy of iCR in predicting pCR, as independently assessed by two radiologists, was evaluated as follows: among the 121 patients, 28 were determined to have iCR, of which 23 were pathologically confirmed as pCR (TP) and 5 were non-pCR (FP). Among the 93 patients assessed as non-iCR, 60 were non-pCR (TN) and 33 were pCR (FN). Based on these data, the overall accuracy of iCR in predicting pCR was 68.6% (83/121), with a sensitivity of 41.1% (23/56), specificity of 92.3% (60/65), positive predictive value of 82.1% (23/28), and negative predictive value of 64.5% (60/93). These results indicate that although radiologists demonstrate high specificity in assessing the pCR status of NME-type breast cancer based on MRI, their sensitivity is limited. The main source of error arises from non-central shrinkage of lesions or residual DCIS, leading to enhancement, which can mislead radiologists’ judgments.

Radiomic feature selection and model construction

In this study, we extracted 1,409 radiomic features from each patient’s post-NAC preoperative DCE-MRI 3D VOIs. After screening based on ICCs, Pearson correlation analysis (threshold: 0.9), and ANOVA F-tests, 11 features that were significantly associated with pCR and exhibited low redundancy were retained for model construction (Table 4, Figure 2).

To identify the optimal predictive machine learning algorithm, we built models using a decision tree, AdaBoost, random forest, SVM, and LR based on these 11 features. By comparing the performance of each model in the training and internal validation sets (Table 5, Figure 3), the LR model showed the best performance in the validation set, achieving the highest AUC [training set AUC: 0.927, 95% confidence interval (CI): 0.872–0.982; validation set AUC: 0.867, 95% CI: 0.750–0.983]. Therefore, the LR model was chosen as the basis for subsequent fusion models.

Construction, validation, and interpretability analysis of fusion models

To enhance predictive efficacy, two fusion models were constructed: Fusion Model 1 combined the radiomic score (Rad-score) with independent clinicopathological predictive factors (cN stage, HR status, and HER-2 status); Fusion Model 2 further integrated the radiologists’ imaging assessment (iCR) results into Fusion Model 1. The predictive performance of each model is summarized in Table 6 and Figure 4. In the training set, the AUC values were as follows: radiologists’ assessment, 0.708; clinical model, 0.577; radiomics model, 0.927; Fusion Model 1, 0.933; and Fusion Model 2, 0.936. In the validation set, the AUC values were 0.571, 0.608, 0.867, 0.870, and 0.810, respectively. Both fusion models demonstrated good predictive performance in the training and validation sets. The DeLong test was performed for pairwise comparisons of model performance (Figure 4). The results showed that, in the validation set, Fusion Model 1 and Fusion Model 2 had significantly higher AUCs than the single clinical model and radiologists’ assessment (P < 0.05), but there was no significant difference compared with the single radiomic model (P > 0.05).

We plotted calibration curves to assess how well predicted probabilities matched actual outcomes (Figure 5) and found that the radiomics model and its fusion models showed good consistency (Figure 6). Decision curve analysis further indicated that, at most clinical decision thresholds, Fusion Model 1 and Fusion Model 2 offered better clinical net benefits than single models. In the validation set, Fusion Model 1 maintained stable net benefits within a threshold probability range of 0.2–0.8, whereas the radiologists’ model showed lower net benefits, approaching zero or negative values at some thresholds.

Finally, SHAP analysis was performed on the best-performing model, Fusion Model 1, to improve interpretability (Figure 7). The results showed that the Rad-score and its radiomic features were the largest contributors to the model’s predictions, followed by HER-2 status, HR status, and cN stage. These findings highlight the key role of radiomic features in predicting pCR in NME-type breast cancer after NAC.

Discussion

This study assessed the value of a preoperative DCE-MRI-based radiomics model for predicting pCR after NAC in patients with breast cancer non-mass lesions (NMLs). The results showed that the radiomics model had stable and strong discriminative ability in both the training and validation sets (AUCs of 0.927 and 0.867, respectively), significantly outperforming the clinical model (AUCs of 0.577 and 0.608) and the qualitative assessment by radiologists based on the RECIST 1.1 standard ⁸ (AUCs of 0.708 and 0.571). After adding clinical risk factors, model performance improved further (AUCs of 0.933 and 0.870), although the magnitude of improvement was modest, indicating that radiomic features play a central role in predicting NAC effectiveness. There was a slight decrease in the validation AUC for the triple-integration model (Fusion Model 2) compared with the radiomics–clinical model (0.810 vs. 0.870), which may indicate instability or modest overfitting when adding the radiologists’ binary assessment, a variable potentially subject to higher variance, to a model already robustly defined by radiomic and clinical data.

The imaging assessment by radiologists in this study showed high specificity (92.3%) but low sensitivity (41.1%). This is consistent with previous studies reporting a wide range of sensitivities (37.9%–78.3%), which depend on tumor subtype and lesion type.²²^,²³^,^28-30 These studies highlight the complex response patterns of NMLs after chemotherapy, as they often exhibit non-centric shrinkage and may be accompanied by changes such as mucinous degeneration, fibrosis, calcification, and residual DCIS.

These factors can cause residual enhancement, which can mislead radiologists’ judgment.^11-13,16,31 By contrast, radiomics, by extracting and analyzing a large number of quantitative features from tumor regions, can better capture these microstructural changes, leading to more accurate predictions of actual pathological responses.¹⁸^,³²

The findings of this study match earlier reports indicating that radiomics has significant advantages in predicting NAC efficacy.¹⁹^,²⁰^,³¹^,^33-37 However, compared with earlier research, this study includes new aspects. First, regarding study participants, most previous research has focused on mass-type breast cancers with regular morphology and clear boundaries, whereas this study focuses specifically on complex and poorly defined NMLs. Due to the higher heterogeneity and more complex shrinkage patterns exhibited by NMLs on MRI,^15-17 assessing their treatment efficacy is more challenging. This study confirmed the usefulness and superiority of the radiomics approach in NML-type breast cancer, thereby filling a research gap. Second, because it is difficult to accurately outline the tumor bed after NAC, clinical models based on this area often do not predict well,³⁸ leading most earlier studies to use pre-NAC imaging for modeling.²⁵ For example, Joo et al.³⁶ found that a fusion model using clinical information and pretreatment MRI images performed significantly better than a purely clinical model in predicting pCR (AUCs of 0.888 and 0.827, respectively, P < 0.05). Similarly, Chen et al.³⁷ combined clinical factors with radiomic features to achieve an AUC of 0.879 in the validation set. These studies mainly demonstrate how sensitive tumors might be to treatment rather than their actual responses. By contrast, this study built models using post-NAC preoperative MRI, which provides a more direct assessment of the final tumor response to treatment. It is important to note that currently, only a few studies have attempted to analyze post-NAC imaging, and these studies usually combined post-NAC visible tumor areas with pre-NAC imaging features for joint modeling,³³^,³⁴ with fewer studies independently utilizing post-NAC imaging features. This study, however, based its models on the complete tumor volume before treatment, accurately aligning it with post-NAC images and adjusting based on stable anatomical landmarks. Even when lesions shrink substantially or the enhancement signal completely disappears, features can still be extracted within the same volume range, fully reflecting changes in the tissue microenvironment after chemotherapy. This strategy works well for the non-centric shrinkage of NMLs.

Finally, we compared models built from radiomic features, clinical factors, and radiologist assessment. The results showed that although the AUC of the fusion models was numerically higher than that of the radiomics-only model, the difference was not statistically significant (DeLong test P > 0.05). This key finding suggests that the radiomic signature derived from the post-NAC preoperative MRI itself possesses strong predictive capability. Crucially, our model characterizes the definitive post-NAC tissue state rather than measuring change from baseline (a delta-radiomics approach). This endpoint phenotype appears to be a potent integrator of the complex biological processes that determine the final pathological outcome. Consequently, although clinical factors remain crucial for treatment planning, their incremental value for predicting the achieved pCR status from the post-NAC imaging phenotype was limited.

Using SHAP interpretability analysis, this study found that the texture features (such as square_glcmMCC and wavelet-LHL_glcm_Correlation) and first-order features (such as log-sigma-3-0-mm-3D_firstorder_Maximum) were key factors in predicting pCR. These features might reflect microheterogeneity changes caused by necrosis, fibrosis, and vascular remodeling within the tumor after NAC, which is consistent with earlier studies on texture heterogeneity and changes in the tumor microenvironment.¹⁸^,¹⁹^,³⁹

It is important to note that the independent clinical predictive factors identified in this study (cN stage, HR status, and HER-2 status) are highly consistent with previous clinical knowledge.⁴⁰^,⁴¹ A higher cN stage usually indicates greater aggressiveness and a heavier tumor burden,⁶ making it difficult to achieve pCR; HR-positive breast cancers often have estrogen receptors and/or progesterone receptors that promote cancer cell growth, spread, and resistance to treatment, rendering them less sensitive to chemotherapy.⁴² Conversely, HER-2-positive tumors have much higher pCR rates after combined anti-HER-2 targeted therapy.⁴³ Therefore, the clinical model results in this study are consistent with the biological principles of breast cancer treatment responses. Additionally, these results confirm the reliability of the data used in this study. More importantly, the SHAP results show that these traditional clinical factors have less importance in the fusion model than the radiomic features. This suggests that radiomics has partially captured and integrated the imaging manifestations of the aforementioned biological differences, which is one of the main reasons for its improved predictive performance. For example, there is a notable interaction between HER-2-positive status and the log-sigma-3-0-mm-3D_firstorder_Maximum feature, which indicates that differences in treatment responses among molecular subtypes are already reflected in imaging signal intensity and texture features. In other words, the radiomics model can identify the effects of clinically measurable subtypes and may also capture more complex microtissue changes and heterogeneity patterns, thus outperforming traditional clinical indicators in predicting pCR after NAC in NML-type breast cancer.

Although this study demonstrates good performance in predicting pCR after NAC for NML breast cancer, there are still certain limitations. First, the initial imaging assessment was performed by radiologists with 3 years of experience. Although all diagnoses were reviewed and arbitrated by a senior specialist, the subjective evaluation of complex post-NAC NML patterns remains challenging and may influence the performance benchmark against which the radiomics model was compared. This inherent difficulty underscores the value of developing objective tools. Second, model specificity requires further optimization. Integrating multimodal data such as DWI-ADC, DCE-MRI quantitative parameters, and genomic data will be key to boosting the model’s discriminative power and robustness. Third, clinical translation relies on efficient lesion segmentation. Developing (semi-)automatic segmentation tools for NML breast cancer is essential for advancing the clinical application of this model. Fourth, the single-center, retrospective design and the sample size, particularly of the validation set (n = 36), limit the generalizability of our findings. The considerable reduction from the initially screened population (n = 538) to the final cohort (n = 121) may introduce selection bias. Therefore, our results should be interpreted as a promising proof of concept, and external validation in larger, multicenter prospective cohorts is essential before clinical application. Fifth, we also acknowledge that the linear feature selection approach used in this research, although providing stability in a smaller sample, may overlook complex non-linear associations that could be captured by more advanced algorithms in future studies with larger datasets.

This study constructed and validated a radiomics model for predicting pCR after NAC for NML-type breast cancer based on post-NAC preoperative DCE-MRI. The radiomics model, based on LR, showed strong discriminative performance in both the training and validation sets (AUCs of 0.927 and 0.867, respectively), significantly outperforming the clinical models and qualitative assessments by radiologists. Furthermore, incorporating independent clinical predictive factors (cN stage, HR status, and HER-2 status) resulted in a combined model integrating radiomics and clinical data, which showed the most consistent performance in discriminative ability, calibration, and clinical decision-making benefits (AUCs of 0.933 and 0.870 for the training and validation sets, respectively). The results indicate that radiomic features derived from post-NAC MRI can effectively characterize the treatment-altered tissue environment and are key factors in predicting the efficacy of NAC for NML-type breast cancer. This model provides an objective, quantitative, and interpretable method for preoperative efficacy assessment of complex NMLs, which could help clinicians identify patients likely to achieve pCR, thereby supporting individualized treatment decision-making.

Conflict of interest disclosure

The authors declared that no conflicts of interest.

References

Cortazar P, Zhang L, Untch M, et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet. 2014;384(9938):164-172.