ABSTRACT
PURPOSE
The study aims to investigate the predictability of the radiological response in intrahepatic cholangiocarcinoma (iCC) patients undergoing Yttrium-90 transarterial radioembolization (TARE) with a combined model built on dynamic magnetic resonance imaging (MRI)-based radiomics and clinical features.
METHODS
Thirty-six naive iCC patients who underwent TARE were included in this study. The tumor segmentation was performed on the axial T2-weighted (T2W) without fat suppression, axial T2W with fat suppression, and axial T1-weighted (T1W) contrast-enhanced (CE) sequence in equilibrium phase (Eq). At the sixth month MRI follow-up, all patients were divided into responders and non-responders according to the modified Response Evaluation Criteria in Solid Tumors. Subsequently, a radiomics score (rad-score) and a combined model of the rad-score and clinical features for each sequence were generated and compared between the groups.
RESULTS
Thirteen (36.1%) patients were considered responders, and the remaining 23 (63.9%) were non-responders. Responders exhibited significantly lower rad-scores than non-responders (P < 0.050 for all sequences). The radiomics models showed good discriminatory ability with an area under the curve (AUC) of 0.696 [95% confidence interval (CI), 0.522–0.870] for the axial T1W-CE-Eq, AUC of 0.839 (95% CI, 0.709–0.970) for the axial T2W with fat suppression, and AUC of 0.836 (95% CI, 0.678–0.995) for the axial T2W without fat suppression.
CONCLUSION
Radiomics models created by pre-treatment MRIs can predict the radiological response to Yttrium- 90 TARE in iCC patients with high accuracy. Combining radiomics with clinical features could increase the power of the test. Large-scale studies of multi-parametric MRIs with internal and external validations are needed to determine the clinical value of radiomics in iCC patients.
Main points
• Intrahepatic cholangiocarcinoma (iCC) is the second most common primary hepatic malignancy.
• Transarterial radioembolization (TARE) is used as a first-line treatment in iCC patients due to the radiation-sensitivity of this tumor.
• TARE is a costly and laborious treatment method; therefore, predicting the response to the treatment is crucial for accurate patient selection.
• In radiomics models created by pre-treatment magnetic resonance imaging, the response to TARE in iCC patients can be predicted with high accuracy.
Intrahepatic cholangiocarcinoma (iCC) is the second most common primary hepatic malignancy after hepatocellular carcinoma.1 Its worldwide incidence has increased over the past few decades.2 If left untreated, the prognosis is poor, with an estimated median survival of 3 to 8 months. Treatment options for iCC include surgical resection and transplantation. Unfortunately, most patients will present with metastatic or locally advanced disease at diagnosis and are not candidates for surgery.3 For unresectable iCCs, systemic chemotherapy with cisplatin-gemcitabine results in a relatively poor median overall survival (OS) of 11.7 months.4 Currently, transarterial radioembolization (TARE) is used as the first-line treatment due to radiation sensitivity and high arterial perfusion of the tumor.5 The results of TARE are mixed, with median response rates ranging from 5% to 36% and median OS from 9 to 22 months.6 Additionally, TARE is a costly and laborious treatment method; therefore, predicting response to treatment is crucial for accurate patient selection.
Radiomics is the post-processing analysis of medical images with custom-made software to obtain texture data imperceptible to the human eye. The data obtained are analyzed with machine learning algorithms and developed models.7 The number of studies in radiomics, particularly for predicting the treatment response of hepatic malignancies, including hepatocellular carcinoma and hepatic metastasis, has increased exponentially in recent years.8, 9 However, although iCC is the second most common primary liver cancer, they are relatively rare tumors, and studies on radiomics in iCC patients are limited and derived from computed tomography (CT) examinations.10, 11 As far as we know, there are yet to be studies on whether radiomics analyses based on magnetic resonance imaging (MRI) can predict the radiological response to TARE in iCC patients.
This study aims to investigate the predictability of the treatment response in iCC patients undergoing Yttrium-90 TARE with a combined model created with dynamic MRI-based radiomics and clinical features.
Methods
Study design
The Institutional Clinical Research Çukurova University, Faculty of Medicine, Clinical Ethics Commitee (decision number: 114/09-2021) approved this single-center retrospective study. Informed consent was obtained from all patients prior to all diagnostic and therapeutic procedures in accordance with the principles of the 1964 Declaration of Helsinki.
Fifty-five naive iCC patients who underwent TARE between September 2015 and January 2022 were included in the study. The inclusion criteria were a biopsy-proven diagnosis of iCC and dynamic MRI before and after TARE. The exclusion criteria were prior local or systemic treatments, an inability to clearly distinguish tumor boundaries due to the infiltrative pattern on the pre-treatment MRI, and images unsuitable for analysis due to motion artifacts. Nineteen patients who underwent TARE for iCC were excluded from the study after application of the exclusion criteria. As a result, a total of 36 patients who met the selected criteria were included in the study.
Pre-treatment clinical characteristics, including age, gender, alpha-fetoprotein, carcinoembryonic antigen, carbohydrate antigen 19-9, alanine aminotransferase, aspartate aminotransferase, total bilirubin, and albumin, an international normalized ratio (INR), intrahepatic tumor distribution, positron emission tomography/CT-based extrahepatic disease spread, and nodal involvement were noted. The laboratory examination results were obtained from blood tests the day before TARE and during planned follow-ups.
MRI examinations
The MRI examinations were acquired using a 1.5 Tesla system (Optima, General Electric Healthcare, USA) or a 3.0 Tesla system (Ingenia, Philips Medical Systems, the Netherlands). The MRI sequences were composed of an axial T2-weighted (T2W) without fat suppression, axial T2W with fat suppression, and axial T1-weighted (T1W) contrast-enhanced (CE) sequence in equilibrium phases (Eq). The specific parameters of axial T2W imaging were as follows: time of repetition (TR) 10,000 ms, time of echo (TE) 66 ms, layer thickness 6 mm, layer spacing 1 mm, matrix 320 × 320, field of view (FOV) 400 mm × 400 mm, piecewise collection times or average times 1, and parallel collection factor 0, fs. The parameters of dynamic CE MRI were as follows: TR 4.2 ms, TE 1 min full, layer thickness 5 mm, layer spacing 0 mm, matrix 260 × 224 mm, FOV 380 mm × 342 mm, and parallel acceleration factor 2. T1W was acquired using 0.1 mmol/kg gadolinium-diethylenetriamine penta-acetic acid (Gd-DTPA) at a rate of 2.5 mL/s in the Eq (a scanning delay of 180 s). MRI sequences have been abbreviated as “phase 1: axial T1W CE Eq, phase 2: axial T2W with fat suppression, and phase 3: axial T2W sequence without fat suppression” in relevant places in the text.
Transarterial radioembolization
All patients underwent splanchnic angiography via the femoral approach, and the tumor-feeding arteries were determined by cone-beam CT, followed by a 99m technetium-macroaggregate albumin (MAA) injection. The lung shunt fraction and distribution of MAA within the tumors and non-tumor tissue were evaluated with single-photon emission CT. The desired dose was calculated using partition model dosimetry.12 During TARE, infusion of a previously calculated dose of the Yittrum-90-loaded resin (SIR-Spheres, Sirtex Medical, Australia) or glass microspheres (TheraSphere, Boston Scientific, US) was carried out under fluoroscopic guidance with super-selective or selective manner depending on the defined vascular anatomy. All patients were scheduled for follow-up, including MRI and laboratory tests. After the TARE procedure, the patients were observed for complications for 24 hours.
Evaluation of the radiological response to treatment
Following TARE, dynamic CE MRI was performed at intervals of three consecutive months. The response of the index tumor to the treatment was evaluated according to the modified Response Evaluation Criteria in Solid Tumors.13 The objective response of the index tumor represented the primary outcome measure and was defined as the sum of the complete response and partial response. Based on the 6-month MRI follow-up, the patients were divided into two groups of responders and non-responders.
Tumor segmentation
Digital Imaging and Communications in Medicine data were transferred to a workstation and analyzed by dedicated software (Olea Sphere v.3 SP2, Olea Medical, France). The raw images were normalized using a Z-score to rule out the possible effects of different MRI devices. Subsequently, axial T2W without fat suppression, axial T2W with fat suppression, and axial T1W-CE-Eq images were segmented by two radiologists blinded to the aim of this study manually drawing the boundaries of the tumors slice-by-slice. After this, a volume of interest (VOI) that covered the entire tumor was created (Figure 1). One hundred eight grey-level properties (first and second order) of the generated VOI were extracted.
Selection of the treatment response-related features and construction of a radiomics score
One hundred eight features were extracted based on the MRI for each patient. Because the number of features was superior to the number of patients, a radiomic feature selection process was constructed using the lowest absolute shrinkage and selection operator (LASSO)14 The logistic radiomics models for predicting the treatment response for all phases were fitted to select the treatment response-related features with nonzero coefficients. Three-fold cross-validation with minimum criteria was employed to find an optimal tunning parameter, where the final value of the tuning parameter yielded minimum cross-validation error and maximum area under the curve (AUC). Then, the radiomics score (rad-score) was calculated for each patient by a linear combination of the selected features (with nonzero coefficients) and their respective coefficients.
Statistical analysis
All analyses were performed using IBM SPSS software (version 20; IBM Corp, USA) and R software (version 1.0.143). Categorical variables were expressed as numbers and percentages, whereas continuous variables were summarized as mean, standard deviation, median, and minimum-maximum where appropriate. The chi-squared test was used to compare categorical variables between patient groups. The normality of distribution for continuous variables was confirmed with the Shapiro–Wilk test. The Student’s t-test or Mann–Whitney U test was used to compare the continuous clinical characteristics between patient groups depending on whether the statistical hypotheses were fulfilled. The glmnet package (https://cran.r-project.org/web/packages/glmnet/index.html) was used for the LASSO binary logistic regression. The distribution of the rad-scores in the treatment response groups was demonstrated via a violin plot, which is a hybrid of a box plot and a kernel density plot. Violin plots were plotted using the ggplot2 package (https://cran.r-project.org/web/packages/ggplot2/index.html). Logistic regression analysis was performed to determine significant predictors of the treatment response. Clinical features that were significant at the P < 0.250 level in the univariate analysis were entered into the stepwise logistic regression analysis using the backward logistic regression method. Features with a P < 0.050 after the stepwise analysis were included in the clinical model. In addition, three combined models were built: (1) a model adding the rad-score in the axial T2W without fat suppression to the clinical model, (2) a model adding the rad-score in the axial T2W with fat suppression to the clinical model, and (3) a model adding the rad-score in the axial T1W CE Eq to the clinical model. The goodness-of-fit of the models was assessed with Nagelkerke’s R-squared model.
The predictive ability of the models was assessed with receiver operator characteristic curves and associated performance diagnostics (AUC, sensitivity, and specificity). The best cut-off value was based on the index of union method.15 The AUCs of the models were compared with the DeLong test (https://www.rdocumentation.org/packages/Daim/versions/1.1.0/topics/DeLong.test). The net reclassification index (NRI) and integrated discrimination improvement (IDI) were used to assess the discrimination and reclassification ability to use the rad-score.16 Each combined model was compared with the clinical model as a reference to assess them. The PredictABEL package was used to calculate the NRI and IDI (https://cran.r-project.org/web/packages/PredictABEL/index.html). The statistical level of significance for all tests was P < 0.050.
Results
Clinical characteristics
Thirteen (36.1%) patients were considered responders, and the remaining 23 (63.9%) were non-responders at the 6-month follow-up. Table 1 presents the baseline clinical characteristics of the patients in the treatment groups. There were no significant differences in any of the characteristics between the two treatment response groups (P > 0.050 for all).
Radiomics signature calculation and evaluation
To investigate the effectiveness of the treatment response discrimination, we performed LASSO modeling of the texture features; 108 features were chosen to construct the rad-score for the axial T1W-CE-Eq. Similarly, four features were selected for the axial T2W with fat suppression and eight features for the axial T2W without fat suppression. Using these features, rad-scores were generated for each patient in three phases, and Supplementary Material 1 contains the details of the feature selection process.
Responders exhibited significantly lower rad-scores than non-responders in all phases (P = 0.039 for the axial T1W-CE-Eq, P = 0.001 for the axial T2W with fat suppression, and P = 0.001 for the axial T2W without fat suppression). Figure 2 presents the violin plot of the rad-scores for all phases.
Model building and performances
Table 2 summarizes the results of the multivariate logistic regression analysis. After the stepwise regression analysis, results for the clinical model (before the rad-score was added to the clinical features) revealed that bilobar disease [odds ratio (OR): 4.53, 95% confidence interval (CI): 1.06–19.41, P = 0.042] and the INR (OR: 2.31, 95% CI 0.93–5.74, P = 0.072) were significant independent risk factors for the treatment response. Results of the combined models (obtained by integrating the significant clinical features and the rad-score in each phase) demonstrated that bilobar disease and the rad-score in the axial T2W with fat suppression (OR: 7.97, 95% CI: 1.03–62.03, P = 0.047 and OR: 1.33, 95% CI: 1.06–1.68, P = 0.015) and the rad-score in the axial T2W without fat suppression (OR: 1.31, 95% CI: 1.04–1.65, P = 0.023) were independent predictors of the treatment response.
The radiomics models (fitted only from the rad-scores in each phase) showed good discriminatory ability with an AUC of 0.696 (95% CI: 0.522–0.870) for the axial T1W-CE-Eq, 0.839 (95% CI: 0.709–0.970) for the axial T2W with fat suppression, and 0.836 (95% CI, 0.678–0.995) for the axial T2W without fat suppression (Table 3). There was no significant difference in AUCs between the radiomics models (DeLong’s tests P > 0.050 for all pairwise comparisons).
The clinical model resulted in an AUC of 0.769, followed by the combined model-1 (0.816), the combined model-2 (0.863), and the combined model-3 (0.880) (Figure 3). Although the AUC of the combined model-3 was not significantly higher than the other models, the combined model-3 showed a favorable AUC of 0.880 (95% CI: 0.730–0.999) (Table 3). The sensitivity and specificity of the combined model-3 were 92% and 78%, respectively. Relative to the clinical model, the use of the combined model-2 resulted in an NRI of 93.0% (P = 0.002) and an IDI of 20.0% (P = 0.003), and the use of combined model-3 resulted in an NRI of 86.0% (P = 0.006) and an IDI of 22.0% (P < 0.001). The reclassification measures of discrimination confirmed that adding rad-scores to the clinical model (the combined model-2 and the combined model-3) performed better than the clinical model alone. Table 3 presents the detailed information for the prediction performance of the models.
Discussion
In this study, the predictability of the treatment responses in iCC patients undergoing TARE was investigated with a combined model created with MRI-based (including the axial T1W-CE-Eq, axial T2W without fat suppression, and axial T2 with fat suppression sequences) radiomics and clinical features. Radiomics models were produced to predict the radiological response with high accuracy. Bilobar disease and rad-scores were independent predictors of the treatment response. There was no statistical difference between the models combining clinical characteristics with radiomics features. This study is important because it is the first one in which the response of TARE in iCC patients has been predicted with MRI-based radiomics.
Patients with unresectable iCC have a poor prognosis. The previously published studies revealed that TARE has great potential to improve patients’ prognosis and OS. However, they have reported a wide range of median OS in iCC (6.1–22 months), probably reflecting the heterogeneous biological behavior of this relatively rare tumor.17, 18, 19 Therefore, the pre-treatment determination of the prognostic factors is important in the patient selection for TARE and the implementation of personalized treatments. In this study, the tumor responded to therapy in about a third of the treated patients. Texture analysis based on pre-treatment MRI was a valuable marker for predicting the treatment response in unresectable iCC patients who underwent TARE.
Previous studies have identified clinical prognostic factors in patients with iCC who underwent TARE. Tumors with bilobar disease had a lower OS rate after the administration of TARE than tumors with unilobar disease. On the other hand, it was established that extrahepatic disease and liver function did not affect the prognosis.20 In this study, it was found that extrahepatic metastases and liver function did not have prognostic significance. However, bilobar disease was associated with a treatment response in iCC patients.
Mosconi et al.11 analyzed the data of 53 iCC patients who underwent TARE and investigated the relationship between CT textural features prior to TARE and the OR. They used the arterial phase images for texture analysis to show that iCCs with a high uptake of iodine contrast in the arterial phase had a higher OR after TARE. Combining these textural features provided an AUC for an OR prediction of 0.896 (95% CI 0.814–0.977). In the present study, MRI was used, as it has a better resolution than CT and shows more tumor tissue features. The AUC in the authors’ study (0.880) was similar to Mosconi et al.’s11 results.
Zhang et al.21 investigated predicting the immunophenotyping (IP) and OS of iCC patients using preoperative MRI texture analysis. They found that the MRI tissue signature could serve as a potential predictive biomarker for IP and OS using arterial phase images for tissue analysis.21 Mosconi et al.11 considered that tumor enhancement at the arterial phase indicated hyperperfusion as the applicability of TARE. Zhang et al.21 thought that the arterial phase revealed the amount of inflammation better than other MRI sequences. In the present study, tissue analysis was performed in the axial T2W with and without fat suppression and axial T1W CE-Eq because the amount of fibrous component associated with poor prognosis is better visualized on MRI as a peripheral hypointensity in T2W and CE images on the delay phase.21 In this study, the arterial phase was not used since the truncation artifact negatively affects tumor segmentation in MRI with Gd-DTPA. Therefore, this study reveals the importance of using other MRI sequences (axial T2W without fat suppression and axial T1W-CE-Eq) other than the arterial phase for texture analysis.
The rad-scores constructed with the LASSO were significantly associated with a treatment response for all phases in this study. Although the inclusion of the rad-score in the clinical model did not statistically substantially improve the AUC, it increased the sensitivity in predicting the treatment response and improved model performance. The combined clinical model-2 and model-3 showed enhanced AUCs of 0.863 and 0.880 with an explicit NRI and IDI.
There were several limitations to this study. First, the number of patients was limited due to the study’s retrospective nature. Therefore, internal or external validation analysis could not be performed. Second, the images analyzed in the study were obtained from two devices with different Tesla powers. This could have affected the texture analysis. However, to avoid this, normalization was applied to all images before segmentation. Third, the study did not evaluate other MRI sequences and dynamic contrast phases (portal phase). Despite all these limitations, the present study demonstrated that the treatment outcomes of iCC patients undergoing TARE could be predicted with high accuracy by MRI-based radiomics prior to treatment.
In radiomics models created by pre-treatment MRIs, the response to TARE in iCC patients can be predicted with high accuracy. The combination of clinical factors, such as bilobar disease and texture analysis, could increase the power of the test. However, large-scale studies with multiparametric MRIs with internal and external validations are needed to reach a definitive conclusion and determine the advantages and disadvantages over the radiomics models.