ABSTRACT
PURPOSE
We aimed to compare the diagnostic performance and interobserver variability in breast tumor classification with or without the aid of an innovative dual-mode artificial intelligence (AI) architecture, which can automatically integrate information from ultrasonography (US) and shear-wave elastography (SWE).
METHODS
Diagnostic performance assessment was performed with a test subset, containing 599 images (from September 2018 to February 2019) from 91 patients including 64 benign and 27 malignant breast tumors. Six radiologists (three inexperienced, three experienced) were assigned to read images independently (independent diagnosis) and then make a secondary diagnosis with the knowledge of AI results. Sensitivity, specificity, accuracy, receiver-operator characteristics (ROC) curve analysis and Cohen's κ statistics were calculated.
RESULTS
In the inexperienced radiologists’ group, the average area under the ROC curve (AUC) for diagnostic performance increased from 0.722 to 0.765 (p = 0.050) with secondary diagnosis using US-mode and from 0.794 to 0.834 (p = 0.019) with secondary diagnosis using dual-mode compared with independent diagnosis. In the experienced radiologists’ group, the average AUC for diagnostic performance was significantly higher with AI system using the US-mode (0.812 vs. 0.833, p = 0.039), but not for dual-mode (0.858 vs. 0.866, p = 0.458). Using the US-mode, interobserver agreement among all radiologists improved from fair to moderate (p = 0.003). Using the dual-mode, substantial agreement was seen among the experienced radiologists (0.65 to 0.74, p = 0.017) and all radiologists (0.62 to 0.73, p = 0.001).
CONCLUSION
AI assistance provides a more pronounced improvement in diagnostic performance for the inexperienced radiologists; meanwhile, the experienced radiologists benefit more from AI in reducing interobserver variability.