ABSTRACT
Artificial intelligence (AI) is entering routine radiology practice, but most studies evaluate algorithms in isolation rather than their interaction with radiologists in clinical workflows. This narrative review summarizes current knowledge on human–AI interaction in radiology and highlights practical risks and opportunities for clinical teams. First, simple conceptual models of human–AI collaboration are described, such as diagnostic complementarity, which explain when radiologists and AI can achieve synergistic performance exceeding that of either alone. Then, AI tool integration strategies along the imaging pathway are reviewed, from acquisition and triage to interpretation, reporting, and teaching, outlining common interaction models and physician-in-the-loop workflows. Cognitive and professional effects of AI integration are also discussed, including automation bias, algorithmic aversion, deskilling, workload management, and burnout, with specific vulnerabilities for trainees. Furthermore, key elements of responsible implementation are summarized, such as liability and oversight implications, continuous monitoring for performance drift, usable explanations, basic AI literacy, and co-design with radiology teams. Finally, emerging systems are introduced, including vision–language models and adaptive learning loops. This review aims to provide a clear and accessible overview to help the radiology community recognize where human–AI collaboration can add value, where it can cause harm, and which questions future studies must address.
Main points
• Human–artificial intelligence (AI) collaboration in radiology is best understood through diagnostic complementarity, where combined performance can exceed that of either the human or the algorithm alone.
• Workflow-embedded, physician-in-the-loop designs can help determine AI’s real clinical value more than marginal gains in standalone accuracy.
• Automation bias, algorithmic aversion, skill erosion, and context-dependent workload effects are central risks that require explicit mitigation, especially for trainees.
• Responsible implementation depends on formal governance structures, continuous post-market surveillance, explainability standards, and systematic AI literacy for radiology teams.
• Future research should prioritize prospective, multi-institutional evaluation of team performance, equity, and long-term training outcomes over isolated model-centric metrics.
Artificial intelligence (AI), defined as the development of methods enabling machines to perform tasks that historically required human intelligence, is considered a revolutionary development in healthcare, particularly in diagnostic imaging, and has the potential to transform the medical imaging profession.1-3 Early perspectives on AI adoption in radiology were often characterized by automation anxiety, driven by impressive demonstrations of algorithmic performance that led some to speculate about the potential substitution of human practitioners.4, 5 However, this narrative has matured, and a strong argument has emerged that AI’s role is not to supplant human expertise but rather to function in a human–AI symbiosis as a cognitive partner.6, 7
The prevailing professional viewpoint is that AI should serve as a complementary assistive tool, augmenting human intelligence in the diagnostic process.5, 7-9 The imaging workforce, including both radiologists and radiation technologists, demonstrates generally positive reception and optimism regarding the potential of AI.10-13 This favorable outlook is motivated by the expected advantages of AI, including improved efficiency, reduced workload, and optimized management of clinical practice.14-16 The primary goal is to establish a predominantly assistive and collaborative symbiotic relationship between humans and AI systems, yielding collective performance that exceeds what either could achieve alone.6 Realizing these clinical benefits requires the imaging workforce to adapt to and actively collaborate with such systems.
This narrative review first investigates the conceptual frameworks guiding effective human–AI collaboration, followed by an examination of the practical dynamics of integrating AI into clinical radiology workflows. Subsequently, it considers the resulting cognitive and professional impacts, details the required governance and ethical safeguards, and concludes by exploring emerging technological trajectories and proposing directions for future research.
This narrative review is based on targeted, topic-driven searches of the literature, complemented by expert knowledge and reference chaining (backward and forward), and is not designed to follow a formal systematic search strategy.
Conceptual frameworks for human–AI collaboration
The strategic adoption of AI systems in medical imaging relies on established conceptual models that clarify the optimal nature of human–AI cooperation. The key concepts used to describe human–AI interaction and collaboration in this review are summarized in Table 1.
The foundational principle guiding this interaction is diagnostic complementarity.17 This concept posits that combining two distinct interpretive agents, the human radiologist and the AI system, results in an overall diagnostic performance superior to that of either component acting alone. Importantly, complementarity does not imply continuous or unstructured decision fusion; rather, it reflects the effective alignment of distinct strengths through clearly defined roles within the diagnostic workflow. This synergy, often described as human–AI symbiosis, stems from inherent differences in strengths and vulnerabilities.6 AI systems, particularly those using advanced algorithms such as neural networks, can identify subtle patterns and anomalies in medical images, efficiently perform repetitive, high-volume tasks at scale, and dramatically speed up the process of image interpretation.18, 19 Conversely, human radiologists provide critical clinical context, common sense, intuition, and medical judgment, which are essential for synthesizing findings into meaningful patient care narratives.20-25 Because the errors made by human readers (often perceptual or related to fatigue or distraction) and those made by AI systems (often related to generalization or contextual limitations) only partially overlap,6, 17 well-structured role separation may allow each to compensate for the other’s limitations, whereas poorly specified interaction can undermine this advantage.26, 27
AI is implemented across a spectrum of involvement, ranging from low-autonomy augmentative systems,28 which often use a physician-in-the-loop model,29 to highly autonomous systems that operate with limited direct human oversight.28 In this paper, the term “physician-in-the-loop” is used to emphasize the radiologist’s diagnostic liability; however, this role exists within a broader human-in-the-loop ecosystem in which technologists provide critical upstream functions, such as image acquisition and quality control.
Within this spectrum, AI can function as a decision support tool by marking suspicious areas or supplying confidence scores to assist the radiologist’s final determination.30-32 Alternatively, it can assume the role of an independent second opinion or safety net, flagging potentially overlooked regions to minimize false negatives and ensure quality control.33-35 Evidence strongly supports that this interaction model can yield measurable performance improvements, demonstrating the value of collaboration over fully autonomous operation. For instance, Lee et al.36 explored this concept in a retrospective reader study involving 30 radiologists and residents who evaluated 120 chest radiographs (60 containing malignant nodules) in a controlled simulation. They found that a human–AI interaction model can improve performance, although the extent of improvement depends on the quality of the AI system. A conceptual overview of diagnostic complementarity and human–AI team performance is shown in Figure 1.
In mammography screening, different AI workflows are being evaluated to reduce radiologist workload and improve cancer detection. One model is the “AI as supporting reader” workflow, proposed by Ng et al.37 in a large-scale simulation study. The evaluation was based on a retrospective, multisite screening cohort of more than 280,000 mammography examinations from two countries, using multiple vendors, and compared simulated AI-supported workflows with standard human double reading. In this model, the AI acts as the second reader only when it agrees with the first human reader; discordant cases are referred to a second human reader. This simulation indicated that the “AI as supporting reader” workflow could maintain screening performance while substantially reducing the number of cases requiring a second human reading by up to 87%.
A different prospective study evaluated an “AI as additional reader” workflow in mammography screening.38 This prospective implementation was conducted in routine clinical practice across multiple screening sites and evaluated tens of thousands of screening examinations using a commercially deployed AI system. In this implementation, AI was used as a safety net after standard human double reading was complete. If the two human readers agreed not to recall the patient but the AI flagged the case as suspicious, it was referred to an arbitrator for final review. This prospective study found that the additional reader workflow improved the cancer detection rate by 0.7–1.6 additional cancers per 1,000 women screened, with only a minimal increase in additional recalls. The additional cancers detected were primarily invasive (83.3%) and small in size (47.0% were 10 mm or less).
Radiologists, in a study by Zając et al.,39 envisioned triage workflows in which AI could pre-screen cases, for example, by filtering normal radiographs. This would allow senior radiologists to concentrate their efforts more efficiently on challenging cases with positive findings. In this cross-regional qualitative field study, which involved in situ observations and interviews with 18 radiologists across nine clinical sites in Denmark and Kenya, participants conceptualized AI-based case distribution as a potential tool to help clear reading backlogs during periods of high workload. The visions articulated by the participating radiologists focused on AI providing actionable support to help them work better or faster, rather than automating their tasks.
In a recent diagnostic study by the Prostate Imaging–Cancer AI Consortium,40 AI assistance was associated with improved diagnostic accuracy for clinically significant prostate cancer on magnetic resonance imaging (MRI) compared with unassisted readings. The study, which included 61 readers from 53 centers across 17 countries who assessed 360 MRI examinations, found that AI assistance was associated with a statistically significant increase in the area under the receiver operating characteristic curve, from 0.882 to 0.916. At a Prostate Imaging Reporting and Data System threshold of 3 or more, sensitivity improved from 94.3% to 96.8%, and specificity increased from 46.7% to 50.1%. This resulted in three additional true-positive diagnoses and 10 fewer false-positive diagnoses with AI assistance.
Taken together, these studies illustrate both convergent and contrasting approaches to human–AI collaboration that can be organized into a limited number of recurring workflow patterns. Representative workflow architectures for these interaction models are shown in Figure 2, which arranges common diagnostic and screening workflows along a spectrum from low to high AI autonomy. To complement this visual overview, Table 2 presents a workflow-level taxonomy of human–AI collaboration across this autonomy spectrum.
For this partnership to succeed, mutual adaptation is required.29, 41 The AI system must be iteratively optimized to incorporate human feedback, creating closed-loop learning that aligns algorithmic updates with expert judgment. Active learning frameworks operationalize this process by automatically identifying uncertain or informative cases for labeling by radiologists, enabling continuous model refinement while minimizing annotation burden.29, 42, 43 Conversely, radiologists must adapt their own practice by developing AI literacy to interpret model outputs, recognize failure modes, and calibrate trust according to task and context.6, 44 This bidirectional adaptation fosters appropriate reliance, preventing both overtrust and algorithmic aversion and ensuring that human oversight remains central as systems evolve.
The considerations above can be operationalized as a set of design principles for human–AI collaboration in radiology, as outlined in Table 3.
Practical integration and workflow dynamics
Translating complementarity into clinical action requires strategic placement of AI within existing, often fast-paced operational workflows. AI systems interface with radiology at various stages, including image acquisition, triage, worklist prioritization, interpretation, and final reporting.45-48 The corresponding stages, along with human and AI tasks, are depicted in Figure 3.
AI can fundamentally affect the selection and prioritization of cases, for example, by providing prioritization cues or alerts through screening incoming studies for time-sensitive or high-suspicion findings, such as intracranial hemorrhage, pulmonary emboli, or pneumothorax.49-52 This AI-based triage has demonstrated substantial reductions in report turnaround times for critical cases. For example, one simulation showed that AI prioritization reduced the average reporting delay for critical chest radiograph findings from 11.2 to 2.7 days.53 Similarly, another simulation study found that AI significantly reduced the average report turnaround time for critical chest X-ray findings (e.g., pneumothorax, from 80.1 to 35.6 min), although it also noted that the maximum report turnaround time could increase without specific safeguards.51 In contrast to simulations, a real-world clinical deployment found that, for computed tomography pulmonary angiography examinations positive for pulmonary embolism, AI reprioritization significantly shortened the mean report turnaround time, from 59.9 to 47.6 min.54 Furthermore, based on qualitative field studies, radiologists envision that future AI systems could be configured to route cases by user expertise, for example, by directing studies with positive findings to senior radiologists while filtering normal studies for junior radiologists to verify.39
During image interpretation, AI can function as an interactive reporting assistant by automating highly manual tasks.48 For instance, AI can perform time-consuming measurements or calculations.55, 56 Additionally, AI can expedite the retrieval of historical examinations and automatically compare changes over time, highlighting relevant progression or new developments for the radiologist’s attention.57 Emerging vision–language models promise further assistance, enabling capabilities such as draft report generation and automated summarization of patient imaging history, highlighting key events from prior imaging records.58, 59 Early expert-rated vision–language model systems (e.g., Flamingo-CXR) produced clinically acceptable draft reports under constrained conditions, although rigorous evaluation and guardrails remain essential.59
Successful integration requires configurable AI tools to accommodate the varied needs of different clinical sites, local resources, and user expertise.39, 60-62 Interface design must prioritize utility and efficiency, ensuring seamless workflow integration to minimize cognitive disruption.58, 63 Clinicians generally prefer AI to be deployed as tool-based interactions for specific, functional tasks (e.g., quantification or data retrieval) rather than as open-ended, generalized conversational agents.58
Maintaining clinical authority requires that final radiological supervision remain an indispensable component of all AI-supported activities.11, 41, 64 This human oversight is imperative for managing medicolegal liabilities and ensuring patient safety. Therefore, AI systems must be designed with explicit mechanisms for manual oversight and decision arbitration.41 For instance, in sophisticated prostate MRI protocols, AI may highlight suspicious lesions or provide risk scores for cancer detection; however, the human radiologist must retain the ultimate authority to validate or override these AI predictions based on contextual clinical information.65 Similarly, in highly automated triage systems, although clear-cut cases may be filtered, all borderline or ambiguous examinations must be deferred to expert radiologists for final arbitration.6, 44, 65
Furthermore, AI can be leveraged for internal quality assurance without imposing undue cognitive load. A practical example involves using natural language processing to compare the radiologist’s transcribed report with image detection findings, triggering an alert only when a discrepancy or missed finding is identified.39 This approach provides seamless quality control by engaging the radiologist only when a potential error is detected, thereby avoiding workflow disruption in routine cases. However, the usefulness of such systems depends heavily on their performance, as excessive false positives or false negatives can undermine trust and disrupt workflow.
Table 4 summarizes the characteristic interaction patterns, including their main advantages and risks, across commonly used human–AI collaboration configurations.
Cognitive and professional impacts
Integrating AI into clinical radiology fundamentally alters the cognitive processes, trust, and diagnostic reasoning of human practitioners. These complex dynamics introduce specific risks related to human behavior, such as overreliance and interaction mismatches, for example, when workflow design or model context diverges from clinical reality.
A major behavioral concern is the management of automation bias, which refers to the uncritical acceptance of automated outputs, even when those outputs are incorrect (Figure 4).66-68 This bias represents a considerable risk, particularly for users with lower subject matter expertise, such as radiology trainees, who are more prone to accepting AI recommendations.66, 67 For example, in a controlled mammography reader study, trainees’ diagnostic accuracy dropped from approximately 80% when the AI was correct to 20% when misled by incorrect AI output. The same study also found that when AI incorrectly suggested downgrading a correct finding (an error of omission), all experience levels, including experts, were equally susceptible to this bias.67
Studies show that the accuracy of the AI tool is a key factor. High diagnostic accuracy in an AI model is associated with improved radiologist performance and increased susceptibility to accepting AI suggestions.36 Conversely, flawed AI output can be detrimental. Randomized evidence shows that systematically biased AI predictions significantly reduced clinician diagnostic accuracy by approximately 9.1–11.3 percentage points compared with baseline performance.69 This negative impact persisted even when explanations were provided; the study found that AI-generated explanations did not mitigate the adverse effects of biased predictions. This aligns with separate findings that interpretability methods can be flawed, as saliency maps in medical imaging may localize pathology poorly, with one study reporting localization utility, measured as the area under the precision–recall curve, as low as 0.024–0.16 for some methods on chest radiographs.70
The opposite challenge, algorithmic aversion (mistrust) (Figure 4), can also impede successful integration.71 A lack of trust may arise from perceived inconsistencies in AI tools’ technical performance or concerns about frequent false positives or false negatives, which impose an additional burden on radiologists who must double-check AI interpretations.10 In a 2024 EuroAIM/EuSoMII survey, 47.2% of respondents anticipated an increased total reporting workload; other major barriers cited were costs (49.5%), legal issues (43.7%), and lack of validation (35.5%).12 This aversion may also be linked to ethical concerns regarding AI’s vulnerability to bias and discrimination.10, 71-73 Research has demonstrated that algorithmic bias can lead to unequal diagnostic performance across patient subgroups, thereby undermining clinician confidence. For example, a large study showed that deep learning chest radiograph models trained on heterogeneous hospital data systematically underdiagnosed disease in female and Black patients compared with other groups.74 In a large international survey of 1,041 radiologists and residents, 37% of respondents cited a “lack of trust in AI by stakeholders” as a hurdle to implementation; this view was independently and significantly more often observed among those working outside Europe (adjusted odds ratio: 1.77; 95% confidence interval: 1.24–2.53; P = 0.002).75
These behaviors can be conceptualized as a spectrum of reliance on AI, ranging from algorithmic aversion to automation bias, with optimal performance occurring at intermediate, calibrated reliance (Figure 5). It should also be noted that users may experience different biases over time when using AI tools in clinical practice. For example, an expert radiologist’s algorithmic aversion may give way to automation bias as familiarity and confidence in the AI medical device increase or as the user learns to rely excessively on model output while losing confidence in his or her own reading skills.
The widespread deployment of AI detection tools may accelerate deskilling and hinder skill acquisition, particularly in training contexts.49, 76 The introduction of tools designed to detect focal abnormalities, such as pulmonary nodules, intracranial hemorrhage, and pneumothorax, may unintentionally disrupt the training process required to acquire fundamental perceptual skills and efficient search patterns.49 Furthermore, reader studies have shown that incorrect AI prompts can modify diagnostic judgment.67 For instance, when AI automatically triages studies containing emergent findings to the top of a worklist, the immediate alert deprives the trainee of the opportunity to conduct an initial blinded evaluation of the imaging.49 Beyond perceptual skills, AI systems that offer contextual diagnostic suggestions or provide automated scores for standardized reporting systems could compromise mastery of foundational knowledge, such as learning complex differential diagnoses. If trainees become overly dependent on AI to identify findings, they may become more susceptible to automation bias, as studies show that inexperienced users are more vulnerable to following incorrect automated suggestions.67, 77
AI integration introduces an additional layer of cognitive workload and digital fatigue, contrary to expectations that it would uniformly reduce labor.63 A nationwide survey of 6,726 radiologists from 1,143 hospitals in China demonstrated a dose–response relationship between the frequency of AI use for image interpretation and work-related emotional exhaustion and burnout.78, 79 This unexpected burden is partly attributable to the effort required to review and dismiss frequent false-positive detections, a problem already well recognized with mammography computer-aided detection systems and now echoed in newer AI tools.63, 80 In a prospective study of 18,680 chest radiographs, AI use reduced overall reading time (13.3 s vs. 14.8 s) and clearly shortened reading times for studies without AI-detected abnormalities (10.8 s vs. 13.1 s). However, for cases with AI-detected abnormalities, reading times did not differ significantly (18.6 s vs. 18.4 s) and increased more steeply as abnormality scores rose.81 Taken together, these findings show that the workload impact of AI in radiology is highly context-dependent, being beneficial for some tasks, such as chest X-ray screening, but potentially burden-increasing in high-complexity settings or under conditions of high workload and low AI acceptance.
Survey data indicate that attitudes toward AI often differ by age and experience, although the patterns vary. One Italian survey found a U-shaped relationship, in which the youngest (< 30 years) and oldest (> 60 years) radiologists were the most optimistic, whereas a large international survey found that younger age was a positive predictor of a proactive attitude toward AI.11, 13 Younger radiologists and residents frequently report feeling inadequately informed about AI. The survey of Italian radiologists found that 46% of younger members shared this sentiment.11 A Singaporean survey found that a majority (64.8%) of residents and faculty described themselves as novices in their understanding of AI/ML and 59.2% of respondents felt that their residency programs had not adequately implemented AI or machine learning (ML) teaching, despite strong interest in the topic.82, 83 This perceived gap in AI literacy is considered a factor inhibiting adoption; surveys suggest that limited AI knowledge is associated with fear of replacement, whereas intermediate to advanced knowledge correlates with a more positive attitude toward AI.11, 13
Professionals generally recognize that AI will necessitate an expansion of their roles, as evidenced by a 2024 EuroAIM/EuSoMII survey of 572 European Society of Radiology (ESR) members, in which 98% agreed that radiology teams should participate in the development and validation of AI tools, and 45% stated that radiologists should retain full responsibility for AI outputs influencing clinical decisions.12 However, skepticism remains regarding the delegation of high-risk functions, such as prognostication or complex treatment decisions, to AI. Surveys repeatedly show that radiologists favor AI as a second reader or workflow aid and insist that final image interpretation and clinical supervision remain their nondelegable responsibility.11, 13, 75, 83 Accordingly, several education and human–computer interaction studies have warned that if core interpretive and reporting tasks become heavily automated, AI may contribute to progressive deskilling of radiologists unless training and system design explicitly safeguard independent perceptual and decision-making skills.84-86
Governance, ethics, and responsible implementation
The complex behavioral and cognitive challenges posed by AI necessitate strict systemic responses, robust governance, and continuous oversight to ensure safe and responsible adoption.73, 87, 88
A critical concern in scenarios involving AI assistance is medicolegal liability for errors arising from joint human–AI decisions.41, 89-91 There is currently no transfer of liability to AI systems as long as the radiologist or clinician makes the final decision.92-95 Globally, experts affirm that final assessment and supervision of AI results by the radiologist are essential for managing legal risks and ensuring patient safety.11, 88, 96-98 Regulators, including the US Food and Drug Administration (FDA), treat AI-based diagnostic tools as medical devices whose potential harms include increased false-positive and false-negative rates and other incorrect outputs that can delay or misdirect care, leading to patient harm.41, 99 Therefore, institutional governance bodies must establish safeguards to prevent patient harm, especially when deploying high-risk applications, such as screening tools for healthy populations or, in the future, models that might evolve toward treatment support roles.87, 93 Institutional policies must also address ethical considerations, such as patient consent and the potential misuse of data for other purposes.73, 87, 91, 97
Effective AI implementation requires formal governance structures to guide the entire life cycle of clinical AI, encompassing evaluation, procurement, and ongoing support.28 Daye et al.41 describe radiology-led, enterprise-level, and hybrid AI governance committees that oversee the selection, implementation, and continuous monitoring of imaging AI tools within large health systems. These governing bodies should be interdisciplinary, integrating clinical, technical, and governance expertise, including ethics and regulatory perspectives, as recommended by radiology AI governance frameworks that emphasize multidisciplinary teams and shared decision-making.41, 100 Importantly, for such shared decision-making to be meaningful, governance models must also clearly delineate professional accountability for AI-informed clinical actions, as influence without responsibility risks undermining trust and patient safety.
Before any AI tool is deployed, it must undergo a rigorous assessment covering its clinical value, efficacy (benchmarked against average radiologist performance), technical readiness, and ethical implications.41, 93, 101-103 Multiple radiology-specific evaluation frameworks now formalize these dimensions, including the methodological guide by Park et al.,103 the ECLAIR guidelines for commercial tools,104 and the RADAR deployment and assessment rubric.105 Implementation should ideally follow a phased approach, beginning with shadow deployment, in which the AI runs in the background without influencing reports, followed by tightly scoped pilot deployment before full rollout.105, 106
Continuous post-market surveillance and monitoring are crucial for detecting performance degradation or drift after implementation, which can occur due to changes in patient populations, disease prevalence, acquisition protocols, or data pipelines.71, 107-111 A robust monitoring plan must track established metrics and include mechanisms for early intervention if performance declines, as emphasized in radiology-specific monitoring frameworks and quality assurance proposals.105, 112, 113 Recently, the ESR published consensus recommendations clarifying that although legal responsibility for post-market surveillance lies with software providers, radiologists (acting as clinical deployers) are expected to actively contribute to the ongoing monitoring of AI safety and performance in routine practice, including output oversight, incident reporting, and structured clinical feedback.114 For algorithms designed for continuous learning, adherence to regulatory guidelines—including a Predetermined Change Control Plan for anticipated updates—is critical, as reflected in recent FDA guidance for AI/ML-enabled medical devices and in radiology AI governance statements.28, 29, 115, 116 Collaboration among radiologists, AI scientists, and information technology staff is necessary for continuous quality control, as real-world implementation studies consistently show that sustained AI performance depends on this joint clinical–technical oversight.41, 106, 112
To cultivate appropriate trust and counter bias, AI tools must provide transparency.117-119 Explainable AI systems aim to offer interpretive assistance; however, current user-level explanation tools, such as saliency maps (heat maps), have repeatedly been shown to be unstable and only weakly aligned with radiologists’ localization needs, making them insufficient as the primary interface for human–AI interaction.70, 120-122 Poorly articulated or non-sensical explanations can erode trust, whereas clear explanations aligned with established clinical reasoning may increase trust.123-127 Similarly, unreliable explanations may promote algorithmic aversion, whereas overly persuasive ones may increase automation bias, illustrating how explainability, trust, and user behavior are closely interconnected. Beyond post hoc explanations, clinically useful AI systems should expose calibrated confidence or uncertainty estimates so that radiologists can preferentially scrutinize low-confidence cases and more readily detect potential AI errors.128-131
AI literacy, defined as the competency to critically evaluate and collaborate with AI systems, remains a major barrier to safe integration and is also mandatory for deployers within the European Union under the AI Act.10, 132, 133 To address this, education must begin at the undergraduate level, as radiologists and radiation technologists—including residents and bachelor-level graduates—express a strong desire to enhance their AI and ML knowledge for practice improvement.10, 83, 134 Educational frameworks should be stratified by role (e.g., foundational, clinical user, and expert) to enable professionals to understand algorithmic principles and safety concerns appropriate to their scope of practice.135 Several recent initiatives exemplify this structured approach. For instance, a multi-society collaboration (AAPM, ACR, RSNA, and SIIM) has released a comprehensive syllabus detailing competencies across different personas, from general users to purchasers and developers.135 Similarly, practical implementation frameworks have been proposed, including a five-step model for integrating AI curricula into residency programs and condensed workshops focused on foundational literacy rather than technical proficiency, which have been shown to considerably improve resident confidence.136, 137
Finally, involving radiologists and radiation technologists in co-design efforts is vital to ensure that AI solutions address genuine clinical needs and integrate seamlessly with existing workflows.10, 138 This collaboration aims to foster a symbiotic relationship with the technology, ensuring that standardized processes align machine-recommended procedures with professional judgment.
Emerging technologies and future directions
Recent advances in foundation models, particularly vision–language models, have extended the boundaries of human–AI interaction in radiology.139-141 These systems combine image understanding and language generation to enable functions such as report drafting, segmentation, classification, image retrieval, and longitudinal case summarization.59, 141-146 However, current vision–language models trained on general data remain limited in domain-specific reasoning and often underperform in specialized perception tasks.144, 145, 147, 148 Therefore, their immediate utility is expected in constrained, task-specific roles—such as structured summarization, quantitative measurement, and retrieval—rather than open-ended conversational support.39, 58 Qualitative studies suggest that radiologists and clinicians tend to prefer workflow tools (such as tool buttons or alerts) embedded within their reporting environment over general free-text conversational assistants, citing time constraints and a lack of trust in open-ended chat systems.58
Physician-in-the-loop active learning, which facilitates interactive and continuous model improvement (Figure 6), aims to enhance physician–AI interaction and collaboration.29 These frameworks allow radiologists to iteratively refine models through feedback collected during routine practice, with updates performed under predefined change control protocols and independent validation. Such designs support regulatory compliance and improve model adaptability and generalizability, although challenges such as annotation variability must be carefully managed to preserve data integrity.
To further strengthen human–AI collaboration and trust, models should incorporate both uncertainty quantification and transparent reasoning mechanisms.119, 149-151 Uncertainty-aware systems can guide role arbitration by allowing AI to handle clear, low-ambiguity cases while deferring complex or equivocal findings to expert review.149-151 For explainability to be effective, it must provide human-centered, decision-relevant feedback; this includes not only visualizations linking predictions to evidential image regions but also calibrated measures of model confidence and uncertainty.119, 149
The near-term priority is not the development of larger models but the implementation of effective human–AI collaboration through trustworthy, auditable, and workflow-embedded systems that demonstrably enhance collective diagnostic accuracy, efficiency, and safety. Future research should focus on evaluating team performance metrics, workload implications, and long-term cognitive effects in prospective, multi-institutional settings.
Final remarks
Radiology is no longer debating whether AI will replace radiologists but rather how to structure accountable and effective human–AI partnerships. The evidence reviewed here demonstrates that performance gains are fragile when workflow integration, cognitive effects, and governance are neglected. Robust collaboration requires physician-in-the-loop design, calibrated trust, continuous monitoring, and explicit protection of training pathways and professional autonomy. Future work should prioritize prospective, multi-institutional studies of team performance, workload, equity, and long-term learning outcomes rather than isolated accuracy metrics. Under these conditions, AI can evolve from an opportunistic add-on into core clinical infrastructure that strengthens the safety and reliability of imaging care.


