ABSTRACT
Artificial intelligence (AI) continues to change paradigms in the field of medicine with new applications that are applicable to daily life. The field of ultrasonography, which has been developing since the 1950s and continues to be one of the most powerful tools in the field of diagnosis, is also the subject of AI studies, despite its unique problems. It is predicted that many operations, such as appropriate diagnostic tool selection, use of the most relevant parameters, improvement of low-quality images, automatic lesion detection and diagnosis from the image, and classification of pathologies, will be performed using AI tools in the near future. Especially with the use of convolutional neural networks, successful results can be obtained for lesion detection, segmentation, and classification from images. In this review, relevant developments are summarized based on the literature, and examples of the tools used in the field are presented.
Main points
• Artificial intelligence (AI) can assist with difficult, time-consuming, and accurate processes, such as image quality improvement, lesion detection, segmentation, and classification, in ultrasonography examinations.
• The main challenges of AI studies in ultrasonography are the real-time imaging of the examination, its dependency on the user, and the abundance of noise sources in the image.
• AI studies, as a new field of terminology and knowledge, require multidisciplinary collaborations, and radiologists need to adopt the necessary roles in field-specific data management.
Ultrasonography has been an effective diagnostic tool since the 1950s. As ultrasonography is a diagnostic tool providing a cross-sectional examination, its real-time evaluation capacity and absence of ionizing radiation are its outstanding strengths. Over the years, higher-quality images have been obtained in ultrasonography, and the use of the method has been expanded with applications such as Power Doppler, harmonic tissue imaging, contrast-enhanced ultrasonography, three-dimensional (3D) and four-dimensional imaging, and elastography, as well as Doppler imaging. Significant gains have been made in the clinical field with applications such as transrectal and intraoperative ultrasonography.
However, there are still various problems in ultrasonography imaging that can make diagnoses challenging. The main problems are the noise and artifacts that occur in imaging.1 Artifacts usually appear due to the transmission and reflection behaviors of sound waves in tissues and sampling problems.2 Noise sources are also varied. The main types of noise observed in ultrasound images are salt-and-pepper noise, Poisson noise, Gaussian noise, and speckling. Salt-and-pepper noise, also called random or impulse noise, is the difference in color and density in one pixel compared with the neighboring pixels, depending on sudden signal changes. The noise, called speckling, is caused by the interference of the reflected and rotating sound wave with other waves and causes distortions in the image that can result in diagnostic errors. Gaussian noise is also known as electronic noise and is connected to the device. Poisson noise is also caused by the electronic system. Various filter methods have been developed to eliminate these problems and improve image quality.3
In addition to the problems in image quality, as in all radiological examinations, we need to detect, differentiate, and define the lesions and specify which pathology they are associated with. After diagnosis, procedures such as classification and staging are required. However, since the ultrasonography examination is in real time and user dependent, it causes problems specific to this modality. This can lead to major differences in pathology detection, identification, and diagnosis. There are also image quality differences between devices from different manufacturers, and the sensitivity of observing the same pathology may vary in different devices.4
With the artificial intelligence (AI) studies that have developed in recent years, a new solution opportunity has emerged in the image quality and diagnostic processes described above. This opportunity arose when deep learning from machine-learning methods, a sub-element of AI, used convolutional neural networks. However, the data obtained are very large scale, and improvements in computing speeds accelerate this process.5
Machine learning aims to obtain an output after analyzing the data at hand and making sense of the variables in the data for a situation that needs to be solved using different techniques.5 In this way, it is possible to easily perform complex, time-consuming, and inadequately sensitive operations for human beings through machines. In recent years, machine learning has started to be replaced by deep learning using the methods developed. Machine learning requires an educational process, similar to that in humans. For this purpose, appropriate training sets are prepared where necessary. Three types of learning methods are used: supervised, unsupervised, and reinforced learning methods. The most common among them is supervised learning. In supervised learning, the preparation of the training set and the labeling of the data content are carried out by an expert. Thus, a gold standard is prepared for the machine, which is called ground truth in the field of machine learning.5 For an AI study to be used to determine thyroid nodules from ultrasonography images, it is necessary to draw nodule boundaries in a certain number of images and define the content and boundary properties to indicate whether the lesion is benign or malignant. Supervised learning is very useful in operations such as classifying lesions, characterizing them, and comparing the similarities. As might be expected, these operations require a large amount of time-consuming labeling and may contain serious errors. Therefore, unsupervised learning methods have been developed.6,7
In unsupervised learning, attributes, patterns, and clustered information in a data set can be extracted through generated pathways (algorithms) without the need to label them. Attributes are properties that define the organ or lesion to be distinguished. It is the name given to descriptive properties, such as the size, shape, edge properties, internal structure, and echogenicity of the lesion. Reinforced learning is a unified and dynamic form of these two methods, and learning is performed with continuous positive or negative feedback.
There are several pathways used according to the method that is selected during the learning process. For supervised learning, a method, (for example, support vector machines, logistic regression, Naive Bayes, random forest, K-nearest neighbor, or decision tree) is chosen. Clustering methods, self-organizing maps, principal component analyses, and K-means are used in unsupervised learning. Choosing the right method to provide the most appropriate solution is important for success.8,9
Deep learning, an important area of machine learning, uses multilayered artificial neural networks. These networks, which imitate natural neural networks, evaluate the input data in terms of compliance with the gold standard obtained from the training set at every port it encounters on the network. If a threshold value can be exceeded in compliance, this information turns into ready-to-use information for other pathways in the network.
Because layers are 3D, a path that moves between nonlinear but hidden layers is used. In multilayered convolutional neural networks, attributes in the input are first collected in a representative pool. Attribute information that comes to the pool from different paths of the network is collected in new pools and reaches a fully connected layer by advancing and gradually purifying the threshold values in the network, as in neurons. After the classification has been made, the output (action) and gain are obtained. This flow in deep learning is schematized in Figure 1. Convolutional networks can be used successfully for purposes such as classification, lesion detection, and segmentation.
Since there are hidden layers in deep-learning paths, and it is not known how feature extraction is achieved, this process has been likened to a black box. This created a credibility problem and led to new studies called “explainable AI” in the field of AI.10 Most of these studies are ongoing studies and have not yet reached full maturity.
Different elements play a role in a successful AI application. The most important of these are numerous and well-marked data sets. There are various limitations when it comes to ultrasonography. Ultrasonography examinations are obtained in real time, with the image quality preferred by the user and preferred cross-section angle and probe type, apart from the automatically obtained 3D ultrasound breast scans. In daily practice, images remain unarchived, and only selected sample images are stored. This greatly destroys the image standard and prevents the creation of training sets that reflect real life. Devices from different companies also create another obstacle. The spatial resolution of ultrasonography images is low, and artifacts can be very high. Therefore, serious preliminary image improvement work is required for healthy feature extraction in AI studies related to ultrasonography.11
Various methods to obtain an artifact-free ultrasonography image have been tried. One of the methods developed for this is real-time spatial compound imaging.12 In this method, special transducers that can take sections from the imaged object from different angles at the same time by scattering sound waves are used. In general, the average of the three to nine sections obtained is taken as the real-time representative image. Speckle, clutter, and other acoustic artifacts are significantly reduced in these images. One method of image enhancement is harmonic imaging. Reverberation and side-lobe artifacts are reduced through this imaging method, which considers harmonics that are basic or multiples of the transmitted frequency from the tissue due to the nonlinear emission of sound waves throughout the body tissues. A clearer appearance of cysts, improvements in the signal-to-noise ratio, and better results are achieved, especially in overweight patients.13
General image enhancement tools have also been used to improve the quality of ultrasonography images. In addition to filters and iterative back-projection methods, studies have been conducted in this field using machine or deep-learning methods.14,15 One focus in studies on ultrasonography has been acoustic shadow determination pathways. Geometric and statistical methods have been tried for this.16 In these studies, the inability to prepare quality training sets remains the largest obstacle.
Ultrasonography image analysis studies are collected into three groups: classification, determination, and segmentation.14 Classification studies are used to separate the sections that are the most suitable from the numerous section images taken or to separate the self-qualities extracted in deep-learning studies. The five basic methods used in classification are logistic regression, Naive Bayes, K-nearest neighbors, decision tree, and support vector machines.5
Image determination aims to distinguish between anatomical formations or pathological findings. Effective segmentation is required for these studies, and then, classification paths are used. This creates a path to diagnosis. Segmentation has formed the basis of computer-aided diagnosis. The main methods used in image segmentation are summarized in Table 1.17
As presented in Table 1, the methods have advantages and weaknesses, and the performance levels are generally increased with the combination or successive use of these methods. Significant gains have been made in this field through the utilization of learning systems.18 Algorithms used in the segmentation studies performed using convolutional neural networks and their descriptive properties are listed in Table 2.18,19,20,21,22,23
In deep learning, architectural models are developed by different researchers for convolutional neural networks and subsequent classification algorithms. Examples include the residual neural network, visual geometry group (VGG), auxiliary classifier generative adversarial network (GAN), and neuro-fuzzy system.24,25,26,27
Clinical applications
After many years of studies, products that produce solutions based on machine learning and AI have started to enter clinical practice. One of the areas where studies have gained intensity is the diagnosis and classification of thyroid nodules. There are also products related to the breast and bladder. In addition, most US Food and Drug Administration-approved products are in the field of cardiac ultrasounds. Table 3 provides examples of these products.11
There are many publications with deep-learning methods based on convolutional networks and applications, with diagnostic sensitivity and specificity that are equivalent to or better than those of radiologists.28,29,30,31 In a study of 589 thyroid nodules, 396 of which were malignant and 193 of which were benign, Ko et al.30 found the area under the curve (AUC) to diagnose thyroid malignancy to be 0.805–0.860 for radiologists and 0.845, 0.835, and 0.850 for three convolutional neural networks, respectively. According to the results of this study, there was no significant difference in the AUC between radiologists and convolutional neural networks.30 In a retrospective multi-cohort study conducted by Li et al.28 using ultrasound images obtained from three different hospitals, models trained with a set of 131,731 ultrasound images with thyroid cancer were compared with the diagnoses of 16 radiologists. When the results of the models and radiologists were compared with the images from three different centers, the sensitivity was between 84.3% and 93.4% for the models, whereas it was 89.0%–96.9% for the radiologists. These values for specificity ranged from 86.1% to 87.8% versus 57.1% to 68.6%.28
Studies that predict the prognosis of a lesion and make an appropriate decision for biopsy are also promising. Ultrasonography devices that provide data to the radiologist before the decision by analyzing a thyroid nodule and scoring it according to the American College of Radiology Thyroid Image Reporting and Data System criteria have started to be produced.32
Studies on breast lesions are also of interest to researchers. Benign-malignant differentiation of lesions in the breast can be successfully made with the help of convolutional networks. Fujioka et al.33 retrospectively gathered 480 images of 96 benign breast masses and 467 images of 144 malignant breast masses for training data. The deep-learning model was constructed using the convolutional neural network architecture GoogLeNet, and three radiologists interpreted the test data. The convolutional neural network model and radiologists had a sensitivity of 0.958 and 0.583–0.917, specificity of 0.925 and 0.604–0.771, and accuracy of 0.925 and 0.658–0.792, respectively. The convolutional neural network model had an equal or better diagnostic performance compared with the radiologists (AUC: 0.913 and 0.728–0.845, P = 0.010–0.140).33
It is possible to parse attributes obtained with convolutional networks with classification methods, such as VGG and support vector machine, and match them with the breast imaging-reporting and data system criteria. With the models developed, lymph node metastasis can be predicted from images detected through ultrasound.34 Different researchers used the methods presented in Table 2 to segment breast lesions and achieve certain successes. In one of these studies, researchers used the Multi-U-Net segmentation pathway to divide suspicious breast masses with high performance in ultrasonography.35 Another group of researchers was able to achieve more successful results than other segmentation methods by automatically segmenting ultrasound images using GAN architecture.36
Similar approaches continue to be conducted to evaluate malignancy in cystic and solid masses of ovaries, detect prostate cancer, evaluate ultrasound images in liver masses, and segment kidney masses.37,38,39,40,41 Schmauch trained an algorithm on a data set proposed during a data challenge. The data set was composed of 367 two-dimensional ultrasound images from 367 individual livers, captured at various institutions. Their model reached mean receiver operating characteristic curve–AUC scores of 0.935 for focal liver lesion detection and 0.916 for focal liver lesion characterization over three shuffled three-fold cross-validations performed using the training data.40
One of the areas of interest to researchers is vascular evaluations through ultrasound. A success rate of up to 90% has been achieved in the evaluation of images of carotid vascular walls and plaques with convolutional networks and the segmentation of fat cap, fibrous valve, and calcified sections in plaques.42 With the carotid intima-media thickness classification, it will be possible to predict the early diagnosis of atherosclerosis. In the same way, models that predict stroke have been created.43 There are also studies on the detection and division of lesions in deep vein thrombus.44
Segmenting organs or structures through selected areas or the entire image in the field of obstetric ultrasonography can contribute to diagnoses. Therefore, a considerable amount of research has been conducted on this subject. The cropping–segmentation–calibration method provided in Table 2 is one of the most commonly used tools in this field. In moving structures, such as heart valves, this model can produce very successful segmentation.21 Software has also been developed that allows automatic measurements, such as abdominal circumference, femur length, amniotic fluid volume, and placenta volume in the fetus, to be taken using the same methods.45,46,47 With these measurements, many clinical studies, especially predicting pregnancy complications, are possible. As mentioned above, the biggest challenge in these studies is the lack of a standard in the image quality and section level and the potential for errors in the data set to be processed. It is thought that deep-learning networks can help to overcome this problem, and a selection of suitable series is used. The automatic detection of congenital anomalies is one of the most anticipated developments in this field. If there is a large enough data set, there will be significant developments in the near future.
As a result, with the widespread use of AI, gains may be expected to increase diagnostic accuracy, provide reliable support to radiologists in clinical decisions, reduce the workload of radiologists, increase their efficiency, and create an opportunity for patients where access to health is limited. To be part of this process, all parties must be included in multidisciplinary working groups, and highly accurate algorithms must be developed by creating excellent data sets.