Deep learning (DL) has revolutionized image recognition and analysis by enabling unprecedented performance leaps between 2010-2014. These rapid advancements enabled the development of automated, accurate, accessible, and cost-effective image recognition AI for medical diagnostics.
Major technical advancements over the past decade have laid the groundwork for commercial growth – IDTechEx forecasts the market for AI-enabled image-based medical diagnostics to exceed $3bn by 2030.
This article assesses the technological status of image recognition AI in medical diagnostics today, detailing its current capabilities and value proposition. The article also identifies key technological issues currently limiting the uptake of image recognition AI and discusses opportunities for improvement. It draws from IDTechEx’s report ‘AI in Medical Diagnostics 2020-2030: Image Recognition, Players, Clinical Applications, Forecasts’.
Today the level of accuracy of an algorithm is its number one selling point. Companies that achieve super-human accuracy possess a crucial competitive advantage. The speed of the algorithm is also key – in fact, many companies base their marketing strategy on how much analysis time their AI saves. However, if the AI software cannot produce an assessment faster than an experienced radiologist, its value decreases greatly. IDTechEx interviewed over 15 industry experts and found that fifteen minutes is generally the cut-off point for the AI algorithm to offer significant value to a radiologist.
Due to the fierce competition in this emerging field, marketing teams in AI companies have a tendency to overhype the capabilities of AI to boost its appeal. The truth is that, whilst AI has the potential to revolutionize the disease diagnosis process, its current value proposition remains below the expectations of most radiologists.
AI performance in medical diagnostics is often measured by assessing sensitivity, the algorithm’s ability to correctly identify those with the disease (true positive rate), and specificity, its ability to correctly identify those without the disease (true negative rate). IDTechEx gathered the performance measurements of over 35 AI algorithms and found that most are on par with human radiologists for disease detection performance, showing comparable accuracy, sensitivity and specificity. The results of IDTechEx’s analysis are shown in the chart below.
The goal, however, is for AI algorithms to outperform human radiologists. Ultimately, AI must be more reliable and more accurate than even the most highly trained experts in order to gain credibility as a decision support tool. Achieving this will facilitate the uptake of this technology in medical settings as the benefit of automated actionable quantitative insights will outweigh the short-term inconvenience of changing the workflow.
Opportunities for Improvement to Boost Uptake
A number of improvements could help image recognition AI to reach its full potential as a decision support tool for radiologists.
1) Increasing diversity in training data sets will widen the software’s applicability
Currently, a common limitation of image recognition AI algorithms is their restriction to a certain disease or population type. As a result, the software’s ability to detect disease can be reduced if the patient profile or condition does not match the data type that it was previously exposed to.
A key technical and business advantage lies in the demonstration of success in dealing with a wide range of patient demographics as it widens the software’s applicability. While training DL algorithms, the training data should encompass numerous types of disease, lesions, and other parameters so that the algorithm can recognize a multitude of demographics, tissue types, abnormalities, etc and perform to the level required by radiologists. It must work equally well for males and females, different ethnicities, etc.
For instance, skin cancer algorithms must be able to assess moles in all skin types and colors. Otherwise, if an algorithm encounters a type of lesion or abnormality that does not match any of the conditions it recognizes, it will classify it as ‘not dangerous’ as it does not associate it with any condition that it knows.
Having a diverse data set also helps to prevent bias (the tendency of an algorithm to make a decision by ignoring options that go against its initial assessment). Aside from using a wide training data set, the diversity issue in detecting skin cancer can be addressed by using a higher resolution camera. For instance, MetaOptima provides a device equivalent to a dermoscope that clips onto the user’s smartphone and distinguishes features in different skin tones or mole types that would go unnoticed by a smartphone camera.
2) Including more negative cases in the training data can raise algorithm specificity
During algorithm training, confirmed disease cases often take priority over negative cases to raise the algorithm’s disease detection performance. Training this way results in a high sensitivity, or disease detection capability, which is a significant competitive advantage already described earlier in this article. While this approach is beneficial for rapidly identifying patients at risk of disease, it limits the AI’s ability to recognize healthy or benign cases. As a result, low specificity is a recurring issue in medical image recognition AI.
From the chart above, it is obvious that there are more algorithms with super-human sensitivity than there are algorithms with super-human specificity. In other words, AI might outperform humans in disease detection but is still inferior at identifying disease-free patients. False positives in testing poses a significant problem, as patients are caused undue anxiety during the time the results are confirmed via other means.
In some instances, unnecessary invasive procedures may even be conducted, which can considerably and needlessly raise costs. Overdiagnosis costs healthcare systems billions of dollars every year and AI has the potential to dramatically reduce this problem by providing more detailed insights into a patient’s condition.
To achieve this, image recognition AI algorithms must improve at differentiating afflicted patients from healthy ones, which can be addressed by using more curated negative cases during the training process.
3) Using high-resolution images will maximize algorithm performance
The use of poor-quality data during training negatively impacts the development process and performance levels of DL algorithms. Unclear images reduce the accuracy of insights generated by AI, which can damage its chances for widespread implementation.
Methods that enable doctors and radiologists to capture better images or enhance their resolution can boost the value of image recognition AI in medical settings. This in turn improves user experience and publicizes the value of AI more effectively. These methods can also help reduce the demand on the servers of AI companies as only high-quality images will be uploaded rather than noisy/unusable data, which translates to lower cloud server costs.
Ensuring picture quality is becoming increasingly important and AI-driven methods for assessing or improving image quality are already commercialized. For instance, USA-based Subtle Medical use image recognition AI to transform blurry images unsuitable for analysis into high-resolution scans.
Another approach is to assess image quality immediately after acquisition to determine whether the image is sufficient for reliable diagnosis or if the image should be re-taken. India-based Artelus has incorporated this feature into its diabetic retinopathy AI detection tool so that image quality assessment is automated upon capture.
Companies focused on data quality hold a competitive advantage as dealing only with high-resolution images heightens the reliability of AI-generated insights.
IDTechEx’s report ‘AI in Medical Diagnostics 2020-2030: Image Recognition, Players, Clinical Applications, Forecasts’ provides a detailed analysis of emerging solutions and innovations in the medical image recognition AI space. It cuts through the technological landscape from both a commercial and technical perspective by benchmarking the products of over 60 companies across 12 disease applications according to performance, market readiness, technical maturity, value proposition and other factors.
In-depth insights into the company and market landscape are also provided, including ten-year forecasts for each application.