All I Want for Christmas is… Reliable AI
As 2022 comes to an end, it’s time to reflect on the progress made in AI-enabled health tech. Investors meaningfully turned their attention to the space about a decade ago, which gives us a long enough period to take a step back and review the bigger picture with more clarity.
Background
Let’s take radiology AI as the “canary in the coal mine” to set the scene for our reflections. Computer aided tools for medical imaging made their first appearance long before VCs got interested, dating back to the early 2000s. Radiology theoretically presented the perfect field for AI adoption, being reductively framed as a “computer vision” problem that could be solved away with ease, in the eyes of computer scientists and techno-utopians. It turns out the failure of this ambition is not only old news, but the status quo still hasn’t changed in 2022.
Fast forward to the present day, the year spanning Q4 2021 - Q4 2022 witnessed some sizable acquisitions that signalled the integration of AI tools within the broader radiology infrastructure and workflow (Sirona Medical acquired Nines, RadNet acquired Aidence and Nanox acquired Zebra). But let’s be clear - acquisition/exit does not always equate to delivering concrete value for patients, professionals, healthcare systems or investors. And that’s what we’re going to focus on in this blog.
The last decade (and hubris) of radiology AI has already served up some hard lessons for other fields of medicine, which are earlier on in their AI development journey. Here, we’ll cover the 3 biggest obstacles that AI-enabled health tech faces, and some pointers on how we can work to overcome them. And no, radiology AI still hasn’t quite overcome any of these either.
Obstacle 1: reproducibility
Real world reproducibility of clinical AI performance is all over the place, and the data is worryingly sparse. Of the 139 non-radiology AI-enabled FDA approved devices (as of December 2022), a simple literature search uncovers only a handful of independent external validation studies of these tools in their post-deployment phase. Where these studies do exist, they leave a lot to be desired, and this gap has been described as an implementation chasm between development and deployment. Performance metrics submitted to regulators aren’t being matched in real world settings. For example, external validation of the Apple Watch ECG analysis showed accuracy varied based on watch positioning (with the highest accuracy recorded when the wrist watch is conveniently worn around the chest). What’s worse, this lack of transparent, reproducible and published data isn’t just an AI-enabled health tech problem - it’s a general health tech problem.
Solutions to Obstacle 1
1: Changes to the regulatory landscape. This year, medical device regulators continued to take steps towards increasing transparency and offering frameworks to help developers deploy, monitor and update AI-based algorithms in a safe and effective way. The ongoing collaboration between the MHRA, FDA and Health Canada has already produced “Good Machine Learning Practice” guidance, and 2023 should bring more progress following the FDA’s call for feedback on managing pre-specified algorithm changes for on-market AI-enabled products.
More broadly, regulators are increasing transparency of their processes. Whilst most data submitted to regulators is not public, the FDA’s 510(k) database offers some information on the clinical data that developers submit for approval. In Europe, the staggered and lacklustre rollout of EUDAMED is still ongoing, but someday promises to increase the amount of publicly available information for CE-marked products too. This will help buyers and external validators alike in seeing what data was submitted for regulatory approval, and get a better understanding of how to benchmark local deployment of AI tools.
2: Third party validation. Contingent on the regulatory frameworks being in flux, some companies now offer a “try before you buy” service often bundled with a distribution platform (such as Lakera, MONAI, carpl.ai and Meta Lynx) to help buyers evaluate the potential local performance of AI-enabled products on their own native data before finalising a purchase. These independent validation tools definitely take a step in the right direction. But until the regulations allow for site-dependent algorithmic change, continuous improvement of deployed devices will remain a pipe dream and therefore limit how impactful these adjunct platforms can be. By facilitating pre-specified changes in a safe way, AI-enabled tech may be able to extinguish the ongoing reproducibility crisis.
Obstacle 2: demonstrating economic value
The reproducibility crisis (Obstacle 1) needs to be solved first. But once performance metrics stabilise and are easier to monitor and collect, the next step will involve incorporating this data into formal value assessments. Again, with radiology being the furthest along the journey, many buyers and investors are now catching onto the fact that the value promised during initial investments in the mid 2010s is on the whole yet to be realised. The reason? There are currently no robust health economic models demonstrating how valuable these tools are in real life, and how their performance impacts the efficiencies of healthcare systems.
Solution to Obstacle 2
Get intentional in collecting the right type of health economic data. Developing the infrastructure to be able to collect real-time clinical (and non-clinical!) data is vital to build rigorous health economic models. This year, Hardian Health and Aidence kicked off a pioneering initiative, which aims to collect the data needed to demonstrate the real-world value of AI-assisted lung nodule detection by undertaking a thorough health economic analysis. With the work set for completion next year, it will complement NICE’s Early Value Assessment for similar radiology solutions.
Pivoting to mental health digital therapeutics (DTx) as a contrast to radiology, there is already well-established economic evidence to support its implementation. NICE is now undertaking more Early Value Assessments focusing on digital therapies for depression and anxiety. The question then remains as to how AI-enabled features of next-generation DTx platforms can be evaluated. These newer interventions may allow for highly personalised treatment (which loops us back to the reproducibility problem outlined in Obstacle 1) and a more variable patient journey, which becomes trickier to model without granular data across the entire care pathway. It seems likely that for most AI-enabled solutions, we will need formally designed studies to specifically assess the health economic value they propose to deliver. The outcomes of these types of studies would also provide invaluable inputs for the financial models of the more sophisticated investors that are now entering the health tech space.
Obstacle 3: not putting clinical need first
Comparing health economic data for mental health DTx to radiology AI begs the question: why is there a discrepancy in value proposition between different AI-enabled health tech fields that gained traction around the same time? Part of the reason is a misalignment between when AI/machine learning based tools were incorporated into these medical specialties, and why.
Mental Health DTx evolved with the aim of addressing a simply articulated clinical need: give more patients access to the psychological help they need in a quicker and easier way. The journey started with online platforms for tele-therapy in the early 2000s (like SilverCloud), followed by self-guided interventions in the late 2000s (like Deprexis), and now incorporates AI-based features since the late 2010s (like Wysa). Newer companies are working to provide this entire service stack in stepwise offerings (like Koa). In this field, the solutions were designed to complement current clinical pathways (e.g. IAPT in the NHS). Perhaps radiology AI (and other diagnostic fields) were swayed more by the technological innovations of deep learning in the 2010s, rather than directly addressing pressing clinical needs such as the backlog of scans that need interpreting; it’s not intuitively clear that improved detection and diagnostic accuracy (which is what most radiology AI focuses on) directly increases radiologist efficiency (which is where most of the narrative of value is focused). Economic evidence will need to be generated for efficiency improvements if these claims are to be believed.
Solutions to Obstacle 3
1: Put clinical needs first. As obvious as it sounds, starting with a target problem to solve helps to clarify what needs to be built. More often than not, brilliant health tech that puts clinical need first does not require complex AI at all.
2: Bundling AI tools with existing care pathways and infrastructure. Building tools that integrate into the current workflow and ecosystem is vital. Many companies do this on a superficial level (for example via API links to their siloed platforms), but the value comes in designing tools that are specifically built for pathway integration. This nudges product designers to prioritise clinical need and utility, over cramming in more features. Bundling can happen in 4 main ways:
Strategic acquisition of complementary services such as:
Pfizer doubling down on its partnership with CytoReason, an AI-powered drug discovery platform
Integration across technological verticals such as:
Siemens building compatible radiology AI tools that can stack directly onto their MRI and CT scanning machines
Integration of digital platforms with clinical verticals and service delivery such as:
EQL’s “Phio” platforms complementing health insurance packages and facilitating local clinical referrals in the digital physiotherapy space
Building standalone AI-based tools that are specifically designed with a care pathway in mind such as:
Qure.ai’s CT head tool which facilitates NICE Head Injury reporting standards
The in-development CURIAL AI tool for COVID-19 triage specifically in emergency care settings
These bundling strategies also offer opportunities for beneficial kickbacks in collecting useful data for training newer iterations of these tools.
Conclusion
2022 draws a decade of ups and downs for clinical AI to an end. There have been some great breakthroughs, but also a fair share of hiccups along the way. With many lessons learnt from our radiology AI trailblazers, it’s time to take these insights and build on them across the field. Hardian and our clients remain optimistic and committed to doing the hard work in bringing reliable, economically valuable, regulated and clinically useful health tech to healthcare. Bring on 2023!
Hardian Health is clinical digital consultancy focused on leveraging technology into healthcare markets through clinical strategy, scientific validation, regulation, health economics and intellectual property.