Development of trustworthy AI-based image segmentation and uncertainty quantification in advanced NSCLC

Published on 03/06/25
Tags: 
Dive into the groundbreaking efforts being made in academia to develop trustworthy AI algorithms that address challenges related to medical imaging analysis, including delineation, in stage IV non-small cell lung cancer (NSCLC) patients, a crucial step leading towards truly personalized medicine approaches.
Home breadcrumb-arrow Development of trustworthy AI-based image segmentation and uncertainty quantification in advanced NSCLC
Dive into the groundbreaking efforts being made in academia to develop trustworthy AI algorithms that address challenges related to medical imaging analysis, including delineation, in stage IV non-small cell lung cancer (NSCLC) patients, a crucial step leading towards truly personalized medicine approaches.

Context & Study Overview

Lung cancer is a leading cause of cancer-related deaths worldwide, presenting substantial clinical and imaging challenges, especially in its most advanced stages, such as stage IV NSCLC.

Medical imaging, such as computed tomography (CT) scans and magnetic resonance imaging (MRI), plays a crucial role throughout the lung cancer patients’ care pathway, from initial diagnosis, through treatment response assessment, to longitudinal monitoring.

Accurate tumor segmentation remains a noteworthy gap in clinical workflows, being particularly difficult in patients with stage IV NSCLC, due to heterogeneous tumor morphology and the presence of associated comorbidities and conditions caused by the advanced state of their disease (e.g., atelectasis, and pleural effusion).

As such, there’s a clear growing need for developing robust and reproducible radiomics segmentation tools for advancing personalized medicine in NSCLC and providing improved patient care.

This study addresses the persistent obstacle of developing high-performance tumor segmentation using artificial intelligence (AI), in highly heterogeneous multicentric 3D CT scans of metastatic NSCLC (mNSCLC), with an emphasis on model trustworthiness via uncertainty quantification.

Dataset & Methods

To conduct this study, Dedeken et al. used a curated subset of data from the DEEP-Lung-IV (DLIV) study (NCT04994795), comprising 387 stage IV NSCLC patients from 13 European centers. These patients were treated either with a combination of chemotherapy and immunotherapy (study cohort B) or with chemotherapy alone (study cohort C).

The images were acquired using different device models and acquisition protocols across centers, increasing the heterogeneity of the dataset, which reflected the variety found in real-world clinical practice.

All images were annotated by expert radiologists and pre-processed (Figure 1), to ensure consistency, before running AI models to delineate the tumors.

The team evaluated three segmentation models widely used in literature: U-Net, Attention U-Net, and UNEt Transformed (UNETR). They also introduced a “confidence score” to help identify when the AI model might be unsure, especially in more complex cases like small or hard-to-segment tumors. This score helps flag the uncertain cases for further review, ensuring more reliable downstream analyses.

Figure 1. Dataset selection and imaging processing workflow for AI-driven segmentation model development.

Key Findings

1. Segmentation Performance

The best performing AI model was the Attention U-Net, which was trained with a special method to boost accuracy. It correctly segmented lung tumors with a score of 0.76 (±0.20).

This performance surpassed U-Net (0.66 ±0.23) and UNETR (0.60 ±0.23), demonstrating that the model is reliable across different patient subsets and types of settings.

As expected, the model showed lower scores in cases with small tumor volumes or complex scenarios, such as atelectasis and pleural effusion.

Additionally, the researchers found that using a narrower imaging range focused on the abdomen yielded better precision in solid tumor segmentation, critical for advanced NSCLC imaging needs.

Figure 2. Influence of architecture model on test score.

2. Confidence Score Reliability

The developed confidence score, following a Deep Ensembles approach, showed how certain the AI segmentation model was about each result. This score demonstrated high accuracy, closely matching the actual performance, with a strong correlation of 0.86.

Low-confidence scores were associated to known complex cases (e.g., small tumor volumes, pleural effusion, atelectasis), showing that the system can automatically flag tricky images for additional expert manual verification.

A Monte Carlo Dropout (MCDO) method was also tested for computing uncertainty. Although the results obtained were satisfying (correlation of 0.77), the Deep Ensembles approach produced more reliable results.

Figure 3. Correlation between confidence score and segmentation model test score. 

Conclusion and Clinical Implications

The development of accurate tumor segmentation from CT scans is essential for extracting reliable imaging biomarkers that support clinical decision-making and personalized treatment strategies in advanced NSCLC.

This study by Dedeken et al. aligns with these growing needs and demonstrates that trustworthy deep-learning-based tumor segmentation can effectively address the complexity of segmenting stage IV NSCLC CT scans, even using diversified, heterogeneous, and multicentric datasets.

By running extensive experiments, this study showed that:

  • The Attention U-Net model is the best-performing model for tumor segmentation, enhancing research efficiency and robustness in this field.
  • A simple confidence scoring system, using a Deep Ensembles approach, effectively measures uncertainty at the image level, having potential practical applications in making automated clinical workflows safer and more reliable.

Next steps include testing the developed algorithms on a larger and more diversified group of patients leveraging the DEEP-Lung-IV clinical study; and assessing them with stage III NSCLC patients undergoing radiotherapy and chemotherapy to evaluate the generalizability of the segmentation model.

Overall, this work lays the foundation for robust and explainable radiomics algorithms, offering insights and tools to accelerate the integration of AI in medical imaging workflows.

This study was led by Sacha Dedeken, Pierre-Henri Conze, and Dimitris Visvikis from the Laboratory of Medical Information Processing (LaTIM), in collaboration with SOPHiA GENETICS.

LaTIM is a joint research laboratory (UMR) of Inserm (French National Institute of Health and Medical Research), the University of West Brittany (UBO), and IMT Atlantique, associated with the CHRU (University Research Hospital) of Brest.

SOPHiA DDM™ for Radiomics and SOPHiA DDM™ for Multimodal are concepts in development. May not be available for sale.

Related Posts

SOPHiA GENETICS products are for Research Use Only and not for use in diagnostic procedures unless specified otherwise.

SOPHiA DDM™ Dx Hereditary Cancer Solution, SOPHiA DDM™ Dx RNAtarget Oncology Solution and SOPHiA DDM™ Dx Homologous Recombination Deficiency Solution are available as CE-IVD products for In Vitro Diagnostic Use in the European Economic Area (EEA), the United Kingdom and Switzerland. SOPHiA DDM™ Dx Myeloid Solution and SOPHiA DDM™ Dx Solid Tumor Solution are available as CE-IVD products for In Vitro Diagnostic Use in the EEA, the United Kingdom, Switzerland, and Israel. Information about products that may or may not be available in different countries and if applicable, may or may not have received approval or market clearance by a governmental regulatory body for different indications for use. Please contact us to obtain the appropriate product information for your country of residence.

All third-party trademarks listed by SOPHiA GENETICS remain the property of their respective owners. Unless specifically identified as such, SOPHiA GENETICS’ use of third-party trademarks does not indicate any relationship, sponsorship, or endorsement between SOPHiA GENETICS and the owners of these trademarks. Any references by SOPHiA GENETICS to third-party trademarks is to identify the corresponding third-party goods and/or services and shall be considered nominative fair use under the trademark law.

SOPHiA DDM™ Overview
Unlocking Insights, Transforming Healthcare
Learn About SOPHiA DDM™ 
SOPHiA DDM™ for Genomics

Oncology 

Rare and Inherited Disorders

Add-On Modules

SOPHiA DDM™ for Radiomics
Unlock entirely novel insights from your radiology images
Learn About SOPHiA DDM™ for Radiomics 
SOPHiA DDM™ for Multimodal
Explore new frontiers in biology and disease through novel insights
Learn About SOPHiA DDM™ for Multimodal
Professional Services
Accelerate breakthroughs with our tailored enablement services
Learn About our Professional Services