0198

Abdominal Foundation Model: Bootstrapping artificial intelligence for MRI organ volume biomarker analysis in ADPKD
Chenglin Zhu1, Xinzi He2, Zhongxiu Hu1, Hreedi Dev1, Dominick J. Romano1, Arman Sharbatdaran1, Anna Prince1, Andrea Soto Figueroa1, Sophie J. Wang1, Hui Yi Ng He1, Jon D. Blumenfeld3, and Martin R. Prince1,4
1Weill Cornell Medicine, New York City, NY, United States, 2Cornell University and Cornell Tech, New York City, NY, United States, 3The Rogosin Institute, New York City, NY, United States, 4Columbia University Vagelos Collage of Physicians and Surgeons, New York City, NY, United States

Synopsis

Keywords: Kidney, Segmentation, ADPKD

Motivation: Abdominal organ volumes are critical MRI biomarkers in many diseases including autosomal dominant polycystic kidney disease.

Goal(s): We aim to develop a segmentation model with an enhanced ability to generalize across various abdominal organs and MR pulse sequences.

Approach: We construct a multi-modality abdominal foundation model expanding upon our existing ADPKD kidney model which adapts to diverse organs and tissues with minimal new training data.

Results: The model was trained using a model-in-loop methodology and evaluated against radiologist benchmarks, yielding an impressive Dice score of 0.94 for in-distribution sequences and 0.73 for organ segmentations on out-of-distribution sequences.

Impact: This foundational model can seamlessly integrate into clinical workflows, utilizing routine cases to enhance its performance and extending its application to additional organs and tissues. This advance also marks a significant step toward the automation of MRI reporting.

Introduction

Segmenting organs on abdominal MRI scans is a nuanced task that has historically been constrained by the specificity of datasets to certain MR pulse sequences and organ types. We introduce a foundation model tailored for clinical utility, capable of adapting to the segmentation of various organs and tissues with minimal new training data through iterative training.
Autosomal Dominant Polycystic Kidney Disease (ADPKD) affects approximately 1 in 1000 live births and is marked by the progressive enlargement of renal cysts, ultimately leading to renal failure 1. MRI is vital in the longitudinal management of ADPKD: tracking total kidney volume (TKV) to monitor disease progression, characterizing renal cysts, and evaluating other organs involvements.
While existing algorithms have provided solutions for measuring kidney and liver volumes for healthy patients on MRI, our approach expands the scope of automated MRI analysis to encompass all abdominal organs impacted by ADPKD. By integrating this multi-modality, multi-organ auto-segmentation algorithm into clinical practice, we facilitate continuous model performance improvement with the on-going collection of additional training data to further refine the model.

Methods

The crux of our methodological approach is the abdominal foundation model, which is architecturally designed as an encoder-decoder framework. At the core of this model is a vision transformer that functions as the bottleneck, bridging the encoder and decoder. This transformer has been pre-trained on expansive datasets such as OpenCLIP, enabling it to leverage a wide array of visual features. It boasts an embedding dimension of 1024, a multi-layer perceptron (MLP) expansion ratio of 4, 16 attention heads, and 24 layers, equipping it with the capacity to understand and process complex imaging data.
The training dataset includes both manually and model-corrected segmentations from previous studies 2,3,4,5 and clinical ADPKD cases. The continuous integration of real-world data ensures the model's evaluation in line with clinical data, aiming for clinical-grade accuracy. The model's clinical deployment aids in generating automated ADPKD radiological reports, offering critical metrics for disease progression and treatment management (Figure 1).

Results

The training data set includes expert-verified abdominal organ annotation in 2226 MRI sequences from 708 patients: right kidney, left kidney, spleen, liver, aorta, IVC, stomach, pancreas, gallbladder labeled on axial/coronal T2, axial/coronal SSFP, and axial T1 water phase images, and exophytic renal cysts and hepatic cysts labeled exclusively on coronal T2 and axial T2 images, respectively (Figure 2). Seven clinical ADPKD MRI exams from patients outside training data set were used for model performance evaluation. The Dice score between the model output and expert-corrected standard was calculated (Figure 3). Organs with the most extensive training data (right kidney, left kidney, spleen, and liver), achieved excellent average Dice score of 0.99~1.00 on the five primary trained MR sequences, and good performance (0.83~0.97) on untrained MR sequences like Axial T1 in-phase, axial T1 opposed-phase, and axial DWI. Other abdominal organs, except for gallbladder, achieved a mean Dice score of 0.85~1.00 on trained sequences and 0.24~0.96 on untrained sequences (Figure 4). Cyst segmentation trained on T2 sequences only achieved good generalizability to sequences like SSFP but poor generalizability to T1 sequence. Notably, for hepatic cysts, despite the limited manually labeled training dataset (n=20) that predominantly featured cases with fewer cysts, the model achieved notable accuracy in generalizing to cases with a greater abundance of hepatic cysts (Figure 5).

Discussion and Conclusion

These data from 2226 MRI sequences in 708 patients segmenting 8 organs and cystic tissues on multiple types of MRI pulse sequences with impressive Dice how this foundation model starting with the basic kidney segmentations on ADPKD subjects can be expanded to perform well even on tissues and pulse sequences that the model has not seen. By incorporation of the model into the routine clinical workflow, model performance is improving and expanding continuously with the daily addition of radiologist corrected training data. This is gradually evolving toward more automatic and accurate reporting of quantitative MR abdominal organ volume biomarkers.

Acknowledgements

No acknowledgement found.

References

1. Davies, Felicity, Gerald A. Coles, Peter S. Harper, Andrew J. Williams, Christine Evans, and Dennis Cochlin. "Polycystic kidney disease re-evaluated: a population-based study." QJM: An International Journal of Medicine 79, no. 3 (1991): 477-485.

2. He, Xinzi, Zhongxiu Hu, Hreedi Dev, Dominick J. Romano, Arman Sharbatdaran, Syed I. Raza, Sophie J. Wang et al. "Test Retest Reproducibility of Organ Volume Measurements in ADPKD Using 3D Multimodality Deep Learning." Academic Radiology (2023).

3. Dev, Hreedi, Chenglin Zhu, Arman Sharbatdaran, Syed I. Raza, Sophie J. Wang, Dominick J. Romano, Akshay Goel et al. "Effect of Averaging Measurements From Multiple MRI Pulse Sequences on Kidney Volume Reproducibility in Autosomal Dominant Polycystic Kidney Disease." Journal of Magnetic Resonance Imaging (2023).

4. Sharbatdaran, Arman, Dominick Romano, Kurt Teichman, Hreedi Dev, Syed I. Raza, Akshay Goel, Mina C. Moghadam et al. "Deep learning automation of kidney, liver, and spleen segmentation for organ volume measurements in autosomal dominant polycystic kidney disease." Tomography 8, no. 4 (2022): 1804-1819.

5. Goel, Akshay, George Shih, Sadjad Riyahi, Sunil Jeph, Hreedi Dev, Rejoice Hu, Dominick Romano et al. "Deployed deep learning kidney segmentation for polycystic kidney disease MRI." Radiology: Artificial Intelligence 4, no. 2 (2022): e210205.

Figures

Figure 1. Diagram illustrating the integrated model-in-loop workflow utilized for automated generation of radiological reports for ADPKD and concurrent enhancement of the model's training set.

Figure 2. Distribution of the training dataset illustrated by the number of annotations for each organ or tissue type across different MR pulse sequences.

Figure 3. Comparison of Dice scores for the test dataset, showcasing the agreement between model-generated segmentations and expert-corrected segmentations (considered the standard) across various organ or tissue types and both familiar (trained) and unfamiliar (untrained) MR pulse sequences.

Figure 4. A snapshot of the model's segmentation performance for a test case, illustrated through 2D segmentation overlays and 3D reconstructed volumes, evaluated across MR pulse sequences with varying levels of training exposure. The average Dice score for all segmented labels per sequence type is documented.

Figure 5. Evaluation of model efficacy in segmenting hepatic cysts in a test case with cyst volumes exceeding those in the training dataset: (a) A histogram showcasing the distribution of hepatic cyst volumes within the training dataset (blue), contrasted against the volume of the test case (indicated by the red arrow). (b) The achieved Dice scores for model segmentations on axial T2 and coronal T2 images were 0.88, demonstrating the model's robust performance even with cyst volumes outside the training scope.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)
0198
DOI: https://doi.org/10.58530/2024/0198