Srivathsa Pasumarthi1, Sidharth Kumar2, and Ryan Chamberlain1
1R&D, Subtle Medical Inc, Menlo Park, CA, United States, 2Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, United States
Synopsis
Keywords: Analysis/Processing, Brain, Unsupervised Anomaly Detection
Motivation: Unsupervised anomaly detection (UAD) approaches on T1 contrast-enhanced (T1CE) images are currently not feasible as T1CE images of healthy individuals are not typically available.
Goal(s): In this work, we aim to eliminate the need for large labeled datasets that are required for manual anomaly detection on T1CE images.
Approach: Using deep learning (DL), we synthesized healthy T1CE images from non-contrast images available in public datasets. We also synthesized healthy-anomalous paired images and forced the DL network to learn the healthy reconstruction. The anomalies were localized by subtracting the reconstruction from the input image.
Results: The proposed method achieves state-of-the art dice score coefficients.
Impact: This work opens up new avenues of research in unsupervised anomaly detection on T1CE images which has been infeasible due to lack of healthy post-contrast images. We also propose a novel contrastive learning paradigm using synthesis of healthy-anomalous image pairs.
Introduction
Unsupervised anomaly detection (UAD) is the process where the healthy image distribution is learned during training, from unlabelled datasets. At test time, this distribution is used to localize anomalies like tumors, motion artifacts etc. Recently UAD has gained traction as it eliminates the need for large labeled datasets. Traditionally, UAD in brain MRI[1] has been performed on non-contrast images(T2w, FLAIR) which have hyperintensities around the pathology, and these scans are available for healthy individuals. Performing UAD on T1 contrast-enhanced (T1CE) images is a challenge as they are prescribed only to patients with suspected tumors. We propose a novel UAD approach by synthesizing healthy T1CE images from non-contrast images. We further refine the variational auto-encoder (VAE) learning paradigm through contrastive learning where the network is trained to push the manifold of healthy and anomalous slices as far as possible.Methods
Healthy T1CE Synthesis
We synthesized healthy T1CE images from non-contrast T1w and T2w images of healthy individuals available in the IXI dataset [3], by using the vision transformer based MMT network [4]. This data was used to learn the distribution of healthy T1CE images.
VAE Architecture
The overall architecture of the VAE network [5] is shown in Fig1.
Synthetic Tumor Data
For the contrastive learning approach, each healthy image needs a corresponding anomalous pair, for which we realistically create synthetic tumors on the healthy images using the BraTS 2021[7,8] train dataset which had the ground truth tumor masks. Let $$$x^{W x H}$$$ be a healthy image from the training set and $$$y^{W x H}$$$ be the anomalous BraTS image and $$$M^{W X H}$$$ denote the binary tumor mask. The synthetic image is generated as follows
$$\tilde{x} = M \cdot y + (1 -M) \cdot x$$
Contrastive training
In order to push the distributions of healthy and unhealthy slices as far as possible, we used a contrastive learning approach where the network was forced to convert anomalous slices to healthy slices and keep the healthy slices as is.
Loss function
We used a VAE that encodes input images to a lower dimensional manifold through latent variables. The decoder then reconstructs the sample from latent representation, where the distribution is parameterized by the latent variables. The network was trained by minimizing the KL-divergence of the prior and the learnt distributions. Furthermore a pixel wise reconstruction loss was also used to make the network generate high quality healthy images. With N training samples and xi being the input and xi being the output, the loss function for the model training is denoted as two components: VAE loss and reconstruction L1-loss. The total loss of the network is denoted as
$$\mathcal{L}_{G} = L_1(x, \hat{x}) + \lambda_{KL}D_{KL}(q(z) || p(z))$$
Dataset
For synthesizing healthy T1CE images, we used the IXI public dataset[3] which consists of non-contrast T1w and T2w images from 600 healthy individuals. The images were skull-stripped[6], centered using a bounding-box and resized to make it compatible with the MMT network[4]. For testing the UAD model, 50 random subjects from the BraTS 2021 validation dataset were used.
Experiments
For comparative analysis, we trained a variant of the VAE model with only the healthy images (UAD-VAE). The proposed model is the VAE with the contrastive learning approach (Contrastive-VAE).
Post Processing
Fig4 shows the overall post-processing pipeline. The difference of the input and the VAE output is first computed and an eroded brain mask is applied on it to remove hyperintensities in the brain-skull boundary. Additional morphological operations are then applied followed by thresholding. The final UAD mask is obtained after removing all connected components less than 20 pixels.
Results
The average Dice score coefficient (DSC) on the 50 test subjects was 0.3996. Fig5 also shows the qualitative performance of the proposed Contrastive-VAE in comparison to the UAD-VAE variant.Discussions
Qualitative and quantitative results show that introducing synthetic anomalies in the training data can force the VAEs to learn better latent representations. Future work includes analyzing the effect of gaussian distribution constraint on latent variables, by replacing VAE with a UNet. The proposed method can be used to design an anomaly-aware loss function for low-dose[9] or zero-dose[4] T1CE synthesis. This method can also be used to generate areas of interest on T1CE images thus helping radiologists read and triage more efficiently.Conclusion
A novel contrastive learning paradigm has been proposed that can help train a network to translate anomalous images to their healthy counterparts, thus helping in unsupervised tumor detection.Acknowledgements
We would like to acknowledge the grant support of NIH R44EB027560.References
- Baur, Christoph, et al. "Autoencoders for unsupervised anomaly segmentation in brain MR images: a comparative study." Medical Image Analysis 69 (2021): 101952.
- Lagogiannis, Ioannis, et al. "Unsupervised Pathology Detection: A Deep Dive Into the State of the Art." arXiv preprint arXiv:2303.00609 (2023).
- https://brain-development.org/ixi-dataset/
- Liu, J., Pasumarthi, S., Duffy, B., Gong, E., Datta, K., & Zaharchuk, G. (2023). One model to synthesize them all: Multi-contrast multi-scale transformer for missing data imputation. IEEE Transactions on Medical Imaging.
- Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013).
- Isensee F, Schell M, Tursunova I, Brugnara G, Bonekamp D, Neuberger U, Wick A, Schlemmer HP, Heiland S, Wick W, Bendszus M, Maier-Hein KH, Kickingereder P. Automated brain extraction of multi-sequence MRI using artificial neural networks. Hum Brain Mapp. 2019; 1–13. https://doi.org/10.1002/hbm.24750
- U.Baid, et al The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification. arXiv:2107.02314, 2021.
- B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS), IEEE Transactions on Medical Imaging 34(10), 1993-2024 (2015) DOI: 10.1109/TMI.2014.2377694
- Pasumarthi, S., Tamir, J. I., Christensen, S., Zaharchuk, G., Zhang, T., & Gong, E. (2021). A generic deep learning model for reduced gadolinium dose in contrastâenhanced brain MRI. Magnetic Resonance in Medicine, 86(3), 1687-1700.