1762

A Contrastive Learning Approach for Unsupervised Anomaly Detection on Contrast-Enhanced Brain MRI Images

Srivathsa Pasumarthi¹, Sidharth Kumar², and Ryan Chamberlain¹
¹R&D, Subtle Medical Inc, Menlo Park, CA, United States, ²Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, United States

Synopsis

Keywords: Analysis/Processing, Brain, Unsupervised Anomaly Detection

Motivation: Unsupervised anomaly detection (UAD) approaches on T1 contrast-enhanced (T1CE) images are currently not feasible as T1CE images of healthy individuals are not typically available.

Goal(s): In this work, we aim to eliminate the need for large labeled datasets that are required for manual anomaly detection on T1CE images.

Approach: Using deep learning (DL), we synthesized healthy T1CE images from non-contrast images available in public datasets. We also synthesized healthy-anomalous paired images and forced the DL network to learn the healthy reconstruction. The anomalies were localized by subtracting the reconstruction from the input image.

Results: The proposed method achieves state-of-the art dice score coefficients.

Impact: This work opens up new avenues of research in unsupervised anomaly detection on T1CE images which has been infeasible due to lack of healthy post-contrast images. We also propose a novel contrastive learning paradigm using synthesis of healthy-anomalous image pairs.

Introduction

Unsupervised anomaly detection (UAD) is the process where the healthy image distribution is learned during training, from unlabelled datasets. At test time, this distribution is used to localize anomalies like tumors, motion artifacts etc. Recently UAD has gained traction as it eliminates the need for large labeled datasets. Traditionally, UAD in brain MRI[1] has been performed on non-contrast images(T2w, FLAIR) which have hyperintensities around the pathology, and these scans are available for healthy individuals. Performing UAD on T1 contrast-enhanced (T1CE) images is a challenge as they are prescribed only to patients with suspected tumors. We propose a novel UAD approach by synthesizing healthy T1CE images from non-contrast images. We further refine the variational auto-encoder (VAE) learning paradigm through contrastive learning where the network is trained to push the manifold of healthy and anomalous slices as far as possible.

Methods

Healthy T1CE Synthesis
We synthesized healthy T1CE images from non-contrast T1w and T2w images of healthy individuals available in the IXI dataset [3], by using the vision transformer based MMT network [4]. This data was used to learn the distribution of healthy T1CE images.

VAE Architecture
The overall architecture of the VAE network [5] is shown in Fig1.

Synthetic Tumor Data
For the contrastive learning approach, each healthy image needs a corresponding anomalous pair, for which we realistically create synthetic tumors on the healthy images using the BraTS 2021[7,8] train dataset which had the ground truth tumor masks. Let $$$x^{W x H}$$$ be a healthy image from the training set and $$$y^{W x H}$$$ be the anomalous BraTS image and $$$M^{W X H}$$$ denote the binary tumor mask. The synthetic image is generated as follows

$$\tilde{x} = M \cdot y + (1 -M) \cdot x$$

Contrastive training
In order to push the distributions of healthy and unhealthy slices as far as possible, we used a contrastive learning approach where the network was forced to convert anomalous slices to healthy slices and keep the healthy slices as is.

Loss function
We used a VAE that encodes input images to a lower dimensional manifold through latent variables. The decoder then reconstructs the sample from latent representation, where the distribution is parameterized by the latent variables. The network was trained by minimizing the KL-divergence of the prior and the learnt distributions. Furthermore a pixel wise reconstruction loss was also used to make the network generate high quality healthy images. With N training samples and xi being the input and xi being the output, the loss function for the model training is denoted as two components: VAE loss and reconstruction L1-loss. The total loss of the network is denoted as
$$\mathcal{L}_{G} = L_1(x, \hat{x}) + \lambda_{KL}D_{KL}(q(z) || p(z))$$

Dataset
For synthesizing healthy T1CE images, we used the IXI public dataset[3] which consists of non-contrast T1w and T2w images from 600 healthy individuals. The images were skull-stripped[6], centered using a bounding-box and resized to make it compatible with the MMT network[4]. For testing the UAD model, 50 random subjects from the BraTS 2021 validation dataset were used.

Experiments
For comparative analysis, we trained a variant of the VAE model with only the healthy images (UAD-VAE). The proposed model is the VAE with the contrastive learning approach (Contrastive-VAE).

Post Processing
Fig4 shows the overall post-processing pipeline. The difference of the input and the VAE output is first computed and an eroded brain mask is applied on it to remove hyperintensities in the brain-skull boundary. Additional morphological operations are then applied followed by thresholding. The final UAD mask is obtained after removing all connected components less than 20 pixels.

Results

The average Dice score coefficient (DSC) on the 50 test subjects was 0.3996. Fig5 also shows the qualitative performance of the proposed Contrastive-VAE in comparison to the UAD-VAE variant.

Discussions

Qualitative and quantitative results show that introducing synthetic anomalies in the training data can force the VAEs to learn better latent representations. Future work includes analyzing the effect of gaussian distribution constraint on latent variables, by replacing VAE with a UNet. The proposed method can be used to design an anomaly-aware loss function for low-dose[9] or zero-dose[4] T1CE synthesis. This method can also be used to generate areas of interest on T1CE images thus helping radiologists read and triage more efficiently.

Conclusion

A novel contrastive learning paradigm has been proposed that can help train a network to translate anomalous images to their healthy counterparts, thus helping in unsupervised tumor detection.

Acknowledgements

We would like to acknowledge the grant support of NIH R44EB027560.

References

Baur, Christoph, et al. "Autoencoders for unsupervised anomaly segmentation in brain MR images: a comparative study." Medical Image Analysis 69 (2021): 101952.
Lagogiannis, Ioannis, et al. "Unsupervised Pathology Detection: A Deep Dive Into the State of the Art." arXiv preprint arXiv:2303.00609 (2023).
https://brain-development.org/ixi-dataset/
Liu, J., Pasumarthi, S., Duffy, B., Gong, E., Datta, K., & Zaharchuk, G. (2023). One model to synthesize them all: Multi-contrast multi-scale transformer for missing data imputation. IEEE Transactions on Medical Imaging.
Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013).
Isensee F, Schell M, Tursunova I, Brugnara G, Bonekamp D, Neuberger U, Wick A, Schlemmer HP, Heiland S, Wick W, Bendszus M, Maier-Hein KH, Kickingereder P. Automated brain extraction of multi-sequence MRI using artificial neural networks. Hum Brain Mapp. 2019; 1–13. https://doi.org/10.1002/hbm.24750
U.Baid, et al The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification. arXiv:2107.02314, 2021.
B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS), IEEE Transactions on Medical Imaging 34(10), 1993-2024 (2015) DOI: 10.1109/TMI.2014.2377694
Pasumarthi, S., Tamir, J. I., Christensen, S., Zaharchuk, G., Zhang, T., & Gong, E. (2021). A generic deep learning model for reduced gadolinium dose in contrast‐enhanced brain MRI. Magnetic Resonance in Medicine, 86(3), 1687-1700.

Figures

Figure 1: Overview of the VAE model trained to learn the input distribution

Figure 2: Overview of the healthy T1CE synthesis process using the MMT network. Difference image between the T1CE and the T1w image shows that the T1CE synthesis is accurate and realistic.

Figure 3: a) Overview of the synthetic tumor image generation. The BraTS slice with tumor is first preprocessed to have similar size and signal intensity as the T1CE. Then, the signals corresponding to the tumor mask are selected from the BraTS image and added to the brain signals from the T1CE. b) Overview of the proposed contrastive learning approach.

Figure 4: Post processing steps for anomaly generation. The input image is shown in the top left corner with the tumor highlighted by a red circle. The top row shows the output image, the brain mask calculated from the input image and the ground truth tumor mask. The bottom row shows the difference image, and subsequent images as different post processing steps are applied. The difference image is first multiplied by the eroded brainmask before thresholding. Dice similarity coefficient (DSC) was calculated to show the similarity between the generated and ground truth tumor mask.

Figure 5: Qualitative results for the unsupervised anomaly detection network, with each row showing a different slice. The first column is the input image to the network, 2nd column shows the output image with the UAD-VAE trained network, 3rd column shows the corresponding difference image x5. The 4th column shows the output with the same VAE trained using contrastive setup and the next column is the corresponding difference image. The 6th column shows the predicted segmentation map along with the numerical Dice score with inline text and the last column is the actual ground truth map.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

1762

DOI: https://doi.org/10.58530/2024/1762