1963

Application of Swin Transformer for Alzheimer’s Disease Classification on Structural MRI and FDG-PET Brain Scans Using ADNI Data

Chiara Weber¹, Jakob Seeger¹, Ben Isselmann¹, Matthias Günther^2,3,4, Andreas Weinmann¹, and Johannes Gregori^1,2
¹Department of Mathematics and Natural Sciences, University of Applied Sciences Darmstadt, Darmstadt, Germany, ²mediri GmbH, Heidelberg, Germany, ³Fraunhofer Mevis, Bremen, Germany, ⁴University of Bremen, Bremen, Germany

Synopsis

Keywords: Diagnosis/Prediction, PET/MR, Brain

Motivation: The detection of Alzheimer’s Disease (AD) can be supported by using automated computer vision solutions. Those can potentially enable an earlier diagnosis and facilitate improved patient treatment.

Goal(s): Our goal is to apply a state-of-the-art deep learning approach to the field of AD diagnosis based on brain scans.

Approach: A pretrained Swin Transformer model is tuned on FDG-PET and structural MRI brain scans to classify AD.

Results: Our model achieves a competitive area under curve of 97.8% / 99.7% and accuracy of 97.0% / 99.5% (MRI / PET) on independent test data.

Impact: We show how a modern deep neural network can be trained with reasonable efforts while still achieving comparable results to established approaches. This procedure can lead the way towards classifying AD on more challenging modalities, such as ASL.

Introduction

With demographic change the amount of people developing Alzheimer’s Disease (AD) is expected to grow¹. The diagnosing process can be supported by deep neural networks trained to predict the disease from brain scans. This estimate can serve as an additional indicator for clinicians.
Recently, Transformer-based models have gained popularity for image classification tasks on benchmark datasets^{2, 3, 4} and can outperform more established approaches such as convolutional neural networks. First applications of the Swin Transformer⁵ in the medical field, such as tumor segmentation⁶, COVID-19 detection⁷, or lesion classification⁸, show promising results. However, these models have a high demand for training data to tune their large number of parameters. Since available datasets do not meet this requirement, using a pretrained model is recommended⁴.
This work explores the application of such a model on preprocessed data, investigating the performance on two medical image modalities covering distinct AD related physiological processes: T1-weighted MR images showing structural changes and FDG-PET images indicating glucose metabolism rates.

Methods

Two datasets were created from the ADNI database⁹, containing MRI and FDG-PET data, respectively. The three diagnostic categories Control Normal (CN), Mild Cognitive Impairment (MCI) and Alzheimer’s Disease (AD) were considered as labels. The datasets were split into training, validation and test sets by using an oversampling strategy to ensure a balanced class distribution. This led to 549 training samples (183 CN, 183 AD, 183 MCI) and 66 samples for test and validation (22 CN, 22 AD, 22 MCI) for each modality.
For data preparation, a preprocessing pipeline based on SPM12¹⁰ was implemented. It included registration to MNI space and resampling to a voxel size of 1.5x1.5x1.5mm³, followed by an intensity normalization (SUVR) for FDG-PET¹¹. To make use of a model pretrained on 2D images, a mosaic representation was prepared by slicing through the volume in z-direction and selecting 16 equidistant tiles of 121x121 pixel each. The data was augmented by adding random noise, smoothing, and normalization.
The model was composed of a Swin-B V2¹² model pretrained on ImageNet1K V1, and a subsequent classifier which consisted of two linear layers divided by a ReLU layer followed by a Softmax activation. It was trained for 200 epochs on an NVIDIA A40 graphics card using a batch size of 15, an initial learning rate of 10^-5, and an Adam optimizer.
A feature representation was computed by calculating t-SNE plots from the outputs of the final Swin layer. In addition, Grad-CAM overlays were generated from the output of the last two normalization layers in the Swin model.

Results

Our model showed promising results on both modalities (Figure 1): An area under curve of 97.8% / 99.7%, accuracy of 97.0% / 99.5%, sensitivity of 100% / 96.5%, and specificity of 95.7% / 100% (MRI / FDG-PET). In particular, we could achieve a competitive performance on FDG-PET compared to results obtained on the Inception V3 architecture¹⁴ (ROC AUC 98%). On MRI, our approach can be ranked slightly behind current state-of-the-art solutions such as AlzheimerNet¹³ (ACC 98.7%).

Discussion

The shown preliminary results have been achieved by implementing both a standard procedure regarding data preparation, and a minimal supplementation of the models’ architecture. However, there is still potential room for improvement. For example, the volumetric data representation via 2D mosaics could be optimized by augmenting the slice positions or raising the tile count, and the size of the datasets could be increased.
The features extracted by the model show good clustering in the t-SNE plot (Figure 2), which indicates the models’ ability to extract relevant information for class separation: This is improved in comparison to the Inception model¹⁴. The CAM visualizations (Figure 3) demonstrate how the model can look at separate tiles in the mosaic and even considers particular brain regions within these slices, which will be a topic for further investigations.
Given further development, these potentials make the Swin model a candidate to become a state-of-the-art approach not only in the specific domain of AD classification but also in the larger scope of medical image analysis.

Conclusion

In this work the application of the Swin Transformer for AD classification on structural MRI and FDG-PET was examined. While first results are already promising, more experiments have to be conducted to determine the models’ full potential in this domain. The usage of this recent, pretrained architecture has the potential to deliver good classification results while being relatively easy and fast to train, which might lower the entry hurdle for further research. Future work will include investigating the models’ applicability for Arterial Spin Labeling (ASL) data within the Eurostars project ASPIRE¹⁵.

Acknowledgements

This research has been conducted in the scope of the "E! 113701 - ASPIRE" project funded by the Eurostars program via the Federal Ministry of Education and Research Germany, Innovate UK, and the Netherlands Enterprise Agency RvO.

References

1. Jia, Jianping, Cuibai Wei, Shuoqi Chen, Fangyu Li, Yi Tang, Wei Qin, Lina Zhao, et al. “The Cost of Alzheimer’s Disease in China and Re-Estimation of Costs Worldwide.” Alzheimer’s & Dementia 14, no. 4 (April 1, 2018): 483–91. https://doi.org/10.1016/j.jalz.2017.12.006.

2. Dimitrovski, Ivica, Ivan Kitanovski, Dragi Kocev, and Nikola Simidjievski. “Current Trends in Deep Learning for Earth Observation: An Open-Source Benchmark Arena for Image Classification.” ISPRS Journal of Photogrammetry and Remote Sensing 197 (March 1, 2023): 18–35. https://doi.org/10.1016/j.isprsjprs.2023.01.014.

3. Bhojanapalli, Srinadh, Ayan Chakrabarti, Daniel Glasner, Daliang Li, Thomas Unterthiner, and Andreas Veit. “Understanding Robustness of Transformers for Image Classification,” 10231–41, 2021. https://openaccess.thecvf.com/content/ICCV2021/html/Bhojanapalli_Understanding_Robustness_of_Transformers_for_Image_Classification_ICCV_2021_paper.html.

4. Ma, DongAo, Mohammad Reza Hosseinzadeh Taher, Jiaxuan Pang, Nahid UI Islam, Fatemeh Haghighi, Michael B. Gotway, and Jianming Liang. “Benchmarking and Boosting Transformers for Medical Image Classification.” In Domain Adaptation and Representation Transfer, edited by Konstantinos Kamnitsas, Lisa Koch, Mobarakol Islam, Ziyue Xu, Jorge Cardoso, Qi Dou, Nicola Rieke, and Sotirios Tsaftaris, 12–22. Lecture Notes in Computer Science. Cham: Springer Nature Switzerland, 2022. https://doi.org/10.1007/978-3-031-16852-9_2.

5. Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows.” arXiv, August 17, 2021. https://doi.org/10.48550/arXiv.2103.14030.

6. Li, Gary Y., Junyu Chen, Se-In Jang, Kuang Gong, and Quanzheng Li. “SwinCross: Cross-Modal Swin Transformer for Head-and-Neck Tumor Segmentation in PET/CT Images.” arXiv, February 7, 2023. https://doi.org/10.48550/arXiv.2302.03861.

7. Tuncer, Ilknur, Prabal Datta Barua, Sengul Dogan, Mehmet Baygin, Turker Tuncer, Ru-San Tan, Chai Hong Yeong, and U. Rajendra Acharya. “Swin-Textural: A Novel Textural Features-Based Image Classification Model for COVID-19 Detection on Chest Computed Tomography.” Informatics in Medicine Unlocked 36 (January 1, 2023): 101158. https://doi.org/10.1016/j.imu.2022.101158.

8. Li, Yuheng, Boran Zhou, Jing Wang, Shaoyan Pan, Ashesh Jani, Tian Liu, Pretsh Patel, and Xiaofeng Yang. “Ultrasound-Based Dominant Intraprostatic Lesion Classification with Swin Transformer.” In Medical Imaging 2023: Ultrasonic Imaging and Tomography, 12470:146–51. SPIE, 2023. https://doi.org/10.1117/12.2653634.

9. “ADNI | Alzheimer’s Disease Neuroimaging Initiative.” Accessed November 6, 2023. https://adni.loni.usc.edu/

10. “SPM12 Software - Statistical Parametric Mapping.” Accessed November 6, 2023. https://www.fil.ion.ucl.ac.uk/spm/software/spm12/

11. Samper-González, Jorge, Ninon Burgos, Simona Bottani, Sabrina Fontanella, Pascal Lu, Arnaud Marcoux, Alexandre Routier, et al. “Reproducible Evaluation of Classification Methods in Alzheimer’s Disease: Framework and Application to MRI and PET Data.” NeuroImage 183 (December 1, 2018): 504–21. https://doi.org/10.1016/j.neuroimage.2018.08.042.

12. Liu, Ze, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, et al. “Swin Transformer V2: Scaling Up Capacity and Resolution.” arXiv, April 11, 2022. http://arxiv.org/abs/2111.09883.

13. Shamrat, F M Javed Mehedi, Shamima Akter, Sami Azam, Asif Karim, Pronab Ghosh, Zarrin Tasnim, Khan Md. Hasib, Friso De Boer, and Kawsar Ahmed. “AlzheimerNet: An Effective Deep Learning Based Proposition for Alzheimer’s Disease Stages Classification From Functional Brain Changes in Magnetic Resonance Images.” IEEE Access 11 (2023): 16376–95. https://doi.org/10.1109/ACCESS.2023.3244952.

14. Ding, Yiming, Jae Ho Sohn, Michael G. Kawczynski, Hari Trivedi, Roy Harnish, Nathaniel W. Jenkins, Dmytro Lituiev, et al. “A Deep Learning Model to Predict a Diagnosis of Alzheimer Disease by Using 18F-FDG PET of the Brain.” Radiology 290, no. 2 (February 2019): 456–64. https://doi.org/10.1148/radiol.2018180958.

15. “ASPIRE.” Accessed November 4, 2023. http://aspire-mri.eu/.

Figures

Metric evaluation on independent test data with a tile count of 16

t-SNE plot from features extracted from FDG-PET data

Cam overlays from a forward pass on FDG-PET data with labels CN (a), MCI (b) and AD (c)

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

1963

DOI: https://doi.org/10.58530/2024/1963