Synopsis
Keywords: Diagnosis/Prediction, Tumor, Hemangioblastoma, Contrastive Learning
Motivation: Brainstem and cerebellar hemangioblastoma is a rare tumor with a high risk of haemorrhage during the biopsy. However, it is still a challenging task to distinguish HB from other types of intracranial tumors solely based on neuroimaging techniques.
Goal(s): To propose a computer-aided diagnosis method to classify hemangioblastoma.
Approach: We propose a patient-level classification framework using multi-task supervised contrastive learning, named LaSCL-PLC, for hemangioblastoma classification.
Results: We evaluated the proposed model on a local MRI dataset of brainstem-and-cerebellum tumors, consisting of 97(HB) and 143(others). The experimental results show that our model achieves competitive performance as neuroradiologists.
Impact: It could improve the preoperative diagnosis hemangioblastoma, which is crucial for clinical treatments.
Background
Hemangioblastoma (HB), a rare tumor characterized by abundant hypervascularity, either sporadically (75 \%) or as a component of von Hippel-Lindau (VHL) disease (25 \%)[1]. Despite their benign histology, intracranial HBs are associated with a high risk of haemorrhage due to their vascular nature[2] and may lead to severe neurological complications or patient mortality. Kuharic et al.[3] reported a postoperative mortality rate of 10.3\% among 1106 patients with solid HBs. Thus, accurate preoperative diagnosis plays an important role in selecting appropriate surgical plans and mitigating the risk of hemorrhagic complications. Methods
As depicted in Fig.1, the patient-level classification means getting the final diagnosis result of the patient by fusing the various information, such as demography info, medical images, and other clinical information. In our proposed frame, we first extract features from different data sources, standardize and concatenate them, and then we will train a classifier for the final prediction. In the present task, the T1CE images are treated as medical images since there are no other available modalities, while age and gender information are utilized as demographic data. The encoder in this scheme is our proposed LaSCL, which can extract features from 3D images, while we employ the XGBoost[4] as the classifier. In our framework, alternative models could substitute for the encoder and classifier.
We proposed a multi-task network incorporating supervised contrastive learning, as illustrated in Fig.1. The main idea of this approach is to minimize the distance between the extracted features from the same class. For that, we defined the samples from the same classes as similar image pairs. Consequently, supervised contrastive learning can be viewed as a query task, where we aim to train a network that can identify samples from the same class among many other samples. Inspired by the research on self-supervised contrastive learning[5], we exploited a query encoder for generating query representations and a key encoder for generating key embeddings. To prevent the key encoder from changing too rapidly and impeding learning, we update its parameters as follows:
$$\theta_{k} \leftarrow m\theta_{k} + {1-m}\theta_{q}$$
Here,$$$\theta_{k}$$$ and $$$\theta_{q}$$$ represent the parameters of the key encoder and query encoder, respectively, while $$$m \in [0, 1)$$$ is a momentum coefficient. To increase the number of negative pairs available for comparison, we utilized queues to store extracted embeddings. During each training iteration, we compared the sampled representations from the same class with all representations from other queues, thereby generating more negative pairs for comparison. Additionally, we integrated an auxiliary segmentation task to constrain the SCL to learn the representations from tumor region. In detail, we employed ResNet50 as the query encoder and key encoder, whereas the decoder was constructed with three convolutional blocks, each block comprising a deconvolutional layer, and two 3D convolutional layers, followed by batch normalization and ReLu activation.
The loss functions in our network contain two parts. The first part is the SCL loss, which is defined as follows:
$$\mathcal{L}^{SCL}_{q,k}=-{log} \frac{exp({z}_{q}\cdot {z}_{k}/\tau)}{\sum_{K_{j} \in P(q)} \sum_{i \in K_{j}} exp(z_{q} \cdot {z}_{i} / \tau)}$$
while $$$q, k$$$ are the samples from the same class, $$$P(q)$$$ represents the set of queues that do not contain the class of sample $$$q$$$. The $$$K_{j}$$$ denotes the embedding queue of class $$$j$$$. The second part is segmentation loss, including a Dice loss and a cross entropy loss.Results
We collected preoperative brain MR images of 97 patients with HB located in the brainstem and cerebellum, as well as 143 patients with other brainstem-and-cerebellum tumors, including PA, HGG, and AC. For the low-sample 3D image classification, transfer learning and regularization are the main solutions to avoid overfitting. So we compared our method with pre-trained Med3D[6], Genesis[7] and SwinUNETR[8]. Additionally, we assessed two representative regularization methods, dropout and mixUp, on the Med3D. To make the comparative fairer, the above methods were modified to support classification and integrate the age/gender information by a classification head. We also compared our method with the experts to evaluate its utility. The results are shown in Table 1 and Fig.2.
Our method achieved competitive performance compared to various transfer learning and regularization methods. Specifically, we find that dropout could improve the classification performance significantly, but the result will decline when combining the dropout and mixUp. We observed that Med3D outperformed SwinUNETR, which can be attributed to the larger size of SwinUNETR that makes it prone to overfitting when the sample size of training data is limited. The Genesis 3D model performed poorly, likely due to this model being trained on CT images, which makes it challenging to generalize to MRI.
Acknowledgements
This study was supported in part by grants from the National Natural Science Foundation of China (82171903, 92043301).References
[1]Conway, J. E., Chou, D., Clatterbuck, R. E., Brem, H., Long, D. M., & Rigamonti, D. (2001). Hemangioblastomas of the central nervous system in von Hippel-Lindau syndrome and sporadic disease. Neurosurgery, 48(1), 55-63.
[2]Gläsker, S., & Van Velthoven, V. (2005). Risk of hemorrhage in hemangioblastomas of the central nervous system. Neurosurgery, 57(1), 71-76.
[3]Kuharic, M., Jankovic, D., Splavski, B., Boop, F. A., & Arnautovic, K. I. (2018). Hemangioblastomas of the posterior cranial fossa in adults: demographics, clinical, morphologic, pathologic, surgical features, and outcomes. A systematic review. World Neurosurgery, 110, e1049-e1062.
[4]Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
[5]Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., ... & Krishnan, D. (2020). Supervised contrastive learning. Advances in neural information processing systems, 33, 18661-18673.
[6]Chen, S., K. Ma, and Y. Zheng. "Transfer learning for 3d medical image analysis." arXiv preprint arXiv (1904).
[7]Zhou, Z., Sodha, V., Pang, J., Gotway, M. B., & Liang, J. (2021). Models genesis. Medical image analysis, 67, 101840.
[8]Hatamizadeh, A., Nath, V., Tang, Y., Yang, D., Roth, H. R., & Xu, D. (2021, September). Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. In International MICCAI Brainlesion Workshop (pp. 272-284). Cham: Springer International Publishing.