3186

Multi-class brain lesion detection: establishing a baseline for fastMRI+ dataset
Lifeng Mei1,2, Sixing Liu1,2, Guoxiong Deng1,2, Shaojun Liu1,2, Yali Zheng1,2, and Mengye Lyu1,2
1College of Health Science and Environmental Engineering, Shenzhen Technology University, Shenzhen, China, 2College of Applied Sciences, Shenzhen University, Shenzhen, China

Synopsis

Computer aided diagnosis (CAD) is widely considered an important application of deep learning in healthcare. However, brain data with lesion location labels are rare in MRI domain. Recently, Microsoft Research has released a dataset of clinical pathology annotations based on the raw images from fastMRI and named it fastMRI+. Here, fastMRI+ brain dataset was analyzed and used to train deep learning-based models for lesion detection. Some well-known object detection architectures such as YOLOv3, Faster-RCNN and YOLOX were compared. Overall, this abstract established a baseline with improvement suggestions for future studies.

Introduction

Computer aided diagnosis (CAD) is widely considered an important application of deep learning in healthcare. However, brain data with lesion location labels are rare in MRI domain. Recently, Microsoft Research has released a dataset of clinical pathology annotations based on the raw images from fastMRI and named it fastMRI plus (fastMRI+)1. It consists of 7570 bounding box annotations and 643 study-level labels for 30 different pathology categories for brain MRI. In this abstract, fastMRI+ brain dataset was analyzed and used to train deep learning-based models for lesion detection. Some well-known object detection architectures such as YOLOv32, Faster-RCNN3 and YOLOX4 were compared. Overall, this abstract established a baseline with improvement suggestions for future studies.

Methods

Analysis on fastMRI+ brain dataset
We first analyzed the FastMRI+ brain dataset in terms of the distribution of labels and bounding boxes. The results were summary in Results Section.
Model selection
At present, there are two mainstreams of detection models: anchor-based and anchor-free models. The former one can be further divided into two-stage and one-stage models. We selected a representative model for each of the three types, namely, one-stage model YOLOv3, two-stage model Faster R-CNN and the state-of-the-art anchor-free model YOLOX. The performance of the three models based on FastMRI+ brain dataset training was compared. The training framework used was MMDetection7.
Training and evaluation
We randomly divided fastMRI+ brain data into training set and validation set (7:3). We initially used all the annotations directly for training, then based on the preliminary mAP (mean Average Precision) of each class, we removed the classes with extremely poor average precision(AP<0.01). The finally trained 12 classes were Nonspecific white matter lesion, Mass, Resection cavity, Craniotomy, Lacunar infarct, Extra-axial mass, Edema, Nonspecific lesion, Possible artifact, Encephalomalacia, Enlarged ventricles, and Normal variant. In addition, we introduced a certain number of negative samples. Specifically, subjects labeled normal for age were obtained from 643 study-level labels, and 4 slices were taken from each subject and added into the dataset. We evaluated the models on the validation set of fastMRI+.
Cross-dataset test using Brats2020
To investigate the generalization ability of trained models, we also conducted cross-dataset test on the Brats2020 data5, which was originally a tumor segmentation dataset.

Results

Some typical bounding box annotations are shown in Fig.1a. Some lesions of different classes had very subtle visual differences and bounding boxes may overlap with each other. In addition, statistical analysis of Fig.1 b and c shows that the fastMRI+ brain dataset presented an imbalanced distribution, and for many classes, there was a large standard variation in the area of boundary boxes. All above characteristics caused difficulties to the training of object detection model.
The performance of the trained models on fastMRI validation set are shown in Table 1. In addition to training with the basic configuration of the three models, we also tried to replace backbone of YOLOX-M. Among the three basic configuration models, the two-stage model Faster R-CNN performed relatively well, while YOLOX had improved performance after replacing DarkNet53 with Swin-Transformer-Tiny (Swin-T)6 as backbone, even surpassing Faster R-CNN.
In Fig. 2, typical detection results were visualized. For some lesions such as non-specific white matter lesion, Edema and Mass could be well detected. However, there were also many false negatives and false positives.
Fig.3 shows the quantitative analysis on the performance of the three models with PR curves according to all classes and some major classes. It is worth noting that Faster R-CNN, which performed well in all classes, was weaker than the other two models in the class of non-specific white matter lesions.
For different training strategies, as shown in Fig.4, backbone with pretraining improved the performance of models, even if it was based on the pre-training of natural image. For YOLOX, strong data augmentation methods such as Mixup and Mosaic increased its accuracy. However, they caused difficulties in model convergence for YOLOv3 and Faster R-CNN.
Fig.5 shows the results of applying YOLOX trained on fastMRI+ to Brats2020. The model could still detect some lesions despite the obvious contrast difference between the two datasets, indicating that the model has certain generalization ability.

Models
Backbone
Size
mAP (%)
(at IoU=0.50:0.95)
mAP (%)
(at IoU=0.50)
YOLOv3
DarkNet53-MSCOCO
Pretrain
(608,608)
8.7
23.2
YOLOX-m
DarkNet53-MSCOCO
Pretrain
(640,640)
9.4
22.7
Faster R-CNN
RestNet50-MSCOC
Pretrain
(1333,640)
10.0
25.9
YOLOX
Swin-T-ImageNet1K
Pretrain
(640,640)
10.8
26.0
Table1. Different models were evaluated in the fastMRI+ brain dataset.

Discussion and Conclusion

We have conducted in-depth analysis on the fastMRI+ brain dataset, and tried three types of object detection models. The fastMRI+ brain dataset was characterized by imbalanced distribution, small visual differences among classes, and large area variation within classes. Thus, all models had relatively low accuracy as compared to what they can achieve in natural images dataset such as COCO. Out of the three models, anchor-free object detector YOLOX performed relatively well, as can be seen as a baseline of fastMRI+ brain dataset. Some improvements may be readily investigated. For example, one can 1) remove the outliers in annotations, 2) add more data of normal subjects, and 3) try 3D based detection.

Acknowledgements

No acknowledgement found.

References

1. fastMRI+:Zhao, R., “fastMRI+: Clinical Pathology Annotations for Knee and Brain Fully Sampled Multi-Coil MRI Data”,arXiv:2109.03812,2021,https://arxiv.org/abs/2109.03812

2. Redmon, J.and Farhadi,A.,“YOLOv3: An Incremental Improvement”, arXiv:1804.02767,2018, https://arxiv.org/abs/1804.02767

3. Ren, S., He, K., Girshick, R., and Sun, J.,“Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”,arXiv:1506.01497, 2015 ,https://arxiv.org/abs/1506.01497

4. Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J., “YOLOX: Exceeding YOLO Series in 2021”, arXiv:2107.08430, 2021, https://arxiv.org/abs/2107.08430

5. https://www.med.upenn.edu/cbica/brats2020/

6. Liu, Z., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows”, arXiv:2103.14030, 2021, https://arxiv.org/abs/2103.14030

7. Chen, K., “MMDetection: Open MMLab Detection Toolbox and Benchmark”, arXiv:1906.07155,2019, https://arxiv.org/abs/1906.07155

Figures

Figure 1. Examples and statistical analysis of fastMRI+ Brain Dataset. Besides imbalanced class distribution, some classes had extremely large variation in the area of bounding box, e.g., Dural thickening, Possible artifact, and Nonspecific white matter lesion.


Figure 2. Visualization of typical detection results by the three models on fastMRI+ brain dataset. For some lesions such as non-specific WM lesion, Edema and Mass could be well detected, and even a few small targets could be identified. But detection errors were still common: as in row (b), YOLOv3 could not differentiate the overlapping bounds well, and both YOLOX and Faster R-CNN could mistake the non-specific WM lesion as Mass; as in row (e), none of the three models identified the correct lesion location, and some other areas were falsely detected as non-specific white matter lesions.


Figure 3. The PR curves of three models in all classes and four major classes. C75 is the area under the PR curve at IoU=0.75, C50 is the area under the PR curve at IoU=0.5, Loc is PR curve at IoU=0.10 (localization errors ignored, but not duplicate detections), and Oth is PR curve after all class confusions are removed. These metrics are derived from the coco API. It is worth noting that Faster R-CNN, which performed well in all classes, was weaker than the other two models in the class of non-specific white matter lesions.


Figure 4. Influence of different training strategies on the mean average accuracy (mAP). Pretrained backbone improved the performance, even if it was based on the pre-training of natural image. For YOLOX, strong data augmentation methods such as Mixup and Mosaic increased its accuracy. However, strong data augmentation caused training failure for YOLOv3 and Faster R-CNN.


Figure 5. Results of tumor detection on Brats2020 dataset by YOLOX model trained on on fastMRI+.


Proc. Intl. Soc. Mag. Reson. Med. 30 (2022)
3186
DOI: https://doi.org/10.58530/2022/3186