Yi-Tien Li1,2, Hsiao-Wen Chung3, and David Yen-Ting Chen4
1Neuroscience Research Center, Taipei Medical University, Taipei, Taiwan, 2Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan, 3Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan, 4Department of Medical Imaging, Taipei Medical University - Shuang Ho Hospital, New Taipei, Taiwan
Synopsis
We propose a multi-output segmentation
approach, which incorporates other non-lesion brain tissue maps into the
additional output layers to force the model to learn more about the lesion and
tissue characteristics. We construct a cross-vendor study by training the white
matter hyperintensities segmentation model on cases collected from one vendor
and testing the model performance on eight different data sets. The model performance
can be significantly improved, especially in testing sets which shows
low image contrast similarity with training data, suggesting the feasibility of
incorporating the non-lesion characteristics into
segmentation model to enhance the resistance of cross-vendor image contrast
variabilities.
Introduction
Deep learning-based automatic segmentation
techniques for brain lesion segmentation have been widely established, however,
with varying generalization performance on different MRI vendor, which depends on
the different image intensities and contrasts. It has always been a critical
issue to enhance the model resistance to these cross-vendor variabilities to
maintain the model performance under different MRI acquisition parameters. We proposed
to use multi-output segmentation approach which incorporates intensity information
of each brain tissue type to achieve a higher cross-vendor model performance. Here,
we used cases collected from one MRI scanner for model training, and tested the
model performance on independent dataset. We hypothesized that 1) the model
performance is closely related to the image contrast similarity between
training and testing set; 2) the multi-class segmentation approach can be more
resilient to the varying MR image acquisition parameters than the widely used
single-label segmentation approach, and can achieve better model performance.Methods
We recruited patients who are referred to Shuang Ho
Hospital (SHH), Taipei Medical University Hospital (TMUH), and Wang-Fang
Hospital (WFH) with suspicion for appearance of WMHs during the past 5 years (Table 1). The MICCAI WMH
segmentation challenge training dataset (MWSC (1);
http://wmh.isi.uu.nl/) was also treated as
the testing set in this study. Before supplying the data to the auto-segmentation
model, the T2-FLAIR images from all subjects were first segmented by a unified
algorithm that included bias field correction, tissue classification, and
registration using SPM12 software (Wellcome Department, University College
London, UK) to generate the probability maps of each non-lesion brain tissue (gray
matter [GM], white matter [WM], and cerebrospinal fluid [CSF]) as well as the
non-brain tissue area (skull and scalp). Then, the axial slices of T2-FLAIR
volumes were automatically resampled to 512×512, to guarantee a uniform size
for input to the neural network. Gaussian normalization was further employed to
normalize the brain voxel intensity distributions for each axial slice. Then,
the image intensity distribution of WMH and other non-lesion tissue types in
the training set and each testing set were computed to quantify the degrees of voxel
intensity overlap between WMH and other non-lesion brain tissues as well as the
image contrast similarity between the training set and each testing set. The segmentation model
architecture was constructed, which contained one output layer for single-label
segmentation and 6 output layers (Figure 1A), including WMH, GM, WM, CSF, skull/scalp, and air, for
multi-class segmentation, respectively. To further investigate the possible
model performance improved by the adjustment of the common variant model architectures, three types of architectures (Figure 1B), including U-Net (2),
SegNet (3),
and U-SegNet (4),
a hybrid approach of the above two model structures, were built and combined
with single or multiple output segmentation, respectively.Results & DIscussions
The three kinds of multi-output architectures
can achieve better model performance (Table 2), then the
single-output model structures among all testing set, especially in the
external testing sets. It can be found that the prediction results of the
single-label segmentation models can cause false positives mainly distributed
at GM regions (Figure 2; yellow arrows) and false negatives
near the GM regions (Figure 2; red arrows). Among the 8
testing sets used in this study, the voxel intensity distributions of WMH and
GM in different data sets showed varying overlap ratios (Figure 3A-H) which correlated
significantly with the model sensitivity in both multi-class (R2=0.4719,
p=0.0001) and single-label architectures
(R2=0.5134, p<0.0001),
suggesting that the less similar the voxel intensity between GM and WMH can
lead to better model performance regardless of the model structure (Figure 3I). Then, we
extracted the voxel intensity distributions of GM, WM and WMH in all testing
sets and compared them with the distribution in the training set (Figure 4A-C). The 8 testing sets used in the study can be further divided
into 3 high similarity testing sets and 5 low similarity testing sets based on
the cosine similarity of histograms with the training set. In the low similarity
testing sets, the multi-class segmentation approaches showed significantly
superior dice coefficient (Figure 4D; right), precision (Figure 4E; right), sensitivity (Figure 4F; right) and specificity (Figure 4G; right) compared with single-label methods, which supported our
hypothesis that the multi-output algorithms can enhance the resistance of image
contrast variabilities in MRI data from different hospital and/or MR vendors.Conclusion
In this cross-hospital and cross-vendor study (Table 1), we systematically
delineated and quantified the image contrast between WMH lesions and other
non-lesion tissue types among the training and all testing sets using the
intersection of image intensity distributions of these tissues. The MR image
from different hospital/vendor can show varying MR image contrast (Figure 3), which in turn may
affect the model performance. The multi-class segmentation approaches showed
significantly better model performance at the low similarity testing sets
compared with the single-label segmentation methods (Figure 4D-G), supporting that the multi-output algorithms can enhance the
resistance of image contrast variabilities in MRI data from different hospital
and/or MR vendors. Our study characterizes the effects of varying MR image
contrast on model performance and generalizability, and further signifies the
advantage of multi-output models that combined the lesion labels with other
non-lesion tissue labels for brain lesion segmentation tasks.Acknowledgements
This work was partially supported by the
Ministry of Science and Technology, Taiwan (MOST110–2314–B–038–086–MY3) and Taipei Medical University, Taiwan
(TMU109-AE1-B18).References
1. Kuijf HJ, et al. (2019) Standardized assessment of automatic segmentation
of white matter hyperintensities and results of the WMH segmentation challenge.
IEEE transactions on medical imaging
38(11):2556-2568.
2. Ronneberger O,
Fischer P, & Brox T (2015) U-net: Convolutional networks for biomedical
image segmentation. International
Conference on Medical image computing and computer-assisted intervention,
(Springer), pp 234-241.
3. Badrinarayanan V,
Kendall A, Cipolla RJItopa, & intelligence m (2017) Segnet: A deep
convolutional encoder-decoder architecture for image segmentation.
39(12):2481-2495.
4. Kumar P, Nagar P,
Arora C, & Gupta A (2018) U-SegNet: fully convolutional neural network
based automated brain tissue segmentation tool. 2018 25th IEEE International Conference on Image Processing (ICIP),
(IEEE), pp 3503-3507.