Diffuse white matter abnormality (DWMA) is observed in 50-80% of very preterm infants at term-equivalent age. Despite autopsy studies showing correlation with neuropathology, the relationship of DWMA with long term neurodevelopmental impairments remains controversial. The controversy may be due to the qualitative nature of previous studies of DWMA, likely resulting in measurement error and perhaps contributing to the lack of association with neurodevelopmental impairments in some studies. In this study, we developed a deep learning approach to objectively and automatically segment DWMA regions on T2-weighted MRI images. The internal and external validations demonstrated very accurate and reproducible DWMA segmentation performance.
INTRODUCTION
Diffuse white matter abnormality (DWMA), or diffuse excessive high signal intensity, is observed in 50-80% of very preterm infants at term-equivalent age.1 It is subjectively defined as higher than normal signal intensity in periventricular and subcortical white matter in comparison to normal unmyelinated white matter on T2-weighted MRI images. Despite the well-documented presence of DWMA and emerging evidence of its pathological nature, the significance of DWMA for long-term neurodevelopment remains debatable.2 Much of this debate has been fueled by the nearly universal use of qualitative reporting of DWMA that is subjective and unreliable, likely resulting in measurement error and lack of association with neurodevelopmental impairments.3 Limited studies have attempted to develop reproducible quantitative methods for evaluating DWMA in preterm infants.1,4,5 However, these approaches were developed by utilizing only individual voxels without considering the neighboring spatial information, which contributed to a higher false positive rate. In this work, we developed a deep convolutional neutral network (CNN) approach utilizing the spatial information of individual voxels to automatically segment DWMA regions on T2-weighted MRI images. Specifically, the detection of DWMA was formulated as an image voxel classification task. Small image patches that are centered on the given voxels were utilized to represent regional spatial information of individual voxels. We evaluated the proposed model using both internal and external validation.METHODS
We formulated the DWMA segmentation into an image voxel classification task. Each T2-weighted white matter voxel is exclusively assigned into either DWMA or normal group. To utilize the image spatial information around voxels, a small neighborhood patch (13 x 13) centered on a given voxel is sampled. This typically results in a set of ~105 image patches for each subject. The deep CNN model takes each image patch as input and assigns a label to its center voxel (Figure 1).
The data for this study were derived from two independent cohorts of very preterm infants (≤32 weeks gestational age). The Institutional Review Board of Nationwide Children’s Hospital approved both studies and written parental informed consent was obtained for every subject. The first cohort included 95 subjects scanned on a 3T Siemens MAGNETOM Skyra scanner. We used these data for deep model development and internal 10-fold cross-validation. The second cohort included 28 subjects scanned on a 3T GE HDX scanner. We used this data for external validation. More specifically, models that were trained using 50 subjects (~50×105 image patches) from cohort 1, were tested on this independent cohort 2 with 28 subjects (~28×105 image patches). Demographic information is listed in Table 1.
We designed a 12-layer deep CNN architecture6 (Figure 2). DWMA gold standard information was annotated by two experts guided by an atlas-based method.5 Compared to normal voxels, the number of DWMA voxels are relatively small, and therefore this results in an imbalanced classification problem (a disproportionate ratio of observations in each class). We therefore applied Dice index and balanced accuracy7 for the model evaluation. To evaluate the proposed deep CNN model, we also implemented deep neural network (DNN)8 and support vector machine (SVM) models.
RESULTS
Table 2 shows the DWMA segmentation performance using different models based on cross-validation. The CNN model exhibited a significantly higher Dice index than DNN (p = 0.019) and SVM (p ≤ 0.001); and higher balanced accuracy than DNN (p = 0.043) and SVM (p ≤ 0.001). The external validation performance is also shown in Table 2. The CNN performance remained robust and once again significantly outperformed DNN (p=0.018 for Dice ratio; p=0.009 for balanced accuracy) and SVM (p=0.021 for Dice ratio; p=0.006 for balanced accuracy). Figure 3 displayed a representative DWMA segmentation using deep CNN.DISCUSSIONS AND CONCLUSIONS
We developed a deep CNN approach to objectively and automatically quantify DWMA regions on T2-weighted MRI images. This is the first study to segment DWMA regions in preterm infants by using a deep learning algorithm. We evaluated our deep CNN model using both internal and external validation. Our deep CNN model achieved Dice similarity index values of 0.85-0.86 and balanced accuracy of 0.87-0.94 for DWMA segmentation and exhibited significantly better performance than other popular machine learning models, such as DNN and SVM. The generalizability of our approach, tested on two preterm cohorts and scanner platforms, suggests that we can achieve consistent and reliable accurate segmentation of DWMA. Future studies to investigate the association between CNN-detected DWMA volumes and long-term neurodevelopmental outcomes will be important to further validate the clinical significance of this work.1 Parikh, N.A., et al., Automatically quantified diffuse excessive high signal intensity on MRI predicts cognitive development in preterm infants. Pediatric neurology, 2013. 49(6): p. 424-430.
2 Volpe, J.J., Confusions in Nomenclature:“Periventricular Leukomalacia” and “White Matter Injury”—Identical, Distinct, or Overlapping? Pediatric neurology, 2017. 73: p. 3-6.
3 Hart, A.R., et al., Appearances of diffuse excessive high signal intensity (DEHSI) on MR imaging following preterm birth. Pediatric radiology, 2010. 40(8): p. 1390-1396.
4 He, L. and N.A. Parikh, Automated detection of white matter signal abnormality using T2 relaxometry: application to brain segmentation on term MRI in very preterm infants. Neuroimage, 2013. 64: p. 328-340.
5 He, L. and N.A. Parikh, Atlas-guided quantification of white matter signal abnormalities on term-equivalent age MRI in very preterm infants: findings predict language and cognitive development at two years of age. PloS one, 2013. 8(12): p. e85475.
6 Zhang, W., et al., Deep convolutional neural networks for multi-modality isointense infant brain image segmentation. NeuroImage, 2015. 108: p. 214-224.
7 Brodersen, K.H., et al. The balanced accuracy and its posterior distribution. in Pattern recognition (ICPR), 2010 20th international conference on. 2010. IEEE.
8 Hinton, G.E. and Salakhutdinov, R.R., Reducing the dimensionality of data with neural networks. Science, 2006. 313(5786): p.504-507.