1099

Deep Learning with Attention to Predict Gestational Age of the Fetal Brain Using MRI

Liyue Shen¹, Katie Shpanskaya², Edward Lee¹, Emily McKenna², Maryam Maleki², Quin Lu³, John Pauly¹, and Kristen Yeom²

¹Electrical Engineering, Stanford University, Stanford, CA, United States, ²Radiology, Stanford University, Stanford, CA, United States, ³Philips Healthcare North America, Gainesville, FL, United States

Synopsis

Fetal brain imaging is a cornerstone of prenatal screening and early diagnosis of congenital anomalies. Knowledge of fetal gestational age is the key to the accurate assessment of brain development. This study develops an attention-based deep learning model to predict gestational age of fetal brain. The proposed model is an end-to-end framework that combines key insights from multi-view MRI including axial, coronal, and sagittal views. The model uses age-activated weakly-supervised attention maps to enable rotation-invariant localization of fetal brain among background noise. We evaluate our method on a collected fetal brain MRI cohort and achieve promising age prediction performance.

Purpose

Given the dramatic structural changes of fetal neurodevelopment, precise knowledge of gestational age and the associated neuroanatomical hallmarks is key to the accurate interpretation of the fetal brain MRI¹. In this study, we aim to develop a novel deep learning approach to accurately predict gestational age from fetal brain MRI.

Methods

We propose our model pipeline as shown in Fig.1, which takes MRI image X to predict the gestational age y. We define the objective as a regression problem to minimize the mean squared error between the true fetal brain ages and model's predictions. The MRI images are selected from the center of MRI sequences. Especially, we integrate three images (views) into our model: axial, sagittal, and coronal orientations, and combine together with multi-view learning to form one final age prediction. Furthermore, computational analysis for fetal brain images is extremely challenging because positions and orientations of fetal brain sub-regions are randomly variable across patients, and unrelated information (mother's placenta and organs) act as clutter negatively affecting performance. Thus, we introduce self-attention mechanism2 in our model to crop out the part of input image that only pertains to the fetus. As in Fig.1, an attention heatmap is extracted from feature maps after the last stage of residual block. Then we use this attention map for automatic fetal brain segmentation by thresholding pixel values, cropping out input image with a rectangular bounding box that captures the largest number of thresholded values with the minimum perimeter. Finally, the cropped image after interpolation is fed into the local branch of our model to learn local feature representation. We study different ways of predicting the age, by using only global branch, using local branch predictions from the masked inputs, averaging prediction from two branches, and using fusion branch by concatenating features from two branches followed by a regressor.

Considering no public database of fetal MRI exists currently, we collected a database of 1927 fetal brain MRI from our clinical picture archiving and communication system (PACS). Each MRI was manually interpreted by an expert pediatric neuroradiologist to extract developmentally normal studies. Among those, the optimal T2-fast sequences in the three standard planes (axial, coronal, and sagittal) were identified. A total of 741 studies that had all three MRI planes were included in this study. Fetal gestational age (in days) was calculated from the estimated date of delivery, in accordance with the current obstetrical guidelines and standard of care. These ages serve as ground truth labels and range from 125 to 273 days. The entire dataset was split into training (70%), validation (10%), and test (20%) sets.

Results

For evaluation metrics, R2 score (ranging from 0 to 1) and mean absolute error (MAE) are leveraged to show the quantitative difference between model prediction and ground-truth labels. All the quantization results are summarized in Table 1 and 2. For comparison, we show the results of different base architectures (ResNet-18 and ResNet-50)³ with various layers. The results show that deeper network benefits the final performance of age regression, where more complicated feature representation is learned from deeper layers. Besides, the comparison ablative studies indicate that both attention mechanism and multi-view learning are beneficial to age prediction. Specifically, in single-view prediction, the axial and coronal planes provide more useful information to estimate fetal ages. The combination of multi-view learning largely increases regression accuracy compared to any single plane. With attention mechanism, the model can better understand local image features such as brain shape or contour. Finally, with both attention mechanism and multi-view learning, the state-of-the-art results are achieved with a R2 score of 0.94 and mean error of 6.8 days. The qualitative visualization of this best result is demonstrated in Fig.2 for regression performance on test dataset. Visual verification notes that this deep regression model can get an accurate estimation for fetal brain age. Moreover, Fig.5 visualizes attention maps, mask inference, and automatic sub-image cropping from trained model. By visual inspection, we verify that our model learns the location of fetal brains in an unsupervised manner, given the small size of these brains with respect to the surrounding environment and without ground-truth locations. This is strong evidence that this attention-aware models can learn local features from correct brain regions.

Conclusion

This work proposes an end-to-end framework for efficient prediction of gestational age of fetal brain. Especially, experimental evidence shows that attention mechanism and multi-view learning can benefit fetal brain age regression. For future work, external cohorts from outside institutions will be used for further validation.

Acknowledgements

We thank Philips Health Care for the funding support to some co-authors in this abstract.

References

1. Public Health England. National congenital anomaly and rare disease registration service. 2017.

2. Guan Q, Huang Y, Zhong Z, Zheng Z, Zheng L, and Yang Y. Diagnose like a radiologist: Attention guided convolutional neural network for thorax disease classification. CoRR abs/1801.09927, 2018.

3. He K, Zhang X, Ren S, and Sun J. Deep residual learning for image recognition. CVPR. 770–778, IEEE Computer Society, 2016.

Figures

Figure 1: Model architecture for fetal brain age regression with attention mechanism and automatic mask inference. After two parallel streams of global branch and local branch, the global and local features are fused to predict gestational age of the fetal brain.

Figure 2: Visualization for regression performance on test set with attention-guided multi-view model. X-axis notes the ground-truth labels while Y-axis notes the model prediction ages for corresponding samples. The model-estimation results are expected to be aligned with ground-truth values as much as possible, which is annotated as the black line of y = x.

Figure 3: Examples visualization of the attention mechanism. The top row shows examples of input images in three views: (a) Sagittal (b) Axial (c) Coronal. The middle row demonstrates the attention heatmap extracted from feature maps in the last convolution layer of model. The bottom row shows the attention-guided mask detention and sub-image cropping annotated as red bounding boxes.

Table 1: Quantitative evaluation of R2 score (ranging from 0 to 1) on test dataset.

Table 2: Quantitative evaluation of mean absolute error (in days) on test dataset.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)

1099