0739

Automatic Assessment of Fetal Gestational Age using Bayesian Deep learning Method
Axel Largent1, Jonathan Murnick1,2, Yuan-Chiao Lu1, Kushal Kapse1, Nicole Andersen1, Todd Richmann1, Josepheen De Asis-Cruz1, Jessica Quistorff1, Catherine Lopez1, Nickie Andescavage1,3, and Catherine Limperopoulos1,2,4
1Department of Diagnostic Imaging and Radiology, Children’s National Hospital, Developing Brain Institute, Washington, DC, United States, 2Departments of Radiology and Pediatrics, George Washington University, Washington, DC, United States, 3Department of Neonatology, Children's National Hospital, Washington, DC, United States, 4Neurology School of Medicine and Health Sciences, George Washington University, 20010, DC, United States

Synopsis

Monitoring fetal brain development is crucial for early diagnosis of brain malformations and other congenital disorders. Standard methods to monitor brain maturation are mainly based on subjective and time-consuming visual analysis of the progression of sulcation. Our study proposed a Bayesian deep-learning method (DLM) for automatic assessment of fetal-gestational age (GA), and accurate and efficient identification of fetuses with abnormal brain development. Our Bayesian DLM showed excellent performance in predicted GA (mean-absolute-error = 0.928 weeks) and compared favorably with other state-of-the-art methods. This method may be used in clinical practice for monitoring fetal-brain development and early diagnosis of fetal brain malformations.

Introduction

The fetal brain rapidly develops gyri and sulci between approximately 18-36 weeks of gestation, evolving from smooth cerebral hemispheres to mature primary and secondary sulcation. Sulcation follows a characteristic progression over this period, and it is important to understand where a particular fetus is along this pathway for the early identification of brain malformations such as lissencephaly and schizencephaly. Assessment of whether sulcation is appropriate for gestational age (GA) can be challenging for trained fetal radiologists. A manual scoring system has been developed to facilitate this process1, but it is both subjective and time-consuming to apply. An automated tool to determine sulcal maturation from fetal brain images will allow early and more accurate clinical diagnosis of fetal brain malformations and facilitate research studies of fetal brain development.

Purpose

To propose a 2D Bayesian deep learning method2–4 (DLM) for automatic GA assessment of healthy fetuses, and accurate and efficient identification of fetal brain malformation. To compare this Bayesian method to three state-of-the-art DLMs: a one-middle-slice 2D DLM5, a multi-slice 2D DLM, and a fully 3D DLM6.

Methods

A total of 482 3D-reconstructed T2-weighted MRI scans from 341 healthy fetuses (GA mean = 30.45 ± 5.65 weeks, GA range = [16.71; 39.71] weeks) were included in this study (Fig 1.). These images were acquired using a single-shot fast spin-echo sequence on a 1.5 T MRI scanner (GE Healthcare, Milwaukee, IL) and reconstructed with a slice-to-volume reconstruction algorithm7. Our 2D Bayesian DLM was composed of three steps: (1) slice-wise GA prediction with uncertainty assessment conducted by an AlexNet8 using Monte Carlo dropout2; (2) filtering of all predicted slice-wise GAs with an uncertainty superior to 0.80 weeks; and (3) averaging of the remaining slice-wise GA predictions to get the final GA (per scan). The backbones of the two other 2D DLMs and fully 3D DLM were respectively an AlexNet and a standard 3D Convolutional Neural Network9. All DLMs were trained using a 6-fold cross-validation on the whole dataset (with as inputs axial MRI slices for the 2D DLMs and whole 3D MRI volumes for the 3D DLM). The cross-validation was repeated five times to decrease the influence of DLM stochastic biases. For evaluation and comparison of the DLMs, endpoints such as mean absolute error (MAE), mean error (ME), standard deviation absolute error (SDAE), and standard deviation error (SDE) of the true and predicted GAs were considered. For each DLM, Friedman tests were used to compare the absolute error and error distributions across the repeated cross-validations (significant differences were indicated by p-values < 0.01). Additionally, Wilcoxon tests were used to compare the absolute errors and errors of the Bayesian DLM to those of other DLMs (significant differences were indicated by p-values < 0.01).

Results

Table 1. shows the endpoint values of all DLMs. The Bayesian DLM provided the lowest endpoint values among all DLMs (with mean MAE = 0.928 weeks). The Bayesian DLM absolute errors were significantly lower than those of the fully 3D DLM and one-middle-slice DLM. Additionally, the Bayesian DLM errors were significantly different from those of the other DLMs. The fully 3D DLM provided the highest endpoint values. All DLMs showed endpoint results with low standard deviations (range = [0.027; 0.126] weeks) across the repeated cross-validation. The distribution of the Bayesian DLM absolute errors was not significantly different across the repeated cross-validation. Fig 2. showed the Bayesian DLM absolute errors as a function of the GAs. Overall, the Bayesian DLM absolute errors were equally distributed in the GA range. Fig 3. showed the T2-weighted brain MRIs of three subjects with Bayesian DLM absolute errors greater than 5 weeks. For these subjects, the Bayesian DLM failed to accurately predict GAs due to the poor quality of their MRIs (presence of blur and bias field artifacts).
The computational times (per scan) of all DLMs were lower than 12 seconds.

Conclusion

The Bayesian DLM showed the best GA prediction results across all investigated methods with suitable computational time for clinical practice. This method may be used for monitoring fetal brain development and early diagnosis of fetal brain malformations.

Acknowledgements

No acknowledgement found.

References

1. Vossough, A. et al. Development and Validation of a Semiquantitative Brain Maturation Score on Fetal MR Images: Initial Results. Radiology 268, 200–207 (2013).

2. Gal, Y. & Ghahramani, Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. in International Conference on Machine Learning 1050–1059 (PMLR, 2016).

3. MacKay, D. J. C. A Practical Bayesian Framework for Backpropagation Networks. Neural Computation 4, 448–472 (1992).

4. McAllister, R. et al. Concrete problems for autonomous vehicle safety: advantages of Bayesian deep learning. in Proceedings of the 26th International Joint Conference on Artificial Intelligence 4745–4753 (AAAI Press, 2017).

5. Kojita, Y. et al. Deep learning model for predicting gestational age after the first trimester using fetal MRI. Eur Radiol 31, 3775–3782 (2021).

6. He, S. et al. Multi-channel attention-fusion neural network for brain age estimation: Accuracy, generality, and interpretation with 16,705 healthy MRIs across lifespan. Medical Image Analysis 72, 102091 (2021).

7. Kainz, B. et al. Fast Volume Reconstruction From Motion Corrupted Stacks of 2D Slices. IEEE Transactions on Medical Imaging 34, 1901–1913 (2015).

8. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet Classification with Deep Convolutional Neural Networks. in Advances in Neural Information Processing Systems 25 (eds. Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 1097–1105 (Curran Associates, Inc., 2012).

9. LeCun, Y., Kavukcuoglu, K. & Farabet, C. Convolutional networks and applications in vision. in Proceedings of 2010 IEEE International Symposium on Circuits and Systems 253–256 (2010). doi:10.1109/ISCAS.2010.5537907.

Figures

Fig 1. Examples of T2-weighted brain MRIs of three subjects: (a) subject with gestational age = 18 weeks; (b) subject with gestational age = 29 weeks; (c) subject with gestational age = 38 weeks


Fig 2. Absolute errors of the Bayesian deep learning method as a function of the gestational ages (over the whole cohort and one 6-fold cross-validation)


Fig 3. Examples of T2-weighted brain MRIs of challenging subjects (with absolute errors > 5 weeks for the Bayesian deep learning): (a) subject with gestational age = 19 weeks and predicted gestational age = 26 weeks; (b) subject with gestational age = 20 weeks and predicted gestational age = 25 weeks; (c) subject with gestational age = 38 weeks and predicted gestational age = 31 weeks

Table 1. Endpoint values (mean ± standard deviation across the five repeated 6-fold cross-validations) of the deep learning methods (DLM). Significant differences (p-values < 0.01) found by the Friedman and Wilcoxon tests are displayed respectively with (*) and (#).


Proc. Intl. Soc. Mag. Reson. Med. 30 (2022)
0739
DOI: https://doi.org/10.58530/2022/0739