1302

3D DUAL RECURSIVE REFINER NETWORK FOR ROBUST SEGMENTATION: APPLICATION TO BRAIN EXTRACTION

Maxime Bertrait¹, Pascal Ceccaldi¹, Boris Mailhé¹, Youngjin Yoo¹, and Mariappan S. Nadar¹
¹Digital Technology and Innovation, Siemens Healthineers, Princeton, NJ, United States

Synopsis

In Magnetic Resonance Imaging, acquisition protocol may varies from one clinical task to another affecting the resulting reconstructed scan in terms of field of view and resolution. In research, 3D acquired MRI scans are widely available providing high quality isotropic medical images but is far from what can exist in clinical environment such as 2D multi-slices with thick slices acquisition that can provide anisotropic medical images. We then present a framework, through a brain extraction task, called Dual Recursive Refiner able to work with both acquisitions. The presented framework outperforms baseline architectures for segmentation on both isotropic and anisotropic data.

Introduction

In Magnetic Resonance Imaging, image resolution and field of view can be adapted by changing protocol parameters. In research-oriented neuro-imaging datasets, the data is often acquired in 3D resulting in an high isotropic resolution (e.g. 1mm³.) However, in clinical environments, 2D multi-slice protocols with thick slices can also be used (e.g. 5mm.) Therefore, image analytics tools that aim to be generic must be able to work with both isotropic and anisotropic data. In this abstract, we present a generic architecture trained for brain segmentation using two independent fully convolutional networks: a first network providing an initial segmentation as contextual information to a second network that will recursively refine its segmentation.

Methods

The proposed supervised framework relies on the Dual Network architecture. This architecture is composed of two networks combined as a pipeline. The second network is using the first network output to provide the final prediction. Using this concept, the proposed architecture is a 3D Dual Network architecture composed of two Fully Convolutional Networks (FCN): a first FCN which output is used as contextual initialization of a second FCN, the one of which refines recursively the segmentation. A representation of the network architecture is shown on Figure 1. The first FCN is firstly trained separately until convergence before being used for the second network. In terms of network architecture, both networks are UNets with the same configuration, except for the number of input channels (2 for the second recursive network instead of 1 for the first one): 3 levels of max pooling, a cubic convolution kernel size of 3, a growth rate of 12, instance normalizations, ReLU activations and two output channels on which is applied a final SoftMax activation to produce probability masks. The networks are both trained to minimize the cross-entropy loss function with Adam optimizer using a learning rate of 10−3 and a batch-size of 8. Once convergence of the first network is achieved, its probability output is used as initialization of the second recursive network as an additional channel. Throughout the recursion steps, the contextual initialization will be replaced by successive recursive network prediction. For training, we empirically fixed the number of recursive iterations at 5 for the second network as we observed that it was a great compromised between training time and segmentation performance. In terms of data augmentation, we applied a random resolution perturbation on the input volumes: we resample a random axis by a uniform random variable taking values between the real interval [2;6] to simulate thick slices acquisition. The final dataset is then a balanced combination of two datasets: the original dataset and the randomly perturbed one. The resulting dataset is then composed of 3,994 volumes.

Results

We used Wu-Minn Human Connectome Project 900¹(HCP-900) composed of 895 T1w 3D MPRAGE 3T acquired scans, and 1102 T1w 3D MPRAGE 3T acquired scans from Autism Brain Imaging Data Exchange² (ABIDE-I). Both datasets contain isotropic data: HCP-900 scans are 0.7mm isotropic and ABIDE-I is composed of 1mm isotropic scans. We split this combination as 80/10/10 (training/validation/testing). Hence for testing we used 90 HCP scans and 111 ABIDE scans. To evaluate model capability to generalize on both isotropic and anisotropic data, we have made experiments with non-perturbed and perturbed scans using the same transformation as during training. The metric used for evaluation is the Dice similarity score based on the pseudo ground-truth brain mask of the original non-perturbed dataset, and the model prediction. We compared results obtained with the 3D Dual Recursive Refiner network, with 5 recursive iterations for the second network, with two other architectures: a simple U-Net and a Recursive U-Net with 5 recursive iterations. These networks have been trained using the same data processing and augmentation. As shown on Figure 2, our framework performs the highest dice score on both isotropic and anisotropic data. We can witness visually on Figure 3 the differences between the three models prediction on an anisotropic example.

Discussion

The results on Figure 2 show the benefits of using a initializer combined with a recursive network on segmentation tasks in both isotropic and anisotropic environment. Here we applied it on the brain segmentation problem but it can be generalized to any segmentation task.

Conclusion

The medical images environment is sparse in terms of quality and quantity. This architecture allows us to deal with the large variety of resolution existing in this environment. We demonstrate in this abstract the benefits of the initialization of a recursive model for a refined segmentation.

Acknowledgements

Data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscienceat Washington University. Primary support for the work by Adriana Di Martino was provided by the NIMH (K23MH087770) and the Leon Levy Foundation. Primary support for the work by Michael P. Milham and the INDI team was provided by gifts from Joseph P. Healy and the Stavros Niarchos Foundation to the Child Mind Institute, as well as by an NIMH award to MPM (R03MH096321).

The concepts and information presented in this abstract are based on research results that are not commercially available.

References

[1] David C. Van Essen, Stephen M. Smith, Deanna M. Barch, Timothy E.J. Behrens, Essa Yacoub, Kamil Ugurbil, for the WU-Minn HCP Consortium. (2013). The WU-Minn Connectome Project: An overview. NeuroImage 80(2013):62-79. [2] Di Martino, A., Yan, C.-G., Li, Q., Denio, E., Castellanos, F. X., Alaerts, K., Milham, M. P. et al. (2013). The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Molecular Psychiatry, 19(6), 659667.

Figures

Figure 1: Dual Recursive Refiner network architecture

Figure 2: Dice scores comparison on 90 T1w scans from Human Connectom Project dataset and 111 T1w scans from Autism Brain Imaging Data Exchange dataset.

Figure 3: Example from perturbed HCP anisotropic test set of an extracted sagittal slice, predicted by a simple U-Net (left scan), a Recursive U-Net (middle scan) and the Dual Recursive Refiner model (right scan). Orange circles and arrows point segmentation errors.

Proc. Intl. Soc. Mag. Reson. Med. 28 (2020)

1302