Benjamin Roussel1,2, Julien Oster2,3, and Mattias Paul Heinrich4
1Université de Lorraine, Nancy, France, 2U1254, INSERM, Nancy, France, 3Université de Lorraine, Nancy, France, Metropolitan, 4Universität zu Lübeck · Institut für Medizinische Informatik, Lübeck, Germany
Synopsis
To perform a fully-automated segmentation of cardiac volumes,
current Convolutional Neural Networks (CNNs) process each slice
independently, not taking the depth information into consideration.
Networks using 3D convolutions being memory-hungry, we propose a CNN
model with a low memory demand and processing the whole volume. The
network is based on propagating the redundant depth information from
slice to slice. Following a 4-fold cross validation on the
MICCAI/ACDC challenge dataset, our network obtained better results
than a standard 2D network, improving the average DICE score of 1.7%
computed over three cardiac structures (myocardium, left and right
ventricle).
Introduction
Cardiac Magnetic Resonance
Images is a highly effective diagnosis tool, as it offers morphological1 but
also functional view of the heart2. To provide a diagnosis
radiologists have to manually delineate the different structures (left ventricle,
myocardial scar,…), which can be very time-consuming and is prone to
variability. There is therefore a growing need for the development of accurate
fully-automated cardiac segmentation methods. The most recent techniques
relying on Deep Learning and Convolutional Neural Networks (CNN) have shown
huge promises for pattern recognition and are now translated into the medical
image processing field3.
Purpose
2D CNN have shown better
segmentation results in the literature4 and require less memory than
3D models. The purpose of this study is to further
increase the segmentation accuracy by incorporating the redundant depth
information provided by a whole 3D heart volume.Method
We
developed a CNN based on the DenseNet structure5. The DenseNet consists
in a contracting and an expanding path, a bottleneck, and skip connections. The
feature extraction is performed by dense blocks, consisting in a sequence of
convolutional layers extracting the same number of feature maps (aka growth
rate), batch normalisation, Relu activation, and direct connections between the
layers. The exact architecture is depicted in figure 1.
The
proposed network is divided into 3 sub-networks (all based on the DenseNet
structure): an initialization network, a forward and a backward propagation
network. The initialization network
takes the slice located in the middle of the stack as single input to perform
the segmentation and consists in the standard 2D CNN. Once the network
initialized, both propagation networks propagate the existing segmentations to
their neighboring slices. These sub-networks are fed with the slice to be
segmented and the existing neighboring segmentation. Both networks start from
the already segmented slice and propagate the segmentation to both ends of the
stack, one slice at the time.
The
training of the initialization network is performed independently first. The
forward and backward networks are then jointly trained.
The
proposed network was then compared to a standard 2D and 3D DenseNet models. The
sub-networks of the proposed model were designed so that the number of
trainable parameters of the whole network matches the one of the 2D network for
a fair comparison (around 8M). It has to be noted that the proposed network
required more than 10 times less memory than the 3D network.
The 2017
MICCAI/ACDC challenge dataset4 was used to assess our architectures.
It contains 3D stacks of short axis images and their segmentation for 100
patients during both the end diastole and the end systole. The images were
resized to 128x128 pixels and their intensity normalized.
The
networks were trained and tested using a 4-fold cross-validation. The DICE
score, computed over each of the 3 classes of interest (right ventricle, left ventricle and myocardium), was used to assess the network
performance.
The
training was performed over 100 epochs using ADAM optimizer. The unweighted
cross-entropy was chosen as loss function, as it led to better performance than
using the DICE score directly.Results
The DICE scores
obtained by the different networks are assembled in Table 1. The propagation
network showed the best results with an average DICE score of 93.5% computed on
the three structure, resulting in an improvement of 1.7% compared to the 2D
network. The 3D network could not be tested under optimal conditions due to
memory limitations, resulting in poor performance.Discussion
The
propagation network showed superior results compared to the 2D one, implying
the segmentation process does benefit from the depth information contained in
the stack of images. We demonstrated the need to develop light memory demanding
networks for volume segmentation. One useful outcome of the propagation network
is its flexibility: here trained on propagating the information spatially, it
could also be used to propagate the information temporally and catch the heart
motion during a cardiac cycle, making it a useful tool for motion
correction. Acknowledgements
We gratefully acknowledge the support of NVIDIA
Corporation with the donation of the Titan Xp GPU used for this research.
The authors would also like to acknowledge the Region Grand Est and the
Doctoral School "IAEM" from the Université de Lorraine for funding
Benjamin Roussel's PhD.
References
[1] Rickers, et al.
"Utility of cardiac magnetic resonance imaging in the diagnosis of
hypertrophic cardiomyopathy." Circulation 112.6 (2005):
855-861.
[2] Gatehouse,
et al. “Applications of phase-contrast flow and velocity imaging in
cardiovascular MRI.” European radiology. 15.10 (2005):2172-84.
[3] Litjens, et al. "A
survey on deep learning in medical image analysis." Medical image
analysis 42 (2017): 60-88.
[4] Pop, et al., Statistical Atlases and Computational Models of the
Heart. ACDC and MMWHS Challenges: STACOM 2017, Revised Selected Papers.
10663, 2018.
[5] Huang, et al. "Densely
Connected Convolutional Networks." CVPR. Vol. 1. No. 2. 2017.