Dimitri A Kessler1, James W MacKay1,2, Fiona J Gilbert1, Martin J Graves1, and Joshua D Kaggie1
1Department of Radiology, University of Cambridge, Cambridge, United Kingdom, 2Norwich Medical School, University of East Anglia, Norwich, United Kingdom
Synopsis
In this study we evaluated the possibility of using transfer
learning to improve the segmentation accuracy of femoral and tibial knee
articular cartilage of a small locally acquired and annotated dataset. Two conditional Generative Adversarial
Networks were trained - one with pretraining on the much larger SKI10 (Segmentation of Knee Images 2010) dataset and
the other with random weight initialisation and no pretraining. Pretraining not only increased cartilage segmentation
accuracy of the fine-tuned dataset, but also increased the network’s capacity
to preserve segmentation capabilities for the pretrained dataset.
Introduction
Advances in
magnetic resonance imaging (MRI) techniques can assist in the quantification of
early degenerative changes present in osteoarthritis, the most common, life
limiting joint disorder1–3. However, the clinical translation of these
techniques is slow-moving as the required validation using manual segmentations
of different tissues is laborious.
The expensive and
time-consuming nature of medical image segmentation has significantly benefited
from the developments in deep learning. Deep neural networks such as conditional
generative adversarial networks (cGANs)4,5 have shown great promise in their usage for
automating the image segmentation process. However, the quantity of high-quality
label maps of local datasets is typically very small, and the performance of a
network trained on a low number of data is limited due to the lack of
heterogeneity presented during training. Transfer learning has been introduced
to mitigate this limitation by pretraining a network on another large dataset
with distant or near similarities to the actual task, followed by network fine-tuning
on the limited dataset6.
The aim of this study was to enhance the cGAN-generated
semantic label maps containing segmentations of femoral and tibial cartilages and
their underlying bone structures of a small local dataset and to increase
overall network capacity by exploring the use of transfer learning. Methods
CGANs model a min-max game between two networks: a
generative network which learns to generate realistic semantic label maps from
a source image, and a discriminative network which learns to distinguish
generated from ground truth label maps. The implemented cGAN in this study uses
a coarse-to-fine generator (global generator and local enhancer), multi-scale discriminators
(discriminator networks at different image scales) and a robust GAN loss (intermediate
feature map matching between generated and ground truth images)5 to generate high-resolution outputs. To assess if
transfer learning could be used to enhance the segmentations of a small local
dataset, two cGANs were trained and compared in this study.
Network 1: Three-dimensional fat-saturated spoiled
gradient recalled-echo (matrix size=512x380
zero-filled interpolated to 512x512, voxel size=0.29x0.29x1mm3) images from ten
participants (5 healthy volunteers; 5 patients with mild-to-moderate OA (Kellgren-Lawrence grade 2-3)) were acquired on a 3.0T MRI
system (MR750 GE Healthcare, Waukesha, WI) and used as source images in this network.
Semantic label maps including tibia and femur bones
and the tibial and femoral cartilages were created from the images using manual
segmentation. The
dataset was split into eight for training and two
for testing. Training was
performed for 100 epochs with batch size 1.
Network 2: Instead of using random weight initialisation, this network
was pretrained for 50 epochs with batch size 10 on 70% of the SKI10 dataset from the MICCAI grand challenge
workshop “Segmentation of Knee Images 2010”7. The other 30% of the SKI10 dataset were used for testing. The SKI10 dataset
consists of about 90% 1.5T and 10% 3.0T MR
images from multiple MR system vendors and included both T1-weighted and
T2-weighted images. Before pretraining, the SKI10 images were zero-filled interpolated to 512x512 to resemble the matrix
size of the local dataset. Following pretraining, the network was trained for 50 epochs with batch
size 1 on the dataset as described above for ‘Network 1’.
To assess the influence of transfer learning, both networks were
validated with the local and SKI10 testing sets. The Sørensen–Dice
Similarity Coefficient (DSC) and Volumetric Overlap Error (VOE) were used to evaluate
the segmentation performances.Results
Figure 1 shows the generated
segmentation maps from ‘Network 1’ without (Fig 1B, left) and ‘Network 2’ with
transfer learning (Fig 1B, right) from a local dataset source image (Fig 1A,
left). Figures 1C shows overlays of the ground truth cartilage segmentations
(Fig 1A, right) and the generated cartilage segmentation from the networks
without (Fig 1C, left) and with transfer learning (Fig 1C, right). The
overlapping area is shown in cyan with the ground truth segmentation in blue
and the generated segmentation in green.
Figure 2 illustrates the segmentation
results of ‘Network 1’ without (Fig 2B, left) and ‘Network 2’ with transfer
learning (Fig 2B, right) from a SKI10 dataset test image (Fig 2A, left).
DSCs and VOEs for all segmented regions
of both networks output with and without transfer learning tested on the local
dataset are in Table 1 and tested on the SKI10 dataset in Table 2.Discussion
The model used in this work achieved
high segmentation accuracy for all labelled regions with a low number of
participants in the local dataset. Pretraining the network on the ten-fold
larger SKI10 dataset improved cartilage segmentation accuracy and decreased the
VOE for femoral and tibial cartilage by 3.4% and 2.6%, respectively, compared
to the network scores without pretraining.
Transfer learning not only
improved the segmentation accuracy on the local dataset but also enhanced the
networks ability to segment the SKI10 test dataset by introducing more
heterogeneity into the model. Even though the SKI10-pretrained network was then
fine-tuned to segment the local dataset, it successfully segmented the SKI10
dataset with improved performance compared to the network without
SKI10-pretraining.Conclusion
The obtained results
highlight the strength of transfer learning and suggest that cGANs can perform
accurate segmentation of different knee compartments, which could be used to
improve the efficiency of MRI-based joint health quantification.Acknowledgements
We
acknowledge the support of Robert L Janiczek (Experimental Medicine Imaging,
GlaxoSmithKline, London, UK). This work was supported by GlaxoSmithKline,
European
Union's Horizon 2020 Research and Innovation Programme (Grant Agreement
no. 761214), Addenbrooke's Charitable Trust, and the National Institute of Health Research
Cambridge Biomedical Research Centre.References
1. Martel-Pelletier J, Barr AJ, Cicuttini
FM, et al. Osteoarthritis. Nat Rev Dis Prim; 2. Epub ahead of print
2016. DOI: 10.1038/nrdp.2016.72.
2. Matzat SJ, van Tiel J, Gold GE, et al.
Quantitative MRI techniques of cartilage composition. Quant Imaging Med Surg
2013; 3: 162–74.
3. Li X, Majumdar S. Quantitative
Magnetic Resonance Imaging of Articular Cartilage and its Clinical
Applications. J Magn Reson Imaging 2013; 38: 991–1008.
4. Isola P, Zhu JY, Zhou T, et al.
Image-to-image translation with conditional adversarial networks. Proc -
30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017 2017; 5967–5976.
5. Wang T-C, Liu M-Y, Zhu J-Y, et al.
High-Resolution Image Synthesis and Semantic Manipulation with Conditional
GANs. arXiv Prepr arXiv 171111585v2 2018; 1–14.
6. Shie CK, Chuang CH, Chou CN, et al.
Transfer representation learning for medical image analysis. Proc Annu Int
Conf IEEE Eng Med Biol Soc EMBS 2015; 711–714.
7. Heimann T, Styner M, Warfield SK. Segmentation of
Knee Images : A Grand Challenge Segmentation of Knee Images :,
http://www.ski10.org/ski10.pdf (2010).