2652

Automated Segmentation of Knee Articular Cartilage on MRI Data: Increasing Network Capacity with Transfer Learning

Dimitri A Kessler¹, James W MacKay^1,2, Fiona J Gilbert¹, Martin J Graves¹, and Joshua D Kaggie¹
¹Department of Radiology, University of Cambridge, Cambridge, United Kingdom, ²Norwich Medical School, University of East Anglia, Norwich, United Kingdom

Synopsis

In this study we evaluated the possibility of using transfer learning to improve the segmentation accuracy of femoral and tibial knee articular cartilage of a small locally acquired and annotated dataset. Two conditional Generative Adversarial Networks were trained - one with pretraining on the much larger SKI10 (Segmentation of Knee Images 2010) dataset and the other with random weight initialisation and no pretraining. Pretraining not only increased cartilage segmentation accuracy of the fine-tuned dataset, but also increased the network’s capacity to preserve segmentation capabilities for the pretrained dataset.

Introduction

Advances in magnetic resonance imaging (MRI) techniques can assist in the quantification of early degenerative changes present in osteoarthritis, the most common, life limiting joint disorder^1–3. However, the clinical translation of these techniques is slow-moving as the required validation using manual segmentations of different tissues is laborious.

The expensive and time-consuming nature of medical image segmentation has significantly benefited from the developments in deep learning. Deep neural networks such as conditional generative adversarial networks (cGANs)^4,5 have shown great promise in their usage for automating the image segmentation process. However, the quantity of high-quality label maps of local datasets is typically very small, and the performance of a network trained on a low number of data is limited due to the lack of heterogeneity presented during training. Transfer learning has been introduced to mitigate this limitation by pretraining a network on another large dataset with distant or near similarities to the actual task, followed by network fine-tuning on the limited dataset⁶.

The aim of this study was to enhance the cGAN-generated semantic label maps containing segmentations of femoral and tibial cartilages and their underlying bone structures of a small local dataset and to increase overall network capacity by exploring the use of transfer learning.

Methods

CGANs model a min-max game between two networks: a generative network which learns to generate realistic semantic label maps from a source image, and a discriminative network which learns to distinguish generated from ground truth label maps. The implemented cGAN in this study uses a coarse-to-fine generator (global generator and local enhancer), multi-scale discriminators (discriminator networks at different image scales) and a robust GAN loss (intermediate feature map matching between generated and ground truth images)⁵ to generate high-resolution outputs. To assess if transfer learning could be used to enhance the segmentations of a small local dataset, two cGANs were trained and compared in this study.

Network 1: Three-dimensional fat-saturated spoiled gradient recalled-echo (matrix size=512x380 zero-filled interpolated to 512x512, voxel size=0.29x0.29x1mm3) images from ten participants (5 healthy volunteers; 5 patients with mild-to-moderate OA (Kellgren-Lawrence grade 2-3)) were acquired on a 3.0T MRI system (MR750 GE Healthcare, Waukesha, WI) and used as source images in this network. Semantic label maps including tibia and femur bones and the tibial and femoral cartilages were created from the images using manual segmentation. The dataset was split into eight for training and two for testing. Training was performed for 100 epochs with batch size 1.

Network 2: Instead of using random weight initialisation, this network was pretrained for 50 epochs with batch size 10 on 70% of the SKI10 dataset from the MICCAI grand challenge workshop “Segmentation of Knee Images 2010”7. The other 30% of the SKI10 dataset were used for testing. The SKI10 dataset consists of about 90% 1.5T and 10% 3.0T MR images from multiple MR system vendors and included both T1-weighted and T2-weighted images. Before pretraining, the SKI10 images were zero-filled interpolated to 512x512 to resemble the matrix size of the local dataset. Following pretraining, the network was trained for 50 epochs with batch size 1 on the dataset as described above for ‘Network 1’.

To assess the influence of transfer learning, both networks were validated with the local and SKI10 testing sets. The Sørensen–Dice Similarity Coefficient (DSC) and Volumetric Overlap Error (VOE) were used to evaluate the segmentation performances.

Results

Figure 1 shows the generated segmentation maps from ‘Network 1’ without (Fig 1B, left) and ‘Network 2’ with transfer learning (Fig 1B, right) from a local dataset source image (Fig 1A, left). Figures 1C shows overlays of the ground truth cartilage segmentations (Fig 1A, right) and the generated cartilage segmentation from the networks without (Fig 1C, left) and with transfer learning (Fig 1C, right). The overlapping area is shown in cyan with the ground truth segmentation in blue and the generated segmentation in green.

Figure 2 illustrates the segmentation results of ‘Network 1’ without (Fig 2B, left) and ‘Network 2’ with transfer learning (Fig 2B, right) from a SKI10 dataset test image (Fig 2A, left).

DSCs and VOEs for all segmented regions of both networks output with and without transfer learning tested on the local dataset are in Table 1 and tested on the SKI10 dataset in Table 2.

Discussion

The model used in this work achieved high segmentation accuracy for all labelled regions with a low number of participants in the local dataset. Pretraining the network on the ten-fold larger SKI10 dataset improved cartilage segmentation accuracy and decreased the VOE for femoral and tibial cartilage by 3.4% and 2.6%, respectively, compared to the network scores without pretraining.

Transfer learning not only improved the segmentation accuracy on the local dataset but also enhanced the networks ability to segment the SKI10 test dataset by introducing more heterogeneity into the model. Even though the SKI10-pretrained network was then fine-tuned to segment the local dataset, it successfully segmented the SKI10 dataset with improved performance compared to the network without SKI10-pretraining.

Conclusion

The obtained results highlight the strength of transfer learning and suggest that cGANs can perform accurate segmentation of different knee compartments, which could be used to improve the efficiency of MRI-based joint health quantification.

Acknowledgements

We acknowledge the support of Robert L Janiczek (Experimental Medicine Imaging, GlaxoSmithKline, London, UK). This work was supported by GlaxoSmithKline, European Union's Horizon 2020 Research and Innovation Programme (Grant Agreement no. 761214), Addenbrooke's Charitable Trust, and the National Institute of Health Research Cambridge Biomedical Research Centre.

References

1. Martel-Pelletier J, Barr AJ, Cicuttini FM, et al. Osteoarthritis. Nat Rev Dis Prim; 2. Epub ahead of print 2016. DOI: 10.1038/nrdp.2016.72.

2. Matzat SJ, van Tiel J, Gold GE, et al. Quantitative MRI techniques of cartilage composition. Quant Imaging Med Surg 2013; 3: 162–74.

3. Li X, Majumdar S. Quantitative Magnetic Resonance Imaging of Articular Cartilage and its Clinical Applications. J Magn Reson Imaging 2013; 38: 991–1008.

4. Isola P, Zhu JY, Zhou T, et al. Image-to-image translation with conditional adversarial networks. Proc - 30th IEEE Conf Comput Vis Pattern Recognition, CVPR 2017 2017; 5967–5976.

5. Wang T-C, Liu M-Y, Zhu J-Y, et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. arXiv Prepr arXiv 171111585v2 2018; 1–14.

6. Shie CK, Chuang CH, Chou CN, et al. Transfer representation learning for medical image analysis. Proc Annu Int Conf IEEE Eng Med Biol Soc EMBS 2015; 711–714.

7. Heimann T, Styner M, Warfield SK. Segmentation of Knee Images : A Grand Challenge Segmentation of Knee Images :, http://www.ski10.org/ski10.pdf (2010).

Figures

Figure 1 – (A) 3D-SPGR FS source MR images from local dataset (left), which were used to create manual ground truth semantic maps of all regions (right). (B) A generated semantic map is shown as predicted from the cGAN without SKI10-pretraining (left) and with SKI10-pretraining (right) tested on local MRI dataset. (C) Overlap of the manual cartilage segmentation map (blue) and cGAN predicted map (green) without SKI10-pretraining (left) and with SKI10-pretraining (right). Arrows indicate areas of cartilage segmentation improvements with transfer learning (TL).

Figure 2 – (A) Source MR images from the SKI10 dataset (left), which were used to create manual ground truth semantic maps (right). (B) A generated semantic map is shown as predicted from the cGAN without SKI10-pretraining (left) and with SKI10-pretraining (right) tested on SKI10 MRI dataset. Arrows indicate areas of cartilage segmentation improvements with transfer learning (TL).

Table 1 - Sørensen–Dice Similarity Coefficient (DSC) and Volumetric Overlap Error (VOE) for regional performance evaluation of the proposed cGANs without and with pretraining on SKI10 dataset and tested on local dataset. *Evaluation metrics presented as without / with transfer learning.

Table 2 - Sørensen–Dice Similarity Coefficient (DSC) and Volumetric Overlap Error (VOE) for regional performance evaluation of the proposed cGANs without and with SKI10 pretraining and tested on SKI10 dataset. *Evaluation metrics presented as without / with transfer learning.

Proc. Intl. Soc. Mag. Reson. Med. 28 (2020)

2652