2156

A privacy-preserving federated learning infrastructure for prostate segmentation on T2-Weighted MRI

Fadila Zerka¹, Mohammed Sunoqrot¹, Bendik Abrahamsen¹, Alexandros Patsanis¹, Tone Frost Bathen^1,2, and Mattijs Elschot^1,2
¹Department of Circulation and Medical Imaging, NTNU, Trondheim, Norway, ²Department of Radiology and Nuclear Medicine, St. Olavs Hospital, Trondheim, Norway

Synopsis

Accessing medical data is highly protected by law and ethics, making data sharing difficult and time-consuming. Distributed learning in its various forms allows learning from medical data without these data ever leaving the medical institutions. In this study, we evaluate the Flower federated learning framework for prostate segmentation on T2-Weighted MRI. The results show that the Federated learning framework performs comparably to the reference (centralized learning) model.

INTRODUCTION

Segmentation of the prostate on magnetic resonance imaging (MRI) is essential for accurate prostate cancer diagnosis and treatment planning.1 Manual segmentation is subject to intra and inter-reader variability. In addition, segmentation is a time-consuming task, due to the lack of clear boundaries at the apex and base, and the large variance of prostate shapes. Fully automatic segmentation of the prostate in T2-weighted (T2W) MR images using deep learning techniques has the potential to address these issues.1,2 However, existing models for automatic prostate segmentation are typically trained on small cohorts, making them difficult to generalize.1,3 Generalizability of the model is ensured by exposing it to diverse quality training data, ideally originating from multiple sources. Nevertheless, patient data is highly sensitive, and sharing is subject to ethical and legal concerns. Federated learning (FL) can overcome this limitation by training a model without the need for centralizing or physically sharing sensitive patient data. The aim of this work was to evaluate the Flower4 FL framework for prostate segmentation on T2W MR images.

METHODS

In this study, the performance of a centralized and a FL model to segment the whole prostate gland was compared. The experiments were performed using two separate NVIDIA GeForce GTX 1080 8GB GPUs running Ubuntu 20.04.3 LTS.
Data
In this study, the transverse T2W images and corresponding manual ground truth segmentations for the publicly available PROMISE12 (n=50) and PROSTATEx challenge datasets (n=346) were used6,7. Three subjects from the PROSTATEx were excluded due to missing segmentations of the ground truth. We randomly selected a global test set from PROSTATEx data (n=45) to evaluate both the centralized and federated learning models. The remained 298 subjects from PROSTATEx were utilized for model training. Data were distributed as follows to simulate both centralized and FL experiments:

The centralized repository model mimics the conventional approach where training, validation, and testing occur on the same device. The centralized repository holds a combination of PROMISE12 data (n=50) and PROSTATEx data (n=298).6,7

Federated client 1 contains PROMISE12 data (n=50).6

Federated client 2 contains PROSTATEx data (n = 298).7

For each experiment, pre-processing was performed as proposed by Mirzaev et al8 for the training (80%), validation (20%), and global test sets.
Federated learning
We propose a federated learning infrastructure to train a prostate segmentation model, without centralizing or physically sharing sensitive patient data. We simulated a FL network of two clients. The network is based on the Flower framework,4 where each client trains a portion of the same model on local data before sending it to the server for aggregation. Federated averaging was employed for model weight aggregation.5 After aggregation, the server sends the aggregated weights to the clients for an update. This process is repeated until a predefined number of allocated rounds is exhausted, as depicted in Figure 1.
Model architecture
The model architecture is based on 2D U-Net.8,9 The hyperparameters were set according to Mirzaev et al.8
Training
First, the centralized model was trained with the data in the centralized repository. This is considered as our reference model. Second, the federated learning model was trained with the training data of each set as a separate client. The same architecture is used to train the centralized model and the two local models for the federated learning infrastructure. After training, both models were tested using the global test set.

RESULTS

In this study, we used the Dice Similarity Coefficient (DSC) to evaluate the per-slice segmentation accuracy. The centralized and FL models achieved a DSC mean ± standard deviation of 0.87±0.34 and 0.85±0.36, respectively, on the global test set. Figure 2 illustrates examples of the segmentation results for both the centralized and FL models. The Spearman test showed a high correlation between the centralized and FL DSCs, as shown in Figure 3.

DISCUSSION

We demonstrated that the Flower FL framework can be successfully implemented across multiple institutions for organ segmentation, prostate in our use case, and that the resulting model would benefit from learning from each distributed dataset as if the data were centralized. It had already been shown that FL improved the performance of single client models,10 our results indicated that the global FL model performed equivalently to the centralized model. Our infrastructure is based on the Flower framework which allows scaling to a large number of clients and machine learning frameworks compared to other FL frameworks.11,12,12,13
Figure 3 shows that there are some outliers in both the centralized and federated learning models, those represent poor predictions, as shown in Figures 4 and 5.
Existing FL studies of prostate segmentation are limited in network size, with approximately three clients per study.10,14 In our future work we will 1)test the scalability of the network to hundreds of clients, 2)secure the network against possible malicious attempts to extract patient-level information from the shared model weights, 3)test different weight averaging methods for federated prostate segmentation on T2W MR images, and finally improve the model architecture to reduce poor predictions and add a pre/post-processing module to eliminate predictions outside the prostate gland.

CONCLUSION

Federated learning for prostate segmentation ensures the preservation of learning quality on distributed datasets without the need to transfer data across institutions.

Acknowledgements

No acknowledgement found.

References

REFERENCES

1. Aldoj, N., Biavati, F., Michallek, F., Stober, S. & Dewey, M. Automatic prostate and prostate zones segmentation of magnetic resonance images using DenseNet-like U-net. Sci. Rep. 10, 14315 (2020). 2. Khan, Z., Yahya, N., Alsaih, K., Ali, S. S. A. & Meriaudeau, F. Evaluation of Deep Neural Networks for Semantic Segmentation of Prostate in T2W MRI. Sensors 20, 3183 (2020). 3. Zabihollahy, F., Schieda, N., Krishna Jeyaraj, S. & Ukwatta, E. Automated segmentation of prostate zonal anatomy on T2‐weighted (T2W) and apparent diffusion coefficient ( ADC ) map MR images using U‐Nets. Med. Phys. 46, 3078–3090 (2019). 4. Beutel, D. J. et al. Flower: A Friendly Federated Learning Research Framework. ArXiv200714390 Cs Stat (2021). 5. McMahan, H. B., Moore, E., Ramage, D., Hampson, S. & Arcas, B. A. y. Communication-Efficient Learning of Deep Networks from Decentralized Data. ArXiv160205629 Cs (2017). 6. Litjens, G. et al. Evaluation of prostate segmentation algorithms for MRI: The PROMISE12 challenge. Med. Image Anal. 18, 359–373 (2014). 7. Armato, S. G. et al. PROSTATEx Challenges for computerized classification of prostate lesions from multiparametric magnetic resonance images. J. Med. Imaging 5, 1 (2018). 8. Mirzaev, I. Fully convolutional neural network with residual connections for automatic segmentation of prostate structures from MR images. https://www.semanticscholar.org/paper/Fully-convolutional-neural-network-with-residual-of-Mirzaev/7d39de8b799ba1d99aaae91861dd6f773c34f6e3 (2017). 9. Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. ArXiv150504597 Cs (2015). 10. Sarma, K. V. et al. Federated learning improves site performance in multicenter deep learning without data sharing. J. Am. Med. Inform. Assoc. 28, 1259–1264 (2021). 11. TensorFlow Federated. TensorFlow https://www.tensorflow.org/federated. 12. Ziller, A. et al. PySyft: A Library for Easy Federated Learning. in Federated Learning Systems: Towards Next-Generation AI (eds. Rehman, M. H. ur & Gaber, M. M.) 111–139 (Springer International Publishing, 2021). doi:10.1007/978-3-030-70604-3_5. 13. Moncada-Torres, A., Martin, F. & Sieswerda, M. VANTAGE6: an open source priVAcy preserviNg federaTed leArninG infrastructurE for Secure Insight eXchange. 8. 14. Roth, H. R. et al. Federated Whole Prostate Segmentation in MRI with Personalized Neural Architectures. ArXiv210708111 Cs Eess (2021).

Figures

Federated learning workflow consisting of 1) Server that initializes and aggregates the learning, 2) Clients that first update the weights with local data, then send the updated weights back to the server for aggregation, after aggregation the server sends the global model weights back to the clients and repeats the same steps until the maximum number of rounds is exhausted.

Segmentation results of the centralized and federated learning models

Correlation analysis between the DSCs of the centralized and federated learning models

Poor prediction examples of both centralized and federated learning models

A) Examples of the federated learning model correctly predicting most or large parts of the prostate while the centralized model yields poor or no predictions, B) Examples of the centralized model correctly predicting most or large parts of the prostate, while the federated learning model yields poor or no predictions

Proc. Intl. Soc. Mag. Reson. Med. 30 (2022)

2156

DOI: https://doi.org/10.58530/2022/2156