1690

Predicting Total Knee Replacement Surgery Using Radiomic Features Extracted from MRI Scans

Eros Montin^1,2, Ozkan Cigdem^1,2, Haresh Rajamohan³, Kyunghyun Cho³, Richard Kijowski⁴, Cem Deniz^1,2, and Riccardo Lattanzi^1,2
¹Bernard and Irene Schwartz Center for Biomedical Imaging, Department of Radiology,, New York University Grossman School of Medicine, New York, New York, USA, new york, NY, United States, ²Center for Advanced Imaging Innovation and Research (CAI2R), Department of Radiology, New York University Grossman School of Medicine, New York, New York, USA, new york, NY, United States, ³Center of Data Science, New York University, New York, NY, United States, new york, NY, United States, ⁴Department of Radiology, New York University Grossman School of Medicine, New York, New York, USA, new york, NY, United States

Synopsis

Keywords: Cartilage, Joints, knee, joints, mri, radiomic

Motivation: Predicting total knee replacement surgery can help patients and healthcare providers make informed treatment decisions.

Goal(s): To develop a machine learning model for predicting patients that will undergo a total knee replacement using radiomics.

Approach: To extract radiomic features from images of patients and healthy subjects and train different machine learning models to predict patients' outcomes.

Results: The best model achieved an accuracy of 87.2%. Three out of the four most significant radiomic features selected in the All subset were derived from the meniscus areas, suggesting that the meniscus may play a crucial role in predicting patient outcomes.

Impact: Radiomic features from MRI scans effectively classify TKR-positive patients, particularly those incorporating meniscus features. These models potentially can predict patient outcomes and guide treatment decisions, but further research is needed to enhance performance and validate findings in broader patient populations.

Introduction

Knee osteoarthritis (KOA) is a prevalent condition affecting millions worldwide. Total knee replacement (TKR) is a common treatment for advanced KOA. Deep learning (DL) models have emerged as promising tools for predicting KOA progression using MRI¹. Machine learning algorithms have also been employed to predict whether a patient requires TKR surgery or not². These advancements have the potential to help the precision of decision-making in TKR surgery. In addition to DL, radiomics can also be used for the same purpose^3-6. Radiomics is a quantitative approach that extracts high-throughput features from medical images, such as MRI scans. These features can then be used to build predictive models of disease progression and treatment outcomes. In this study, we applied radiomics to distinguish patients in need of TKR from the control group.

Material & Methods

Data:
The study cohort was derived from the osteoarthritis initiative (OAI) database⁷, using only the sagittal intermediate-weighted turbo spin echo (TSE) images (Figure 1). We included 143 TKR-positive patients, defined as those who had undergone TKR within 9 years from their baseline measurement. The control group, also comprising 143 individuals, did not undergo TKR within the same timeframe. Each patient may have undergone TKR for either one or both knees.
Image Segmentation:
The pre-trained deep learning-based knee MRI cartilage segmentation model described in⁸ was employed to segment the articular (femoral, tibial, and patellar) cartilage and meniscus, resulting in a total of six Regions of Interest (ROIs) as represented in Figure 1.
Radiomic Features Extraction:
A comprehensive set of radiomic features was extracted from TSE images using the pyradiomics library⁹. These features encompassed first-order statistics, shape-based descriptors, and textural analysis⁹. Pyradiomics facilitates extracting features from various filtered versions of the original TSE image, including normalization, filtering, and wavelet decomposition. A total of 10,494 features were extracted per subject.
Features Selection:
Six distinct subsets of features were established based on the six ROIs employed for feature extraction, along with an additional comprehensive set comprising all extracted features (Figure 2). Due to the large number of features, Sequential Forward Floating Selection (SFFS)¹⁰ was employed to identify the four most relevant features within each subset. This approach was chosen as it effectively reduces the dimensionality of the feature space while preserving the most informative features. The SFFS algorithm was based on a K-Nearest Neighbors (KNN) classifier¹⁰.
Training and Testing:
To effectively and reliably evaluate the model's performance, a 10-fold cross-validation training procedure was implemented for each subset. Each cross-validation fold utilized a random forest classification algorithm with parameters n_estimators=100 and max_depth=5. The training data for each fold consisted of 90% of the subset's data, while the remaining 10% was used for testing.
Finally, to evaluate the diagnostic performance of each of the four features in a specific subset, a Wilcoxon rank-sum test was conducted with p<0.01.

Results

In Figure 3 it is possible to see that, The models’ accuracy ranged from 62.4% to 87.2% (Figure 3). The models’ precisions ranged from 62.4% to 89.1%. The best model overall was the one including all features, which had an accuracy of 87.2% and a precision of 89.1%.
However, it was not as good at identifying true negatives, with a recall of 85.8%.Interestingly, three out of the four most significant radiomic features selected in the “All” subset were all derived from the meniscus areas (p-values<0.01) (Figure 4).
The medial meniscus model was the second best performer, with an accuracy of 79.3%, a precision of 80.2%, and a recall of 78.8%. It also yielded the highest specificity (79.9%) and the lowest error rate (20.7%).

Discussion and Conclusions

Our study evaluated the performance of six different machine learning models based on radiomic features for classifying TKR. The menisci seem to be the most important areas for predicting clinical outcome, since the associated features achieved 87.2% accuracy with the “All” model.
Furthermore, the medial meniscus model demonstrated the best performance among the individual tissue-specific models, achieving an accuracy of 79.3% and a recall of 78.8%. Further research is warranted to improve the models' ability to identify true negatives and to validate their performance in larger and more diverse patient populations.

Acknowledgements

This work was supported in part by NIH grant R01 AR074453, and was performed under the rubric of the Center for Advanced Imaging Innovation and Research (CAI²R, www.cai2r.net), a NIBIB Biomedical Technology Resource Center (NIH P41 EB017183).

References

O. Cigdem, C. M. Deniz, Artificial intelligence in knee osteoarthritis: A comprehensive review for 2022, Osteoarthritis Imaging 3 (3) (2023) 100161.
Y.X. Teoh, K.W. Lai, J. Usman, S.L. Goh, H. Mohafez, K. Hasikin, P. Qian, Y. Jiang, Y. Zhang, S. Dhanalakshmi Discovering knee osteoarthritis imaging features for diagnosis and prognosis: re- view of manual imaging grading and machine learning approaches J. Healthc. Eng., 2022 (2022), Article 4138666, 10.1155/2022/4138666.
Bologna, M., Corino, V. D. A., Montin, E., Messina, A., Calareso, G., Greco, F. G., Sdao, S., & Mainardi, L. T. (2018). Assessment of Stability and Discrimination Capacity of Radiomic Features on Apparent Diffusion Coefficient Images. Journal of Digital Imaging. https://doi.org/10.1007/s10278-018-0092-9
Corino, V. D. A., Montin, E., Messina, A., Casali, P. G., Gronchi, A., Marchianò, A., & Mainardi, L. T. (2018). Radiomic analysis of soft tissue sarcomas can distinguish intermediate from high-grade lesions. Journal of Magnetic Resonance Imaging, 4(3), 829–840. https://doi.org/10.1002/jmri.25791
Bologna, M., Calareso, G., Resteghini, C., Sdao, S., Montin, E., Corino, V., Mainardi, L., Licitra, L., & Bossi, P. (2020). Relevance of apparent diffusion coefficient features for a radiomics-based prediction of response to induction chemotherapy in sinonasal cancer. NMR in Biomedicine, March 2019.
Montin E, Kijowski R, Youm T, Lattanzi R. A radiomics approach to the diagnosis of femoroacetabular impingement. Front Radiol. 2023 Mar 20;3:1151258. doi: 10.3389/fradi.2023.1151258. PMID: 37492381; PMCID: PMC10365279.
G. Lester, Clinical research in OA–the NIH osteoarthritis initiative, J. Musculoskelet. Neuronal Interact. 8 (4) (2008) 313–314.
A. D. Desai, F. Caliva, C. Iriondo, A. Mortazi, S. Jambawalikar, U. Bagci, M. Perslev, C. Igel, E. B. Dam, S. Gaj, M. Yang, X. Li, C. M. Deniz, V. Juras, R. Regatte, G. E. Gold, B. A. Hargreaves, V. Pedoia, A. S. Chaudhari, IWOAI Segmentation Challenge Writing Group, The international workshop on osteoarthritis imaging knee MRI segmentation challenge: A multi-institute evaluation and analysis framework on a standardized dataset, Radiol. Artif. Intell. 3 (3) (2021) e200078. https://github.com/denizlab/2019_IWOAI_Challenge.
van Griethuysen, J. J. M., Fedorov, A., Parmar, C., Hosny, A., Aucoin, N., Narayan, V., Beets-Tan, R. G. H., Fillon-Robin, J. C., Pieper, S., Aerts, H. J. W. L. (2017). Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Research, 77(21)
Nakariyakul, Songyot and David Paul Casasent. “Improved forward floating selection algorithm for feature subset selection.” 2008 International Conference on Wavelet Analysis and Pattern Recognition 2 (2008): 793-798

Figures

In the figure it is possible to see the images and ROIs used in this study: images were acquired using a sagittal intermediate-weighted turbo spin echo (TE = 30 ms, TR = 3200 ms, FOV = 160 mm, Slice Thickness = 3.0 mm, In-plane Resolution = 0.36 mm × 0.51 mm, Bandwidth = 248 Hz/pixel, Matrix Size = 444 × 448 × 37). In yellow the femoral cartilage, in brown the patellar cartilage, in red the medial meniscus, in green the lateral TBL Cartilage, in blue the medial TBL Cartilage, and in purple the lateral meniscus

the workflow for training and validating random forest (RF) models to distinguish TKR and non-TKR patients based on radiomic features. The 6 ROIs were automatically segmented. Note: A blue square indicates the repeated parts of the workflow over the 7 subsets.

The mean (standard deviation) performance across the 20 cv models reported using different subsets features. Each model was evaluated on the testing dataset associated with the specific subset. The All subset represents the model trained using the nine most informative features among all 10494 features. The other subsets group the four most informative features associated with specific ROIs. the Matthews correlation coefficient (MCC) and the area under the receiver operating characteristic curve (AUC) were among the metrics used to assess the performance of the models.

Two radial plots showing the normalized values of the 4 features most informative among all features (left) and in the most informative region, the medial meniscus subset (right). The bold line represents the average value across all subjects/patients, whereas the shaded area includes the 25th and 75th percentile of the distribution of the metrics/features. The asterisk (*) indicates a p-value < 0.01 in differentiating TKR patients.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

1690

DOI: https://doi.org/10.58530/2024/1690