2970

3D Texture Analysis of 3D DESS Cartilage Images for Prediction of Knee Osteoarthritis
Daniel Uher1, Ari Väärälä1, Antti Isosalo1, Victor Casula1,2, and Miika T. Nieminen1,2,3
1Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Oulu, Finland, 2Medical Research Center, University of Oulu and Oulu University Hospital, Oulu, Finland, 3Department of Diagnostic Radiology, Oulu University Hospital, Oulu, Finland

Synopsis

In this study, a gray level co-occurrence matrix (GLCM) based 3D Texture Analysis method was utilized for early prediction of knee osteoarthritis using 3D DESS images. Twenty subjects were extracted from the Osteoarthritis Initiative (baseline) with Kellgren-Lawrence (KL) score = 0 at baseline. Ten of the selected subjects developed the disease and showed KL ≥ 2 at the 36-month visit. Knee DESS images were analyzed using various quantization schemes and three machine learning models were trained based on the output GLCM features. Naïve Bayes model trained on tibial features showed the highest accuracy (86.8%) for OA onset after 36 months.

Introduction

Three-dimensional texture analysis of cartilage DESS images based on gray level co-occurrence matrices (GLCM) has previously demonstrated the ability to predict osteoarthritis (OA) onset years before manifestation of radiographic signs.1 The prior study incorporated only a single set of fixed gray levels with 8-bin quantization and offset setting of 1. The aim of this study was to identify the optimal set of 3D Texture Analysis input parameters for classifying subjects that will develop OA. This was done by utilizing various machine learning protocols and using different gray level quantization ranges, numbers of bins, and offsets.

Methods

Baseline knee MRI of twenty subjects were obtained from the Osteoarthritis Initiative2 (OAI) database. None of the selected subjects showed any radiographic signs of the disease at the baseline screening (Kellgren-Lawrence score, KL = 0). Ten control subjects were selected from the OAI Control cohort with KL = 0 at baseline and follow-up visits (CTRL group). Another 10 subjects were selected from the OAI Incidence or Progression cohort with rapid progression from KL = 0 at baseline to KL ≥ 2 at the 36-month visit (PRGS group). CTRL and PRGS groups were matched for sex, age (±2 years) and Body-Mass Index (BMI, ±2 kg/m2). For each subject the longitudinal relative BMI variation over 36 months was within 10%.

Femoral and tibial cartilages were automatically segmented from DESS images using a previously validated deep learning tool3 (Figure 1). Nineteen GLCM features were extracted for 4 different cartilage layers (Figure 2) at 10% (L10), 50% (L50) and 90% (L90) of the cartilage thickness, and SUM (full cartilage thickness). GLCMs were calculated using different gray level ranges with minimum and maximum levels assigned either statically or dynamically. The dynamic approach assigned gray levels according to the cartilage pixel intensity distribution with 2% margin to disregard outliers. Static maximum gray levels used were 300, 400, 500, 600, 700, 800. Pixel intensities outside the quantization range were included into the first or last bin. The used bin quantization schemes were 4, 8, 12, 16, 32 and 64 bins. Offsets of 1,2,3 and 4 were used.

Naïve Bayes (NB), Support Vector Machines (SVM, using radial basis function) and Multilayer Perceptron4 (MLP, 2 hidden layers, each containing 11 neurons) were trained and tested using all 19 features or selected features from either femoral or tibial cartilage. The training and testing were repeated 200 times to estimate the accuracy of each GLCM output feature set. The average accuracy across the 200 repetitions was calculated. Random initial conditions were assigned for each repetition as well as random training and testing split. The testing set consisted of either a single subject pair or 1-3 randomly selected subjects regardless of their cohort. The rest of the subjects were used for training. The training set was bootstrapped to always contain 20 subjects. Two hundred repetitions yielded one set of results for the chosen model.

Results

The highest accuracy (86.8%) was achieved using Naïve Bayes on tibial cartilage features (Table 1). The average difference between best and worst accuracies was 30.8 ± 11.2%. Using Naïve Bayes yielded the highest overall accuracy for tibial predictions (Table 2). Utilizing SVM yielded a performance spike (79.3%) with selected L50 tibial features, otherwise the accuracies for SVM predictions remained below 50%. Using MLP showed the highest overall accuracies in femoral L10 and L50 predictions (62.8% and 63.2% respectively). Best predictions were observed by using tibial L50 and SUM features. Tibial features outperformed femoral features across all models by 13.1% for L10; 16.2% for L50; 18.2% for L90; and 14.8% for SUM. A table of 5 best performers across all sets of results for each layer is presented (Table 3).

Discussion

The average difference between best and worst accuracies highlights the importance of the input parameter selection. Based on the results, dynamic gray level assignments and/or higher static maximum gray levels seem to benefit the prediction. The best accuracies were found to be associated with lower number of bins (4-12 bins seemed to be optimal with this subject dataset). Using Naïve Bayes provided the best results from tibia and using MLP seemed to be most beneficial for femoral features. SVM has shown good performance amongst OA studies found in the literature.5 This and the observed performance spike suggest that further parameter search with different kernel functions should be conducted.

The subject dataset in this study consisted of only 20 subjects and a single timepoint (baseline) due to the time-consuming nature and complexity involved with the 3D texture analysis. The small sample size might arguably be the biggest limitation of this study due to its probable impact on the classifiers. Future studies with larger sample sizes will be performed in order to build upon the findings of this study.

Conclusion

The GLCM features calculated with 3D Texture Analysis showed promising predictive capabilities. Adjusting the minimum and maximum gray levels to the analyzed cartilage had a positive impact on the predictions. Lower bin quantization numbers also displayed better accuracy. Using Naïve Bayes yielded the most convincing results. A further study is warranted to implement machine learning classification models and therefore maximize the predictive potential.

Acknowledgements

Financial assistance by Jane and Aatos Erkko Foundation is gratefully acknowledged. The authors would also like to thank Egor Panfilov for his valuable advice and expertise.

References

1. Väärälä A., Peuna A., Panfilov E., Casula V., Haapea M., Lammentausta E., and Nieminen M.T. (2020). Gray Level Co-occurrence Matrix Based 3D Texture Analysis of Knee Articular Cartilage using 3D DESS Images. Proc. Intl. Soc. Magn. Reson. Med 28 (350)

2. Osteoarthritits Initiative (OAI) study protocol. URL: https://nda.nih.gov/oai/study-details

3. Panfilov, E., Tiulpin, A., Klein, S., Nieminen, M. T., & Saarakkala, S. (2019). Improving robustness of deep learning based knee MRI segmentation: Mixup and adversarial domain adaptation. In Proceedings of the IEEE International Conference on Computer Vision Workshops (pp. 0-0).

4. Eraqi H. (2016). MLP Neural Network with Backpropagation. MATLAB Central File Exchange. URL:https://www.mathworks.com/matlabcentral/fileexchange/54076-mlp-neural-networkwith-backpropagation

5. Kokkotis C., Moustakidis S., Papageorgiou E., Giakas G. & Tsaopoulos D.E. (2020). Machine learning in knee osteoarthritis: A review. Osteoarthritis and Cartilage Open, Volume 2, Issue 3. DOI: https://doi.org/10.1016/j.ocarto.2020.100069

Figures

Table 1. Five best performing 3D Texture Analysis outputs (sorted by accuracy) per each layer using Naïve Bayes and selected tibial features.

Table 2. Overall accuracies calculated for each trained model.

Table 3. A list of best performing 3D Texture Analysis outputs across all sets of results.

Figure 1. 3D DESS image data for a randomly selected patient. The white region outlines the segmented cartilage mask.

Figure 2. Example of the cartilage layers implemented in the 3D Texture Analysis. Blue dashed line represents the thickness of the cartilage. L10, L50, L90 show the layer heights, in which the cartilage was analyzed. SUM represents the full cartilage thickness.

Proc. Intl. Soc. Mag. Reson. Med. 29 (2021)
2970