3247

Importance of Clinical MRI Features in Predicting Epilepsy Drug Treatment Outcome for Pediatric Tuberous Sclerosis Complex

Jun Yang^1,2, Cailei Zhao³, Shi Su⁴, Zhanqi Hu⁵, Jianxiang Liao⁵, Dong Liang^1,2,4, and Haifeng Wang^2,4
¹Research Centre for Medical AI, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China, ²University of Chinese Academy of Sciences, Beijing, China, ³Department of Radiology, Shenzhen Children’s Hospital, Shenzhen, China, ⁴Paul C. Lauterbur Research Centre for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China, ⁵Department of Neurology, Shenzhen Children’s Hospital, Shenzhen, China

Synopsis

Predicting epilepsy drug treatment outcome is important for treating children with tuberous sclerosis complex (TSC). Here, the best performing model was selected to explore the contribution of the features, using permutation importance (PIMP). An approach similar to PIMP was used to compare the magnetic resonance imaging (MRI) and non-MRI features. The best model multilayer perceptron (MLP) with a hidden layer size of 60 and 30 features selected by F-test achieved the best performance. The results based on 103 children patients showed that some features were more important than others, and MRI features contributed more than non-MRI features in prediction.

INTRODUCTION

TSC is caused by the mutation of the TSC1 or TSC2 genes ^{1, 2}. Different symptoms may appear with the development of hamartomas in many organs ¹. Almost 2/3 of the TSC patients with epilepsy have seizure onset in the first year of life, and 60% of them have drug-resistant epilepsy ³, leading to higher mortality ⁴. Therefore, to better treat TSC patients, it is important to predict drug-resistant epilepsy. Based on clinical experience, the locations and types of the cortical tubers have a certain relationship with the drug treatment outcomes, and machine learning models have been applied to predict the outcomes ⁵. However, the contributions of the features to the results are still unclear. In this study, we further improve the model performance and explore the feature contributions. The results based on 103 pediatric patients show that MRI features contribute more than non-MRI features.

METHODS

The analysis of variance F-test was used for feature selection. Three models, including MLP ⁶, Naive Bayes, and random forest ⁷ were used for the prediction. PIMP and a similar method were used to explore the contribution of the features. The area under curve (AUC) was used as the evaluation metric in the experiments.
The IRB approved data were obtained from 103 TSC children patients who had anti-epileptic drug (AED) treatments for at least one year, and consisted of MRI and non-MRI parts. Also, the brain was divided into 5 lobes, denoted as FL, PL, TL, OL, and IC. Cortical tubers were classified into 16 types based on medical knowledge, and each type of cortical tubers in each lobe was counted, respectively. These numbers were used as features whose naming formats were Lobe-Type. Other clinical features were also added, to get better performance. Furthermore, the target was whether epilepsy had been controlled by the AEDs.
Experiments were implemented using Scikit-learn ⁸. Stratified 10-fold cross-validation was used. Every experiment was repeated 50 times to calculate the mean value and 95% confidence interval (CI) of the AUCs. Initially, the hyper-parameters of the three models were kept at default. Multinomial Naive Bayes was used because the MRI features were lesion counts, and other features were natural numbers or had been converted. The number of features selected by F-test is denoted as k. Due to the small amount of data, making a trade-off on k was needed. Multiple k, ranging from 5 to 109, were tried on the 3 models. The best model MLP with k = 30, was selected for the following experiment. The number of the hidden layer size of the MLP is denoted as n. Similarly, a trade-off was made on the value of n. Multiple n, ranging from 5 to 200, were evaluated. Then, the best performing n was selected for the next experiment. The PIMP of feature f is the decrease in model performance after randomly permuting f. The PIMPs of the selected 30 features were calculated. For comparison, their F-values in the feature selection step were gathered. Based on the idea of PIMP, the MRI and non-MRI features were permuted in two settings. For comparison, two additional settings were added, one of which permuted all features, and the other did nothing. Then the receiver operating characteristic (ROC) curves of the four settings were calculated.

RESULTS & DISCUSSION

Fig. 1 shows an example of cortical tubers observed in an MRI of a pediatric TSC patient. The AUCs of the three models with different k are shown in Fig. 2. The highest AUCs (95% CI) of MLP, Naïve Bayes, and random forest, were 0.837 (±0.004) at k=30, 0.811 (±0.003) at k=30, and 0.745 (±0.005) at k=10, respectively. Also, the AUCs of the MLP with different n are shown in Fig. 3, and the highest AUC (95% CI) was 0.841 (±0.004), at n=60. Furthermore, the PIMPs and F-values of the selected 30 features are shown in Fig. 4. The top 3 features of PIMP were OL-B2 (of 0.031), hypomelanotic macules (of 0.024), and FL-A3 (of 0.0084), where the importance increases with the PIMP. Meanwhile the top 3 features of F-values were hypomelanotic macules, TL-B2, and PL-B4. The results showed that a higher F value does not necessarily mean higher PIMP. There were 20 MRI features and 10 non-MRI features in the 30 selected features. The ROC curves of the four settings are shown in Fig. 5. The AUC of permuting MRI features (0.615) was lower than permuting non-MRI features (0.721). Based on this, we can consider that MRI features were more important than non-MRI features in prediction.

CONCLUSION

The model MLP with 60 neurons in the hidden layer and using 30 features selected by F-test achieves the best AUC in predicting epilepsy drug treatment outcome for TSC children patients. Also, the PIMPs of the selected 30 features were calculated, and the results based on 103 pediatric patients show that some features are more important than others. Especially, among the selected 30 features, MRI features contribute more than non-MRI features in prediction.

Acknowledgements

Some of the work was partially supported by the Sanming Project of Medicine in Shenzhen (SZSM201812005), the Shenzhen Peacock Plan Team Program (KQTD20180413181834876), and the Natural Science Foundation of Guangdong Province (2018A0303130132).

References

1. M. Słowińska, S. Jóźwiak, A. Peron, J. Borkowska, D. Chmielewski, K. Sadowski, E. Jurkiewicz, A. Vignoli, F. L. Briola, M. P. Canevini, and K. Kotulska-Jóźwiak, “Early diagnosis of tuberous sclerosis complex: a race against time. How to make the diagnosis before seizures?,” Orphanet Journal of Rare Diseases, vol. 13, no. 1, 2018.

2. K. A. Orlova and P. B. Crino, “The tuberous sclerosis complex,” Annals of the New York Academy of Sciences, vol. 1184, no. 1, pp. 87–105, 2010.

3. M. Fohlen, D. Taussig, S. Ferrand-Sorbets, M. Chipaux, N. Dorison, O. Delalande, and G. Dorfmüller, “Refractory epilepsy in preschool children with tuberous sclerosis complex: Early surgical treatment and outcome,” Seizure, vol. 60, pp. 71–79, 2018.

4. S. An, K. Malhotra, C. Dilley, E. Han-Burgess, J. N. Valdez, J. Robertson, C. Clark, M. B. Westover, and J. Sun, “Predicting drug-resistant epilepsy — A machine learning approach based on administrative claims data,” Epilepsy & Behavior, vol. 89, pp. 118–125, 2018.

5. J. Yang, C. Zhao, S. Su, D. Liang, Z. Hu, H. Wang, H. Liao. “Machine Learning in Epilepsy Drug Treatment Outcome Prediction Using Multi-modality Data in Children with Tuberous Sclerosis Complex,” Proc. of the 6th International Conference on Big Data and Information Analytics (BigDIA), Shenzhen, China, pp. 100-103, Dec. 2020.

6. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, no. 5, pp. 359–366, 1989.

7. L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.

8. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

Figures

Fig. 1. One example of cortical tubers of one pediatric TSC patient observed in MR images.

Fig. 2. Mean AUCs and their 95% CIs of three models using different numbers of features selected by the F-test method.

Fig. 3. Mean AUCs and their 95% CIs of MLP with different hidden layer size, using 30 features selected by the F-test method.

Fig. 4. PIMPs with their 95% CIs and F-values of the selected 30 features.

Fig. 5. Mean ROC curves of 4 settings of feature permutation. Mean AUCs and their 95% CIs are shown in the legend.

Proc. Intl. Soc. Mag. Reson. Med. 29 (2021)

3247