Cindy xue1,2, Jing Yuan1, Yihang Zhou1, Oilei Wong1, Kin Yin Cheung3, and Siu Ki Yu3
1Research Department, Hong Kong Sanatorium and Hospital, Hong Kong, Hong Kong, 2Department of Imaging and Interventional Radiology, The Chinese University of Hong Kong, Hong Kong, Hong Kong, 3Medical Physics Department, Hong Kong Sanatorium and Hospital, Hong Kong, Hong Kong
Synopsis
Radiomics
has been increasingly used as potential quantitative imaging
biomarkers for head and neck cancer diagnosis and treatment. However,
the test-retest
repeatability
of radiomics features in the head and neck
(HN)
region is rarely investigated. This study aims to investigate the
test-retest repeatability of MRI radiomics features in HN. Radiomics
features were extracted from 15
volunteers receiving
4 re-positioned MRI scans
with
thermoplastic mask immobilization and flexible coils using 3D-T1w-TSE
and 3D-T2w-TSE. The
results found that test-retest repeatability of MRI radiomics
features varied dependent on tissues and pulse sequences. Only a
small percentage of radiomics features showed excellent test-retest
repeatability.
Introduction
Head and neck cancer (HNC) is a clinically and biologically complex and heterogeneous disease [1]. Radiomics has been increasingly investigated as potential quantitative imaging biomarkers to facilitate personalized HNC diagnosis and treatment [2]. However, the broad generality of radiomics is much hindered by the reliability of the radiomics features due to many variability and uncertainty sources throughout the complex radiomics workflow, especially in the presence of highly heterogeneous head and neck (HN) tissues [3]. Furthermore, radiomics feature repeatability to magnetic resonance imaging (MRI) acquisition, which is considered a crucial factor affecting the radiomics feature reliability, is still rarely investigated. This study was prospectively conducted to examine the multi-scan test-retest repeatability of MRI radiomics features in HN tissues, aiming for potential magnetic resonance-guided radiotherapy (MRgRT) treatment in HNC.Methods
Fifteen
healthy volunteers received four re-positioned scans within one day
(age range: 24-40 years) on a 1.5 T MRI dedicated for radiotherapy
simulation scan with thermoplastic mask immobilization and flexible
coils using 3D-T1W-TSE and 3D-T2W-TSE. The imaging parameters of both
sequences are shown in Table 1.
MR
images were rigidly registered to the first scan images using 3D
Slicer. Ten spherical VOIs of pons, left (L) and right (R) parotid
glands, mandible, tongue, L/R pterygoid muscle, thyroid, and L/R
submandibular gland were manually drawn by an MRI physicist on the
first-scan (reference) 3D-T1W-TSE images as shown in Figure 1. Then
all VOIs were propagated to other registered image sets and visually
checked by the same MRI physicist.
Radiomics
features were extracted using Pyradiomics (v.2.2.0) with default
fixed-bin size of 25. Totally 93 radiomics features in 6 categories
(first-order n=18; texture_GLCM n =24; texture_GLDM n =14;
texture_GLRLM n =16; texture_GLSZM n =16; texture_NGTDM n =5) were
extracted.
Radiomics
feature test-retest repeatability was evaluated by intraclass
correlation coefficient (ICC) ) (2-way mixed effects, absolute
agreement). The feature repeatability was classified as excellent
(ICC>0.9), good (0.9>ICC>0.75), moderate (0.75>ICC>0.5),
and poor (ICC<0.5) when the calculated ICC and its 95% confidence
interval (CI) were both within the thresholds according to Koo et
al..
ICC
values between both sequences (3D-T1w-TSE and 3D-T2w-TSE) were
compared using Student’s T-test. ANOVA test with Bonferroni
correction was conducted to compare the ICC values among the VOIs. A
P-value smaller than 0.05 indicated statistical significance. All
statistical tests were conducted using RStudio (RStudio PBC, Boston,
MA, USA).Results
The
radiomics feature test-retest repeatability evaluated by ICC varied
with VOIs and pulse sequences, as shown in the boxplots of ICC for
both pulse sequences (Figure 2).
Overall, significantly higher ICC values were found in the 3D-T2w-TSE
(0.473±0.249) compared to the ICC values in the 3D-T1w-TSE
(0.418±0.270) (P<0.001). This could suggest that the test-retest
repeatability in the head and neck region could be better achieved
with a 3D-T2w-TSE sequence. The ICC values also vary significantly
among different regions for both pulse sequences (P<0.001). Among
the paired VOIs, such as L/R parotid gland, L/R pterygoid muscle, and
L/R submandibular gland, there was no significant difference in the
ICC values (P>0.05) for both sequences, except in the
submandibular gland (P<0.05) for 3D-T2w-TSE sequence.
Figure
3
shows the percentages of radiomics features showing excellent, good,
moderate, and poor ICCs in different VOIs for 3D-T1W-TSE and
3D-T2W-TSE sequence, respectively. Only a small percentage of
radiomics features showed excellent repeatability for 3D-T1w-TSE
(5.27%±4.00%) and 3D-T2w-TSE (4.41%±2.66%) among the VOIs. The
largest number of excellent test-retest repeatability features was
found in the VOI of the R parotid gland for both sequences
(3D-T1w-TSE: 10.75%, 10/93; 3D-T2w-TSE: 7.53%, 7/93), while the
lowest number of excellent test-retest repeatability features was
found in tongue VOI, where none of the features was excellent.Discussion
This study prospectively examined the multi-scan test-retest repeatability of MRI radiomics features in HN tissues for a 3D-T1w-TSE sequence and a 3D-T2w-TSE sequence in a healthy volunteers cohort, aiming for the potential applications of MRI radiomics in HNC MRgRT. To the best of our knowledge, this is the very first study that rigorously examines the test-retest repeatability of the MRI radiomics features in the head and neck region.
The results found that test-retest repeatability of MRI radiomics features varied dependent on tissues and pulse sequences. Only a small percentage of the radiomics features showed excellent test-retest repeatability (ICC>0.9) aligned with other studies [5,6]. However, the proportion of excellent ICC features in this study was even smaller than that in other studies [5,6] conducted using different anatomies. This could result from the higher heterogeneity in the HN tissues and a more stringent ICC classification by its 95% confidence interval. Test-retest repeatability was found to be tissue-dependent, hence, suggesting different features might be chosen for tissue-specific radiomics model building in clinical use.
Higher ICC values in 3D-T2w-TSE sequence compared to 3D-T1w-TSE could be resulted from the wider voxel intensity range found in 3D-T2w-TSE. However, the number of features showing excellent ICC was not necessarily more in 3D-T2W-TSE in different VOIs.
This study should be helpful for a better understanding of radiomics feature reliability, and reliable radiomics feature selection for modeling in the future HNC MRgRT applications. Acknowledgements
No acknowledgement found.References
[1] Chow LQM (2020) Head and Neck Cancer. N Engl J Med 382:60-72
[2] Tanadini-Lang S, Balermpas P, Guckenberger M et al (2020) Radiomic biomarkers for head and neck squamous cell carcinoma. Strahlenther Onkol 196:868-878
[3] Xue C, Yuan J, Lo GG et al (2021) Radiomics feature reliability assessed by intraclass correlation coefficient: a systematic review. Quant Imaging Med Surg 11:4431-4460
[4] Koo TK, Li MY (2016) A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med 15:155-163
[5] Xue C, Yuan J, Poon DM et al (2021) Reliability of MRI radiomics features in MR-guided radiotherapy for prostate cancer: Repeatability, reproducibility, and within-subject agreement. Med Phys. 10.1002/mp.15232
[6] Shiri I, Hajianfar G, Sohrabi A et al (2020) Repeatability of radiomic features in magnetic resonance imaging of glioblastoma: Test-retest and image registration analyses. Med Phys 47:4265-4280