3310

Super resolution permits fast, low resolution bSSFP imaging of the temporal bone…but do radiologists agree on image quality?
Sarah Reeve1,2, Alessandro Guida2,3, Chris Bowen1,2,3, James Rioux1,2,3, David Volders3,4, Jens Heidenreich3,4, and Steven Beyea1,3,5,6
1Physics and Atmospheric Science, Dalhousie University, Halifax, NS, Canada, 2Biomedical Translational Imaging Centre, QEII Health Sciences Centre, Halifax, NS, Canada, 3Diagnostic Radiology, Dalhousie University, Halifax, NS, Canada, 4Diagnostic Imaging, QEII Health Sciences Centre, Halifax, NS, Canada, 5Biomedical Translational Imaging Centre, IWK Health Centre, Halifax, NS, Canada, 6School of Biomedical Engineering, Dalhousie University, Halifax, NS, Canada

Synopsis

Keywords: Machine Learning/Artificial Intelligence, Head & Neck/ENT, Super Resolution

Low resolution (LR) balanced steady-state free precession (bSSFP) acquisitions confer decreased acquisition times, reduced patient motion and heating, and increased artifact tolerance due to a decrease in TR. The application of a pre-trained super resolution network to LR bSSFP images of the temporal bone allows for these advantages to be realized, without significantly degrading image quality. In the absence of a matched high resolution image, quality was judged by two radiologists. However, radiologist’s ratings were not in agreement, highlighting the fact that there is no single definition of task-specific image quality, which must be considered when super resolution is performed.

Introduction

In many studies that implement super resolution (SR) models, a low resolution (LR) image is synthesized from a high resolution (HR) acquisition (e.g. 1, 2) to generate a paired set of images that is appropriate for use in training and testing a model. However, synthesis of a LR image does not allow for the advantages that a true LR acquisition may provide such as reduced patient heating, less patient motion, and reduced scan times. Balanced steady-state free precession (bSSFP) is a sequence which has the additional benefit of reduced off-resonance induced banding artifacts when performed at LR, due to a reduction in TR. Application of a SR pipeline to LR bSSFP images of the temporal bone could, therefore, enable fast, artifact-tolerant acquisitions of this complex space, without forfeiting the high resolution required to visualize the fine structures of the inner ear.

The consequence of this approach, however, is that there is no HR “reference” image to compare the model output to. But, while paired HR and synthesized LR datasets make it easy to quantify the performance of a model via calculation of full-reference image quality metrics (IQMs), studies have shown that IQMs fail to show strong correlations to the gold standard of radiologist opinion of image quality or diagnostic utility3. Instead, we aim to examine radiologists’ opinion of images resulting from a super resolution pipeline in comparison to standard interpolation techniques. To do so, radiologists will rate their ability to visualize specific structures of the inner ear in images generated from an “originally” protocoled resolution as well as those from a LR acquisition, both brought to HR with the super resolution pipeline and with standard interpolation techniques.

Methods

Informed written consent was obtained from 10 healthy volunteers (6M, 4F, average age 36 ± 11 years) who were scanned under an NSH REB-approved protocol.

Imaging was performed on the head-only 0.5T point-of-care system from Synaptive Medical4. The “original” resolution (OR) acquisitions (FOV 18cm, isotropic resolution 0.6mm, 164 slices, RBW 70kHz, TR/TE 7.0/3.4ms, flip angle 60°, scan time 4min 19s) and LR acquisitions (FOV 18cm, isotropic resolution 0.7mm, 140 slices, RBW 70kHz, TR/TE 6.3/3.1ms, flip angle 60°, scan time 3min 11s) were brought to an isotropic HR of 0.3mm using standard interpolation techniques as well as with a SR pipeline, resulting in ORInt., ORSr, LRInt., and LRSrInt. image sets (Figure 1). The SR images were generated with a publicly available model that was pre-trained on natural images5, and the weights used were the result of a weighted interpolation between two of the sets provided.

Each participant’s dataset was anonymized and shown to two board-certified neuroradiologists in a randomized manner. Raters were asked to independently rank their ability to visualize the superior semicircular canal (SSC), facial nerve, cochlear nerve, vestibule, and cochlea, as well as their impressions of overall image quality on a Likert scale ranging from 1-5. Cohen’s Kappa was calculated to quantify inter-rater reliability, and Wilcoxon signed-rank tests were used to compare ratings between the ORInt. and LRSrInt., ORInt. and ORSr, and LRInt. and LRSrInt. image sets for each structure examined. A Bonferroni-corrected P value of < 0.02 was considered significant.

Results and Discussion

A representative dataset from one participant is shown in Figure 2. Inter-rater reliability tests revealed poor to fair agreement for visualizing all structures in all image types (Figure 3). The ratings from each rater are therefore considered as two unique case studies. The results of analysis on ratings from both raters are shown in Figure 4.

Comparisons of rater 1’s ratings yielded no significant differences. In contrast, comparison of rater 2’s ratings yielded significant preference for LRSrInt. images over ORInt. images when visualizing the cochlea, vestibule, and for overall image quality. When comparing methods used to bring acquisitions to HR, rater 2 preferred ORSr images over ORInt. images for visualizing the cochlea, vestibule, and for overall image quality. Similarly, LRSrInt. images were preferred over LRInt. images for visualizing the SSC, facial nerve, cochlea, and vestibule.

The result that raters did not agree is, itself, an interesting one that often goes overlooked (e.g. 6). Post-evaluation interviews with raters revealed that inter-rater variation is conferred by differing noise characteristics between pipelines. This indicates that inter-rater variation, as much as IQM calculations, can be a significant challenge when evaluating image quality.

Neither rater rated the LRSrInt. images significantly lower than the ORInt. images for any structure examined, nor for overall image quality, permitting a reduction in scan time of 1min 8s. Notably, the LR protocol maintained the RBW of the original protocol so as to maintain comparable SNR. A further reduction in TR, and therefore scan time, would likely be feasible if this was not a concern.

Conclusions

The use of a SR pipeline permitted the advantages of acquiring a bSSFP image at LR to be exploited, without reducing overall image quality, as rated by two board-certified neuroradiologists. Importantly, however, radiologists’ ratings were not in agreement, demonstrating that while IQMs are not an ideal method for assessing image quality, radiologists’ opinions can show similar disadvantages.

Acknowledgements

Funding for this research was provided by the National Sciences and Engineering Research Council (Discovery Grant), INOVAIT (with matching funding provided by Synaptive Medical), and the David Fraser Radiology Research Foundation.

References

1. deLeeuw den Bouter ML, Ippolito G, O’Reilly TPA, et al. Deep learning-based single image super-resolution for low-field MR brain images. Scientific Reports. 2022;12(1):6362.

2. Chen Y, Xie Y, Zhou Z, et al. Brain MRI super resolution using 3D deep densely connected neural networks. Proceedings - International Symposium on Biomedical Imaging. April 2018, 739-742.

3. Mason A, Rioux J, Clarke SE, et al. Comparison of Objective Image Quality Metrics to Expert Radiologists’ Scoring of Diagnostic Quality of MR Images. IEEE Transactions on Medical Imaging. 2020;39(4):1064–1072.

4. Jeff A Stainsby, Geron A Bindseil, Ian Ro Connell, et al. Imaging at 0.5 T with high-performance system components. ISMRM 27th Annual Meeting and Exhibition, 2019, Montreal, Quebec, Canada.

5. Cardinale F, et al. Image Super Resolution. 2018. http://github.com/idealo/image-super-resolution.

6. Rudie JD, Gleason T, Barkovich MJ, et al. Clinical Assessment of Deep Learning–based Super-Resolution for 3D Volumetric Brain MRI. Radiology: Artificial Intelligence. 2022;4(2).

Figures

A flow chart indicating how one experimental dataset was generated from 3 balanced steady-state free precession images. Both the low and original resolution images were brought to high resolution with standard interpolation methods and via a super resolution (SR) pipeline. The SR network only permitted an increase in resolution by a factor of 2, therefore the LR acquisition required additional interpolation after the SR pipeline to bring it onto the same image matrix size as the other images.

An example dataset. The axial images (left) include a zoomed-in view of the right lateral semi-circular canal, cochlea, cochlear nerve, and facial nerve. The coronal images (center) include a zoomed-in view of the right cochlea. The sagittal images (right) include a zoomed-in view of the internal auditory canal where cross-sections of the facial, cochlear, and superior and inferior vestibular nerves are visible.

The results of Cohen’s kappa tests for inter-rater reliability reveal poor to fair agreement for each structure examined in each type of image.

Barplots showing the results of Wilcoxon signed-rank tests that compare two radiologists’ ability to visualize structures of the inner ear, as well as overall image quality. Cohen’s kappa tests revealed poor to moderate agreement between raters, therefore results are considered two independent case studies. Error bars depict 95% confidence intervals. * indicates 0.003 < P ≤ 0.02, ** indicates 0.0003 < P ≤ 0.003, and *** indicates 0.00003 < P ≤ 0.0003.

Proc. Intl. Soc. Mag. Reson. Med. 31 (2023)
3310
DOI: https://doi.org/10.58530/2023/3310