2099

Large-scale image quality assessment using deep learning: impact of physiological factors and acquisition settings in whole-heart coronary MRA

Aurélien Maillot^1,2, John Heerfordt^1,2, Robin Demesmaeker^1,3,4, Jonas Richiardi¹, Dimitri Van De Ville^3,5, Tobias Kober^1,2,6, Juerg Schwitter⁷, Matthias Stuber^2,8, and Davide Piccini ^1,2,6

¹Advanced Clinical Imaging Technology, Siemens Healthcare AG, Lausanne, Switzerland, ²Department of Radiology, University Hospital (CHUV) and University of Lausanne (UNIL), Lausanne, Switzerland, ³Institute of Bioengineering/Center for Neuroprosthetics, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland, ⁴Institute of Electrical Engineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland, ⁵Department of Radiology and Medical Informatics, University Hospital of Geneva (HUG), Geneva, Switzerland, ⁶LTS5, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland, ⁷Division of Cardiology and Cardiac MR Center, University Hospital of Lausanne (CHUV), Lausanne, Switzerland, ⁸Center for Biomedical Imaging (CIBM), Lausanne, Switzerland

Synopsis

Understanding which factors affect image quality is essential in order to perform high quality MRI acquisitions. Using a deep convolutional neural network, we performed automated Image Quality Assessment of 1102 heterogeneous whole-heart coronary MRA volumes acquired with a respiratory self-navigated ECG-triggered bSSFP sequence. A non-parametric multivariate rank regression was performed to predict image quality from available physiological and acquisition parameters. A large agreement between the Image Quality Scores (IQSs) estimated by the neural network and the fitted IQSs from the regression model was found (Spearman correlation 0.57). Gender, age, BMI, average RR interval, voxel size, trigger time and flip angle were found to be significant predictors of IQSs.

Introduction

Image Quality Assessment (IQA) is crucial in medical imaging, since image quality directly impacts the reliability of both diagnostic reading and post-processing analyses. Therefore, understanding the main factors that may affect image quality is a major issue. Variability in image quality should be investigated in a large cohort in order to minimize bias. However, visual IQA can become extremely tedious and time-consuming as the amount of datasets to review increases. Moreover, visual IQA can, as most repetitive tasks, be highly affected by the expert's concentration which varies with tiredness and external conditions¹. To overcome some of these limitations, a deep Convolutional Neural Network which performs automated IQA (IQA-CNN) has recently been proposed². This specific network was trained and tested on 3D respiratory self-navigated whole-heart coronary MRA images. In our institution, the very same acquisition protocol has been used extensively in a clinical setting, with varying acquisition parameters, on hundreds of patients exhibiting variable physiological attributes^3,4. In this work, we apply the IQA-CNN to this large-scale database and quantitatively investigate if physiological parameters or acquisition settings can explain image quality.

Methods

The large-scale database consisted of N=1102 3D whole-heart patient datasets acquired on a 1.5T clinical MRI scanner between 2015 and 2018 (MAGNETOM Aera, Siemens Healthcare, Erlangen, Germany) using a prototype ECG-triggered respiratory self-navigated 3D radial bSSFP sequence⁵. All these datasets were graded with the IQA-CNN, which outputs a single Image Quality Score (IQS) between 0 (non-diagnostic) and 4 (excellent) for every volume as described in detail in ². Additionally, physiological attributes that varied between subjects and acquisition settings that varied between scans were identified. We then investigated the univariate effects of these parameters (physiological + acquisition) on the IQS, before using them as predictor variables in a non-parametric multivariate linear rank regression⁶ where the IQS was set as the response variable. Exploratory analysis of bivariate relationships between IQS and individual parameters motivated the choice of a linear model. In particular, a rank regression model was chosen as it reflects the ordinal nature of the IQS. Thereafter, regression diagnostics was performed by inspecting residuals (residual plot and Quantile-Quantile (QQ) plot of the studentized residuals). Finally, Spearman’s rank correlation between the IQSs from the IQA-CNN and the IQS from the rank regression model was computed in order to examine how well the model fitted the data.

Results

The IQS distribution across the full cohort is depicted in Fig.1 together with examples of representative axial, sagittal and transversal slices from volumes with different scores. The distribution is non-Gaussian with an average IQS of 2.12 ± 1.04. The mean values and standard deviations of the physiological parameters and main adjustable acquisition settings that we investigated are listed in Table 1. The IQS distributions with respect to the different parameters are depicted as scatter plots in Fig.2. It appears that bivariate relationships, when observed, tend to be linear (cf. BMI and average RR interval). The regression diagnostics did not reveal any issues (homoscedastic distribution of the residuals with mean of zero). The predictor coefficients, their standard deviation error, t-value and p-value are reported in Table 2. Gender, age, BMI, average RR interval, voxel size, trigger time and flip angle were statistically significant predictor variables with respective t-values of -6.02, -12.11, -11.13, 4.69, 5.23, -3.89 and 4.36 (cf. Table 2). The IQSs predicted by the model from the physiological and acquisition parameters are plotted against the real IQSs in the scatter plot in Fig.3. The Spearman rank correlation between the actual IQS and the fitted IQS was significant at 0.57 (p-value<2.2e-16), indicating relatively good agreement between the real and predicted IQS.

Discussion and Conclusion

We performed a multivariate rank-based analysis on a large heterogeneous dataset in order to predict whole-heart MRI image quality from available physiological parameters and acquisition settings. Our model had good agreement with IQS. All physiological parameters and three of the acquisition parameters were found to be significant predictors. The negative correlation between image quality and BMI (the strongest predictor) might be explained by an increased distance of the coil to the heart and by the intrinsic difficulty for radial sequences to achieve good fat-suppression due to the repeated sampling of lower frequencies in k-space. It is worth noting that important parameters that may affect the IQS such as patient motion or respiratory patterns was excluded due to the limited available information in the DICOM headers. Including them might increase the predictive power of the model. Finally, it remains to be investigated how the knowledge of factors that affect image quality can be actively used to tailor future acquisitions.

Acknowledgements

No acknowledgement found.

References

1. S. Danziger et al. “Extraneous in factors in judicial decisions” in Proceedings of the National Academy of sciences of the United States of America, 2011

2. R. Demesmaeker et al. “Deep Learning for Automated Medical Image Quality Assessment : Proof of Concept in Whole- Heart Magnetic Resonance Imaging,” in Proceedings of the Joint Annual ISMRM-ESMRMB Meeting, 2018.

3. D. Piccini et al. “Respiratory Self-navigated Postcontrast Whole-Heart Coronary MR Angiography: Initial Experience in Patients,” Radiology, vol. 270, no. 2, pp. 378–386, 2014.

4. P. Monney et al. “Single centre experience of the application of self navigated 3D whole heart cardiovascular magnetic resonance for the assessment of cardiac anatomy in congenital heart disease.,” J. Cardiovasc. Magn. Reson., vol. 17, p. 55, 2015.

5. D. Piccini, A. Littmann, S. Nielles-Vallespin, and M. O. Zenge, “Respiratory self-navigation for whole-heart bright-blood coronary MRI: Methods for robust isolation and automatic segmentation of the blood pool,” Magn. Reson. Med., vol. 68, no. 2, pp. 571–579, 2012.

6. D.Kloke, J.Mckean, "Rfit: Rank-based Estimation for Linear Models": The R. Journal, 2012

Figures

Figure 1: Image quality score distribution of the full dataset and examples of slices in transversal, sagittal and coronal orientations from volumes with different scores.

Table 1: Mean values and standard deviations of the analyzed physiological parameters and acquisition settings. BMI stands for Body Mass Index and is defined as weight/height². Average RR interval corresponds to the mean value of the interval between two successive heartbeats and is calculated prior to the scan. RR variability index is an indirect measure of both the variability of the RR interval during the scan and the number of missed heartbeats (non-detected R peak or trigger time spanning outside the cardiac cycle). It is calculated as (Scan time - (Segments*Average RR interval)/Scan time). Acquisition window length is defined as Number of readout per segment*Repetition time

Figure 2: Scatter plots of the bivariate relationships between individual parameters and the Image Quality Score (IQS). Blue dots represent men whereas orange dots represent women. Bivariate relationships are mainly linear as seen for the BMI and the Average RR interval.

Table 2: Coefficients summary of the Rank regression analysis predictor variables. Gender, age, BMI, average RR interval, voxel size, trigger time and flip angle are statistically significant predictor variables.

Figure 3: Scatter plot of the real Image Quality Scores (IQSs) as estimated by the Image Quality Assessment Convolutional Neural Network (IQA-CNN) with respect to the fitted IQSs from the rank model. The more the points lie on the diagonal the better the rank model predicted the IQSs. The Spearman rank correlation between the actual IQS and the fitted IQS was significant at 0.57 (p-value < 2.2e-16), indicating relatively large agreement between the real and predicted IQS.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)

2099