0545

How frequently should we use a phantom for QA? A preliminary assessment

Kalina V Jordanova¹, Stephen E Ogier¹, Stephen E Russek¹, Cassandra M Stoffer¹, Guido Buonincontri², Mathias Nittka², and Kathryn E Keenan¹
¹NIST: National Institute of Standards and Technology, Boulder, CO, United States, ²Siemens Healthcare GmbH, Erlangen, Germany

Synopsis

Keywords: Phantoms, Precision & Accuracy, Quality Assurance, MR Fingerprinting, Quantitative Imaging, Relaxometry, Measurement & Correction

Motivation: Currently, we do not know how frequently quality assurance (QA) should be performed on an MRI scanner to detect changes that impact quantitative measurements.

Goal(s): Our goal is to determine the frequency of QA measurements needed during the course of a quantitative in vivo study to have confidence in the in vivo measurements.

Approach: Phantom quantitative QA measurements were made immediately before or after the in vivo measurements over the duration of a repeatability study.

Results: All quantitative phantom measurements had variation well below 10 % over the course of the 99 day study.

Impact: We now know that for measurements using magnetic resonance fingerprinting on this system, QA using phantom measurements is only necessary at the start and end of an in vivo study when the study duration is less than approximately 3 months.

Introduction

Standardized reference objects, or phantoms, help assess MRI protocol accuracy. Studies have shown the value of phantom use following scanner upgrades^1,2 and in translating quantitative MRI methods to the clinic³. Despite this, it remains unclear how often phantom measurements should be conducted during in vivo studies.

This work attempts to identify the frequency that phantom measurements should be conducted during the course of one single-site in vivo study. Repeat in vivo T1 and T2 measurements were measured using magnetic resonance fingerprinting (MRF) on 20 participants over a 99-day period, and a phantom was scanned twice immediately before or after an in vivo measurement. Phantom test-retest repeatability was evaluated along with longitudinal stability, while considering temperature variation.

Methods

An MRF sequence was used to acquire T1 and T2 measurements on a commercially available breast phantom⁴ (CaliberMRI, Figure 1). The phantom contains aqueous solutions of polyvinylpyrrolidone (PVP), ranging from 0-40% PVP concentration. Some solutions have repeated vials in the phantom. Additionally, a fibroglandular (FBG) mimic material is the background fill of the phantom.

All scans were conducted on a 3T Siemens scanner (MAGNETOM PrismaFit, Siemens Healthcare, Erlangen, Germany) using the body transmit coil and a 20-channel receive head/neck coil. MRF data were acquired using a 2D spiral acquisition research sequence with 1500 measurements, 12.1–15.0 ms repetition times, and 0–74° flip angles. Seven slices were acquired with a scan time of 20 s each. The dictionary range was 10–4500 ms for T1 and 2–3000 ms for T2, with increasing step sizes as T1 or T2 increased. The sequence was optimized for T1>400 ms, so voxels with a lower T1 were masked in the quantitative images.

Phantom regions of interest (ROIs) were identified by selecting a central region of each sample vial. ROIs for FBG were found by selecting four regions of the background fill, of the same size as the vial ROIs.

To assess MRF sequence repeatability, test-retest measurements were taken. Phantom test-retest accuracy was assessed using Bland-Altman plots. Test-retest variation was calculated for each material and relaxation parameter as
$$Var=\frac{1}{N}\sum_{i=1}^{N}\frac{|test_i-retest_i|}{(test_i+retest_i)/2}*100$$
where N is the number of measurements and $$$test_i$$$ and $$$retest_i$$$ are the test-retest measurements.

To examine measurement stability over time, percent difference was calculated of the mean measurement for each {day, repeat, material} compared to the overall mean measurement for each material. The daily scan room temperature was recorded. To assess measurement stability over temperature, an aliquot of each phantom material was measured using an NMR system in a temperature-controlled environment⁵ to determine its T1 and T2 temperature dependence. Using this temperature dependence, we calculated the expected range of relaxation values for the observed temperatures and compared that to the measured relaxation value range.

Results

Bland-Altman plots for each ROI in the test-retest measurements (Figure 2) show that most measurements are within the 95% confidence interval (CI) of the mean. For the 18% and 25% PVP measurements outside of 95% CI, the test-retest percent difference was below 5%. The FBG measurements outside of 95% CI had percent difference below 10%. Test-retest variation (Figure 3) was below 1% for all measurements except FBG T2, which had 2.44% variation.

Over 99 days, the variation of daily mean T1 and T2 was easily within 10% of the overall mean measurement for all materials, for all temperatures (Figure 4). During this time, the temperature varied between 19.28–20.56°C. While the T1 measurements stayed within the expected measurement variation range according to temperature-controlled NMR values, the T2 measurements often varied by larger amounts than indicated by NMR calculations.

Discussion

Phantom results indicate that the MRF sequence is repeatable in test-retest acquisition and reproducible over time. Previous studies have found stability in MRF measurements^6-8. Some variation in measurement is expected as this experiment was not temperature-controlled. The measurement variation did not correlate with the variation in temperature, indicating that the measurement variation source was likely noise in the measurement itself rather than variations in temperature.

Conclusion

Over the 99-day study, phantom measurements were stable, and no changes were detected on this system using MRF. For this measurement type and study duration, phantom measurements should be made at the study’s start and end to verify no changes occurred during the study. There is no indication that intermediate measurements are necessary based on the results here; however, phantom measurements should be conducted pre/post known system changes. This work represents an initial effort to answer the clinically-relevant question of how often phantom QA must be done in tandem with in vivo measurements.

Acknowledgements

NIST acknowledges research funding from the National Research Council Postdoctoral Fellowship.

References

1. Keenan KE, Gimbutas Z, Dienstfrey A, Stupic KF. Assessing effects of scanner upgrades for clinical studies. J Magn Reson Imaging. 2019;50(6):1948-1954. doi:10.1002/jmri.26785

2. Lee Y, Callaghan MF, Acosta-Cabronero J, Lutti A, Nagy Z. Establishing intra- and inter-vendor reproducibility of T1 relaxation time measurements with 3T MRI. Magn Reson Med. 2019;81(1):454-465. doi:10.1002/mrm.27421

3. Keenan KE, Ainslie M, Barker AJ, et al. Quantitative magnetic resonance imaging phantoms: A review and the need for a system phantom: Quantitative MRI Phantoms Review. Magn Reson Med. 2018;79(1):48-61. doi:10.1002/mrm.26982

4. Keenan KE, Wilmes LJ, Aliu SO, et al. Design of a breast phantom for quantitative MRI. Journal of Magnetic Resonance Imaging. 2016;44(3):610-619. doi:10.1002/jmri.25214

5. Boss MA, Dienstfrey AM, Gimbutas Z, et al. Magnetic Resonance Imaging Biomarker Calibration Service: Proton Spin Relaxation Times. National Institute of Standards and Technology; 2018:NIST SP 250-97. doi:10.6028/NIST.SP.250-97

6. Jiang Y, Ma D, Keenan KE, Stupic KF, Gulani V, Griswold MA. Repeatability of magnetic resonance fingerprinting T1 and T2 estimates assessed using the ISMRM/NIST MRI system phantom. Magn Reson Med. 2017;78(4):1452-1457. doi:10.1002/mrm.26509

7. Buonincontri G, Kurzawski JW, Kaggie JD, et al. Three dimensional MRF obtains highly repeatable and reproducible multi-parametric estimations in the healthy human brain at 1.5T and 3T. NeuroImage. 2021;226:117573. doi:10.1016/j.neuroimage.2020.117573

8. Dupuis A, et al. Quantifying 3D-MRF Reproducibility Across Subjects, Sessions, and Scanners Automatically Using MNI Atlases. Proc. Intl. Soc. Mag. Reson. Med. 31. 2023; 2182.

Figures

Figure 1: Example MRF T1 (left) and T2 (right) maps for the breast phantom. The phantom consists of 12 PVP samples and 4 fat mimics (masked), with an FBG mimic background fill. The phantom was imaged in the same coil as in vivo imaging using a 3D printed holder for repeatable positioning, and it was stored in the scan room between imaging sessions.

Figure 2: Bland-Altman plots of T1, T2, and the T1/T2 ratio for each phantom sample test-retest measurement. Percent differences between each test-retest measurement are calculated and plotted as a function of the mean test-retest value. The mean (solid line) and 1.96*standard deviation (dashed lines) were calculated for each measurement type.

Figure 3: T1 and T2 test-retest variation for each material over the course of the study shows a test-retest variation of less than 1% for all measurements except FBG T2.

Figure 4: T1 (left) and T2 (right) percent variation of mean measurement by {day, repeat} compared to overall mean for each material (rows). Scan room temperature is shown in color (colorbar in °C). X-axes show measurement day, where day 1 is the first day of in vivo acquisition. Coefficients of variation are plotted (error bars), with dashed lines showing ±10% variation. Shaded gray regions indicate amount of expected variation given the temperature range, as per temperature-controlled NMR data.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

0545

DOI: https://doi.org/10.58530/2024/0545