1084

Self-navigated Subspace Reconstruction for Real-time MRI Speech Tracking

Peng Cao¹, Wenting Jiang¹, Changhe Chen², Yiang Wang¹, and Jonathan Havenhill²
¹Department of Diagnostic Radiology, The University of Hong Kong, Hong Kong, China, ²Department of Linguistics, The University of Hong Kong, Hong Kong, China

Synopsis

Keywords: Image Reconstruction, Motion Correction

Motivation: Real-time MRI offers a continuous and dynamic view of the object being imaged. Researchers have applied real-time MRI to speech tracking, which allows for the visualization of the vocal tract during speech production.

Goal(s): In this study, we propose applying self-navigated subspace reconstruction to real-time MRI for speech tracking.

Approach: During reconstruction, 1000 frames were compressed to a few principal components, and iterative low-rank approximation was performed on compressed k-space, greatly reducing computation costs.

Results: The proposed method allows for the joint reconstruction of all time frames and provides the dynamic motion pattern of the vocal tract at a high frame rate.

Impact: Our study presented a subspace reconstruction technique that does not require a navigator echo, which can be used for real-time MRI, particularly in speech tracking applications.

INTRODUCTION

Real-time MRI captures a series of cross-sectional images of the subject's vocal tract, synchronized with audio recording [1]. Although real-time MRI was conventionally reconstructed using regularized iterative reconstruction [2, 3], advanced MRI reconstruction and acceleration methods could improve its speed further. Various algorithms are used for subspace reconstruction, such as spatial-temporal separable model, dictionary learning, and low-rank matrix completion [4]. Moreover, studies have shown the feasibility of using a multi-scale low rank approximation to reconstruct 100 gigabytes of dynamic volumetric image data [5]. Previously, subspace reconstruction with navigator echo was used for speech tracking [6, 7]. In addition, another previous study also showed the feasibility of updating the subspace basis in conjuncation with the iterative reconstruction [8]. Therefore, in this study, we propose applying self-navigated subspace reconstruction to real-time MRI for speech tracking. We performed experiments on a clinical 3T MRI using standard RF coils and rapid acquisition.

METHODS

MRI acquisition, postprocessing, and subspace basis generation The in vivo experiments were approved by the local institutional research ethics committee. A healthy volunteer underwent a real-time MRI scan using the FIDALL sequence on a GE 3T system (General Electric Healthcare) with a clinical 21-channel head-neck coil for signal reception. Real-time MRI scan has parameters including: flip angle = 15⁰, FISP acquisition, constant echo time (TE)/ repetition time (TR) = 1.77/7 ms, slice thickness = 5 mm, matrix size = 128 × 128, field of view = 300 × 300 mm2, number of TRs = 1000, number of total spiral interleaves = 32, nominal temporal resolution = 7 ms/frame, golden angle rotation of spiral (56.25⁰), and scan time = 7 s/slice. The k-space data was compressed into singular value components. The error associated with the choice of 5% corresponds to around 10-20 singular values. Gradient descent for subspace image reconstruction. Noted that the images are compressed to singular images. All the computations in this study were performed in MATLAB (MathWorks, Natick) on a laptop computer.

RESULTS

The results of a study that involved a 32-time acceleration simulation showed that the proposed method produced a reasonably small root mean square error (RMSE) of 0.154, compared to 0.278 for sliding window reconstruction, and 0.294 for low rank reconstruction. The study also presented in vivo images of a typical sagittal image with a temporal resolution of 7 ms/frame. The cross-sectional view from 1000 frames acquired over 7 seconds revealed the dynamics of the vocal tract, and several vowels were generated during this scan. In addition, the study displayed the magnitude and phase images of a typical vowel production in the range of 200 ms. The high temporal resolution of images from the proposed method enabled clear visualization of the soft palate. The phase maps showed a modest variation of the static magnetic field due to the airflow, which could cause phase cancellation if one summed these dynamics in a conventional sliding window approach. Furthermore, Figure 5 compared the proposed method and sliding window reconstruction. The image reconstructed from the proposed method had a high temporal resolution in resolving the motion of the soft palate in almost every frame.

DISCUSSION

In this study, a subspace reconstruction method for real-time MRI is presented. The proposed method allows for the joint reconstruction of all time frames and provides the dynamic motion pattern of the vocal tract at a high frame rate. The proposed method delivers a nominal resolution of 7 ms/frame and enables the linguistic study of the position of the soft palate and the opening or closing of the passage between the nasopharynx and oropharynx.

CONCLUSION

Our study presented a subspace reconstruction technique that does not require a navigator echo, which can be used for real-time MRI, particularly in speech tracking applications.

Acknowledgements

No acknowledgement found.

References

1. Ramanarayanan, V., et al., Analysis of speech production real-time MRI. Computer Speech and Language, 2018. 52: p. 1-22.

2. Lingala, S.G., et al., A fast and flexible MRI system for the study of dynamic vocal tract shaping. Magn Reson Med, 2017. 77(1): p. 112-125.

3. Lingala, S.G., et al., State-of-the-art MRI Protocol for Comprehensive Assessment of Vocal Tract Structure and Function. 17th Annual Conference of the International Speech Communication Association (Interspeech 2016), Vols 1-5, 2016: p. 475-479.

4. Shafieizargar, B., et al., Systematic review of reconstruction techniques for accelerated quantitative MRI. Magn Reson Med, 2023. 90(3): p. 1172-1208.

5. Ong, F., et al., Extreme MRI: Large-scale volumetric dynamic imaging from continuous non-gated acquisitions. Magn Reson Med, 2020. 84(4): p. 1763-1780.

6. Fu, M., et al., High-frame-rate full-vocal-tract 3D dynamic speech imaging. Magn Reson Med, 2017. 77(4): p. 1619-1629.

7. Fu, M.J., et al., High-Resolution Dynamic Speech Imaging with Joint Low-Rank and Sparsity Constraints. Magnetic Resonance in Medicine, 2015. 73(5): p. 1820-1832.

8. Bhave, S., et al., Accelerated whole-brain multi-parameter mapping using blind compressed sensing. Magn Reson Med, 2016. 75(3): p. 1175-86.

9. Cao, P., et al., Motion-resolved and free-breathing liver MRF. Magnetic Resonance Imaging, 2022. 91: p. 69-80.

Figures

Figure 1. Schmetic drawing shows the key steps of the proposed method. Those steps are: 1) dynamic images from spiral acquisition, 2) dynamic images were compressed into singular images, 3) iterative subspace reconstruction for singular images, and 4) subspace basis U was updated during reconstruction. The updating the subspace basis might have preserved the temporal features of reconstructed image series, as well as rejected undersampling artifacts.

Figure 2. The simulation result showed the proposed subspace reconstruction preserved the dynamic features of the dataset. The ground truth image had a temporal resolution of 7 ms/frame. The acceleration factor was 32. The measured RMSE was 0.217 for proposed method, 0.278 for sliding window reconstruction, and 0.294 for low rank reconstruction. The cross section views were from the image position indicted by the vertical dash line shown on the sagittal image. The sliding window reconstruction caused some ambiguities during the transitions on cross section views.

Figure 3. Tyical sagittal image from in vivo measurement with temporal resolution of 7 ms/frame. The left panel is the cross section view from 1000 frames in a duration of 7s. The cross section view was from the image position indicated by the vertical dash line shown on the sagittal image.

Figure 4. The magnitude and phase images of a typical vowel production, in the range of 200 ms. The high temporal resolution of image from proposed method clearly enabled the visualization of the soft palate. The phase maps also showed modest variation of the static magnetic field due to the air flow, which can cause phase cancellation if one summed these dynamics in conventional sliding window approach.

Figure 5. Compared with sliding window reconstruction, the proposed method had high temporal resolution in resolving the motion of soft palate in almost every frame. Meanwhile, the sliding window reconstruction with low temporal resolution cannot recover the soft palate, due to the motion and air-flow-induced phase cancellation.

Proc. Intl. Soc. Mag. Reson. Med. 32 (2024)

1084

DOI: https://doi.org/10.58530/2024/1084