Real-time MRI of Speech at Very High Temporal Resolution
Arun Antony Joseph1,2, Dirk Voit1, Klaus-Dietmar Merboldt1, and Jens Frahm1,2

1Biomedizinische NMR Forschungs GmbH, Max-Planck-Institute for Biophysical Chemistry, Goettingen, Germany, 2DZHK, German Center for Cardiovascular Research, Goettingen, Germany

Synopsis

MRI has become the preferred technique to study the dynamics of tongue and vocal tract during speech, singing or instrument playing. Recent advances provide access to qualitative information of fast tongue movements at very high temporal resolution. In this study, real-time MRI using highly undersampled radial FLASH with regularized nonlinear inverse reconstruction was used to monitor the dynamics of the tongue at 10 ms, 18 ms and 33 ms resolution. The effect of different temporal resolutions on the tongue and oral cavity during speech will be compared and analyzed.

Purpose

The interest in studying movements of the articulators during speech, singing or instrument playing has been growing during the past years. MRI has become the most preferred technique for these studies due to its excellent anatomic and temporal resolution. However, most of the present MRI techniques are based on view sharing and sliding window reconstructions in order to increase the nominal frame rate. Recently, however, advancements in real-time MRI using highly undersampled radial FLASH and regularized nonlinear inverse reconstruction have provided the opportunity to observe the dynamics of the tongue and oral cavity at very high temporal resolution [1,2]. The aim of this study was to analyze the effects of different temporal resolutions such as 10 ms, 18 ms, and 33 ms on tongue movements during fast speech.

Materials and Methods

Speech experiments were performed on a 3T MRI system (Prisma, Siemens Healthcare, Erlangen). The standard head receiver coil consisting of 64 elements was used for the measurements covering the regions of the head and neck. The experiments were performed on normal healthy volunteers (n=6) according to the recommendations of the local ethics committee. Real-time T1-weighted images were acquired with highly undersampled radial FLASH sequence in combination with regularized nonlinear inverse reconstruction (NLINV) [3,4]. The MRI measurements consisted of acquisitions at three different temporal resolutions such as 30, 55 and 100 fps with otherwise similar parameters: 1.4×1.4×8 mm3 spatial resolution, FOV 192×192 mm2, flip angle 5°. The values for TR/TE/# of spokes were 1.96 ms/1.28 ms/17, 2.02 ms/1.28 ms/9 and 2.00 ms/1.28 ms/5 for 33 ms, 18 ms and 10 ms resolution, respectively. The subjects were tasked to read an English-language pangram during the measurements. In order to understand the influence of higher temporal resolutions, the subjects were further instructed to read at different speeds. The audio signals were recorded using a recording system provided by Optoacoustics (Or Yehuda, Israel) and later synchronized along with the images for the analysis.

Results and Discussion

The images obtained from different temporal resolutions 33 ms, 18 ms, 10 ms are shown in the figure. Highly qualitative images were obtained from the different temporal resolutions, as observed from the important structures of the vocal tract such as the tongue, palate and larynx. Minimal increase in background noise, due to reduced number of spokes, were observed in the images obtained from 10 ms. Further, the images displayed show the tongue positions during rest and two different phases of the speech. It is clearly observed that the images obtained at increased temporal resolution (10 ms, 18 ms) provide better definition of the tongue at fast tongue movements as in shown phase 1. For very fast tongue movements as shown in phase 2, the tongue tips were most qualitatively defined for 10 ms temporal resolution image.

Conclusion

MRI of the tongue is generally performed to understand its dynamics during speech, singing or music. The fast movements of the tongue during these tasks provide a serious challenge to visualize when measured with low temporal resolution. Real-time MRI using highly undersampled radial FLASH and NLINV reconstruction enable monitoring of tongue movements at very high temporal resolution. During rapid tongue movements, the real-time acquisitions with temporal resolution of 10 ms and 18 ms were found to provide much clearer definition of all articulators in the oral cavity.

Acknowledgements

No acknowledgement found.

References

[1] Niebergall A et al. MRM 2013; 69:477–485

[2] Iltis PW et al. Quant Imaging Med Surg 2015; 5:374-381

[3] Uecker M et al. NMR Biomed. 2010; 23: 986–994

[4] Uecker M et al. MRM. 2008; 60: 674–682.

Figures

Figure 1: Selected frames from real-time MRI movies depict the oral cavity at different temporal resolutions (30 ms, 18 ms, 10 ms) as well during rest and speech (phases 1 and 2).



Proc. Intl. Soc. Mag. Reson. Med. 24 (2016)
3209