Sarah E Johnson1, Marissa Barlaz1, Shuju Shi1, Ryan K Shosted1, and Brad P Sutton2
1Linguistics, University of Illinois at Urbana-Champaign, Urbana, IL, United States, 2Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, United States
Synopsis
The present study assesses the ability of rt-MRI
to detect subtle laryngeal configuration changes during varying phonation
contrasts. One subject lay supine within a
3T Siemens Trio scanner while producing a variety of phonation types including
breathy, modal, and creaky voice. An analysis of axial and coronal slices of
the larynx detected predictable changes at the ventricular folds, vocal folds
and arytenoid cartilages. We conclude that rt-MRI of the larynx may have
further application in the study of phonation in both research and clinical
settings as a non-invasive measure of laryngeal function.
Introduction
Advances in rt-MRI allow researchers to image the posterior vocal tract,
including the structures of the pharynx, epilarynx, and larynx. This method has
advantages for speech and clinical research, including its non-invasive nature
and the ability to image multiple regions of the vocal tract simultaneously. In
the past, radial fast low-angle shot (FLASH) MRI has been used to obtain
time-varying coronal images of the larynx. Using this method, researchers have
reliably detected glottal adduction during swallowing1 and voiced consonants.2 However, it is not clear whether more subtle laryngeal
configurations, such as phonation contrasts, can also be detected using this
method. Accordingly, the present study will explore whether rt-MR images can be
used to differentiate glottal closure, breathy, creaky, and modal phonation. Methods
The participant (a trained phonetician) lay supine within a 3T Siemens Trio
scanner while producing a variety of test utterances containing phonation type
contrasts, as well as complete glottal closure (glottal stop). rt-MR images
were obtained using the partial separability model,3,4 yielding
approximately 25 frames per second when acquiring 4 slices composed of 128 x
128 vowels at 2.2 mm x 2.2 mm x 8.0 mm (through-plane depth). Two 5-minute
scans were collected. In the first scan, 4 axial slices transected the
laryngeal region, including both the arytenoid cartilages and the vocal folds.
In the second scan, 4 coronal slices transected the neck from anterior at the
thyroid notch to posterior behind the cricoid cartilage. We analyzed three
slices: an axial slice at the arytenoid cartilages, an axial slice at the
glottis, and a coronal slice at the center of the anterior-posterior axis
through the vocal folds (Figure 1). Results
Visual inspection of the images revealed greater constriction at the
vocal and ventricular folds and lower larynx during creaky phonation and
glottal stop (Figure 2). Principal component analyses of pixel intensity2 corroborate these qualitative observations in both the coronal and axial
orientations. A pixel intensity analysis of the entire laryngeal vestibule
across phonation conditions revealed differences in the adduction of tissues
over time (Figure 3). For this measure, brighter pixel intensity is a result of
more matter (soft tissue) within a given region of interest. Results indicate
the presence of more soft tissue within the glottal and ventricular regions,
most likely due to medial adduction of the soft tissue surrounding the
laryngeal lumen. When quantified, the greatest degree of adduction was observed
in glottal stop, followed by creaky, modal, and breathy phonation in that
order. A control set of images of quiet breathing showed the lowest pixel
intensity, indicative of an wider laryngeal lumen.Discussion
We interpret these results as increased laryngeal and ventricular
approximation and lowered larynx during the production of glottal stop and creaky
voice. These articulations are traditionally associated with a raised larynx.6 One explanation may be that the participant is a native Mandarin speaker.
The low Mandarin tone is often produced with creaky phonation and sometimes a
lowered larynx.7 The speaker may have extended the creaky phonation task to
coincide with a low tone and hence a lowered larynx. The relatively nuanced
articulatory contrast between breathy and modal phonation, which is
traditionally characterized in part by relative abduction of the glottis, was
automatically captured in our analysis using PCA of pixel intensity. These
results demonstrate that current rt-MRI methods and technology can be used to
extract features of fine articulatory distinctions between phonation type and
degree of vocal and ventricular fold approximation. Conclusion
We find that rt-MRI can be used to accurately distinguish laryngeal
configurations associated with differing phonation types and degrees of glottal
closure. We also find that rt-MRI can be used to reliably detect differences in
laryngeal height associated with phonation contrasts. We conclude that rt-MRI
of the larynx can have further application in the study of phonation in both
research and clinical settings. Our methods of data collection, reconstruction,
and automatic machine learning of rt-MR image features may be extended to
provide a non-invasive measure of laryngeal function in clinical settings for
patients with dysphonia and other voice or resonance disorders. Acknowledgements
This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE - 1144245.References
1. Zhang S, Olthoff A, Frahm J. Real-time magnetic resonance
imaging of normal swallowing. J Magn Reson Imaging. 2012 Jun;35(6):1372-9.
2. Niebergall A, Zhang S, Kunay E, Keydana G, Job M, Uecker M, Frahm, J.
(2013). Real-time mri of speaking at a resolution of 33 ms: Undersampled
radial flash with nonlinear inverse reconstruction. Magn Reson Med. 2013 Feb;69(2):477-85.
3. Fu
M, Zhao B, Carignan C, Shosted RK, Perry JL, Kuehn DP, Liang ZP, Sutton BP. High-resolution dynamic speech imaging with joint
low-rank and sparsity constraints. Magn Reson Med.
2015 May;73(5):1820-32.
4. Liang Z-P. (2007). Spatiotemporal imaging with partially separable functions. 4th IEEE International Symposium on Biomedical
Imaging: from nano to macro; 2007; Arlington, Virginia. New York: Curran Associates, Inc; 988–91 p.
5. Carignan C, Shosted RK, Fu M, Liang Z-P, Sutton, BP. A
real-time MRI investigation of the role of lingual and pharyngeal
articulation in the production of the nasal vowel system of French.
J Phonetics. 2015;50:34-51.
6.
Hardcastle WJ, Beck JM. A figure of speech
: a Festschrift for John Laver. Mahwah, New Jersey: Erlbaum; 2005. Esling JH, Harris JG. States of the glottis: an articulatory phonetic model based on laryngoscopic observations; 347-83.
7. Moisik S, Lin H, Esling JH. A study of laryngeal gestures in
Mandarin citation tones using simultaneous laryngoscopy and laryngeal
ultrasound (SLLUS). J International Phonetic Association. 2014;44:21–58.
8. Otsu N. A Threshold Selection Method from Gray-Level
Histograms. (1979). IEEE Transactions on Systems, Man, and Cybernetics. 1979;9(1):62-6.