Michael Burdumy1,2, Matthias Echternach2, Jan Gerrit Korvink3, Bernhard Richter2, Jürgen Hennig1, and Maxim Zaitsev1
1Medical Physics, University Medical Center Freiburg, Freiburg, Germany, 2Institute of Musicians' Medicine, University Medical Center Freiburg, Freiburg, Germany, 3Institute of Microstructure Technology, Karlsruhe Institute of Technology, Karlsruhe, Germany
Synopsis
To
accelerate dynamic 3-D imaging of the vocal tract during
articulation, a stack-of-stars sequence with golden angle rotation
and iterative reconstruction was implemented. Phase correction,
peripheral under-sampling, temporal and spatial regularization
were applied to reach an acquisition time of 1.3 seconds. The vocal tract
modifications of one subject could be successfully analyzed at discrete
time steps during phonation of a long note.Purpose
This work aims to improve the temporal
resolution of three-dimensional vocal tract (VT)
imaging with sufficient spatial resolution
to analyze morphometric changes during speech or singing. The method
enables dynamic speech or singing tasks in the MR system where the
recorded 3-D images can be analyzed via distance measurements,
volumetric acoustic simulations or rapid-prototyping print-outs of
segmentations.
Methods
A radial stack-of-stars sequence with golden-angle rotation was modified such that dynamic changes in the VT
could be depicted at high spatial and temporal resolutions. A previously reported1 radio-frequency-spoiled radial GRE
sequence was extended to 3-D imaging, by adding a phase gradient and
phase spoilers in the sagittal direction. Sequence parameters:
RF-spoiled Stack-of-Stars: TR=2.9ms, TE=1.4ms, FA=6°, FOV
200x200x62mm3, Pixel Resolution 1.56x1.56x1.3mm3, BW 1524Hz/pixel.
The inner loop of the sequence cycles through all phase encodes at
different projection angles. Hence, a complete coverage of all
partitions is reached in short time. In
the outer loop, each projection is rotated with respect to the
previous one of the same partition. Furthermore,
the peripheral parts of k-space in
phase-direction are sampled less frequently, with a linear decrease
of the number of projections starting from the k-space center towards
both peripheries (Figure 1). The measurement was split into
temporally separated parts, such that 21 projections were measured in
the center partition and 5 projections in the outer-most partition,
leading to a total of 456 projections and a temporal resolution of
1.322s. Coil sensitivities were calculated from the data itself for
each time frame, as described previously1.
With the measured signal $$$S$$$, the original
image $$$x$$$ can be calculated by minimizing a functional of the
form $$f(x)=||Ax-S||_2+\lambda_1 TV_{temporal}(x)+\lambda_2
TV_{spatial}(x),$$ where $$$A$$$ is the
forward encoding operator that includes coil sensitivities, Fourier
encoding and projection onto a grid. The
pixel-wise total variation operator $$$TV_{temporal}$$$ enforces
sparsity in time2, while $$$TV_{spatial}$$$ enforces
sparsity in the spatial domain. The
minimization problem was solved with the method of non-linear
conjugated gradients in MATLAB (R2014b). $$$A$$$
was implemented using the gpuNUFFT3 that is based on the Image
Reconstruction Toolbox4.
As an example of practical application, results of a 28 year-old female are shown. Data were acquired in the
supine position in a 3T Prisma (Siemens, Erlangen, Germany) with the
manufacturer’s 64-channel head/neck coil. The untrained
singer was required to hold the note C5 for as long as possible and
explicitly instructed to sing past her comfortable resting expiratory
level. In the reconstructed images, the larynx height was measured
from a mid-sagittal slice, as described in Figure 2 and1. The
air-filled VT was
manually segmented for all time points using ITK SNAP5 and the binary
volume segmentations were added to each other, so as to find regions
of morphometric changes.
Results
The
subject was able to hold the note for 20s, while the sequence
ran for 25s. The correct pitch could be confirmed by a trained
musician with the help of the scanner’s microphone system. Image
reconstruction of this data-set was performed in
about four hours on a single CPU and GPU. The contrast and artifact levels
were sufficient to identify the landmarks for the larynx height. This
parameter was approximately constant for about 12s, but then showed
a decrease towards the end of phonation (Figure 3). A closing of
the mouth, raising of the tongue and opening of the uvula could be
seen at the end of the recording, when the subject started normal breathing. Regarding the segmentations (Figure 4), the
addition of the binary models of all time steps showed changes in the
region of the lips, frontal part of the tongue and in all three
dimensions of the larynx region (Figure 5).
Discussion
It has been suspected that untrained singers
modify the configuration of the VT when air runs out at
the end of a long note, because of a lack of sub-glottic pressure. This
study could confirm such modifications in a singing subject in all
three dimensions, especially of the tongue and the larynx. Both the presented images and previous studies
confirm that spatial resolutions higher than 2mm are required to
identify the small structures of the VT6.
However, previous studies were limited to static or repetitive tasks,
due to longer acquisition times.
Although the presented method enables high under-sampling
factors, the singing or speech tasks and the
regularization parameters must be chosen carefully, else fast modifications are
suppressed.
Conclusion
With an under-sampling factor of 13 compared to full Cartesian
sampling, the acquisition of one volume per second opens up new
possibilities to research vocal tract acoustics and articulator modfications
during dynamic tasks.
Acknowledgements
This work was supported by DFG grants ZA422/3-3 and RI1050/4-3.References
1Burdumy, M; Traser,
L; Richter, B; Echternach, M; Korvink, JG; Hennig, J; Zaitsev, M.
“Acceleration of MRI of the Vocal Tract Provides Additional Insight
into Articulator Modifications.” Journal of Magnetic Resonance
Imaging, 2015, doi:10.1002/jmri.24857.
2Feng, L; Grimm, R;
Block, KT; Chandarana, H; Kim, S; Xu, J; Axel, L; Sodickson, DK;
Otazo, R. “Golden-Angle Radial Sparse Parallel MRI: Combination of
Compressed Sensing, Parallel Imaging, and Golden-Angle Radial
Sampling for Fast and Flexible Dynamic Volumetric MRI: iGRASP:
Iterative Golden-Angle RAdial Sparse Parallel MRI.” Magnetic
Resonance in Medicine, 2013, doi:10.1002/mrm.24980.
3Knoll, F.; Schwarzl,
A,; Diwoky, C.; Sodickson DK. “gpuNUFFT - An Open-Source GPU Library
for 3D Gridding with Direct Matlab Interface. Proc ISMRM p4297, 2014.
4Fessler, J.A.;
Sutton,B.P. “Nonuniform Fast Fourier Transforms Using Min-Max
Interpolation.” IEEE Transactions on Signal Processing 51,
no. 2, 2003: 560–74, doi:10.1109/TSP.2002.807005.
5
Yushkevich, PA;
Piven, J; Hazlett, HC; Smith, RG; Ho, S; Gee, JC; Gerig,G.
”User-guided 3D active contour segmentation of anatomical
structures: Significantly improved efficiency and reliability“
Neuroimage, 2006, 31(3):1116-28.
6
Scott, AD;
Wylezinska, M; Birch, MJ; Miquel, ME. “Speech
MRI: Morphology and Function.” Physica Medica 30, no. 6: 604–18, 2014, doi:10.1016/j.ejmp.2014.05.001.