Hybrid MRI-ultrasound acquisitions, and scannerless real-time imaging
Frank Preiswerk1, Matthew Toews2, Cheng-Chieh Cheng1, Jr-yuan George Chiou1, Chang-Sheng Mei3, Lena F. Schaefer1, W. Scott Hoge1, Benjamin M. Schwartz4, Lawrence P. Panych1, and Bruno Madore1

1Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States, 2The Laboratory for Imagery, Vision and Artificial Intelligence, École de Technologie Supérieure, Montréal, QC, Canada, 3Department of Physics, Soochow University, Taipei, Taiwan, 4Google Inc, New York, NY, United States


The goal of this project was to combine MRI, ultrasound (US) and computer science methodologies toward generating MRI at high frame rates, inside and even outside the bore. A small US transducer, fixed to the abdomen, collected signals during MRI. Based on these signals and correlations with MRI, a machine-learning algorithm created synthetic MR images at up to 100 frames per second. In one particular implementation volunteers were taken out of the MRI bore with US sensor still in place, and MR images were generated on the basis of ultrasound signal and learned correlations alone, in a 'scannerless' manner.


In an era of ever increasing computational power, dataset sizes and detector types, machine learning often allows traditional problems to be tackled in new and possibly better ways. Since its inception, acquisition speed and frame rates have typically been considered weaknesses of MRI, especially as compared to ultrasound (US) imaging. The purpose of the present work was to explore the benefits of combining MRI and ultrasound signals through machine-learning algorithms. In contrast to previous work on hybrid US and MRI1-3, our work employs a single-element US transducer and integrates US into the MRI reconstruction directly4,5.

We developed a hybrid-imaging setup that includes a 3D printed capsule and an MR-compatible ultrasound transducer (Fig. 1a,b). The capsule, transducer, US gel and two-way tape are assembled and affixed to the skin as shown in Fig. 1c-h. While Fig. 1 shows the latest version of our setup, results shown here were actually obtained with our previous version, which was similar in spirit but not quite as elegant in pictures, having been carved in parts from a kid’s flip-flops. We named the resulting US assembly an ‘organ-configuration motion’ (OCM) sensor.

The combination of MRI and US was motivated by their highly complementary strengths: image contrast, speed, patient access, simplicity and cost. Our Bayesian algorithm allowed for two different modes of operation: After the algorithm has learned sufficiently (~1 to 2min), it can predict many intermediate images in-between actually-acquired ones, boosting temporal resolution by up to two orders of magnitude. This mode might be useful in monitoring ablation in moving organs for example, to provide rapid updates on target location. In a second mode of operation, with the subject outside the MRI scanner, the fully-trained algorithm continues predicting MR images based on OCM signals alone. Possible applications to image guided therapies and/or to multi-modality image registration are being evaluated.


Two hybrid MRI-OCM acquisitions were performed on each of 8 subjects (16 acquisitions). A pulser-receiver (5072PR, Olympus) synced with the MRI scanner fired the OCM sensor once per TR. Data were recorded using a digitizer card (NI5122, National Instruments). MRI was performed either on a 3T GE Signa or a Siemens Verio (TR 10-18ms, flip angle = 30°, matrix size 192x192, FOV 38x38cm2, ~0.6s/image). Volunteers were requested to breathe normally, with occasional (and intentional) coughing and/or gasping to challenge the algorithm. For each TR increment, the algorithm computed a synthetic MR image It through Kernel Density Estimation6,7, using the following weighted sum over all previously acquired MR images It:


where Ut is the latest OCM signal, D={IT,UT} the collection of all NT MR images acquired up to time t and their time-matched OCM data, and N() is a Gaussian kernel with covariance matrix Σ. The quality of this estimation increases over time as the algorithm keeps learning and NT grows.

OCM data were further acquired from 4 subjects outside the scanner room, for 'scannerless' MRI. In both modes of operation, a 'cough-detector' algorithm robustly detected rapid motion, such as that from a cough or a gasp, based on the derivative of the OCM signal. Time points labeled as coughs/gasps were excluded from the learning database, and flags were generated for downstream applications. For example, if our hybrid method was used to track the location of a lesion during ablation, the ablation process should be paused during coughs/gasps and resumed when motion activity returns to normal.


The inside-the-scanner mode was manually validated by a trained radiologist, and results are shown in Fig. 2. A mean error of 1 pixel was found, after learning had converged. Figure 3 shows results from inside the scanner in M-mode format, along with results from cough-detection. The outside-the-scanner mode was qualitatively validated using optically-tracked 2D ultrasound imaging, see Fig. 4. Good agreement of motion between MRI and ultrasound images was found.

Discussion and Conclusion

Combining MRI data with ultrasound data, through a machine-learning algorithm, led to vast improvements in temporal resolution compared to MRI alone, by up to two orders of magnitude. Synthetic images from outside the bore might pave the way for interesting image-guided therapy applications. A movie8 and source-code9 have been made available online. Future improvements include the use of several OCM sensors simultaneously (instead of a single one) and additional hardware for this purpose is currently under development. Possible applications such as cardiac gating and image fusion are also being investigated. We believe that OCM sensors might prove valuable enough as an adjunct to MRI, in various applications, to possibly become a standard feature of future MRI systems.


Financial support from grants NIH R01CA149342, P41EB015898, R21EB019500, and SNSF P2BSP2 155234 is duly acknowledged.


Fig. 1: (a,b) A 3D-printed holder was designed to accommodate the transducer, gel, and a fiberoptic temperature probe. (c) Double-sided tape sealed the lower side, and (d) gel was squeezed inside. (e) The OCM sensor displaced excess gel outward, through holes in the bayonet mechanism, and (f) the sensor was closed by twisting the lid. (g) OCM sensor with holder is compact, about 3x3x1 cm, with flexible coaxial (white) and fiberoptic (yellow) cables. (h) For application, the protective layer of the tape is peeled and the OCM sensor is affixed to the skin. Steps (c-g) can be performed in advance, for optimized clinical workflow.

Fig. 2: Landmarks in the liver, indicated with green circles in Fig. 3, were tracked both in synthetic OCM-MRI and in acquired MRI images. The Euclidean error between synthetic and acquired images is plotted as a function of time: The algorithm learned from incoming timeframes, rapidly at first (see pink dashed line) and plateaued after a few breathing cycles (blue dashed line). The mean, median, and 90th percentile error, over all landmarks and all subjects, plateaued at 1.0, 0.6, and 2.2 pixels, respectively.

Fig. 3: Hybrid OCM-MRI results are shown for subjects A-C in M-mode format. The horizontal axis is time and the vertical axis represents a 1D image. Orange lines in the images on the left mark the 1D locations selected for M-mode display. Green circles in these images indicate the landmarks employed for validation purposes, (Fig. 2). Red intervals represent cough/gasp events as detected by our algorithm, and highlighted in yellow are cough/gasp events as detected by a human reader looking at OCM signals. Volunteer B proved especially enthusiastic in challenging the algorithm with coughing events.

Fig. 4: MRI data were acquired (left column), and then the subject was taken out of the MRI scan room. Based on the OCM signals and learned correlations only, streams of synthetic MRI images could still be generated (middle column). Green lines indicate examples of full inspiration and expiration, whereas blue markings show the time segment zoomed in the lower row. Note the coarser nature of the temporal resolution from acquired MRI (1.7 fps) compared with synthetic MRI (100 fps). Images in the rightmost column show the 1D locations selected for M-mode display.

