One of the most critical aspects that limits the application of ultra-high field MRI is the local Specific Absorption Rate (SAR) evaluation. The key aspect is that local SAR information could only be obtained by off-line simulation using generic body models, which hardly match with the patient's body and positioning. In this work we present a first deep learning approach for local SAR assessment. Results, show that the relation between local SAR on the one hand and MR Dixon images and B1-field maps on the other hand, can be accurately and instantaneously mapped by a Convolutional Neural Network (CNN).
Setup and Dataset
By means of our database of 23 subject-specific models6 with a 8-fractionated dipole array for prostate imaging at 7T7, we have generated 5750 images sets (using 250x23 random phase settings with uniform amplitudes). These correspond to 5750 unique SAR and B1+ distributions. For each set, the corresponding Water-Fat image (Dixon reconstruction) and the simulated (Sim4Life,ZurichMedTech,Switzerland) B1+ distribution serve as an input to generate the local SAR distribution. The water-fat images were acquired in an earlier study6 and obtained at 1.5T. The B1+ phase of one channel is subtracted from the B1+ phase of the dataset to provide a relative phase distribution that can be realistically acquired. We have trained our CNN for the mid-plane 2D slice where the antenna feeds are located and the maximum local SAR is expected. The actual SAR distribution for each set is used for training or validation (ground-truth) given the input maps (i.e. Water-fat images, real and imaginary B1+ distributions)(Figure 1).
Network Architectures and Cost Functions
The best network architecture and most suitable cost function for training are still under investigation. We present results using two different CNN architectures:
Voxels with high local SAR required higher accuracy. Accordingly, we have trained our U-Net to minimize a weighted L1 distance between the ground-truth and the output $$$\mathcal{L}_{WL1}$$$ (weights proportional to the ground-truth).
$$\mathcal{L}_{WL1}=\sum_{i=1}^{N_{\text{pixel}}}w_i\left|\text{ground-truth}_i-\text{output}_i\right|$$
Moreover, to reduce the peak local SAR (pSAR) underestimation, the optional use of an additional loss term $$$\mathcal{L}_{Peak}$$$ was evaluated.
$$\mathcal{L}_{\text{Peak}}=\max(0,\max(\text{ground-truth})-\max(\text{output}))$$
Likewise, the cGAN objective $$$\mathcal{L}_{cGAN}$$$ was mixed with the $$$\mathcal{L}_{WL1}$$$, and the optional use of $$$\mathcal{L}_{Peak}$$$ was evaluated.
$$\mathcal{L}=\arg\min_{G}\max_{D}\mathcal{L}_{\text{cGAN}}(G,D)+\lambda_{WL1}\mathcal{L}_{WL1}(G)+\lambda_{\text{Peak}}\mathcal{L}_{\text{Peak}}(G)$$
Training and Validation
To evaluate the robustness of this approach and identify the best setting (CNN and cost function), the dataset was partitioned into 3 sub-datasets according to the models that have generated the images and a 3-Fold Cross-Validation was performed. Afterwards, the achievable performance using the best settings was assessed by a Leave-One-Out Cross-Validation and compared to SAR assessment using VOPs for one generic model (Duke)9 with a safety factor of 2 (pSARDuke,2).
The SAR assessment performance was evaluated by calculating the error in the value for pSAR with respect to the ground-truth.
Each network was implemented in TensorFlow and trained in less than three hours on a GPU (NVIDIA Tesla P100-PCIe-16GB).
The relation between local SAR and MR images (DIXON/B1) can be mapped by a CNN. This approach is robust, fast and requires relatively few models for training. This method allows for subject-specific local SAR prediction in a few milliseconds. It outperforms the traditional approach and allows to reduce the over-conservative safety factors currently used (by around 73%).
In this study, 3D prediction could be enabled by using multiple 2D transverse slices (and/or sagittal/coronal). Furthermore, as local SAR reflects 3D electromagnetic scattering, we are investigating the use of 3D input data and 3D CNNs. The robustness of this approach to noise and inhomogeneities (measured B1+ map), and extension to other coil geometries, anatomical sites and field strengths, are under investigation.
Figure 1: (a) 8-channel transmit array configuration, body models present in the database and data of their respective volunteers.
(b) Dataset construction pipeline.