0469

Exploring Complex-Valued Neural Networks with Trainable Activation Functions for Magnetic Resonance Imaging
Guillaume Daval-Frerot1,2, Xiao Chen1, Simon Arberet1, Boris Mailhé1, Peter Speier3, Mathias Nittka3, Heiko Meyer3, and Mariappan S Nadar1

1Digital Technology and Innovation, Siemens Healthineers, Princeton, NJ, United States, 2EPITA, Le Kremlin-Bicêtre, France, 3Magnetic Resonance, Siemens Healthcare GmbH, Erlangen, Germany

Synopsis

MR signals by nature are complex valued. However, most of the current deep neural networks for MR are derived from applications dealing with real-valued images. Recent studies proposed an adaptation of neural networks to the complex domain to learn a better representation of the signal. In this study, multiple CVNN with trainable complex-valued activation functions are proposed and validated on MR fingerprinting regression problem. 2D activation functions with trainable parameters have been demonstrated here to suit the CVNN well and provide significant improvement over the non-trainable versions.

INTRODUCTION

MR signals are complex valued and the signal phase carries crucial information. However, most of the current deep neural network (DNN) architectures for MR are derived from applications dealing with real-valued images. Studies in the past1-3 and more recent studies4-7 have shown that complex-valued DNN (CVNN) can learn better representations than real-valued DNN (RVNN). One of the key topics in CVNN is to design complex non-linearities (activation functions) to fully exploit complex-valued features. In this study, multiple CVNN with trainable complex-valued non-linearities are proposed and validated on MR fingerprinting regression problem.

METHODS

A common approach to handle complex values in DNN is to apply real-valued non-linearities separately to the real and imaginary components, such as the complex ReLU (CReLU, Fig1.A). This approach may alter the phase information. The Cardioid function defined as $$$f(z) = \frac{1}{2}(1+\cos(\angle z))z$$$ can preserve the phase (Cardioid, Fig1.B). However, the Cardioid function has a fixed orientation toward the real axis, which may be unsuitable for signals and features that have non-zero phases.

Our first proposition is a trainable Cardioid function $$$f(z) = \frac{1}{2}(1+\cos(\angle z +\angle b))z$$$ (Rotated Cardioid, Fig1.C) that can orient the Cardioid function differently in the 2D complex plane for each neuron by learning the bias term $$$\angle b$$$ through training.

Our second proposition is to use a Gaussian kernel activation function (KAF) to learn the non-linearities for DNN. The base version5 is given as $$$f(x) = \sum_{n=1}^D\alpha_nk_n(x)$$$, where $$$\alpha_n$$$ are the trainable mixing coefficients, $$$k_n$$$ is the 1D Gaussian kernel defined as $$$k_n(x) = e^{-\gamma(x-d_n)^2}$$$, $$$D$$$ is the number of kernels, and $$$d_i$$$ are the positions of the evenly distributed kernels. An improved version with bivariate Gaussian kernel (Bivariate KAF, Fig1.D) is proposed to adopt 2D mixing coefficient given as $$$f(z) = \sum_{n=1}^{D_1}\sum_{m=1}^{D_2}\alpha_{n,m}k_{n,m}(z)$$$. The 2D kernel was designed as $$$k_{n,m}(z) = e^{-(z-d_{n,m})^T\Sigma^{-1}(z-d_{n,m})}$$$ with $$$\Sigma$$$ the kernel covariance. Another variation uses a kernel in polar representation (Polar KAF, Fig1.E). This is formally described as $$$f(z)=\sum_{n=1}^{D_1}\alpha_nk_n(|z|)e^{ig_n(z)}$$$ where $$$g_n(z)=\sum_{m=1}^{D_2}\alpha_{n,m}k_{n,m}(\angle z)$$$, and $$$k_n(x) = e^{-\gamma_n(x-d_n)^2}$$$, $$$k_{n,m}(x) = e^{-\gamma_m(x-d_{n,m})^2}$$$. Complex-valued components of DNN such as fully connection and batch-normalization were also implemented6.

Experiment: All networks were tested using MR fingerprinting, where the DNNs regressed complex-valued MRF signal to a vector of tissue parameters. 100k TrueFISP fingerprint signals8 were generated for training (90%) and validation (10%) with a wide range of T1s, T2s and off-resonances. Brainweb data9 was used for testing. Root mean square error (RMSE) and relative error percentage (err%) were measured. The DNNs have three hidden fully-connected layers of 256, 256 and 128 neurons, respectively. The NN structure was heuristically optimized for RVNN with ReLU. Both RVNN (ReLU and KAF) and CVNN (CReLU, Cardioid, rotated Cardioid, bivariate KAF and polar KAF) are tested. Training used MSE loss and Adam algorithm.

RESULTS

The learnt $$$\angle b$$$ of the rotated Cardioid is shown in Fig2.A. The learnt $$$\alpha_n$$$ of the 1D Gaussian KAF is shown in Fig2.B. Exemplary learnt bivariate KAFs and learnt polar KAFs are shown in Fig3. Structures can be observed from the learnt parameters and the KAFs. Both the bivariate and polar KAF show little variation of the output in the regions with a strong output magnitude, i.e. these non-linearities learnt to normalize the phase of their input. Fig4 shows the error maps of the T1, T2 estimation from all the tested NNs. The RMSE and err% of the Brainweb test are summarized in Table1. KAF methods performed better than the others, and polar KAF achieved the best scores.

DISCUSSIONS AND CONCLUSIONS

In this study, we have focused on the non-linearity part of the CVNN. Split real activation functions are less expressive once in the 2D complex domain. Using the complex-valued TrueFISP fingerprints with varying phases, 2D activation functions with trainable parameters have been demonstrated here to suit the CVNN well and provide significant improvement over the non-trainable versions. They could also be useful as research tools to observe the complex non-linearities that emerge during training and inspire the design of simpler complex activation functions. More generally, CVNNs could be used in a broad range of phase-sensitive MR applications and more generally the image reconstruction from k-space10.

DISCLAIMER

This feature is based on research, and is not commercially available. Due to regulatory reasons its future availability cannot be guaranteed.

Acknowledgements

No acknowledgement found.

References

1. Mandic DP, Su Lee Goh V. Complex valued nonlinear adaptive filters: noncircularity, widely linear and neural models. Wiley. 2009.

2. Savitha R, Suresh S, Sundararajan N. A fully complex-valued radial basis function network and its learning algorithm. Int J Neural Syst. 2009;19(4):253-67.

3. Kim T, Adali T. Fully complex multi-layer perceptron network for nonlinear signal processing. JVLSI. 2002;32:29.

4. Virtue P, Yu SX. Lustig M. Better than real: complex-valued neural nets for MRI fingerprinting. ICIP. 2017;3953-57.

5. Scardapane S, Van Vaerenbergh S, Hussain A, Uncini A. Complex-valued Neural Networks with Non-parametric Activation Functions. arXiv. 2018;1802.08026 [cs.NE]

6. Trabelsi C, Bilaniuk O, Zhang Y, Serdyuk D, Subramanian S, Santos JF, Mehri S, Rostamzadeh N, Bengio Y, Pal CJ. Deep complex networks. arXiv. 2017;1705.09792 [cs.NE]

7. Lee D, Yoo J, Tak S, Ye JC. Deep Residual Learning for Accelerated MRI using Magnitude and Phase Networks. arXiv. 2018;1804.00432 [cs.CV]

8. Ma D, Gulani V, Seiberlich N, Liu K, Sunshine JL, Duerk JL, Griswold MA. Magnetic Resonance Fingerprinting. Nature. 2013;495(7440):187-92

9. Cocosco CA, Kollokian V, Kwan R K-S, Pike GB, Evans AC. BrainWeb: online interface to a 3D MRI simulated brain database. NeuroImage. 1997;5(4), part2/4, S425

10. Moreau A, Gbelidji F, Mailhé B, Arberet S, Chen X, Nadar MS. Deep transform networks for scalable learning of MR reconstruction. ISMRM workshop on machine learning Part II. 2018.

Figures

Figure 1: Example non-learnable activation functions (A,B) and learnable activation functions (C-E). A: CReLU. B: Cardioid. C: Rotated Cardioid. D: Bivariate KAF. E: Polar KAF.

Figure 2: Distributions of learnt activation function parameters. A: distribution of learnt bias terms in the rotated Cardioid activation functions from the three hidden layers. B: distribution of learnt mixing coefficients in the 1D KAF from the three hidden layers. The number of kernels was determined empirically for the 1D KAF as a grid of 20 elements.

Figure 3: Example learnt 2D activation functions. The 2D activation functions are shown in real and imaginary 2D grids. Magnitude |f(z)| and phase ∠f(z) responses are shown in the first and second row, respectively. A: Bivariate KAF. B: Polar KAF. The number of kernels was determined empirically as a grid of 10 by 10 for the bivariate kernel, and a magnitude grid of 20 and a phase grid of 20 by 1 for the polar kernel.

Figure 4: Error maps of T1 and T2 estimated from the tested NNs with different activation functions. NNs with trainable activation functions are rotated Cardioid, KAF, bivariate KAF and polar KAF. Complex-valued NNs are CReLU, Cardioid, rotated Cardioid, bivariate KAF and polar KAF.

Table 1: RMSE and Err% measured on maps of T1 and T2 estimated with different activation functions.

Proc. Intl. Soc. Mag. Reson. Med. 27 (2019)
0469