MR signals by nature are complex valued. However, most of the current deep neural networks for MR are derived from applications dealing with real-valued images. Recent studies proposed an adaptation of neural networks to the complex domain to learn a better representation of the signal. In this study, multiple CVNN with trainable complex-valued activation functions are proposed and validated on MR fingerprinting regression problem. 2D activation functions with trainable parameters have been demonstrated here to suit the CVNN well and provide significant improvement over the non-trainable versions.
A common approach to handle complex values in DNN is to apply real-valued non-linearities separately to the real and imaginary components, such as the complex ReLU (CReLU, Fig1.A). This approach may alter the phase information. The Cardioid function defined as $$$f(z) = \frac{1}{2}(1+\cos(\angle z))z$$$ can preserve the phase (Cardioid, Fig1.B). However, the Cardioid function has a fixed orientation toward the real axis, which may be unsuitable for signals and features that have non-zero phases.
Our first proposition is a trainable Cardioid function $$$f(z) = \frac{1}{2}(1+\cos(\angle z +\angle b))z$$$ (Rotated Cardioid, Fig1.C) that can orient the Cardioid function differently in the 2D complex plane for each neuron by learning the bias term $$$\angle b$$$ through training.
Our second proposition is to use a Gaussian kernel activation function (KAF) to learn the non-linearities for DNN. The base version5 is given as $$$f(x) = \sum_{n=1}^D\alpha_nk_n(x)$$$, where $$$\alpha_n$$$ are the trainable mixing coefficients, $$$k_n$$$ is the 1D Gaussian kernel defined as $$$k_n(x) = e^{-\gamma(x-d_n)^2}$$$, $$$D$$$ is the number of kernels, and $$$d_i$$$ are the positions of the evenly distributed kernels. An improved version with bivariate Gaussian kernel (Bivariate KAF, Fig1.D) is proposed to adopt 2D mixing coefficient given as $$$f(z) = \sum_{n=1}^{D_1}\sum_{m=1}^{D_2}\alpha_{n,m}k_{n,m}(z)$$$. The 2D kernel was designed as $$$k_{n,m}(z) = e^{-(z-d_{n,m})^T\Sigma^{-1}(z-d_{n,m})}$$$ with $$$\Sigma$$$ the kernel covariance. Another variation uses a kernel in polar representation (Polar KAF, Fig1.E). This is formally described as $$$f(z)=\sum_{n=1}^{D_1}\alpha_nk_n(|z|)e^{ig_n(z)}$$$ where $$$g_n(z)=\sum_{m=1}^{D_2}\alpha_{n,m}k_{n,m}(\angle z)$$$, and $$$k_n(x) = e^{-\gamma_n(x-d_n)^2}$$$, $$$k_{n,m}(x) = e^{-\gamma_m(x-d_{n,m})^2}$$$. Complex-valued components of DNN such as fully connection and batch-normalization were also implemented6.
Experiment: All networks were tested using MR fingerprinting, where the DNNs regressed complex-valued MRF signal to a vector of tissue parameters. 100k TrueFISP fingerprint signals8 were generated for training (90%) and validation (10%) with a wide range of T1s, T2s and off-resonances. Brainweb data9 was used for testing. Root mean square error (RMSE) and relative error percentage (err%) were measured. The DNNs have three hidden fully-connected layers of 256, 256 and 128 neurons, respectively. The NN structure was heuristically optimized for RVNN with ReLU. Both RVNN (ReLU and KAF) and CVNN (CReLU, Cardioid, rotated Cardioid, bivariate KAF and polar KAF) are tested. Training used MSE loss and Adam algorithm.
In this study, we have focused on the non-linearity part of the CVNN. Split real activation functions are less expressive once in the 2D complex domain. Using the complex-valued TrueFISP fingerprints with varying phases, 2D activation functions with trainable parameters have been demonstrated here to suit the CVNN well and provide significant improvement over the non-trainable versions. They could also be useful as research tools to observe the complex non-linearities that emerge during training and inspire the design of simpler complex activation functions. More generally, CVNNs could be used in a broad range of phase-sensitive MR applications and more generally the image reconstruction from k-space10.
1. Mandic DP, Su Lee Goh V. Complex valued nonlinear adaptive filters: noncircularity, widely linear and neural models. Wiley. 2009.
2. Savitha R, Suresh S, Sundararajan N. A fully complex-valued radial basis function network and its learning algorithm. Int J Neural Syst. 2009;19(4):253-67.
3. Kim T, Adali T. Fully complex multi-layer perceptron network for nonlinear signal processing. JVLSI. 2002;32:29.
4. Virtue P, Yu SX. Lustig M. Better than real: complex-valued neural nets for MRI fingerprinting. ICIP. 2017;3953-57.
5. Scardapane S, Van Vaerenbergh S, Hussain A, Uncini A. Complex-valued Neural Networks with Non-parametric Activation Functions. arXiv. 2018;1802.08026 [cs.NE]
6. Trabelsi C, Bilaniuk O, Zhang Y, Serdyuk D, Subramanian S, Santos JF, Mehri S, Rostamzadeh N, Bengio Y, Pal CJ. Deep complex networks. arXiv. 2017;1705.09792 [cs.NE]
7. Lee D, Yoo J, Tak S, Ye JC. Deep Residual Learning for Accelerated MRI using Magnitude and Phase Networks. arXiv. 2018;1804.00432 [cs.CV]
8. Ma D, Gulani V, Seiberlich N, Liu K, Sunshine JL, Duerk JL, Griswold MA. Magnetic Resonance Fingerprinting. Nature. 2013;495(7440):187-92
9. Cocosco CA, Kollokian V, Kwan R K-S, Pike GB, Evans AC. BrainWeb: online interface to a 3D MRI simulated brain database. NeuroImage. 1997;5(4), part2/4, S425
10. Moreau A, Gbelidji F, Mailhé B, Arberet S, Chen X, Nadar MS. Deep transform networks for scalable learning of MR reconstruction. ISMRM workshop on machine learning Part II. 2018.