Dipika Sikka1,2, Noah Igra3,4, Sabrina Gjerswold-Sellec1, Cynthia Gao5, Ed Wu6, and Jia Guo7
1Department of Biomedical Engineering, Columbia University, New York, NY, United States, 2VantAI, New York, NY, United States, 3Department of Applied Mathematics, Columbia University, New York, NY, United States, 4Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel, 5Department of Computer Science, Columbia University, New York, NY, United States, 6Department of Electrical and Electronic Engineering, The University of Hong Kong, Pok Fu Lam, Hong Kong, China, 7Department of Psychiatry, Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, United States
Synopsis
While the application of deep learning in MR image analysis has gained significant popularity, using raw MR k-space data as part of deep learning analysis is an underexplored area. Here we develop a completely complex U-Net deep learning architecture, CU-Net, where we apply deep learning components and operations in the complex space. CU-Net leverages k-space MR signals while training a U-Net with Attention and Residual components, as opposed to using processed spatial (real) data, typically seen with MRI deep learning applications. As part of a proof-of-concept study, the complex networks demonstrated their utility and potential superiority over their spatial counterparts.
Introduction
Despite the rapid expansion of deep learning applications in biomedicine, most deep learning architectures are designed for the real space. Studies have shown that complex numbers bring various advantages, including better generalization and less noisy retrieval from associative memory.1,2 As MR signals are intrinsically of complex representation, this demands the incorporation of the complex space into neural network structures. While previous studies have extended deep learning networks to the frequency domain through spectral pooling, complex batch normalization, and convolution, some attempts are not fully complex in terms of the convolution kernels, and others are limited to using convolutional neural networks exclusively instead of adapting to other networks.3,4 In our study, we implement a fully-complex U-Net model with Residual and Attention components (Residual Attention U-Net). It was hypothesized that a network receiving complex k-space MRI data will have more information in a given instance in comparison to one receiving processed magnitude-only input. Thus, U-Net encoding in the complex domain may lead to improved feature extraction and data encoding necessary for image translation. To our knowledge, these encoder-decoder networks demonstrate novel complex applications of otherwise commonly used components including Pooling, Upsampling, and Attention. As an initial demonstration of utility, we use the complex neural networks for a mouse-brain extraction task and compare it to its spatial (real space) counterpart.Methods
Data Generation
A dataset of 120 T2-weighted mouse MRI scans acquired on a Bruker 9.4T scanner was used. The k-space data were extracted from raw data using the Bruker-supplied MATLAB package PVtools.5 Spatial scans were computed using the magnitude of the inverse Fourier transformed k-space data. Ground truth brain masks were generated from spatial scans with an in-house deep learning-based brain extraction tool that employs a 3D Residual Attention U-Net. Spatial and k-space scans were then randomly split into train-test-validation sets, consisting of 89, 15, and 16 scans respectively.
Model Architecture Design
The model architecture was developed as individual modules including complex Convolution, Batch Normalization, Pooling, Upsampling, Rectified Linear Unit (ReLU), and Sigmoid. These modules were used to build complex Attention and Residual blocks. For the complex data, real and imaginary components are stored as real numbers in separate channels, for which complex operations are simulated. The underlying theory for complex Convolution, Batch Normalization, and ReLU has been demonstrated and implemented using other frameworks in previous work.4 Similar to complex ReLU, Sigmoid is simulated by applying the operation to real and imaginary components individually. Our complex Attention has been adjusted by using the same components used in its spatial counterpart, while accommodating for a 2-channel complex input. Unlike conventional stride-based compression methods (i.e. Maxpool), our implementation of complex Pooling takes advantage of the local properties of data constrained to the frequency space. Spatial image homogenized edges and gray-level variation features are separated as high and low frequency spaces, respectively.6 The resulting spatial separation allows us to implement Pooling which functions similarly to a low-pass filter; low-frequency data is conserved while high-frequency data is truncated. Frequency space Upsampling can be conducted as the inverse of complex Pooling: zero-padding high-frequency space originally truncated in feature encoding. A fully-complex Residual Attention U-Net architecture was constructed with the blocks mentioned above.
Training and Testing
2D complex Residual Attention U-Nets were trained at various network depths and first layer kernel outputs (see Figure 2 for a sample architecture). To compare performance to its spatial counterparts, 2D residual attention U-Nets in the real space were also trained at the same network depths and kernel outputs. All other initial model and training parameters were kept constant.Results
All complex and real space models were evaluated using DICE coefficients on their respective test sets. Figure 3 summarizes these results, demonstrating the utility of complex networks, especially with shallower architecture, which superseded the performance of its real counterparts, across all numbers of first layer kernel outputs. The performance is especially significant for a 3 layer network, which showed the greatest performance increase. In contrast, the performance of complex networks was poorer for deeper architectures, where real networks showed higher DICE scores for 5 and 6 layer networks. Complex networks, however, demonstrated improved performance stability compared to the real networks.Conclusions and Discussion
The results highlight the potential of completely complex neural networks for k-space data when using a U-Net structure. While we have used a U-Net architecture, other structures, such as the RNN, can also be applied using the complex modules. Moreover, the success of applying our complex Pooling, Upsampling, and Attention, is also demonstrated. By comparing the complex and real networks across various network depth and kernel settings, the utility and stability of the complex networks are demonstrated, especially given a model’s memory restrictions. The complex networks have the potential to supersede their spatial counterparts, when restricted to shallow networks due to memory limitations. For deeper networks, complex networks may demonstrate poorer, however, comparable performance. As brain-extraction tasks are simple, our work should be regarded as a proof of concept study; future studies will focus on more difficult tasks such as MRI reconstruction, to understand the full potential of these networks with k-space data.Author Contributions
Authors 1 and 2 are co-first authors and contributed equally.Acknowledgements
This work was performed at Zuckerman Mind Brain Behavior Institute MRI Platform, a shared resource and Columbia MR Research Center site.References
1. A. Hirose and S. Yoshida, “Generalization characteristics of complex-valued feedforward neural networks in relation to signal coherence,” IEEE Transactions on Neural Networks and learning systems, 23(4): pp. 541–551, 2012.
2. I. Danihelka, G. Wayne, B. Uria, N. Kalchbrenner, and A. Graves, “Associative long short-term memory,” arXiv preprint, arXiv:1602.03032, 2016.
3. Or. Rippel, J. Snoek, and R. Adams, “Spectral representations for convolutional neural networks,” in Advances in Neural Information Processing Systems, pp. 2449–2457, 2015.
4. C. Trabelsi, O. Bilaniuk, Y. Zhang, D. Serdyuk, S. Subramanian, J. Santos, S. Mehri, N. Rostamzadeh, Y. Bengio and C. Pal, “Deep Complex Networks,” presented at the 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA. Dec 4-9, 2017.
5. PVmatlab (2020), Bruker BioSpin. [Matlab package]
6. R.C. Gonzalez and R.E. Woods. Digital Image Processing. New Jersey: Prentice Hall, 2002.