2618

CU-Net: A Completely Complex U-Net for MR k-space Signal Processing

Dipika Sikka^1,2, Noah Igra^3,4, Sabrina Gjerswold-Sellec¹, Cynthia Gao⁵, Ed Wu⁶, and Jia Guo⁷
¹Department of Biomedical Engineering, Columbia University, New York, NY, United States, ²VantAI, New York, NY, United States, ³Department of Applied Mathematics, Columbia University, New York, NY, United States, ⁴Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel, ⁵Department of Computer Science, Columbia University, New York, NY, United States, ⁶Department of Electrical and Electronic Engineering, The University of Hong Kong, Pok Fu Lam, Hong Kong, China, ⁷Department of Psychiatry, Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, United States

Synopsis

While the application of deep learning in MR image analysis has gained significant popularity, using raw MR k-space data as part of deep learning analysis is an underexplored area. Here we develop a completely complex U-Net deep learning architecture, CU-Net, where we apply deep learning components and operations in the complex space. CU-Net leverages k-space MR signals while training a U-Net with Attention and Residual components, as opposed to using processed spatial (real) data, typically seen with MRI deep learning applications. As part of a proof-of-concept study, the complex networks demonstrated their utility and potential superiority over their spatial counterparts.

Introduction

Despite the rapid expansion of deep learning applications in biomedicine, most deep learning architectures are designed for the real space. Studies have shown that complex numbers bring various advantages, including better generalization and less noisy retrieval from associative memory.^1,2 As MR signals are intrinsically of complex representation, this demands the incorporation of the complex space into neural network structures. While previous studies have extended deep learning networks to the frequency domain through spectral pooling, complex batch normalization, and convolution, some attempts are not fully complex in terms of the convolution kernels, and others are limited to using convolutional neural networks exclusively instead of adapting to other networks.^3,4 In our study, we implement a fully-complex U-Net model with Residual and Attention components (Residual Attention U-Net). It was hypothesized that a network receiving complex k-space MRI data will have more information in a given instance in comparison to one receiving processed magnitude-only input. Thus, U-Net encoding in the complex domain may lead to improved feature extraction and data encoding necessary for image translation. To our knowledge, these encoder-decoder networks demonstrate novel complex applications of otherwise commonly used components including Pooling, Upsampling, and Attention. As an initial demonstration of utility, we use the complex neural networks for a mouse-brain extraction task and compare it to its spatial (real space) counterpart.

Methods

Data Generation
A dataset of 120 T2-weighted mouse MRI scans acquired on a Bruker 9.4T scanner was used. The k-space data were extracted from raw data using the Bruker-supplied MATLAB package PVtools.⁵ Spatial scans were computed using the magnitude of the inverse Fourier transformed k-space data. Ground truth brain masks were generated from spatial scans with an in-house deep learning-based brain extraction tool that employs a 3D Residual Attention U-Net. Spatial and k-space scans were then randomly split into train-test-validation sets, consisting of 89, 15, and 16 scans respectively.
Model Architecture Design
The model architecture was developed as individual modules including complex Convolution, Batch Normalization, Pooling, Upsampling, Rectified Linear Unit (ReLU), and Sigmoid. These modules were used to build complex Attention and Residual blocks. For the complex data, real and imaginary components are stored as real numbers in separate channels, for which complex operations are simulated. The underlying theory for complex Convolution, Batch Normalization, and ReLU has been demonstrated and implemented using other frameworks in previous work.⁴ Similar to complex ReLU, Sigmoid is simulated by applying the operation to real and imaginary components individually. Our complex Attention has been adjusted by using the same components used in its spatial counterpart, while accommodating for a 2-channel complex input. Unlike conventional stride-based compression methods (i.e. Maxpool), our implementation of complex Pooling takes advantage of the local properties of data constrained to the frequency space. Spatial image homogenized edges and gray-level variation features are separated as high and low frequency spaces, respectively.⁶The resulting spatial separation allows us to implement Pooling which functions similarly to a low-pass filter; low-frequency data is conserved while high-frequency data is truncated. Frequency space Upsampling can be conducted as the inverse of complex Pooling: zero-padding high-frequency space originally truncated in feature encoding. A fully-complex Residual Attention U-Net architecture was constructed with the blocks mentioned above.
Training and Testing
2D complex Residual Attention U-Nets were trained at various network depths and first layer kernel outputs (see Figure 2 for a sample architecture). To compare performance to its spatial counterparts, 2D residual attention U-Nets in the real space were also trained at the same network depths and kernel outputs. All other initial model and training parameters were kept constant.

Results

All complex and real space models were evaluated using DICE coefficients on their respective test sets. Figure 3 summarizes these results, demonstrating the utility of complex networks, especially with shallower architecture, which superseded the performance of its real counterparts, across all numbers of first layer kernel outputs. The performance is especially significant for a 3 layer network, which showed the greatest performance increase. In contrast, the performance of complex networks was poorer for deeper architectures, where real networks showed higher DICE scores for 5 and 6 layer networks. Complex networks, however, demonstrated improved performance stability compared to the real networks.

Conclusions and Discussion

The results highlight the potential of completely complex neural networks for k-space data when using a U-Net structure. While we have used a U-Net architecture, other structures, such as the RNN, can also be applied using the complex modules. Moreover, the success of applying our complex Pooling, Upsampling, and Attention, is also demonstrated. By comparing the complex and real networks across various network depth and kernel settings, the utility and stability of the complex networks are demonstrated, especially given a model’s memory restrictions. The complex networks have the potential to supersede their spatial counterparts, when restricted to shallow networks due to memory limitations. For deeper networks, complex networks may demonstrate poorer, however, comparable performance. As brain-extraction tasks are simple, our work should be regarded as a proof of concept study; future studies will focus on more difficult tasks such as MRI reconstruction, to understand the full potential of these networks with k-space data.

Author Contributions

Authors 1 and 2 are co-first authors and contributed equally.

Acknowledgements

This work was performed at Zuckerman Mind Brain Behavior Institute MRI Platform, a shared resource and Columbia MR Research Center site.

References

1. A. Hirose and S. Yoshida, “Generalization characteristics of complex-valued feedforward neural networks in relation to signal coherence,” IEEE Transactions on Neural Networks and learning systems, 23(4): pp. 541–551, 2012.

2. I. Danihelka, G. Wayne, B. Uria, N. Kalchbrenner, and A. Graves, “Associative long short-term memory,” arXiv preprint, arXiv:1602.03032, 2016.

3. Or. Rippel, J. Snoek, and R. Adams, “Spectral representations for convolutional neural networks,” in Advances in Neural Information Processing Systems, pp. 2449–2457, 2015.

4. C. Trabelsi, O. Bilaniuk, Y. Zhang, D. Serdyuk, S. Subramanian, J. Santos, S. Mehri, N. Rostamzadeh, Y. Bengio and C. Pal, “Deep Complex Networks,” presented at the 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA. Dec 4-9, 2017.

5. PVmatlab (2020), Bruker BioSpin. [Matlab package]

6. R.C. Gonzalez and R.E. Woods. Digital Image Processing. New Jersey: Prentice Hall, 2002.

Figures

Figure 1. Complex domain pipeline for mouse-brain extraction. Raw data is used to generate real and imaginary components for k-space data, which was used to train the complex Residual Attention U-Net network. Model output was compared to ground truth mouse-brain masks.

Figure 2. Sample complex network Residual Attention U-Net architecture. This network shows the architecture of one of the complex networks trained for the mouse-brain extraction task. The network consists of 4 encoding layers and 4 decoding layers. Spatial dimension decreases by 2 and channel dimension increases by 2 as the data propagates through the encoding layers while the reverse happens along the decoding layers. A similar structure is used for networks with 3, 5, and 6 layers.

Figure 3. DICE comparisons between the complex and real space networks. a. DICE coefficients across networks with a constant number of layers (4) and a changing number of first layer kernel outputs (4, 8, and 16) as well as for a constant number of first layer kernel outputs (8) and a changing number of layers (3, 5, and 6). b. Mean scores and standard error between complex and real space.

Figure 4. Network complex operations and modules. These were used within the complex Residual Attention U-Net architecture and listed with them are their corresponding references.

Proc. Intl. Soc. Mag. Reson. Med. 29 (2021)

2618