We propose DCM-RNN, a new model for effective connectivity estimation from
Dynamic Causal Modelling (DCM) 1 is a highly nonlinear generative model used to infer causal architecture of coupled dynamical systems in the brain from functional MRI (fMRI) data, namely, effective connectivity 2. It is considered the most biologically plausible as well as the most technically advanced fMRI modeling method 3,4. It relies primarily on variational-based methods to estimate its parameters 5–9.
We propose to cast the DCM as a Generalized Recurrent Neural Network (GRNN) (to be called DCM-RNN) and use backpropagation based methods to estimate the causal architecture. It has the following potential advantages:
1. DCM-RNN is a more flexible framework. One can pursue model parameter sparsity and data fidelity simultaneously by specifying an appropriate loss function for network training, while these objectives have to be done separately in traditional DCM10,11. It is also easier to add more biophysical constraints in DCM-RNN such as the sigmoid non-linearity suggested in 12,13.
2. DCM-RNN can leverage efficient parameter estimation methods that have been developed for RNN, e.g. Truncated Backpropagation Through Time (TBPTT)14, which are significantly different from any existing methods for DCM. It circumvents some limitations of variational-based methods. For instance, it does not rely on the non-biophysically inspired Gaussian assumption about target parameters as in 5–8. It optimizes its loss function directly, not a lower bound of the loss function as in variational-based methods.
3. DCM-RNN is biophysical meaningful and compatible with other Neural Networks (NN). Deep learning provides exciting opportunities for medical applications15. Works16,17 have applied generic RNN in attempts to understand brain functional responses in complex tasks like movie watching. However, the generic RNNs lack biophysical interpretability. DCM-RNN can be used instead to dig more biophysical insights.
An overview of DCM is shown in Fig. 1 and the notations are summarized in TABLE I. DCM is a model for continuous time signal. The first step is to discretize it for modeling discrete time signal. Taking the neural activity state function xt as an example. Using the approximation:
$$\mathbf{x} ̇_t≈\frac{\mathbf{x}_{t+1}-\mathbf{x}_t}{Δt}$$
where Δt is time interval between adjacent time points, the neural equation for becomes
$${\mathbf{x}_{t+1}≈(Δt\times A+I) \mathbf{x}_t+∑_{j=1}^mΔt\times u_t^{(j)} B^{(j)} \mathbf{x}_t +Δt\times C\mathbf{u}_t}$$
This trick can be applied to all the differential equations in DCM.
To accommodate the complex DCM, we propose a generalization of RNN. The classic RNN18 models inputs xt and outputs yt as
$$\mathbf{h}_t=f^h (W^{hx}\mathbf{x}_t+W^{hh}\mathbf{h}_{t-1}+\mathbf{b}^h )$$
$$\mathbf{y}_t=f^y (W^{yh}\mathbf{h}_t+\mathbf{b}^y )$$
where h is hidden state, W and b with various superscripts are weighting matrices and biases. fh and fy are nonlinear functions up to researchers’ choice and targeted applications. We generalize the classic RNN to
$$\mathbf{h}_t=f^h (W^h ϕ^h (\mathbf{x}_t,\mathbf{h}_{t-1};\mathbf{ξ}^h )+\mathbf{b}^h )$$
$$\mathbf{y}_t=f^y (W^{yh} ϕ^y (\mathbf{h}_t;\mathbf{ξ}^y )+\mathbf{b}^y )$$
The difference is the generally nonlinear functions ϕh and ϕy introduced, which are parameterized by ξh and ξy. The GRNN is trainable by backpropagation as long as ϕh and ϕy are partially differentiable. The original DCM1 can be cast as a GRNN, as illustrated in Fig. 2, which involves three ϕ functions. We refer to the model as DCM-RNN.
We propose to infer the parameters $$$Θ=\{A,B,C,κ_n,γ_n,τ_n,{E_0}_n,α_n | n=1,2...N_n\}$$$ of DCM-RNN by using a loss function that promotes model sparsity and prediction accuracy simultaneously:
$$L(Θ)= ∑_{t=1}^T‖\mathbf{y}_t-\mathbf{y} ̂_t‖_2^2 +λ∑_{θ∈\{A,B,C\}}|θ|_1 +β∑_{n=1}^{N_n}∑_{θ∈\{κ_n,γ_n,τ_n,{E_0}_n,α_n \}}\frac{(θ-mean(θ))^2}{variance(θ)}$$
where yt and $$$y ̂_t$$$ are the measured and estimated fMRI signal. T is the scan length. n is brain region index and Nn is total number of brain regions. λ and β are user-defined factors. The mean and variance of hemodynamic parameters come from previous studies and are listed in 1. The loss is minimized over the whole set of Θ.
This work was supported in part by RO1 NS039135-11 from the National Institute for Neurological Disorders and Stroke (NINDS) and was performed under the rubric of the Center for Advanced Imaging Innovation and Research (CAI2R, www.cai2r.net), a NIBIB Biomedical Technology Resource Center (NIH P41 EB017183).
We thank Quanyan Zhu, Assistant Professor, NYU, for suggestions during the development of this work.
1. Friston KJ, Harrison L, Penny W. Dynamic causal modelling. Neuroimage 2003; 19(4):1273–1302.
2. Friston KJ. Functional and effective connectivity: a review. Brain Connect. 2011; 1(1):13–36.
3. Smith SM. The future of FMRI connectivity. Neuroimage 2012; 62(2):1257–1266.
4. Smith SM, Vidaurre D, Beckmann CF et al. Functional connectomics from resting-state fMRI. Trends Cogn. Sci. 2013; 17(12):666–682.
5. Friston K, Mattout J, Trujillo-Barreto N et al. Variational free energy and the Laplace approximation. Neuroimage 2007; 34(1):220–234.
6. Daunizeau J, Friston KJ, Kiebel SJ. Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models. Phys. D Nonlinear Phenom. 2009; 238(21):2089–2118.
7. Friston KJ, Trujillo-Barreto N, Daunizeau J. DEM: A variational treatment of dynamic systems. Neuroimage 2008; 41(3):849–885.
8. Friston K, Stephan K, Li B, Daunizeau J. Generalised filtering. Math. Probl. Eng. 2010. doi:10.1155/2010/621670.
9. Friston KJ. Variational filtering. Neuroimage 2008; 41(3):747–766.
10. Penny WD, Stephan KE, Mechelli A, Friston KJ. Comparing dynamic causal models. Neuroimage 2004; 22(3):1157–1172.
11. Friston KJ, Li B, Daunizeau J, Stephan KE. Network discovery with DCM. Neuroimage 2011; 56(3):1202–1221.
12. Marreiros AC, Kiebel SJ, Friston KJ. Dynamic causal modelling for fMRI: A two-state model. Neuroimage 2008; 39(1):269–278.
13. Wilson HR, Cowan JD. A mathematical theory of the functional dynamics of cortical and thalamic nervous tissue. Biol. Cybern. 1973; 13(2):55–80.
14. Williams RJ, Zipser D. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks. Neural Comput. 1989; 1(2):270–280.
15. Greenspan H, VanGinneken B, Summers RM. Guest Editorial Deep Learning in Medical Imaging?: Overview and Future Promise of an Exciting New Technique. IEEE Trans. Med. Imaging 2016; 35(5):1153–1159.
16. Güçlü U, van Gerven MAJ. Modeling the dynamics of human brain activity with recurrent neural networks. Front. Syst. Neurosci. 2016:1–19.
17. Qian P, Qiu X, Huang X. Bridging LSTM Architecture and the Neural Dynamics during Reading. Ijcai 2016.
18. Zachary C. Lipton. A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv Prepr. 2015:1–35.
19. Stephan KE, Weiskopf N, Drysdale PM et al. Comparing hemodynamic models with DCM. Neuroimage 2007; 38(3):387–401.