Dynamical Components Analysis (DCA)¶
Model Formulation¶
Breifly, DCA seeks to find a projection of time series data \(y_t=V^T\cdot x_t\) such that the predictive information (PI) [Bialek1999] is maximized. The PI of a stationary, multivariate time series \(y_t\) is defined as the mutual information between two consecutive windows of length \(T\)
Where \(H_y(T)\) is the entropy of a length-\(T\) window of \(y\). Estimating the mutual information or entropy of continuous, high dimensional signals is difficult. Furthermore, we require a estimator of PI that is differentiable in \(V\) so that PI can be maximized.
To solve both of these problems, we can assume that \(X\) is a stationary, discrete-time Gaussian process. In this case, \(Y\) will also be stationary and Gaussian since it is a linear projection of \(X\). In this case, estimating PI simplifies to
where \(\Sigma_T(Y)\) and \(\Sigma_{2T}(Y)\) are the space-time cross covariance matrices for windows of length-\(T\) and \(2T\) respectively. The space-time cross covariance matrix for \(X\) is
Finally, the space-time cross covariance for \(Y\) can be computed by taking
This allows us to both compute the Gaussian PI and take derivatives with respect to \(V\). More details can be found in [Clark2019].
References
W. Bialek, and N. Tishby. Predictive information. arXiv preprint cond-mat/9902341 (1999).
Python Implementation¶
The DCA models are designed to mimic scikit-learn functionality.
import numpy as np
from dca import DynamicalComponentsAnalysis as DCA
X = np.random.randn(1000, 9)
model = DCA(d=3, T=10)
model.fit(X)
Y = model.transform(X)