Base Components Analysis Classes

class dca.base.BaseComponentsAnalysis(T=None, init='random_ortho', n_init=1, stride=1, chunk_cov_estimate=None, tol=1e-06, verbose=False, device='cpu', dtype=torch.float64, rng_or_seed=None)[source]

Base Components Analysis class.

Parameters:
  • T (int) – Size of time windows.

  • init (str) – Options: “random_ortho”, “random”, or “PCA” Method for initializing the projection matrix.

  • n_init (int) – Number of random restarts. Default is 1.

  • stride (int) – Number of samples to skip when estimating cross covariance matrices. Settings stride > 1 will speedup covariance estimation but may reduce the quality of the covariance estimate for small datasets.

  • chunk_cov_estimate (None or int) – If None, cov is estimated from entire time series. If an int, cov is estimated by chunking up time series and averaging covariances from chucks. This can use less memory and be faster for long timeseries. Requires that the length of the shortest timeseries in the batch is longer than T * chunk_cov_estimate.

  • tol (float) – Tolerance for stopping optimization. Default is 1e-6.

  • ortho_lambda (float) – Coefficient on term that keeps V close to orthonormal.

  • verbose (bool) – Verbosity during optimization.

  • device (str) – What device to run the computation on in Pytorch.

  • dtype (pytorch.dtype) – What dtype to use for computation.

  • rng_or_seed (None, int, or NumPy RandomState) – Random number generator or seed.

T

Default T used for PI.

Type:

int

T_fit

T used for last cross covariance estimation.

Type:

int

estimate_data_statistics()[source]

Estimate the data statistics needed for projection fitting.

fit()[source]

Estimate the data statistics and fit the projections.

fit_projection()[source]

Fit the projections, with n_init restarts.

fit_transform(X, d=None, T=None, n_init=None, *args, **kwargs)[source]

Estimate the data statistics and fit the projection matrix. Then project the data onto the components.

score()[source]

Calculate the score of the data.

transform()[source]

Project the data onto the components after removing the training mean.

class dca.base.SingleProjectionComponentsAnalysis(d=None, T=None, init='random_ortho', n_init=1, stride=1, chunk_cov_estimate=None, tol=1e-06, verbose=False, device='cpu', dtype=torch.float64, rng_or_seed=None)[source]

Base class for Components Analysis with 1 projection.

Runs a Components Analysis method on multidimensional timeseries data X to discover a projection onto a d-dimensional subspace of an N-dimensional space which maximizes the score of the d-dimensional dynamics over windows of length T.

Parameters:
  • d (int) – Number of basis vectors onto which the data X are projected.

  • T (int) – Size of time windows across which to compute mutual information. Total window length will be 2 * T. When fitting a model, the length of the shortest timeseries must be greater than 2 * T and for good performance should be much greater than 2 * T.

  • init (str) – Options: “random_ortho”, “random”, or “PCA” Method for initializing the projection matrix.

  • n_init (int) – Number of random restarts. Default is 1.

  • stride (int) – Number of samples to skip when estimating cross covariance matrices. Settings stride > 1 will speedup covariance estimation but may reduce the quality of the covariance estimate for small datasets.

  • chunk_cov_estimate (None or int) – If None, cov is estimated from entire time series. If an int, cov is estimated by chunking up time series and averaging covariances from chucks. This can use less memory and be faster for long timeseries. Requires that the length of the shortest timeseries in the batch is longer than T * chunk_cov_estimate.

  • tol (float) – Tolerance for stopping optimization. Default is 1e-6.

  • verbose (bool) – Verbosity during optimization.

  • device (str) – What device to run the computation on in Pytorch.

  • dtype (pytorch.dtype) – What dtype to use for computation.

  • rng_or_seed (None, int, or NumPy RandomState) – Random number generator or seed.

T

Default T used for PI.

Type:

int

T_fit

T used for last cross covariance estimation.

Type:

int

d

Default d used for fitting the projection.

Type:

int

d_fit

d used for last projection fit.

Type:

int

cross covs

Cross covariance matrices from the last covariance estimation.

Type:

torch tensor

coef_

Projection matrix from fit.

Type:

ndarray (N, d)

fit(X, d=None, T=None, n_init=None, *args, **kwargs)[source]

Estimate the data statistics and fit the projection matrix.

Parameters:
  • X (ndarray or list of ndarrays) – Data to estimate the cross covariance matrix.

  • d (int) – Dimensionality of the projection (optional.)

  • T (int) – T for PI calculation (optional.)

  • n_init (int) – Number of random restarts (optional.)

fit_projection(d=None, T=None, n_init=None)[source]

Fit the projection matrix.

Parameters:
  • d (int) – Dimensionality of the projection (optional.)

  • T (int) – T for PI calculation (optional). Default is self.T. If T is set here it must be less than or equal to self.T or self.estimate_cross_covariance() must be called with a larger T.

  • n_init (int) – Number of random restarts (optional.)

fit_transform(X, d=None, T=None, n_init=None, *args, **kwargs)[source]

Estimate the data statistics and fit the projection matrix. Then project the data onto the components.

Parameters:
  • X (ndarray or list of ndarrays) – Data to estimate the cross covariance matrix.

  • d (int) – Dimensionality of the projection (optional.)

  • T (int) – T for PI calculation (optional.)

  • n_init (int) – Number of random restarts (optional.)

transform(X)[source]

Project the data onto the components after removing the training mean.

Parameters:

X (ndarray or list of ndarrays) – Data to estimate the cross covariance matrix.

dca.base.init_coef(N, d, rng, init)[source]

Initialize a projection coefficent matrix.

Parameters:
  • N (int) – Original dimensionality.

  • d (int) – Projected dimensionality.

  • rng (np.random.RandomState) – Random state for generation.

  • init (str or ndarray) – Initialization type.

dca.base.ortho_reg_fn(ortho_lambda, *Vs)[source]

Regularization term which encourages the basis vectors in the columns of the Vs to be orthonormal.

Parameters:
  • Vs (np.ndarrays, shape (N, d)) – Matrices whose columns are basis vectors.

  • ortho_lambda (float) – Regularization hyperparameter.

Returns:

reg_val – Value of regularization function.

Return type:

float