Synthetic Datasets¶

dca.synth_data.calc_pi_for_gp(kernel, T_pi, N)[source]¶

Calculates the predictive information in a spatiotemporal Gaussian process with a given kernel.

Parameters:

T (int) – Length of temporal windows accross which to compute mutual information.
N (int) – Number of spatial steps in teh Gaussian process.
kernel (function) – Should be of the form kernel = K(t1, t2, x1, x2). The kernel can choose to imlement temporal or spatial stationarity, however this is not enfored.

Returns:

PI – (Temporal) predictive information in the Gaussian process.

Return type:

float

dca.synth_data.embed_gp(T, N, d, kernel, noise_cov, T_pi, num_to_concat=1)[source]¶

Embed a d-dimensional Gaussian process into N-dimensional space, then add (potentially) spatially structured white noise.

Parameters:

T (int) – Length in time.
N (int) – Ambient dimension.
d (int) – Gaussian process dimension.
kernel (function) – Kernel of the form K(t1, t2, x1, x2).
noise_cov (np.ndarray, shape (N, N)) – Covariance matrix from which to sampel Gaussian noise to add to each time point in an iid fashion.
num_to_concat (int) – Number of samples of lenght T to concatenate before returning the result.

Returns:

X – Embedding of GP into high-dimensional space, plus noise.

Return type:

np.ndarray, size (T*num_to_concat, N)

dca.synth_data.embedded_lorenz_cross_cov_mats(N, T, snr=1.0, noise_dim=7, return_samples=False, num_lorenz_samples=10000, num_subspace_samples=5000, V_dynamics=None, V_noise=None, X_dynamics=None, seed=20200326)[source]¶

Embed the Lorenz system into high dimensions with additive spatially structued white noise. Signal and noise subspaces are oriented with the median subspace angle.

Parameters:

N (int) – Embedding dimension.
T (int) – Number of timesteps (2 * T_pi)
snr (float) – Signal-to-noise ratio. Specifically it is the ratio of the largest eigenvalue of the signal covariance to the largest eigenvalue of the noise covariance.
noise_dim (int) – Dimension at which noise eigenvalues fall to 1/e. If noise_dim is np.inf then a flat spectrum is used.
return_samples (bool) – Whether to return cross_cov_mats or data samples.
num_lorenz_samples (int) – Number of data samples to use.
num_subspace_samples (int) – Number of random subspaces used to calculate the median relative angle.
seed (int) – Seed for Numpy random state.

dca.synth_data.gen_gp_cov(kernel, T, N)[source]¶

Generates a T*N-by-T*N covariance matrix for a spatiotemporal Gaussian process (2D Gaussian random field) with a provided kernel.

Parameters:

T (int) – Number of time-steps.
N (int) – Number of spatial steps.
kernel (function) – Should be of the form kernel = K(t1, t2, x1, x2). The kernel can choose to imlement temporal or spatial stationarity, however this is not enfored.

Returns:

C – Covariance matrix for the Gaussian process. Time is the “outer” variable and space is the “inner” variable.

Return type:

np.ndarray, shape (T*N, T*N)

dca.synth_data.gen_gp_kernel(kernel_type, spatial_scale, temporal_scale, local_noise=0.0)[source]¶

Generates a specified type of Kernel for a spatiotemporal Gaussian process.

Parameters:

kernel_type (string) – ‘squared_exp’ or ‘exp’
spatial_scale (float) – Spatial autocorrelation scale.
temporal_scale (float) – Temporal autocorrelation scale.

Returns:

K – Kernel of the form K(t1, t2, x1, x2).

Return type:

function

dca.synth_data.gen_lorenz_system(T, seed, integration_dt=0.005)[source]¶

Period ~ 1 unit of time (total time is T) So make sure integration_dt << 1

Known-to-be-good chaotic parameters See sussillo LFADS paper

dca.synth_data.sample_gp(T, N, kernel, num_to_concat=1)[source]¶

Draw a sample from a spatiotemporal Gaussian process.

Parameters:

T (int) – Length in time of sample.
N (int) – Size in space of sample.
kernel (function) – Kernel of the form K(t1, t2, x1, x2).
num_to_concat (int) – Number of samples of lenght T to concatenate before returning the result.

Returns:

sample – Sample from the Gaussian process.

Return type:

np.ndarray, size (T*num_to_concat, N)