cedalion.sim.datasets package
Submodules
cedalion.sim.datasets.synthetic_fnirs_eeg module
Simulate bimodal data as a toy model for concurrent EEG–fNIRS data.
Follows an approach inspired by Dähne et al., 2014.
Reference: https://doi.org/10.1016/j.neuroimage.2014.01.014
- class cedalion.sim.datasets.synthetic_fnirs_eeg.BimodalToyDataSimulation(config, seed=None, mixing_type='structured')[source]
Bases:
objectSimulate bimodal data as a toy model for concurrent EEG-fNIRS data.
This class creates coupled (bimodal) synthetic signals representing an “X” modality (e.g., EEG) and a “Y” modality (e.g., fNIRS). It provides utilities for simulating sources, generating spatial mixing patterns, projecting to channels via a simple forward model, computing power, and basic visualization.
- Toy data description:
EEG (
x) and fNIRS (y) recordings are generated from a pseudo-random linear mixing forward model. Sourcess_xands_yare divided into background sources (independent between modalities) and target sources (co-modulated across modalities). Each EEG background source is constructed from a random oscillatory signal in a chosen frequency band multiplied by a slow-varying random amplitude-modulation function. This amplitude modulation acts as the envelope ofs_xand provides an estimate of its bandpower timecourse. fNIRS background sources are generated directly from slow-varying amplitude-modulation functions in the same way.Target sources are built similarly, except that the same envelope modulating
s_xis also used for the corresponding fNIRS sources_y. This coupling may be delayed by a time-lag parameterdTto simulate realistic physiological delays between modalities. fNIRS sources and recordings are then downsampled to epoch intervals of lengthT_epoch.Background mixing matrices
A_xandA_yare drawn from normal distributions, while target mixing matrices use Gaussian radial basis functions (RBFs) centered on shared positions, plus white noise, to create spatial patterns with local structure.Signal-to-noise ratio (SNR) is controlled by the parameter
gamma, which weights target versus background contributions in channel space. The relationship with decibel units is given bySNR [dB] = 20 * log10(gamma).EEG channel recordings and their bandpower timecourses are available via the attributes
x(high-rate channels) andx_power(epoch downsampled). The downsampledx_poweris aligned with fNIRS recordingsy. Target sources are accessible assx_tandsy_t; thei powerband timecourse of the former is available assx_power.
- Parameters:
config (str | dict) – Path to a YAML config file or a dictionary containing simulation parameters. See
generate_args()for derived fields.seed (int | None) – Random seed for reproducibility. If
None, uses the current UNIX timestamp.mixing_type (str) – Type of mixing to generate for target sources.
'structured'assigns localized RBF-like patterns; any other value leaves patterns purely random.
- sy_t[source]
Target sources for Y modality, dims
('source', 'time'), sampled over epochs.- Type:
xr.DataArray
- sy_ba[source]
Background sources for Y modality, dims
('source', 'time'), sampled over epochs.- Type:
xr.DataArray
- sx_power[source]
Power of target sources for X modality, dims
('source', 'time'), sampled over epochs.- Type:
xr.DataArray
- ax_t[source]
Mixing patterns for target sources in X, dims
('channel', 'source').- Type:
xr.DataArray
- ax_ba[source]
Mixing patterns for background sources in X, dims
('channel', 'source').- Type:
xr.DataArray
- ay_t[source]
Mixing patterns for target sources in Y, dims
('channel', 'source').- Type:
xr.DataArray
- ay_ba[source]
Mixing patterns for background sources in Y, dims
('channel', 'source').- Type:
xr.DataArray
- x_power[source]
Power of observed channels for X modality, dims
('channel', 'time'), sampled over epochs.- Type:
xr.DataArray
- y[source]
Observed channels for Y modality, dims
('channel', 'time'), sampled over epochs.- Type:
xr.DataArray
- preprocess_data(train_test_split=0.8)[source]
Normalize and split the simulated data into train/test subsets.
- Parameters:
train_test_split (float) – Fraction of the time axis to use for the training split (
0 < train_test_split < 1). The remainder is used for testing.- Returns:
- A dictionary with keys:
'x_train'(xr.DataArray): X channels (train).'x_test'(xr.DataArray): X channels (test).'x_power_train'(xr.DataArray): X power (train).'x_power_test'(xr.DataArray): X power (test).'y_train'(xr.DataArray): Y channels (train).'y_test'(xr.DataArray): Y channels (test).'sx'(xr.DataArray): Target X sources, test portion only.'sx_power'(xr.DataArray): Target X source power, test only.'sy'(xr.DataArray): Target Y sources, test portion only.
- Return type:
dict
- static generate_montage(Nc, channel_label)[source]
Create a symmetric 2D montage of channels within
[0, 1] x [0, 1].The function builds the smallest grid with at least
Ncpoints, ranks points by distance to the center, and selects the closestNcpoints for an aesthetically symmetric montage.- Parameters:
Nc (int) – Number of channels to place.
channel_label (str) – Prefix used to name channels (e.g.,
'X'or'Y'). Channels are labeled as'<label><idx>'starting at 1.
- Returns:
Array of shape
(channel, dim)with coordinateschannel=[<label>1, ..., <label>Nc]anddim=['x','y']containing the 2D positions in the unit square.- Return type:
xr.DataArray
- simulate_sources()[source]
Simulate target and background sources for both modalities.
Target sources are constructed from band-limited oscillations with amplitude modulation and a modality-dependent temporal shift. Background sources are independent amplitude-modulated oscillations.
- Parameters:
None
- Returns:
(sx_t, sy_t, sx_ba, sy_ba)where -sx_t: target sources for X, dims('source','time')oflength
(Ns_target, Nt).sy_t: target sources for Y, dims('source','time')of length(Ns_target, Ne)(epoch-averaged amplitude).sx_ba: background sources for X, dims('source','time')of length(Ns_ba, Nt).sy_ba: background sources for Y, dims('source','time')of length(Ns_ba, Ne).
- Return type:
tuple[xr.DataArray, xr.DataArray, xr.DataArray, xr.DataArray]
- simulate_random_source(T=None, Nt=None)[source]
Simulate a single random oscillatory source within
(f_min, f_max).The source is synthesized in the frequency domain with unit amplitude and random phases in the desired band, then transformed back to time via an inverse FFT and envelope-normalized to unity.
- Parameters:
T (float | None) – Total duration in seconds. Defaults to
self.args.TwhenNone.Nt (int | None) – Number of time samples. Defaults to
self.args.NtwhenNone.
- Returns:
Time-domain signal of shape
(Nt,)with unit envelope (via analytic signal magnitude normalization).- Return type:
np.ndarray
- simulate_amplitude(Nt=None)[source]
Simulate an amplitude modulation signal from low-pass noise.
- Parameters:
Nt (int | None) – Number of samples to generate. Defaults to
self.args.NtwhenNone.- Returns:
Positive amplitude modulation of shape
(Nt,)scaled to[0, 1].- Return type:
np.ndarray
- generate_patterns(mixing_type='structured')[source]
Generate mixing matrices and split them into target/background.
- Parameters:
mixing_type (str) – If
'structured', assign localized RBF-like patterns to target sources based on distances from true source positions to channel locations; otherwise leave as random.- Returns:
(Ax, ax_t, ax_ba, Ay, ay_t, ay_ba)where each is anxr.DataArraywith dims('channel','source').*_tcontains only the firstNs_targetsources and*_bathe background sources.
- Return type:
tuple
- forward_model(a_t, a_ba, s_t, s_ba)[source]
Project sources to channels and add background/noise.
- Parameters:
a_t (xr.DataArray) – Mixing patterns for target sources with dims
('channel','source').a_ba (xr.DataArray) – Mixing patterns for background sources with dims
('channel','source').s_t (xr.DataArray) – Target source time series with dims
('source','time').s_ba (xr.DataArray) – Background source time series with dims
('source','time').
- Returns:
(x_t, x_noise)wherex_tis the projected target-only signal andx_noiseis the combination of background and Gaussian noise (both Frobenius normalized), each with dims('channel','time').
- Return type:
tuple[xr.DataArray, xr.DataArray]
- get_channels(x_t, x_noise, gamma, calculate_power=False)[source]
Combine target and noise components to form observed channels.
- Parameters:
x_t (xr.DataArray) – Target-only channels with dims
('channel','time').x_noise (xr.DataArray) – Background+noise channels with dims
('channel','time').gamma (float) – Mixture weight controlling SNR; higher values weight the target more heavily.
calculate_power (bool) – If
True, also compute envelope-based power per-epoch for each channel.
- Returns:
If
calculate_powerisFalse, returnsx(channels). IfTrue, returns(x, x_power)wherex_powerhas dims('channel','time')over epochs.- Return type:
xr.DataArray | tuple[xr.DataArray, xr.DataArray]
- plot_targets(xlim=None, ylim=None)[source]
Plot target sources and their envelopes/power with optional limits.
- Parameters:
xlim (tuple[float, float] | None) –
(xmin, xmax)for the x-axis.ylim (tuple[float, float] | None) –
(ymin, ymax)for the y-axis.
- Returns:
None
- plot_channels(N=2, xlim=None, ylim=None)[source]
Plot
Npairs of channels for X and Y modalities.- Parameters:
N (int) – Number of top-index channels to plot.
xlim (tuple[float, float] | None) –
(xmin, xmax)for the x-axis.ylim (tuple[float, float] | None) –
(ymin, ymax)for the y-axis.
- Returns:
None
- plot_mixing_patterns(
- Ax=None,
- Ay=None,
- cmap='viridis',
- activity_size=200,
- title=None,
Plot mixing patterns for target sources across both modalities.
This method generates a scatter plot of mixing patterns
AxandAyfor sources, in a 2D grid following the X and Y montages, respectively. The scatter points represent the activity level at each channel. Each source is represented in a separate row, withAxon the left andAyon the right.- Parameters:
Ax (np.ndarray | xr.DataArray | None) – Mixing matrix for X modality with dims/shape
(n_channels, n_sources). IfNone, usesself.ax_t.Ay (np.ndarray | xr.DataArray | None) – Mixing matrix for Y modality with dims/shape
(n_channels, n_sources). IfNone, usesself.ay_t.cmap (str) – Matplotlib colormap name used to render activity.
activity_size (float) – Marker size for scatter points.
title (str | None) – Optional figure title.
- Returns:
None
- cedalion.sim.datasets.synthetic_fnirs_eeg.generate_args(config)[source]
Read arguments from config and extend with derived parameters.
- Parameters:
config (str | dict) – Path to the configuration YAML file or a dict containing simulation parameters. Expected base keys include
T,rate,T_epoch,dT,f_min,f_max,Ns_all,Ns_target,Nx,Ny,ellx,elly,sigma_noise,gamma, andgamma_e.- Returns:
Namespace with original and derived fields (e.g.,
Nt,NdT,e_len,Ne,Nde, andNs_ba).- Return type:
argparse.Namespace
- cedalion.sim.datasets.synthetic_fnirs_eeg.set_seed(seed)[source]
Set the global random seeds for reproducibility.
- Parameters:
seed (int) – Seed value to set for
numpyand Python’srandom.- Returns:
None
- cedalion.sim.datasets.synthetic_fnirs_eeg.butter_lowpass(cut, fs, order)[source]
Construct a digital Butterworth low-pass filter.
- Parameters:
cut (float) – Cutoff frequency in Hz.
fs (float) – Sampling rate in Hz.
order (int) – Filter order.
- Returns:
Numerator
band denominatorafilter coefficients suitable forscipy.signal.lfilter().- Return type:
tuple[np.ndarray, np.ndarray]
- cedalion.sim.datasets.synthetic_fnirs_eeg.butter_lowpass_filter(data, cut, fs, order=5)[source]
Apply a Butterworth low-pass filter to a 1D signal.
- Parameters:
data (np.ndarray) – Input time series of shape
(N,).cut (float) – Cutoff frequency in Hz.
fs (float) – Sampling rate in Hz.
order (int) – Filter order.
- Returns:
Filtered signal of the same shape as
data.- Return type:
np.ndarray
- cedalion.sim.datasets.synthetic_fnirs_eeg.f_normalize(x)[source]
Frobenius-normalize an array.
The array is divided by the square root of the sum of squares of all entries. Useful to normalize channel matrices or multichannel signals.
- Parameters:
x (np.ndarray | xr.DataArray) – Input array.
- Returns:
Normalized array with the same shape and type as
x.- Return type:
np.ndarray | xr.DataArray
- cedalion.sim.datasets.synthetic_fnirs_eeg.standardize(x, dim='time')[source]
Z-score standardize along a dimension (for xarray) or globally (numpy).
- Parameters:
x (xr.DataArray | np.ndarray) – Input array to standardize.
dim (str) – Dimension name along which to standardize when
xis anxr.DataArray. Ignored fornp.ndarray.
- Returns:
Standardized array with mean 0 and std 1.
- Return type:
xr.DataArray | np.ndarray
- cedalion.sim.datasets.synthetic_fnirs_eeg.split_epochs(x, e_len)[source]
Split a 1D signal into non-overlapping epochs of a given length.
- Parameters:
x (np.ndarray) – 1D input signal of length
N.e_len (int) – Epoch length in samples.
- Returns:
Array of shape
(Ne, e_len)whereNe = N // e_len.- Return type:
np.ndarray
Module contents
Synthetic toy datasets.