cedalion.mlutils package
Submodules
cedalion.mlutils.cv module
- cedalion.mlutils.cv.create_cv_splits(
- df_stim: DataFrame,
- n_splits: int,
Split stimulus events into train and test sets for cross-validation.
- Parameters:
df_stim – Stimulus events, sorted by onset times with ordered index.
n_splits – number of folds
- Yields:
For each fold, the stimulis data frame split into train and test set.
The test trials are consecutive and not randomized.
- cedalion.mlutils.cv.mask_design_matrix(
- dms: ~cedalion.models.glm.design_matrix.DesignMatrix,
- df_stim_test: ~pandas.core.frame.DataFrame,
- before: ~pint.Annotated[~pint.Quantity,
- '[time]'] = <Quantity(5,
- 'second')>,
- after: ~pint.Annotated[~pint.Quantity,
- '[time]'] = <Quantity(20,
- 'second')>,
Mask a segment of the design matrix by setting it to zero.
When using GLM parameters as features, the fit must not have access to the test trials. This function zeros out a contiguous segment of the design matrix, ensuring that the model cannot explain the time course in the masked segment for any choice of parameters. The segment extends from the earliest to the latest trial in df_stim_test, padded by additional time specified by the before and after parameters. Because the masked segment is continuous, the train-test split must be chosen such that the test trials are consecutive.
- Parameters:
dms – The design matrix to mask
df_stim_test – test set of stimulus events.
before – time to pad before the earlist test trial
after – time to pad after the latest test trial
- Returns:
A copy of the design matrix with the masked segment set to zero.
cedalion.mlutils.features module
Feature extraction from epoched fNIRS data for use with scikit-learn pipelines.
- cedalion.mlutils.features.epoch_features(
- epochs: DataArray,
- feature_types: list[Literal['slope', 'mean', 'max', 'min', 'auc']],
- reltime_slices: dict[Literal['slope', 'mean', 'max', 'min', 'auc'], slice] | None = None,
Extract scalar features from epoched data for use in ML classifiers.
For each requested feature type, a scalar value is computed over the
"reltime"axis (optionally restricted to a sub-window). All non-epoch dimensions (channel, chromo, …) are then stacked into a flat"feature"dimension so the result is suitable as a 2-D feature matrix for scikit-learn estimators (rows = epochs, columns = features).- Parameters:
epochs – DataArray with at least an
"epoch"dimension and a"reltime"dimension.feature_types – One or more of
"slope","mean","max","min","auc". A string is also accepted as a shorthand for a single-element list.reltime_slices – Optional mapping from feature type to a
sliceof relative-time values used to restrict the window for that feature. Unspecified feature types use the fullreltimerange.
- Returns:
xr.DataArray with dimensions
(epoch, feature)wherefeatureis a multi-index stacking all non-epoch, non-reltime dimensions and thefeature_typelabel.- Raises:
ValueError – If an unrecognised feature type is requested.