GLM Illustrative Example

This notebook explores cedalion’s GLM functionality. Using simulated timeseries with a known ‘ground truth’ allows us to clearly see how the GLM works. We fit several different models to the timeseries and showcase the GLM statistics afforded by cedalion’s statsmodels integration.

[1]:

# This cells setups the environment when executed in Google Colab.
try:
    import google.colab
    !curl -s https://raw.githubusercontent.com/ibs-lab/cedalion/dev/scripts/colab_setup.py -o colab_setup.py
    # Select branch with --branch "branch name" (default is "dev")
    %run colab_setup.py
except ImportError:
    pass

[2]:

import cedalion
import cedalion.io
import cedalion.nirs
import cedalion.dataclasses as cdc
import cedalion.models.glm as glm
import cedalion.plots as plots
import numpy as np
import xarray as xr
import matplotlib.pyplot as p
import pandas as pd
from cedalion import units
from cedalion.sim.synthetic_hrf import RandomGaussianSum
xr.set_options(display_expand_data=False)

[2]:

<xarray.core.options.set_options at 0x7fce3d320650>

Creating a simple simulated timeseries

In this section, we simulate an fNIRS timeseries by generating some random noise and then adding a simple HRF regressor.

1. Build a NDTimeSeries with noise

First, we’ll use the utility function build_timeseries to create a timeseries containing normally distributed noise. The build_timeseries function takes the following arguments:

data : The data values.
dims : The dimension names.
time : The time values.
channel : The channel names.
value_units : The units of the data values.
time_units : The units of the time values.
other_coords : dict[str, ArrayLike] Additional coordinates

[3]:

fs = 10.0 * cedalion.units.Hz # sampling rate
T = 240 * cedalion.units.s # time series length
channel = ["S1D1", "S1D2"] # two channels
chromo = ["HbO", "HbR"] # two chromophores
nsample = int(T * fs)  # number of samples

# create a NDTimeSeries that contains normal distributed noise
ts = cdc.build_timeseries(
    np.random.normal(0, 0.05, (nsample, len(channel), len(chromo))),
    dims=["time", "channel", "chromo"],
    time=np.arange(nsample) / fs,
    channel=channel,
    value_units=units.uM,
    time_units=units.s,
    other_coords={"chromo": chromo},
)
display(ts)

<xarray.DataArray (time: 2400, channel: 2, chromo: 2)> Size: 77kB
[µM] -0.05248 0.003438 -0.003954 -0.005354 ... 0.0001534 0.02894 0.01462
Coordinates:
  * time     (time) float64 19kB 0.0 0.1 0.2 0.3 0.4 ... 239.6 239.7 239.8 239.9
    samples  (time) int64 19kB 0 1 2 3 4 5 6 ... 2394 2395 2396 2397 2398 2399
  * channel  (channel) <U4 32B 'S1D1' 'S1D2'
  * chromo   (chromo) <U3 24B 'HbO' 'HbR'

2. Build the Stimulus DataFrame

A stimulus Dataframe contains information about the experimental stimuli. It should have columns ‘trial_type’, ‘onset’, ‘duration’ and ‘value’.

In this example, we specify two trial types: ‘StimA’, ‘StimB’ and define for each 3 trials with a duration of 10s.

The trials get different values assigned, which we’ll use to control the amplitude of the hemodynamic response.

[4]:

stim = pd.concat(
    (
        pd.DataFrame({"onset": o, "trial_type": "StimA"} for o in [10, 80, 150]),
        pd.DataFrame({"onset": o, "trial_type": "StimB"} for o in [45, 115, 185]),
    )
)

stim["value"] = [0.5, 1, 1.5, 1.25, 0.75, 1.0]
stim["duration"] = 10.
display(stim)

	onset	trial_type	value	duration
0	10	StimA	0.50	10.0
1	80	StimA	1.00	10.0
2	150	StimA	1.50	10.0
0	45	StimB	1.25	10.0
1	115	StimB	0.75	10.0
2	185	StimB	1.00	10.0

4. Add regressors to time series with noise

The time series has two channels: ‘S1D1’ and ‘S1D2’. In this toy example the stimuli will correspond to distinct channels: activations from trial ‘StimA’ should occur only in ‘S1D1’, and ‘StimB’ activations occur only in ‘S1D2’.

The regressors are added with different offsets and scaling factors, which determine the amplitude of the function. Later on we will compare our GLM results to these ‘ground truth’ amplitude values.

[6]:

# define offsets and scaling factors
SCALE_STIMA = 1.25
OFFSET_STIMA = 0.5
SCALE_STIMB = 0.75
OFFSET_STIMB = 0.25

# add scaled regressor and offsets to time series, which up to now contains only noise
ts.loc[:, "S1D1", :] += (
    SCALE_STIMA * dm.sel(regressor="HRF StimA").pint.quantify("uM")
    + OFFSET_STIMA * cedalion.units.uM
)
ts.loc[:, "S1D2", :] += (
    SCALE_STIMB * dm.sel(regressor="HRF StimB").pint.quantify("uM")
    + OFFSET_STIMB * cedalion.units.uM
)

# plot original regressors for StimA and StimB
f, ax = p.subplots(1, 2, sharex=True, sharey=True, figsize=(12,3))
for i, reg in enumerate(["HRF StimA", "HRF StimB"]):
    ax[i].plot(dm.time, dm.sel(regressor=reg, chromo="HbO"), "r-")
    ax[i].plot(dm.time, dm.sel(regressor=reg, chromo="HbR"), "b-")
    ax[i].set_title(f"Reg {reg}")
    plots.plot_stim_markers(ax[i], stim, y=1)
    ax[i].grid(True)
p.tight_layout()

# plot the resulting time series
f, ax = p.subplots(1, 2, sharex=True, sharey=True, figsize=(12,3))
for i, ch in enumerate(ts.channel.values):
    ax[i].plot(ts.time, ts.sel(channel=ch, chromo="HbO"), "r-")
    ax[i].plot(ts.time, ts.sel(channel=ch, chromo="HbR"), "b-")
    ax[i].set_title(f"Ch {ch}")
    ax[i].grid(True)
    plots.plot_stim_markers(ax[i], stim, y=1)
p.tight_layout()

../../_images/examples_modeling_33_glm_illustrative_example_10_0.png

../../_images/examples_modeling_33_glm_illustrative_example_10_1.png

Fitting the GLM - using the same design matrix

In this section, we’ll fit our simulated timeseries using the same design matrix that we used to create it, just to make sure that everything is working as expected.

The method glm.fit is used to fit the GLM to the time series. The required arguments are timeseries and design matrix. We can optionally specify the noise model from the following currently available options:

ols (default): ordinary least squares
rls: recursive least squares
wls: weighted least squares
ar_irls: autoregressive iteratively reweighted least squares (:cite:t:Barker2013)
gls: generalized least squares
glsar: generalized least squares with autoregressive covariance structure

The fit method returns an xr.DataArray of statsmodels RegressionResults objects with dimensions (channel, chromo). Any RegressionResults method can be called on this DataArray using the .sm accessor. For example, we access the betas or model coefficients by using result.sm.params. Please refer to the statsmodels documentation for a full list of methods and attributes.

[7]:

result = glm.fit(ts, dms, noise_model="ols")
display(result)

 50%|█████     | 1/2 [00:00<00:00, 84.90it/s]
 50%|█████     | 1/2 [00:00<00:00, 82.79it/s]

[8]:

betas = result.sm.params
betas

Next, we’ll display our coefficiients in table.

The weights of the fitted model should correspond to the amplitudes of our original timeseries, which we know are just the scaling factors and offsets that we used above. We’ll add these amplitude values in an additional column.

[9]:

df = betas.rename("betas_S1D1").to_dataframe()
# add a column with expected values
df["Expected"] = [
    SCALE_STIMA, 0.0, OFFSET_STIMA,
    SCALE_STIMA, 0.0, OFFSET_STIMA,
    0.0, SCALE_STIMB, OFFSET_STIMB,
    0.0, SCALE_STIMB, OFFSET_STIMB,
]

display(df)

			betas_S1D1	Expected
channel	chromo	regressor
S1D1	HbO	HRF StimA	1.249031	1.25
		HRF StimB	0.003441	0.00
		Drift 0	0.500236	0.50
	HbR	HRF StimA	1.256020	1.25
		HRF StimB	-0.002000	0.00
		Drift 0	0.500822	0.50
S1D2	HbO	HRF StimA	0.001762	0.00
		HRF StimB	0.746638	0.75
		Drift 0	0.249380	0.25
	HbR	HRF StimA	0.007096	0.00
		HRF StimB	0.762541	0.75
		Drift 0	0.250797	0.25

Finally, we use the design matrix and fitted coefficients to predict the timeseries using glm.predict.

[10]:

# helper function to compare original time series and model prediction
def plot_data_to_fit_comparison(ts, pred, stim):
    f, ax = p.subplots(2,1, sharex=True, figsize=(12,4))
    for i, ch in enumerate(ts.channel.values):
        ax[i].plot(ts.time, ts.sel(channel=ch, chromo="HbO"), "r-")
        ax[i].plot(ts.time, ts.sel(channel=ch, chromo="HbR"), "b-")
        ax[i].plot(pred.time, pred.sel(channel=ch, chromo="HbO"), "-", c="#e41a1c", lw=2)
        ax[i].plot(pred.time, pred.sel(channel=ch, chromo="HbR"), "-", c="#377eb8", lw=2)
        ax[i].set_title(f"Ch {ch}")
        plots.plot_stim_markers(ax[i], stim, y=1)
    p.tight_layout()

# use all regressors of the design matrix to predict the time series
pred = glm.predict(ts, betas, dms)
display(pred)
plot_data_to_fit_comparison(ts, pred, stim)


# use only HRF-related regressors, i.e. remove the drift/offset
pred = glm.predict(
    ts,
    # select regressor whose label start with HRF Stim
    betas.sel(regressor=betas.regressor.str.startswith("HRF Stim")),
    dms
)
plot_data_to_fit_comparison(ts, pred, stim)

../../_images/examples_modeling_33_glm_illustrative_example_17_1.png

../../_images/examples_modeling_33_glm_illustrative_example_17_2.png

Fitting the GLM - this time using a slightly different model

In this section, we create a new design matrix that encodes less prior knowledge about the timeseries.

[11]:

# copy the stimulus DataFrame and set all values to 1, i.e.
# there is no prior knowledge about amplitude differences between trials
stim_other = stim.copy()
stim_other["value"] = 1.
display(stim_other)

# this design matrix also uses Gamma basis functions but
# the onset (tau) is delayed and the HRF width (sigma) is longer.
dms_other = (
    glm.design_matrix.hrf_regressors(
        ts, stim, glm.Gamma(tau=1 * units.s, sigma=7 * units.s)
    )
    & glm.design_matrix.drift_regressors(ts, drift_order=0)
)


betas = glm.fit(ts, dms_other, noise_model="ols").sm.params

# display the fitted betas as a DataFrame
display(betas.rename("betas_S1D1").to_dataframe())

	onset	trial_type	value	duration
0	10	StimA	1.0	10.0
1	80	StimA	1.0	10.0
2	150	StimA	1.0	10.0
0	45	StimB	1.0	10.0
1	115	StimB	1.0	10.0
2	185	StimB	1.0	10.0

 50%|█████     | 1/2 [00:00<00:00, 86.59it/s]
 50%|█████     | 1/2 [00:00<00:00, 84.90it/s]

			betas_S1D1
channel	chromo	regressor
S1D1	HbO	HRF StimA	1.032896
		HRF StimB	-0.012818
		Drift 0	0.510603
	HbR	HRF StimA	-0.260037
		HRF StimB	0.005664
		Drift 0	0.498111
S1D2	HbO	HRF StimA	-0.012891
		HRF StimB	0.611035
		Drift 0	0.257623
	HbR	HRF StimA	0.002374
		HRF StimB	-0.160355
		Drift 0	0.249162

[12]:

pred = glm.predict(ts, betas, dms_other)
display(pred)
plot_data_to_fit_comparison(ts, pred, stim_other)


pred = glm.predict(
    ts,
    betas.sel(regressor=betas.regressor.str.startswith("HRF Stim")),
    dms_other
)
plot_data_to_fit_comparison(ts, pred, stim_other)

../../_images/examples_modeling_33_glm_illustrative_example_21_1.png

../../_images/examples_modeling_33_glm_illustrative_example_21_2.png

Creating a new, more complicated time series

In this section, we’ll build a more complicated simulated HRF, following the same steps as in the first part of this notebook. The only thing we do differently is pass a different basis function (RandomGaussianSum) to glm.design_matrix.hrf_regressors.

[13]:

fs = 10.0 * cedalion.units.Hz # sampling rate
T = 240 * cedalion.units.s # time series length
channel = ["S1D1", "S1D2"] # two channels
chromo = ["HbO", "HbR"] # two chromophores
nsample = int(T * fs)  # number of samples

# create a NDTimeSeries that contains normal distributed noise
ts = cdc.build_timeseries(
    np.random.normal(0, 0.05, (nsample, len(channel), len(chromo))),
    dims=["time", "channel", "chromo"],
    time=np.arange(nsample) / fs,
    channel=channel,
    value_units=units.uM,
    time_units=units.s,
    other_coords={"chromo": chromo},
)

# Create a design matrix with a random Gaussian sum HRF
dms = (
    glm.design_matrix.hrf_regressors(
        ts, stim, RandomGaussianSum(
    t_start=0 * units.s,
    t_end=30 * units.s,
    t_delta=2 * units.s,
    t_std=3 * units.s,
    seed=22,
)
    )
    & glm.design_matrix.drift_regressors(ts, drift_order=0)
)

dm = dms.common # there are no channel-wise regressors in this example

# For this simulated example we again want the HbR regressors to be
# inverted and smaller in amplitude than their HbO counterparts.
dm.loc[:, ["HRF StimA", "HRF StimB"], "HbR"] *= -0.25

# define offsets and scaling factors
SCALE_STIMA = 1.25
OFFSET_STIMA = 0.5
SCALE_STIMB = 0.75
OFFSET_STIMB = 0.25

# add scaled regressor and offsets to time series, which up to now contains only noise
ts.loc[:, "S1D1", :] += (
    SCALE_STIMA * dm.sel(regressor="HRF StimA").pint.quantify("uM")
    + OFFSET_STIMA * cedalion.units.uM
)
ts.loc[:, "S1D2", :] += (
    SCALE_STIMB * dm.sel(regressor="HRF StimB").pint.quantify("uM")
    + OFFSET_STIMB * cedalion.units.uM
)

# plot original regressors for StimA and StimB
f, ax = p.subplots(1, 2, sharex=True, sharey=True, figsize=(12,3))
for i, reg in enumerate(["HRF StimA", "HRF StimB"]):
    ax[i].plot(dm.time, dm.sel(regressor=reg, chromo="HbO"), "r-")
    ax[i].plot(dm.time, dm.sel(regressor=reg, chromo="HbR"), "b-")
    ax[i].set_title(f"Reg {reg}")
    plots.plot_stim_markers(ax[i], stim, y=1)
    ax[i].grid(True)
p.tight_layout()

# plot the resulting time series
f, ax = p.subplots(1, 2, sharex=True, sharey=True, figsize=(12,3))
for i, ch in enumerate(ts.channel.values):
    ax[i].plot(ts.time, ts.sel(channel=ch, chromo="HbO"), "r-")
    ax[i].plot(ts.time, ts.sel(channel=ch, chromo="HbR"), "b-")
    ax[i].set_title(f"Ch {ch}")
    ax[i].grid(True)
    plots.plot_stim_markers(ax[i], stim, y=1)
p.tight_layout()

../../_images/examples_modeling_33_glm_illustrative_example_24_0.png

../../_images/examples_modeling_33_glm_illustrative_example_24_1.png

Fitting the GLM - Uncertainty

The aim of this section is to showcase some uncertainty measurements of the GLM fit that are accessible through statsmodels. First, we’ll create a simple design matrix using gamma basis functions, following the same steps as in the previous sections.

[14]:

dms_simple = (
    glm.design_matrix.hrf_regressors(
        ts, stim, glm.Gamma(tau=0 * units.s, sigma=5 * units.s)
    )
    & glm.design_matrix.drift_regressors(ts, drift_order=0)
)

dm_simple = dms_simple.common

dm_simple.loc[:, ["HRF StimA", "HRF StimB"], "HbR"] *= -0.25 # invert HbR regressors

f, ax = p.subplots(1,2,figsize=(12,5))
dm_simple.sel(chromo="HbO").plot(ax=ax[0], vmin=-1, vmax=1, cmap='RdBu_r')
dm_simple.sel(chromo="HbR").plot(ax=ax[1], vmin=-1, vmax=1, cmap='RdBu_r')
p.xticks(rotation=90)
p.show()

f, ax = p.subplots(1,2,figsize=(12,3))
for i,chromo in enumerate(dm_simple.chromo.values):
    for reg in dm_simple.regressor.values:
        ax[i].plot(dm_simple.time, dm_simple.sel(chromo=chromo, regressor=reg), label=reg)
    plots.plot_stim_markers(ax[i], stim, y=1)
    ax[i].set_title(f"chromo={chromo}")
    ax[i].legend()

../../_images/examples_modeling_33_glm_illustrative_example_27_0.png

../../_images/examples_modeling_33_glm_illustrative_example_27_1.png

Now, we fit the GLM and showcase some useful uncertainty measures. As noted above, the glm.fit function returns a DataArray of statsmodels RegressionResults objects, and the full list of available methods can be found in the statsmodels documentation. Here, we calculate the standard error, variance, and confidence intervals of our GLM coefficients.

Even though our HRF is more complicated now, the weights of the fitted simple model should still roughly correspond to the max amplitudes of our original timeseries (TS Amplitude), which we know are just the scaling factors and offsets that we used above.

[15]:

# fit the model to the time series
result = glm.fit(ts, dms_simple, noise_model="ols")
betas = result.sm.params

# display the fitted betas as a DataFrame
df = betas.rename("betas_S1D1").to_dataframe()
# add a column with expected values
df["TS Amplitude"] = [
    SCALE_STIMA, 0.0, OFFSET_STIMA,
    SCALE_STIMA, 0.0, OFFSET_STIMA,
    0.0, SCALE_STIMB, OFFSET_STIMB,
    0.0, SCALE_STIMB, OFFSET_STIMB,
]

# calculate uncertainty measures and add to table
df["Standard Error"] = result.sm.bse.rename("Standard Error").to_dataframe() # standard error
conf_int_df = result.sm.conf_int().to_dataframe(name="conf_int").unstack() # confidence intervals
conf_int_df.columns = ["Confidence Interval Lower", "Confidence Interval Upper"]
df = df.join(conf_int_df)
df["Variance"] = result.sm.regressor_variances().rename("Variance").to_dataframe() # variance

# reorder columns
df = df[["TS Amplitude", "Confidence Interval Lower", "betas_S1D1", "Confidence Interval Upper", "Standard Error", "Variance"]]

display(df)

 50%|█████     | 1/2 [00:00<00:00, 84.51it/s]
 50%|█████     | 1/2 [00:00<00:00, 83.59it/s]

			TS Amplitude	Confidence Interval Lower	betas_S1D1	Confidence Interval Upper	Standard Error	Variance
channel	chromo	regressor
S1D1	HbO	HRF StimA	1.25	0.778787	0.821603	0.864418	0.021834	0.000477
		HRF StimB	0.00	-0.168463	-0.130309	-0.092155	0.019457	0.000379
		Drift 0	0.50	0.573850	0.584339	0.594828	0.005349	0.000029
	HbR	HRF StimA	1.25	0.747414	0.804164	0.860914	0.028940	0.000838
		HRF StimB	0.00	-0.175365	-0.124794	-0.074223	0.025789	0.000665
		Drift 0	0.50	0.475257	0.478733	0.482209	0.001772	0.000003
S1D2	HbO	HRF StimA	0.00	-0.136103	-0.106755	-0.077406	0.014966	0.000224
		HRF StimB	0.75	0.447867	0.474021	0.500174	0.013337	0.000178
		Drift 0	0.25	0.304411	0.311601	0.318791	0.003667	0.000013
	HbR	HRF StimA	0.00	-0.143067	-0.093813	-0.044559	0.025117	0.000631
		HRF StimB	0.75	0.432181	0.476072	0.519964	0.022383	0.000501
		Drift 0	0.25	0.231895	0.234911	0.237928	0.001538	0.000002

Next, we plot the model prediction and compare it to the time series.

[16]:

# use all regressors of the design matrix to predict the time series
pred = glm.predict(ts, betas, dms_simple)
display(pred)
plot_data_to_fit_comparison(ts, pred, stim)


# use only HRF-related regressors, i.e. remove the drift/offset
pred = glm.predict(
    ts,
    # select regressor whose label start with HRF Stim
    betas.sel(regressor=betas.regressor.str.startswith("HRF Stim")),
    dms_simple
)
plot_data_to_fit_comparison(ts, pred, stim)

../../_images/examples_modeling_33_glm_illustrative_example_31_1.png

../../_images/examples_modeling_33_glm_illustrative_example_31_2.png

Fitting with multiple gaussian kernels

In this section, we’ll build a new design matrix using the GaussianKernels basis function. This basis function consists of multiple time-shifted gaussians. Therefore, the model is capable of describing also more complex hemodynamic repsonse - like the one we just simulated. However, the obtained coefficients are not as easy to interprete as in the simpler model above. There the size of the coefficient directly related to the amplitude of the response. Here, the coefficients of the Gaussians encode both the shape and the amplitude.

The GaussianKernels basis function takes the following arguments:

t_pre : time before trial onset
t_post : time after trial onset
t_delta : the temporal spacing between consecutive gaussians
t_std : time width of the gaussians

[17]:

# Build design matrix with Gaussian kernels
dms_gauss = glm.design_matrix.hrf_regressors(
    ts,
    stim,
    glm.GaussianKernels(
        t_pre=5 * units.s, t_post=30 * units.s, t_delta=3 * units.s, t_std=2 * units.s
    ),
) & glm.design_matrix.drift_regressors(ts, drift_order=0)

dm_gauss = dms_gauss.common

# Plot the regressors
f,ax = p.subplots(1,1, figsize=(12,4))
for reg in dm_gauss.regressor.values:
    p.plot(dm_gauss.time, dm_gauss.sel(chromo="HbO", regressor=reg), label=reg)
plots.plot_stim_markers(ax, stim, y=1.)
p.legend(ncol=3, loc="center right")
p.xlim(0,90)

[17]:

(0.0, 90.0)

../../_images/examples_modeling_33_glm_illustrative_example_33_1.png

Now we use the new design matrix to fit the GLM.

Note that we have many regressors and thus no longer can easily infer the HRF amplitude from the coefficients. We also need a more sophisticated way to display the uncertainty of our fit.

[18]:

# Fit the model to the time series
results = glm.fit(ts, dms_gauss, noise_model="ols")
betas = results.sm.params

# translate the xr.DataArray into a pd.DataFrame which are displayed as tables
display(betas.rename("betas_S1D1").to_dataframe())

 50%|█████     | 1/2 [00:00<00:00, 44.06it/s]
 50%|█████     | 1/2 [00:00<00:00, 84.48it/s]

			betas_S1D1
channel	chromo	regressor
S1D1	HbO	HRF StimA 00	0.106093
		HRF StimA 01	-0.213654
		HRF StimA 02	0.321471
		HRF StimA 03	-0.477283
		HRF StimA 04	1.007852
...	...	...	...
S1D2	HbR	HRF StimB 07	0.102107
		HRF StimB 08	-0.205848
		HRF StimB 09	-0.028270
		HRF StimB 10	0.005917
		Drift 0	0.251470

92 rows × 1 columns

We again use the fitted model to predict the time series. This time, the prediction is quite accurate.

[19]:

pred = glm.predict(ts, betas, dms_gauss)
plot_data_to_fit_comparison(ts, pred, stim)


# use only HRF-related regressors, i.e. remove the drift/offset
pred_stim = glm.predict(
    ts,
    # select regressor whose label start with HRF Stim
    betas.sel(regressor=betas.regressor.str.startswith("HRF Stim")),
    dms_gauss
)
plot_data_to_fit_comparison(ts, pred_stim, stim)

../../_images/examples_modeling_33_glm_illustrative_example_37_0.png

../../_images/examples_modeling_33_glm_illustrative_example_37_1.png

In the last part of the notebook, we’ll explore some options for using a derived statistic to enable quantitative comparisons between the prediction and the original timeseries.

First, we compare the peak amplitude and area under the curve of the prediction and the original timeseries.

TODO: uncertainty measurements for these values

[20]:

# calculate peak amplitude and area under the curve (AUC) for each channel and chromophore
peak_amp_pred = pred.max(dim=["time"]).to_dataframe(name="peak_amp_pred")
auc_pred = pred.integrate(coord="time").to_dataframe(name="auc_pred")

# calculate peak amplitude and AUC for the original time series
peak_amp_ts = ts.pint.dequantify().max(dim=["time"]).to_dataframe(name="peak_amp_ts")
auc_ts = ts.pint.dequantify().integrate(coord="time").to_dataframe(name="auc_ts")

# merge the results into a single DataFrame and display
df = peak_amp_ts.join(peak_amp_pred).join(auc_ts).join(auc_pred)

display(df)

		peak_amp_ts	peak_amp_pred	auc_ts	auc_pred
channel	chromo
S1D1	HbO	1.806876	1.739051	153.865968	153.870876
S1D1	HbR	0.669799	0.516425	111.479789	111.481772
S1D2	HbO	1.128190	0.988238	84.260954	84.260495
S1D2	HbR	0.404885	0.261051	53.905973	53.898974

Finally, we try out a way to visualize the uncertainty of the GLM fit. In the simple case, we could just look at the uncertainties of each coefficient. Now that we have so many coefficients, this is no longer feasible.

This method visualizes the uncertainty in a GLM fit by drawing multiple samples of the beta coefficients from their estimated covariance (via multivariate normal sampling). It then uses these sampled betas to generate predicted time courses, and plots the mean prediction with a shaded band representing ±3 standard deviations across samples, thus capturing the variability due to model uncertainty. The band is quite narrow because uncertainty is low in this toy example.

[21]:

cov = results.sm.cov_params()
sampled_betas = xr.zeros_like(betas).expand_dims({"sample" : 100}, axis=-1).copy()
for i_ch in range(sampled_betas.shape[0]):
    for i_cr in range(sampled_betas.shape[1]):
        sampled_betas[i_ch, i_cr, :, :] = np.random.multivariate_normal(
            betas[i_ch, i_cr, :],
            cov[i_ch, i_cr, :, :],
            size=100,
        ).T
pred = glm.predict(ts, sampled_betas, dms_gauss)

pred_mean = pred.mean("sample")
pred_std = pred.std("sample")
mm = pred_mean.loc[slice(0,400), "S1D1", "HbO"]
ss = pred_std.loc[slice(0,400), "S1D1", "HbO"]

p.plot(mm.time, mm, c="r")
p.fill_between(mm.time, mm-3*ss, mm+3*ss, fc="y", alpha=.8)

p.xlabel("time / s")
p.ylabel(r"$\Delta$  c / uM")

[21]:

Text(0, 0.5, '$\\Delta$  c / uM')

../../_images/examples_modeling_33_glm_illustrative_example_41_1.png