Hide code cell source
%matplotlib inline
import warnings

# Ignore the first specific warning
warnings.filterwarnings(
    "ignore",
    message="plotting functions contained within `_documentation_utils` are intended for nemos's documentation.",
    category=UserWarning,
)

# Ignore the second specific warning
warnings.filterwarnings(
    "ignore",
    message="Ignoring cached namespace 'core'",
    category=UserWarning,
)

warnings.filterwarnings(
    "ignore",
    message=(
        "invalid value encountered in div "
    ),
    category=RuntimeWarning,
)

Fit Calcium Imaging#

For the example dataset, we will be working with a recording of a freely-moving mouse imaged with a Miniscope (1-photon imaging at 30Hz using the genetically encoded calcium indicator GCaMP6f). The area recorded for this experiment is the postsubiculum - a region that is known to contain head-direction cells, or cells that fire when the animal’s head is pointing in a specific direction.

The data were collected by Sofia Skromne Carrasco from the Peyrache Lab.

import jax
import jax.numpy as jnp
import matplotlib.pyplot as plt
import pynapple as nap
from sklearn.linear_model import LinearRegression

import nemos as nmo

configure plots

plt.style.use(nmo.styles.plot_style)

Data Streaming#

Here we load the data from OSF. The data is a NWB file.

path = nmo.fetch.fetch_data("A0670-221213.nwb")
Downloading file 'A0670-221213.nwb' from 'https://osf.io/sbnaw/download' to '/home/docs/.cache/nemos'.

pynapple preprocessing#

Now that we have the file, let’s load the data. The NWB file contains multiple entries.

data = nap.load_file(path)
print(data)
/home/docs/checkouts/readthedocs.org/user_builds/nemos/envs/stable/lib/python3.11/site-packages/hdmf/spec/namespace.py:535: UserWarning: Ignoring cached namespace 'hdmf-common' version 1.5.1 because version 1.8.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/home/docs/checkouts/readthedocs.org/user_builds/nemos/envs/stable/lib/python3.11/site-packages/hdmf/spec/namespace.py:535: UserWarning: Ignoring cached namespace 'hdmf-experimental' version 0.2.0 because version 0.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
A0670-221213
┍━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━┑
│ Keys                  │ Type        │
┝━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━┥
│ position_time_support │ IntervalSet │
│ RoiResponseSeries     │ TsdFrame    │
│ z                     │ Tsd         │
│ y                     │ Tsd         │
│ x                     │ Tsd         │
│ rz                    │ Tsd         │
│ ry                    │ Tsd         │
│ rx                    │ Tsd         │
┕━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━┙

In the NWB file, the calcium traces are saved the RoiResponseSeries field. Let’s save them in a variable called ‘transients’ and print it.

transients = data['RoiResponseSeries']
print(transients)
Time (s)    0        1        2        3        4        ...
----------  -------  -------  -------  -------  -------  -----
3.1187      0.27546  0.79973  0.16383  0.20118  0.02926  ...
3.15225     0.26665  0.86751  0.15879  0.23682  0.02719  ...
3.18585     0.25796  0.89419  0.15352  0.25074  0.03651  ...
3.2194      0.24943  0.89513  0.14812  0.25215  0.05627  ...
3.253       0.24111  0.88023  0.14898  0.24651  0.07095  ...
3.28655     0.233    0.85584  0.14858  0.23706  0.08147  ...
3.32015     0.22513  1.0996   0.14715  0.22572  0.08859  ...
...         ...      ...      ...      ...      ...      ...
1203.38945  0.20815  0.17535  0.12126  0.09446  0.87427  ...
1203.42305  0.20247  0.17243  0.11807  0.08992  1.2578   ...
1203.4566   0.19654  0.17056  0.11461  0.08508  1.62     ...
1203.4902   0.19052  0.16645  0.11096  0.0802   1.8811   ...
1203.52375  0.18449  0.16105  0.10717  0.07542  2.0599   ...
1203.55735  0.17851  0.15494  0.10331  0.07081  2.2176   ...
1203.5909   0.17264  0.14851  0.09942  0.06643  2.311    ...
dtype: float64, shape: (35757, 65)

transients is a TsdFrame. Each column contains the activity of one neuron.

The mouse was recorded for a 20 minute recording epoch as we can see from the time_support property of the transients object.

ep = transients.time_support
print(ep)
  index    start      end
      0   3.1187  1203.59
shape: (1, 2), time unit: sec.

There are a few different ways we can explore the data. First, let’s inspect the raw calcium traces for neurons 4 and 35 for the first 250 seconds of the experiment.

fig, ax = plt.subplots(1, 2, figsize=(12, 4))
ax[0].plot(transients[:, 4].get(0,250))
ax[0].set_ylabel("Firing rate (Hz)")
ax[0].set_title("Trace 4")
ax[0].set_xlabel("Time(s)")
ax[1].plot(transients[:, 35].get(0,250))
ax[1].set_title("Trace 35")
ax[1].set_xlabel("Time(s)")
plt.tight_layout()
../_images/ae49d9cd9c2de9d05ba548d003571b031b3537acc66a415633701da121521c4a.png

You can see that the calcium signals are both nonnegative, and noisy. One (neuron 4) has much higher SNR than the other. We cannot typically resolve individual action potentials, but instead see slow calcium fluctuations that result from an unknown underlying electrical signal (estimating the spikes from calcium traces is known as deconvolution and is beyond the scope of this demo).

We can also plot tuning curves, plotting mean calcium activity as a function of head direction, using the function compute_1d_tuning_curves_continuous. Here data['ry'] is a Tsd that contains the angular head-direction of the animal between 0 and 2\(\pi\).

tcurves = nap.compute_1d_tuning_curves_continuous(transients, data['ry'], 120)

The function returns a pandas DataFrame. Let’s plot the tuning curves for neurons 4 and 35.

fig, ax = plt.subplots(1, 2, figsize=(12, 4))
ax[0].plot(tcurves.iloc[:, 4])
ax[0].set_xlabel("Angle (rad)")
ax[0].set_ylabel("Firing rate (Hz)")
ax[0].set_title("Trace 4")
ax[1].plot(tcurves.iloc[:, 35])
ax[1].set_xlabel("Angle (rad)")
ax[1].set_title("Trace 35")
plt.tight_layout()
../_images/5c07fd8f87dea525d0fc598cb80ad72ae9c903fadef30df344faf45a3d510de4.png

As a first processing step, let’s bin the calcium traces to a 100ms resolution.

Y = transients.bin_average(0.1, ep)

We can visualize the downsampled transients for the first 50 seconds of data.

plt.figure()
plt.plot(transients[:,0].get(0, 50), linewidth=5, label="30 Hz")
plt.plot(Y[:,0].get(0, 50), '--', linewidth=2, label="10 Hz")
plt.xlabel("Time (s)")
plt.ylabel("Fluorescence")
plt.legend()
plt.show()
../_images/8a1799a92683b5a46d1c07062e048353d0594e2b64398b206e2dcdb5234e6368.png

The downsampling did not destroy the fast transient dynamics, so seems fine to use. We can now move on to using NeMoS to fit a model.

Basis instantiation#

We can define a cyclic-BSpline for capturing the encoding of the heading angle, and a log-spaced raised cosine basis for the coupling filters between neurons. Note that we are not including a self-coupling (spike history) filter, because in practice we have found it results in overfitting.

We can combine the two bases.

heading_basis = nmo.basis.CyclicBSplineEval(n_basis_funcs=12)
coupling_basis = nmo.basis.RaisedCosineLogConv(3, window_size=10)

We need to make sure the design matrix will be full-rank by applying identifiability constraints to the Cyclic Bspline, and then combine the bases (the resturned object will be an AdditiveBasis object).

heading_basis.identifiability_constraints = True
basis = heading_basis + coupling_basis

Gamma GLM#

Until now, we have been modeling spike trains, and have used a Poisson distribution for the observation model. With calcium traces, things are quite different: we no longer have counts but continuous signals, so the Poisson assumption is no longer appropriate. A Gaussian model is also not ideal since the calcium traces are non-negative. To satisfy these constraints, we will use a Gamma distribution from NeMoS with a soft-plus non linearity.

Non-linearity

Different option are possible. With a soft-plus we are assuming an “additive” effect of the predictors, while an exponential non-linearity assumes multiplicative effects. Deciding which firing rate model works best is an empirical question. You can fit different configurations to see which one capture best the neural activity.

model = nmo.glm.GLM(
    solver_kwargs=dict(tol=10**-13),
    regularizer="Ridge",
    regularizer_strength=0.02,
    observation_model=nmo.observation_models.GammaObservations(inverse_link_function=jax.nn.softplus)
)

We select one neuron to fit later, so remove it from the list of predictors

neu = 4
selected_neurons = jnp.hstack(
    (jnp.arange(0, neu), jnp.arange(neu+1, Y.shape[1]))
)

print(selected_neurons)
[ 0  1  2  3  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64]

We need to bring the head-direction of the animal to the same size as the transients matrix. We can use the function bin_average of pynapple. Notice how we pass the parameter ep that is the time_support of the transients.

head_direction = data['ry'].bin_average(0.1, ep)

Let’s check that head_direction and Y are of the same size.

print(head_direction.shape)
print(Y.shape)
(12005,)
(12005, 65)

Design matrix#

We can now create the design matrix by combining the head-direction of the animal and the activity of all other neurons.

X = basis.compute_features(head_direction, Y[:, selected_neurons])
/home/docs/checkouts/readthedocs.org/user_builds/nemos/envs/stable/lib/python3.11/site-packages/pynapple/core/utils.py:196: UserWarning: Converting 'd' to numpy.array. The provided array was of type 'ArrayImpl'.
  warnings.warn(
/home/docs/checkouts/readthedocs.org/user_builds/nemos/envs/stable/lib/python3.11/site-packages/pynapple/core/time_series.py:300: DeprecationWarning: `newshape` keyword argument is deprecated, use `shape=...` or pass shape positionally instead. (deprecated in NumPy 2.1)
  out = func._implementation(*new_args, **kwargs)

Train & test set#

Let’s create a train epoch and a test epoch to fit and test the models. Since X is a pynapple time series, we can create IntervalSet objects to restrict them into a train set and test set.

train_ep = nap.IntervalSet(start=X.time_support.start, end=X.time_support.get_intervals_center().t)
test_ep = X.time_support.set_diff(train_ep) # Removing the train_ep from time_support

print(train_ep)
print(test_ep)
  index    start      end
      0   3.1187  603.355
shape: (1, 2), time unit: sec.
  index    start      end
      0  603.355  1203.59
shape: (1, 2), time unit: sec.

We can now restrict the X and Y to create our train set and test set.

Xtrain = X.restrict(train_ep)
Ytrain = Y.restrict(train_ep)

Xtest = X.restrict(test_ep)
Ytest = Y.restrict(test_ep)

Model fitting#

It’s time to fit the model on the data from the neuron we left out.

model.fit(Xtrain, Ytrain[:, neu])
GLM(
    observation_model=GammaObservations(inverse_link_function=softplus),
    regularizer=Ridge(),
    regularizer_strength=0.02,
    solver_name='GradientDescent',
    solver_kwargs={'tol': 1e-13}
)

Model comparison#

We can compare this to scikit-learn linear regression.

mdl = LinearRegression()
valid = ~jnp.isnan(Xtrain.d.sum(axis=1)) # Scikit learn does not like nans.
mdl.fit(Xtrain[valid], Ytrain[valid, neu])
LinearRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

We now have 2 models we can compare. Let’s predict the activity of the neuron during the test epoch.

yp = model.predict(Xtest)
ylreg = mdl.predict(Xtest) # Notice that this is not a pynapple object.
/home/docs/checkouts/readthedocs.org/user_builds/nemos/envs/stable/lib/python3.11/site-packages/pynapple/core/utils.py:196: UserWarning: Converting 'd' to numpy.array. The provided array was of type 'ArrayImpl'.
  warnings.warn(

Unfortunately scikit-learn has not adopted pynapple yet. Let’s convert ylreg to a pynapple object to make our life easier.

ylreg = nap.Tsd(t=yp.t, d=ylreg, time_support = yp.time_support)

Let’s plot the predicted activity for the first 60 seconds of data.

# mkdocs_gallery_thumbnail_number = 3

ep_to_plot = nap.IntervalSet(test_ep.start+20, test_ep.start+80)

plt.figure()
plt.plot(Ytest[:,neu].restrict(ep_to_plot), "r", label="true", linewidth=2)
plt.plot(yp.restrict(ep_to_plot), "k", label="gamma-nemos", alpha=1)
plt.plot(ylreg.restrict(ep_to_plot), "g", label="linreg-sklearn", alpha=0.5)
plt.legend(loc='best')
plt.xlabel("Time (s)")
plt.ylabel("Fluorescence")
plt.show()
../_images/14343033b398daf1693050b21d39f8f6fd41e2514f25e29cc8a0071307fc8739.png

While there is some variability in the fit for both models, one advantage of the gamma distribution is clear: the nonnegativity constraint is followed with the data. This is required for using GLMs to predict the firing rate, which must be positive, in response to simulated inputs. See Peyrache et al. 2018\(^{[1]}\) for an example of simulating activity with a GLM.

Another way to compare models is to compute tuning curves. Here we use the function compute_1d_tuning_curves_continuous from pynapple.

real_tcurves = nap.compute_1d_tuning_curves_continuous(transients, data['ry'], 120, ep=test_ep)
gamma_tcurves = nap.compute_1d_tuning_curves_continuous(yp, data['ry'], 120, ep=test_ep)
linreg_tcurves = nap.compute_1d_tuning_curves_continuous(ylreg, data['ry'], 120, ep=test_ep)

Let’s plot them.

fig = plt.figure()
plt.plot(real_tcurves[neu], "r", label="true", linewidth=2)
plt.plot(gamma_tcurves, "k", label="gamma-nemos", alpha=1)
plt.plot(linreg_tcurves, "g", label="linreg-sklearn", alpha=0.5)
plt.legend(loc='best')
plt.ylabel("Fluorescence")
plt.xlabel("Head-direction (rad)")
plt.show()
../_images/04e3e1f6c5945129b9662629f345110771bad2ec07e6a7a32214a4c1de770821.png
Hide code cell source
# save image for thumbnail
from pathlib import Path
import os

root = os.environ.get("READTHEDOCS_OUTPUT")
if root:
   path = Path(root) / "html/_static/thumbnails/tutorials"
# if local store in assets
else:
   path = Path("../_build/html/_static/thumbnails/tutorials")
 
# make sure the folder exists if run from build
if root or Path("../assets/stylesheets").exists():
   path.mkdir(parents=True, exist_ok=True)

if path.exists():
  fig.savefig(path / "plot_06_calcium_imaging.svg")

Gamma-GLM for Calcium Imaging Analysis

Using Gamma-GLMs for fitting calcium imaging data is still in early stages, and hasn’t been through the levels of review and validation that they have for fitting spike data. Users should consider this a relatively unexplored territory, and we hope that we hope that NeMoS will help researchers explore this new space of models.

References#

[1] Peyrache, A., Schieferstein, N. & Buzsáki, G. Transformation of the head-direction signal into a spatial code. Nat Commun 8, 1752 (2017). https://doi.org/10.1038/s41467-017-01908-3