The BernoulliObservations is designed to model an observed binary variable based on a Bernoulli distribution
with a given success probability. When using a logit link function (i.e. a logistic inverse link function),
this is equivalent to Logistic Regression. It provides methods for computing the negative log-likelihood,
generating samples, and computing the residual deviance for the given binary observations.
This uses PEP-487 [1] to set the set_{method}_request methods. It
looks for the information available in the set default values which are
set using __metadata_request__* class attributes, or inferred
from method signatures.
The __metadata_request__* class attributes are used when a method
does not explicitly accept a metadata through its arguments or if the
developer would like to specify a request value for those metadata
which are different from the default None.
Compute the residual deviance for a Bernoulli model.
Parameters:
observations (Array) – The binary observations. Shape (n_time_bins,) or (n_time_bins,n_observations) for population
models (i.e. multiple observations).
predicted_rate (Array) – The predicted rate (success probability). Shape (n_time_bins,) or (n_time_bins,n_observations)
for population models (i.e. multiple observations).
scale (Union[float, Array]) – Scale parameter of the model. For Bernoulli should be equal to 1.
where \(y\) is the observed data, \(\hat{y}\) is the predicted data, and \(\text{LL}\) is
the model log-likelihood. Lower values of deviance indicate a better fit.
predicted_rate (Array) – The predicted rate values (success probabilities). This is not used in the Bernoulli model for estimating
scale, but is retained for compatibility with the abstract method signature.
dof_resid (Union[float, Array]) – The DOF of the residuals.
Compute the pseudo-\(R^2\) metric for the GLM, as defined by McFadden et al. [2]
or by Cohen et al. [3].
This metric evaluates the goodness-of-fit of the model relative to a null (baseline) model that assumes a
constant mean for the observations. While the pseudo-\(R^2\) is bounded between 0 and 1 for the
training set, it can yield negative values on out-of-sample data, indicating potential over-fitting.
Parameters:
y (Array) – The neural activity. Expected shape: (n_time_bins,)
predicted_rate (Array) – The mean neural activity. Expected shape: (n_time_bins,)
score_type (Literal['pseudo-r2-McFadden', 'pseudo-r2-Cohen']) – The pseudo-\(R^2\) type.
The pseudo-\(R^2\) of the model. A value closer to 1 indicates a better model fit,
whereas a value closer to 0 suggests that the model doesn’t improve much over the null model.
where \(L_M\), \(L_0\) and \(L_s\) are the likelihood of the fitted model, the null model (a
model with only the intercept term), and the saturated model (a model with one parameter per
sample, i.e. the maximum value that the likelihood could possibly achieve). \(D_M\) and \(D_0\) are
the model and the null deviance, \(D_i = -2 \left[ \log(L_s) - \log(L_i) \right]\) for \(i=M,0\).
The method works on simple estimators as well as on nested objects
(such as Pipeline). The latter have
parameters of the form <component>__<parameter> so that it’s
possible to update each component of a nested object.