nemos.basis.Category#

class nemos.basis.Category(categories, out_of_category=True, label=None)[source]#

Bases: EvalBasisMixin, AtomicBasisMixin, Basis

Categorical one-hot encoding basis.

Encodes a categorical variable with n_categories unique labels as a one-hot feature matrix of shape (n_samples, n_categories). Each column corresponds to one category: the entry is 1 when the input equals that category, and 0 everywhere else.

Parameters:

categories (Union[List, NDArray, int]) –
The set of valid category labels. Accepted forms:
- int: interpreted as the number of categories; labels default to [0, 1, ..., categories-1].
- list or NDArray: the explicit list of unique category labels. Note that the provided category labels will be sorted and stored as an attribute. Column i of the one-hot encoding will correspond to basis.categories[i]. When a list is provided, it is converted to an NDArray via np.asarray. Mixed-type lists will be cast to a common dtype (e.g., ["a", 1] becomes array(['a', '1'], dtype='<U21')).
out_of_category (bool) – If False, raise if labels that do not belong to categories are provided, else encode the out-of-category labels as all 0s.
label (Optional[str]) – The label of the basis, intended to be descriptive of the task variable being processed. For example: "trial_type", "stimulus_id".

Notes

Design matrix identifiability.

This basis produces a full encoding: one column per category. Because NeMoS GLMs include an intercept, including all columns of a Category basis as a standalone predictor introduces perfect collinearity — the column sum equals the intercept column. Always drop one column per categorical variable when using categories as main effects; the dropped category becomes the reference level and all retained coefficients are contrasts against it.

When Category is multiplied with a continuous basis (the recommended use), the intercept is not involved and no column needs to be dropped.

For a detailed discussion of identifiability, reference-level choice, and the effect of regularization, see the identifiability guide.

Examples

Encode a categorical variable with 3 integer labels:

>>> import numpy as np
>>> from nemos.basis import Category
>>> basis = Category(3)
>>> basis.n_basis_funcs
3
>>> labels = np.array([0, 1, 2, 0])
>>> features = basis.compute_features(labels)
>>> features
Array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [1., 0., 0.]], dtype=...)

y The 4 input labels map to a (4, 3) matrix: a 1 in column k of row i means sample i has label basis.categories[k].

Standalone categorical predictor with reference coding (drop one column):

>>> basis = Category(["L", "R"])
>>> X = basis.compute_features(np.array(["L", "R", "L", "R"]))
>>> X = X[:, 1:]  # "Tri" is the reference; remaining column is the contrast between "Sq" and "Tri"
>>> X
Array([[0.],
       [1.],
       [0.],
       [1.]], dtype=...)

Category-specific tuning curves via basis product (no column dropping needed):

>>> from nemos.basis import RaisedCosineLinearEval
>>> speed = np.random.randn(20)
>>> context = np.random.choice(["L", "R"], size=20)
>>> bas = Category(["L", "R"]) * RaisedCosineLinearEval(5)
>>> X = bas.compute_features(context, speed)

Attributes

`bounds`	Returns bounds, as provided.
`categories`
`input_shape`	Expected per-sample input shape.
`is_complex`	Whether the basis is intrinsically complex.
`label`	Label for the basis.
`n_basis_funcs`	Number of basis functions.
`n_output_features`	Number of features returned by the basis.
`out_of_category`

__init__(categories, out_of_category=True, label=None)[source]#

Parameters:

categories (List | TypeAliasForwardRef('NDArray') | int)
out_of_category (bool)
label (str | None)

Methods

`__init__`(categories[, out_of_category, label])
`compute_features`(xi)	Evaluate basis at sample points.
`evaluate`(xi)	Evaluate the categorical basis at the provided samples.
`evaluate_on_grid`(*n_samples)	Evaluate the categorical basis on its natural grid.
`get_params`([deep])	From scikit-learn, get parameters by inspecting init.
`set_input_shape`(xi)	Set the expected input shape for the basis object.
`set_params`(**params)	Set the parameters of this estimator.
`setup_basis`(*xi)	Set all basis states.
`split_by_feature`(x[, axis])	Decompose an array along a specified axis into sub-arrays based on the number of expected inputs.
`to_transformer`()	Turn the Basis into a TransformerBasis for use with scikit-learn.

__add__(other)#

Add two Basis objects together.

Parameters:: other (BasisMixin) – The other Basis object to add.
Returns:: The resulting Basis object.
Return type:: AdditiveBasis

classmethod __init_subclass__(**kwargs)#

Set the set_{method}_request methods.

This uses PEP-487 [1] to set the set_{method}_request methods. It looks for the information available in the set default values which are set using __metadata_request__* class attributes, or inferred from method signatures.

The __metadata_request__* class attributes are used when a method does not explicitly accept a metadata through its arguments or if the developer would like to specify a request value for those metadata which are different from the default None.

References

__iter__()#: Make basis iterable. Re-implemented for additive.

__len__()#: Return the number of additive basis.

__mul__(other)#

Multiply two Basis objects together.

Parameters:: other (BasisMixin | int) – The other Basis object to multiply.
Return type:: Basis
Returns:: The resulting Basis object.

__pow__(exponent)#

Exponentiation of a Basis object.

Define the power of a basis by repeatedly applying the method __multiply__. The exponent must be a positive integer.

Parameters:

exponent (int) – Positive integer exponent

Return type:

BasisMixin

Returns:

The product of the basis with itself “exponent” times. Equivalent to self * self * ... * self.

Raises:

TypeError – If the provided exponent is not an integer.
ValueError – If the integer is zero or negative.

__rmul__(other)#

Right multiplication operator for basis.

Parameters:: other (BasisMixin | int)

__sklearn_clone__()#

Clone the basis while preserving attributes related to input shapes.

This method ensures that input shape attributes (e.g., _input_shape_product, _input_shape_) are preserved during cloning. Reinitializing the class as in the regular sklearn clone would drop these attributes, rendering cross-validation unusable.

Return type:: Basis

property bounds: List[Tuple[float, float]] | Tuple[float, float] | None#: Returns bounds, as provided.

property categories#

compute_features(xi)[source]#

Evaluate basis at sample points.

The basis is evaluated at the locations specified in the inputs. For example, compute_features(np.array([0, .5])) would return the array:

b_1(0) ... b_n(0)
b_1(.5) ... b_n(.5)

where b_i is the i-th basis.

Parameters:: *xi (ArrayLike) – The input samples over which to apply the basis transformation. The samples can be passed as multiple arguments, each representing a different dimension for multivariate inputs.
Return type:: FeatureMatrix
Returns:: A matrix with the transformed features.

Examples

>>> import numpy as np
>>> from nemos.basis import Category
>>> labels = np.array([0, 0, 2, 1])
>>> basis = Category(3)
>>> basis.compute_features(labels)
Array([[1., 0., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.]], dtype=float...)

evaluate(xi)[source]#

Evaluate the categorical basis at the provided samples.

Encodes each sample label as a one-hot vector of length n_basis_funcs (equal to the number of categories).

Parameters:: xi (ArrayLike | Tsd | TsdFrame | TsdTensor) – Array of category labels. Every value must belong to the set of categories defined at construction time. Shape is arbitrary; the returned array appends the category axis as the last dimension.
Return type:: FeatureMatrix
Returns:: One-hot encoded array of shape (*xi.shape, n_basis_funcs).
Raises:: ValueError – If any label in xi is not in the set of known categories.

Notes

The evaluate method returns an array of shape (*xi.shape, n_basis_funcs). The method preserves the input shape and appends an extra basis axis.

Examples

>>> import numpy as np
>>> from nemos.basis import Category
>>> basis = Category(3)
>>> x = np.array([[0, 1, 2, 0], [2, 1, 0, 0]])
>>> out = basis.evaluate(x)
>>> out
Array([[[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.],
        [1., 0., 0.]],

       [[0., 0., 1.],
        [0., 1., 0.],
        [1., 0., 0.],
        [1., 0., 0.]]], dtype=...)
>>> x.shape, out.shape
((2, 4), (2, 4, 3))

The input shape (2, 4) is preserved and a length-3 category axis is appended, giving (2, 4, 3). A 1 at out[i, j, k] means that x[i, j] equals basis.categories[k].

evaluate_on_grid(*n_samples)[source]#

Evaluate the categorical basis on its natural grid.

For a categorical basis the grid is intrinsically fixed: one point per category. The n_samples argument is kept for interface compatibility with continuous bases but must equal self.n_basis_funcs.

Parameters:

n_samples (int) – Must be a single integer equal to n_basis_funcs (the number of categories).

Return type:

Tuple[NDArray, NDArray]

Returns:

X – The category labels, shape (n_basis_funcs,).
Y – One-hot encoding of the categories, shape (n_basis_funcs, n_basis_funcs).

Raises:

ValueError – If more than one value is provided, or if the provided value does not match n_basis_funcs.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing – A MetadataRequest encapsulating routing information.
Return type:: MetadataRequest

get_params(deep=True)#

From scikit-learn, get parameters by inspecting init.

Parameters:: deep – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Return type:: dict
Returns:: A dictionary containing the parameters. Key is the parameter name, value is the parameter value.

property input_shape#

Expected per-sample input shape.

Returns:: If inputs are shaped (n_samples, *shape), returns shape.

property is_complex#

Whether the basis is intrinsically complex.

Returns:: True if the basis is complex; False otherwise.

Notes

compute_features() always returns a real-valued design matrix. For complex bases (e.g., FourierEval), the real and imaginary parts are returned as separate columns.

property label: str#: Label for the basis.

property n_basis_funcs#: Number of basis functions.

property n_output_features: int | None#

Number of features returned by the basis.

Notes

The number of output features can be determined only when the number of inputs provided to the basis is known. Therefore, before the first call to compute_features, this property will return None. After that call, or after setting the input shape with set_input_shape, n_output_features will be available.

property out_of_category: int | str | float | None#

set_input_shape(xi)[source]#

Set the expected input shape for the basis object.

This method configures the shape of the input data that the basis object expects. xi can be specified as an integer, a tuple of integers, or derived from an array. The method also calculates the total number of input features and output features based on the number of basis functions.

Parameters:

xi (Union[int, tuple[int, ...], NDArray]) –

The input shape specification. - An integer: Represents the dimensionality of the input. A value of 1 is treated as scalar input. - A tuple: Represents the exact input shape excluding the first axis (sample axis).

All elements must be integers.

An array: The shape is extracted, excluding the first axis (assumed to be the sample axis).

Raises:

ValueError – If a tuple is provided and it contains non-integer elements.

Returns:

Returns the instance itself to allow method chaining.

Return type:

self

Notes

All state attributes that depends on the input must be set in this method in order for the API of basis to work correctly. In particular, this method is called by setup_basis, which is equivalent to fit for a transformer. If any input dependent state is not set in this method, then compute_features (equivalent to fit_transform) will break.

Examples

>>> import nemos as nmo
>>> import numpy as np
>>> basis = nmo.basis.Category(3)
>>> # Configure with an integer input:
>>> _ = basis.set_input_shape(1)
>>> basis.n_output_features
3
>>> # Configure with a tuple:
>>> _ = basis.set_input_shape((4, 5))
>>> basis.n_output_features
60

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (Any) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance

setup_basis(*xi)#

Set all basis states.

This method corresponds sklearn transformer fit. As fit, it must receive the input and it must set all basis states, i.e. kernel_ and all the states relative to the input shape. The difference between this method and the transformer fit is in the expected input structure, where the transformer fit method requires the inputs to be concatenated in a 2D array, while here each input is provided as a separate time series for each basis element.

Parameters:: xi (NDArray) – Input arrays.
Return type:: Basis
Returns:: The basis with ready for evaluation.

split_by_feature(x, axis=1)[source]#

Decompose an array along a specified axis into sub-arrays based on the number of expected inputs.

This function takes an array (e.g., a design matrix or model coefficients) and splits it along a designated axis.

How it works:

If the basis expects an input shape (n_samples, n_inputs), then the feature axis length will be total_n_features = n_inputs * n_basis_funcs. This axis is reshaped into dimensions (n_inputs, n_basis_funcs).
If the basis expects an input of shape (n_samples,), then the feature axis length will be total_n_features = n_basis_funcs. This axis is reshaped into (1, n_basis_funcs).

For example, if the input array x has shape (1, 2, total_n_features, 4, 5), then after applying this method, it will be reshaped into (1, 2, n_inputs, n_basis_funcs, 4, 5).

The specified axis (axis) determines where the split occurs, and all other dimensions remain unchanged. See the example section below for the most common use cases.

Parameters:

x (NDArray) –
The input array to be split, representing concatenated features, coefficients, or other data. The shape of x along the specified axis must match the total number of features generated by the basis, i.e., self.n_output_features.

Examples:
- For a design matrix: (n_samples, total_n_features)
- For model coefficients: (total_n_features,) or (total_n_features, n_neurons).
axis (int) – The axis along which to split the features. Defaults to 1. Use axis=1 for design matrices (features along columns) and axis=0 for coefficient arrays (features along rows). All other dimensions are preserved.

Raises:

ValueError – If the shape of x along the specified axis does not match self.n_output_features.

Returns:

A dictionary where:

Key: Label of the basis.
Value: the array reshaped to: (..., n_inputs, n_basis_funcs, ...)

Return type:

dict

Examples

>>> import numpy as np
>>> from nemos.basis import Category
>>> basis = Category(3, label="stimulus")
>>> X = basis.compute_features(np.array([0, 1, 2, 0, 1]))
>>> split_features = basis.split_by_feature(X, axis=1)
>>> for feature, arr in split_features.items():
...     print(f"{feature}: shape {arr.shape}")
stimulus: shape (5, 3)

to_transformer()#

Turn the Basis into a TransformerBasis for use with scikit-learn.

Return type:: TransformerBasis

Examples

Jointly cross-validating basis and GLM parameters with scikit-learn.

>>> import nemos as nmo
>>> from sklearn.pipeline import Pipeline
>>> from sklearn.model_selection import GridSearchCV
>>> # load some data
>>> X, y = np.random.normal(size=(30, 1)), np.random.poisson(size=30)
>>> basis = nmo.basis.RaisedCosineLinearEval(10).set_input_shape(1).to_transformer()
>>> glm = nmo.glm.GLM(regularizer="Ridge", regularizer_strength=1.)
>>> pipeline = Pipeline([("basis", basis), ("glm", glm)])
>>> param_grid = dict(
...     glm__regularizer_strength=(0.1, 0.01, 0.001, 1e-6),
...     basis__n_basis_funcs=(3, 5, 10, 20, 100),
... )
>>> gridsearch = GridSearchCV(
...     pipeline,
...     param_grid=param_grid,
...     cv=5,
... )
>>> gridsearch = gridsearch.fit(X, y)