nemos.basis.Category#
- class nemos.basis.Category(categories, out_of_category=True, label=None)[source]#
Bases:
EvalBasisMixin,AtomicBasisMixin,BasisCategorical one-hot encoding basis.
Encodes a categorical variable with
n_categoriesunique labels as a one-hot feature matrix of shape(n_samples, n_categories). Each column corresponds to one category: the entry is 1 when the input equals that category, and 0 everywhere else.- Parameters:
categories (
Union[List, NDArray,int]) –The set of valid category labels. Accepted forms:
int: interpreted as the number of categories; labels default to[0, 1, ..., categories-1].listorNDArray: the explicit list of unique category labels. Note that the provided category labels will be sorted and stored as an attribute. Columniof the one-hot encoding will correspond tobasis.categories[i]. When a list is provided, it is converted to anNDArrayvianp.asarray. Mixed-type lists will be cast to a common dtype (e.g.,["a", 1]becomesarray(['a', '1'], dtype='<U21')).
out_of_category (
bool) – If False, raise if labels that do not belong tocategoriesare provided, else encode the out-of-category labels as all 0s.label (
Optional[str]) – The label of the basis, intended to be descriptive of the task variable being processed. For example:"trial_type","stimulus_id".
Notes
Design matrix identifiability.
This basis produces a full encoding: one column per category. Because NeMoS GLMs include an intercept, including all columns of a
Categorybasis as a standalone predictor introduces perfect collinearity — the column sum equals the intercept column. Always drop one column per categorical variable when using categories as main effects; the dropped category becomes the reference level and all retained coefficients are contrasts against it.When
Categoryis multiplied with a continuous basis (the recommended use), the intercept is not involved and no column needs to be dropped.For a detailed discussion of identifiability, reference-level choice, and the effect of regularization, see the identifiability guide.
Examples
Encode a categorical variable with 3 integer labels:
>>> import numpy as np >>> from nemos.basis import Category >>> basis = Category(3) >>> basis.n_basis_funcs 3 >>> labels = np.array([0, 1, 2, 0]) >>> features = basis.compute_features(labels) >>> features Array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.], [1., 0., 0.]], dtype=...)
y The 4 input labels map to a
(4, 3)matrix: a 1 in columnkof rowimeans sampleihas labelbasis.categories[k].Standalone categorical predictor with reference coding (drop one column):
>>> basis = Category(["L", "R"]) >>> X = basis.compute_features(np.array(["L", "R", "L", "R"])) >>> X = X[:, 1:] # "Tri" is the reference; remaining column is the contrast between "Sq" and "Tri" >>> X Array([[0.], [1.], [0.], [1.]], dtype=...)
Category-specific tuning curves via basis product (no column dropping needed):
>>> from nemos.basis import RaisedCosineLinearEval >>> speed = np.random.randn(20) >>> context = np.random.choice(["L", "R"], size=20) >>> bas = Category(["L", "R"]) * RaisedCosineLinearEval(5) >>> X = bas.compute_features(context, speed)
Attributes
Returns bounds, as provided.
Expected per-sample input shape.
Whether the basis is intrinsically complex.
Label for the basis.
Number of basis functions.
Number of features returned by the basis.
Methods
__init__(categories[, out_of_category, label])compute_features(xi)Evaluate basis at sample points.
evaluate(xi)Evaluate the categorical basis at the provided samples.
evaluate_on_grid(*n_samples)Evaluate the categorical basis on its natural grid.
get_params([deep])From scikit-learn, get parameters by inspecting init.
set_input_shape(xi)Set the expected input shape for the basis object.
set_params(**params)Set the parameters of this estimator.
setup_basis(*xi)Set all basis states.
split_by_feature(x[, axis])Decompose an array along a specified axis into sub-arrays based on the number of expected inputs.
Turn the Basis into a TransformerBasis for use with scikit-learn.
- __add__(other)#
Add two Basis objects together.
- Parameters:
other (
BasisMixin) – The other Basis object to add.- Returns:
The resulting Basis object.
- Return type:
- classmethod __init_subclass__(**kwargs)#
Set the
set_{method}_requestmethods.This uses PEP-487 [1] to set the
set_{method}_requestmethods. It looks for the information available in the set default values which are set using__metadata_request__*class attributes, or inferred from method signatures.The
__metadata_request__*class attributes are used when a method does not explicitly accept a metadata through its arguments or if the developer would like to specify a request value for those metadata which are different from the defaultNone.References
- __iter__()#
Make basis iterable. Re-implemented for additive.
- __len__()#
Return the number of additive basis.
- __mul__(other)#
Multiply two Basis objects together.
- __pow__(exponent)#
Exponentiation of a Basis object.
Define the power of a basis by repeatedly applying the method __multiply__. The exponent must be a positive integer.
- Parameters:
exponent (
int) – Positive integer exponent- Return type:
BasisMixin- Returns:
The product of the basis with itself “exponent” times. Equivalent to
self * self * ... * self.- Raises:
TypeError – If the provided exponent is not an integer.
ValueError – If the integer is zero or negative.
- __sklearn_clone__()#
Clone the basis while preserving attributes related to input shapes.
This method ensures that input shape attributes (e.g., _input_shape_product, _input_shape_) are preserved during cloning. Reinitializing the class as in the regular sklearn clone would drop these attributes, rendering cross-validation unusable.
- Return type:
- property bounds: List[Tuple[float, float]] | Tuple[float, float] | None#
Returns bounds, as provided.
- property categories#
- compute_features(xi)[source]#
Evaluate basis at sample points.
The basis is evaluated at the locations specified in the inputs. For example,
compute_features(np.array([0, .5]))would return the array:b_1(0) ... b_n(0) b_1(.5) ... b_n(.5)
where
b_iis the i-th basis.- Parameters:
*xi (ArrayLike) – The input samples over which to apply the basis transformation. The samples can be passed as multiple arguments, each representing a different dimension for multivariate inputs.
- Return type:
FeatureMatrix
- Returns:
A matrix with the transformed features.
Examples
>>> import numpy as np >>> from nemos.basis import Category >>> labels = np.array([0, 0, 2, 1]) >>> basis = Category(3) >>> basis.compute_features(labels) Array([[1., 0., 0.], [1., 0., 0.], [0., 0., 1.], [0., 1., 0.]], dtype=float...)
- evaluate(xi)[source]#
Evaluate the categorical basis at the provided samples.
Encodes each sample label as a one-hot vector of length
n_basis_funcs(equal to the number of categories).- Parameters:
xi (ArrayLike | Tsd | TsdFrame | TsdTensor) – Array of category labels. Every value must belong to the set of categories defined at construction time. Shape is arbitrary; the returned array appends the category axis as the last dimension.
- Return type:
FeatureMatrix
- Returns:
One-hot encoded array of shape
(*xi.shape, n_basis_funcs).- Raises:
ValueError – If any label in
xiis not in the set of known categories.
Notes
The evaluate method returns an array of shape
(*xi.shape, n_basis_funcs). The method preserves the input shape and appends an extra basis axis.Examples
>>> import numpy as np >>> from nemos.basis import Category >>> basis = Category(3) >>> x = np.array([[0, 1, 2, 0], [2, 1, 0, 0]]) >>> out = basis.evaluate(x) >>> out Array([[[1., 0., 0.], [0., 1., 0.], [0., 0., 1.], [1., 0., 0.]], [[0., 0., 1.], [0., 1., 0.], [1., 0., 0.], [1., 0., 0.]]], dtype=...) >>> x.shape, out.shape ((2, 4), (2, 4, 3))
The input shape
(2, 4)is preserved and a length-3 category axis is appended, giving(2, 4, 3). A 1 atout[i, j, k]means thatx[i, j]equalsbasis.categories[k].
- evaluate_on_grid(*n_samples)[source]#
Evaluate the categorical basis on its natural grid.
For a categorical basis the grid is intrinsically fixed: one point per category. The
n_samplesargument is kept for interface compatibility with continuous bases but must equalself.n_basis_funcs.- Parameters:
n_samples (
int) – Must be a single integer equal ton_basis_funcs(the number of categories).- Return type:
Tuple[NDArray, NDArray]- Returns:
X – The category labels, shape
(n_basis_funcs,).Y – One-hot encoding of the categories, shape
(n_basis_funcs, n_basis_funcs).
- Raises:
ValueError – If more than one value is provided, or if the provided value does not match
n_basis_funcs.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequestencapsulating routing information.- Return type:
MetadataRequest
- get_params(deep=True)#
From scikit-learn, get parameters by inspecting init.
- Parameters:
deep – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Return type:
- Returns:
A dictionary containing the parameters. Key is the parameter name, value is the parameter value.
- property input_shape#
Expected per-sample input shape.
- Returns:
If inputs are shaped
(n_samples, *shape), returnsshape.
- property is_complex#
Whether the basis is intrinsically complex.
- Returns:
Trueif the basis is complex;Falseotherwise.
Notes
compute_features()always returns a real-valued design matrix. For complex bases (e.g.,FourierEval), the real and imaginary parts are returned as separate columns.
- property n_basis_funcs#
Number of basis functions.
- property n_output_features: int | None#
Number of features returned by the basis.
Notes
The number of output features can be determined only when the number of inputs provided to the basis is known. Therefore, before the first call to
compute_features, this property will returnNone. After that call, or after setting the input shape withset_input_shape,n_output_featureswill be available.
- set_input_shape(xi)[source]#
Set the expected input shape for the basis object.
This method configures the shape of the input data that the basis object expects.
xican be specified as an integer, a tuple of integers, or derived from an array. The method also calculates the total number of input features and output features based on the number of basis functions.- Parameters:
xi (
Union[int,tuple[int,...], NDArray]) –The input shape specification. - An integer: Represents the dimensionality of the input. A value of
1is treated as scalar input. - A tuple: Represents the exact input shape excluding the first axis (sample axis).All elements must be integers.
An array: The shape is extracted, excluding the first axis (assumed to be the sample axis).
- Raises:
ValueError – If a tuple is provided and it contains non-integer elements.
- Returns:
Returns the instance itself to allow method chaining.
- Return type:
self
Notes
All state attributes that depends on the input must be set in this method in order for the API of basis to work correctly. In particular, this method is called by
setup_basis, which is equivalent tofitfor a transformer. If any input dependent state is not set in this method, thencompute_features(equivalent tofit_transform) will break.Examples
>>> import nemos as nmo >>> import numpy as np >>> basis = nmo.basis.Category(3) >>> # Configure with an integer input: >>> _ = basis.set_input_shape(1) >>> basis.n_output_features 3 >>> # Configure with a tuple: >>> _ = basis.set_input_shape((4, 5)) >>> basis.n_output_features 60
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
**params (
Any) – Estimator parameters.- Returns:
self – Estimator instance.
- Return type:
estimator instance
- setup_basis(*xi)#
Set all basis states.
This method corresponds sklearn transformer
fit. As fit, it must receive the input and it must set all basis states, i.e.kernel_and all the states relative to the input shape. The difference between this method and the transformerfitis in the expected input structure, where the transformerfitmethod requires the inputs to be concatenated in a 2D array, while here each input is provided as a separate time series for each basis element.- Parameters:
xi (NDArray) – Input arrays.
- Return type:
- Returns:
The basis with ready for evaluation.
- split_by_feature(x, axis=1)[source]#
Decompose an array along a specified axis into sub-arrays based on the number of expected inputs.
This function takes an array (e.g., a design matrix or model coefficients) and splits it along a designated axis.
How it works:
If the basis expects an input shape
(n_samples, n_inputs), then the feature axis length will betotal_n_features = n_inputs * n_basis_funcs. This axis is reshaped into dimensions(n_inputs, n_basis_funcs).If the basis expects an input of shape
(n_samples,), then the feature axis length will betotal_n_features = n_basis_funcs. This axis is reshaped into(1, n_basis_funcs).
For example, if the input array
xhas shape(1, 2, total_n_features, 4, 5), then after applying this method, it will be reshaped into(1, 2, n_inputs, n_basis_funcs, 4, 5).The specified axis (
axis) determines where the split occurs, and all other dimensions remain unchanged. See the example section below for the most common use cases.- Parameters:
x (NDArray) –
The input array to be split, representing concatenated features, coefficients, or other data. The shape of
xalong the specified axis must match the total number of features generated by the basis, i.e.,self.n_output_features.Examples:
For a design matrix:
(n_samples, total_n_features)For model coefficients:
(total_n_features,)or(total_n_features, n_neurons).
axis (
int) – The axis along which to split the features. Defaults to 1. Useaxis=1for design matrices (features along columns) andaxis=0for coefficient arrays (features along rows). All other dimensions are preserved.
- Raises:
ValueError – If the shape of
xalong the specified axis does not matchself.n_output_features.- Returns:
A dictionary where:
Key: Label of the basis.
Value: the array reshaped to:
(..., n_inputs, n_basis_funcs, ...)
- Return type:
Examples
>>> import numpy as np >>> from nemos.basis import Category >>> basis = Category(3, label="stimulus") >>> X = basis.compute_features(np.array([0, 1, 2, 0, 1])) >>> split_features = basis.split_by_feature(X, axis=1) >>> for feature, arr in split_features.items(): ... print(f"{feature}: shape {arr.shape}") stimulus: shape (5, 3)
- to_transformer()#
Turn the Basis into a TransformerBasis for use with scikit-learn.
- Return type:
Examples
Jointly cross-validating basis and GLM parameters with scikit-learn.
>>> import nemos as nmo >>> from sklearn.pipeline import Pipeline >>> from sklearn.model_selection import GridSearchCV >>> # load some data >>> X, y = np.random.normal(size=(30, 1)), np.random.poisson(size=30) >>> basis = nmo.basis.RaisedCosineLinearEval(10).set_input_shape(1).to_transformer() >>> glm = nmo.glm.GLM(regularizer="Ridge", regularizer_strength=1.) >>> pipeline = Pipeline([("basis", basis), ("glm", glm)]) >>> param_grid = dict( ... glm__regularizer_strength=(0.1, 0.01, 0.001, 1e-6), ... basis__n_basis_funcs=(3, 5, 10, 20, 100), ... ) >>> gridsearch = GridSearchCV( ... pipeline, ... param_grid=param_grid, ... cv=5, ... ) >>> gridsearch = gridsearch.fit(X, y)