SToG.selectors

Feature selector implementations.

Classes

CorrelatedSTGLayer(input_dim[, sigma, ...])

STG with explicit handling of correlated features.

GumbelLayer(input_dim[, temperature, device])

Gumbel-Softmax based feature selector.

L1Layer(input_dim[, device])

L1 regularization on input layer weights.

STELayer(input_dim[, device])

Straight-Through Estimator for feature selection.

STGLayer(input_dim[, sigma, device])

Stochastic Gates (STG) - Original implementation from Yamada et al. 2020.

class SToG.selectors.STGLayer(input_dim: int, sigma: float = 0.5, device: str = 'cpu')[source]

Bases: BaseFeatureSelector

Stochastic Gates (STG) - Original implementation from Yamada et al. 2020. Uses Gaussian-based continuous relaxation of Bernoulli variables.

Reference: “Learning Feature Sparse Principal Subspace” (Yamada et al., ICML 2020)

__init__(input_dim: int, sigma: float = 0.5, device: str = 'cpu')[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x: Tensor) Tensor[source]

Apply stochastic gates to input features.

regularization_loss() Tensor[source]

Compute regularization: sum of probabilities of selection.

get_selection_probs() Tensor[source]

Get selection probabilities for each feature.

class SToG.selectors.STELayer(input_dim: int, device: str = 'cpu')[source]

Bases: BaseFeatureSelector

Straight-Through Estimator for feature selection. Uses binary gates with gradient flow through sigmoid.

Reference: “Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation” (Bengio et al., 2013)

__init__(input_dim: int, device: str = 'cpu')[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x: Tensor) Tensor[source]

Apply straight-through gates to input features.

regularization_loss() Tensor[source]

Compute regularization: sum of selection probabilities.

get_selection_probs() Tensor[source]

Get selection probabilities for each feature.

class SToG.selectors.GumbelLayer(input_dim: int, temperature: float = 1.0, device: str = 'cpu')[source]

Bases: BaseFeatureSelector

Gumbel-Softmax based feature selector. Uses categorical distribution over {off, on} for each feature.

Reference: “Categorical Reparameterization with Gumbel-Softmax” (Jang et al., ICLR 2017)

Fixed implementation: Properly handles batch dimension and sampling.

__init__(input_dim: int, temperature: float = 1.0, device: str = 'cpu')[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x: Tensor) Tensor[source]

Apply Gumbel-Softmax gates to input features.

Parameters:

x – Input tensor of shape [batch_size, input_dim]

Returns:

Gated input tensor

regularization_loss() Tensor[source]

Compute regularization: sum of “on” state probabilities.

get_selection_probs() Tensor[source]

Get selection probabilities for each feature.

set_temperature(temperature: float)[source]

Update temperature for annealing schedule.

class SToG.selectors.CorrelatedSTGLayer(input_dim: int, sigma: float = 0.5, group_penalty: float = 0.1, device: str = 'cpu')[source]

Bases: BaseFeatureSelector

STG with explicit handling of correlated features. Based on “Adaptive Group Sparse Regularization for Deep Neural Networks”. Uses group structure to handle feature correlation.

Reference: “Adaptive Group Sparse Regularization for Deep Neural Networks”

__init__(input_dim: int, sigma: float = 0.5, group_penalty: float = 0.1, device: str = 'cpu')[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x: Tensor) Tensor[source]

Apply correlated stochastic gates to input features.

regularization_loss() Tensor[source]

Compute regularization with correlation penalty.

get_selection_probs() Tensor[source]

Get selection probabilities for each feature.

class SToG.selectors.L1Layer(input_dim: int, device: str = 'cpu')[source]

Bases: BaseFeatureSelector

L1 regularization on input layer weights. Baseline comparison method for feature selection.

__init__(input_dim: int, device: str = 'cpu')[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x: Tensor) Tensor[source]

Apply L1 weights to input features.

regularization_loss() Tensor[source]

Compute L1 regularization: sum of absolute weights.

get_selection_probs() Tensor[source]

Get feature importance (absolute weights).

get_selected_features(threshold: float = 0.1) ndarray[source]

Get selected features based on weight magnitude.