: It suppresses or identifies specific output tokens, acting as a filter within the model's decision-making process.
: It is often analyzed alongside other similar features within the Gemma Scope to understand how the model represents complex concepts or steers its own internal logic. Other Technical References 124095
This feature is a computational "neuron" that activates in response to specific linguistic patterns. According to Neuronpedia , its primary function is associated with for specific characters or tokens. : It suppresses or identifies specific output tokens,
In the context of machine learning and neural networks, identifies a specific "feature" or SAE (Sparse Autoencoder) latent within the Gemma 2 9b (IT) model, specifically in the 20-GEMMASCOPE-RES-131K layer. Feature 124095: Negative Logits 124095