Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.
This skill does not declare a tool allowlist. The agent host applies whatever default tools are available at runtime.
SKILL.md / Manifest
https://raw.githubusercontent.com/zechenzhangagi/ai-research-skills/main/04-mechanistic-interpretability/saelens/SKILL.mdRegistry
github (via claudemarketplaces.com)