sparse-autoencoder-training

Community

Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.

Claude

8,991 stars Updated 4 days ago

Allowed Tools

This skill does not declare a tool allowlist. The agent host applies whatever default tools are available at runtime.

Source

SKILL.md / Manifest

https://raw.githubusercontent.com/zechenzhangagi/ai-research-skills/main/04-mechanistic-interpretability/saelens/SKILL.md

Registry

github (via claudemarketplaces.com)

Trust Score

55Fair

Verification10/30

sparse-autoencoder-training

Allowed Tools

Source

Trust Score

Details