Sparse Autoencoders (SAE)
- is a type of neural network used for unsupervised feature learning and dimensionality reduction
- it is a variant of a standard autoencoder, but with an added sparsity constraint on the hidden layer activations
Use Cases
- can be used for LLM Interpretability to see what’s going on within a neural network