Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors Paper • 2509.06608 • Published Sep 8, 2025
Train One Sparse Autoencoder Across Multiple Sparsity Budgets to Preserve Interpretability and Accuracy Paper • 2505.24473 • Published May 30, 2025
Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders Paper • 2606.10029 • Published 7 days ago • 12
Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders Paper • 2606.10029 • Published 7 days ago • 12
Interpreting and Steering a Text-to-Speech Language Model with Sparse Autoencoders Paper • 2606.10029 • Published 7 days ago • 12
Trust-Region Behavior Blending for On-Policy Distillation Paper • 2605.31159 • Published 18 days ago • 66