Opinion

Painless Activation Steering

​Published on February 15, 2026 5:49 PM GMTWe introduce an automated activation‑steering approach that plugs into standard labeled datasets—no handcrafted prompt pairs or feature annotation. On 18 tasks and 3 open‑weight models, the introspective variant (iPAS) yields the strongest behavior improvements, and layers on top of ICL/SFT.Full write‑up: https://open.substack.com/pub/sashacui/p/painless-activation-steering-pasPaper: arxiv.org/abs/2509.22739 Discuss ​Read More

​Published on February 15, 2026 5:49 PM GMTWe introduce an automated activation‑steering approach that plugs into standard labeled datasets—no handcrafted prompt pairs or feature annotation. On 18 tasks and 3 open‑weight models, the introspective variant (iPAS) yields the strongest behavior improvements, and layers on top of ICL/SFT.Full write‑up: https://open.substack.com/pub/sashacui/p/painless-activation-steering-pasPaper: arxiv.org/abs/2509.22739 Discuss ​Read More

Leave a Reply

Your email address will not be published. Required fields are marked *