Opinion AIs will be used in “unhinged” configurations AI News Team March 11, 2026 Writing up a probably-obvious point that I want to refer to later, with significant writing…
Opinion Sanity Weekend Retrospective AI News Team March 11, 2026 Last December, we ran a workshop on exploring civilizational sanity. Our core team consisted of…
Opinion What do we know about AI company employee giving? AI News Team March 11, 2026 Many Anthropic employees, especially, are sympathetic to AI safety and (will) have lots of money.…
Opinion The Day After Move 37 AI News Team March 11, 2026 I was a few months into 21 years old when a hijacked plane crashed into…
Opinion AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors AI News Team March 11, 2026 TL;DR We release AuditBench, an alignment auditing benchmark. AuditBench consists of 56 language models with…
Opinion Economic efficiency often undermines sociopolitical autonomy AI News Team March 11, 2026 Many people in my intellectual circles use economic abstractions as one of their main tools…
Opinion Letting Claude do Autonomous Research to Improve SAEs AI News Team March 11, 2026 This work was done as part of MATS 7.1I pointed Claude at our new synthetic…
Opinion Don’t Let LLMs Write For You AI News Team March 11, 2026 Content note: nothing in this piece is a prank or jumpscare where I smirkingly reveal…
Opinion Questions to ask when everyone is shooting themselves in the foot AI News Team March 11, 2026 Why is everyone shooting themselves in the foot? What's wrong with institutions/incentives that makes foot-shooting an…
Opinion The case for satiating cheaply-satisfied AI preferences AI News Team March 10, 2026 A central AI safety concern is that AIs will develop unintended preferences and undermine human…