Opinion Anthropic: “Statement from Dario Amodei on our discussions with the Department of War” AI News Team February 27, 2026 I believe deeply in the existential importance of using AI to defend the United States…
Opinion Asymmetric Risks of Unfaithful Reasoning: Omission as the Critical Failure Mode for AI Monitoring AI News Team February 27, 2026 TLDR:Faithful reasoning is a representation in a comprehensible lower dimension that is understandable and only…
Opinion Getting Back To It AI News Team February 27, 2026 Artist: Lily TaylorIt’s been a while since I’ve written anything lately, and that doesn’t feel…
Opinion Inference-time Generative Debates on Coding and Reasoning Tasks for Scalable Oversight AI News Team February 27, 2026 By Ethan Elasky and Frank Nakasako (equal contribution)We tested generative debate (where participants freely make…
Opinion A minor point about instrumental convergence that I would like feedback on AI News Team February 27, 2026 PreambleMy current understanding: the EY/MIRI perspective is that superintelligent AI will invariably instrumentally converge on…
Opinion AI welfare as a demotivator for takeover. AI News Team February 27, 2026 TLDR: Superhuman AI may consider takeover the risky option, and we can influence its choices…
Opinion Frontier AI companies probably can’t leave the US AI News Team February 26, 2026 It’s plausible that, over the next few years, US-based frontier AI companies will become very…
Opinion Improving Internal Model Principle AI News Team February 26, 2026 Funded by the Advanced Research + Invention Agency (ARIA) through project code MSAI-SE01-P005This post was…
Opinion A Positive Case for Faithfulness: LLM Self-Explanations Help Predict Model Behavior AI News Team February 26, 2026 This is a summary of our new paper.TL;DR: Existing faithfulness metrics are not suitable for…
Opinion How Robust Is Monitoring Against Secret Loyalties? AI News Team February 26, 2026 If monitoring is robust, secret loyalties may be very hard to act on—at least for…