Opinion The Law of Positive-Sum Badness AI News Team March 8, 2026 I keep running into similar arguments online, where people attack “the other” and use the…
Opinion Mitigating collusive self-preference by redaction and paraphrasing AI News Team March 8, 2026 tldr: superficial self-preference can be mitigated by perturbation, but can be hard to eliminateIntroductionOur goal…
Opinion Proposal For Cryptographic Method to Rigorously Verify LLM Prompt Experiments AI News Team March 8, 2026 OverviewI propose and present proof of concept code to formally sign each stage of a…
Opinion The first confirmed instance of an LLM going rogue for instrumental reasons in a real-world setting has occurred, buried in an Alibaba paper about a new training pipeline. AI News Team March 8, 2026 First off, paper link. The title, Let It Flow: Agentic Crafting on Rock and Roll,…
Opinion When has forecasting been useful for you? AI News Team March 8, 2026 I'm currently thinking of how impactful forecasting is. I'm interested to hear about situations where…
Opinion Can governments quickly and cheaply slow AI training? AI News Team March 8, 2026 I originally wrote this as a private doc for people working in the field -…
Opinion Can governments quickly and cheaply slow AI training? AI News Team March 8, 2026 I originally wrote this as a private doc for people working in the field -…
Opinion Did I Catch Claude Cheating? AI News Team March 7, 2026 OverviewIn my API interactions with the Anthropic API I am finding what appears to be…
Opinion D&D.Sci Release Day: Topple the Tower! AI News Team March 7, 2026 This is an entry in the 'Dungeons & Data Science' series, a set of puzzles…
Opinion CHAI 2026 Workshop: Open Call for Posters! AI News Team March 7, 2026 To mark the Center for Human-Compatible AI's tenth annual workshop, we're casting a wide net…