Opinion Operationalizing FDT AI News Team March 13, 2026 This post is an attempt to better operationalize FDT (functional decision theory). It answers the…
Opinion Operationalizing FDT AI News Team March 13, 2026 This post is an attempt to better operationalize FDT (functional decision theory). It answers the…
Opinion Steering Awareness: Models Can Be Trained to Detect Activation Steering AI News Team March 13, 2026 TL;DRLLMs can be trained to detect activation steering robustly. With lightweight fine-tuning, models learn to…
Opinion All technical alignment plans are steps in the dark AI News Team March 13, 2026 One reason aligning superintelligent AI will be hard is because we don’t get to test…
Opinion A Plan ‘B’ for AI safety AI News Team March 13, 2026 TL;DR: Teaching AI the value of biology in solving future AI-relevant problems may serve as…
Opinion Anthropic vs USG. What will happen by May 1st? Long careful forecast. AI News Team March 13, 2026 On March 4th, 2026, the Pentagon did something it had never done to an American company…
Opinion Ideologies Embed Taboos Against Common Knowledge Formation: a Case Study with LLMs AI News Team March 12, 2026 LLMs are searchable holograms of the text corpus they were trained on. RLHF LLM chat…
Opinion Are AIs more likely to pursue on-episode or beyond-episode reward? AI News Team March 12, 2026 Consider an AI that terminally pursues reward. How dangerous is this? It depends on how…
Opinion Modeling a Constant-Compute Automated AI R&D Process AI News Team March 12, 2026 We’d like to know how much limits on compute scaling will constrain AI R&D. This…
Opinion Forecasting Dojo Meetup – Open discussion about our forecasting process AI News Team March 12, 2026 Hi Everyone, The next meetup of the forecasting practice group is here! This time we'll…