Milder temperature makes a hell stable

Milder temperature makes a hell stableThe hell of Hell is game theory folk theorems is not robust.To recap: in an iterated game 100 agents choose a number between 30 and 100 and for the next 10 seconds they all experience the temperature equal to the average of their chosen number in Celsius (without getting damaged). Now it is declared that:For round 1 equilibrium temperature is 99,Iff in round N all agents choose this, then the equilibrium temperature for the round N+1 is 99,Otherwise it’s 100.Since all other agents seem to follow the equilibrium it’s not in the interest of any individual agent to set temperature lower than 99. Even if it does set it at 30 others will set it to 100 and they’ll end up with the temperature of 99.3. Worse than if they picked 99.But let’s consider an agent decides to just set 30 and disregard whatever the other agents are doing. Now the penalty is saturated for all the other agents too. So each of them could set the equilibrium value of 100 and the temperature would be 99.3C in the next round. Or any one of them could set 30 instead and… they won’t get punished any more than if they set 100. But if they set 30 they get a lower temperature from their own choice. So all agents pick 30. And everyone is merely uncomfortably hot instead of boiling. Much better!So we can fix this particular hell with some more reasoning within game theory.However it’s possible to set up a more robust hell at “cost” of them being milder in a robust state. The original one breaks because a single agent can saturate the penalty. And when that happens other agents are free to make the “prosocial” choice.This suggests a solution. You can make your hell robust to m < 30[1] agents deciding to set 30 anyway by setting the following:For round 1 equilibrium is 99 – m,Let dN be number of agents “defecting” (choosing anything other than the equilibrium) in round N,Equilibrium in round N+1 is min(100, 99 – m + dN).This way the penalty doesn’t get saturated until at least m agents decide to pick 30 whatever everyone else is doing.Harder to escape. But I suspect it can be done with some decision theory (but that requires some more knowledge about the other agents).[1] If we have only 70 agents cooperating then them turning the dial up by 1 moves average temperature up by 0.7C. Which exactly balances a single agent changing dial from 100 to 30. So somewhere around m=30 this breaks and you need to start introducing penalties in bigger steps. Since this is just an illustration I’m skipping working this out exactly.Discuss Read More

Related Posts

Vibestemics

Notes on International Klein Blue

AIXI with general utility functions: “Value under ignorance in UAI”

Leave a Reply Cancel reply