Opinion

An Informal Definition of Goals for Embedded Agents

​This post was written as part of research done at MATS 9.0 under the mentorship of Richard Ngo.You can conceptualise Embedded agents as inducing a partition[1] of the world into “the agent”, “the external world”, and the dynamics that mediate their interaction; the dynamics include observations and actions. The agent has beliefs, which can be thought of as a generative model of the world. This contains a generative self-model of “the agent” and its relationship to the world through intermediate dynamics[2].An agent’s self-model contains probable events that are statistically dependent on her actions. They are likely, but only if she acts to make them happen. These events are her goals.^See Demski 2025 and Critch 2022 for mathematical treatments of partitions^This model may or may not be interpretable.Discuss ​Read More

​This post was written as part of research done at MATS 9.0 under the mentorship of Richard Ngo.You can conceptualise Embedded agents as inducing a partition[1] of the world into “the agent”, “the external world”, and the dynamics that mediate their interaction; the dynamics include observations and actions. The agent has beliefs, which can be thought of as a generative model of the world. This contains a generative self-model of “the agent” and its relationship to the world through intermediate dynamics[2].An agent’s self-model contains probable events that are statistically dependent on her actions. They are likely, but only if she acts to make them happen. These events are her goals.^See Demski 2025 and Critch 2022 for mathematical treatments of partitions^This model may or may not be interpretable.Discuss ​Read More

Leave a Reply

Your email address will not be published. Required fields are marked *