10. EDT Wager
10
● Large universe
● Caring about the gains of our copies
● Non-zero credence in EDT
● Meta decision theory
Wager for evidential decision theory (and all other theories that
take impact of copies into account)
13. Implementing decision theories in AIs
13
• Two problems of decision theory in AI safety:
• What is the right decision theory for an AI?
• How do we implement decision theories in AI?
• Decision theory not explicit in AI architecture
• Example: Doing what has worked well in the past (Oesterheld
2017b)
• Exception: Gödel machine (Schmidhuber 2006)
20. 20
In the paper…
If overseer only looks at the world, the agent’s DT is
decisive.
If overseer only looks at the agent’s action, the
overseer’s DT is decisive.
21. Presentation title
John Smith | Head of Department 28.06.2016
Subtitle or caption
Thank you.
{johannes,caspar}@foundational-research.org
22. References
22
• Ahmed, A. (2014): Evidence, Decision and Causality. Cambridge University Press.
• Almond, P. (2010): On Causation and Correlation. Part 2: Implications of Evidential
Decision Theory. https://casparoesterheld.files.wordpress.com/2017/03/
correlation2.pdf
• Bostrom, N. (2014b): Superintelligence: Paths, Dangers, Strategies. Oxford
University Press.
• Christiano, P. (2014): Model-free decisions. https://ai-alignment.com/model-free-
decisions-6e6609f5d99e
• MacAskill, W. (2016): Smokers, Psychos, and Decision-Theoretic Uncertainty. The
Journal of Philosophy
• Nozick, R. (1993): The Nature of Rationality. Princeton: Princeton University Press
23. References
23
• Oesterheld, C. (2017b): Doing what has worked well in the past leads to evidential
decision theory. https://casparoesterheld.files.wordpress.com/2017/09/learningdt.pdf
• Oesterheld, C. (2017a): Multiverse-wide Cooperation via Correlated Decision
Making. https://foundational-research.org/files/Multiverse-wide-Cooperation-via-
Correlated-Decision-Making.pdf
• Schmidhuber, J. (2006): Gödel Machines: Self-Referential Universal Problem Solvers
Making Provably Optimal Self-Improvements. ftp://ftp.idsia.ch/pub/juergen/gm6.pdf
• Soares, N. and Fallenstein, B. (2014a): Aligning Superintelligence with Human
Interests: A Technical Research Agenda. MIRI Tech. rep. 2014-8. https://
intelligence.org/files/TechnicalAgenda.pdf
• Soares, N. and Fallenstein, B. (2014b): Toward Idealized Decision Theory. MIRI
Tech. rep. 2014-7. https://arxiv.org/abs/1507.01986
• Soares and Levinstein (2017): Cheating Death in Damascus. https://intelligence.org/
files/DeathInDamascus.pdf