SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
Johannes Treutlein
Foundational Research Institute
Decision theory research
at FRI
Johannes Treutlein
Foundational Research Institute
A wager for evidential decision
theory
Altruistic Newcomb problem
3
Ω
?
one
wish
predicts one-boxing:

two wishes
predicts two-boxing:

nothing
Altruistic Newcomb problem
4
S1 S2
A1 2 0
A2 3 1
● A1: One-box; A2: Two-box
● S1: opaque box contains two wishes; S2: opaque box empty
Evidential decision theory
5
Causal decision theory
6
Meta decision theory
7
(Nozick 1993; MacAskill 2016)
8
Altruistic Newcomb problem in a large
universe
Ω
Ω
Ω
Ω
Ω
Ω
Ω
Altruistic Newcomb problem in a large
universe
9
EDT Wager
10
● Large universe
● Caring about the gains of our copies
● Non-zero credence in EDT
● Meta decision theory
Wager for evidential decision theory (and all other theories that
take impact of copies into account)
Relevance
11
● AI Safety
● Macrostrategy
● Multiverse-wide superrationality (Oesterheld 2017a)
Caspar Oesterheld 

Foundational Research Institute
Decision theory and approval-
directed agents
Implementing decision theories in AIs
13
• Two problems of decision theory in AI safety:
• What is the right decision theory for an AI?
• How do we implement decision theories in AI?
• Decision theory not explicit in AI architecture
• Example: Doing what has worked well in the past (Oesterheld
2017b)
• Exception: Gödel machine (Schmidhuber 2006)
Approval-directed agency
14
(Christiano 2014)
Two decision theories
15
Two decision theories
16
Example
17
Two decision theories
18
Example
19
20
In the paper…
If overseer only looks at the world, the agent’s DT is
decisive.
If overseer only looks at the agent’s action, the
overseer’s DT is decisive.
Presentation title
John Smith | Head of Department 28.06.2016
Subtitle or caption
Thank you.
{johannes,caspar}@foundational-research.org
References
22
• Ahmed, A. (2014): Evidence, Decision and Causality. Cambridge University Press.
• Almond, P. (2010): On Causation and Correlation. Part 2: Implications of Evidential
Decision Theory. https://casparoesterheld.files.wordpress.com/2017/03/
correlation2.pdf
• Bostrom, N. (2014b): Superintelligence: Paths, Dangers, Strategies. Oxford
University Press.
• Christiano, P. (2014): Model-free decisions. https://ai-alignment.com/model-free-
decisions-6e6609f5d99e
• MacAskill, W. (2016): Smokers, Psychos, and Decision-Theoretic Uncertainty. The
Journal of Philosophy
• Nozick, R. (1993): The Nature of Rationality. Princeton: Princeton University Press
References
23
• Oesterheld, C. (2017b): Doing what has worked well in the past leads to evidential
decision theory. https://casparoesterheld.files.wordpress.com/2017/09/learningdt.pdf
• Oesterheld, C. (2017a): Multiverse-wide Cooperation via Correlated Decision
Making. https://foundational-research.org/files/Multiverse-wide-Cooperation-via-
Correlated-Decision-Making.pdf
• Schmidhuber, J. (2006): Gödel Machines: Self-Referential Universal Problem Solvers
Making Provably Optimal Self-Improvements. ftp://ftp.idsia.ch/pub/juergen/gm6.pdf
• Soares, N. and Fallenstein, B. (2014a): Aligning Superintelligence with Human
Interests: A Technical Research Agenda. MIRI Tech. rep. 2014-8. https://
intelligence.org/files/TechnicalAgenda.pdf
• Soares, N. and Fallenstein, B. (2014b): Toward Idealized Decision Theory. MIRI
Tech. rep. 2014-7. https://arxiv.org/abs/1507.01986
• Soares and Levinstein (2017): Cheating Death in Damascus. https://intelligence.org/
files/DeathInDamascus.pdf

Mais conteúdo relacionado

Semelhante a Decision Theory Research at FRI

PO 397 Introduction to Social Science Research
PO 397 Introduction to Social Science Research PO 397 Introduction to Social Science Research
PO 397 Introduction to Social Science Research
atrantham
 

Semelhante a Decision Theory Research at FRI (20)

Broad concepts - Methods in User-Technology Studies
Broad concepts - Methods in User-Technology StudiesBroad concepts - Methods in User-Technology Studies
Broad concepts - Methods in User-Technology Studies
 
Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...
 
Bias and the Data Lifecycle
Bias and the Data LifecycleBias and the Data Lifecycle
Bias and the Data Lifecycle
 
PO 397 Introduction to Social Science Research
PO 397 Introduction to Social Science Research PO 397 Introduction to Social Science Research
PO 397 Introduction to Social Science Research
 
R m 101
R m 101R m 101
R m 101
 
2022_Fried_Workshop_theory_measurement.pptx
2022_Fried_Workshop_theory_measurement.pptx2022_Fried_Workshop_theory_measurement.pptx
2022_Fried_Workshop_theory_measurement.pptx
 
chapter-3.pptx
chapter-3.pptxchapter-3.pptx
chapter-3.pptx
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
 
MUMS Opening Workshop - Principles of Predictive Computational Science: Predi...
MUMS Opening Workshop - Principles of Predictive Computational Science: Predi...MUMS Opening Workshop - Principles of Predictive Computational Science: Predi...
MUMS Opening Workshop - Principles of Predictive Computational Science: Predi...
 
Engineering design of an environmental management system: A trans-disciplinar...
Engineering design of an environmental management system: A trans-disciplinar...Engineering design of an environmental management system: A trans-disciplinar...
Engineering design of an environmental management system: A trans-disciplinar...
 
West-Vanderbilt-Talk--Revised-22March2017.ppt
West-Vanderbilt-Talk--Revised-22March2017.pptWest-Vanderbilt-Talk--Revised-22March2017.ppt
West-Vanderbilt-Talk--Revised-22March2017.ppt
 
Presentation on-Resarch-paradigms.pptx
Presentation on-Resarch-paradigms.pptxPresentation on-Resarch-paradigms.pptx
Presentation on-Resarch-paradigms.pptx
 
Emeasec 2014 RoodtKoen
Emeasec 2014 RoodtKoenEmeasec 2014 RoodtKoen
Emeasec 2014 RoodtKoen
 
ISWC2015 Opening Session
ISWC2015 Opening SessionISWC2015 Opening Session
ISWC2015 Opening Session
 
Peer Review Filters.pptx
Peer Review Filters.pptxPeer Review Filters.pptx
Peer Review Filters.pptx
 
Modelling Innovation – some options from probabilistic to radical
Modelling Innovation – some options from probabilistic to radicalModelling Innovation – some options from probabilistic to radical
Modelling Innovation – some options from probabilistic to radical
 
Exploring the underpinnings of research (2)
Exploring the underpinnings of research (2)Exploring the underpinnings of research (2)
Exploring the underpinnings of research (2)
 
Scientific mind (nov. 2016-feb.2017)
Scientific mind (nov. 2016-feb.2017)Scientific mind (nov. 2016-feb.2017)
Scientific mind (nov. 2016-feb.2017)
 
Why Research?
Why Research?Why Research?
Why Research?
 
Borner - Modelling science technology and innovation
Borner - Modelling science technology and innovationBorner - Modelling science technology and innovation
Borner - Modelling science technology and innovation
 

Mais de Effective Altruism Foundation

Mais de Effective Altruism Foundation (20)

Current Status of the EA Movement
Current Status of the EA MovementCurrent Status of the EA Movement
Current Status of the EA Movement
 
Nationale Volksinitiative zur Abschaffung der Massen­tierhaltung
Nationale Volksinitiative zur Abschaffung der Massen­tierhaltungNationale Volksinitiative zur Abschaffung der Massen­tierhaltung
Nationale Volksinitiative zur Abschaffung der Massen­tierhaltung
 
The New Meat
The New MeatThe New Meat
The New Meat
 
Lessons from Building an EA Charity: New Incentives
Lessons from Building an EA Charity: New IncentivesLessons from Building an EA Charity: New Incentives
Lessons from Building an EA Charity: New Incentives
 
What Does (and Doesn't) AI Mean for Effective Altruism?
What Does (and Doesn't) AI Mean for Effective Altruism?What Does (and Doesn't) AI Mean for Effective Altruism?
What Does (and Doesn't) AI Mean for Effective Altruism?
 
Delivering Development Impact at Scale
Delivering Development Impact at ScaleDelivering Development Impact at Scale
Delivering Development Impact at Scale
 
Using Evidence to Fight Poverty
Using Evidence to Fight PovertyUsing Evidence to Fight Poverty
Using Evidence to Fight Poverty
 
Toward a #FutureFortified: How Food Fortification Can Help End Micronutrient ...
Toward a #FutureFortified: How Food Fortification Can Help End Micronutrient ...Toward a #FutureFortified: How Food Fortification Can Help End Micronutrient ...
Toward a #FutureFortified: How Food Fortification Can Help End Micronutrient ...
 
Raising the Stakes: When Poker Meets EA
Raising the Stakes: When Poker Meets EARaising the Stakes: When Poker Meets EA
Raising the Stakes: When Poker Meets EA
 
Against Naive Effective Altruism
Against Naive Effective AltruismAgainst Naive Effective Altruism
Against Naive Effective Altruism
 
Wild-Animal Suffering Movement Building Through Research
Wild-Animal Suffering Movement Building Through ResearchWild-Animal Suffering Movement Building Through Research
Wild-Animal Suffering Movement Building Through Research
 
S-risks: Why they are the worst existential risks, and how to prevent them
S-risks: Why they are the worst existential risks, and how to prevent themS-risks: Why they are the worst existential risks, and how to prevent them
S-risks: Why they are the worst existential risks, and how to prevent them
 
Ethische Berufswahl – Mit 80.000 Stunden die Welt retten
Ethische Berufswahl – Mit 80.000 Stunden die Welt rettenEthische Berufswahl – Mit 80.000 Stunden die Welt retten
Ethische Berufswahl – Mit 80.000 Stunden die Welt retten
 
Effektiver Altruismus: Minimaldefinition und Begründung
Effektiver Altruismus: Minimaldefinition und BegründungEffektiver Altruismus: Minimaldefinition und Begründung
Effektiver Altruismus: Minimaldefinition und Begründung
 
Effektiver Altruismus: Einwände und Erwiderungen
Effektiver Altruismus: Einwände und ErwiderungenEffektiver Altruismus: Einwände und Erwiderungen
Effektiver Altruismus: Einwände und Erwiderungen
 
Rationaler Altruismus – eine Begründung moralischen Engagements
Rationaler Altruismus – eine Begründung moralischen EngagementsRationaler Altruismus – eine Begründung moralischen Engagements
Rationaler Altruismus – eine Begründung moralischen Engagements
 
Meta­-Strategien: Investitionen in die EA­-Bewegung
Meta­-Strategien: Investitionen in die EA­-BewegungMeta­-Strategien: Investitionen in die EA­-Bewegung
Meta­-Strategien: Investitionen in die EA­-Bewegung
 
Dysrationalia — The IQ­-RQ gap and what to do about it
Dysrationalia — The IQ­-RQ gap and what to do about itDysrationalia — The IQ­-RQ gap and what to do about it
Dysrationalia — The IQ­-RQ gap and what to do about it
 
Political and legal activism for all sentient beings
Political and legal activism for all sentient beingsPolitical and legal activism for all sentient beings
Political and legal activism for all sentient beings
 
Künstliche Intelligenz und Effektiver Altruismus
Künstliche Intelligenz und Effektiver AltruismusKünstliche Intelligenz und Effektiver Altruismus
Künstliche Intelligenz und Effektiver Altruismus
 

Último

VIP Call Girls Agra 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Agra 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Agra 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Agra 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Chandigarh Call girls 9053900678 Call girls in Chandigarh
 
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Financing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCFinancing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCC
 
Postal Ballots-For home voting step by step process 2024.pptx
Postal Ballots-For home voting step by step process 2024.pptxPostal Ballots-For home voting step by step process 2024.pptx
Postal Ballots-For home voting step by step process 2024.pptx
 
Get Premium Budhwar Peth Call Girls (8005736733) 24x7 Rate 15999 with A/c Roo...
Get Premium Budhwar Peth Call Girls (8005736733) 24x7 Rate 15999 with A/c Roo...Get Premium Budhwar Peth Call Girls (8005736733) 24x7 Rate 15999 with A/c Roo...
Get Premium Budhwar Peth Call Girls (8005736733) 24x7 Rate 15999 with A/c Roo...
 
The NAP process & South-South peer learning
The NAP process & South-South peer learningThe NAP process & South-South peer learning
The NAP process & South-South peer learning
 
VIP Call Girls Agra 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Agra 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Agra 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Agra 7001035870 Whatsapp Number, 24/07 Booking
 
Government e Marketplace GeM Presentation
Government e Marketplace GeM PresentationGovernment e Marketplace GeM Presentation
Government e Marketplace GeM Presentation
 
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
 
Election 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdfElection 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdf
 
The Economic and Organised Crime Office (EOCO) has been advised by the Office...
The Economic and Organised Crime Office (EOCO) has been advised by the Office...The Economic and Organised Crime Office (EOCO) has been advised by the Office...
The Economic and Organised Crime Office (EOCO) has been advised by the Office...
 
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
 
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
 
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
 
PPT BIJNOR COUNTING Counting of Votes on ETPBs (FOR SERVICE ELECTORS
PPT BIJNOR COUNTING Counting of Votes on ETPBs (FOR SERVICE ELECTORSPPT BIJNOR COUNTING Counting of Votes on ETPBs (FOR SERVICE ELECTORS
PPT BIJNOR COUNTING Counting of Votes on ETPBs (FOR SERVICE ELECTORS
 
Finance strategies for adaptation. Presentation for CANCC
Finance strategies for adaptation. Presentation for CANCCFinance strategies for adaptation. Presentation for CANCC
Finance strategies for adaptation. Presentation for CANCC
 
World Press Freedom Day 2024; May 3rd - Poster
World Press Freedom Day 2024; May 3rd - PosterWorld Press Freedom Day 2024; May 3rd - Poster
World Press Freedom Day 2024; May 3rd - Poster
 
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Nanded City ? Russian Call Girls Pune - 450+ Call Girl Cash Payment 800573673...
Nanded City ? Russian Call Girls Pune - 450+ Call Girl Cash Payment 800573673...Nanded City ? Russian Call Girls Pune - 450+ Call Girl Cash Payment 800573673...
Nanded City ? Russian Call Girls Pune - 450+ Call Girl Cash Payment 800573673...
 
Sustainability by Design: Assessment Tool for Just Energy Transition Plans
Sustainability by Design: Assessment Tool for Just Energy Transition PlansSustainability by Design: Assessment Tool for Just Energy Transition Plans
Sustainability by Design: Assessment Tool for Just Energy Transition Plans
 
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
 

Decision Theory Research at FRI

  • 1. Johannes Treutlein Foundational Research Institute Decision theory research at FRI
  • 2. Johannes Treutlein Foundational Research Institute A wager for evidential decision theory
  • 3. Altruistic Newcomb problem 3 Ω ? one wish predicts one-boxing:
 two wishes predicts two-boxing:
 nothing
  • 4. Altruistic Newcomb problem 4 S1 S2 A1 2 0 A2 3 1 ● A1: One-box; A2: Two-box ● S1: opaque box contains two wishes; S2: opaque box empty
  • 7. Meta decision theory 7 (Nozick 1993; MacAskill 2016)
  • 8. 8 Altruistic Newcomb problem in a large universe Ω Ω Ω Ω Ω Ω Ω
  • 9. Altruistic Newcomb problem in a large universe 9
  • 10. EDT Wager 10 ● Large universe ● Caring about the gains of our copies ● Non-zero credence in EDT ● Meta decision theory Wager for evidential decision theory (and all other theories that take impact of copies into account)
  • 11. Relevance 11 ● AI Safety ● Macrostrategy ● Multiverse-wide superrationality (Oesterheld 2017a)
  • 12. Caspar Oesterheld 
 Foundational Research Institute Decision theory and approval- directed agents
  • 13. Implementing decision theories in AIs 13 • Two problems of decision theory in AI safety: • What is the right decision theory for an AI? • How do we implement decision theories in AI? • Decision theory not explicit in AI architecture • Example: Doing what has worked well in the past (Oesterheld 2017b) • Exception: Gödel machine (Schmidhuber 2006)
  • 20. 20 In the paper… If overseer only looks at the world, the agent’s DT is decisive. If overseer only looks at the agent’s action, the overseer’s DT is decisive.
  • 21. Presentation title John Smith | Head of Department 28.06.2016 Subtitle or caption Thank you. {johannes,caspar}@foundational-research.org
  • 22. References 22 • Ahmed, A. (2014): Evidence, Decision and Causality. Cambridge University Press. • Almond, P. (2010): On Causation and Correlation. Part 2: Implications of Evidential Decision Theory. https://casparoesterheld.files.wordpress.com/2017/03/ correlation2.pdf • Bostrom, N. (2014b): Superintelligence: Paths, Dangers, Strategies. Oxford University Press. • Christiano, P. (2014): Model-free decisions. https://ai-alignment.com/model-free- decisions-6e6609f5d99e • MacAskill, W. (2016): Smokers, Psychos, and Decision-Theoretic Uncertainty. The Journal of Philosophy • Nozick, R. (1993): The Nature of Rationality. Princeton: Princeton University Press
  • 23. References 23 • Oesterheld, C. (2017b): Doing what has worked well in the past leads to evidential decision theory. https://casparoesterheld.files.wordpress.com/2017/09/learningdt.pdf • Oesterheld, C. (2017a): Multiverse-wide Cooperation via Correlated Decision Making. https://foundational-research.org/files/Multiverse-wide-Cooperation-via- Correlated-Decision-Making.pdf • Schmidhuber, J. (2006): Gödel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements. ftp://ftp.idsia.ch/pub/juergen/gm6.pdf • Soares, N. and Fallenstein, B. (2014a): Aligning Superintelligence with Human Interests: A Technical Research Agenda. MIRI Tech. rep. 2014-8. https:// intelligence.org/files/TechnicalAgenda.pdf • Soares, N. and Fallenstein, B. (2014b): Toward Idealized Decision Theory. MIRI Tech. rep. 2014-7. https://arxiv.org/abs/1507.01986 • Soares and Levinstein (2017): Cheating Death in Damascus. https://intelligence.org/ files/DeathInDamascus.pdf