SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
An Adaptive Proportional Value-per-
    Click Agent for Bidding in Ad
              Auctions
      Trading Agent Design and Analysis Workshop 2011

  Kyriakos C. Chatzidimitriou              AUTH/CERTH
  Lampros C. Stavrogiannis        Univ. of Southampton
  Andreas L. Symeonidis                    AUTH/CERTH
  Pericles A. Mitkas                       AUTH/CERTH
Introduction
• Basic idea: working paper of Dr. Yevgeniy
  Vorobeychik regarding QuakTAC 2009 entry
• Since this initial work, we have:
      –   Conducted more Game-Theoretic experiments
      –   Improved conversions estimation
      –   Improved user distribution estimation
      –   Included an adaptive component
• Ended up with (more or less) the same:
        “Ultimate Answer to the Ultimate Question of Life, The
          Universe, and Everything” TAC Ad Auctions Game

                                0.3
TADA@IJCAI 2011                Mertacor                          2
Basic Strategy: VPC
                                                            D
                                                    q                   q
                                         bid    d 1
                                                                a vd        1




                        ^                                                                                A
                  q         q                                                    q
              ˆ
              v        Pr { conversion      | click } E [ revenue                    | conversion    ]




 ^                                              ^
                                                                                                                      C
                                                                                          | focused }( Iˆ d
     q                                                              q       q
Pr { conversion        | click }   focusedPer           centage         Pr { conversion                       1
                                                                                                                  )
                                            B


     TADA@IJCAI 2011                                     Mertacor                                                 3
A) Expected Revenue
• Solely depends on Manufacturer’s Specialty
  (MS)

        (USP       (3    MSB )) / 3    MS not defined   in q
       USP        (1    MSB )          MS matched   in q
       USP                             MS not matched      in q




TADA@IJCAI 2011                   Mertacor                        4
B) Focused Percentage
• Monte Carlo Simulations
• First Method (Vorobeychik)
      – focusedPercentagequery = conversionsquery /
        [clicksquery * Pr(conversionquery )]
      – Average over query class (F0,F1,F2)
• Second Method                           2011
      – Use server source files
      – MC states (NS, IS, F0, F1, F2, T) per product (x9)
      – focusedPercentagequery = Fiquery / (Fiquery + ISquery)

TADA@IJCAI 2011                Mertacor                          5
Graph for query (pg, null)




TADA@IJCAI 2011              Mertacor          6
C) Id Estimation
                                                                    cap
       Id    1
                  g (cd   3
                              cd   2
                                        cd    1
                                                  ˆ
                                                  cd   ˆ
                                                       cd   1
                                                                C         )

• kNN
      – Inspired by periodic conversions behavior
      – Time series matching using Euclidean Distance as a
        similarity criterion
      – k = 5, t = 5, N = 600
• Heuristic Baseline
      – Underestimate for bidding higher cd = (cd-1 +cd-2 +cd-3 )/4
• Aggregate
      – cd          =         (kNN+Baseline)/2
      – cd+1        =         ((kNN+Baseline)/2)/2
TADA@IJCAI 2011                        Mertacor                               7
kNN example




TADA@IJCAI 2011       Mertacor   8
No ad                      No
                           display                conversions


                         Cyclic behavior                           High
               Low bid                                          conversion
                                                                  prob.
                  • 5-day long pulses
                  • Pulse Height & Width related to
                    factors like user distribution at the
         Low
                    time, competition                                   High
         VPC      • Large peaks in daily profits come from              VPC
                    “catching the wave”


               Low
            conversion                                           High bid
              prob.

                                                      High
                         Conversions
TADA@IJCAI 2011                        Mertacor     ranking                    9
Rest of the strategy
• Budget unconstrained
• Hard-coded ad selection strategy
      – F0 => generic
      – F2 => if user preference matched => targeted
      – F1 => if one of the preferences is matched =>
        targeted, else generic




TADA@IJCAI 2011             Mertacor                    10
Simulation-based Game
                    Theoretical Analysis
• One-shot Bayesian game
• Myopic linear strategies b = α ∙ vpc -> find
  optimal shading, α
• Iterative best response to find a symmetric
  Bayes-Nash equilibrium
• Most profitable single deviation from a
  homogeneous set of opponents until self-play
  is best response -> BNE

TADA@IJCAI 2011            Mertacor          11
D) alpha
• Vorobeychik
      – “a = 0.2, 0.3 more robust to aggressive
        opponents”
      – The previous best values found a=0.1, 0.2 (2009)
        not profitable in 2010 platform
• We have re-run the algorithm under the 2010
  specs
      – a=0.3 is the optimal value (1 -> 0.4 -> 0.3)


TADA@IJCAI 2011               Mertacor                     12
Simulation-based Game
                    Theoretical Analysis
• Instead of α -> (αF0 ,αF1, αF2) x (αCLOW, αCMED,αCHIGH)

• Start from optimal α = 0.3, explore all possible
  deviations for each α, first for query levels then
  capacity levels

• 0.3 seems to be optimal in all cases

• Points in between do not yield different results (0.3
  still the best)
TADA@IJCAI 2011            Mertacor                         13
Simulation-based Game
                    Theoretical Analysis




TADA@IJCAI 2011            Mertacor        14
Simulation-based Game
                    Theoretical Analysis




TADA@IJCAI 2011            Mertacor        15
Adaptive component
• Problem Statement
       We want to capture the case where, based on the
        current environment (competition conditions),
      having a different α than 0.3, will yield a competitive
                            advantage
• GT analysis “a good starting point”
• Model it as an associative k-armed bandit
  problem with optimistic initial values and e-
  greedy action selection strategy
TADA@IJCAI 2011              Mertacor                      16
State, Action, Reward
• State
      – Quantized VPC (x11)
      – Capacity (x3)
      – Query Type (x3)
      – Manufacturer Specialty Bonus (x2)
      – Component Specialty Bonus (x2)
• a = {0.28, 0.29, 0.3, 0.31, 0.32}
• r = daily profits

TADA@IJCAI 2011            Mertacor         17
Experiment (1/2)
• Self-play                      Agent Name        Score
   – 210 games                   Mertacor-Std-1    53.042
   – All capacities to 450       Mertacor-Std-2    52.763
     (MEDIUM)                    Mertacor-kNN-1    52.673
• The standard agent is          Mertacor-kNN-2    52.703
  unbeatable since it is created Mertacor-RL-1     52.270
  that way                       Mertacor-RL-2     52.233
                                 Mertacor-Full-1   51.673
                                 Mertacor-Full-2   51.899


TADA@IJCAI 2011           Mertacor                      18
Experiment (2/2)
• Mix-up things, include more               Agent Name         Score
  agents with different strategies          Mertacor-kNN       53.223
      – 250 games
      – All capacities to 450 (MEDIUM)
                                            Mertacor-Std       52.245
                                            Schlemazl (2010)   51.975
• Better estimation lead to
  better performance                        Mertacor-Full      51.796
                                            Mertacor-RL        51.790
• Adaptiveness is suited for
  even more complicated                     Epflagent (2010)   49.232
  environments                              Tau (2010)         45.987
  (capacity and strategy wise)              Crocodile (2010)   45.858


TADA@IJCAI 2011                  Mertacor                         19
2011
     Also tested/under development
• Daily Campaign Budget Threshold algorithms
      – Estimation
      – Simulation
• Particle Filtering for user state estimation
      – TacTex




TADA@IJCAI 2011         Mertacor                  20
Conclusions & Future Work
• α = 0.3 is a very powerful conclusion/hard to
  beat
• Better estimates for B) user state and C) Id
  could further improve performance
• On-line learning still in very crude form – Not
  yet satisfied but seems a reasonable thing to
  do
• Competition-wise: fitted-Q learning from data
  logs
TADA@IJCAI 2011        Mertacor                 21
Thank you for your attention

         Questions?

Mais conteúdo relacionado

Semelhante a An Adaptive Proportional Value-per-Click Agent for Ad Auctions

Private Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionPrivate Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionJunpei Kawamoto
 
TPC_Microsoft.ppt
TPC_Microsoft.pptTPC_Microsoft.ppt
TPC_Microsoft.pptAsimTaj2
 
GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...PTIHPA
 
Butts estimating facilitiespm challenge2012
Butts estimating facilitiespm challenge2012Butts estimating facilitiespm challenge2012
Butts estimating facilitiespm challenge2012NASAPMC
 
Variational quantum gate optimization on superconducting qubit system
Variational quantum gate optimization on superconducting qubit systemVariational quantum gate optimization on superconducting qubit system
Variational quantum gate optimization on superconducting qubit systemHeyaKentaro
 
IP Expo 2012 Storage Lab Presentation - Tulip Ltd
IP Expo 2012 Storage Lab Presentation - Tulip LtdIP Expo 2012 Storage Lab Presentation - Tulip Ltd
IP Expo 2012 Storage Lab Presentation - Tulip Ltdresponsedatacomms
 
ZVxPlus Presentation: Characterization of Nonlinear RF/HF Components in Time ...
ZVxPlus Presentation: Characterization of Nonlinear RF/HF Components in Time ...ZVxPlus Presentation: Characterization of Nonlinear RF/HF Components in Time ...
ZVxPlus Presentation: Characterization of Nonlinear RF/HF Components in Time ...NMDG NV
 
An approach to incentive based reputation for communities of web services
An approach to incentive based reputation for communities of web servicesAn approach to incentive based reputation for communities of web services
An approach to incentive based reputation for communities of web servicesBabak Khosravifar
 
Marketing Plan - Forum TDA 14 Nov 2008
Marketing Plan - Forum TDA 14 Nov 2008Marketing Plan - Forum TDA 14 Nov 2008
Marketing Plan - Forum TDA 14 Nov 2008Adhika Dirgantara
 

Semelhante a An Adaptive Proportional Value-per-Click Agent for Ad Auctions (12)

Private Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionPrivate Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based Encryption
 
TPC_Microsoft.ppt
TPC_Microsoft.pptTPC_Microsoft.ppt
TPC_Microsoft.ppt
 
GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...
 
22 levine
22 levine22 levine
22 levine
 
Butts estimating facilitiespm challenge2012
Butts estimating facilitiespm challenge2012Butts estimating facilitiespm challenge2012
Butts estimating facilitiespm challenge2012
 
Variational quantum gate optimization on superconducting qubit system
Variational quantum gate optimization on superconducting qubit systemVariational quantum gate optimization on superconducting qubit system
Variational quantum gate optimization on superconducting qubit system
 
Ion express lecture 7 partners
Ion express lecture 7 partnersIon express lecture 7 partners
Ion express lecture 7 partners
 
IP Expo 2012 Storage Lab Presentation - Tulip Ltd
IP Expo 2012 Storage Lab Presentation - Tulip LtdIP Expo 2012 Storage Lab Presentation - Tulip Ltd
IP Expo 2012 Storage Lab Presentation - Tulip Ltd
 
ZVxPlus Presentation: Characterization of Nonlinear RF/HF Components in Time ...
ZVxPlus Presentation: Characterization of Nonlinear RF/HF Components in Time ...ZVxPlus Presentation: Characterization of Nonlinear RF/HF Components in Time ...
ZVxPlus Presentation: Characterization of Nonlinear RF/HF Components in Time ...
 
Mobide2010
Mobide2010Mobide2010
Mobide2010
 
An approach to incentive based reputation for communities of web services
An approach to incentive based reputation for communities of web servicesAn approach to incentive based reputation for communities of web services
An approach to incentive based reputation for communities of web services
 
Marketing Plan - Forum TDA 14 Nov 2008
Marketing Plan - Forum TDA 14 Nov 2008Marketing Plan - Forum TDA 14 Nov 2008
Marketing Plan - Forum TDA 14 Nov 2008
 

Mais de Kyriakos Chatzidimitriou

Simple rules for building robust machine learning models
Simple rules for building robust machine learning modelsSimple rules for building robust machine learning models
Simple rules for building robust machine learning modelsKyriakos Chatzidimitriou
 
Συμβουλές και στρατηγικές που αποκόμισα από το πρώτο μου εγχείρημα
Συμβουλές και στρατηγικές που αποκόμισα από το πρώτο μου εγχείρημαΣυμβουλές και στρατηγικές που αποκόμισα από το πρώτο μου εγχείρημα
Συμβουλές και στρατηγικές που αποκόμισα από το πρώτο μου εγχείρημαKyriakos Chatzidimitriou
 
Advices and strategies I learned from my first business attempt
Advices and strategies I learned from my first business attemptAdvices and strategies I learned from my first business attempt
Advices and strategies I learned from my first business attemptKyriakos Chatzidimitriou
 
Μηχανισμοί Ενισχυτικής Μάθησης και Εξελικτικής Υπολογιστικής για Αυτόνομους Π...
Μηχανισμοί Ενισχυτικής Μάθησης και Εξελικτικής Υπολογιστικής για Αυτόνομους Π...Μηχανισμοί Ενισχυτικής Μάθησης και Εξελικτικής Υπολογιστικής για Αυτόνομους Π...
Μηχανισμοί Ενισχυτικής Μάθησης και Εξελικτικής Υπολογιστικής για Αυτόνομους Π...Kyriakos Chatzidimitriou
 
A NEAT Way for Evolving Echo State Networks
A NEAT Way for Evolving Echo State NetworksA NEAT Way for Evolving Echo State Networks
A NEAT Way for Evolving Echo State NetworksKyriakos Chatzidimitriou
 

Mais de Kyriakos Chatzidimitriou (6)

Simple rules for building robust machine learning models
Simple rules for building robust machine learning modelsSimple rules for building robust machine learning models
Simple rules for building robust machine learning models
 
Συμβουλές και στρατηγικές που αποκόμισα από το πρώτο μου εγχείρημα
Συμβουλές και στρατηγικές που αποκόμισα από το πρώτο μου εγχείρημαΣυμβουλές και στρατηγικές που αποκόμισα από το πρώτο μου εγχείρημα
Συμβουλές και στρατηγικές που αποκόμισα από το πρώτο μου εγχείρημα
 
Advices and strategies I learned from my first business attempt
Advices and strategies I learned from my first business attemptAdvices and strategies I learned from my first business attempt
Advices and strategies I learned from my first business attempt
 
Ι/Ο Data Εngineering
Ι/Ο Data ΕngineeringΙ/Ο Data Εngineering
Ι/Ο Data Εngineering
 
Μηχανισμοί Ενισχυτικής Μάθησης και Εξελικτικής Υπολογιστικής για Αυτόνομους Π...
Μηχανισμοί Ενισχυτικής Μάθησης και Εξελικτικής Υπολογιστικής για Αυτόνομους Π...Μηχανισμοί Ενισχυτικής Μάθησης και Εξελικτικής Υπολογιστικής για Αυτόνομους Π...
Μηχανισμοί Ενισχυτικής Μάθησης και Εξελικτικής Υπολογιστικής για Αυτόνομους Π...
 
A NEAT Way for Evolving Echo State Networks
A NEAT Way for Evolving Echo State NetworksA NEAT Way for Evolving Echo State Networks
A NEAT Way for Evolving Echo State Networks
 

Último

PPT on INDIA VS PAKISTAN - A Sports Rivalry
PPT on INDIA VS PAKISTAN - A Sports RivalryPPT on INDIA VS PAKISTAN - A Sports Rivalry
PPT on INDIA VS PAKISTAN - A Sports Rivalryanirbannath184
 
JORNADA 2 LIGA MUROBASQUETBOL1 2024.docx
JORNADA 2 LIGA MUROBASQUETBOL1 2024.docxJORNADA 2 LIGA MUROBASQUETBOL1 2024.docx
JORNADA 2 LIGA MUROBASQUETBOL1 2024.docxArturo Pacheco Alvarez
 
Clash of Titans_ PSG vs Barcelona (1).pdf
Clash of Titans_ PSG vs Barcelona (1).pdfClash of Titans_ PSG vs Barcelona (1).pdf
Clash of Titans_ PSG vs Barcelona (1).pdfMuhammad Hashim
 
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docxItaly Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docxWorld Wide Tickets And Hospitality
 
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...World Wide Tickets And Hospitality
 
DONAL88 >LINK SLOT PG SOFT TERGACOR 2024
DONAL88 >LINK SLOT PG SOFT TERGACOR 2024DONAL88 >LINK SLOT PG SOFT TERGACOR 2024
DONAL88 >LINK SLOT PG SOFT TERGACOR 2024DONAL88 GACOR
 
Introduction to Basketball-PowerPoint Presentation
Introduction to Basketball-PowerPoint PresentationIntroduction to Basketball-PowerPoint Presentation
Introduction to Basketball-PowerPoint PresentationJuliusMacaballug
 
Project & Portfolio, Market Analysis: WWE
Project & Portfolio, Market Analysis: WWEProject & Portfolio, Market Analysis: WWE
Project & Portfolio, Market Analysis: WWEDeShawn Ellis
 
Benifits of Individual And Team Sports-Group 7.pptx
Benifits of Individual And Team Sports-Group 7.pptxBenifits of Individual And Team Sports-Group 7.pptx
Benifits of Individual And Team Sports-Group 7.pptxsherrymieg19
 
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfJORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfArturo Pacheco Alvarez
 

Último (11)

PPT on INDIA VS PAKISTAN - A Sports Rivalry
PPT on INDIA VS PAKISTAN - A Sports RivalryPPT on INDIA VS PAKISTAN - A Sports Rivalry
PPT on INDIA VS PAKISTAN - A Sports Rivalry
 
JORNADA 2 LIGA MUROBASQUETBOL1 2024.docx
JORNADA 2 LIGA MUROBASQUETBOL1 2024.docxJORNADA 2 LIGA MUROBASQUETBOL1 2024.docx
JORNADA 2 LIGA MUROBASQUETBOL1 2024.docx
 
Clash of Titans_ PSG vs Barcelona (1).pdf
Clash of Titans_ PSG vs Barcelona (1).pdfClash of Titans_ PSG vs Barcelona (1).pdf
Clash of Titans_ PSG vs Barcelona (1).pdf
 
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docxItaly Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
Italy Vs Albania Euro Cup 2024 Italy's Strategy for Success.docx
 
NATIONAL SPORTS DAY WRITTEN QUIZ by QUI9
NATIONAL SPORTS DAY WRITTEN QUIZ by QUI9NATIONAL SPORTS DAY WRITTEN QUIZ by QUI9
NATIONAL SPORTS DAY WRITTEN QUIZ by QUI9
 
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
Spain Vs Italy Showdown Between Italy and Spain Could Determine UEFA Euro 202...
 
DONAL88 >LINK SLOT PG SOFT TERGACOR 2024
DONAL88 >LINK SLOT PG SOFT TERGACOR 2024DONAL88 >LINK SLOT PG SOFT TERGACOR 2024
DONAL88 >LINK SLOT PG SOFT TERGACOR 2024
 
Introduction to Basketball-PowerPoint Presentation
Introduction to Basketball-PowerPoint PresentationIntroduction to Basketball-PowerPoint Presentation
Introduction to Basketball-PowerPoint Presentation
 
Project & Portfolio, Market Analysis: WWE
Project & Portfolio, Market Analysis: WWEProject & Portfolio, Market Analysis: WWE
Project & Portfolio, Market Analysis: WWE
 
Benifits of Individual And Team Sports-Group 7.pptx
Benifits of Individual And Team Sports-Group 7.pptxBenifits of Individual And Team Sports-Group 7.pptx
Benifits of Individual And Team Sports-Group 7.pptx
 
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdfJORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
JORNADA 3 LIGA MURO 2024GHGHGHGHGHGH.pdf
 

An Adaptive Proportional Value-per-Click Agent for Ad Auctions

  • 1. An Adaptive Proportional Value-per- Click Agent for Bidding in Ad Auctions Trading Agent Design and Analysis Workshop 2011 Kyriakos C. Chatzidimitriou AUTH/CERTH Lampros C. Stavrogiannis Univ. of Southampton Andreas L. Symeonidis AUTH/CERTH Pericles A. Mitkas AUTH/CERTH
  • 2. Introduction • Basic idea: working paper of Dr. Yevgeniy Vorobeychik regarding QuakTAC 2009 entry • Since this initial work, we have: – Conducted more Game-Theoretic experiments – Improved conversions estimation – Improved user distribution estimation – Included an adaptive component • Ended up with (more or less) the same: “Ultimate Answer to the Ultimate Question of Life, The Universe, and Everything” TAC Ad Auctions Game 0.3 TADA@IJCAI 2011 Mertacor 2
  • 3. Basic Strategy: VPC D q q bid d 1 a vd 1 ^ A q q q ˆ v Pr { conversion | click } E [ revenue | conversion ] ^ ^ C | focused }( Iˆ d q q q Pr { conversion | click } focusedPer centage Pr { conversion 1 ) B TADA@IJCAI 2011 Mertacor 3
  • 4. A) Expected Revenue • Solely depends on Manufacturer’s Specialty (MS) (USP (3 MSB )) / 3 MS not defined in q USP (1 MSB ) MS matched in q USP MS not matched in q TADA@IJCAI 2011 Mertacor 4
  • 5. B) Focused Percentage • Monte Carlo Simulations • First Method (Vorobeychik) – focusedPercentagequery = conversionsquery / [clicksquery * Pr(conversionquery )] – Average over query class (F0,F1,F2) • Second Method 2011 – Use server source files – MC states (NS, IS, F0, F1, F2, T) per product (x9) – focusedPercentagequery = Fiquery / (Fiquery + ISquery) TADA@IJCAI 2011 Mertacor 5
  • 6. Graph for query (pg, null) TADA@IJCAI 2011 Mertacor 6
  • 7. C) Id Estimation cap Id 1 g (cd 3 cd 2 cd 1 ˆ cd ˆ cd 1 C ) • kNN – Inspired by periodic conversions behavior – Time series matching using Euclidean Distance as a similarity criterion – k = 5, t = 5, N = 600 • Heuristic Baseline – Underestimate for bidding higher cd = (cd-1 +cd-2 +cd-3 )/4 • Aggregate – cd = (kNN+Baseline)/2 – cd+1 = ((kNN+Baseline)/2)/2 TADA@IJCAI 2011 Mertacor 7
  • 9. No ad No display conversions Cyclic behavior High Low bid conversion prob. • 5-day long pulses • Pulse Height & Width related to factors like user distribution at the Low time, competition High VPC • Large peaks in daily profits come from VPC “catching the wave” Low conversion High bid prob. High Conversions TADA@IJCAI 2011 Mertacor ranking 9
  • 10. Rest of the strategy • Budget unconstrained • Hard-coded ad selection strategy – F0 => generic – F2 => if user preference matched => targeted – F1 => if one of the preferences is matched => targeted, else generic TADA@IJCAI 2011 Mertacor 10
  • 11. Simulation-based Game Theoretical Analysis • One-shot Bayesian game • Myopic linear strategies b = α ∙ vpc -> find optimal shading, α • Iterative best response to find a symmetric Bayes-Nash equilibrium • Most profitable single deviation from a homogeneous set of opponents until self-play is best response -> BNE TADA@IJCAI 2011 Mertacor 11
  • 12. D) alpha • Vorobeychik – “a = 0.2, 0.3 more robust to aggressive opponents” – The previous best values found a=0.1, 0.2 (2009) not profitable in 2010 platform • We have re-run the algorithm under the 2010 specs – a=0.3 is the optimal value (1 -> 0.4 -> 0.3) TADA@IJCAI 2011 Mertacor 12
  • 13. Simulation-based Game Theoretical Analysis • Instead of α -> (αF0 ,αF1, αF2) x (αCLOW, αCMED,αCHIGH) • Start from optimal α = 0.3, explore all possible deviations for each α, first for query levels then capacity levels • 0.3 seems to be optimal in all cases • Points in between do not yield different results (0.3 still the best) TADA@IJCAI 2011 Mertacor 13
  • 14. Simulation-based Game Theoretical Analysis TADA@IJCAI 2011 Mertacor 14
  • 15. Simulation-based Game Theoretical Analysis TADA@IJCAI 2011 Mertacor 15
  • 16. Adaptive component • Problem Statement We want to capture the case where, based on the current environment (competition conditions), having a different α than 0.3, will yield a competitive advantage • GT analysis “a good starting point” • Model it as an associative k-armed bandit problem with optimistic initial values and e- greedy action selection strategy TADA@IJCAI 2011 Mertacor 16
  • 17. State, Action, Reward • State – Quantized VPC (x11) – Capacity (x3) – Query Type (x3) – Manufacturer Specialty Bonus (x2) – Component Specialty Bonus (x2) • a = {0.28, 0.29, 0.3, 0.31, 0.32} • r = daily profits TADA@IJCAI 2011 Mertacor 17
  • 18. Experiment (1/2) • Self-play Agent Name Score – 210 games Mertacor-Std-1 53.042 – All capacities to 450 Mertacor-Std-2 52.763 (MEDIUM) Mertacor-kNN-1 52.673 • The standard agent is Mertacor-kNN-2 52.703 unbeatable since it is created Mertacor-RL-1 52.270 that way Mertacor-RL-2 52.233 Mertacor-Full-1 51.673 Mertacor-Full-2 51.899 TADA@IJCAI 2011 Mertacor 18
  • 19. Experiment (2/2) • Mix-up things, include more Agent Name Score agents with different strategies Mertacor-kNN 53.223 – 250 games – All capacities to 450 (MEDIUM) Mertacor-Std 52.245 Schlemazl (2010) 51.975 • Better estimation lead to better performance Mertacor-Full 51.796 Mertacor-RL 51.790 • Adaptiveness is suited for even more complicated Epflagent (2010) 49.232 environments Tau (2010) 45.987 (capacity and strategy wise) Crocodile (2010) 45.858 TADA@IJCAI 2011 Mertacor 19
  • 20. 2011 Also tested/under development • Daily Campaign Budget Threshold algorithms – Estimation – Simulation • Particle Filtering for user state estimation – TacTex TADA@IJCAI 2011 Mertacor 20
  • 21. Conclusions & Future Work • α = 0.3 is a very powerful conclusion/hard to beat • Better estimates for B) user state and C) Id could further improve performance • On-line learning still in very crude form – Not yet satisfied but seems a reasonable thing to do • Competition-wise: fitted-Q learning from data logs TADA@IJCAI 2011 Mertacor 21
  • 22. Thank you for your attention Questions?