SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Voting Advice via Direct Access to the Relevant Data                  1




  Voting Advice via Direct Access to
          the Relevant Data

                                                       Maarten Marx

                                         Universiteit van Amsterdam

                       Politicologen etmaal, Amsterdam, 2011-06-09
Voting Advice via Direct Access to the Relevant Data             2



                                                       Outline

• Two types of voting advice systems

• Lipschits on the web

• Technical details

• Conclusions and how further?
Voting Advice via Direct Access to the Relevant Data        3



                        2 types of voting advices systems

 • Lipschits method (1977–1998)

 • Stemwijzer method (on the web, from 1998)

Same users: voters

Same motivation: help voters in making a choice, based on
  manifesto data
Voting Advice via Direct Access to the Relevant Data                  4



                                                  Primary goals

Lipschits • quickly find standpoint of party in manifesto on topic X
   • easily compare standpoints of parties on topic X




Stemwijzer quickly find which parties best fit the user on a (for all
   voters) fixed set of topics
Voting Advice via Direct Access to the Relevant Data                    5



                                               Secondary goals

                          Reuse obtained data for scientific research.

Lipschits
   • standardize manifestos
   • rich list of salient topics for each election
   • rich controlled vocabulary
   • (Now) excellent training data for creating classifiers

KiesKompas
   • positions of parties on several topics
   • positions of “the electorate” on these topics
Voting Advice via Direct Access to the Relevant Data       6



        Differences between Lipschits and Stemwijzer

• One size fits all vs user decides on topics and parties

• Direct vs indirect access to primary sources

• Different input-output behaviour
Voting Advice via Direct Access to the Relevant Data                                                     7



                                                       Input-output




                                       In                           Out
        Stemwijzer                     answer to questions          ranked list of parties
        Kieskompas                                   ”              model of user as a party
        Lipschits                      controled vocabulary terms   relevant paragraphs for each party
        VerkiezingsKijker                  ” or free search terms                    ”
Voting Advice via Direct Access to the Relevant Data                     8



                             Demo: ’Lipschits on the web’

                                                  verkiezingskijker.nl
Voting Advice via Direct Access to the Relevant Data                 9



                                   History VerkiezingsKijker

TK 2006 UvA-Stemwijzer.
  • Eddy Habben Jansen: take Lipschits as inspiration
  • Motivation Stemwijzer: add “proof” for party positions from
    their manifestos

PS 2007 UvA-Kieskompas. Verkiezingskijker used to facilitate large
  amount of party-placements (12 provinces × 10 parties × 36
  positions = 4320 placements.

DNPP corpus UvA Bsc thesis: search engine for DNPP manifesto
  corpus.

TK 2010 Google
Voting Advice via Direct Access to the Relevant Data            10



                                   Technical Details: outline

1. Idea

2. How to do it

3. Main problem

4. Solutions
Voting Advice via Direct Access to the Relevant Data                  11



                                        Idea verkiezingskijker

• Replicate Lipschits, “Google style”

• Add free keyword search

• Make it scalable, faster to make (and without a Lipschits . . . )
Voting Advice via Direct Access to the Relevant Data             12



                                              How to do that?

 • Collect manifestos (in time . . . )

 • Standardize them into one data format

 • Partition each manifesto into meaningful units (paragraphs)

Outcome Basic Google style search engine which returns on each
  search term a ranked list of paragraphs

Advanced search restrict to parties
Voting Advice via Direct Access to the Relevant Data                  13



                              Main problem: Semantic gap

 Voter and manifesto use different terms to talk about the same topic

• different parties use different terms to talk about the same topic

• small amounts of text per retrieval unit make this problem worse

• Recall Problem: system does not retrieve all relevant paragraphs.
Voting Advice via Direct Access to the Relevant Data                   14



                                         Two solutions to this

Hierarchical controlled vocabulary
   • Basically back to Lipschits.
   • Burden at the user.

Document expansion
  • find related terms (schiphol vliegveld luchthaven vliegtuig . . .
  • expand paragraphs: if it contains one term, add all others
  • Aim: Improve recall.
  • Danger: topic drift (thus more false positives)
Voting Advice via Direct Access to the Relevant Data                 15



                                                       Predicament

With both solutions we seem to be back at Lipschits and need to do
all the work he did . . .
Voting Advice via Direct Access to the Relevant Data              16



                   Our solution: learning from examples

• Stemwijzer created a list of 100 important election topics.

• For each topic, Stemwijzer found 5 highly relevant paragraphs

• From these paragraphs we harvested all overused terms (using
  corpus comparison techniques [Rayson, Garside 2000])

• For each topic we took the top k terms

• Quick manual check to remove outliers

• Output: classifier for each topic, and set of expansion terms.
Voting Advice via Direct Access to the Relevant Data                      17



                                 Conclusion and what next?

• Both systems are complimentary.
  • Modern Lipschits system is useful for both makers and users of
    stemwijzer-like systems.

• Fine grained classification of manifestos (and alternatives . . . ) is
  useful for comparative research (e.g., Breeman-Timmermans,
  Louwerse)
Voting Advice via Direct Access to the Relevant Data                  18



                                       What next/Discussion

• Standardization of controlled vocabularies and development of
  high quality gold standard data is desirable

• Soon: Lipschits 1998 available in Excel and as a fully searchable
  hyperlinked web-docoment.

• Wish? same for the “Verkiezingsprogramma’s met cd-rom”
  (Holsteyn et al) series?
  Or is the system by Google sufficient?

Mais conteúdo relacionado

Semelhante a voting advice slides

Groningen nl pgroep
Groningen nl pgroepGroningen nl pgroep
Groningen nl pgroep
maartenmarx
 
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Timo Wandhoefer
 
Keynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official PublicationsKeynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official Publications
maartenmarx
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
GESIS
 
Building the PoliMedia search system; data- and user-driven
Building the PoliMedia search system; data- and user-drivenBuilding the PoliMedia search system; data- and user-driven
Building the PoliMedia search system; data- and user-driven
MaxKemman
 
Developing a global vision through marketing research part 02
Developing a global vision through marketing research   part 02Developing a global vision through marketing research   part 02
Developing a global vision through marketing research part 02
Tala Lorena
 

Semelhante a voting advice slides (20)

The Potential and Perils of Election Prediction Using Social Media Sources
The Potential and Perils of Election Prediction Using Social Media SourcesThe Potential and Perils of Election Prediction Using Social Media Sources
The Potential and Perils of Election Prediction Using Social Media Sources
 
Groningen nl pgroep
Groningen nl pgroepGroningen nl pgroep
Groningen nl pgroep
 
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
Online Forums vs. Social Networks: Two Case Studies to support eGovernment wi...
 
The data we want
The data we wantThe data we want
The data we want
 
[DSC Europe 23] Alen Kisic - How can do Facebook data and machine learning al...
[DSC Europe 23] Alen Kisic - How can do Facebook data and machine learning al...[DSC Europe 23] Alen Kisic - How can do Facebook data and machine learning al...
[DSC Europe 23] Alen Kisic - How can do Facebook data and machine learning al...
 
What you can learn from usability testing
What you can learn from usability testingWhat you can learn from usability testing
What you can learn from usability testing
 
Data stories
Data storiesData stories
Data stories
 
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
 
Keynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official PublicationsKeynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official Publications
 
Why altmetrics?
Why altmetrics?Why altmetrics?
Why altmetrics?
 
Survey Research Methods with Lynn Silipigni Connaway
Survey Research Methods with Lynn Silipigni ConnawaySurvey Research Methods with Lynn Silipigni Connaway
Survey Research Methods with Lynn Silipigni Connaway
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
 
Building the PoliMedia search system; data- and user-driven
Building the PoliMedia search system; data- and user-drivenBuilding the PoliMedia search system; data- and user-driven
Building the PoliMedia search system; data- and user-driven
 
Lowenberg Making Data Count
Lowenberg Making Data CountLowenberg Making Data Count
Lowenberg Making Data Count
 
Collective Intelligence Meets the Political Agenda
Collective Intelligence Meets the Political AgendaCollective Intelligence Meets the Political Agenda
Collective Intelligence Meets the Political Agenda
 
Media, intention and final vote: A two-wave panel data study to the effects o...
Media, intention and final vote: A two-wave panel data study to the effects o...Media, intention and final vote: A two-wave panel data study to the effects o...
Media, intention and final vote: A two-wave panel data study to the effects o...
 
Developing a global vision through marketing research part 02
Developing a global vision through marketing research   part 02Developing a global vision through marketing research   part 02
Developing a global vision through marketing research part 02
 
Increasing Voter Knowledge with Pre-Election Interventions on Facebook
Increasing Voter Knowledge with Pre-Election Interventions on FacebookIncreasing Voter Knowledge with Pre-Election Interventions on Facebook
Increasing Voter Knowledge with Pre-Election Interventions on Facebook
 
Reimagining the Digital Monograph: Improving the Discovery and Use of Scholar...
Reimagining the Digital Monograph: Improving the Discovery and Use of Scholar...Reimagining the Digital Monograph: Improving the Discovery and Use of Scholar...
Reimagining the Digital Monograph: Improving the Discovery and Use of Scholar...
 
How machines learn to talk. Machine Learning for Conversational AI
How machines learn to talk. Machine Learning for Conversational AIHow machines learn to talk. Machine Learning for Conversational AI
How machines learn to talk. Machine Learning for Conversational AI
 

Mais de maartenmarx (12)

Ilja state2014expressivity
Ilja state2014expressivityIlja state2014expressivity
Ilja state2014expressivity
 
Haagse Hogeschool 2012-09-13
Haagse Hogeschool 2012-09-13Haagse Hogeschool 2012-09-13
Haagse Hogeschool 2012-09-13
 
Expertmeeting, E-humanities en politieke geschiedenis, Nijmegen, 2013-09-13
Expertmeeting, E-humanities en politieke geschiedenis, Nijmegen, 2013-09-13Expertmeeting, E-humanities en politieke geschiedenis, Nijmegen, 2013-09-13
Expertmeeting, E-humanities en politieke geschiedenis, Nijmegen, 2013-09-13
 
Economie van de aandacht
  Economie van de aandacht  Economie van de aandacht
Economie van de aandacht
 
Dans dataprijs2012
Dans dataprijs2012Dans dataprijs2012
Dans dataprijs2012
 
College sicco van-sas-2012_10_08
College sicco van-sas-2012_10_08College sicco van-sas-2012_10_08
College sicco van-sas-2012_10_08
 
Presentation at NLDB 2012
Presentation at NLDB 2012Presentation at NLDB 2012
Presentation at NLDB 2012
 
Women in Dutch parliament: what they did
Women in Dutch parliament: what they didWomen in Dutch parliament: what they did
Women in Dutch parliament: what they did
 
Namescape 2012 03 06
Namescape 2012 03 06Namescape 2012 03 06
Namescape 2012 03 06
 
TV-slant presentatie_politicologen_etmaal
TV-slant presentatie_politicologen_etmaalTV-slant presentatie_politicologen_etmaal
TV-slant presentatie_politicologen_etmaal
 
networks inparliament-ccct
 networks inparliament-ccct networks inparliament-ccct
networks inparliament-ccct
 
Screen biographischportaal2010 12-10
Screen biographischportaal2010 12-10Screen biographischportaal2010 12-10
Screen biographischportaal2010 12-10
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

voting advice slides

  • 1. Voting Advice via Direct Access to the Relevant Data 1 Voting Advice via Direct Access to the Relevant Data Maarten Marx Universiteit van Amsterdam Politicologen etmaal, Amsterdam, 2011-06-09
  • 2. Voting Advice via Direct Access to the Relevant Data 2 Outline • Two types of voting advice systems • Lipschits on the web • Technical details • Conclusions and how further?
  • 3. Voting Advice via Direct Access to the Relevant Data 3 2 types of voting advices systems • Lipschits method (1977–1998) • Stemwijzer method (on the web, from 1998) Same users: voters Same motivation: help voters in making a choice, based on manifesto data
  • 4. Voting Advice via Direct Access to the Relevant Data 4 Primary goals Lipschits • quickly find standpoint of party in manifesto on topic X • easily compare standpoints of parties on topic X Stemwijzer quickly find which parties best fit the user on a (for all voters) fixed set of topics
  • 5. Voting Advice via Direct Access to the Relevant Data 5 Secondary goals Reuse obtained data for scientific research. Lipschits • standardize manifestos • rich list of salient topics for each election • rich controlled vocabulary • (Now) excellent training data for creating classifiers KiesKompas • positions of parties on several topics • positions of “the electorate” on these topics
  • 6. Voting Advice via Direct Access to the Relevant Data 6 Differences between Lipschits and Stemwijzer • One size fits all vs user decides on topics and parties • Direct vs indirect access to primary sources • Different input-output behaviour
  • 7. Voting Advice via Direct Access to the Relevant Data 7 Input-output In Out Stemwijzer answer to questions ranked list of parties Kieskompas ” model of user as a party Lipschits controled vocabulary terms relevant paragraphs for each party VerkiezingsKijker ” or free search terms ”
  • 8. Voting Advice via Direct Access to the Relevant Data 8 Demo: ’Lipschits on the web’ verkiezingskijker.nl
  • 9. Voting Advice via Direct Access to the Relevant Data 9 History VerkiezingsKijker TK 2006 UvA-Stemwijzer. • Eddy Habben Jansen: take Lipschits as inspiration • Motivation Stemwijzer: add “proof” for party positions from their manifestos PS 2007 UvA-Kieskompas. Verkiezingskijker used to facilitate large amount of party-placements (12 provinces × 10 parties × 36 positions = 4320 placements. DNPP corpus UvA Bsc thesis: search engine for DNPP manifesto corpus. TK 2010 Google
  • 10. Voting Advice via Direct Access to the Relevant Data 10 Technical Details: outline 1. Idea 2. How to do it 3. Main problem 4. Solutions
  • 11. Voting Advice via Direct Access to the Relevant Data 11 Idea verkiezingskijker • Replicate Lipschits, “Google style” • Add free keyword search • Make it scalable, faster to make (and without a Lipschits . . . )
  • 12. Voting Advice via Direct Access to the Relevant Data 12 How to do that? • Collect manifestos (in time . . . ) • Standardize them into one data format • Partition each manifesto into meaningful units (paragraphs) Outcome Basic Google style search engine which returns on each search term a ranked list of paragraphs Advanced search restrict to parties
  • 13. Voting Advice via Direct Access to the Relevant Data 13 Main problem: Semantic gap Voter and manifesto use different terms to talk about the same topic • different parties use different terms to talk about the same topic • small amounts of text per retrieval unit make this problem worse • Recall Problem: system does not retrieve all relevant paragraphs.
  • 14. Voting Advice via Direct Access to the Relevant Data 14 Two solutions to this Hierarchical controlled vocabulary • Basically back to Lipschits. • Burden at the user. Document expansion • find related terms (schiphol vliegveld luchthaven vliegtuig . . . • expand paragraphs: if it contains one term, add all others • Aim: Improve recall. • Danger: topic drift (thus more false positives)
  • 15. Voting Advice via Direct Access to the Relevant Data 15 Predicament With both solutions we seem to be back at Lipschits and need to do all the work he did . . .
  • 16. Voting Advice via Direct Access to the Relevant Data 16 Our solution: learning from examples • Stemwijzer created a list of 100 important election topics. • For each topic, Stemwijzer found 5 highly relevant paragraphs • From these paragraphs we harvested all overused terms (using corpus comparison techniques [Rayson, Garside 2000]) • For each topic we took the top k terms • Quick manual check to remove outliers • Output: classifier for each topic, and set of expansion terms.
  • 17. Voting Advice via Direct Access to the Relevant Data 17 Conclusion and what next? • Both systems are complimentary. • Modern Lipschits system is useful for both makers and users of stemwijzer-like systems. • Fine grained classification of manifestos (and alternatives . . . ) is useful for comparative research (e.g., Breeman-Timmermans, Louwerse)
  • 18. Voting Advice via Direct Access to the Relevant Data 18 What next/Discussion • Standardization of controlled vocabularies and development of high quality gold standard data is desirable • Soon: Lipschits 1998 available in Excel and as a fully searchable hyperlinked web-docoment. • Wish? same for the “Verkiezingsprogramma’s met cd-rom” (Holsteyn et al) series? Or is the system by Google sufficient?