1. Forecasting Elections from Voters’ Perceptions
of Candidates’ Ability to Handle Issues
A Test of the Index Method
Andreas Graefe, Karlsruhe Institute of Technology
J. Scott Armstrong, Wharton School, University of Pennsylvania
The full paper to this talk can be downloaded at: tinyurl.com/issueindex
Bucharest Dialogues on
Expert Knowledge, Prediction, Forecasting: A Social Sciences Perspective
November 21, 2010
2. Outline
1. Status-quo in election forecasting
2. Limitations of multiple regression for social science
predictions
3. Index method
4. Issue-index model
5. Future applications of the index method
6. Conclusions
3. Evolution of election forecasting
1978: Economist Ray Fair publishes regression model
that focuses on economic growth and inflation as
predictor variables.
[Fair, 1978]
Over the next three decades, others would follow with
models that use slightly different variables.
4. Status-quo in election forecasting
In a review of 14 quantitative models by economists and
political scientists
All were regression models
12 used a measure of the state of the economy
7 used a measure of the incumbent’s popularity
5 used both
[Jones & Cuzan, 2008]
5. Common view in election forecasting
Most models are economic vote models.
On average, these models perform well.
Widely-held view that a presidential election is a
referendum on
- the incumbent president’s popularity or
- his ability to handle the economy
Presidential campaigns, individual differences among
candidates, and parties are assumed to have no – or only
little – impact on the election outcome.
6. But what about the candidates?
Candidates play a vital role in election campaigns and
are extensively discussed in the media, e.g. their
- Biography (experience)
- Personality
- Stands on the issues
- Endorsed policies
Yet, no existing model uses such information.
Models are unable to aid decision-making in campaigns.
7. Research with decision-making implications
Can better forecasting help to advise…
A candidate’s decision on whether to run for office?
A party’s decision about who to nominate?
Decisions as to what issues a candidate should
emphasize in a campaign?
Decisions as to which policies to endorse?
This talk
8. Why do all existing models ignore individual
differences between candidates?
Insufficient use of
forecasting methods?
9. Multiple regression
Almost all existing forecasting models use multiple
regression.
Multiple regression is useful to estimate the relative
impact of certain variables on the outcome variable
for a given data set, i.e. data fitting.
10. Outline
1. Status-quo in election forecasting
2. Limitations of multiple regression for social science predictions
3. Index method
4. Issue-index model
5. Future applications of the index method
6. Conclusions
11. Limitations of multiple regression
Multiple regression is limited for predicting new data.
Variable weights are typically estimated from the dataset.
- Limited in the number of variables
- Limited ability to incorporate prior domain knowledge
- Extracts too much information (i.e. noise) from given
data, which does not generalize when predicting new
data
12. Conditions for multiple regression
The performance of multiple regression depends to a
large extent on the ratio between predictor variables
and the number of observations available – especially
when using non-experimental data.
[Einhorn & Hogarth, 1975]
Regression should not be used for social science
predictions unless sample size is larger than 100
observations per predictor variable.
[Dana & Dawes, 2004]
13. Conditions for forecasting U.S. Presidential Elections
Data for the majority of regression models is limited to
about 25 elections. Many models use
- no more than 15 observations and
- from two to five predictor variables
Multiple regression is limited for forecasting U.S.
Presidential Elections.
In particular, if one wants to incorporate individual
differences between candidates: too many variables.
14. Outline
1. Status-quo in election forecasting
2. Limitations of multiple regression for social science predictions
3. Index method
4. Issue-index model
5. Future applications of the index method
6. Conclusions
15. Index method
Structured approach for summarizing domain knowledge
(e.g., prior research, expert knowledge).
Long history in forecasting and decision-making.
16. Index method procedure
(1) Identify variables from domain knowledge
(i.e. prior empirical studies and/or subjective judgment
by experts)
(2) Use prior evidence to determine their directional
influence on the outcome
(e.g. 1: favorable; 0: unfavorable)
(3) Sum up scores. Weight variables equally unless strong
evidence exists for differential weighting.
(4) Select the option that is favored by most variables.
17. Early use of the index method
Predicting the success of paroling individuals from prison.
[Burgess, 1939]
Based on a list of 25 factors, an index score was calculated for
each individual to determine the chance of successful parole.
This method was criticized as it did not consider
- the relative importance of different variables or
- how favorable the ratings were for a certain variable
However, a follow-up study did not find evidence that supported
the use of regression over index scores for predicting parole.
[Gough, 1962]
18. Relative performance of the index method and
multiple regression: Theoretical evidence
The index method will outperform regression if
- sample size is small and
- the number of – and inter-correlation among –
predictor variables high.
[Einhorn & Hogarth, 1975]
19. Relative performance of the index method and
multiple regression: Empirical evidence
In a summary of 8 studies, 5 favored the index method
over regression.
[Armstrong, 1985]
Unit-weighting showed higher out-of-sample accuracy
than multiple regression for 20 prediction problems
taken from statistical textbooks.
[Czerlinski et al., 1999]
“Equal-weights” versions of three established election
forecasting models yielded a lower MAE and higher
hit-rate than the original econometric models for the
32 elections from 1880 to 2004.
[Cuzán & Bundrick, 2009]
20. Advantages of Index models
No limit in the number of variables
(regression usually limited to 3 or 4 variables).
No need to estimate weights from the data and thus
- No sample necessary for building the model.
- Easy to incorporate new variables in the model
Index models can utilize all prior knowledge.
21. Disadvantages of Index models
Expensive to summarize prior knowledge.
Difficult to account for effect size (coefficients and
the amount of change in predictor variables).
Difficult to base model on theory
22. The first index model for forecasting elections
Lichtman’s index model, the 13 Keys to the White House,
has been published for years. He made forecasts of the
past 38 presidential elections (7 prospectively).
[Lichtman, 2008]
In all cases, the model’s predictions of the popular vote
winner have been correct. No other approach has come
close to this record.
From 1984 to 2004, Lichtman’s “Keys” yielded forecast
errors of the popular vote shares almost as low as three
established econometric models.
[Armstrong & Cuzán, 2006]
23. The bio-index model
Predicts U.S. presidential election winners based on 59
variables that incorporate biographical information
about candidates.
Examples: height, weight, birth order, married, beauty
Correctly predicted the winner in 27 of the 29 elections
from 1896 to 2008 and thereby outperformed polls,
prediction markets, and econometric models.
[Armstrong & Graefe, 2010]
24. Conditions favoring the index method
Many variables
Few observations
Much prior domain knowledge (based on expertise
and empirical studies)
25. Outline
1. Status-quo in election forecasting
2. Limitations of multiple regression for social science predictions
3. Index method
4. Issue-index model
5. Future applications of the index method
6. Conclusions
26. Prediction problem
Forecast U.S. presidential election outcome from
information about how voters’ perceive candidates to
handle issues
Conditions:
1. Large number of issues / variables (sometimes more than 40)
2. Issues change or new issues arise over time
(e.g., global warming, wars, financial crisis)
3. Few observations
Conditions favor the index method
27. Data
Polls which asked voters to name the candidate who would
be more successful in solving a specific issue.
“Now I'm going to mention a few issues and for each one, please tell me if
you think Barack Obama or John McCain would better handle that issue if
they were elected president...the economy”
[CNN/Opinion Research Corporation Poll. July 27-29, 2008]
Examples for issues: terrorism, war in Iraq, economy, jobs,
budget deficit, health care, immigration
Obtained data: 427 polls with voters‘ opinion on 314 issues
for the 10 elections from 1972 to 2008
28. Coding and generation of the issue index
For each issue, candidate with higher voter support
achieved a score of 1. Otherwise, 0.
29. The issue-index heuristic for predicting the winner
Calculate the overall index score for each candidate.
Decision rule (issue index heuristic)
Predict the candidate with the higher index score to
win the popular vote.
Performance
- Correctly predicted 9 of 10 popular vote winners.
- Missed Reagan in 1980.
30. Issue-index model for predicting vote-shares
The simple heuristic performs well in predicting the winner.
But it does not allow for predicting the popular vote share.
Issue-index model
Simple linear regression to relate the relative index score (I)
of the incumbent to the popular vote (V)
Vote equation: V = 40.3 + 0.22 * I
31. Relative performance:
Issue-index model vs. econometric models
We compared out-of-sample (ex ante) forecast accuracy of the
issue index model and 8 established econometric models for
predicting the three elections from 2000 to 2008.
Early September forecast:
Issue index model provided a lower MAE than all 8 benchmark
models but did not predict the correct winner in the 2004
election.
Election Eve forecast:
Issue index model also correctly predicted the 2004 election
winner.
32. Summary
Issue-index model yielded a higher hit rate and more
accurate long-term vote-share forecasts than the
IEM.
Issue-index model yielded more accurate out-of-sample
forecasts of the vote shares than 8 econometric
models (early September forecasts)
33. Benefits of issue indexes
Contribute to forecasting accuracy, in particular for
long-term forecasting and the PollyVote.
Can help political candidates in deciding about which
issues to focus on in their campaign.
- Increase marketing effort to gain ownership of an issue.
- Raise and promote issues that favor them but which
have not received attention in the public yet.
- Adopt new or revised positions and diverge from
traditional party views.
Simple to use and easy to understand.
34. Limitations of issue indexes
Costs
Must summarize prior knowledge about the field.
Acceptability
Easy to understand and thus easy to criticize.
People wrongly believe that complex methods are
necessary to solve complex problems. They exhibit a
general resistance to simple solutions.
[Hogarth, in press]
35. Outline
1. Status-quo in election forecasting
2. Limitations of multiple regression for social science predictions
3. Index method
4. Issue-index model
5. Future applications of the index method
6. Conclusions
36. Future work on index models (1)
Predict the election outcome based on how voters
perceive the candidates’ personalities.
E.g., which of the candidates is more likable, honest, etc.
Implications for decision-making:
- Helps candidates to decide whether to run
- Helps parties to decide who to nominate
37. Future work on index models (2)
Predict the election outcome based on how voters agree
with candidates’ positions on policies.
Implications for decision-making:
- Helps candidates to decide which policies to pursue.
Examine policies related to issues such as gun control,
income taxes, free trade, abortion, government spending
to see which candidate is closest to the opinions of the
voters on more policies.
38. The Index Model Challenge
Index method will be more accurate than econometric
models in situations with
- many variables
- much prior knowledge (especially experiments)
and
- lack of data, measurement errors, and collinearity.
Examples: Selecting CEOs, drafting athletes, marriages,
economic growth rates of nations, value of real
estate, medical treatments, effectiveness of ads.
39. Outline
1. Status-quo in election forecasting
2. Limitations of multiple regression for social science predictions
3. Index method
4. Issue-index model
5. Future applications of the index method
6. Conclusions
40. Conclusions
We used the index method to develop the issue index model, which
is based on information about how voters perceive the
candidates to handle the issues
The model correctly predicted the winner in 9 of the last 10 U.S.
Presidential Elections. Its out-of-sample forecasts for the past
three elections outperformed established econometric models.
The model improves the accuracy of long-term election forecasting
and can help candidates to decide about which issues to focus
on in their campaign.
Further improvements in accuracy are expected based on the index
method – which itself can be used in many other applications.
41. References
Armstrong, J. S. (1985). Long-range forecasting: From crystal ball to computer, New York: John Wiley.
Armstrong, J. S. & Graefe, A. (2010). Predicting elections from biographical information about candidates, Journal of
Business Research, in press.
Berg, J., Nelson, F. & Rietz, T. A. (2008). Prediction market accuracy in the long run. International Journal of Forecasting,
24, 285-300.
Burgess, E. W. (1939). Predicting success or failure in marriage. New York: Prentice-Hall.
Cuzán, A. G. & Bundrick, C. M. (2009). Predicting presidential elections with equally-weighted regressors in Fair's
equation and the fiscal model, Political Analysis 17, 333-340.
Czerlinski, J., Gigerenzer, G. & Goldstein, D. G. (1999). How good are simple heuristics? In: G. Gigerenzer, G. & Todd, P.
M. (Eds.), Simple heuristics that make us smart. Oxford University Press, pp. 97-118.
Einhorn, H. J. & Hogarth, R. M. (1975). Unit weighting schemes for decision-making, Organizational Behavior & Human
Performance, 13, 171-192.
Fair, R. C. (1978). The effect of economic events on votes for president, Review of Economics and Statistics, 60, 159-173.
Gough, H. G. (1962). Clinical versus statistical prediction in psychology. In: L. Postman (Eds.), Psychology in the making.
New York; Knopf, pp. 526-584.
Hogarth, R. M. (2006). When simple is hard to accept. In: P. M. Todd & Gigerenzer, G. (Eds.), Ecological rationality:
Intelligence in the world (in press). Oxford; Oxford University Press, pp.
Jones, R. J. & Cuzán, A. G. (2008). Forecasting U.S. Presidential Elections: A brief review, Foresight, Issue 10, 29-34.
Lichtman, A. J. (2008). The keys to the white house: An index forecast for 2008, International Journal of Forecasting, 24,
299-307.
Rhode, P. W. & Strumpf, K. S. (2004). Historic presidential betting markets, Journal of Economic Perspectives, 18, 127-
142.
43. Relative performance:
issue index model vs. prediction markets
We compared the issue index forecasts to the market
prices from the Iowa Electronic Markets.
45. Performance of prediction markets
Iowa Electronic Markets (IEM) yielded a lower
forecasting error than the typical poll 74% of the
times from 1988 to 2004 (Berg et al. 2008)
46. Relative performance:
Issue-index vs. IEM (hit rate)
For the last 150 days prior to election day, the issue-
index model (forecasts derived by jackknifing)
yielded a higher hit rate than the IEM (1988-2008).
47. Relative performance:
Issue-index vs. IEM (forecast error)
Across the last 150 days prior to election day, the issue-index model and the
IEM yielded an identical MAE of 2.3 percentage points (1988-2008).
Accuracy varied depending on forecast horizon.
- issue index model more accurate for long-term forecasts
- prediction markets superior for short-term forecasts
48. Procedure for selecting the issues
Operational definition
“A political issue is a matter of public concern and is something
that the next president can be expected to take action about.
An issue always focuses on a particular problem. Issues do not
include policies for solving problems.”
Four coders decided independently on whether to
include an issue.
In case of ties, the authors made the final decision.
Notas do Editor
Fair‘s model correctly predicted the winners in the 1980, 1984, and 1988 elections but failed to predict the easy re-election of Bush in 1992.
Even with massive data, it is difficult for regression to model interactions, non-linearitues and multi-colliniearities that are associated with observational data in the social sciences.
Keys To The White House
1: Party Mandate
2: Party Contest
3: Incumbency
4: Third Party
5: Short-term Economy
6: Long-term Economy
7: Policy Change
8: Social Unrest
9: Scandal
10: Foreign or Military Failure
11: Foreign or Military Success
12: Incumbent Charisma/Hero
13: Challenger Charisma/Hero