Wish List for 2020 US Presidential Polling

Here are a few observations before stepping back onto to the sidelines to watch critiques of 2020 US presidential election polling methods.  Some of these points are being expressed elsewhere, even if they aren’t necessarily noticed much.  There’s a perception that polls failed in 2016.  While it’s true they were accurate, in aggregate, they certainly didn’t predict the actual loser- because she came up short in the electoral college anyway.

Attention to this topic put pollsters in a defensive stance, and in May 2017, AAPOR (American Association for Public Opinion Research), issued their evaluation of what wrong.  It is thorough and useful.  Sound scientific methods, when applied without practical knowledge may produce bad results.  What can be done differently in 2020?

1. Fund high-quality polls in the few states that matter

The implicit sample frame for US presidential polling is all 50 states.  Except for a few key primary races however, most of them are irrelevant.   Well-funded public polling was very scarce in the most important states during the last days of the 2016 campaign: MI, PA, WI.  As Pew Research has warned, “high-quality state-level polling in the U.S. remains sparse and underfunded”.

In California, The Field Poll was an example of a high-quality, geographically-intelligent polling organizational that operated for 70 years before closing in 2016.  It enjoyed an excellent track record, and was long heralded as an example of a savvy intrastate operation.  But evidently even in a large state, the economics weren’t compelling enough for them to secure continued funding.  Like many traditional newspaper functions, public polls have trouble surviving.  The remedy to this predicament is going to be financial.  Perhaps it can happen once media consumers target their attention to the handful of states that determine presidential election outcomes.  Resources are needed to develop polling infrastructure in states like AZ, FL, GA and TX – in addition to the 3 states that determined the 2016 results.

2. Resist the temptation for “horse-race” reporting (unless your sample is large and it is November already)

In key states, voters who decided late stacked up decisively for Trump.  Their sentiments could have shifted over time.  Measuring sentiments shouldn’t be confused with forecasting outcomes.  Incessantly checking on a small proportion of late-deciders is pointless. Unless the election is truly imminent, the precision of results will rapidly decay.

The most accurate poll that I worked with was at Harris Interactive for the 2000 election, when internet polling was in its infancy (Bush/Gore error 0.0%).  Functionally, it replicated the election by interviewing hundreds of thousands of voters in every state.  All states were correctly called, except for Florida, which arguably was an accurate finding.  The ginormous sample size enabled solid weighting but what likely mattered more was being able to capture respondents on the weekend prior to Election Day.  It was if a dry-run, simulated election were being conducted.  Unless conditions like these can be satisfied, specifically in swing states, much polling is merely “directional”, and usually, short of a forecast.

As boring and as disappointing as it may seem, it is smarter to express probabilities than to predict “winners”.  Media commentators frequently drone on about percentage results, as if they were announcing baseball scores.  There needs to be a more appealing way to convey the concept of confidence intervals.

3. The value of psychographic weighting is scarcely tapped

It’s widely recognized that 2016 polling fell down by inadequately weighting for non-college-graduates.  Graduates were over-represented.  In fact, psychological variables increasingly distinguish American voters, not the familiar economic or educational ones.  Educational achievement may simply be a crude proxy for these.  This is a fascinating topic for further study, at least for those who want to dive into recesses of the US collective psyche.

Social psychologists have defined attributes to identify folks who are attracted to Trump and contemporary Republicans, including “authoritarianism, social dominance orientation, prejudice, relative deprivation, and intergroup contact”.  Some of these can be reasonably marked out within a few well-designed queries. For example, the frequency of exposure or contact to individuals outside one’s own group can be self-reported.  It’s also possible to gauge beliefs about imagined threats, or preferences for authoritarian leaders.  Direct questioning isn’t always needed.  Passive data harvesting about media consumption or product purchases is less intrusive and may cost less.

Psychographic profiling is controversial precisely because it is so powerful.  Smart candidates have already recognized that identifying the psychological attributes of specific voters can be decisive at racking up votes.  Pollsters may benefit from using similar attributes to assure that they achieve adequate representation of the entire voting public.  Not unlike certain kinds of epidemiology research, the distribution of this group could be estimated in order establish weighting targets for these psychological variables.  Their prevalence within defined geographies might be demarcated.  That would support being able to achieve representative samples.

4. Account for the “Shy Trump” effect

Some Trump voters will be reluctant to admit their candidate preference.  Social desirability will bias survey results when respondents tell you what they think you want to hear, not what they believe.  In 2020, in the USA, the stakes are high if expressing an affinity for Trump.  Even in a more private mode, such as online instead of by phone, there is a risk of being categorized as immoral or racist.

The “Shy Trump” effect is a variant of a previously established form of bias, referred to as the Bradley effect– named for when Tom Bradley lost the 1982 gubernetorial election in California.  In that case, a voter says that they will vote for a non-white candidate (Bradley) despite actually voting for a white candidate.  It may be a particular concern among striving, upper-middle class Republicans.  Conceivably, this could be partially remedied through tiered question phrasing.  For example, a respondent is alternatively asked how their neighbors or colleagues will vote, helping to deflect any sensitivity to blame.  This year, there may be opportunities to validate questioning for the purpose of positively identifying this subset, whose presence may meaningfully impact measurement accuracy.

5   Be wary of aggregate polls

From now until November, expect poll results to be sometimes massaged and manipulated beyond their intended use.   One example is the meta-poll, where results from many polls are aggregated.  Enough bad apples can spoil this bushel.   Put another way, the strength within a chain of aggregate polls may be no greater than its weakest link.

 

These observations suggest opportunities to improve polling methods.  Most require funding and will therefore be difficult to implement.  Only one requires an attitudinal shift about how data is interpreted.  With so much public attention to the US Presidential election this year, the circumstances may be ideal to conduct applied research on the rest.

Write a comment