Mark 2012 down as the year that we finally saw traditional political polls for what they are – a form of voodoo black magic mixed with Machiavellian pseudoscience. With only 10 days to go until the general election, polls seem to tell us everything - and nothing - at the same time. Romney has opened up an insurmountable lead! No, wait, Obama will be the winner by a landslide! Hold on, hold on, the latest polls have the candidates at dead-even! Sometimes, poll results are so blatantly contradictory (zigzagging in the course of a single day) that even veteran political pollsters throw their hands up in despair. So is there any way that some of the most exciting advances made possible by the Internet - everything from Big Data analysis to social networking analysis - can ever create an Internet Prediction Machine capable of forecasting political results?
The Internet already has proven to have remarkably strong predictive power, being used for everything from forecasting box office revenues to forecasting the outbreak of epidemics. Largely, this has been possible by studying activity on social networking sites such as Facebook or Twitter in order to gauge momentum, sentiment and intent. Unlike traditional polls, which are essentially point-in-time estimates of future voting behavior, social networking analysis is much better at smoothing out all the mood swings and sentiment shifts that occur throughout a campaign. Moreover, one could argue that social networking analysis is inherently more democratic than telephone polling - you can measure a much broader swathe of the population (i.e. all known Internet users), instead of just reaching out to the weirdo 9% who actually respond to polls.
The next step is to move beyond expressed intent – I tell you what I plan to do – to something much more difficult to measure, and that’s implied intent – my actions let you know what I’m going to do, even when I don’t tell you. The New York Times ran a fascinating article last weekend from Seth Stephens-Davidowitz on how Google searches are a remarkably effective way to gauge implied intent, especially when it comes to the likelihood of an undecided voter actually going to the polls on Election Day. As a result, Google searches are remarkably effective at predicting the demographic makeup of the electorate on Election Day. This is extraordinarily relevant for pollsters, since it helps them smooth out and adjust poll results for higher-than-normal or lower-than-normal election turnouts. For a candidate's "ground game" to work to perfection, this type of information is critical.
But Google searches are just the tip of the predictive iceberg. There's so much Big Data out there that we don't even know what to do with all of it. Think of how much time you spend interacting with your digital devices (laptops, smartphones, tablets) all day long -- how could all of those check-ins, tweets, SMS messages, blog posts and emails be used to create increasingly sophisticated psychological profiles of voters? By understanding the psychological profile of a voter, you have a much greater certainty of knowing how they think, of what's in their head. Certainly, it would go way beyond how we classify voters today, with simplistic labels such as "undecided" or "partisan".
Researchers within the government and the private sector have constructed a few Internet Prediction Machines that build on the Big Data promise of the Internet to actually put numbers against all that implied intent. As you might imagine, however, these Internet Prediction Machines also have a flip side - they could have very real negative impact on our personal privacy. The scariest of these initiatives point to Total Information Awareness-style initiatives that automatically collect data from every aspect of our online lives, without us even knowing about it. In some scenarios, they even go beyond our online lives to include things like images from traffic webcams or information from public records.
If this all sounds vaguely familiar to some kind of science fiction scenario that you've heard about, that’s because it is. As John Markoff of the New York Times pointed out last year, Internet Prediction Machines are related to the concept of "psychohistory" first outlined by Isaac Asimov in his "Foundation" series of science fiction works 60 years ago. For Asimov, psychohistory was an amalgam of history, statistics and psychology that functions much like today’s data mining and social networking analysis – it attempts to predict the future by extrapolating on individual psychology and crunching the numbers on all known data in the universe.
Who knows? By 2016, we might not need polls, and we may not even need campaigns. Thanks to the creation of Internet Prediction Machines, we'll know the eventual winner months ahead of time.
image: USA Elections Online Voting / Shutterstock