What does our online data say about who we are? That's the question at the heart of Christian Rudder's best-selling book Dataclysm: Who We Are (When We Think No One's Looking), released earlier this year. Rudder is co-founder and president of the online dating site OkCupid, which serves as the source of many of his analtics. In fact, Dataclysm is a work very similar in subject matter to the OkTrends blog he maintained for several years at OkCupid. The popular blog offered a unique brand of social analysis relying on statistics gathered from the site's userbase. Dataclysm continues in this vein while also speaking to how data scientists have become a newest breed of demographers. Rudder offers some examples of his findings in today's featured Big Think interview:

                                                                                               

       

Several of Rudder's initial observations appear to substantiate commonly held perceptions of heterosexual dating:

"You see that men are the kind of pursuers in relationships at a four to one ratio and kind of correspondingly, women, because they're getting four messages to every one they send out, like they respond a lot less and response rates track directly with how hot the writer was, is."

Beneath the surface though, there are some surprising behavioral trends revealed through Rudder's unique access to data such as message length, time spent composing messages, and message response rates:

"You also see that once people start talking and they establish a rapport, which for OkCupid is four messages going back-and-forth, that attractiveness kind of goes out the window at that point. Your personality takes over after the fourth message." 

Rudder also takes note of implicit racial biases among OkCupid users, all in spite of the site's politically progressive demographics:

"We're all highly coastal. Very little red state, very blue. On a piece of paper OkCupid should be a very progressive place... But the data that we have, you know, black users get three quarters of the messages, the positive votes. They're attractiveness rating are three quarters of an average white user, or Latino user for that matter. They get replied to about three quarters of the time. It's pretty blanket."

What Rudder is saying is that black users are only 75% as likely as white or Latino users to get positive feedback from other people on the site. Asian men experience similar stats, though not Asian women. Rudder compared his OkCupid data to stats from other sites like Match.com and DateHookup. He found that these percentages stayed true across the board. This isn't a matter of small sample sizes; data from those three sites is drawn from 30 million people. Rudder notes that this is about half of the United States' "single-and-looking" population.

Rudder goes on to comment on other trends he's spotted in his data. Shorter, more concise messages on OkCupid tend to do better than longer ones, though not by a huge margin. Copy/pasting the same message to multiple users is probably the best strategy for achieving a high return per units; it's certainly more effective than sending a unique message to every person you connect with. Rudder makes sure to note that, even though these bits of info are interesting in their own right, the truly fascinating piece of this puzzle is how all these observations were derived from social media user stats. Outside of a government census, when in history have we ever had the ability to collect data from such a large pool of people and draw conclusions about the nature of society and human behavior?

"It's the best data set in the world because it's people, all strangers, all making judgments of one another, all probably trying to sleep with each other, which also adds a certain piquancy to the whole thing. So, you know, you look at the data and you really get a kind of special window into people's psyche."