Skip to content
Who's in the Video
Sam Yagan is co-founder and CEO of OkCupid.com, the fastest-growing free online dating service. Yagan was previously co-founder and CEO of TheSpark.com, maker of SparkNotes, and president of MetaMachine, which[…]
Sign up for the Smarter Faster newsletter
A weekly newsletter featuring the biggest ideas from the smartest people

OkCupid records and publishes data on the interactions, profiles, and preferences of its members. This information has plenty of implications for the social-scientific quest to understand human behavior.

Question: What’s new about OKCupid?

Sam Yagan: I think of OkCupid as an online bar; a place where singles can go, have fun with each other and in the process hopefully meet new people, some of whom might turn into a romantic relationship.

Imagine if you had a video camera in every bar in the country and you could watch and log every transaction, every interaction that took place.  You could watch what each person was wearing; you could watch what pick-up line they used.  You could watch the person of this race and age approach the person of this race and age.  And was that a successful interaction?  Imagine how amazing that data set would be and how much fun you could have learning about people, dating, society. 

We’ve got that essentially on our online bar.  We can watch every time someone looks at a profile.  Do they choose to send that person a message?  We can look at every message that’s sent and we can determine, was that message replied to or not.  And we can run a bunch of regressions and a bunch of analyses to determine what were the driving factors of that decision being made to send a message or that decision being made not to reply to a message. 

So we have this data.  And what’s really interesting about the data is that it is not survey data.  It’s not an experiment; it’s not in a laboratory.  If you remember back to the 2008 elections, there was this thing called “The Bradley Effect.”  And nobody knew if the surveys were able to actually quantify the potential impact of race.  If you call someone up and say, are you racist?  They say, no.  And will race impact your vote?  They’re going to say, no.  But when you get into that ballot box, you just don’t know what’s going to happen. 

And so what we’ve got is we’re actually observing these behaviors and race is the easiest thing to talk about because it’s so salient.  But even beyond that, you can look at hair color, you can look at height.  You can look at pickup lines. 

I think the data-based approach actually appeals much more broadly than a psychology-based approach which might just be dismissed by people.  Even if you’re not a math person, you’re unlikely to dismiss something that was brought to you by a scientific process.

Question: What has the reaction been to OKCupid’s approach?

Sam Yagan: We have published some things that have made other dating executives worry about us and wonder why we’re "hurting the industry."  For example, we published the average response rate on OKCupid is about 33%.  So about one out of every three messages you send get responded to.  And we got several inquiries from our competitors saying why would you say that?  Why would you tell people that most messages they send don't get replied to? 

Our entire brand is about transparency.  We want that data out there because you know what?  If you are only getting one in three messages replied to, you’re normal.  You’re right there in the middle of everything with everyone else.  So rather than wondering to yourself, “Wow, am I the only person that’s not getting most of my message responded to?”  Quite the opposite.  You are, again, in the norm. 

So we really believe that transparency is the best approach.  We think we have the best product.  We think we have the best matching algorithm, we think we have the best members.  So why wouldn’t we want to just shine the light onto just how our processes work, what the real data are, and let people come to their own conclusions.

Question: How is OKCupid different from other dating sites?

The biggest difference between a psychology-based approach to matching and a data-based approach to matching—in particular in the way that we do it as opposed to somebody like eHarmony—is, you know, eHarmony employs a centralized model.  They have a specific way to match people up.  A specific belief on what drives a successful relationship.  They may be right, they may be wrong, and that’s the flaw in their model... is that they may be wrong. 

Our system is decentralized.  We don’t have any preconceived notion of what makes a good match.  We don’t have any preconceived notion about whether like wants like, whether opposites attract.  We don’t have any preconceived notion that God is more important than pets.  What we believe, fundamentally, is that people are single and people therefore turn to online dating, not because they don’t know what they’re looking for, but because they don’t meet enough people in their day-to-day life. 

Ask yourself, how many single people of the appropriate gender, the appropriate orientation and the appropriate age do you meet in a given week?  If you’re like most people, you have routines.  You go to the same job, you go to the same gym, you hang out with the same friends and you don’t meet that many new people.  You turn to online dating, not because you say, “Gosh, I have no idea what I’m looking for.” You turn to online dating because you want to meet more people.  And in that way, I think the decentralized model of you come and tell us what you’re looking for, we’ll use data, we’ll use algorithms to sort through the millions of profiles and find the best people for you.  We think that’s much more powerful and, in particular, not exposed to that risk of our algorithm being wrong.  It’s possible the eHarmony algorithm is perfect and right, but it’s also possible that it’s right for some people and wrong for other people.  And if you fall into one of those... if you’re one of those people for whom their algorithm is wrong, then their product can’t service you by definition. And our model doesn’t have that fault.  We don’t believe that X is looking for Y or that A and B are matches for each other.  We simply give you the platform to express your preferences. 

We have this term in the office called “three ways,” which isn’t probably what you’re thinking about.  A three way in OkCupid’s parlance is a message, a reply and a reply back.  We believe that if you’re in a "three way," that is, you are having a conversation with someone that is a metric of success.  We’ve done a good job of matching you up.  So we do feedback whether or not a message led to a "three way" into our matching algorithms.

There is a risk of people gaming the system.  And that’s a question we get a lot because these are expressed preferences rather than observed behaviors.  There is a chance for someone to answer questions in a particular way that they might think is more attractive to someone that they’re trying to impress.  We have a few different ways of countering that.  The most powerful of which is, our system inherently penalizes inconsistency.  What do I mean by that?  There might be 10 questions, imagine that we have learned, that are highly correlated.  Right?  We know that people who tend to answer question A, a certain way also answer question B a certain way and answer question C a certain way. 

So if you go through, you’re trying to game the system and you don’t actually have these preferences, you’re just trying to guess what answers are going to be the most attractive to a given person, you’re unlikely to know and to expose all those same correlations that we’ve seen in the data. 

Sort of a more visceral way of thinking about that is: if you try to game the system, you’re unlikely to... you’re going to have a set of preferences that is likely to piss everyone off in some way or another.  Right?  Because you’re not a sort of coordinated set of answers that’s likely to make one person very happy and one person very unhappy, you’re likely just to make every person relatively unhappy because you haven’t answered the questions in a consistent way that people are looking for. 

And so the system inherently penalizes someone who tries to game it.  It inherently penalizes inconsistency.  Now you could sort of put yourself in some sort of Zen mode where you just say, "I am going to become this person.  I’m going to become a 45-year old devout Catholic who is looking for marriage." And you could sort of live out this whole lifestyle and answer every question exactly that way, and sure, in that case you’ll game the system.  But for most cases, you’re going to have missteps along the way and the system is going to penalize you. 

A lot of times what people think they want and what people actually want turn out to be different.  So you may say that I want someone in this age range, or you may be searching for someone of a certain ethnicity, of a certain race, but it may turn out that you may be sending messages to people outside of that range, or outside of those constraints.  And so we will sometimes loosen constraints for you if we know that you’re actually more receptive to people outside of those boundaries.

Recorded on November 4, 2010
Interviewed by Teddy Sherrill

Directed & Produced by Jonathan Fowler


Related