how a math genius hacked okcupid to find true love - wired science

14
How a Math Genius Hacked OkCupid to Find True Love BY KEVIN POULSEN 01.21.14 6:30 AM Mathematician Chris McKinlay hacked OKCupid to find the girl of his dreams. Emily Shur Chris McKinlay was folded into a cramped fifth-floor cubicle in UCLA’s math sciences building, lit by a single bulb and the glow from his monitor. It was 3 in the morning, the optimal time to squeeze cycles out of the supercomputer in Colorado that he was using for his PhD dissertation. (The subject: large- scale data processing and parallel numerical methods.) While the computer chugged, he clicked open a second window to check his OkCupid inbox.

Upload: andrebenn

Post on 22-Oct-2015

25 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

How a Math Genius Hacked OkCupid to Find True LoveBY KEVIN POULSEN 01.21.14 6:30 AM

Mathematician Chris McKinlay hacked OKCupid tofind the girl of his dreams. Emily Shur

Chris McKinlay was folded into a crampedfifth-floor cubicle in UCLA’s math sciencesbuilding, lit by a single bulb and the glow fromhis monitor. It was 3 in the morning, theoptimal time to squeeze cycles out of thesupercomputer in Colorado that he was usingfor his PhD dissertation. (The subject: large-scale data processing and parallel numericalmethods.) While the computer chugged, heclicked open a second window to check hisOkCupid inbox.

Page 2: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

McKinlay, a lanky 35-year-old with tousled hair,was one of about 40 million Americans lookingfor romance through websites like Match.com,J-Date, and e-Harmony, and he’d beensearching in vain since his last breakup ninemonths earlier. He’d sent dozens of cutesyintroductory messages to women touted aspotential matches by OkCupid’s algorithms.Most were ignored; he’d gone on a total of sixfirst dates.

On that early morning in June 2012, hiscompiler crunching out machine code in onewindow, his forlorn dating profile sitting idle inthe other, it dawned on him that he was doing itwrong. He’d been approaching onlinematchmaking like any other user. Instead, herealized, he should be dating like amathematician.

OkCupid was founded by Harvard math majorsin 2004, and it first caught daters’ attentionbecause of its computational approach tomatchmaking. Members answer droves ofmultiple-choice survey questions on everythingfrom politics, religion, and family to love, sex,and smartphones.

On average, respondents select 350 questionsfrom a pool of thousands—“Which of thefollowing is most likely to draw you to amovie?” or “How important is religion/God inyour life?” For each, the user records an answer,specifies which responses they’d find acceptablein a mate, and rates how important the questionis to them on a five-point scale from“irrelevant” to “mandatory.” OkCupid’smatching engine uses that data to calculate a

Page 3: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

couple’s compatibility. The closer to 100percent—mathematical soul mate—the better.

But mathematically, McKinlay’s compatibilitywith women in Los Angeles was abysmal.OkCupid’s algorithms use only the questionsthat both potential matches decide to answer,and the match questions McKinlay had chosen—more or less at random—had proven unpopular.When he scrolled through his matches, fewerthan 100 women would appear above the 90percent compatibility mark. And that was in acity containing some 2 million women(approximately 80,000 of them on OkCupid).On a site where compatibility equals visibility,he was practically a ghost.

He realized he’d have to boost that number. If,through statistical sampling, McKinlay couldascertain which questions mattered to the kindof women he liked, he could construct a newprofile that honestly answered those questionsand ignored the rest. He could match everywoman in LA who might be right for him, andnone that weren’t.

Chris McKinlay used Python scripts to riffle

Page 4: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

through hundreds of OkCupid survey questions.He then sorted female daters into seven clusters,like “Diverse” and “Mindful,” each with distinctcharacteristics. Maurico Alejo

Even for a mathematician, McKinlay isunusual. Raised in a Boston suburb, hegraduated from Middlebury College in 2001with a degree in Chinese. In August of that yearhe took a part-time job in New York translatingChinese into English for a company on the 91stfloor of the north tower of the World TradeCenter. The towers fell five weeks later.(McKinlay wasn’t due at the office until 2o’clock that day. He was asleep when the firstplane hit the north tower at 8:46 am.) “Afterthat I asked myself what I really wanted to bedoing,” he says. A friend at Columbia recruitedhim into an offshoot of MIT’s famedprofessional blackjack team, and he spent thenext few years bouncing between New York andLas Vegas, counting cards and earning up to$60,000 a year.

The experience kindled his interest in appliedmath, ultimately inspiring him to earn amaster’s and then a PhD in the field. “They werecapable of using mathematics in lots of differentsituations,” he says. “They could see some newgame—like Three Card Pai Gow Poker—then gohome, write some code, and come up with astrategy to beat it.”

Now he’d do the same for love. First he’d needdata. While his dissertation work continued torun on the side, he set up 12 fake OkCupidaccounts and wrote a Python script to managethem. The script would search his target

Page 5: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

demographic (heterosexual and bisexual womenbetween the ages of 25 and 45), visit their pages,and scrape their profiles for every scrap ofavailable information: ethnicity, height, smokeror nonsmoker, astrological sign—“all that crap,”he says.

To find the survey answers, he had to do a bit ofextra sleuthing. OkCupid lets users see theresponses of others, but only to questionsthey’ve answered themselves. McKinlay set uphis bots to simply answer each questionrandomly—he wasn’t using the dummy profilesto attract any of the women, so the answersdidn’t matter—then scooped the women’sanswers into a database.

McKinlay watched with satisfaction as his botspurred along. Then, after about a thousandprofiles were collected, he hit his firstroadblock. OkCupid has a system in place toprevent exactly this kind of data harvesting: Itcan spot rapid-fire use easily. One by one, hisbots started getting banned.

He would have to train them to act human.

He turned to his friend Sam Torrisi, aneuroscientist who’d recently taught McKinlaymusic theory in exchange for advanced mathlessons. Torrisi was also on OkCupid, and heagreed to install spyware on his computer tomonitor his use of the site. With the data inhand, McKinlay programmed his bots tosimulate Torrisi’s click-rates and typing speed.He brought in a second computer from homeand plugged it into the math department’sbroadband line so it could run uninterrupted 24

Page 6: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

hours a day.

After three weeks he’d harvested 6 millionquestions and answers from 20,000 women allover the country. McKinlay’s dissertation wasrelegated to a side project as he dove into thedata. He was already sleeping in his cubiclemost nights. Now he gave up his apartmententirely and moved into the dingy beige cell,laying a thin mattress across his desk when itwas time to sleep.

For McKinlay’s plan to work, he’d have to find apattern in the survey data—a way to roughlygroup the women according to theirsimilarities. The breakthrough came when hecoded up a modified Bell Labs algorithm calledK-Modes. First used in 1998 to analyze diseasedsoybean crops, it takes categorical data andclumps it like the colored wax swimming in aLava Lamp. With some fine-tuning he couldadjust the viscosity of the results, thinning itinto a slick or coagulating it into a single, solidglob.

He played with the dial and found a naturalresting point where the 20,000 women clumpedinto seven statistically distinct clusters based ontheir questions and answers. “I was ecstatic,” hesays. “That was the high point of June.”

He retasked his bots to gather another sample:5,000 women in Los Angeles and San Franciscowho’d logged on to OkCupid in the past month.Another pass through K-Modes confirmed thatthey clustered in a similar way. His statisticalsampling had worked.

Page 7: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

Now he just had to decide which cluster bestsuited him. He checked out some profiles fromeach. One cluster was too young, two were tooold, another was too Christian. But he lingeredover a cluster dominated by women in theirmid-twenties who looked like indie types,musicians and artists. This was the goldencluster. The haystack in which he’d find hisneedle. Somewhere within, he’d find true love.

Actually, a neighboring cluster looked prettycool too—slightly older women who heldprofessional creative jobs, like editors anddesigners. He decided to go for both. He’d setup two profiles and optimize one for the Agroup and one for the B group.

He text-mined the two clusters to learn whatinterested them; teaching turned out to be apopular topic, so he wrote a bio thatemphasized his work as a math professor. Theimportant part, though, would be the survey. Hepicked out the 500 questions that were mostpopular with both clusters. He’d alreadydecided he would fill out his answers honestly—he didn’t want to build his future relationshipon a foundation of computer-generated lies. Buthe’d let his computer figure out how muchimportance to assign each question, using amachine-learning algorithm called adaptiveboosting to derive the best weightings.

Page 8: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

Emily Shur (Grooming by AndreaPezzillo/Artmix Beauty)

With that, he created two profiles, one with aphoto of him rock climbing and the other of himplaying guitar at a music gig. “Regardless offuture plans, what’s more interesting to youright now? Sex or love?” went one question.Answer: Love, obviously. But for the younger Acluster, he followed his computer’s directionand rated the question “very important.” Forthe B cluster, it was “mandatory.”

When the last question was answered andranked, he ran a search on OkCupid for womenin Los Angeles sorted by match percentage. Atthe top: a page of women matched at 99percent. He scrolled down … and down … anddown. Ten thousand women scrolled by, fromall over Los Angeles, and he was still in the 90s.

He needed one more step to get noticed.OkCupid members are notified when someoneviews their pages, so he wrote a new program tovisit the pages of his top-rated matches, cyclingby age: a thousand 41-year-old women on

Page 9: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

Monday, another thousand 40-year-old womenon Tuesday, looping back through when hereached 27-year-olds two weeks later. Womenreciprocated by visiting his profiles, some 400 aday. And messages began to roll in.

“I haven’t until now come across anyone withsuch winning numbers, AND I find your profileintriguing,” one woman wrote. “Also, somethingabout a rugged man who’s really good withnumbers … Thought I’d say hi.”

“Hey there—your profile really struck me and Iwanted to say hi,” another wrote. “I think wehave quite a lot in common, maybe not the mathbut certainly a lot of other good stuff!”

“Can you really translate Chinese?” yet anotherasked. “I took a class briefly but it didn’t gowell.”

The math portion of McKinlay’s search wasdone. Only one thing remained. He’d have toleave his cubicle and take his research into thefield. He’d have to go on dates.

On June 30, McKinlay showered at theUCLA gym and drove his beat-up Nissan acrosstown for his first data-mined date. Sheila was aweb designer from the A cluster of young artisttypes. They met for lunch at a cafe in Echo Park.“It was scary,” McKinlay says. “Up until thispoint it had almost been an academic exercise.”

By the end of his date with Sheila, it was clear toboth that the attraction wasn’t there. He wenton his second date the next day—an attractiveblog editor from the B cluster. He’d planned a

Page 10: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

romantic walk around Echo Park Lake butfound it was being dredged. She’d been readingProust and feeling down about her life. “It waskind of depressing,” he says.

Date three was also from the B group. He metAlison at a bar in Koreatown. She was ascreenwriting student with a tattoo of aFibonacci spiral on her shoulder. McKinlay gotdrunk on Korean beer and woke up in hiscubicle the next day with a painful hangover. Hesent Alison a follow- up message on OkCupid,but she didn’t write back.

The rejection stung, but he was still getting 20messages a day. Dating with his computer-endowed profiles was a completely differentgame. He could ignore messages consisting ofbad one-liners. He responded to the ones thatshowed a sense of humor or displayedsomething interesting in their bios. Back whenhe was the pursuer, he’d swapped three to fivemessages to get a single date. Now he’d sendjust one reply. “You seem really cool. Want tomeet?”

By date 20, he noticed latent variablesemerging. In the younger cluster, the womeninvariably had two or more tattoos and lived onthe east side of Los Angeles. In the other, adisproportionate number owned midsize dogsthat they adored.

His earliest dates were carefully planned. But ashe worked feverishly through his queue, heresorted to casual afternoon meetups overlunch or coffee, often stacking two dates in aday. He developed a set of personal rules to get

Page 11: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

through his marathon love search. No moredrinking, for one. End the date when it’s over,don’t let it trail off. And no concerts or movies.“Nothing where your attention is directed at athird object instead of each other,” he says. “It’sinefficient.”

LOVE IS A DATA FIELD

McKinlay’s code found that the women clusteredinto statistically identifiable groups who tendedto answer their OkCupid survey questions insimilar ways. One group, which he dubbed theGreens, were online dating newbies; another, theSamanthas, tended to be older and moreadventuresome. Here’s how each clusteranswered four of the most popular questions.

The Questions

(1) Abouthow longdo youwantyour nextrelationshipto last?

Onenight

A fewmonthsto a year

Severalyears

Therest ofmy life

(2) Sayyou’vestartedseeingsomeoneyoureallylike. Asfar asyou’reconcerned,how longwill ittakebeforeyou havesex?

1-2dates

3-5dates

6 ormoredates

Onlyafter the

(3) Haveyou everhad asexualencounterwithsomeoneof thesamesex?

Yes,and Ienjoyedmyself

Yes,and I didnot enjoymyself

No,and Iwouldnever

No,but I’dlike to

(4) Howimportantisreligion/Godin yourlife?

Extremelyimportant

Somewhatimportant

Notveryimportant

Notimportantat all

Page 12: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

wedding

After a month of dating equally from both of hisprofiles, he decided he was spending too muchtime on the freeway reaching east-side womenfrom the tattoo cluster. He deleted his A-groupprofile. His efficiency improved, but the resultswere the same. As summer drew to a close, he’dbeen on more than 55 dates, each one dutifullylogged in a lab notebook. Only three had led tosecond dates; only one had led to a third.

Most unsuccessful daters confront self-esteemissues. For McKinlay it was worse. He had toquestion his calculations.

Then came the message from Christine TienWang, a 28-year-old artist and prison abolitionactivist. McKinlay had popped up in her searchfor 6-foot guys with blue eyes near UCLA,where she was pursuing her master’s in finearts. They were a 91 percent match.

He met her at the sculpture garden on campus.From there they walked to a college sushi joint.He felt it immediately. They talked aboutbooks, art, music. When she confessed thatshe’d made some tweaks to her profile beforemessaging him, he responded by telling her allabout his love hacking. The whole story.

Page 13: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

“I thought it was dark and cynical,” she says. “Iliked it.”

It was first date number 88. A second datefollowed, then a third. After two weeks theyboth suspended their OkCupid accounts.

“I think that what I did is just a slightlymore algorithmic, large-scale, and machine-learning-based version of what everyone doeson the site,” McKinlay says. Everyone tries tocreate an optimal profile—he just had the datato engineer one.

It’s one year after their first date, and McKinlayand Tien Wang have met me at the Westwoodsushi bar where their relationship began.McKinlay has his PhD; he’s teaching math and isnow working on a postgraduate degree in music.Tien Wang was accepted into a one-year artfellowship in Qatar. She’s in California to visitMcKinlay. They’ve been staying connected onSkype, and she has returned for a couple ofvisits.

At my request, McKinlay has brought his labnotebook. Tien Wang hasn’t seen it beforetoday. It’s page after page of formulas andequations in McKinlay’s tight handwriting,ending in a neatly ordered list of women anddates, a few terse notes about each. Tien Wangleafs through it, laughing at some of thehighlights. On August 24, she notices, he tooktwo women to the same beach on the same day.“That’s horrible,” she says.

To Tien Wang, McKinlay’s OkCupid hackingis a

Page 14: How a Math Genius Hacked OkCupid to Find True Love - Wired Science

funny story to tell. But all the math and codingis merely prologue to their story together. Thereal hacking in a relationship comes after youmeet. “People are much more complicated thantheir profiles,” she says. “So the way we metwas kind of superficial, but everything thathappened after is not superficial at all. It’s beencultivated through a lot of work.”

“It’s not like, we matched and therefore we havea great relationship,” McKinlay agrees. “It wasjust a mechanism to put us in the same room. Iwas able to use OkCupid to find someone.”

She bristles at that. “You didn’t find me. I foundyou,” she says, touching his elbow. McKinlaypauses to think, then admits she’s right.

A week later Tien Wang is back in Qatar, andthe couple is on one of their daily Skype callswhen McKinlay pulls out a diamond ring andholds it up to the webcam. She says yes.

They’re not entirely sure when they’ll getmarried. There’s research to be done todetermine the optimal wedding day.