open data talk at the world bank

21
Anthony Goldbloom Kaggle Making data science a sport Photo by mikebaird,

Upload: anthony-goldbloom

Post on 02-Aug-2015

482 views

Category:

Technology


6 download

TRANSCRIPT

Page 1: Open Data talk at the World Bank

Anthony GoldbloomKaggle

Making data science a sport

Photo by mikebaird, www.flickr.com/photos/mikebaird

Windows User
Page 2: Open Data talk at the World Bank
Page 3: Open Data talk at the World Bank
Page 4: Open Data talk at the World Bank

Competitions are judged on objective criteria

Competition Mechanics

Windows User
I have an alternative with a few more columns and a pharma (R&D) theme
Page 5: Open Data talk at the World Bank
Windows User
replaced BI with Hewlett (since BI gets introduced later)
Page 6: Open Data talk at the World Bank
Page 7: Open Data talk at the World Bank
Page 8: Open Data talk at the World Bank

“In less than a week, Martin O’Leary, a PhD student in glaciology, outperformedthe state-of-the-art algorithms”

“The world’s brightest physicists have been working for decades on solving one of the great unifying problems of our universe”

Kaggle’s Dark Matter Competition on the White House blog

Page 9: Open Data talk at the World Bank
Page 10: Open Data talk at the World Bank

User base: 60,000 data scientists

Page 11: Open Data talk at the World Bank

Our User Base

Page 12: Open Data talk at the World Bank

• neural networks• logistic regression• support vector machine• decision trees• ensemble methods• adaBoost• Bayesian networks

• genetic algorithms• random forest• Monte Carlo methods• principal component analysis• Kalman filter• evolutionary fuzzy modeling

Users apply different techniques

Page 13: Open Data talk at the World Bank

EXAMPLE ESSAY QUESTION —

We all understand the benefits of laughter. For example, someone once said, “Laughter is the shortest distance between two people.”

Many other people believe that laughter is an important part of any relationship. Tell a true story in which laughter was one element or part.

Page 14: Open Data talk at the World Bank

“Have you ever experienced a time with your friends or family where you laughed so hard your stomach hurt, and your eyes were filled with tears? Laughing is something every person needs.

A great laugh can make a persons day and put a smile on their face. If no one laughed the world would be a terribly sad place. My friends and I are always laughing, to the point where were rolling on the ground, clutching our stomachs laughing.”

Automated results by

the winning algorithm are

as reliable as manual

assessment by teachers.

Page 15: Open Data talk at the World Bank
Page 16: Open Data talk at the World Bank
Page 17: Open Data talk at the World Bank

Probability of going to hospital in the next six months

& Obesity

Diabetes

& Hypertension

& High Cholesterol

Page 18: Open Data talk at the World Bank

RTA Competition: Travel Time Prediction

Page 19: Open Data talk at the World Bank

Boehringer Ingelheim Competition: Data

Mutates Molecule +1700 fields

True Molecule2

False Molecule3

True Molecule4

True Molecule 5

True 0

Windows User
various edits to text box
Page 20: Open Data talk at the World Bank

Is it a lemon?

Page 21: Open Data talk at the World Bank

What could the world’s bestanalysts find in your data?

e-mail [email protected] +1 650 283 9781