using data for evil (2)
TRANSCRIPT
@fhr @duncan3ross @datakinduk
Before we start We aren’t actually evil*
We don’t want you to do these things**
But, if you catch yourself thinking that “this is a little bit like what we do” or “if we aren’t careful that could be us…” then maybe you ought to think carefully***
* mileage may vary
** other than for comedy purposes
*** and NOT do them. No, seriously
USING DATA FOR EVILFran Bennett,
CEO Mastodon C
@fhr
Duncan Ross,
Data and Analytics Director, Times Higher Education
@duncan3ross
2
@fhr @duncan3ross @datakinduk
The Categorical Imperative
“Act only according to that maxim whereby you can, at the same time, will that it should become a universal law”
He said that humans are ends-in-themselves, not means to an end
TOO PIOUS?
DAMN RIGHT!
@fhr @duncan3ross @datakinduk
Your quest for Global Data Dominance begins here
Doing evil doesn’t need to be about the big things
- We all have the ability to make the world a little worse
- Where would Khan have got to without his minions?
So, in 2014, how did people manage to do so much evil with data?
@fhr @duncan3ross @datakinduk
Our hypothesis For a data scientist
- Doing good deliberately is hard and Doing evil deliberately is hard
- Doing evil accidentally is easy
Hypothesis:
- Every data scientist has the capability to do good by thinking about what they do
Null hypothesis:
- Every data scientist has the capability to do good by not thinking about what they do
Source: xkcd
@fhr @duncan3ross @datakinduk
The problem
Too many people have been thinking
Things have got worse (good) not better (bad)
@fhr @duncan3ross @datakinduk
Quiz: which of these is most exciting?
① Analysing call ‘meta data’ to find out when people are entering (and leaving) a relationship
② Finding out how different people react to different drugs using medical data?
③ Predicting and changing behaviours through examining purchasing behaviour
@fhr @duncan3ross @datakinduk
Don’t care about impact! Analysis is cool, and powerful
Your analysis can have really great effect on people
Effects aren’t always as predicted
- Never measure the results
- Don’t worry about what might happen
The (possibly ex) PM on Facebook
Facebook is EVIL
Terrorists are EVIL
Surely we can do something with this?
@fhr @duncan3ross @datakinduk
What could possibly go wrong? Let’s not confuse politicians with statistics
- Yes, the Royal Statistical Society is trying…
Gloss over issues like Bayes and Type I and Type II errors
Type II error Don’t intercept terrorist
Type I error Send police after ‘innocent’ victims
@fhr @duncan3ross @datakinduk
Facebook: the maths P(bad guy | +) = ??
P(+ | bad guy) = 0.999
P(bad guy) = 100/60,000,000
P(+ | good guy) = 0.001
Then:
P(bad guy | +) = 99.9/60,000 = 0.17%
https://duncan3ross.wordpress.com/2014/11/26/why-david-cameron-is-wrong-the-maths/
@fhr @duncan3ross @datakinduk
Take advantage when intuition is systematically wrong
http://duckofminerva.com/2013/07/bayes-stereotypin-and-rare-events.html
@fhr @duncan3ross @datakinduk
But wait a minute! If you have independent priors (I think Fran is a terrorist…) then won’t this make
Facebook’s predictions much, much better?
Yes!
So we should
- Give Facebook all of MI5’s data
- Just have mass surveillance of Facebook by MI5…
@fhr @duncan3ross @datakinduk
3 Evil Personal Data Options1. Know your enemy (by which we mean everyone)
2. Change their behaviour (make more money!)
3. If you can’t do 1 or 2, sell the data
@fhr @duncan3ross @datakinduk
But doesn’t anonymisation protect personal data?
There is nothing better than revealing people’s dirty secrets
Taxi & Limo Commission released every 2013 NYC taxi ride (detailed locations, timings, fares)
What could possibly go wrong?
http://research.neustar.biz/2014/09/15/riding-with-the-stars-passenger-privacy-in-the-nyc-taxicab-dataset/
@fhr @duncan3ross @datakinduk
Celebrity stalking! Good news! It’s really hard to guarantee anonymity. The more data you link
together, the easier it is to work backwards
@fhr @duncan3ross @datakinduk
Home addresses of strip club visitors! Find the residential addresses
where cabs get to after leaving dodgy locations between midnight and 6am
Filter down to specific individuals using property records
Find where else those people go to in cabs (maybe to where they work?)
Change behaviour
Casinos:
Great place for plotting, or full ofsecret agents? We ask the important questions…
@fhr @duncan3ross @datakinduk
Data is everywhere: use it
Image: Casino Enterprise Management - Where’s the Money? Part 5: Gaming Density and Yielding the Floor
From a reputable source
@fhr @duncan3ross @datakinduk
Hell Cycle
Kickstarter Peak of Hysteria
Pit of Despair
Appears on Dave
Slope of Acceptanc
e
M25
IoT
Masssurveillance
Spark BadVisualisation
Big Table
IBMWatson
Pre-CrimeProfiling
AppleWatch
Deanonymisation
care.data
Google. Just Google
@fhr @duncan3ross @datakinduk
Where can you do most evil? In an organisation committed
to evil? In an organisation committed
to good? In an organisation committed
to shareholder value? In an organisation that isn’t
sure?
Image: mattbuck
@fhr @duncan3ross @datakinduk
Necromantic Quadrant ™ 2013
SMERSH
NSA
E.ON
RSPCA
OWCAUNITSPECTRE COTT
EVIL GOOD
EFFEC
TIV
EN
ESS
@fhr @duncan3ross @datakinduk
Necromantic Quadrant ™ 2015
SMERSH
NSA
RSPCA
SPECTRE
EVIL GOOD
EFFEC
TIV
EN
ESS
Devils Angels
Self-regulatorsChallengers
Royal MailCAB
Wonga
Uber
36 @duncan3ross @DataKindUK
• DataKind UK is a charity that believes we can make the world better by using data
• We work by linking data volunteers (you) with charities
COME AND JOIN DATAKIND
37 @duncan3ross @DataKindUK
WHO HAVE WE WORKED WITH?
Children
Education
Health
Young people
Advice and support
International and community