making sense of ‘making a murderer’ - samsi · overview “making a murderer” netflix...

39
Making Sense of ‘Making a Murderer’ Lucas Mentch SAMSI UG Workshop February 22, 2016

Upload: hacong

Post on 27-Apr-2018

220 views

Category:

Documents


4 download

TRANSCRIPT

Making Sense of ‘Making a Murderer’

Lucas Mentch SAMSI UG Workshop February 22, 2016

Overview

“Making a Murderer” Netflix Documentary Series (MaM)

Follows case of Steven Avery from defense perspective

A number of issues suggested in relation to evidence used to convict

Trailerhttps://www.youtube.com/watch?v=qxgbdYaR_KQ

Summary1985: Avery convicted of sexual assault

2003: After 18 years in prison; DNA evidence overturns conviction

Files lawsuit against county

2005: Arrested for murder

2007: Convicted and sentenced to life in prison

Documentary series suggests that much of the evidence used in the murder trial was suspect/unreliable

Various forms of bias

Conditional and Random match probabilities

Threshold-based testing procedures

How common are these issues and why are they important?

Bias

Contextual Bias: can occur when examiners are exposed to “additional” case details

e.g. A fingerprint examiner is asked whether fingerprints match, but is aware that a DNA match has already been confirmed

Confirmation Bias: can occur when patterns seen in exemplar are used to “find” match in latent evidence

Several sources, but we’ll focus on two:

Confirmation BiasQuality of latent evidence can vary greatly, but can assume that quality is bounded above by exemplar quality

Confirmation BiasQuality of latent evidence can vary greatly, but can assume that quality is bounded above by exemplar quality

https://en.wikipedia.org/wiki/William_Heirens

Confirmation BiasQuality of latent evidence can vary greatly, but can assume that quality is bounded above by exemplar quality

https://en.wikipedia.org/wiki/William_Heirens http://bwhiteforensics.blogspot.com/2011_12_01_archive.html

Confirmation BiasQuality of latent evidence can vary greatly, but can assume that quality is bounded above by exemplar quality

https://en.wikipedia.org/wiki/William_Heirens http://bwhiteforensics.blogspot.com/2011_12_01_archive.html http://stoneyforensic.com/latent-print-evaluation.htm

Contextual BiasHow much are the results/conclusions of forensic examiners influenced by other details of the case?

Key Question: How much information should forensic examiners have access to?

Knowing the details surrounding the forensic evidence could be helpful in making decisions, but too much information could allow for bias

See more background and references on Bill Thompson’s slides from the SAMSI opening workshop

ImplicationsNature of court proceedings implies that conclusions were arrived at independently:

Onus on opposing counsel to challenge results and explain implications to jury

P [E1, ..., En] =nY

i=1

P [Ei]

e.g. Even if each test is only correct 60% of the time, probability that all 3 tests are wrong is just 6.4%

Relationship to MaM …

* Making a Murderer. Dir. Laura Ricciardi and Moira Demos. Netflix, 2015. Documentary Film.

Relationship to MaM …DNA Examiner testifies that in conversations with detectives, it was said that the lab would be receiving items of interest that they would like to use to “try to put her [the victim] in his [Avery’s] house or garage.” *

* Making a Murderer. Dir. Laura Ricciardi and Moira Demos. Netflix, 2015. Documentary Film.

Relationship to MaM …

* Making a Murderer. Dir. Laura Ricciardi and Moira Demos. Netflix, 2015. Documentary Film.

So you’re being told before you do any of these tests that [the detective] wants you to come up with results that put [the victim] in Mr. Avery’s house or garage, isn’t that right? *

I had that information but that had no bearing on my analysis at all. *

Of course not. *

Defense:

Defense:

Examiner:

Roadblocks & Paths ForwardCan design tests to measure the effects of various types of bias, but requires cooperation of labs

No upside for examiners

Labs already woefully behind

“Expert-only” tests

Known testing can induce its own bias

Some labs are interested in establishing internal quality control

OJ Simpson Paradox

1994-1995: OJ Simpson arrested on suspicion of double murder; found not guilty after 8 month trial

OJ Simpson Paradox

1994-1995: OJ Simpson arrested on suspicion of double murder; found not guilty after 8 month trial

Prosecutors detailed a history of spousal abuse but defense argued that

P[husband murdered wife | husband abused wife] = ???

OJ Simpson Paradox

1994-1995: OJ Simpson arrested on suspicion of double murder; found not guilty after 8 month trial

Prosecutors detailed a history of spousal abuse but defense argued that

P[husband murdered wife | husband abused wife] = 1/2500 (0.0004)

OJ Simpson Paradox

1994-1995: OJ Simpson arrested on suspicion of double murder; found not guilty after 8 month trial

Prosecutors detailed a history of spousal abuse but defense argued that

But …

P[husband murdered wife | husband abused wife] = 1/2500 (0.0004)

P[husband murdered wife | husband abused wife & wife is dead] =???

OJ Simpson Paradox

1994-1995: OJ Simpson arrested on suspicion of double murder; found not guilty after 8 month trial

Prosecutors detailed a history of spousal abuse but defense argued that

But …

P[husband murdered wife | husband abused wife] = 1/2500 (0.0004)

P[husband murdered wife | husband abused wife & wife is dead] = 8/9 (0.889) *

* Gigerenzer, G., Reckoning with Risk: Learning to Live with Uncertainty, Penguin, (2003)

Relationship to MaM …

Relationship to MaM …Examiner testifies that bullet found in Avery’s garage contains DNA matching the victim

Why do DNA testing in the first place?

But in this case, the control sample was contaminated

(*) P[Positive DNA test | Not a match] = very small

(*) Not meaningful

Relationship to MaM …What is meaningful?

Relationship to MaM …What is meaningful?

P[Positive DNA test | Not a match & control contaminated] = …?

Relationship to MaM …What is meaningful?

SOP was to throw out these results

Likely implies that this probability is either meaninglessly large, untested, or not generalizable

P[Positive DNA test | Not a match & control contaminated] = …?

Random Match ProbabilitiesGets at larger issue; Given evidence E and suspect S, what is

Depends on E

DNA, fingerprints, shoe prints, tire tracks, bite marks …

Shirt color? Eyewitness ID?

Outside of DNA, these are largely unknown and difficult to calculate

P[S Matches E | S Not the source of E]

Back to Bias

If any sources of bias are present, then real question becomes (for 3 pieces of evidence)

P[S Matches E1 | S Not the source of E1] x P[S Matches E2 | S Not the source of E2

and S matches E1]x P[S Matches E3 | S Not the source of E3

and S matches E1 and S matches E2]

EDTA Blood TestsOne key piece of evidence against Avery is blood (matching Avery) found in victim’s car

Defense uncovers Avery blood vial taken during previous assault arrest has been tampered with

These vials would have been stored with the preservative EDTA

If the samples collected from the victims car originated from the vile, they should contain traces of EDTA

FBI Testing

FBI developed test to determine presence/absence of EDTA in dried blood samples

Total of 6 samples taken from the car

Only 3 samples were tested; none found to contain EDTA

Testing Set-upLet’s frame this as a hypothesis test. For each sample

Testing Set-upLet’s frame this as a hypothesis test. For each sample

H0: EDTA = 0 H1: EDTA > 0

Testing Set-upLet’s frame this as a hypothesis test. For each sample

H0: EDTA = 0 H1: EDTA > 0

Defense witness testifies that based on the available information, it is reasonably clear if the test detects EDTA, then it’s likely EDTA is present. However, when not detected, not clear whether it is not present or simply not detected

What statistical issues is this getting at?

Testing Set-upAlpha-level: probability of incorrectly rejecting H0

Power: probability of correctly rejecting H0

Testing Set-upAlpha-level: probability of incorrectly rejecting H0

Power: probability of correctly rejecting H0

In this case, the concern is that the testsmay have low alpha level (false positive rate), butmay also have low power

Samples that do not test positive for EDTA may actually contain it

Designing a TestHow could you design such a test?

Designing a TestHow could you design such a test?

Define a sequence of control samples C1, …, Cn

Chemically analyze each sample to determine levels of EDTA present L1, …, Ln

Define threshold T at appropriate false positive level

0 T

The Issue:

0 T

What if when you included positive samples, the distribution of levels looked like this:

Now positive samples would routinely be marked as not containing EDTA

Control SamplesPositive Samples

Bottom LineCalibrating threshold-based testing procedures is an intensive process; need to account for

Measurement error

Chemical concentration amounts

Reasonable concentrations seen in practice

Concern is that these FBI testing procedures were developed very quickly

TakeawayRegardless of guilt or innocence, the ‘Making a Murderer’ shed light on some potentially suspicious aspects of the Steven Avery murder case

While this might represent an extreme example, the concern is how easy and relatively common these missteps are in general

In my own opinion, I don’t expect to see any vast improvements until a more automated approach is embraced