reproducible research: assessing spatial predictions of crime · reproducible research: assessing...
TRANSCRIPT
![Page 1: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/1.jpg)
Reproducible research: assessing spatial predictionsof crime
Matthew Daws
Leeds
LIDA Seminar, November 2017
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 1 / 20
![Page 2: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/2.jpg)
Project
UK Home O�ce Police Innovation Fund: \More with Less: Authentic
Implementation of Evidence-Based Predictive Patrol Plans". With
Andy Evans and Monsuru Adepeju here at Leeds.
My task:
Take crime prediction algorithms from the literature, and
implement in an open source way
(https://github.com/QuantCrimAtLeeds/PredictCode/)
Allow other researchers to see what bene�t di�erent crime
prediction algorithms are likely to give.
My background is in Mathematics; and Software Development.
Runs until February 2018.
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 2 / 20
![Page 3: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/3.jpg)
Project
UK Home O�ce Police Innovation Fund: \More with Less: Authentic
Implementation of Evidence-Based Predictive Patrol Plans". With
Andy Evans and Monsuru Adepeju here at Leeds.
My task:
Take crime prediction algorithms from the literature, and
implement in an open source way
(https://github.com/QuantCrimAtLeeds/PredictCode/)
Allow other researchers to see what bene�t di�erent crime
prediction algorithms are likely to give.
My background is in Mathematics; and Software Development.
Runs until February 2018.
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 2 / 20
![Page 4: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/4.jpg)
Project
UK Home O�ce Police Innovation Fund: \More with Less: Authentic
Implementation of Evidence-Based Predictive Patrol Plans". With
Andy Evans and Monsuru Adepeju here at Leeds.
My task:
Take crime prediction algorithms from the literature, and
implement in an open source way
(https://github.com/QuantCrimAtLeeds/PredictCode/)
Allow other researchers to see what bene�t di�erent crime
prediction algorithms are likely to give.
My background is in Mathematics; and Software Development.
Runs until February 2018.
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 2 / 20
![Page 5: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/5.jpg)
Project
UK Home O�ce Police Innovation Fund: \More with Less: Authentic
Implementation of Evidence-Based Predictive Patrol Plans". With
Andy Evans and Monsuru Adepeju here at Leeds.
My task:
Take crime prediction algorithms from the literature, and
implement in an open source way
(https://github.com/QuantCrimAtLeeds/PredictCode/)
Allow other researchers to see what bene�t di�erent crime
prediction algorithms are likely to give.
My background is in Mathematics; and Software Development.
Runs until February 2018.
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 2 / 20
![Page 6: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/6.jpg)
The (Near-)repeat hypothesis
\The tendency of victims of crime to, in the nearby future, be repeat
victims; and of near-by (say) buildings to also be future victims."
(Principally interested in Burglary.)
That is, a crime event at a spatial/temporal location tends to imply a
higher risk, localised in space and time, for nearby locations.
Classical prediction techniques tend to generate \hot spots"
around previous locations.
Part I: How do we do this? (Plea for reproducible research.)
Part II: And what do we mean by \prediction" anyway? What
makes a \good" prediciton?
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 3 / 20
![Page 7: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/7.jpg)
The (Near-)repeat hypothesis
\The tendency of victims of crime to, in the nearby future, be repeat
victims; and of near-by (say) buildings to also be future victims."
(Principally interested in Burglary.)
That is, a crime event at a spatial/temporal location tends to imply a
higher risk, localised in space and time, for nearby locations.
Classical prediction techniques tend to generate \hot spots"
around previous locations.
Part I: How do we do this? (Plea for reproducible research.)
Part II: And what do we mean by \prediction" anyway? What
makes a \good" prediciton?
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 3 / 20
![Page 8: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/8.jpg)
The (Near-)repeat hypothesis
\The tendency of victims of crime to, in the nearby future, be repeat
victims; and of near-by (say) buildings to also be future victims."
(Principally interested in Burglary.)
That is, a crime event at a spatial/temporal location tends to imply a
higher risk, localised in space and time, for nearby locations.
Classical prediction techniques tend to generate \hot spots"
around previous locations.
Part I: How do we do this? (Plea for reproducible research.)
Part II: And what do we mean by \prediction" anyway? What
makes a \good" prediciton?
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 3 / 20
![Page 9: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/9.jpg)
The (Near-)repeat hypothesis
\The tendency of victims of crime to, in the nearby future, be repeat
victims; and of near-by (say) buildings to also be future victims."
(Principally interested in Burglary.)
That is, a crime event at a spatial/temporal location tends to imply a
higher risk, localised in space and time, for nearby locations.
Classical prediction techniques tend to generate \hot spots"
around previous locations.
Part I: How do we do this? (Plea for reproducible research.)
Part II: And what do we mean by \prediction" anyway? What
makes a \good" prediciton?
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 3 / 20
![Page 10: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/10.jpg)
The (Near-)repeat hypothesis
\The tendency of victims of crime to, in the nearby future, be repeat
victims; and of near-by (say) buildings to also be future victims."
(Principally interested in Burglary.)
That is, a crime event at a spatial/temporal location tends to imply a
higher risk, localised in space and time, for nearby locations.
Classical prediction techniques tend to generate \hot spots"
around previous locations.
Part I: How do we do this? (Plea for reproducible research.)
Part II: And what do we mean by \prediction" anyway? What
makes a \good" prediciton?
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 3 / 20
![Page 11: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/11.jpg)
The (Near-)repeat hypothesis
\The tendency of victims of crime to, in the nearby future, be repeat
victims; and of near-by (say) buildings to also be future victims."
(Principally interested in Burglary.)
That is, a crime event at a spatial/temporal location tends to imply a
higher risk, localised in space and time, for nearby locations.
Classical prediction techniques tend to generate \hot spots"
around previous locations.
Part I: How do we do this? (Plea for reproducible research.)
Part II: And what do we mean by \prediction" anyway? What
makes a \good" prediciton?
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 3 / 20
![Page 12: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/12.jpg)
Publications
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 4 / 20
![Page 13: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/13.jpg)
The algorithm
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 5 / 20
![Page 14: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/14.jpg)
The code
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 6 / 20
![Page 15: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/15.jpg)
Reproducible Research
\An article about computational science in a scienti�c publication is
not the scholarship itself, it is merely advertising of the scholarship.
The actual scholarship is the complete software development
environment and the complete set of instructions which generated the
�gures." | Buckheit, Donoho, \WaveLab and Reproducible Research", 1995.
\In my own experience, error is ubiquitous in scienti�c computing . . . "
| Donoho, \An invitation to reproducible computational research", Biostatistics (2010).
Merton's norms: universalism, communalism, disinterestedness,
organized scepticism.
With thanks to Victoria Stodden.
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 7 / 20
![Page 16: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/16.jpg)
Reproducible Research
\An article about computational science in a scienti�c publication is
not the scholarship itself, it is merely advertising of the scholarship.
The actual scholarship is the complete software development
environment and the complete set of instructions which generated the
�gures." | Buckheit, Donoho, \WaveLab and Reproducible Research", 1995.
\In my own experience, error is ubiquitous in scienti�c computing . . . "
| Donoho, \An invitation to reproducible computational research", Biostatistics (2010).
Merton's norms: universalism, communalism, disinterestedness,
organized scepticism.
With thanks to Victoria Stodden.
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 7 / 20
![Page 17: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/17.jpg)
Reproducible Research
\An article about computational science in a scienti�c publication is
not the scholarship itself, it is merely advertising of the scholarship.
The actual scholarship is the complete software development
environment and the complete set of instructions which generated the
�gures." | Buckheit, Donoho, \WaveLab and Reproducible Research", 1995.
\In my own experience, error is ubiquitous in scienti�c computing . . . "
| Donoho, \An invitation to reproducible computational research", Biostatistics (2010).
Merton's norms: universalism, communalism, disinterestedness,
organized scepticism.
With thanks to Victoria Stodden.
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 7 / 20
![Page 18: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/18.jpg)
Resources
http://reproducibleresearch.net/
https://rroxford.github.io/
http://www.bmj.com/content/344/bmj.e4383
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 8 / 20
![Page 19: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/19.jpg)
But to continue
Wikipedia entry \Hobby horse"
\My Uncle Toby on his Hobby-horse",Wikipedia
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 9 / 20
![Page 20: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/20.jpg)
What is crime prediction?
\Precrime: It Works!"
Wikipedia entry \The Minority
Report"From IMDB
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 10 / 20
![Page 21: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/21.jpg)
What is crime prediction actually?
\Although much news coverage promotes the meme that predictive
policing is a crystal ball, these algorithms predict the risk of future
events, not the events themselves." Perry, McInnis, Price, Smith, Hollywood,
\Predictive Policing", RAND report.
\Prior to each shift, Santa Cruz police o�cers receive information
identifying 15 such squares with the highest probability of crime, and
are encouraged | though not required | to provide greater attention
to these areas." Joh, \Policing by numbers: Big data and the fourth amendment.
\Despite the increased emphasis on proactive policing, the core of
police work remains that of responding to calls for service. . . " Gro�, La
Vigne, \Forecasting the future of predictive crime mapping".
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 11 / 20
![Page 22: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/22.jpg)
What is crime prediction actually?
\Although much news coverage promotes the meme that predictive
policing is a crystal ball, these algorithms predict the risk of future
events, not the events themselves." Perry, McInnis, Price, Smith, Hollywood,
\Predictive Policing", RAND report.
\Prior to each shift, Santa Cruz police o�cers receive information
identifying 15 such squares with the highest probability of crime, and
are encouraged | though not required | to provide greater attention
to these areas." Joh, \Policing by numbers: Big data and the fourth amendment.
\Despite the increased emphasis on proactive policing, the core of
police work remains that of responding to calls for service. . . " Gro�, La
Vigne, \Forecasting the future of predictive crime mapping".
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 11 / 20
![Page 23: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/23.jpg)
What is crime prediction actually?
\Although much news coverage promotes the meme that predictive
policing is a crystal ball, these algorithms predict the risk of future
events, not the events themselves." Perry, McInnis, Price, Smith, Hollywood,
\Predictive Policing", RAND report.
\Prior to each shift, Santa Cruz police o�cers receive information
identifying 15 such squares with the highest probability of crime, and
are encouraged | though not required | to provide greater attention
to these areas." Joh, \Policing by numbers: Big data and the fourth amendment.
\Despite the increased emphasis on proactive policing, the core of
police work remains that of responding to calls for service. . . " Gro�, La
Vigne, \Forecasting the future of predictive crime mapping".
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 11 / 20
![Page 24: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/24.jpg)
Analogy with weather forecasting
I have found analogies with probabilistic forecasting within
Meteorology to be very pro�table.
\There is a 20% chance of rain in Leeds tomorrow."
What does this mean?
If we make this prediction many times, then 1 in 5 times, it should
rain tomorrow. \reliability".
But maybe it rains 20% of the time in Leeds anyway (over a year,
say)?
\resolution" (which is hard to actually de�ne.)
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 12 / 20
![Page 25: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/25.jpg)
Analogy with weather forecasting
I have found analogies with probabilistic forecasting within
Meteorology to be very pro�table.
\There is a 20% chance of rain in Leeds tomorrow."
What does this mean?
If we make this prediction many times, then 1 in 5 times, it should
rain tomorrow. \reliability".
But maybe it rains 20% of the time in Leeds anyway (over a year,
say)?
\resolution" (which is hard to actually de�ne.)
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 12 / 20
![Page 26: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/26.jpg)
Analogy with weather forecasting
I have found analogies with probabilistic forecasting within
Meteorology to be very pro�table.
\There is a 20% chance of rain in Leeds tomorrow."
What does this mean?
If we make this prediction many times, then 1 in 5 times, it should
rain tomorrow. \reliability".
But maybe it rains 20% of the time in Leeds anyway (over a year,
say)?
\resolution" (which is hard to actually de�ne.)
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 12 / 20
![Page 27: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/27.jpg)
Analogy with weather forecasting
I have found analogies with probabilistic forecasting within
Meteorology to be very pro�table.
\There is a 20% chance of rain in Leeds tomorrow."
What does this mean?
If we make this prediction many times, then 1 in 5 times, it should
rain tomorrow. \reliability".
But maybe it rains 20% of the time in Leeds anyway (over a year,
say)?
\resolution" (which is hard to actually de�ne.)
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 12 / 20
![Page 28: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/28.jpg)
Analogy with weather forecasting
I have found analogies with probabilistic forecasting within
Meteorology to be very pro�table.
\There is a 20% chance of rain in Leeds tomorrow."
What does this mean?
If we make this prediction many times, then 1 in 5 times, it should
rain tomorrow. \reliability".
But maybe it rains 20% of the time in Leeds anyway (over a year,
say)?
\resolution" (which is hard to actually de�ne.)
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 12 / 20
![Page 29: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/29.jpg)
Analogy with weather forecasting
I have found analogies with probabilistic forecasting within
Meteorology to be very pro�table.
\There is a 20% chance of rain in Leeds tomorrow."
What does this mean?
If we make this prediction many times, then 1 in 5 times, it should
rain tomorrow. \reliability".
But maybe it rains 20% of the time in Leeds anyway (over a year,
say)?
\resolution" (which is hard to actually de�ne.)
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 12 / 20
![Page 30: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/30.jpg)
Lack of analogy
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Naive prediction
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000KDE prediction
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Actual events
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Naive prediction
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000KDE prediction
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Actual events
Northside of Chicago, predictions and reality for 5th Nov 2016, and
23rd October 2016.
The probabilities involved are tiny.
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 13 / 20
![Page 31: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/31.jpg)
Hit rate
The de facto standard.
Pick a \coverage level", say 10% of the
area, which might be chosen given
Policing resources.
Pick that % of grid cells, by picking
those with the highest risk �rst.
Then calculate the fraction of actual
events which fall in the selected grid
cells.
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'naive' prediciton
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'kde' prediciton
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 14 / 20
![Page 32: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/32.jpg)
Hit rate
The de facto standard.
Pick a \coverage level", say 10% of the
area, which might be chosen given
Policing resources.
Pick that % of grid cells, by picking
those with the highest risk �rst.
Then calculate the fraction of actual
events which fall in the selected grid
cells.
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'naive' prediciton
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'kde' prediciton
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 14 / 20
![Page 33: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/33.jpg)
Hit rate
The de facto standard.
Pick a \coverage level", say 10% of the
area, which might be chosen given
Policing resources.
Pick that % of grid cells, by picking
those with the highest risk �rst.
Then calculate the fraction of actual
events which fall in the selected grid
cells.
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'naive' prediciton
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'kde' prediciton
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 14 / 20
![Page 34: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/34.jpg)
Hit rate
The de facto standard.
Pick a \coverage level", say 10% of the
area, which might be chosen given
Policing resources.
Pick that % of grid cells, by picking
those with the highest risk �rst.
Then calculate the fraction of actual
events which fall in the selected grid
cells.
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'naive' prediciton
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'kde' prediciton
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 14 / 20
![Page 35: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/35.jpg)
The good, the bad, the ugly
Easy to understand, tied to usage of the
prediction;
But seems to me to confuse prediction
with hot-spot / patrol plan creation.
Notice the huge quantitative di�erence
in the two examples.
How do you deal with the selection of a
coverage level?
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'naive' prediciton
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'kde' prediciton
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 15 / 20
![Page 36: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/36.jpg)
The good, the bad, the ugly
Easy to understand, tied to usage of the
prediction;
But seems to me to confuse prediction
with hot-spot / patrol plan creation.
Notice the huge quantitative di�erence
in the two examples.
How do you deal with the selection of a
coverage level?
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'naive' prediciton
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'kde' prediciton
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 15 / 20
![Page 37: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/37.jpg)
The good, the bad, the ugly
Easy to understand, tied to usage of the
prediction;
But seems to me to confuse prediction
with hot-spot / patrol plan creation.
Notice the huge quantitative di�erence
in the two examples.
How do you deal with the selection of a
coverage level?
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'naive' prediciton
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'kde' prediciton
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 15 / 20
![Page 38: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/38.jpg)
The good, the bad, the ugly
Easy to understand, tied to usage of the
prediction;
But seems to me to confuse prediction
with hot-spot / patrol plan creation.
Notice the huge quantitative di�erence
in the two examples.
How do you deal with the selection of a
coverage level?
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'naive' prediciton
350000 352000 354000 356000 358000
583000
584000
585000
586000
587000
588000Top 10% of 'kde' prediciton
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 15 / 20
![Page 39: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/39.jpg)
Interpret the results
Usual to plot mean hitrate against
coverage. Then use some statistical test.
But what's the model?
Let's suppose that each trial is an
independent draw from a binomial with
unknown p.
Use a at prior. Compute the predictive
posterior, plot the median and
inter-quartile range.
Gives much the same result (the number
of events per day doesn't vary that much).
0 20 40 60 80 100Coverage (%)
0
20
40
60
80
100
Hit r
ate
(%)
Mean hit ratenaivekde
0 20 40 60 80 100Coverage (%)
0.0
0.2
0.4
0.6
0.8
1.0
Succ
essf
ul c
aptu
re p
roba
bilit
y
naivekde
0 2 4 6 8 10 12 14Coverage (%)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Succ
essf
ul c
aptu
re p
roba
bilit
ynaivekde
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 16 / 20
![Page 40: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/40.jpg)
Interpret the results
Usual to plot mean hitrate against
coverage. Then use some statistical test.
But what's the model?
Let's suppose that each trial is an
independent draw from a binomial with
unknown p.
Use a at prior. Compute the predictive
posterior, plot the median and
inter-quartile range.
Gives much the same result (the number
of events per day doesn't vary that much).
0 20 40 60 80 100Coverage (%)
0
20
40
60
80
100
Hit r
ate
(%)
Mean hit ratenaivekde
0 20 40 60 80 100Coverage (%)
0.0
0.2
0.4
0.6
0.8
1.0
Succ
essf
ul c
aptu
re p
roba
bilit
y
naivekde
0 2 4 6 8 10 12 14Coverage (%)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Succ
essf
ul c
aptu
re p
roba
bilit
ynaivekde
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 16 / 20
![Page 41: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/41.jpg)
Interpret the results
Usual to plot mean hitrate against
coverage. Then use some statistical test.
But what's the model?
Let's suppose that each trial is an
independent draw from a binomial with
unknown p.
Use a at prior. Compute the predictive
posterior, plot the median and
inter-quartile range.
Gives much the same result (the number
of events per day doesn't vary that much).
0 20 40 60 80 100Coverage (%)
0
20
40
60
80
100
Hit r
ate
(%)
Mean hit ratenaivekde
0 20 40 60 80 100Coverage (%)
0.0
0.2
0.4
0.6
0.8
1.0
Succ
essf
ul c
aptu
re p
roba
bilit
y
naivekde
0 2 4 6 8 10 12 14Coverage (%)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Succ
essf
ul c
aptu
re p
roba
bilit
ynaivekde
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 16 / 20
![Page 42: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/42.jpg)
Interpret the results
Usual to plot mean hitrate against
coverage. Then use some statistical test.
But what's the model?
Let's suppose that each trial is an
independent draw from a binomial with
unknown p.
Use a at prior. Compute the predictive
posterior, plot the median and
inter-quartile range.
Gives much the same result (the number
of events per day doesn't vary that much).
0 20 40 60 80 100Coverage (%)
0
20
40
60
80
100
Hit r
ate
(%)
Mean hit ratenaivekde
0 20 40 60 80 100Coverage (%)
0.0
0.2
0.4
0.6
0.8
1.0
Succ
essf
ul c
aptu
re p
roba
bilit
y
naivekde
0 2 4 6 8 10 12 14Coverage (%)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Succ
essf
ul c
aptu
re p
roba
bilit
ynaivekde
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 16 / 20
![Page 43: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/43.jpg)
Interpret the results
Usual to plot mean hitrate against
coverage. Then use some statistical test.
But what's the model?
Let's suppose that each trial is an
independent draw from a binomial with
unknown p.
Use a at prior. Compute the predictive
posterior, plot the median and
inter-quartile range.
Gives much the same result (the number
of events per day doesn't vary that much).
0 20 40 60 80 100Coverage (%)
0
20
40
60
80
100
Hit r
ate
(%)
Mean hit ratenaivekde
0 20 40 60 80 100Coverage (%)
0.0
0.2
0.4
0.6
0.8
1.0
Succ
essf
ul c
aptu
re p
roba
bilit
y
naivekde
0 2 4 6 8 10 12 14Coverage (%)
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Succ
essf
ul c
aptu
re p
roba
bilit
ynaivekde
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 16 / 20
![Page 44: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/44.jpg)
Brier scores
BS = 1
N
∑N
t=1(ft − ot )2
F = 1
K
∑K
k=1
(pk −
nk
N
)2
Return to Meteorology and probabilistic forecasting.
Binary events: either happens (1) or not (0).
For t = 1, · · · ,N make a prediction ft ∈ [0, 1].
Have actual events (ot ).
We follow a variant from Roberts, \Assessing the spatial andtemporal variation in the skill of precipitation forecasts from anNWP model"
I K grid cellsI predicted probability pkI nk actual events so nk/N fraction.
\Fractional Brier Score"
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 17 / 20
![Page 45: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/45.jpg)
Brier scores
BS = 1
N
∑N
t=1(ft − ot )2
F = 1
K
∑K
k=1
(pk −
nk
N
)2
Return to Meteorology and probabilistic forecasting.
Binary events: either happens (1) or not (0).
For t = 1, · · · ,N make a prediction ft ∈ [0, 1].
Have actual events (ot ).
We follow a variant from Roberts, \Assessing the spatial andtemporal variation in the skill of precipitation forecasts from anNWP model"
I K grid cellsI predicted probability pkI nk actual events so nk/N fraction.
\Fractional Brier Score"
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 17 / 20
![Page 46: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/46.jpg)
Brier scores
BS = 1
N
∑N
t=1(ft − ot )2
F = 1
K
∑K
k=1
(pk −
nk
N
)2
Return to Meteorology and probabilistic forecasting.
Binary events: either happens (1) or not (0).
For t = 1, · · · ,N make a prediction ft ∈ [0, 1].
Have actual events (ot ).
We follow a variant from Roberts, \Assessing the spatial andtemporal variation in the skill of precipitation forecasts from anNWP model"
I K grid cellsI predicted probability pkI nk actual events so nk/N fraction.
\Fractional Brier Score"
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 17 / 20
![Page 47: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/47.jpg)
Brier scores
BS = 1
N
∑N
t=1(ft − ot )2
F = 1
K
∑K
k=1
(pk −
nk
N
)2
Return to Meteorology and probabilistic forecasting.
Binary events: either happens (1) or not (0).
For t = 1, · · · ,N make a prediction ft ∈ [0, 1].
Have actual events (ot ).
We follow a variant from Roberts, \Assessing the spatial andtemporal variation in the skill of precipitation forecasts from anNWP model"
I K grid cellsI predicted probability pkI nk actual events so nk/N fraction.
\Fractional Brier Score"
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 17 / 20
![Page 48: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/48.jpg)
Brier scores
BS = 1
N
∑N
t=1(ft − ot )2
F = 1
K
∑K
k=1
(pk −
nk
N
)2
Return to Meteorology and probabilistic forecasting.
Binary events: either happens (1) or not (0).
For t = 1, · · · ,N make a prediction ft ∈ [0, 1].
Have actual events (ot ).
We follow a variant from Roberts, \Assessing the spatial andtemporal variation in the skill of precipitation forecasts from anNWP model"
I K grid cellsI predicted probability pkI nk actual events so nk/N fraction.
\Fractional Brier Score"
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 17 / 20
![Page 49: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/49.jpg)
Skill score; resultsFworst =
1
K
∑K
k=1
(p2k+(nk
N
)2)FS = 1− F/Fworst
What are units of F?
FS is the \skill"; closer to 1 is better.
0 5 10 15 20 25Naive prediction
0.15
0.10
0.05
0.00
0.05
0.10
0.15
KDE
- Nai
ve
Brier score
0.20 0.15 0.10 0.05 0.00 0.05 0.10
0.0
0.2
0.4
0.6
0.8
1.0Brier score; CDF of difference
0.000 0.005 0.010 0.015 0.020 0.025 0.030Naive prediction
0.000
0.005
0.010
0.015
0.020
0.025
0.030
KDE
pred
ictor
Brier skill
0.015 0.010 0.005 0.000 0.005 0.010 0.015
0.0
0.2
0.4
0.6
0.8
1.0Brier skill; CDF of difference
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 18 / 20
![Page 50: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/50.jpg)
Bayesian information gain
Want to capture the feeling that if we see more events on a given
day, we should learn more about the quality of the prediction.
My idea is to use the prediction to form a prior, the update this
given the data to form a posterior, and then compare these with
the Kullback-Leibler divergence.
Measures the information gain from prior to posterior{ a good
prediction should mean less gained on learning the result.
0 25 50 75 100 125 150naive
0
20
40
60
80
100
120
140
kde
Dirichlet distribution
20 0 20 40 60 80
0.0
0.2
0.4
0.6
0.8
1.0
CDF of differences
0 2 4 6 8 10naive
0
2
4
6
8
10
kde
Predictive distribution
2 0 2 4 6 8
0.0
0.2
0.4
0.6
0.8
1.0
CDF of differences
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 19 / 20
![Page 51: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/51.jpg)
Conclusions?
Seems a little inconclusive.
Hit rate, Brier scores, (other ideas we develop) show roughly a tie.
The information gain idea is more of a clear win for the KDE
method.
Original aim was to get beyond the \hit rate" as being the only game
in town.
Bit of a work in progress: any ideas much appreciated!
https://github.com/QuantCrimAtLeeds/PredictCode/
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 20 / 20
![Page 52: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/52.jpg)
Conclusions?
Seems a little inconclusive.
Hit rate, Brier scores, (other ideas we develop) show roughly a tie.
The information gain idea is more of a clear win for the KDE
method.
Original aim was to get beyond the \hit rate" as being the only game
in town.
Bit of a work in progress: any ideas much appreciated!
https://github.com/QuantCrimAtLeeds/PredictCode/
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 20 / 20
![Page 53: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/53.jpg)
Conclusions?
Seems a little inconclusive.
Hit rate, Brier scores, (other ideas we develop) show roughly a tie.
The information gain idea is more of a clear win for the KDE
method.
Original aim was to get beyond the \hit rate" as being the only game
in town.
Bit of a work in progress: any ideas much appreciated!
https://github.com/QuantCrimAtLeeds/PredictCode/
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 20 / 20
![Page 54: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/54.jpg)
Conclusions?
Seems a little inconclusive.
Hit rate, Brier scores, (other ideas we develop) show roughly a tie.
The information gain idea is more of a clear win for the KDE
method.
Original aim was to get beyond the \hit rate" as being the only game
in town.
Bit of a work in progress: any ideas much appreciated!
https://github.com/QuantCrimAtLeeds/PredictCode/
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 20 / 20
![Page 55: Reproducible research: assessing spatial predictions of crime · Reproducible research: assessing spatial predictions of crime Matthew Daws Leeds LIDA Seminar, November 2017 Matthew](https://reader033.vdocuments.net/reader033/viewer/2022060221/5f0761a27e708231d41cb3d7/html5/thumbnails/55.jpg)
Conclusions?
Seems a little inconclusive.
Hit rate, Brier scores, (other ideas we develop) show roughly a tie.
The information gain idea is more of a clear win for the KDE
method.
Original aim was to get beyond the \hit rate" as being the only game
in town.
Bit of a work in progress: any ideas much appreciated!
https://github.com/QuantCrimAtLeeds/PredictCode/
Matthew Daws (Leeds) Assessing predictions LIDA, Nov 2017 20 / 20