crime forecasting using boosted ensemble classifiers chung-hsien yu crime forecasting using boosted...

Post on 17-Dec-2015

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Crime Forecasting Using Boosted Ensemble Classifiers

Department of Computer Science University of Massachusetts Boston

2012 GRADUATE STUDENTS SYMPOSIUM

Present by: Chung-Hsien Yu

Advisor: Prof. Wei Ding

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

• Retaining spatiotemporal knowledge by applying multi-clustering to monthly aggregated crime data.

• Training baseline learners on these clusters obtained from clustering.

• Adapting a greedy algorithm to find a rule-based ensemble classifier during each boosting round.

• Pruning the ensemble classifier to prevent it from overfitting. • Constructing a strong hypothesis based on these ensemble

classifiers obtained from each round.

Abstract

2

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Original Data

3

Residential Burglary

911 Calls

Arrest

Foreclosure

Street Robbery

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Aggregated Data

4

3

1

1

1

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Data3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

3

1

1

0

5

0

0

2

6

0

3

3

1

0

0

0

0

0

1

0

4

3

3

2

8

9

4

0

6

4

5

1

2

2

2

5

4

3

0

2

3

1

2

3

0

0

0

0

2

6

1

0

5

6

6

2

7

5

3

3

1

3

4

4

3

1

4

0

4

3

3

2

8

9

4

0

6

4

5

1

2

3

2

3

0

3

0

2

0

1

2

5

0

0

0

0

5

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Clusters (k=3)

6

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Monthly Clusters (k=4)

7

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Flow Chart

8

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Algorithm (Part I)

9

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Algorithm (Part II)

10

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Confidence Value

11

From AdaBoosting (Schapire & Singer 1998) we have

Let and ignore the boosting round .

𝑍=∑𝑖

𝑤 (𝑖 ) exp (−𝐶𝑅¿ 𝑦 𝑖)¿

is defined as the confidence value for the rule and if .

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Objective Function

12

Therefore,

𝑊 0= ∑{ 𝑖|𝑥 𝑖∉𝑅 }

𝑤 (𝑖 )𝑊+¿= ∑{𝑖|𝑥𝑖∈𝑅 𝑎𝑛𝑑 𝑦=1 }

𝑤 ( 𝑖 ) ¿𝑊−= ∑{𝑖|𝑥 𝑖∈𝑅𝑎𝑛𝑑 𝑦=− 1}

𝑤 (𝑖 )

𝑊 0+𝑊+¿+𝑊 −=1¿

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Minimum Z Value

13

𝑑𝑍𝑑𝐶𝑅

=−𝑊+¿exp (−𝐶 𝑅 )+𝑊 −exp (𝐶𝑅 )=0¿

→𝑊−exp (𝐶𝑅 )=𝑊+¿ exp (−𝐶𝑅 ) ¿

→ ln (𝑊 −exp (𝐶𝑅 ))=ln ¿¿→ ln (𝑊 −)+𝐶𝑅=ln ¿¿→2𝐶𝑅=ln¿ ¿

→𝐶𝑅=12ln ¿¿

has the minimum value when

𝑑𝑍𝑑𝐶𝑅

2=𝑊+¿ exp (−𝐶𝑅 )+𝑊−exp (𝐶𝑅 )>0¿

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

BuildChain Function

14

𝑊 0+𝑊+¿+𝑊 −=1¿

Repeatedly adding a classifier to R until it maximizes . This will minimize as well.

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

PruneChain Function

15

�́�=¿Loss Function:

Minimize by removing the last classifier from R.

is obtained from GrowSet.

are obtained from applying R to PruneSet

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Update Weights

16

Calculate with ensemble classifier R on the entire data set.

where

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Strong Hypothesis

17

At the end of boosting, there are chains,

�̂�𝑅𝑡=0 𝑖𝑓 𝑥 ∉𝑅𝑡

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

1. The grid cells with the similar crime counts clustered together also are close to each other on the map geographically. Besides, the high-crime-rate area and low-crime-rate area are separated with cluster.

2. The original data set is randomly divided into two subsets each round. The greedy weak-learn algorithm adapts confidence-rate evaluation to “chain” the base-line classifiers using one data set. And then, “trim” the chain using the other data set.

3. The strong hypothesis is easy to calculate.

SUMMARY

18

Crime Forecasting Using Boosted Ensemble Classifiers Chung-Hsien Yu

Q & A

THANK YOU!!

19

top related