association rule mining to remotely sensed data

Upload: pearlmillee

Post on 30-May-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Association rule mining to Remotely sensed data

    1/20

    Represented By

    Madhusmita Sahu

    (CSE,950014)

    1

  • 8/9/2019 Association rule mining to Remotely sensed data

    2/20

    Contentsy Introductiony Apriori Algorithmy Mining Rules to Imagery data

    -Problem definition

    -Partitioning quantitative attributes-Finding larger itemsets from imagery data

    y New pruning techniques for fast data mining-Technique one-Technique two

    y

    An example of applying new algorithmy Conclusiony Reference

    2

  • 8/9/2019 Association rule mining to Remotely sensed data

    3/20

    3

    REMOTE SENSING

    Remote Sensing is the science of acquiring information about the Earth'ssurface without actually being in contact with it.

    recording reflected energy

    images collected in multiple bands of the electromagnetic spectrum

  • 8/9/2019 Association rule mining to Remotely sensed data

    4/20

    Association Rule MiningyAssociations

    y Simple rules in categorical data

    y

    Sample applicationsy Market Basket Analysis

    Buys(Milk) Buys(Eggs)

    y Transaction Processing

    Income(Hi) & Single(Y) Owns(Computer)

    y Search for Strong Rulesy Support R(A B) = P(A U B)

    y Confidence R(A B) = P(B | A) = P(A B) / P(A)

    4

  • 8/9/2019 Association rule mining to Remotely sensed data

    5/20

    The Apriori Algorithm : Pseudo code

    y Join Step: Ck is generated by joining Lk-1with itselfy Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a

    frequent k-itemsety Pseudo-code:

    Ck: Candidate item set of size kLk: frequent item set of size k

    L1= { frequent items};For(k= 1; Lk!=; k++ ) do begin

    Ck+1= candidates generated from Lk;F

    or each transactiont

    in dat

    abase doIncrement the count of all candidates in Ck+1 that are contained in tLk+1= candidates in Ck+1 with min_support

    endReturn kLk;

    5

  • 8/9/2019 Association rule mining to Remotely sensed data

    6/20

    MINING ASSOCIATION RULES

    FROM IMAGERY DATA

    y Problem definition

    y Partitioning Quantitative Attributes

    y Finding Large Item sets from Imagery Data

    6

  • 8/9/2019 Association rule mining to Remotely sensed data

    7/20

    NEW PRUNING TECHNIQUES FOR FAST DATAMINING

    7

    y Technique one

    lemma 1: A pixel value can not belong to two differentintervals from the same band.

    lemma 2: The combination of k intervals (k>1)from

    same band has support zero.

  • 8/9/2019 Association rule mining to Remotely sensed data

    8/20

    Ck : Candidate k-item setsLk: Large k-item sets

    * : An operation for contactenationCk : Number of itemset in candidate k-item setsRj : Number of intervals in bandj

    L k: Number of itemset in large k-item sets

    1. According to the apriori algorithm :Apriori use L1*L1 to generate a candidate set ofitemsets C2.

    |C2|apriori = |L1 ||L-1| 2

    =

    2. According to the new algorithm :Assume L 1 = R1 + R2 + ... + Rn.

    C2new =R1 (R2 + R3 + ... + Rn) +R2 (R3 + R4 + ... + Rn) + ...+ Rn-2 (Rn-1 + Rn) + Rn-1(Rn)

    =

    8

  • 8/9/2019 Association rule mining to Remotely sensed data

    9/20

    Contd

    The numberof candidate 2-itemsets generated by new algorithm is muchless than by Apriori .

    C2prune 1 = C2 apriori - C2 new

    =

    whe

    n n is lar

    ge

    and Rj is lar

    ge,

    C2prune 1 be

    come

    s anex

    tre

    me

    ly lar

    ge

    number.Forexample : If the imagery data has 8 bands and each band has 16intervals.The numberof pruned candidate 2-itemsets is8 *16(16-1)=960.It sharply reduces the process cost.

    9

  • 8/9/2019 Association rule mining to Remotely sensed data

    10/20

    Technique twoy During the process of data mining ,allow user interaction with the

    mining engine and use users prior knowledge will help to speed upthe mining algorithms by restricting the search space.

    y Consider only one band "bandN" in output. The association rule is the

    form: bandl ... band(N-l)bandN.The number of candidate 2-itemset

    C2 new =y we are not interested in those itemsets which do not contain bandN.

    We will prune those candidate itemset in which none of the interval ischose from bandN.

    The number of pruned candidate 2-itemset is

    C2 prune 2 =

    10

  • 8/9/2019 Association rule mining to Remotely sensed data

    11/20

    contd.

    y

    Apply new pruning technique described in technique one.

    C2 prune 1 =

    y The total number of pruned candidate 2-itemset

    C2 prune = C2prune 1 + C2 prune 2

    =

    y

    And the remaining steps are the same as Apriori algorithm,

    11

  • 8/9/2019 Association rule mining to Remotely sensed data

    12/20

    Contd.

    y If there are (N-M) bands in output in the form:

    bandl ... bandM band(mM+l ) .... bandN

    The total number of pruned candidate 2-itemset

    C2prune = C2 prune l+ C 2 prune2

    = +

    And the remaining steps are the same as Apriori algorithm

    12

  • 8/9/2019 Association rule mining to Remotely sensed data

    13/20

    Steps

    y Step 1: Choose one of the partition method (equaldepth,uneven depth and discontinous partition) to

    determine the intervals.

    y Step 2: From large l-item set, apply new pruningtechnique (technique one and technique two) to

    generate candidate 2-itemset.

    y Step 3: Applying remaining steps of Apriori algorithm

    13

  • 8/9/2019 Association rule mining to Remotely sensed data

    14/20

    An example for applying new algorithm (Assume user select equal depth

    partitioning.Diameter two for band1 and band4 , Diameter three for band2 and band3

    Pixel Band1 Band2 Band3 Band4

    1 40 140 200 240

    2 50 130 210 250

    3 45 135 210 190

    4 100 180 50 1005 110 170 40 120

    14

    [0,63] [64,127] [128,191] [192,255]

    band1 b11 b12 b13 B14

    band4 b41 b42 b43 b44

    [0,31] [32,63] [64,95] [96,127

    ]

    [128,

    159]

    [160,

    191]

    [192,

    225]

    [226,

    255]

    band2 b21 b22 b23 b24 b25 b26 b27 B28

    band3 b31 b32 b33 b34 b35 b36 b37 b38

  • 8/9/2019 Association rule mining to Remotely sensed data

    15/20

    An example ofpartition the value into intervals.

    15

    Pixel

    b11

    b12

    b13

    b14

    b21

    b25

    b26

    b28

    b31

    b32

    b37

    b38

    b41

    b42

    b43

    b44

    1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

    2 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

    3 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0

    4 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0

    5 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0

    y After selecting partition method.Map each value in thistable into intervals.

  • 8/9/2019 Association rule mining to Remotely sensed data

    16/20

    Contd.y Apply new pruning techniques forcandidate 2-itemset

    generation.Assume the minsup=40% and minconf=60%y Candidate 1-itemset:

    {b11,b12,b13,b14,b21,b22,b23,b24,b25,b26,b27,b28,b31,b32,

    b33,b34,b35,b36,b37,b38,b41,b42,b43,b44} Large 1-itemset:

    {b11(3),b12(2),b25(3),b26(2),b32(2),b37(3),b42(2),b44(2)}Candidate 2-itemsets:

    {{b42,b11},{b42,b12},{b42,b25},{b42,b32},{b42,b37},{b44,b11},{b44,b12},{b44,b25},{b44,b26},{b44,b32},{b44,b37}}

    16

  • 8/9/2019 Association rule mining to Remotely sensed data

    17/20

    An example contd.

    Applying pruning technique one,C2 prune 1 =1+1+1+1=4

    Applying pruning technique two,

    C2 prune 2 =2 X (2+2)+2 X2 = 12 Total pruned no. of candidate 2-itemsets is =12+4=16 Applying apriori algorithm,the no. of candidate 2-itemset

    C2 apriori =(8 X 7)/2 = 28

    The percentage of pruning is 57%.so,theexecutionefficiency of mining process is improved. Remaining steps are the same as Apriori algorithm.

    17

  • 8/9/2019 Association rule mining to Remotely sensed data

    18/20

    Conclusion

    y In this seminar,we defined a new data mining problem ---mining association rules from imagery data and its applicationin precision agriculture.

    y Since theefficiency of a mining algorithm is a very importantissue of data mining,we proposed two simple and effectivepruning techniques forcandidate 2-itemset generation.

    y by exploiting the nature of the problem and characteristics of

    imagery data,we can prune significant numberof unnecessarycandidate itemsets during thevery early phase of miningprocess.

    18

  • 8/9/2019 Association rule mining to Remotely sensed data

    19/20

    References

    19

    Jianning Dong,william Perrizo,Qin Ding and Jingkai Zhou,Associationrule mining to Remotely sensed data North Dakota StateUniversity,Fargo,ND 581105

    Data Mining: Concepts and Techniques(Hardcover - Mar 2006)

    byJiawei han,Micheline kamber. J. Zhang, H. Wynne, M. L. Lee, Image mining: issues, frameworks, and

    techniques, inProceedings of 2nd International Workshop on MultimediaData Mining, San Francisco, Aug 2001, pp. 13 20.

    J. Li and R. M. Narayanan, "Integrated spectral and spatial information

    mining in remote sensing,"IEEE Transactions on Geoscience and RemoteSensing,vol. 42, no. 3, pp. 673 685, March 2004.

  • 8/9/2019 Association rule mining to Remotely sensed data

    20/20

    Thank You!!

    20