optimizing scoring functions and indexes for proximity search in type-annotated corpora soumen...
TRANSCRIPT
-
Optimizing Scoring Functions and Indexes for Proximity Search in Type-annotated CorporaSoumen Chakrabarti Kriti Puniyani Sujatha Das
IIT Bombay
-
(In fewer words)
Ranking and Indexing for Semantic SearchSoumen Chakrabarti Kriti Puniyani Sujatha Das
IIT Bombay
-
Working notion of semantic searchExploiting in conjunctionStrings with meaning entities and relationsUninterpreted strings as in IRThis paperOnly is-a relationToken matchToken proximityCan approximate many info needs
-
Type-annotated corpus and query e.g.Born in New York in 1934 , Sagan was a noted astronomer whose lifelong passion was searching for intelligent life in the cosmos. personscientistphysicistastronomerentityregioncitydistrictstatehasDigitisDDDDabstractiontimeyearis-a
-
The query class we addressFind a token span w (in context) such thatw is a mention of entity eCarl Sagan or Sagan is a mention of the concept of that specific physiciste is an instance of atype a given in the queryWhich a=physicist w is NEAR a set of selector stringssearched, intelligent, life, cosmosAll uncertain/imprecise; we focus on #3Yet surprisingly powerful: correct answer within top 34 ws for TREC QA benchmark
-
Contribution 1: What is NEAR?XQuery and XPath full text support(distance at most|window) 10 words [ordered] hard proximity clause, not learntftcontains with thesaurus at relationship "narrower terms" at most ` levelsNo implementation combining narrower terms and soft proximity rankingSearch engines favor proximity in proprietary waysA learning framework for proximity
-
Contribution 2: Indexing annotationstype=person NEAR theory relativity type in {physicist, politician, cricketer,} NEAR theory relativityLarge fanout at query time, impracticalComplex annotation indexes tend to be largeBinding Engine (WWW 2005): 10x index size blowup with only a handful of entity typesOur target: 18000 atypes today, more laterWorkload-driven index and query optimizationExploit skew in query atype workload
-
Part-1: Learning to score token spanstype=person NEAR television invent*Rarity of selectorsDistance from candidate position to selectorsMany occurrences of one selectorClosest is goodCombining scores from many selectorsSum is goodCandidate position to scoreSelectorsClosest stem inventtelevisionwasinventedin1925.InventorJohn BairdwasbornEnergySecond-closest stem inventpersonis-a065432+11+2
-
Learning the shape of the decay functionFor simplicity assume left-right symmetryParameters (1,,W), W=max gap windowCandidate position characterized by a feature vector f = (f [1],,f [W])If there is a matched selector s at distance j and This is the closest occurrence of sThen set f [j ] to energy(s), else 0Score of candidate position is fIf we like candidate u less than v (u v)We want fu fv Assess a penalty proportional to exp(fu fv)
-
Learning decay functionresults Discourage adjacent s from differing a lot Penalize violations of preference orderTREC yearMean reciprocal rank: Average over questions, reciprocal of the first rank where an answer token was found (large good)Roughly unimodal around gap = 4 and 5IR Baseline
Sheet1
TrainTestMRR
IR20000.16
200120000.29
Chart7
0.3204998352
0.5877443897
0.9523591403
0.9569400564
1
0.8276996083
0.5167971111
0.5454805194
0.4012412644
0.2406366575
0.2819411413
0.4271552718
0.4049001309
0.2998002768
0.3899159935
0.294507633
0.2684023904
0.4192628047
0.3221388832
0.1186241567
0.0817876142
0.2467787808
0.2829989472
0.2021728072
0.125166476
0.1231640887
0.1809765772
0.1427080567
0.1223400233
0.1498921061
0
0.1024691993
0.0941917441
0.0450078244
0.0498364067
0.0678726617
0.0325344501
0.0193490696
0.0620929982
0.0946416534
0.1006562594
0.1049590426
0.0752862436
0.0748297027
0.0788185427
0.0822336531
0.0998727702
0.1259473905
0.1290475622
0.1113221298
Smooth
Gap j
beta(j)
lidstone
LogProb
1.00E-08-2178.181762562
2.00E-08-2326.4799180878
5.00E-08-2224.1789966904
1.00E-07-2202.2633543075
2.00E-07-2146.0750582988
5.00E-07-1992.8932192806
1.00E-06-2084.0806161757
2.00E-06-1877.8246354931
5.00E-06-1838.1782611087
1.00E-05-1879.0257625106
2.00E-05-1788.0156086263
5.00E-05-1740.4745191227
1.00E-04-1692.1033084092
2.00E-04-1719.6857779636
5.00E-04-1578.4655503456
0.001-1551.5706578039
0.002-1536.2036580768
0.005-1519.0000250537
0.01-1540.2766542196
0.02-1576.2282182944
0.05-1704.3854332564
0.1-1804.2512914667
0.2-1969.1623016556
0.5-2144.1675997999
1-2301.8518856123
lidstone
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
LogProb
Lidstone
LogProb
ranksvm-cpu
numFoldsSampleFracConstraintsTime(m)SVM
300.03333333335633620112060
250.048559387752620
200.056249740324180
150.0666666667106819162797620
100.11630122327139620
70.142857142924927612481748860
50.233510513717823020
30.3333333333570593
rankexp-cpu
numFoldsslackPenaltytrainFoldrealTimecpuTimeuserCpuTimegcTime
150.3020977918842013658017081
15304621742960310102880
150.3131957727841019929027064
1531156171486010670671
150.321545971356009875015037
15321904417630126601275
150.331399371221108808014455
15332294220170148202317
150.3418197316437011870015848
1534169201583011240767
150.3530721027125019662029001
1535181721722012410791
150.3642075537458027213041576
1536191481798012490752
150.37729463601870438540109155
15373322728150206004505
150.3876127766053047671084640
15382481820890144901890
150.3954263847598034630058915
153940365434575024952045022
150.31051255945075032644055601
153101993717890128502038
150.31175382766077047785083178
15311205951976014410844
150.31252262446455033641054366
153124334925280174002797
150.31340356638798027922012309
1531312004115408300412
150.31421844819525014011019887
15314195501858012980965
100.3046329039715028669058430
10302914325100182203728
100.31786462645900471840124403
10313329829260215003500
100.3232055025841018710054853
10323831931360226605732
100.3333742129033020984043514
10332807325100174502942
100.3419342715951011553030144
10343455729570211804905
100.3526262222334016163035598
10353043826120189804329
100.3624395120659015019035223
10362646623230171103190
100.3754980744337032259092356
10373826632600230605660
100.3828941123653017255048329
10383294727130194604762
100.39646182534240380070105964
10392283020480148902331
70.3035187928427020578059483
7304389437080266306788
70.3149936239144028184092403
7314796740370295507584
70.3255671745242032523093798
7325098042120303808156
70.3347523336481026141081545
73361261489503557010499
70.3442029633826024367073627
7345332043290310108898
70.3558235941189028646074597
7354077727190195105527
70.3654930737885027231081389
7365916842210306208227
50.3046747530881022100085003
53084310545703895017721
50.31644218418800301470116290
53179968513303719013906
50.32585639370230267810119134
53299607648104615018759
50.33612426410380291020113837
53382388569104055013141
50.3442185527471019630077771
53476101524203740011952
30.30792085473370336840152893
3301719441127508176032249
30.31631961393070282580124347
331143781923606571026548
30.32673954470960341670135883
332122689950006921023775
rankexp-cpu-pivot
Average of cpuTimeslackPenalty
numFolds0.33Grand Total
3445800100036.666666667272918.333333333
535658656008206297
7374562.85714285740172.8571428571207367.857142857
1033953726995183266
15362161.33333333342299.3333333333202230.333333333
Grand Total364251.544145204198.25
Exp,C=0.3Exp,C=3TrainingSize
3445800100036.6666666670.3333333333
5356586560080.2
7374562.85714285740172.85714285710.1428571429
10339537269950.1
15362161.33333333342299.33333333330.0666666667
rankexp-cpu-pivot
0012060
0052620
0024180
0097620
00139620
748860
823020
Exp,C=0.3
Exp,C=3
SVM
FractionTrainingSize
RelativeCPUTime
score-param
Gap jSmoothRawSmoothRoughRaw Rough
13.53E-050.32049983520.02820.7058823529
21.53E-040.58774438970.05210.8165817508
33.14E-040.95235914030.09171
43.16E-040.95694005640.07230.910143585
53.35E-0410.08670.9768411302
62.59E-040.82769960830.06390.8712366836
71.22E-040.51679711110.03710.7471051413
81.34E-040.54548051940.0520.8161185734
97.09E-050.40124126440.03780.750347383
101.43E-070.24063665750.01830.6600277906
111.83E-050.28194114130.02430.6878184345
128.23E-050.42715527180.05640.8364983789
137.25E-050.40490013090.05010.8073182029
142.62E-050.29980027680.03030.7156090783
156.59E-050.38991599350.0440.7790643817
162.39E-050.2945076330.02630.6970819824
171.24E-050.26840239040.01550.6470588235
187.88E-050.41926280470.04360.7772116721
193.60E-050.32213888320.01990.667438629
20-5.36E-050.1186241567-0.03760.4011116258
21-6.98E-050.0817876142-0.02850.4432607689
222.85E-060.24677878080.02520.691987031
231.88E-050.28299894720.03320.7290412228
24-1.68E-050.20217280720.01480.6438165818
25-5.07E-050.125166476-0.01250.5173691524
26-5.16E-050.1231640887-0.02060.4798517832
27-2.61E-050.18097657720.01010.6220472441
28-4.30E-050.1427080567-0.00930.5321908291
29-5.20E-050.1223400233-0.00350.5590551181
30-3.98E-050.14989210610.01290.6350162112
31-1.06E-040-0.11370.0486336267
32-6.07E-050.1024691993-0.02260.4705882353
33-6.44E-050.0941917441-0.02940.4390921723
34-8.60E-050.0450078244-0.06750.2626215841
35-8.39E-050.0498364067-0.10230.1014358499
36-7.60E-050.0678726617-0.0190.4872626216
37-9.15E-050.0325344501-0.06060.2945808245
38-9.73E-050.0193490696-0.09620.1296896711
39-7.85E-050.0620929982-0.0510.3390458546
40-6.42E-050.0946416534-0.01230.5182955072
41-6.15E-050.1006562594-0.01090.5247799907
42-5.96E-050.1049590426-0.03580.4094488189
43-7.27E-050.0752862436-0.09580.1315423807
44-7.29E-050.0748297027-0.10160.1046780917
45-7.11E-050.0788185427-0.12420
46-6.96E-050.0822336531-0.0840.1861973136
47-6.19E-050.0998727702-0.0540.3251505327
48-5.04E-050.12594739050.01190.6303844372
49-4.90E-050.1290475622-0.03680.4048170449
50-5.68E-050.1113221298-0.11020.0648448356
-1.06E-04-0.1242
3.35E-040.0917
score-param
Smooth
Gap j
beta(j)
gapfreq
0
065749106574910
1629909512874005
2491745917791464
3455552422346988
4422806326575051
5410055230675603
6394716934622772
7380984538432617
8369220242124819
9359118245716001
10349612149212122
11341019452622316
12333481655957132
13326556459222696
14318275362405449
15311319565518644
16305895068577594
17299653471574128
18293709074511218
19288968777400905
20282952080230425
21279135883021783
22273931585761098
23269092388452021
24265087591102896
25261436493717260
26257493896292198
27253258698824784
282498803101323587
292464642103788229
302429284106217513
gapfreq
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Gap
#(a,s) pairs
fp-log-0.7-40
qidgap-3earlylate
22248230450
2791047361065
3921126211155
6171048511245
3981203451275
38336015181545
37316810812355
25327219092985
27037324153870
41928827144530
34637631054995
45340540946555
248240379818025
35170145779435
3091194604913620
3125032811918780
6133027719920460
3355742982121645
3764633765923100
378188432856630870
60931031646831320
52512371926932865
88862641610034155
2724217855635670
34727451320241085
375120211846943140
2265099897044325
529146731117847700
36858851005155455
23656881101757105
28797471543357930
22348482624361530
31767891154661545
837104941874570305
340213281276578525
581166914422980475
359998133304100095
6533946223598106395
7992035154372108930
2691043532499110460
8002211842113119415
2714933720355145995
44185710170154196170
4662080873462204555
65158079113114216795
3916341037421226965
30489848317055271890
3436022785445278055
79285442102695279270
48537248101085288825
790114085200882295260
59536173108054298815
48299167199594301485
791101867142071304995
5348291162054311670
69867437115184315750
436109416205022339885
793109416205022339885
25474301188531348270
62012189054717404835
63812583979557423195
62412141825944475815
40113217242734512745
21212398940434545160
643147976590272575865
396110909120198593535
701239546432791607905
70316335650324618495
65913616546161645975
8213005751025501670170
508154457101430674190
331189455982790730155
76415828028382785715
26321407462744793170
801868391089142804735
803868391089142804735
804868391089142804735
555174656112539816240
762116881108537822585
324242065385802823815
802960671167549836865
3382939221094363890040
673330368504620911850
557214794884580942450
377252110514303976110
84220202091540985350
4962560626189761006365
280226493476791010010
341232803761071103445
283180762437461131030
5642993691283401141890
54626222015539261142280
4942489215035391166760
46099118334731300425
5163239262097601351095
2164443483131681422975
652379682728181443105
390380685804081451910
440251426236671452390
3804855302583821457820
326298110753941618890
4643565441116881645320
7735068351607701712235
7746205122300921754340
6662147001919351766625
7756218881849661767945
5714546458805091784280
23940388712902311806180
5263667982899611813530
3034182563420561828275
8474490782783691849965
8484490782783691849965
8504103239920131884750
87663098012661042039295
62854014312232322133660
8494996346991082138880
372535254923452151855
2063878366958192213925
3204508344467982293395
660437036205162302740
87767628616946862321940
32742430914484712356680
2923737372506082359290
8594140021271902378865
77768883212357672385375
82053586611847302389395
81954915812533162400945
4515382322425352504895
6945677909777072518200
69557494410029612531700
772425538840422553255
4486107508817282691930
8576107508817282691930
21943651915300982696595
40370909926372032701620
7135163241341822724015
7145163241341822724015
87383653235128362771685
81859519824575732952765
82269837138399883076800
44453795735515683257595
24954200727046623310995
307505325460233366420
62995290626438273372180
26465425911748403452205
5666721723475993532560
4327038268862593581385
5686540283610543609135
27745037814951613654630
3845215893075103814275
80751495615498783831765
5047198642950444306215
511573236308664487070
457582189521184497405
4956986274045474695630
7425051631098254769505
35767623322795764804185
214732621837204817190
319598845305674834635
6034483082655814877415
2105607704626914906305
8756166169325124961115
29955844518594584998990
87475309418383905049210
64664143484875280270
38281144513360015288835
6867434585161435289600
60180572040797865351535
2325934242369235459145
53310782844156335715870
36096929831536225826180
46786699520327865851830
62583137326986135875110
7269346026498655981940
63672687219065856301605
870654562575236368595
723105797532135376413100
711100879915546626447360
452746045185846857820
473964362293256859770
612777356928056903360
5599511346540287272420
295107525659561037281420
22057286322356467381815
2749135431519387584135
7638770383622047882845
35279566612959588232765
25114455118996458723430
6331481395106898948735715
268960080627909109545
58010815595519089301710
24310406005637999307305
291141697675623089354135
662167144944858519527070
716898082926910304520
5181582902627872410615305
293101917513662010633125
36916118241138684010935060
42813925579982011039475
416150172313590711902095
6451965990171458112412170
229142142718876112754725
294166684647209812756645
3211708467256896212879630
47924838081339936312942105
688218300312224513539735
3231354272163999213541670
30619189521138456313843200
4913296457352606115033795
20314868171489162615272745
215108393951329115848880
258357296339640516456560
77117674471039616690200
72931498351371676316751640
74316539026076616971840
7152353120655518045150
7123871557269003418210450
5723509726715557618472260
38540299191016857618713445
245390570441768019008255
7656975089406750419245315
59247468992277517519789140
4257100959755589120304090
6782624575128650521380715
574309793727949622040715
311263366044656822074750
235307608922673422093935
3102650172164174023146320
5692657207766987923303460
2762593022300968823543925
430200635710515624245880
25040536761140825325530600
5482803447411529826904390
42485329072755331027323370
233432222923294429680065
2346405246667710730568890
3012870179439173531380255
5064183520121665431588815
6475296511247940032553000
8538883514473107732683815
73164128503822445932946600
6052487007563667933247995
2553528572497982236744720
302157340722308528438937225
33269599131322746141754675
24079707291131540248893130
727597798313399854707130
413491269122990854725835
724491269122990854725835
7256295400139787155179405
6616787183371682358182060
38910184690100602065173170
27378264062398686174381235
4691113827985891276254750
717201123606578133165605
856199035463873246133799400
356273068444497328196440420
Chart3
48230450
1047361065
1126211155
1048511245
1203451275
36015181545
16810812355
27219092985
37324153870
28827144530
37631054995
40540946555
240379818025
70145779435
1194604913620
5032811918780
3027719920460
5742982121645
4633765923100
188432856630870
31031646831320
12371926932865
62641610034155
4217855635670
27451320241085
120211846943140
5099897044325
146731117847700
58851005155455
56881101757105
97471543357930
48482624361530
67891154661545
104941874570305
213281276578525
166914422980475
998133304100095
3946223598106395
2035154372108930
1043532499110460
2211842113119415
4933720355145995
85710170154196170
2080873462204555
58079113114216795
6341037421226965
89848317055271890
6022785445278055
85442102695279270
37248101085288825
114085200882295260
36173108054298815
99167199594301485
101867142071304995
8291162054311670
67437115184315750
109416205022339885
109416205022339885
74301188531348270
12189054717404835
12583979557423195
12141825944475815
13217242734512745
12398940434545160
147976590272575865
110909120198593535
239546432791607905
16335650324618495
13616546161645975
3005751025501670170
154457101430674190
189455982790730155
15828028382785715
21407462744793170
868391089142804735
868391089142804735
868391089142804735
174656112539816240
116881108537822585
242065385802823815
960671167549836865
2939221094363890040
330368504620911850
214794884580942450
252110514303976110
20202091540985350
2560626189761006365
226493476791010010
232803761071103445
180762437461131030
2993691283401141890
26222015539261142280
2489215035391166760
99118334731300425
3239262097601351095
4443483131681422975
379682728181443105
380685804081451910
251426236671452390
4855302583821457820
298110753941618890
3565441116881645320
5068351607701712235
6205122300921754340
2147001919351766625
6218881849661767945
4546458805091784280
40388712902311806180
3667982899611813530
4182563420561828275
4490782783691849965
4490782783691849965
4103239920131884750
63098012661042039295
54014312232322133660
4996346991082138880
535254923452151855
3878366958192213925
4508344467982293395
437036205162302740
67628616946862321940
42430914484712356680
3737372506082359290
4140021271902378865
68883212357672385375
53586611847302389395
54915812533162400945
5382322425352504895
5677909777072518200
57494410029612531700
425538840422553255
6107508817282691930
6107508817282691930
43651915300982696595
70909926372032701620
5163241341822724015
5163241341822724015
83653235128362771685
59519824575732952765
69837138399883076800
53795735515683257595
54200727046623310995
505325460233366420
95290626438273372180
65425911748403452205
6721723475993532560
7038268862593581385
6540283610543609135
45037814951613654630
5215893075103814275
51495615498783831765
7198642950444306215
573236308664487070
582189521184497405
6986274045474695630
5051631098254769505
67623322795764804185
732621837204817190
598845305674834635
4483082655814877415
5607704626914906305
6166169325124961115
55844518594584998990
75309418383905049210
64143484875280270
81144513360015288835
7434585161435289600
80572040797865351535
5934242369235459145
10782844156335715870
96929831536225826180
86699520327865851830
83137326986135875110
9346026498655981940
72687219065856301605
654562575236368595
105797532135376413100
100879915546626447360
746045185846857820
964362293256859770
777356928056903360
9511346540287272420
107525659561037281420
57286322356467381815
9135431519387584135
8770383622047882845
79566612959588232765
14455118996458723430
1481395106898948735715
960080627909109545
10815595519089301710
10406005637999307305
141697675623089354135
167144944858519527070
898082926910304520
1582902627872410615305
101917513662010633125
16118241138684010935060
13925579982011039475
150172313590711902095
1965990171458112412170
142142718876112754725
166684647209812756645
1708467256896212879630
24838081339936312942105
218300312224513539735
1354272163999213541670
19189521138456313843200
3296457352606115033795
14868171489162615272745
108393951329115848880
357296339640516456560
17674471039616690200
31498351371676316751640
16539026076616971840
2353120655518045150
3871557269003418210450
3509726715557618472260
40299191016857618713445
390570441768019008255
6975089406750419245315
47468992277517519789140
7100959755589120304090
2624575128650521380715
309793727949622040715
263366044656822074750
307608922673422093935
2650172164174023146320
2657207766987923303460
2593022300968823543925
200635710515624245880
40536761140825325530600
2803447411529826904390
85329072755331027323370
432222923294429680065
6405246667710730568890
2870179439173531380255
4183520121665431588815
5296511247940032553000
8883514473107732683815
64128503822445932946600
2487007563667933247995
3528572497982236744720
157340722308528438937225
69599131322746141754675
79707291131540248893130
597798313399854707130
491269122990854725835
491269122990854725835
6295400139787155179405
6787183371682358182060
10184690100602065173170
78264062398686174381235
1113827985891276254750
201123606578133165605
199035463873246133799400
273068444497328196440420
gap-3
early
late
Queries
Relative footprint
LogIDF decay=0.7 window=40
fp-lin-0.7-40
qidearlylategap-3
22223045048
2797361065104
3926211155112
6178511245104
3983451275120
38313571545489
37310812355168
25319092985272
27024153870373
41927144530288
34631054995376
248554363001326
45340946555405
35145779435701
3096049136201194
3126946187805377
6137199204603027
3359154216456105
3767153231006578
8888096285905176
378227243039016229
60916468313203103
52589243286512506
2728556356704217
37511638410559942
34713202410852745
2268970443255099
529101894770011539
36810051554555885
23611017571055688
28715433579309747
22326243615304848
31711546615456789
581198956330015273
837187457030510494
340111787852518431
359333041000959981
7991377710075524714
800821110387519544
6532283910639534151
2693249911046010435
271963714599541627
44110039515343592665
6515333717835051193
4667346220455520808
3912976222696564423
7916564224991567009
30431705525497089848
7929910725663566930
3434089425903563991
7909924525951576383
43610108527399081950
79310108527399081950
2542527727669068135
48510108528882537248
59510805429881536173
48214917830148597647
5343838730192079035
698273731564550091
62053015404835122321
638977541034070464
6242067747464573875
40121551505710120724
643189106508095173068
54652210516975199487
3967291538575114841
212662453994093605
7017682601455168348
7031205260261099747
65916077634260126960
50813685654075141912
8211025501670170345826
377146142682005218696
67333534686055329171
331982790693765182855
80198610276249594107
80398610276249594107
80498610276249594107
8021083231765675105043
324250976784215237625
76424725785715158625
2637038787545125277
76211293791535133864
55775233794970211143
3381094363795525293922
55569046802695166756
49423230899265225217
496178595966930243127
84286871984600138829
2807498990555162694
34182341092270157108
283110861120230192191
56449681141545291282
46033581300320544848
5161090201303695296433
380873541328655606665
2161564691400130579374
65269461442595248115
39074291445025338252
44073831445370336081
326293941588965245742
464644691610325279346
5714132181653390413718
303616171669395376808
773105801699395442453
23912902311706310403887
777114311726905682937
774102581726980548328
775105801727325578480
666359261728975207336
8471313071729410339160
8481313071729410339160
876870091746585605052
403947371746885656486
877950131747530633453
8501529731763850451621
8492000081779450391598
5262167291798380379863
8194915331929585616871
873870091998450803937
81810558382001180668305
6286190452079060523584
69489472107440522820
37243932149680307138
206166752156475392140
4449920132214615665273
3272051372218950401694
3202771502261940323913
66078662294805457681
292411932311815285825
859218502346705366974
448281522349555569383
857281522349555569383
82010256622358930616912
6955791172396100522860
4512411782504895481658
772604902544195435296
21912960042660160436489
71340022678805549342
71440022678805549342
277319702734545406906
26499822839020643332
82238399882842440708521
629106952975745961205
24927046623106185542007
307183083358020470277
56830823400710570280
43275673403335581702
5661630013505365569367
384424353738045563597
80714664573800550610807
35721253384189470791561
5041393804227720730821
36042297042318601395536
51176594478295736858
457175034484115751736
8753448624549890738038
8744561134574850981625
625287734579275750728
382126504581660696888
4951150464613280614328
742326374738380538498
21429904816260410341
21083954817850439979
603756704824555386341
319167444826280627354
2991323424882035643682
686125355079150764636
60140797865245620805495
64650605278470518785
232382265400690625382
63625274754577501020290
4672704855264201026723
5335257856163301176084
726155025784270805655
723155025785035910623
29514772962478001116790
711402562897701121734
870287736356400735185
203148087865620951655203
45296376852870661153
559107646853335920682
473134096854730910676
612861356890385936568
291745351869140701448338
2201742487191030887109
662646374253001634794
27488557532385914681
35299001277805001097295
763251627796565906729
633772877983801484092
2512136785031851736535
7293824986347053183043
2685221091084651198726
369432491263901611844
580489991273651104653
24326093592235001255543
7163841103025101063560
5189292104185801459771
29333327105968701230686
42860490110243401566967
47917158115625103233113
41617664118612951862355
645243156119043152586024
3063795122660251918968
32166010125074051707634
29427439127318202126359
32329279127329001699754
22941216127392751339644
6888832135323252258054
4919936150161253205243
21583582157546652054862
25812811164155503538190
7714416166869601627399
74331211169573051745537
572220524178285653535215
7124002179829603900208
3857176179840553990052
7153795180435752353203
59250163180738604869803
76522908181060656966080
42535075181065757151585
24570058189769503942228
67877993207767253197940
5697521219899252727166
31118262219950252579205
57420516219964952995708
27629118220001702546118
235166267220194003114895
3104462221079752676963
7318613178224003256280904
4249384237952208412294
2507590240090303897280
43019274242329952018098
5482695462257000253294061
23342458296330405092122
234113850296518658164939
301511543298598853611377
853190348310616258860857
6055450241311772753532905
506143957313794154118294
302714613167113515746756
647127351316713605305735
255284786348699454370861
332520720384524107900402
24054947452607457923651
72734822546864456468295
72541032546946806636724
41334960546951604894680
72434960546951604894680
66170219573246006864514
389193436512188510771307
27362425456612510014494734
46975677583154011178865
717379513316401520112443
8562369013317507020325430
3565501619440990028093743
Chart4
45023048
1065736104
1155621112
1245851104
1275345120
15451357489
23551081168
29851909272
38702415373
45302714288
49953105376
630055431326
65554094405
94354577701
1362060491194
1878069465377
2046071993027
2164591546105
2310071536578
2859080965176
303902272416229
31320164683103
32865892412506
3567085564217
41055116389942
41085132022745
4432589705099
477001018911539
55455100515885
57105110175688
57930154339747
61530262434848
61545115466789
633001989515273
703051874510494
785251117818431
100095333049981
1007551377724714
103875821119544
1063952283934151
1104603249910435
145995963741627
15343510039592665
1783505333751193
2045557346220808
2269652976264423
2499156564267009
25497031705589848
2566359910766930
2590354089463991
2595159924576383
27399010108581950
27399010108581950
2766902527768135
28882510108537248
29881510805436173
30148514917897647
3019203838779035
315645273750091
40483553015122321
410340977570464
4746452067773875
50571021551120724
508095189106173068
51697552210199487
5385757291114841
539940662493605
6014557682168348
6026101205299747
63426016077126960
65407513685141912
6701701025501345826
682005146142218696
68605533534329171
693765982790182855
76249598610294107
76249598610294107
76249598610294107
7656751083231105043
784215250976237625
78571524725158625
7875457038125277
79153511293133864
79497075233211143
7955251094363293922
80269569046166756
89926523230225217
966930178595243127
98460086871138829
9905557498162694
10922708234157108
112023011086192191
11415454968291282
13003203358544848
1303695109020296433
132865587354606665
1400130156469579374
14425956946248115
14450257429338252
14453707383336081
158896529394245742
161032564469279346
1653390413218413718
166939561617376808
169939510580442453
17063101290231403887
172690511431682937
172698010258548328
172732510580578480
172897535926207336
1729410131307339160
1729410131307339160
174658587009605052
174688594737656486
174753095013633453
1763850152973451621
1779450200008391598
1798380216729379863
1929585491533616871
199845087009803937
20011801055838668305
2079060619045523584
21074408947522820
21496804393307138
215647516675392140
2214615992013665273
2218950205137401694
2261940277150323913
22948057866457681
231181541193285825
234670521850366974
234955528152569383
234955528152569383
23589301025662616912
2396100579117522860
2504895241178481658
254419560490435296
26601601296004436489
26788054002549342
26788054002549342
273454531970406906
28390209982643332
28424403839988708521
297574510695961205
31061852704662542007
335802018308470277
34007103082570280
34033357567581702
3505365163001569367
373804542435563597
38005501466457610807
41894702125338791561
4227720139380730821
42318604229701395536
44782957659736858
448411517503751736
4549890344862738038
4574850456113981625
457927528773750728
458166012650696888
4613280115046614328
473838032637538498
48162602990410341
48178508395439979
482455575670386341
482628016744627354
4882035132342643682
507915012535764636
52456204079786805495
52784705060518785
540069038226625382
54577502527471020290
5526420270481026723
5616330525781176084
578427015502805655
578503515502910623
62478001477291116790
628977040251121734
635640028773735185
656209514808781655203
68528709637661153
685333510764920682
685473013409910676
689038586135936568
691407074535181448338
7191030174248887109
742530064631634794
75323858855914681
77805009900121097295
779656525162906729
779838077281484092
8503185213671736535
8634705382493183043
9108465522101198726
912639043241611844
912736548991104653
92235002609351255543
1030251038411063560
1041858092921459771
10596870333271230686
11024340604901566967
11562510171583233113
11861295176641862355
119043152431562586024
1226602537951918968
12507405660101707634
12731820274392126359
12732900292791699754
12739275412161339644
1353232588322258054
1501612599363205243
15754665835822054862
16415550128113538190
1668696044161627399
16957305312111745537
178285652205243535215
1798296040023900208
1798405571763990052
1804357537952353203
18073860501634869803
18106065229086966080
18106575350757151585
18976950700583942228
20776725779933197940
2198992575212727166
21995025182622579205
21996495205162995708
22000170291182546118
220194001662673114895
2210797544622676963
2240032586131786280904
2379522093848412294
2400903075903897280
24232995192742018098
2570002526954623294061
29633040424585092122
296518651138508164939
298598855115433611377
310616251903488860857
3117727554502413532905
313794151439574118294
316711357146115746756
316713601273515305735
348699452847864370861
384524105207207900402
45260745549477923651
54686445348226468295
54694680410326636724
54695160349604894680
54695160349604894680
57324600702196864514
651218851934310771307
66125100624254514494734
75831540756711178865
133164015379520112443
1331750702369020325430
1944099005501628093743
late
early
gap-3
Queries
Relative footprint
LinearIDF decay=0.7 window=40
-
Part-2: Workload-driven indexingType hierarchies are large and deep18000 internal and 80000 leaf types in WordNetRuntime atype expansion time-intensiveEven WordNet knows 650 scientists, 860 citiesIndex each token as all generalizationsSagan physicist, scientist, person, living thingLarge index space bloatIndex a subset of atypes
Sheet1
Corpus/IndexGbytes
Original corpus5.72
Gzipped corpus1.33
Stem index0.91
Full type index4.30
-
Pre-generalize (and post-filter)Full set of atypes (answer types) is AIndex only a registered subset R of ASay query has atype a; want k answersFind as best generalization gRGet best k >k spans that are instances of gGiven index on R, this is standard IR (see paper)
ag
-
(Pre-generalize and) post-filterFetch each high-scoring span w Check if w is-a aFast compact forward index (doc,offset)tokenFast small reachability index, common in XMLIf fewer than k survive, restart with larger kExpensivePick conservative kag
-
Estimates needed by optimizerIf we index token ancestors in R as against ancestors in all of A, how much index space will we save?Cannot afford to try out and see for many Rs
If query atype a is not found in R and we must generalize to g, what will be the bloat factor in query processing time?Need to average over a representative workload
-
Index space estimate given REach token occurrence leads to one posting entryAssume index compression is a constant factorThen total estimated index size is proportional to
Surprisingly accurate!Number of tokens in corpus that connect up to r
-
Processing time bloat for one queryIf R=A, query takes time approximated by If a cannot be found in R, the price paid for generalization to g consists ofScanning more posting entries:Post-filtering k responses:Therefore, overall bloat factor isTime to score one candidate position while scanning postingsNumber of occurrences of descendants of type aTime to check if answer is instance of a as well
-
Query time bloatresults Observed bloat fit not as good as index space estimate
While observed::estimated ratio for one query is noisy, average over many queries is much better
TRUE
doGeneralize = TRUE
QIDExpectedBloatt1t2t3t4t5t6t7t8t9MeanTime
20315.794727437457016801385523654510534605596453415576785771355964
20611988229720301791183420201712190720081988
21011749210918071757163817641623170018181757
2122.54594867988637442836643545345936063436643958793664
21411624170914561412127914171295137014461417
2152.54594867981692113748101769819921310897959292391022310176
21611113894743631578629705582332631
2191707675597562589559563542564564
2202.54594867988971812963035866588257465494572065225882
2222.5459486798530529541491496495489496514496
2236.2820742558125791164886826595733266576496837668637332
2263.97526367948518569269123608381536413614371537463746
2292.5459486798139691345410957106409995107011147097011436010957
2322.54594867988154768070766218617859756185619166066218
2332.545948679830730291592419022167224592201521283199012216922169
2342.545948679833337421243171630276310012836530658280773342631001
23514805624149544448484045334454446948174805
2362.54594867984600394733582546255925452700293028462846
2392.54594867984590434136933943367357953572382439603943
240114063153871234311617115361205111316110231173711737
2433.975263679418632188591512817420147661479714349158611487515128
2453.975263679429725347322616326404263972459124657242252602226163
24811851011572151514151515
2493.975263679489211091573288204722271466846674673547328
2503.975263679435669338842552326897258032790325945240392614126141
25112972340328682896295029232759255728952896
2532.54594867983173306329983343305430273197303032923063
25410.375296377621335214061202616419155531175611279138601232613860
2552.545948679826357278342139020139215751933219945186892194421390
2583.975263679429438363622719824896263772497425424263182541426318
2631509530451425445451422415428445
2642.54594867986182655554325131740151025197475551705197
2682.54594867989644996970456382917464136381684172737045
2692.54594867985302504832373035405130362998448834183418
2703.97526367944157405038503757403837493728374039913850
2711193012073681017267689873
2723.97526367945660531739683867487738524006415277384152
2732.545948679844795530503838336430403843569735728324023709537095
2742.54594867986689713256035416803754735436619853035603
27616794831966036533662763956611628966586611
27711208147011921123116911661138111111241166
2792.54594867981129116111551112112111101112110311171117
2801781485325300302314301292337314
2833.97526367947371695654095384684654665238512653465409
2872.54594867983895366221232060289621862041207549152186
29112591305725972475256025302355239522552530
2921938949746697736728995982880880
2932.5459486798108841414489698936934191418509840392319141
2942.545948679819890188201370913822145061320413407121361412713822
29513499418030953194322732573071306231733194
29911797191714931408158915451812175815751589
301119131232541817817924178771754417595168821753317877
302110395152171028110096960994529112897293099609
30315418777242044056505741144368405141234204
30415389533037493620451536913405469535913749
306119780223721857518409185361843718522164921804818522
3072.54594867986097591350775049546753615063456450505077
3092.54594867983255312427192664317527002685287026952719
3103.975263679422039282152099019597206302018720113185482059020590
31114262557940854327443640134045380043304262
312115815519181111919191819
3173.97526367945627528036573266440036533308333333153653
31911441155813261259129111871172123312791279
3203.97526367947281738158385616655356485683559458745838
32118882691454865323539654125367512453695396
3232.545948679816486181051371613558152461408413853130671376813853
32416461141299573580315567583589580
3261132012428865466451590550542557645
32711385155513951362141713731362129813541373
33111962563376350448528632650356528
332115687208021659815389157521484015103142411522115389
335145551313501313121313
3382.54594867984639460734803364410034413269336636253480
3402.54594867983981386525822498343225452532255525652565
3411566587502466749497642777474566
3431670795540524234517234230413517
3462.54594867983052306129362830299928552839283028422855
3472.54594867983846385028072738296129072734273330442907
3512.54594867982919280824242361246623782540263923682466
3522.545948679811853953666306699681571496069659496396815
3561.3307705999813621075667724173183751977668873878678097554675546
35711822195215661543153615531391147215131543
3593.97526367946257633231253099343645143254544139623962
36013740389137603594319036673067321833203594
3683.97526367946111593337363642396551063936391939453945
3693.975263679416907191371675516301160531646616295151231491216301
372111071349880112110101155110197211611107
3733.97526367943238445733003059306331573034304230713071
37527.000612814358842611843280330463304974333832439303133845232803
37612137515151410714141415
3771116612411022983973953974960963974
37859.658176521644368472672455924036264192745026571282912623326571
3801498601408386388402397730385402
38213650425035073585349534383220306434643495
383111010288788888
3842.54594867985630621577375211502551635189545155025451
3853.975263679431718380882846426966270312710027138263862807927138
389133073427873383628574301762814629843287032945829843
3903.975263679410065756064105625594259535661687160366036
3911465419152697073701997173
3922.54594867981323141413271242125012451243136013041304
3962.54594867984325445237783311347334873335412063453778
3981194164878785998
4012.54594867986408498731042605287125882754394826592871
40312511189716511466147015281358155317491553
4133.975263679429959385083003926796272602613527394259692742927394
41614096549343533951419842684282439040824268
4192.54594867983006336032092884319728873202291829023006
42411735716278104041083999981007410790100901105510790
425114246142031097410586103861077911269107191095810958
4283.975263679413991158451226111548110151140912078142341172212078
4303.975263679418229214501683816500192501635416309189121562316838
43211408160113931305134713541350132412991350
4361838772337312599329613433493493
4403.97526367948328750366086025611059636050658058336110
4411.60846858414256483123981925391119461763316921452398
4443.97526367941142912220114667265769375617578814777777777
4483.975263679492561213479057194783074987469851078587858
4513.97526367945869820154845261486951574780527749975261
4522.54594867988078936172087212728967916810691675997212
453120516516161616151201616
4573.97526367949548989782448153744072777303858274748153
4601254334245251229228533217236245
4643.975263679410683862851746558522250224970943071936558
4663.97526367945432864539415406357435033712511135903941
4672.545948679810855894175687400728068436608793772437400
4692.545948679831275399342940427628293922629528458268492850428504
4732.54594867989105966082298203830179018120857276688229
4792.545948679820968253011636415133159561550015851159531573815953
4822.54594867984540489128972528609924552753271534022897
4853.97526367946451635834313537446334863324341051203537
4913.975263679428484355212677024612256592470325225223732538125381
49411547956484718716435420411861716
49513407372232153130306334973302333431593302
4961660680326310596319307300636326
5042.545948679888511033356875817654658595836571795895859
5062.545948679824819250681766718044220321784720455172041912719127
5083.97526367947606711447774297501842094536635860935018
5113.9752636794939512460814475871050175427706766881678144
5162.54594867984218416333643161343331343256295538423364
5186.282074255828378264592184620477209492066020326195932473820949
525128.972967612113791312808399910975871089049566396329100527117493100527
5263.97526367947089715652335446620849925293524987105446
5292.54594867984483401123522303272722602270229836892352
5332.545948679810168107891013990518601879488268703118309051
5343.97526367944705731634833659357933733678339444773659
54615006218712901221112115441232122413611290
5482.683743249621433237751985319560205101829518523205841873719853
5552.54594867984095363132423174494230342865320134903242
55719261179490459534467742446917534
5592.54594867981115011121909488951071384178714854292399094
5643.97526367945601564750544899491547914876636954495054
5661131714921233119311959011621115811271195
56817951309109676110529379351033779937
56915.794727437465350755175740958636554945323055115521745838857409
57112492275317841678168216181848154045781784
572198201194295309556961590649699914294399556
57416397593443834046457339424639409944834483
5803.975263679418803203371596415806162661594016193155401769216193
5811420239118391113414112111183183
5923.975263679433535458043051431073308753003530771277713515730875
5953.97526367947555446532743407362231293159504244983622
60113225416333743206292331493114304534413206
60311594184515411465135416051525129614631525
6052.545948679819599222001800117794172961769117460163771781217794
609156563131503130314931
6122.545948679892991012286468039794178997665797180298029
6133.97526367945313507335043897410236973414349138323832
6171.3307705999622624568555620551556556554556
6201179189150144343142139136139144
62419861157554261354270447557260447
62513905403634953671362635943116357236413626
62819311061924890582798776566875875
62911523201715201382146111911351131311711382
63316008678656715546543658125120518356915671
6367.644186427426639225201606315789191751591315419152931591515915
6381196203163155337159153444156163
6431341359558254287261544251254287
64517198835071926449650367706466665366956695
6462.54594867985158788449454567517747474571457847634763
647184551006277738254808982007728751282318200
6511634517151142224146143141141146
6523.97526367945413630150334739503647494910493650395033
65310.470740972721368182431060010340162451011010198103231031210340
6583.97526367942111211111
6593.97526367945340508240763964445739013916387241524076
6601892703559825531835531512714703
6611127771220194908912918381159266826282749183
6623.975263679417493194681562315260148441486315149135301497715149
6661363416324308629314307471331331
67319671034780443536471754444444536
6782.545948679820683231211697215930171731572116599145631597416599
6863.9752636794127271296098069667954298459561902699619806
68813337377428672482248225342767263927532753
6941646770699885920887588845592770
6951810934897595610590886576588610
6963.97526367941121120121
6983.97526367944394436934773677425635423260327635813581
70112958808639329327341625489329489
7031161181144138136138135133138138
7113.975263679416142199061253813276144891291312998126441342813276
7123.975263679427594339582651525905247862558125649241092595825905
7133.9752636794110081089980887654749276997880786576937865
7143.97526367947926829279527954735179847716723077137926
7153.975263679421619260132236920375197832082521181201142025220825
7163.975263679417704157391378812957130031365813305120941357013570
7173.975263679455784807925467152181540194929752231450135232652326
72312118310024711943237321242239220423212239
7243.975263679428152352292690325862267392633526549246852703726739
7253.975263679439758460213382334148352163317133730310543406634066
72611685245721631799179317781851190117681799
7273.975263679432533414982908330215291122882929180261472934529180
72915.794727437460725709185140350487510404995250101465764906150487
7312.41822542430706352312643425671255802537226022231382626526022
7422.54594867985586597748684539468046745178454647214721
7432.545948679814615166681242811279116471138311519106711160011600
7621782532369638361365350343646369
76312753319027732628296729192632238026602753
7641319650617307307324598608313324
76515748712055655378534453595288483251905359
7712.545948679812064134651128810837104261074210651103241039510742
7723.97526367945595598352454916491651065236498864435236
7731730551446726430431437411711446
77411316130811341193109210971206115611761176
775111291200115582010031013112610678221067
77711356146113991331132612371338130513271331
7901324259254535537430242236527324
7911225227198186191208488483185208
7921164356459161164168163164162164
7931651362336619317337318319592337
7993.97526367944469439842374131449243134451447741074398
8001216343434125129132128138128132
8011517197180170170181173169172173
8021297331317583591304297430608331
8031175191181170172349173462170175
8041175362488172171178467170170175
80720.409001670350814531063435734694348213353634895383003479834821
8181120713891125119388912111243104711911193
81916351082838630791938933780622791
8201838673564844719543535512845673
8211230278241229225541227398229230
8223.97526367949330972772396989689366686715644768636893
8372.54594867987391751942722460254524432758309024382758
8421571302225214212218212207217217
8471433538439705594591414393604538
8481426809728412410416719684409426
8491716649574829701610576508701649
8503.97526367946818702962915825582359756113583558415975
853114767173631331213678124171337813281126921262413312
8563.975263679457041794655465253642542074955353184471585160053642
8573.97526367948564816479947536744475987757697073887598
8593.975263679478781051560666180729758775921943362336233
870112541753123487511591171117611148711171
87311901218717501984191116841657188018281880
8743.97526367941078014112908391481129889079434906186419148
8753.97526367948172876076217758775376727627754873437672
87614128906971104610477671054101410411041
877112071361111511708791085117186511751170
888123525867626365633626265
FALSE
doGeneralize = FALSE
QIDt1t2t3t4t5t6t7t8t9t10MedianTime
203108961098710524104049791105171060510658103731074010524
20621911963229122561863219622722276219021712196
21020622320209720242367205420042084189620512054
2121922504203476199503193199194202202
21417131615169913751716146116971491163117021631
21533707836327739773293385233853139322135113385
216640491535330335341341513873346346
219581879820854747872868577572751820
22018751625178215581571163718361664187814801637
22244444455444
22317947181717181919171818
226191081361819192020181919
22938643834339535033183375335213221354737933521
23215761547151318621471184815571467158816121557
23388048827870387518595969289298824936185628824
2341249011826107921204312342129381191512037127091245512043
23550594748447246994921518350584919519550814921
2362222532121232323212222
239764689656487660518782796805762689
2401235012085116131223211869128381241312077128711222312223
24328444559274728823201283729273198324529512951
24558956604587062986312662564666057647661066312
2481544781415151514151515
249915784707104072977173310421036765771
25060606323565158116612682466816290641162366323
25130462924266929883014306230972992314829172992
25355656586656
254238355235529232234239241252236239
25581507978775081348387857584948336848284968387
25858215948583963376130644963326268631163106310
263639471764449762469453739833464471
26413031341140012861425151814031416142714431416
26818002029180921281910218421701900191122312029
2693233583232333334323433
27065666787666
2717174687171737274737473
2721715151615161616161616
2731480415551142751498214892149331540715430147661545114982
27415171594147215141387160013921523146614221472
27669307298656165736745697467997036704667336799
2771415124710911107119395011011196116311481148
27944434465444
280318323306594313597318315330321321
283317353296309305327362313314319314
2872121372121222322212121
29126202693383925712445268425382580258524542580
2921036109094074791710629301040106110751040
29327652911264729632543298425772666281127042704
29450985306486246244796543948324852519651394862
29533803554323030673336336631763296344434153336
29916891964189914981885166216751896174019171885
30171737384674173577023775569926873660772237023
3029703100699473940298279913100759289978997759789
303499535532778512531497543510521531
304123181116123140127126126125126126
30641623907387540804044426641013743403238854032
307723966982101810081070100610117359231006
30998259891110999
31048054675484946964982521449455023496848724945
31141914373433142964505430742314501467443584358
3121919711818211919191919
3172120322021222222202121
31913731449120613301353128412431366137612771330
320621648602797618952622627628634628
32158185768526251205575571154605659580355515575
32344574310384340494454472241334463454542754310
324302531312295301594491304309594312
326753604644795582582588572895603603
32711231343126312881410145314241436113814271410
331567579589373650392659669689560589
3321527515555148521578615837165371611915346161831552015786
3351313441313131613131413
338543380345351355372358355655358358
3402949292930313230293030
341499731782506798808789686512790782
343242298238518242239245241255242242
346620665676556
3471616151515161817151515
35177788899888
35215171589154216431637171716401660164016651640
3563701940833326603669537265381603783237542376323761137611
35714781964184215421846154318661860169615791842
3593162293130323231313131
36032525341335936913437356335513404358137873563
3681947192019202120202020
36937333947355337393795371836373764378337803764
3721068124511258731152110711801187120311681168
37355556686566
375104205102389100100103100116106103
3761516191515151619141415
37799810129926778586841002968938992968
37820275192019201920202120
380405449385701396686412399418412412
38236933578354833693346350537133690352536973548
383765777788787
38412271180117612271111129312291226123712371227
38566296508610366296369720567046192664163376508
3891271312700113511152612695128791215811889125401209112158
390442747730444440475722742449794722
39110121170385723627273757575
392422444465444
396265302252260261281263264264268264
3988898891181299
401161169153161161171161163162167162
40315841654151114911771150315561489175815351535
41366786908671870456876743271056783689273326908
41647864446432742764750460842894711474543454446
41995089991110999
4241085811287100991060811191115511136311269114421116811269
4251106911910108591118411462116941171411206116851172211685
42826293739258731632736331028712859279127802859
43035604116366436293949374639373945394138403937
43212901448132113401349124413701354142513831354
436333423330615314497334315352335335
440618473727438714469716721735448714
441260739257554250543265251275440275
444107112161126885116094011811178106510861126
44811691086124312781275133112871291117110001275
451509803535800563824501505500684563
45220172282213519352222205721162242199818672116
4531614171516161917151616
4571113969122012331201127311001239129112551233
460237555245236234249239243243526243
464474518442739458778457467475473473
4665051485050525051515050
46722612348207919262206206321652222202422552165
4691155512071103091176112011125271198511802117861187611876
47324892606227023772485278026152927267226302615
47957816647530058045597601754935773546657935773
482143149136143142153143176143448143
48583110778482868385848484
49161916321579863836030647860796126602259556079
494741573718433711443747740751455711
49533053735317135233219357032123256312935143256
496609365309318608333324623510328333
50415711635173115151620161617761550142215491616
50672487729676473197186736471537246726875707268
508168184155458165355167170169171170
511117411291143887116196611641190119111641161
516368417345660362561363372368370370
51833196835283129942756289129432909335328332909
52517150171717161717171717
526439765696430714453711611433435611
5292222212222222424222222
53329103019280528492905335429952800291632722916
53410514899108104109106107106107107
54611551169124812141230122712701126132112631230
54866858901651365876916716170116828693069856930
555516240216229519242227237523233237
557485874753476474490757778489779753
55925912719236925812495276425842629261626742616
564223264208519224418224225223226225
5661203129611891285119494212391210127312291229
56810838741045763108910621088106711037901062
56982969018819483698245879482768321886888348369
57120361845172315461658166116301669212717611669
5721002310396977693541017994131022010145997799579977
57447866782402346914592468849004690472646734690
58028553088270130532663302728812770287229452881
581117122114112113291117113124117117
59276067483727675107656775078537221784676347634
5955578505556585656545656
60136323510347632953375345133123274367932163375
60315821606149618521421187315851434158617531586
60569877237650668746447740269136375683669236874
6093031293030293331313131
61220482401209520882122247420532194190022272122
6131137111211121413121212
617316454455444
620429169138145142147143148326146146
624276564264267559266453563284273284
62539123957335636803359349235133514378237103514
628623947783880789890919881654810880
62915071697148514601465116814841450157013891465
63358055947554057515656556056955731599554945695
63616862952161317001557202716991572171015841699
638456207154159157163159163436163163
643272336257262565268569445273271273
64571757450685671886690724970006415713968577000
646125013641183125996513269621255115612021202
64777148424804478348367803185737956814878628044
651149661144143145147152144156153147
652447486421744444773450455448738455
65352188515253525254535353
65822111111211
659352240167191176186181181481181181
660570929857857850594757756583881850
66191839379809184878723919889058797945389698905
66235713771345734473441351237523393364736793512
666326369310332323635325351339331332
673648664464444630448769622691477622
67864467108567064676068621063916360610566486360
68616532080174317732040179420882054166117821794
68827772810225928122833275128672535272028562810
694610934771884605603619891641894771
695916637573603899887611607932614614
69622111112211
698661006368666934768686868
701355789505338338346351644358350351
703144165135418140144143145146442145
71126452522239727132594275526902510265326522652
71267546757572665456796702868486766663165556757
71310861190100111658631213872885106011621060
714879120110131154116812161163116987811101163
71550925296437651654859507550825065471848045065
71625092851235027172449309725772581270725002581
7171347913630115801358513491139021351613521137471281513521
72320464509237423972268217424602407225924752397
72470436888658771216999700072906968710168966999
72592598841860390098815894094758973947889878973
72618252044177721061810210518571807192019211920
72782108277736379917697768077477923800678447844
72990289196897485298697883993589108915590009000
7311149313945104701034512566112671102610291115701069911026
74212461320108211381236126111321149126211531153
74340526234414139284384432840054204406038844141
762368715354653360381363375376666376
76327942708259228082794316028092827286127322808
764607338313324322332339337625337337
76555775745554358225313562959125797565652945656
77135983618312233653311352835993525351935263525
772475777729769468505470481475784505
773749461421438453640447443640449449
774953127312071229121212391251122498112461229
77511731158112411341134115111591153107311531151
77713691392122013411352137513781356131513841356
790528254245240419242249240448256249
791196218368188187189200461205197200
792168482165339165164170164177170170
793608331325321315609331318642618331
7993959373938414140393939
800133191130127310128421129138132132
801175181464171175172176351185176176
802305599300592297298307297325598307
803178177173171174465178174185178177
804466179172173352173177176479179177
8071188235511351179881121212081067120911931193
818970124512021208113712261237111599212471208
819839975801643828670962825977660825
820789573532727550857572552874869573
821236242230233409240417417244238242
82211881223101811938931253924902120111891189
8372436232425272625242525
842226274213227402240457401232229240
847429731583720427434436429443745443
848723441409479483723484429790469479
849548852816535828553845882568551816
850419434393727415640421428420725428
8531389214617130871367013233137561406613861137651399213765
8561392514495123981460914530142811401714096138441368214096
85712961033121912831262135011821291128510061262
859832747517559543864546556851853559
870926117711531209120711281108120695512051177
87320111771153416621964157118311975205916831771
87415761595136715511546152115231550155614491546
8751132104696411438351066846843112510241024
8767969869341057103710719671039821959986
8771224111210721179117090512281196125912181179
88865122656464646664686565
Graph1(a)
Graph1(a)
122
11
3
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
SUM
Histogram1
Estimated/Observed Bloat
Count
Graph1
605
7358
122
24
24
02
04
03
01
01
00
01
01
01
00
00
01
00
00
01
01
00
01
00
00
00
00
00
01
00
00
00
01
00
01
00
00
00
00
00
00
02
01
01
01
00
01
00
00
01
02
01
00
00
00
00
00
00
00
00
00
00
00
00
00
02
01
01
01
01
01
01
01
01
01
01
01
01
Observed/Estimated
Observed/Estimated Bloat
Count
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
SUM
16
62
10
4
1
6
1
3
1
0
1
1
0
1
0
0
1
0
1
1
0
0
1
0
0
0
0
0
1
0
0
0
1
1
0
0
0
0
0
0
2
1
0
2
0
1
0
0
1
2
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
0
2
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
138
Estimated/Observed Histogram
Observed/Estimated Histogram
Ratio
Count of Occurences
Non-Zero Bloats
Bloats Histogram (nonZeroBuckets Only)
Bin CenterEstimated/Observed HistogramObserved/Estimated Histogram
0.5605
1.57358
2.5122
3.524
4.524
5.502
6.504
7.503
8.501
9.501
10.500
11.501
12.501
13.501
14.500
15.500
16.501
17.500
18.500
19.501
20.501
21.500
22.501
23.500
24.500
25.500
26.500
27.500
28.501
29.500
30.500
31.500
32.501
33.500
34.501
35.500
36.500
37.500
38.500
39.500
40.500
41.502
42.501
43.501
44.501
45.500
46.501
47.500
48.500
49.501
50.502
51.501
52.500
53.500
54.500
55.500
56.500
57.500
58.500
59.500
60.500
61.500
62.500
63.500
64.500
65.502
76.501
80.501
104.501
110.501
119.501
121.501
128.501
129.501
131.501
161.501
187.501
201.501
Graph2
5.317749905
0.9052823315
0.8554040896
18.1386138614
0.8687921521
3.0062038405
1.823699422
0.687804878
3.5931582162
124
407.3333333333
197.1578947368
3.1119000284
3.9935773924
2.5123526745
2.574192477
0.9764275554
129.3636363636
5.7227866473
0.9602388939
5.1263978312
4.1449619772
1
9.504539559
4.1342717065
0.9679144385
510.5
57.9916317992
2.5503755813
4.1708399366
0.9447983015
3.6701977401
3.4721537703
103.5757575758
641.6666666667
1
259.5
2.4759711654
3.8063858696
0.9723488748
1.0156794425
279.25
0.9781931464
17.2261146497
104.0952380952
0.980620155
0.8461538462
3.3805473373
2.8428630193
0.9574340528
0.8429708223
2.5454933789
0.9816120135
7.9171374765
29.753968254
4.59375
5.0467196819
302.1111111111
4.16380182
0.9779715466
1
173.9523809524
0.9616541353
9.2961783439
0.9678923767
3.2141531323
1.858974359
1.0696517413
0.9737588652
0.896434635
0.9748511339
1
9.7206703911
85.5
0.7237851662
2.1363636364
475.8333333333
193.8
308.25
4.1554878049
2.0086145011
0.8376764387
127.8064516129
1.0087005333
197.25
4.3307651435
0.9477739726
511.8333333333
318.4757281553
1
1.0061983471
1328.55
0.9757281553
0.9850620068
1.1428571429
4.4425427873
4.1699446835
2.4545977957
8.3601108033
0.9733333333
326
14.3106060606
0.8888888889
17.7222222222
1.0117263844
3.9655471917
0.9599640126
334
0.9574940101
0.9377834831
4.2245540399
4.2768605537
0.9970457903
1.471641791
8.5574229692
8.72
6.906749556
6.1631372549
9.3445825933
3.4083175803
1
6.6123276561
1.0082304527
13.8646934461
78.82
3.4180138568
2.4001347255
3.1468451243
2.7633812576
20.2587412587
42.1071428571
4.1751932884
1.0070323488
1.0141277641
0.978978979
3.6256188119
2.6316730875
29.5176470588
7.0146425495
9.0918918919
7.2014437951
5913.3529411765
8.9132569558
106.9090909091
3.103909465
34.1962616822
1.0487804878
2.8647907648
13.6793248945
0.7091633466
3.4762996942
22.4622222222
0.9723352319
0.8822975518
6.8597203967
1.0689035351
0.9578029468
0.9558635394
5.620617841
1.5641025641
4.044406602
64.6785714286
0.9499259259
0.9615384615
2.5885947047
1
3.7836946277
319.3333333333
139
0.9863013699
1.573943662
1.03187251
0.9943181818
0.9433447099
0.995785777
9.3672748676
1
1.0512820513
0.9564285714
3.962562396
1.0193933366
0.9931972789
11.0615384615
195.0943396226
1
22.5193370166
0.8270588235
1.0312184166
4.3134965831
0.9969879518
0.8617363344
2.6099056604
5.4659977703
0.9797153025
0.9987029831
0.993485342
1
52.6617647059
1.3931623932
0.9517241379
5.0060331825
3.8338019831
7.4198113208
6.815133276
4.1115498519
5.2576520728
3.8699800311
0.934084272
3.8204029147
3.7965006129
0.9369791667
3.7200407955
5.6096666667
2.3600580446
4.0945359931
2.8012557353
0.9813829787
0.9804131054
0.9614243323
0.9474893918
3.0473758865
10.3683168317
0.9933184855
0.9568755085
0.9270199826
0.9815634218
1.3012048193
1.04
0.9647058824
1.0181268882
112.7692307692
1
0.9829545455
1.0781758958
0.988700565
0.988700565
29.1877619447
0.9875827815
0.9587878788
1.1745200698
0.9504132231
5.7973086627
110.32
0.9041666667
1.2144469526
0.8893528184
0.7953431373
13.9602803738
0.9670904468
3.805476731
6.0206022187
11.1502683363
0.994902294
1.0615471485
5.9172056921
7.4921875
1.0557809331
0.9923664122
1
Observed
Estimated Bloat
Observed Bloat
Sheet2
QIDEstimatedObserved
20315.79472743745.317749905
20610.9052823315
21010.8554040896
2122.545948679818.1386138614
21410.8687921521
2152.54594867983.0062038405
21611.823699422
21910.687804878
2202.54594867983.5931582162
2222.5459486798124
2236.2820742558407.3333333333
2263.9752636794197.1578947368
2292.54594867983.1119000284
2322.54594867983.9935773924
2332.54594867982.5123526745
2342.54594867982.574192477
23510.9764275554
2362.5459486798129.3636363636
2392.54594867985.7227866473
24010.9602388939
2433.97526367945.1263978312
2453.97526367944.1449619772
24811
2493.97526367949.504539559
2503.97526367944.1342717065
25110.9679144385
2532.5459486798510.5
25410.375296377657.9916317992
2552.54594867982.5503755813
2583.97526367944.1708399366
26310.9447983015
2642.54594867983.6701977401
2682.54594867983.4721537703
2692.5459486798103.5757575758
2703.9752636794641.6666666667
27111
2723.9752636794259.5
2732.54594867982.4759711654
2742.54594867983.8063858696
27610.9723488748
27711.0156794425
2792.5459486798279.25
28010.9781931464
2833.975263679417.2261146497
2872.5459486798104.0952380952
29110.980620155
29210.8461538462
2932.54594867983.3805473373
2942.54594867982.8428630193
29510.9574340528
29910.8429708223
30112.5454933789
30210.9816120135
30317.9171374765
304129.753968254
30614.59375
3072.54594867985.0467196819
3092.5459486798302.1111111111
3103.97526367944.16380182
31110.9779715466
31211
3173.9752636794173.9523809524
31910.9616541353
3203.97526367949.2961783439
32110.9678923767
3232.54594867983.2141531323
32411.858974359
32611.0696517413
32710.9737588652
33110.896434635
33210.9748511339
33511
3382.54594867989.7206703911
3402.545948679885.5
34110.7237851662
34312.1363636364
3462.5459486798475.8333333333
3472.5459486798193.8
3512.5459486798308.25
3522.54594867984.1554878049
3561.33077059992.0086145011
35710.8376764387
3593.9752636794127.8064516129
36011.0087005333
3683.9752636794197.25
3693.97526367944.3307651435
37210.9477739726
3733.9752636794511.8333333333
37527.0006128143318.4757281553
37611
37711.0061983471
37859.65817652161328.55
38010.9757281553
38210.9850620068
38311.1428571429
3842.54594867984.4425427873
3853.97526367944.1699446835
38912.4545977957
3903.97526367948.3601108033
39110.9733333333
3922.5459486798326
3962.545948679814.3106060606
39810.8888888889
4012.545948679817.7222222222
40311.0117263844
4133.97526367943.9655471917
41610.9599640126
4192.5459486798334
42410.9574940101
42510.9377834831
4283.97526367944.2245540399
4303.97526367944.2768605537
43210.9970457903
43611.471641791
4403.97526367948.5574229692
4411.60846858418.72
4443.97526367946.906749556
4483.97526367946.1631372549
4513.97526367949.3445825933
4522.54594867983.4083175803
45311
4573.97526367946.6123276561
46011.0082304527
4643.975263679413.8646934461
4663.975263679478.82
4672.54594867983.4180138568
4692.54594867982.4001347255
4732.54594867983.1468451243
4792.54594867982.7633812576
4822.545948679820.2587412587
4853.975263679442.1071428571
4913.97526367944.1751932884
49411.0070323488
49511.0141277641
49610.978978979
5042.54594867983.6256188119
5062.54594867982.6316730875
5083.975263679429.5176470588
5113.97526367947.0146425495
5162.54594867989.0918918919
5186.28207425587.2014437951
525128.97296761215913.3529411765
5263.97526367948.9132569558
5292.5459486798106.9090909091
5332.54594867983.103909465
5343.975263679434.1962616822
54611.0487804878
5482.68374324962.8647907648
5552.545948679813.6793248945
55710.7091633466
5592.54594867983.4762996942
5643.975263679422.4622222222
56610.9723352319
56810.8822975518
56915.79472743746.8597203967
57111.0689035351
57210.9578029468
57410.9558635394
5803.97526367945.620617841
58111.5641025641
5923.97526367944.044406602
5953.975263679464.6785714286
60110.9499259259
60310.9615384615
6052.54594867982.5885947047
60911
6122.54594867983.7836946277
6133.9752636794319.3333333333
6171.3307705999139
62010.9863013699
62411.573943662
62511.03187251
62810.9943181818
62910.9433447099
63310.995785777
6367.64418642749.3672748676
63811
64311.0512820513
64510.9564285714
6462.54594867983.962562396
64711.0193933366
65110.9931972789
6523.975263679411.0615384615
65310.4707409727195.0943396226
6583.97526367941
6593.975263679422.5193370166
66010.8270588235
66111.0312184166
6623.97526367944.3134965831
66610.9969879518
67310.8617363344
6782.54594867982.6099056604
6863.97526367945.4659977703
68810.9797153025
69410.9987029831
69510.993485342
6963.97526367941
6983.975263679452.6617647059
70111.3931623932
70310.9517241379
7113.97526367945.0060331825
7123.97526367943.8338019831
7133.97526367947.4198113208
7143.97526367946.815133276
7153.97526367944.1115498519
7163.97526367945.2576520728
7173.97526367943.8699800311
72310.934084272
7243.97526367943.8204029147
7253.97526367943.7965006129
72610.9369791667
7273.97526367943.7200407955
72915.79472743745.6096666667
7312.4182254242.3600580446
7422.54594867984.0945359931
7432.54594867982.8012557353
76210.9813829787
76310.9804131054
76410.9614243323
76510.9474893918
7712.54594867983.0473758865
7723.975263679410.3683168317
77310.9933184855
77410.9568755085
77510.9270199826
77710.9815634218
79011.3012048193
79111.04
79210.9647058824
79311.0181268882
7993.9752636794112.7692307692
80011
80110.9829545455
80211.0781758958
80310.988700565
80410.988700565
80720.409001670329.1877619447
81810.9875827815
81910.9587878788
82011.1745200698
82110.9504132231
8223.97526367945.7973086627
8372.5459486798110.32
84210.9041666667
84711.2144469526
84810.8893528184
84910.7953431373