thesis slides - definition and evluation of collaborative information retrieval models based on...

39
Defini&on and evalua&on of collabora&ve informa&on retrieval models based on users’ domain exper&se and roles Défini’on et évalua’on de modèles de recherche d’informa’on collabora’ve basés sur l’exper’se de domaine et les rôles des u’lisateurs Laure Soulier Directrice de thèse : Lynda TamineLechani Coencadrante : Wahiba Bahsoun

Upload: upmc-sorbonne-universities

Post on 15-Apr-2017

263 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Defini&on  and  evalua&on  of  collabora&ve  informa&on  retrieval  models    based  on  users’  domain  exper&se  and  roles  

Défini'on  et  évalua'on  de  modèles  de  recherche  d’informa'on  collabora've    basés  sur  l’exper'se  de  domaine  et  les  rôles  des  u'lisateurs  

Laure  Soulier    

           Directrice  de  thèse  :    Lynda  Tamine-­‐Lechani              Co-­‐encadrante  :      Wahiba  Bahsoun  

Page 2: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

     Informa'on  retrieval  and  collabora'on          

 Exper'se-­‐based  collabora've  informa'on  retrieval  models    

User-­‐driven  system-­‐mediated  collabora've  informa'on  retrieval      Conclusion  and  Perspec'ves  

1  

2  

3  

4  

Defense  overview  

                 Contribu'ons  

[  2  ]  

Page 3: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Part  1    

Informa'on  retrieval  and  collabora'on  

[  3  ]  

Page 4: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Informa'on  retrieval  and  collabora'on  

Complex  or  exploratory  tasks  [Denning  and  Yaholkovsky,  CACM  2008  ;  Twidale  et  al.,  IPM  1997]  

Shared    informa'on    

need  

.  

.  

.  

Collabora've  informa'on  

retrieval  system  

Informa'on  retrieval  system  

Informa'on  need  

Informa'on  retrieval  and  collabora'on   [  4  ]  

From  individual  to  collabora've  informa'on  retrieval  

Ad-­‐hoc  informa'on  retrieval  

?  

?  

Page 5: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

 Complex  tasks:  exploratory  or  fact-­‐finding    Bibliographic,  medical,  e-­‐Discovery,  academic  search…  

What?  

 Requirement  or  setup  need  Shared  interests    Insufficient  knowledge    Division  of  labor  

Why?  

 Group  

Who?  

 Synchronous  vs.  Asynchronous  

When?  

 Colocated  vs.  Remote  

Where?  

 Explicit  intent      User  media'on              System  media'on  

How?  

Informa'on  retrieval  and  collabora'on  The  5Ws  of  collabora'on  [Morris  and  Teevan,  2009]  

Informa'on  retrieval  and  collabora'on   [  5  ]  

Page 6: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Sharing  of  knowledge  [Foley  and  Smeaton,  ECIR  2009]  

Division  of  labor  [Kelly  and  Payne,  CSCW  2013]  

Awareness  [Dourish  and  Bellof,  CSCW  1992]  

?  

Informa'on  retrieval  and  collabora'on  Collabora'on  paradigms  

Role-­‐based  division  of  labor      Document-­‐based  division  of  labor  

Communica'on  and  shared  workspace      Ranking  based  on  relevance  judgements  

Doc2   Yes,  good  

Collaborators’  ac'ons      Collabora'on  context  

Doc2   Yes,  good  

Informa'on  retrieval  and  collabora'on   [  6  ]  

Page 7: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Informa'on  retrieval  system  Collabora've  Informa'on  retrieval  system  

How  to  collaborate?  

-­‐  Any  considera'on  of  the  shared  informa'on  need  

-­‐  No  collabora'on  op'misa'on  

-­‐  Well-­‐known  ranking  models  

Query1  

Query2  

1+1  

-­‐  Individuals  as  a  whole  -­‐  Collaborators’  coordina'on  -­‐  Collabora'on  paradigms  

-­‐  Difficult  to  evaluate  Synergic  effect  [Shah  and  Gonzalez-­‐  Ibanez,  SIGIR  2011]  

Informa'on  retrieval  and  collabora'on  Challenges  of  collabora've  informa'on  retrieval  

How  to  collabora'vely  rank  documents?  

✕  

✓  

2  

✕  

✓  

Informa'on  retrieval  and  collabora'on   [  7  ]  

Page 8: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Domain  exper'se-­‐based  CIR  models  

Ver'cal  dis'nc'on  (Expert  and  novice)  

Horizontal  dis'nc'on  (Experts  of  sub-­‐domain)  

Descrip'on  Relevance   Evidence  source   Paradigms  

Collec've     Individual   RF   Interest   Exper'se   DoL   SoK  

[Foley  and  Smeaton,  ECIR  2009]  

Relevance  feedback  process  based  on  probabilis'c  weigh'ng  of  terms  w.r.t.  the  collec've  relevance  

+   -­‐   +   -­‐   ~   +   +  

[Morris  et  al.,  CSCW  2008]  

«  Smart-­‐splifng  »   -­‐   +   +   +   -­‐   +   -­‐  

«  Groupiza'on  »   +   -­‐   +   +   -­‐   -­‐   -­‐  

Exper&se-­‐based  CIR  models  

Personaliza'on  of  collabora've  rankings  based  on  a  ver'cal/horizontal  dis'nc'on  of  domain  exper'se  level  

+   +   +   +   +   +   -­‐  

Research  contribu'ons  Overview  and  comparison  with  previous  work  

Informa'on  retrieval  and  collabora'on   [  8  ]  

Page 9: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

User-­‐driven  System-­‐mediated  CIR  models  

Descrip&on  Roles   Evidence  sources   Approach  

Known   Fixed   RF   Behavior  

[Pickens  et  al.,  SIGIR  2008]  

Role-­‐based  CIR  model  Prospector  (diversity):  query  reformula'on  Miner  (quality):  document  ranking  func'on  

+   +   +   -­‐   Adapted  to  collaborators’  role  

[Shah  et  al.,  IPM  2010]  

Role-­‐based  CIR  model  Gatherer  (quan'ty):  quick  scan  of  documents  Surveyor  (diversity):    browse  a  wider  diversity  

+   +   +   -­‐  Adapted  to  

collaborators’  role    

Role-­‐based  CIR  models  

Hybrid  media'on  based  on  roles   +   -­‐   +   +   Adapted  to  collaborators’  behaviors  and  strategies  

 Hybrid  media'on  based  on  meta-­‐roles     -­‐   -­‐   +   +  

Predefined  role  mining  

Meta-­‐role  mining  for  document  ranking  

Research  contribu'ons  Overview  and  comparison  with  previous  work  

Informa'on  retrieval  and  collabora'on   [  9  ]  

Page 10: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Part  II    

Domain  exper'se-­‐based  CIR  models  

[  10  ]  

Page 11: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

«  The  exper&se  of  a  person  on  a  given   topic  should  be  considered   in  context   […]   comprised  of   the  larger  landscape  of  knwoledge  areas  within  the  domain.  […]  exper'se  is  not  sta&c  »    [Rybak,  SIGIR  2014]  Ver'cal  dis'nc'on  

Horizontal  dis'nc'on  

Unidirec'onal  collabora'on  -­‐  Ques'on-­‐Answering  -­‐  Library  

 Bidirec'onal  collabora'on  -­‐  Medical  domain  -­‐  E-­‐Discovery  -­‐  Academic  

Search  behavior  analysis  [Allen,  1991;  Hembrooke  et  al.,  JASIST  2005  ;    White  and  Dumais,  CIKM  2009]  

-­‐  Informa'on  need  percep'on  -­‐  Technicality  of  the  vocabulary  -­‐  Search  success  

Domain  exper'se-­‐based  CIR  models  Context  

Contribu'on:  Exper'se-­‐based  CIR  models   [  11  ]  

«  Tacit  knowledge  »  [Patel  and  Arocha,  1999]    

Page 12: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

CIR  model  based  on  the  roles  of  domain  expert  and  novice  [Soulier  et  al.,  IPM  2014]  CIR  model  based  on  the  roles  of  domain  expert  and  novice  [Soulier  et  al.,  IPM  2014]  Domain  exper'se-­‐based  CIR  models  

CIR  model  based  on  a  group  of  sub-­‐domain  experts  for  a  mul'-­‐faceted  search  [Soulier  et  al.,  AIRS  2013]  

Domain  exper'se-­‐based  CIR  models  Contribu'on  overview  and  research  ques'ons  

?  

Selected  document  

Selected  document  

…  

…  

Selected  document  

?   To  what  extent  collaborators’  search  ac'ons  enable  to  infer  their  exper'se  level?  

How  the  exper'se  level  of  collaborators  impact  on  the  relevance  of  documents,  as  well  as  the  retrieval  effec'veness?  

?  

Contribu'on:  Exper'se-­‐based  CIR  models   [  12  ]  

Page 13: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Selected  document  

…  

…  

Feedback  itera&on  

?  

Feedback  itera&on  

The  collabora've  ranking  model  On  the  roles  of  domain  expert  and  novice  in  CIR  

Score  es'ma'on  according  to  roles  

Document  ranking  based  on  roles  

Selected  document  

?  

Contribu'on:  Exper'se-­‐based  CIR  models   [  13  ]  

Page 14: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Score  es'ma'on  according  to  roles  

Document  ranking  based  on  roles  

The  collabora've  ranking  model  On  the  roles  of  domain  expert  and  novice  in  CIR  

Es'ma'ng  document  relevance  for  each  user  w.r.t.  his  role  

Contribu'on:  Exper'se-­‐based  CIR  models   [  14  ]  

Pk (di | uj,q)∝Pk (uj | di ) ⋅P

k (di | q)

Language-­‐based  model  Language-­‐based  model  

λ   Novelty   Specificity  

Expert   +   +  Novice   +   -­‐  

λijk =

Nov(di,D(uj )k ) ⋅Spec(di )

β

maxdi '∈DNov(di,D(uj )k ) ⋅Spec(di )

β

Pk (π (uj )k |θdi ) = λij

kP(tv |θdi )+ (1−λijk )P(tv |θC )"# $%

(tv ,wvjk )∈π (uj )

k∏

wvjk

Page 15: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

The  collabora've  ranking  model  On  the  roles  of  domain  expert  and  novice  in  CIR  

Contribu'on:  Exper'se-­‐based  CIR  models   [  15  ]  

Likelihood  maximiza'on  of  document  scores  w.r.t.  collaborators  

Score  es'ma'on  according  to  roles  

Document  ranking  based  on  roles  

P(Rj = Re l | xijk ) =

α jkφ j

k (xijk )

α jkφ j

k (xijk )+ (1−α j

k )ψ jk (xij

k )

Classifica'on  based  on  the  Expecta'on  Maximiza'on  algorithm  (EM)  

1  

2  

3  

4  

1  

2  

4  

3  Document  alloca'on  to  collaborators  by  rank  comparison  

Division  of  labor  policy  

ℓ(Rj = Re l | xijk,θ j

k ) = log(P(xijk,Rj = Re l |θ j

k ))P(Rj = Re l | xijk )

j=1

2

∑h=1

n

-­‐  E-­‐step:  Document  probability  of  belonging  to  collaborator’s  class  

-­‐  M-­‐step  :  Parameter  upda'ng  and  likelihood  es'ma'on  

Page 16: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

         

 

 Collabora'on  simula'on  framework  [Foley  and  Smeaton,  ECIR  2009]  (adapted  to  exper'se)  [Soulier  et  al.,  IPM  2014]    

89  

FT944-­‐15661  

89  

FT944-­‐15661  

149  

 FT944-­‐5773  

238  

FT931-­‐8485  

151  

FT931-­‐5947  

185  

FT944-­‐5773  

185  

FT944-­‐5773  

238  

FT934-­‐8485  

Individual  session  of  the  TREC  Interac've  

Synchronized  list  of  relevance  judgements  

151  

FT931-­‐5947  

149  

 FT944-­‐5773  

253  

FT931-­‐8485  

253  

FT934-­‐8485  

Individual  session  of  the  TREC  Interac've  

Experimental  evalua'on  On  the  roles  of  domain  expert  and  novice  in  CIR  

Contribu'on:  Exper'se-­‐based  CIR  models   [  16  ]  

Page 17: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

     TREC  Interac've  6-­‐7-­‐8  dataset      

Exhaus've  method  

Selec've  method    2-­‐means  

classifica'on  

…  Expertise(uj,T ) =Spec(di )di∈D

T (uj )∑

|DT (uj ) |

     Building  collabora've  groups  w.r.t.  exper'se  

Experts  Novices  

…  

…  

Expertise(uj,T ) = Authority(uj,T ) [Foley  and  Smeaton,  ECIR  2009]    

[Kim,  IPM  2006]  

Experimental  evalua'on  On  the  roles  of  domain  expert  and  novice  in  CIR  

?  

20   Topics  

210K   Documents  

Contribu'on:  Exper'se-­‐based  CIR  models   [  17  ]  

Collabora've  queries/sessions  

243   Exhaus've   81-­‐95   Selec've  

Page 18: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Compara've  effec'veness  at  the  session  level  

Scenarios   P@30   %Ch   C@30   %Ch   PC@30   %Ch  Exhaus've  metho

d  W/oDoL   0,275   +4,09%   0,362   +31,73%  ***   0,080   +26,63%  ***  

W/oEM   0,268   +7,01%  *   0,335   +42,46%  ***   0,072   +43,99%  ***  

W/oEMDoL   0,303   -­‐5,26%   0,258   +84,73%  ***   0,050   +105,88%  ***  

FS   0,208   +32,21%  ***   0,429   +10,95%  ***   0,075   +37,99%  ***  

ENColl   0,287   0,477   0,103  

Selec've  m

etho

d  

W/oDoL   0,251   +0,86%     0,400   +36,44%  ***   0,081   +35,52%    ***  

W/oEM   0,239   +5,87%     0,362   +50,11%  ***   0,070   +56,17%  ***  

W/oEMDoL   0,279   -­‐9,29%   0,254   +114,48%  ***   0,048   +125,96%  ***  

FS   0,166   +51,20%  ***   0,429   +26,71%  *   0,081   +34,22%  ***  

ENColl   0,253   0,544   0,110  

Experimental  evalua'on  On  the  roles  of  domain  expert  and  novice  in  CIR  

Contribu'on:  Exper'se-­‐based  CIR  models   [  18  ]  

Collabora've  ranking  model  (FS)  es'ma'ng  the  collec've  relevance  Document  alloca'on  and  division  of  labor  (W/oEM  and  W/oDoL)  Synergic  effect  considering  collabora've  metrics  (W/oEMDoL)  

More  residual  precision-­‐oriented  due  to  the  integra'on  of  two  division  of  labor  principles  (W/oEMDoL)  

✕  ✓  

Scenarios   P@30   %Ch   C@30   %Ch   PC@30   %Ch  Exhaus've  metho

d  W/oDoL   0,275   +4,09%   0,362   +31,73%  ***   0,080   +26,63%  ***  

W/oEM   0,268   +7,01%  *   0,335   +42,46%  ***   0,072   +43,99%  ***  

W/oEMDoL   0,303   -­‐5,26%   0,258   +84,73%  ***   0,050   +105,88%  ***  

FS   0,208   +32,21%  ***   0,429   +10,95%  ***   0,075   +37,99%  ***  

ENColl   0,287   0,477   0,103  

Selec've  m

etho

d  

W/oDoL   0,251   +0,86%     0,400   +36,44%  ***   0,081   +35,52%    ***  

W/oEM   0,239   +5,87%     0,362   +50,11%  ***   0,070   +56,17%  ***  

W/oEMDoL   0,279   -­‐9,29%   0,254   +114,48%  ***   0,048   +125,96%  ***  

FS   0,166   +51,20%  ***   0,429   +26,71%  *   0,081   +34,22%  ***  

ENColl   0,253   0,544   0,110  

Page 19: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Compara've  effec'veness  at  the  role  level  

Exhaus&ve  method   Selec&ve  method  

Scenarios   P@30   %Ch   P@30   %Ch  

Expe

rt  

W/oDoL   0,264   +5,67%   0,285   +2,01%  *  

W/oEM   0,259   +7,70%  *   0,264   +9,78%  

W/oEMDoL   0,285   -­‐2,30%   0,315   -­‐7,87%  

FS   0,233   +19,10%  *   0,234   +24,08%  *  

ENColl   0,279   0,291  

Novice  

W/oDoL   0,238   +1,67%   0,250   +4,11%  

W/oEM   0,227   +6,51%  *   0,238   +8,97%  ***  

W/oEMDoL   0,253   -­‐4,52%  *   0,262   -­‐1,05%  

FS   0,233   +3,86%   0,209   +23,81%  

ENColl   0,241   0,260  

0,279  

+15,19%    p-­‐value  0,16  

0,241  

Exhaus've  method  

Selec've  method  

0,291  

+11,91%    p-­‐value  0,38  

0,260  

Experimental  evalua'on  On  the  roles  of  domain  expert  and  novice  in  CIR  

?  

Contribu'on:  Exper'se-­‐based  CIR  models   [  19  ]  

Page 20: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Part  III    

User-­‐driven  system-­‐mediated  CIR  models  

[  20  ]  

Page 21: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Research  hypothesis  Collaborators  behave  differently  Collaborators’  behaviors  vary  throughout  the  search  session  

Conclusions  Scenarios  with  roles  difficultly  converge  through  an  op'mal  ac'on  coordina'on  Role  guidelines    -­‐  Enable  to  structure  the  collabora'on  -­‐  Constraint  too  much  collaborators’  ac'ons  

Mo'va'ons  User-­‐driven  system-­‐mediated  models  

Prospector-­‐Miner   Gatherer-­‐Surveyor   Without  Role  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  21  ]  

Page 22: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

**  **  

               

Role  Mining                                

               

Role  Mining                                

**   **  

?  

…  

…  

Reader  

Querier  Annotated  document  

q2   q4  *  

**  

***  

*  

***  

Annotated  document  

q1   q3  *  

***  

*  

***  

Bookmarked  document  

**   **  

Annotated  document  

q6  *  

***  

*  

***  

Bookmarked  document  

q5  

Expert  

Novice  Annotated  document  

q7  *  

***  

User-­‐driven  media'on   System-­‐based  media'on  

Contribu'on  overview  and  research  ques'ons  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  22  ]  

?   How  different  collaborators  are?  

How  do  we  infer  users’  roles?  ?  

How  to  use  these  roles  to  improve  collabora've  informa'on  retrieval?  ?  

Page 23: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Role  payern  

Number  of  visited  documents  

Number  of  submiyed  queries  

Nega've  correla'on  

Role  payern                    :                  Search  feature  kernel  

             Search  feature-­‐based  correla'on  matrix  

             Role  ayribu'on  func'on    

PR1,2

KR1,2 = { fk ∈ F}

FR1,2where FR1,2 ( f j , fk ) =

+1 for positive correlation

0 for independence

−1 for negative correlationRole(u1 ,u2,R

R1,2 )

Reader  

Querier  

[Soulier  et  al.,  SIGIR  2014]  Basic  no'ons  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  23  ]  

Page 24: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

     Collaborators’  search  behaviors  

Bookmarked  document  

…  

…  

Annotated  document  

Annotated  document  

q2  

q1  

q3  

q4  

Annotated  document  

?  

Su1(t ) = (wu1, f1

(t ) ,...,wu1, fn(t ) )

Su2(t ) = (wu2 , f1

(t ) ,...,wu2 , fn(t ) )

-­‐ Avoiding  noisy  search  ac'ons  -­‐ Behaviors  change  

Su1(t ) =

wu1, f1(1) ... wu1, fn

(1)

... ... ...wu1, f1(t ) ... wu1, fn

(t )

!

"

####

$

%

&&&&

Su2(t ) =

wu2 , f1(1) ... wu2 , fn

(1)

... ... ...wu2 , f1(t ) ... wu2 , fn

(t )

!

"

####

$

%

&&&&

Role  m

ining  

Context  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  24  ]  

Search  feature-­‐based  representa'on   Temporal-­‐based  representa'on  

Page 25: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

f1  

f2  

f3  

f4  

f1  

f2  

Δf3  

f4  

*Δf1  

Δf2  

*Δf3  

*Δf4  

Difference  significance  test  (Kolmogorov-­‐Smirnov)  

Δf1  

f3  

Δf4  

Δf3  

Δf1  

Δf4  

   1  0.3  -­‐0.5    0.3      1  -­‐0.8    -­‐0.5  -­‐0.8      1  

Reader/Querier   Expert/Novice   No  role  

Step  1:  Iden'fying  search  behavior  differences  

Step  2:  Characterizing  users’  roles  Correla'ons  on  search  behavior  differences  for:  -­‐  Highligh'ng  search  skill  opposi'ons  -­‐  Iden'fying  in  which  each  collaborator  is  the  most  

effec've    -­‐  Avoiding  prior  assignments  of  any  roles  to  users  

Methodology  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  25  ]  

Page 26: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Pool  of  role  payerns      described  by  a  feature  correla'on  matrix  

Correla'on  matrix                    of  search  feature  differences  

PR1,2

FR1,2

argminR1,2 || FR1,2¬Cu1,u2 ||

subject to

∀( f j , fk )∈ KR1,2;FR1,2 ( f j, fk )−Cu1,u2 ( f j, fk )> −1

Cu1,u2

Role(u1 ,u2,RR1,2 )

Collabora've  ranking  model  

Δf3  Δf1   Δf4  

Δf3  

Δf1  

Δf4  

   1  0.3  -­‐0.5    0.3      1  -­‐0.8    -­‐0.5  -­‐0.8      1  

Reader/Querier   Expert/Novice   No  role  

Step  3:  Iden'fying  users’  roles  

Methodology  User-­‐driven  system-­‐mediated  collabora've  model  

Reader/Querier  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  26  ]  

Page 27: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

     2  user-­‐driven  lab  studies    60  vs.  10  paid  par'cipants  Exploratory  search  task  

Between  25  vs.  30  minutes    

Category   Descrip'on   Measurement  

Query-­‐based  features  

Number  of  queries   Number  of  submiyed  queries  

Query  length   Average  number  of  tokens  within  queries  

Query  success   Average  ra'o  of  successful  pages  over  queries  

Query  overlap   Average  ra'o  of  shared  word  number  among  successive  queries  

Page-­‐based  features    

Number  of  pages   Number  of  visited  pages  

Number  of  pages  by  query   Average  number  of  visited  pages  by  query  

Page  dwell  'me   Average  'me  spent  between  two  visited  pages  

Snippet-­‐based  features  

Number  of  snippets   Number  of  snippets  

Number  of  snippets  by  query   Average  number  of  snippets  by  submiyed  query  

 Document  dataset  (74,844  docs)  Visited  web  pages  Top  100  results  from  submiyed  queries  

Search  features  

Experimental  evalua'on  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  27  ]  

Page 28: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

 Role  payerns    Gatherer  /  Surveyor            

   

         

 Prospector  /  Miner          

   

Query  overlap  vs.  Query  success  Query    overlap  vs.  Dwell-­‐'me  

 

Gatherer  Look  for  highly  relevant  documents    

Surveyor  Quickly  scan  result  for  diversity      

Query  overlap  vs.  Number  of  submiyed  queries  

 

Prospector  Formulate  query  for  diversity  

   

Miner  Look  for  relevant  document    

Experimental  evalua'on  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  28  ]  

Page 29: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

0,062  

0,064  

0,066  

0,068  

0,07  

0,072  

0,074  

1   2   3   4   5  

F-­‐measure  

Time  window  

0,016  

0,0162  

0,0164  

0,0166  

0,0168  

0,017  

1   2   3   4   5  

F-­‐measure  

Time  window  

US1   US2  

***      

•  Impact  of  'mewindow  on  the  retrieval  effec'veness  

Experimental  evalua'on  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  29  ]  

Page 30: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Prec@20! Recall@20! F@20!

value! %Chg! p! value! %Chg! p! value! %Chg! p!

US1!

BM25-CIR! 0.041! +10.408! *! 0.010! +4.636! *! 0.016! +5.372!

GS-CIR! 0.038! +18.316! ***! 0.008! +25.205! ***! 0.014! +24.521! ***!

PM-CIR! 0.05! -9.482! 0.012! -13.991! 0.019! -13.397!

Ra-CIR! 0.041! +11.484! *! 0.009! +12.895! *! 0.015! +12.777! *!

RB-CIR 0.045 - 0.010 - 0.017 -

US2!

BM25-CIR! 0.075! +3.347! 0.063! +2.586! 0.069! +2.833!

GS-CIR! 0.058! +34.636! 0.040! +63.818! *! 0.046! +52.786! *!

PM-CIR! 0.092! -16.051! 0.078! -16.493! 0.084! -16.317!

Ra-CIR! 0.070! +10.714! 0.056! +16.201! 0.062! +14.324!

RB-CIR 0.077 - 0.065 - 0.071 -

Prec@20! Recall@20! F@20!

value! %Chg! p! value! %Chg! p! value! %Chg! p!

US1!

BM25-CIR! 0.041! +10.408! *! 0.010! +4.636! *! 0.016! +5.372!

GS-CIR! 0.038! +18.316! ***! 0.008! +25.205! ***! 0.014! +24.521! ***!

PM-CIR! 0.05! -9.482! 0.012! -13.991! 0.019! -13.397!

Ra-CIR! 0.041! +11.484! *! 0.009! +12.895! *! 0.015! +12.777! *!

RB-CIR 0.045 - 0.010 - 0.017 -

US2!

BM25-CIR! 0.075! +3.347! 0.063! +2.586! 0.069! +2.833!

GS-CIR! 0.058! +34.636! 0.040! +63.818! *! 0.046! +52.786! *!

PM-CIR! 0.092! -16.051! 0.078! -16.493! 0.084! -16.317!

Ra-CIR! 0.070! +10.714! 0.056! +16.201! 0.062! +14.324!

RB-CIR 0.077 - 0.065 - 0.071 -

Individual  scenarios  BM25-­‐CIR  Collabora've  sefng  GS-­‐CIR  with  fixed  predefined  roles  Collabora've  sefng  Ra-­‐CIR  with  randomly  assigned  predefined  roles  

Collabora've  sefngs  PM-­‐CIR  relying  on  users’  ac'ons  

     Retrieval  effec'veness  at  the  session  level  

Experimental  evalua'on  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  30  ]  

✕  ✓  

Page 31: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

               

Role  Mining                                

**  **  

               

Role  Mining                                

**   **  

?  

…  

…  

Reader  

Querier  Annotated  document  

q2   q4  *  

**  

***  

*  

***  

Annotated  document  

q1   q3  *  

***  

*  

***  

Bookmarked  document  

**   **  

Annotated  document  

q6  *  

***  

*  

***  

Bookmarked  document  

q5  

Expert  

Novice  Annotated  document  

q7  *  

***  

Takes  into  account  that:  -­‐  Collaborators  are  different  -­‐  Collaborators  behave  differently  

•  Labelled  roles  may  not  be  in  adequa'on  of  collaborators’  skills    

!  Leveraging  collaborators’  complementarity  to  mine  latent  roles  

Feature  selec'on  maximising:  -­‐  The  complementarity  between  

collaborators  -­‐  The  quality  of  the  collabora've  ranking  

MR6  

MR7  

[Soulier  et  al.,  under  review]  Extension  to  collaborators’  meta-­‐roles  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  31  ]  

✕  ✓  

Su1(t ) =

wu1, f1(1) ... wu1, fn

(1)

... ... ...wu1, f1(t ) ... wu1, fn

(t )

!

"

####

$

%

&&&&

Su2(t ) =

wu2 , f1(1) ... wu2 , fn

(1)

... ... ...wu2 , f1(t ) ... wu2 , fn

(t )

!

"

####

$

%

&&&&

Search  feature-­‐based  representa'on   Meta-­‐role  building  

MR1  

MR2  

Page 32: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Iden'fy  the  features  that  maximise:    -­‐  The  complementarity  between  

collaborators  -­‐  The  quality  of  the  collabora've  ranking  

Δf1  

Δf2  

Δf3  Δf4  

Δf5   C(Δf1,  Δf2)    

C(Δf2,  Δf3)    

C(Δf3,  Δf4)    

C(Δf4,  Δf5)  

C(Δf1,  Δf5)    

C(Δf2,  Δf4)  

 

C(Δf2,  Δf5)    

C(Δf1,  Δf3 )  

f1  

f2  

f4  

f5  

f1  

f2  

f4  

f5  

f3   f3  

Δf1    

Δf2    Δf3    Δf4    

Δf5    

•  Building  the  meta-­‐role  

•  Collabora'vely  ranking  based  on  the  meta-­‐role  

Step  1:  Analyzing  search  skill  complementari'es  

Step  2:  Feature  selec'on  for  meta-­‐role  characteriza'on  

Logis'c  regression  classifica'on  based  on  the  set  of  selected  features  -­‐  Training  step:  learn  the  classifica'on  model  using  selected  documents  -­‐  Tes'ng  step:  classify  not  selected  documents  

Extension  to  collaborators’  meta-­‐roles  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  32  ]  

Coll-­‐Clique  algorithm  

Page 33: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

•  Effec'veness  evalua'on  

•  Behavior  analysis  

Extension  to  collaborators’  meta-­‐roles  User-­‐driven  system-­‐mediated  collabora've  model  

Contribu'on:  Hybrid  media'on-­‐based  CIR  models   [  33  ]  

0  

0,2  

0,4  

0,6  

0,8  

1  

3   6   9  12  15  18  21  24  27  30  33  36  39  42  45  48  51  54  57  60  63  66  69  

Meta-­‐role  overla

p  

Itera&on  

-­‐     Synergic  effect  w.r.t  individual  model  (BM25-­‐CIR)  -­‐     Meta-­‐roles  is  more  effec've  than  behavior-­‐based  CIR  models  (Logit-­‐CIR)  -­‐     Real-­‐'me  meta-­‐role  mining  is  more  effec've  than  fixed  role-­‐based  CIR  models  (GS-­‐CIR  and  PM-­‐CIR)  

✓  

-­‐     Beginning  of  the  search  session:  meta-­‐roles  vary  -­‐     A{erwards:  meta-­‐roles  converge  ?  

Models   F@20   %Ch  

BM25-­‐CIR   0,0177   +166,71%***  

Logit-­‐CIR   0,033   +31,75%*  

GS-­‐CIR   0,009   +345,81%***  

PM-­‐CIR   0,008   +450,00%***  

MineRank   0,044  

Page 34: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Part  IV    

Conclusion  and  Perspec'ves  

[  34  ]  

Page 35: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Ver'cal  and  horizontal  dis'nc'on  of  exper'se  levels  

Collaborators’  exper'se-­‐based  profile  based  on  relevance  judgements  

Document  alloca'on  to  the  most  likely  suited  collaborator  -­‐  Impact  of  the  division  of  labor  [Foley  et  al.,  ECIR  2009]  -­‐  Effec'veness  of  search  result  personaliza'on  w.r.t.  exper'se  

CIR  models  based  on  collaborators’  domain  exper'se  

Labelled  roles  vs.  Meta-­‐roles  

Role  mining  based  on  collaborators’  complementary  skills    

Collabora've  ranking  w.r.t.  collaborators’  roles/meta-­‐roles  -­‐  Synergic  effect  of  role  mining  -­‐  Effec'veness  of  a  dynamic  ranking  adapted  to  roles/meta-­‐roles  

CIR  models  based  on  user-­‐driven  system-­‐oriented  media'on  

Defini'on  and  evalua'on  of  CIR  models  Contribu'on  

Conclusion  and  perspec'ves   [  35  ]  

Page 36: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Formalisa'on  of  CIR  evalua'on  framework  -­‐  Collabora've  tasks  and  topics  -­‐  Logs  of  collabora've  search  sessions  -­‐  Document  collec'on  -­‐  Relevance  judgements  

Crowdsourcing  collabora've  search  -­‐  Collabora've  search  task  in  web  2.0  -­‐  Search  for  relevant  collaborators    

Long  term  Enhancing  collaborators’  modeling  

-­‐  Short  term  vs.  long  term  profile  -­‐  Interests  and  preferences      

Es'ma'ng  the  collec've  relevance  -­‐  Building  a  final  list  of  documents  

Short  term  

Defini'on  and  evalua'on  of  CIR  models  Perspec'ves  

Conclusion  and  perspec'ves   [  36  ]  

Page 37: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

Thank  you  for  your  ayen'on    

[  37  ]  

Page 38: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

"  Allen, B. (1991). Topic knowledge and online catalog search formulation. In The Library Quarterly, pages 188–213. "  Denning, P. J. and Yaholkovsky, P. (2008). Getting to "We". Communications of the ACM (CACM), 51(4) :19–24. "  Dourish, P. and Bellotti, V. (1992). Awareness and coordination in shared workspaces. In Proceedings of the Conference on

Computer Supported Cooperative Work, CSCW ’92, pages 107–114. ACM. "  Foley, C. and Smeaton, A. F. (2009). Synchronous Collaborative Information Retrieval : Techniques and Evaluation. In

Proceedings of the European Conference on Advances in Information Retrieval, ECIR ’09, pages 42–53. Springer. "  Golovchinsky, G., Diriye, A., and Pickens, J. (2011). Designing for Collaboration in Information Seeking. Proceedings of the

ASIS&T Annual Meeting. "  Hembrooke, H. A., Granka, L. A., Gay, G. K., and Liddy, E. D. (2005). The effects of expertise and feedback on search term

selection and subsequent learning. Journal of the Association for Information Science and Technology (JASIST), 56(8) :861–871

"  Kelly, D., Dumais, S., and Pedersen, J. O. (2009). Evaluation Challenges and Directions for Information-Seeking Support Systems. IEEE Computer, 42(3) :60–66.

"  Kim, G. (2006). Relationship Between Index Term Specificity and Relevance Judgment. Information Processing & Management (IP&M), 42(5) :1218–1229.

"  Morris, M. R., Paepcke, A., and Winograd, T. (2006). TeamSearch : Comparing Techniques for Co-Present Collaborative Search of Digital Media. In Proceedings of the International Workshop on Horizontal Interactive Human-Computer Systems, Tabletop ’06, pages 97–104. IEEE Computer Society.

"  Morris, M. R. and Teevan, J. (2009). Collaborative Web Search : Who, What, Where, When, and Why. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan&Claypool Publishers.

"  Morris, M. R., Teevan, J., and Bush, S. (2008). Collaborative Web Search with Personalization : Groupization, Smart Splitting, and Group Hit-highlighting. In Proceedings of the Conference on Computer Supported Cooperative Work, CSCW ’08, pages 481–484. ACM.

References

[  38  ]  

Page 39: Thesis slides - Definition and evluation of collaborative information retrieval models based on users' domain expertise and roles

"  Patel, V. L. and Arocha J.F., K. D. R. (1999). Tacit Knowledge in Professional Practice, chapter Expertise, pages 75–99. Jossey-Bass Publishers.

"  Pickens, J., Golovchinsky, G., Shah, C., Qvarfordt, P., and Back, M. (2008). Algorithmic Mediation for Collaborative Exploratory Search. In Proceedings of the Annual International SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08, pages 315–322. ACM.

"  Rodriguez Perez, J., Whiting, S., and Jose, J. M. (2011). CoFox : A visual collaborative browser. In Proceedings of the International Workshop on Collaborative Information Retrieval, CIKM ’11. ACM.

"  Shah, C. (2013). Collaborative information seeking (cis) : Challenges and opportunities. In Proceedings of the International Workshop on Collaborative Information Seeking, CSCW ’13. ACM.

"  Shah, C. and González-Ibáñez, R. (2011b). Evaluating the Synergic Effect of Collaboration in Information Seeking. In Proceedings of the Annual International SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’11, pages 913–922. ACM.

"  Shah, C., Pickens, J., and Golovchinsky, G. (2010). Role-based results redistribution for collaborative information retrieval. Information Processing &Management (IP&M), 46(6) :773–781.

"  Soulier, L., Shah, C., and Tamine, L. (2014a). User-driven System-mediated Collaborative Information Retrieval. In Proceedings of the Annual International SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’14, pages 485–494. ACM.

"  Soulier, L., Tamine, L., and Bahsoun, W. (2013). A Collaborative Document Ranking Model for a Multi-faceted Search. In Proceedings of the Asia Information Retrieval Societies Conference, AIRS ’13, pages 109–120. Springer.

"  Soulier, L., Tamine, L., and Bahsoun, W. (2014b). On domain expertise-based roles in collaborative information retrieval. Information Processing & Management (IP&M), 50(5) :752–774.

"  Twidale, M. B., Nichols, D. M., and Paice, C. D. (1997). Browsing is a Collaborative Process. Information Processing & Management (IP&M), 33(6) :761–783.

"  White, R. W. and Dumais, S. T. (2009). Characterizing and Predicting Search Engine Switching Behavior. In Proceedings of the Conference on Information and Knowledge Management, CIKM ’09, pages 87–96. ACM.

References

[  39  ]