crowd-augmented social aware search soudip roy chowdhury & bogdan cautis

38
Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Upload: jared-palmer

Post on 27-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Crowd-Augmented Social Aware Search

Soudip Roy Chowdhury & Bogdan Cautis

Page 2: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

What are we talking about?• Social Aware Search– Finding results relevant for the query and for the

users (seeker)– Web Search (tf-idf) + Social search (social

connections e.g., follower-following links)• However,– Required numbers of results (K items) are not found– Algorithm does not ensure the quality of the

retrieved results• Our aim is to

Page 3: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

What are we talking about?• Social Aware Search– Finding results relevant for the query and for the

users (seeker)– Web Search (tf-idf) + Social search (social

connections e.g., follower-following links)• However,– Required numbers of results (K items) are not found– Algorithm does not ensure the quality of the

retrieved results• Our aim is to

Use

Page 4: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

What are we talking about?• Social Aware Search– Finding results relevant for the query and for the

users (seeker)– Web Search (tf-idf) + Social search (social

connections e.g., follower-following links)• However,– Required numbers of results (K items) are not found– Algorithm does not ensure the quality of the

retrieved results• Our aim is to

Use For Datasourcing

Page 5: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

What are we talking about?• Social Aware Search– Finding results relevant for the query and for the

users (seeker)– Web Search (tf-idf) + Social search (social

connections e.g., follower-following links)• However,– Required numbers of results (K items) are not found– Algorithm does not ensure the quality of the

retrieved results• Our aim is to

To address the following problems efficiently

Page 6: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Lets see an example!

Query: get top 4 tweets for the query terms “#jesuscharlie #jesuisahmed”

Page 7: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Hashtag Term Tweet ID Frequency

#jesuischarlie D1 1

D2 1

D3 0

D4 2

D5 1

D6 0

Hashtag Term Tweet ID Frequency

#jesuisahmed D1 0

D2 1

D3 1

D4 1

D5 1

D6 0

By aggregating term-frequencies we get the final

result

Page 8: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Hashtag Term Tweet ID Frequency

#jesuischarlie D1 1

D2 1

D3 0

D4 2

D5 1

D6 0

Hashtag Term Tweet ID Frequency

#jesuisahmed D1 0

D2 1

D3 1

D4 1

D5 1

D6 0

Hashtag Term Tweet ID Frequency

#jesuischarlie #jesuisahmed

D1 1

D2 2

D3 1

D4 3

D5 2

D6 0

and top-4 items are

Page 9: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Similarly the social scores are calculated

TweetID Hashtag term

Author Social Score

D1 #jesuischarlie Elham 0.9x0.9x0.5

D2 #jesuischarlie Elham 0.9x0.9x0.5

#jesuischarlie Das 0.9x0.9

D3 #jesuisahmed Bob 0.9

D4 #jesuischarlie Elham 0.9x0.9x0.5

#jesuischarlie Das 0.9x0.9

#jesuisahmed Das 0.9x0.9

D5 #jesuischarlie Chang 0.6

#jesuisahmed Chang 0.6

Page 10: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Hashtag Term Tweet ID Social score

#jesuischarlie D1 0.4

D2 1.21

D3 0

D4 1.21

D5 0.6

Hashtag Term Tweet ID Social score

#jesuisahmed D1 0

D2 0

D3 0.9

D4 0.81

D5 0.6

Hashtag Term Tweet ID Social score

#jesuischarlie #jesuisahmed

D1 0.4

D2 1.21

D3 0.9

D4 2.02

D5 1.2

and top-4 items are

Top-k results with social score!

Page 11: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Social-aware search• Final results are calculated based on the score

model– score(item|seeker,tag)= α × tf-idf(tag,item)+(1-α) ×

sc(item|seeker,tag)• Following this model, the top-4 results for our

example scenario – D4, D2, D5, and D3

• Let us know consider some additional constraints to make sure the results are good in quality

Page 12: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

#Constraint: Each result item must at least be tagged twice

Example scenario with quality constraints

Hence top-4 items

are

Hashtag Term Tweet ID Frequency

#jesuischarlie #jesuisahmed

D1 1

D2 2

D3 1

D4 3

D5 2

D6 0

Page 13: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Hashtag Term Tweet ID Social score

#jesuischarlie #jesuisahmed

D1 0.4

D2 1.21

D3 0.9

D4 2.02

D5 1.2

#Constraint: Social score for an item must be > 1, in order to be in the final result list

Example scenario with quality constraints

Hence top-4 items

are

Page 14: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

List of quality constraints

1. Min # of posts for item-tag pair2. Min # of distinct tags per item3. Min # of tag occurrences per item4. Threshold for social score5. Threshold stability measures for tags– Based on moving average of relative tag

frequency distribution [1]

To be in the top-k result list an item, apart from the social aware search based threshold must also satisfy these constraints

Page 15: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

• Items that do not meet the constraints are friendsourced

• Friendsourcing tasks are designed to improve the quality of the top-k result

• Friendsourcing tasks = I , T , U , where items ⟨ ⟩I are friendsourced to friends U and U provide tags T for items

Friendsourcing

Page 16: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Human Tasks

• T1: Minimum number of posts for an item-tag pair - I1,t1,{u1,u2,...,uk}⟨ ⟩

• T2: Minimum number of distinct tags: I1 , {t1 , t2 ⟨, . . . , tn }, {u1, u2, . . . , uk} ⟩

• T3: Minimum number of tag occurrences: {{{I1,I2,...,ln},t1,{u1,u2,...,uk}}, {I1,I2,...,ln},t2, ⟨

{u1,u2,...,uk}}, ..., {I1,I2,...,ln},tn,{u1,u2,...,uk}}} ⟩

Page 17: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Human Tasks

• T4: Minimum number of taggers: I1,⟨{t1,t2,...,tn}, {u1,u2,...,uk}⟩

• T5: Minimum network-aware score: I1,⟨{t1,t2,...,tn}, {u1,u2,...,uk}⟩

• T6: Stability-based tag quality: I1, t1, {u1, ⟨u2, . . . , uk} ⟩

Page 18: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Human Task Optimization

• Problem 1:– Given a set of items {I1, I2, …, In} in a result list,

that do not satisfy constraints (C1, C2,…., C6)– Choose an item / set of items that can complete

the top-k result list with minimum numbers of tasks

– Inter-item gain

Page 19: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Human Task Optimization contd.

• Problem 2:– Given a chosen item I {I1, … In}, that does not

satisfy constraints (C1, C2,…., C6)– Choose a task Ti {T1,..,T6}, such that it can

satisfy maximum number of constraints by minimizing the total numbers of tasks/item

– Intra-item gain

Page 20: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Human Task Optimization contd.

• Problem 3:– Given a set of items {I1, I2, …, In} in a result list,

that do not satisfy constraints (C1, C2,…., C6)

– Choose a set of tasks , where I {1,..,n} and j {1,..,6} such that the intra and inter item gain is maximized

Page 21: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Solution hints

• Problem 1. – Among n items in the result lists, if for each of the

constraints we create a partial ordered list of items wrt constraint thresholds

– Item/s that appear highest/er positions in most of the list are chosen first to be friendsourced

Page 22: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Solution hints

• Problem 2. – There exists conditional probabilistic dependencies among

constraints– E.g., , where P(T2) =1

– we aim to find i such that the value of the conditional probability value of

– Is maximized

Page 23: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Expert Selection Criteria• Users are selected based on– User expertise score– Communication cost

• User expertise score (Expui)– User profile/activity attributes – Question-specific user expertise – Algorithm-specific user attributes

Page 24: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Expert Selection Criteria• Users are selected based on– User expertise score– Communication cost

• User expertise score (Expui)– User profile/activity attributes – Question-specific user expertise – Algorithm-specific user attributes

• Communication cost (Costui)– Social score

Page 25: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

System Architecture

1

Page 26: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

System Architecture

2✔✔

Page 27: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

System Architecture

3

Page 28: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

System Architecture

4

Page 29: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

System Architecture

5

Page 30: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

System Architecture

6

Page 31: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

System Architecture

7

Page 32: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Screenshots of CANTO (Search Interfaces)

Page 33: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Screenshots of CANTO (Search Interfaces)

Page 34: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Screenshots of CANTO (Friendsourcing Seeker perspective)

Page 35: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Screenshots of CANTO (Friendsourcing provider perspective)

Page 36: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Summary

• We are working on – Augmenting crowd aka friends for Social-aware search – Algorithm for generating both adhoc and best efforts

social-aware search result – Advanced expert selection algorithm

• Considering both budget and time constraints

– Planning to explore • MAB for expert selection• So far used tweets and hashtags for experimentation, planning

to experiments with Vodkaster dataset (user network, films, comments, micro critique data).

Page 37: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

Thank you for your attention!

Page 38: Crowd-Augmented Social Aware Search Soudip Roy Chowdhury & Bogdan Cautis

References

1. William Webber, Alistair Moffat, and Justin Zobel. 2010. A similarity measure for indefinite rankings. ACM Trans. Inf. Syst.

2. Xuan S. Yang, David W. Cheung, Luyi Mo, Reynold Cheng, and Ben Kao. 2013. On incentive-based tagging. In Proceedings of the 2013 IEEE International Conference on Data Engineering.