searching the social web
DESCRIPTION
A talk on social search as implemented in Delver, presented in IBM-HRL IR technologies/social search seminar, 16/12/2008.TRANSCRIPT
Searching the Social WebThe Challenges of Socially-Connected Search
IR Leadership Seminar 2008 / Ofer Egozi
The The problem…problem…
What to choose?Whom to trust??...
…
The solution?The solution?
What to choose?Whom to trust??...
The solution?The solution?
Socially- Socially- connected connected search:search:• Leveraging the Social
Graph in web search▫ Focused crawling▫ Personalized ranking
• Delver is a first-mover
What to choose?Who to trust??...
The solution?The solution?
Socially- Socially- connected connected search:search:• Trusted results
▫ Friends qualify content/sources▫ Potential contact in reach▫ Spam is inherently low
• Reasoning over results▫ Ranking is transparent▫ Easier to assess relevance
• Network discovery▫ Experts in my network▫ Serendipity
What to choose?Who to trust??...
Outline
•Approaches to Social Search•The Social Graph•Graph-Related Challenges•Search-Related Challenges
Humans in the loop
•Search = crawl + index + rank + query
•Crawling (Dmoz, Mahalo)•Indexing (del.icio.us, Flickr)•Querying (ChaCha)•Ranking – that’s what we’ll discuss…
A Taxonomy of Social Search
Aggregated
Personalized
Network-based
Behavior-based
?
?
The Social Graph
•A directed, cyclic graph▫ Nodes are people (identities)▫ Edges are relations between them
•Large portion is public on social networks•A lot isn’t – cellular, email, non-digital
•Emerging web standards▫ OpenID/hCard – identifier/identity▫ Contact APIs/PoCo/XFN – private/public contact lists▫ FB Connect – a full proprietary framework
Social Graph in Research
•Extraction from interactions▫ Email (Van Alstyne et al. 2003), Chat (Tuulos & Tirri 2004), IM (Lang
2004)
•Correlation with “physical”▫ Bluetooth contact (Mtibaa et al. 2008)
•Security and Privacy▫ Shared knowledge authentication (Toomim et al. 2008)▫ Graph link privacy (Xu 2008), (Korolova et al. 2008)
•Enhance IR ranking▫ Index friends browse history (Mislove et al. 2006)▫ Rank by author centrality (Kirchhoff et al. 2008)▫ Rerank by sampling network click-log (Das et al. 2008)
So first we need to draw the graph…
Social Graph - challenges
•Social graph nodes▫ Identities/relations across networks
Joefriend-of
follo
ws
friend-of
follo
ws
Joe
JJ123
Joey
•Social graph nodes▫ Identities/relations across networks▫ Identity impersonation▫ Non-individual identities (groups,
shared authorship…)▫ Privacy is an issue, even with public data
•Social graph edges▫ Relation “strength” not exposed▫ Super nodes may dominate results▫ “Politeness” relations are not filtered out▫ Automatic generation – double-edged sword
Social Graph - challengesJoe
friend-of
follo
ws
friend-of
follo
ws
Joe
JJ123
Joey
So now we’ve mapped the social graph…
…and attached each node with its content…
…can we finally go fetch?
S-C Search - challenges
•Must build a search engine…▫Store graph, attach content to nodes▫Reranking will not do, this is the long tail
Not in Google’s / Yahoo!’s top-1000!…(dominated by authorities)
S-C Search - challenges
•Must build a search engine…▫Store graph, attach content to nodes▫Reranking will not do, this is the long tail▫Scale well, including graph functions
•Personalized graph-based rank▫Integrate content-based with static ranking▫Use web graph structure, like PageRank etc.▫Network is egocentric, unlike PageRank
Socially-Connected Search
• What are the enablers? ▫Social networks▫Users’ content boom
• What can be achieved?▫Search-based access to network content▫Trusted and transparent social ranking
• What are the challenges?▫Fragmented social graph▫Personal-network ranking
Thank you!http://
www.delver.com
Thank you!http://
www.delver.com