knowledge base enabled information filtering on social web -- emc
TRANSCRIPT
![Page 1: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/1.jpg)
Knowledge-base Enabled Information Filtering on Social Web
Pavan Kapanipathi
Kno.e.sis Center, Wright State University
Advisor: Amit Sheth
1
![Page 2: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/2.jpg)
Kno.e.sis
2
![Page 3: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/3.jpg)
Social Web in 60 secs
3
![Page 4: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/4.jpg)
Social Web in 60 secs
500M users generate 500M tweets per day
4
![Page 5: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/5.jpg)
Disaster Management Organizations utilize Social Web
35% of 20M tweets during hurricane sandy shared information
and news about the disaster 5
![Page 6: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/6.jpg)
Healthcare Issues
6
![Page 7: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/7.jpg)
Healthcare Issues
7
![Page 8: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/8.jpg)
Personalized Filtering on Social Web
Following Dynamically Evolving Topics as interests
8
![Page 9: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/9.jpg)
Personalization on Social Web
• Following Dynamically Evolving Topics • Indian Elections • US Elections • Heathcare Debate
9
![Page 10: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/10.jpg)
Personalization on Social Web
• Following Dynamically Evolving Topics • Indian Elections • US Elections • Heathcare Debate
10
![Page 11: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/11.jpg)
Dynamic Topics
11
![Page 12: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/12.jpg)
Dynamic Topics
Continuously Evolving on Twitter
Entity – Event relevance changes
Many entities are involved
12
![Page 13: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/13.jpg)
Dynamic Topics
Manually crawl using keywords
“indianelection” “jan25” “sandy”
“swineflu” “ebola”
13
![Page 14: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/14.jpg)
Dynamic Topics
Manually updating keywords to get topic relevant tweets is not
feasible
“indianelection” “modi” “bjp”
“congress”
“jan25” “egypt” “tunisia”
“arabspring”
“sandy” “newyork” “redcross” “fema”
“swineflu” “ebola”
14
![Page 15: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/15.jpg)
Problem
How can we automatically update the filters to track a dynamically
evolving topic on Twitter
15
![Page 16: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/16.jpg)
Hashtags as Filters
• Identify a topic on Twitter • Tweets with hashtags are
more informative • Users have a lot of freedom
to create them • Some get popular, most die
16
![Page 17: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/17.jpg)
Exploring Hashtags as Evolving Filters for Dynamic Topics
Colorado Shooting
17
![Page 18: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/18.jpg)
Exploring Hashtags as Evolving Filters for Dynamic Topics
Colorado Shooting
Occupy Wall Street
18
![Page 19: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/19.jpg)
Exploring Hashtags as Evolving Filters for Dynamic Topics
Colorado Shooting
Occupy Wall Street
CS OWS
Tweets: 122,062 Tweets: 6,077,378
Tags: 192,512 Distinct: 12,350 100% Retrieval: 7,763
Tags: 15,963,209 Distinct: 191,602 100% Retrieval: 21,314
19
![Page 20: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/20.jpg)
Exploring Hashtags as Evolving Filters for Dynamic Topics
Colorado Shooting
Occupy Wall Street
CS OWS
Tweets: 122,062 Tweets: 6,077,378
Tags: 192,512 Distinct: 12,350 100% Retrieval: 7,763
Tags: 15,963,209 Distinct: 191,602 100% Retrieval: 21,314
HASHTAG FILTERS 20
![Page 21: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/21.jpg)
Colorado Shooting Occupy Wall Street
Hashtag Filters Co-occurrence Graph
21
![Page 22: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/22.jpg)
Colorado Shooting Occupy Wall Street
Event Related Hashtags co-occur
with each other
Hashtag Filters Co-occurrence Graph
22
![Page 23: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/23.jpg)
Summarizing Hashtag Analysis
Starting with one of the event relevant hashtags, by co-
occurrence we can reach other relevant hashtags
23
![Page 24: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/24.jpg)
Determining Relevancy of Co-occurring Hashtags
#indianelection2015
#modikisarkar
Too many co-occurring hashtags
24
![Page 25: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/25.jpg)
Hashtag Filters distributions
25
![Page 26: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/26.jpg)
Not surprising It’s a Powerlaw
distribution
Hashtag distributions
26
![Page 27: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/27.jpg)
Top 1% retrieves around 85% of the
tweets
Hashtag distributions
27
![Page 28: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/28.jpg)
Clustering Co-efficient of Hashtag Co-occurrence network (1%)
Clustering co-efficient
The top ones co-occur with each other the best
28
![Page 29: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/29.jpg)
Determining Relevancy of Co-occurring Hashtags
#indianelection2015
#modikisarkar
Co-occurring: Threshold δ
Preferably a prominent hashtag
29
![Page 30: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/30.jpg)
Hashtag Co-occurrence works?
o No. Just co-occurrence does not work o Many noisy or unrelated hashtags co-occurs
o Determine the “dynamic” relevance of the top co-occurring hashtag with the dynamic topic
30
![Page 31: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/31.jpg)
Determining Relevancy of Co-occurring Hashtags
#indianelection2015
#modikisarkar
Co-occurring: Threshold
Latest K (200,500)
Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2
Entity Extraction and Scoring
δ
Normalized Frequency Scoring
31
(Vector Space Model)
![Page 32: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/32.jpg)
Determining Relevancy of Co-occurring Hashtags (Vector
Space Model) #indianelection2015
#modikisarkar
Co-occurring: Threshold
Latest K (200,500)
Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2
Entity Extraction and Scoring
Indian General Election,_2014
Dynamically Updated Background Knowledge
δ
32
![Page 33: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/33.jpg)
Event Relevant Background Knowledge
o Wikipedia Event Pages
33
![Page 34: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/34.jpg)
o Wikipedia Event Pages
Event Relevant Background Knowledge
34
![Page 35: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/35.jpg)
o Entities mentioned on the Event page of Wikipedia are relevant to the Event
Event Relevant Background Knowledge
35
![Page 36: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/36.jpg)
o Wikipedia’s Hyperlink structure is very rich o Page-Page (Wikipedia) links
Indian General Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India) UPA (India)
BJP
Indian National Congress
Event Relevant Background Knowledge – Graph Structure
36
![Page 37: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/37.jpg)
Determining Relevancy of Co-occurring Hashtags (Vector
Space Model) #indianelection2015
#modikisarkar
Co-occurring: Threshold
Latest K (200,500)
Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2
Entity Extraction and Scoring
Indian General Election,_2014
Extract, Periodically Update Hyperlink structure
One hop from Event Page
δ
37
![Page 38: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/38.jpg)
o Hyperlink structure is dynamically updated
Indian General Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India) UPA (India)
BJP
Indian National Congress
10 May 2010
Event Relevant Background Knowledge
38
![Page 39: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/39.jpg)
o Hyperlink structure is dynamically updated
Indian General Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India) UPA (India)
BJP
Indian National Congress
10 May 2010
29 March 2013
29 March 2013 29 March 2013
29 March 2013
Event Relevant Background Knowledge
39
![Page 40: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/40.jpg)
o Hyperlink structure is dynamically updated
Indian General Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India) UPA (India)
BJP
Indian National Congress
10 May 2010
29 March 2013
29 March 2013 29 March 2013
29 March 2013
20 May 2013
20 May 2013
Event Relevant Background Knowledge
40
![Page 41: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/41.jpg)
Determining Relevancy of Co-occurring Hashtags (Vector
Space Model) #indianelection2015
#modikisarkar
Co-occurring: Threshold
Latest K (200,500)
Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2
Entity Extraction and Scoring
Indian General Election,_2014
Extract, Periodically Update Hyperlink structure
Entity scoring based on relevance to the Event
One hop from Event Page
δ
41
![Page 42: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/42.jpg)
o Edge Based Measure
o Link Overlap Measure: Jaccard similarity
o Out(c) are the links in Wikipedia page “c”
o Final Score: r(c,E) = ed(c,E) + oco(c,E)
Hyperlink Entity Scoring
India General Election, 2014
Narendra Modi
India General Election, 2014
India General Election, 2009
1
Mutually Important
ed (c,E) = 1
ed (c,E) = 2
42
![Page 43: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/43.jpg)
Determining Relevancy of Co-occurring Hashtags (Vector
Space Model) #indianelection2015
#modikisarkar
Co-occurring: Threshold
Latest K (200,500)
Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2
Entity Extraction and Scoring
Indian General Election,_2014
Extract, Periodically Update Hyperlink structure
Entity scoring based on relevance to the Event
One hop from Event Page
Indian General Elec: 1.0 India: 0.9 Elections: 0.7 UPA: 0.6 BJP: 0.3 NDA: 0.3 Narendra Modi: 0.3
δ
43
![Page 44: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/44.jpg)
Determining Relevancy of Co-occurring Hashtags (Vector
Space Model) #indianelection2015
#modikisarkar
Co-occurring: Threshold
Latest K (200,500)
Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2
Entity Extraction and Scoring
Indian General Election,_2014
Extract, Periodically Update Hyperlink structure
Entity scoring based on relevance to the Event
One hop from Event Page
Indian General Elec: 1.0 India: 0.9 Elections: 0.7 UPA: 0.6 BJP: 0.3 NDA: 0.3 Narendra Modi: 0.3
Similarity Check
Relevance Score: 0.6
δ
44
![Page 45: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/45.jpg)
o Set Based o Jaccard Similarity
o Considers the entities without the scores
o Vector Based o Symmetric
o Cosine Similarity
o Asymmetric o Subsumption Similarity
Similarity Check
45
![Page 46: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/46.jpg)
India General Election 2014
Narendra
Modi
Intuition behind Asymmetric
India General Election 2014
Narendra
Modi
Penalized
Ignored
Similarity
Symmetric
Asymmetric
46
![Page 47: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/47.jpg)
Determining Relevancy of Co-occurring Hashtags (Vector
Space Model) #indianelection2015
#modikisarkar
Co-occurring: Threshold
Latest K (200,500)
Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2
Entity Extraction and Scoring
Indian General Election,_2014
Extract, Periodically Update Hyperlink structure
Entity scoring based on relevance to the Event
One hop from Event Page
Indian General Elec: 1.0 India: 0.9 Elections: 0.7 UPA: 0.6 BJP: 0.3 NDA: 0.3 Narendra Modi: 0.3
Similarity Check
Relevance Score: 0.6
δ
47
![Page 48: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/48.jpg)
o 2 events o US Presidential Elections (#election2012)
o Hurricane Sandy (#sandy)
o Top 25 co-occurring hashtags
Evaluation – Dataset
48
![Page 49: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/49.jpg)
o Ranking Problem o Rank the Top 25 hashtags based on the relevancy of tweets to the event
o Experiment with all the similarity metrics o Manually annotated the tweets of these hashtags as relevant/irrelevant (Gold Standard)
o Ranking Evaluation Metrics o Mean Average Precision o NDCG
Evaluation – Strategy
49
![Page 50: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/50.jpg)
Evaluation
50
![Page 51: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/51.jpg)
Evaluation
Evaluated tweets comprising of top-relevant hashtags detected for
dynamic topics • NDCG - 92% at top-5 Mean Average
Precision
51
![Page 52: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/52.jpg)
A little pause for Questions?
52
![Page 53: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/53.jpg)
Personalized Filtering
53
User Interest Identification/User
Modeling
Filtering Module
Twitter Streaming API
Tweets
Network
Filtered Tweets
![Page 54: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/54.jpg)
Personalized Filtering
54
User Interest Identification/User
Modeling
Filtering Module
Twitter Streaming API
Tweets
Network
Filtered Tweets
Dynamic Topics as Interests
Interest: Indian Elections
![Page 55: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/55.jpg)
Personalized Filtering
55
User Interest Identification/User
Modeling
Filtering Module
Twitter Streaming API
Tweets
Network
Filtered Tweets
A Significant Module
![Page 56: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/56.jpg)
o User Interest Identification on Twitter o Content-based (Only Tweets)
o Term-based (semantic, web, #semanticweb)
o Entity-based (sematic web <same as> #semanticweb)
o Interest Graphs derived from knowledge-base (Hierarchical Interest Graphs)
o Collaborative (Users’ Friends)
o Hybrid
User Modeling
56
![Page 57: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/57.jpg)
A simple solution to most problems I am trying to solve
![Page 58: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/58.jpg)
Hierarchical Interest Graphs
58
![Page 59: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/59.jpg)
What is in your mind? (Next concept/term)
59
![Page 60: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/60.jpg)
What is in your mind? (Next concept/term)
Fruit
60
![Page 61: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/61.jpg)
What is in your mind? (Next concept/term)
Fruit
Other Fruit Names
61
![Page 62: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/62.jpg)
Cognitive Science
o Human memory has been argued to be structured as a hierarchy of concepts (Semantic Network)
o Spreading activation theory has been
utilized to simulate search on semantic network
o This theory has not been well explored for user interest modeling
62
![Page 63: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/63.jpg)
Hierarchical Interest Graphs
o Extending user profiles from Twitter to comprise a hierarchy of concepts
o Hierarchy of concepts are derived from Wikipedia Category Structure
o Each concept in the hierarchy is scored based on the users extent of interest
63
![Page 64: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/64.jpg)
64
![Page 65: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/65.jpg)
Semantic Search
Linked Data Metadata
0.8 0.2 0.6 Scores for Interests
65
User Interests
![Page 66: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/66.jpg)
Internet
Semantic Search
Linked Data Metadata
Technology
World Wide Web
Semantic Web
Structured Information
0.8 0.2 0.6 Scores for Interests
66
User Interests
![Page 67: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/67.jpg)
Internet
Semantic Search
Linked Data Metadata
Technology
World Wide Web
Semantic Web
Structured Information
0.8 0.2 0.6 Scores for Interests
67
User Interests
0.7
0.5
0.4
0.3
![Page 68: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/68.jpg)
68
Tweets
Approach
![Page 69: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/69.jpg)
69
Tweets
Approach
![Page 70: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/70.jpg)
70
Wikipedia Category Graph
Contains Cycles
More abstract: World Wide Web or
Semantic Web?
![Page 71: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/71.jpg)
71
Wikipedia Hierarchy
Hierarchical Levels
No Cycles
1
2
3
4
5
6
![Page 72: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/72.jpg)
72
Tweets
Approach
![Page 73: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/73.jpg)
73
http://en.wikipedia.org/wiki/Semantic_search
http://en.wikipedia.org/wiki/Ontology
o Extracting Wikipedia entities
o Interest Scoring o Frequency based
User Profile Generation
![Page 74: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/74.jpg)
Internet
Semantic Search
Linked Data Metadata
Technology
World Wide Web
Semantic Web
User Interests
Structured Information
0.8 0.2 0.6 Scores for Interests
74
![Page 75: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/75.jpg)
75
Tweets
Approach
![Page 76: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/76.jpg)
76
Cricket
M S Dhoni Virat Kohli Sachin
Tendulkar
Sports
Indian Cricket
Indian Cricketers
0.8 0.2 0.6
0.5
0.4
0.25
0.1
Activation Function Determines the extent of spreading
Example
![Page 77: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/77.jpg)
o Simple Activation Function
𝐴𝑗 = 𝐴𝑖 ×𝑊𝑖𝑗 × 𝐷𝑛𝑖=0
𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑐ℎ𝑖𝑙𝑑 𝑜𝑟 𝑠𝑢𝑏𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑜𝑓 𝑗 𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑒𝑑 .
𝑗 𝑖𝑠 𝑡ℎ𝑒 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑡𝑜 𝑏𝑒 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑒𝑑.
𝑊𝑖𝑗 𝑖𝑠 𝑡ℎ𝑒 𝑒𝑑𝑔𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑗 𝑎𝑛𝑑 𝑖.
𝐷 𝑖𝑠 𝑡ℎ𝑒 𝑑𝑒𝑐𝑎𝑦 𝑓𝑎𝑐𝑡𝑜𝑟.
77
Activation Function
![Page 78: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/78.jpg)
o Uneven distribution of nodes in the hierarchy
oMany-many for category-subcategory relationships
78 78
Challenges – Wikipedia Category Graph
![Page 79: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/79.jpg)
o Uneven distribution of nodes in the hierarchy
oMany-many for category-subcategory relationships
79 79
Challenges – Wikipedia Category Graph
![Page 80: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/80.jpg)
o Uneven distribution of nodes in the hierarchy
oMany-many for category-subcategory relationships
80 80
Challenges – Wikipedia Category Graph
![Page 81: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/81.jpg)
81
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0
50000
100000
150000
200000
250000
300000
Nu
mb
er
of N
ode
s
Hierarchical Level
81
Addressing Uneven Node Distribution
![Page 82: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/82.jpg)
o Uneven distribution of nodes in the hierarchy
oMany-many for category-subcategory relationships
82 82
Challenges – Wikipedia Category Graph
![Page 83: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/83.jpg)
83 83
Preferential Path Constraint – Many to Many Links
![Page 84: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/84.jpg)
84 84
Preferential Path Constraint – Many to Many Links
![Page 85: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/85.jpg)
85
1 2 3 4
85
Preferential Path Constraint – Many to Many Links
![Page 86: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/86.jpg)
Boosting Common Ancestors
o Nodes that intersect domains/subcategories activated by diverse entities
86 86
![Page 87: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/87.jpg)
Boosting Common Ancestors
87
Cricket
M S Dhoni Virat Kohli Sachin
Tendulkar
Sports
Indian Cricket
Indian Cricketers 3
3
5
5
Michael Clarke
Shane Watson
Australian Cricket
Australian Cricketers
2
2
87
![Page 88: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/88.jpg)
88 88
Boosting Common Ancestors
![Page 89: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/89.jpg)
o Bell
𝐴𝑗 = 𝐴𝑖 × 𝐹𝑗
𝑛
𝑖=0
o Bell Log
𝐴𝑗 = 𝐴𝑖 × 𝐹𝐿𝑗
𝑛
𝑖=0
o Priority Intersect
𝐴𝑗 = 𝐴𝑖 × 𝐹𝐿𝑗 × 𝑃𝑗𝑖 × 𝐵𝑗
𝑛
𝑖=0
89
Activation Functions
![Page 90: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/90.jpg)
Evaluation
User Study • 37 Users
• 30K Tweets
Evaluated the top-10 categories of interests derived from the hierarchy
• 76% Mean Average Precision • 98% Mean Reciprocal Recall
• 70% are not mentioned in tweets
90
![Page 91: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/91.jpg)
oWorking on a Tweet recommendation system that utilizes Hierarchical Interest Graph
o Preliminary results are “interesting”
91
Tweet Recommendation using Hierarchical Interest Graph
![Page 92: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/92.jpg)
Conclusion
o Focus on “Information” overload instead of “Data” overload. o Personalized Information Filtering
o Knowledge-base enabled solutions for
challenges in Tweets filtering o Wikipedia hyperlink structure and category
graph leveraged for Twitter data filtering
o More Research on User Specific Attribute Extraction (Personalization) from Twitter Data o Activity Estimation o Location Prediction
![Page 93: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/93.jpg)
93
More at Kno.e.sis
![Page 94: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/94.jpg)
kHealth Knowledge-enabled Healthcare
Applied to ADHF, Asthma, GI, and Dementia
94
![Page 95: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/95.jpg)
Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information
canary in a coal mine
Empowering Individuals (who are not Larry Smarr!) for their own health
kHealth: knowledge-enabled healthcare
95
![Page 96: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/96.jpg)
Social Health Signals
96
![Page 97: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/97.jpg)
Motivational Scenario
Manually going through news articles, diabetes forums, blogs, etc.
- Time consuming
- Relevant? Interesting? Informative? Useful?
97
How about all the relevant and important health
information aggregated at one platform?
A diabetic patient is interested in keeping himself up to date with
new information about diabetes
![Page 98: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/98.jpg)
98
Search and Explore
X Controls
Cancer
X = diet, treatment, exercise
(Pattern-based Approach
leveraging domain
semantics)
Top Health News Informative news about selected
disease
Faceted search (by health topics)
Learn about disease
Source: Wikipedia
Search &
Explore Top Health
News
Tweet
Traffic Learn about
Disease Home
![Page 99: Knowledge base enabled Information Filtering on Social Web -- EMC](https://reader030.vdocuments.net/reader030/viewer/2022032501/55b6e2d6bb61eb75268b480b/html5/thumbnails/99.jpg)
Thanks
Contact: [email protected]
Twitter:@pavankaps Webpage:
http://knoesis.org/researchers/pavan
99