anu @ mediaeval 2011 social event detection
Post on 10-Jul-2015
335 Views
Preview:
TRANSCRIPT
Social Event Detection with Clustering and Filtering
Yanxiang Wang Australian National University
Lexing Xie Australian National University
Hari Sundaram Arizona State University
Background
SED with Clustering and Filtering 2
Introduction
• Previous Approaches – Supervised[Firan CIKM’102] – Unsupervised[Becker WSDM’101,
Rapadopoulos3] • Query partial specified motivate a
Clustering and Filtering approach
SED with Clustering and Filtering 3
Cluster-Based Landmark and Event Detection for Tagged Photo Collection, Papadopoulos3
Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge, Firan2
Learning Similarity Metrics for Event Identification in Social Media, Becker1
Similarity Metric
• Time: Time Difference in minutes • Location: Great Circle Distance • Tag: Jaccard index • Text: Cosine similarity
SED with Clustering and Filtering 4
1− t1 − t2tw1− gcd
50ta∩ tbta∪ tbA BA B•
Overview
SED with Clustering and Filtering 5
Time
Tag + Text + Location
Time + Location
Tag + Text Visual
Clustering
• Incremental Clustering1
1. Time Clustering 2. Tag + Text + Location
– Weighted sum combination – Weight corresponds to training performance
SED with Clustering and Filtering 6
wtst +wxsx +wlsl
Learning Similarity Metrics for Event Identification in Social Media, Becker1
1 2
Filtering
1. Time + Location: – Time: outside time-frame – Location: outside radius of central point
2. Tag + Text: Query Expansion 3. Visual: Concept List
SED with Clustering and Filtering 7
1 2 3
Tag + Text Filtering
• Use Flickr API to construct query – Tag: flickr.tags.getClusters – Text: flickr.photos.search
• Use online event directory last.fm to retrieve tag and text information
• Filter the clusters with same similarity metric
SED with Clustering and Filtering 8
wtst +wxsx
Example Query
SED with Clustering and Filtering 9
Visual Filtering
• Filter clusters with invalid concept • e.g. the list for soccer event
SED with Clustering and Filtering 10
Concept Threshold Beach 0.3 Flower Scene 0.4 Infant 0.3 …
Training
• Setup – No training set from organizer – Compile from subset of upcoming dataset – Additional random photos from flickr –
• Result – 80% on F1 evaluation after clustering – 40% on F1 evaluation after filtering
SED with Clustering and Filtering 11
Result
• Query Expansion – Challenge 1: Barcelona, Rome, soccer – Challenge 2: Paradiso, Parc del Forum
• Runs – Different thresholds µ for the tag + text
filtering
SED with Clustering and Filtering 12
Performance Matric µ:0.2 µ:0.1 µ:0.05 Precision 12.53% 62.88% 84.86% Recall 58.79% 52.93% 52.54% F1 20.65% 57.48% 64.9% NMI 0.1166 0.2207 0.2367
SED with Clustering and Filtering 13
Matric µ:0.2 µ:0.1 µ:0.05 µ:0.1 last.fm Precision 38.5% 59.26% 66.89% 56.16% Recall 66.34% 43.9% 6.04% 18.9% F1 48.72% 50.44% 11.07% 28.28% NMI 0.2941 0.448 0.2705 0.4491
Challenge 1
Challenge 2
Summary
• Simple clustering and filtering algorithm
SED with Clustering and Filtering 14
Correct result Incorrect result Didn’t find
Future work
• Thorough result analysis on available ground-truth
• Refine the filtering process • Incorporate methods to merge and rank
clusters
SED with Clustering and Filtering 15
Thoughts for SED 2012 (and beyond?) • Provide a common training set?
– E.g. 2009 photos for training, 2010 for evaluation
• TREC-style ranked-list evaluation – e.g. AP, F1 vs depth, so as to easily see how an algorithm
(could) easily achieve
• Accommodate other event definitions? – Multi-city long-lasting events, e.g. Olympic torch relay
http://www.flickr.com/search/?q=olympic+torch+relay+2010&s=rec
– Recurring events, e.g. French Open Tennis
SED with Clustering and Filtering 16
The end
SED with Clustering and Filtering 17
top related