anu @ mediaeval 2011 social event detection

Post on 10-Jul-2015

335 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Social Event Detection with Clustering and Filtering

Yanxiang Wang Australian National University

Lexing Xie Australian National University

Hari Sundaram Arizona State University

Background

SED with Clustering and Filtering 2

Introduction

•  Previous Approaches – Supervised[Firan CIKM’102] – Unsupervised[Becker WSDM’101,

Rapadopoulos3] •  Query partial specified motivate a

Clustering and Filtering approach

SED with Clustering and Filtering 3

Cluster-Based Landmark and Event Detection for Tagged Photo Collection, Papadopoulos3

Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge, Firan2

Learning Similarity Metrics for Event Identification in Social Media, Becker1

Similarity Metric

•  Time: Time Difference in minutes •  Location: Great Circle Distance •  Tag: Jaccard index •  Text: Cosine similarity

SED with Clustering and Filtering 4

1− t1 − t2tw1− gcd

50ta∩ tbta∪ tbA BA B•

Overview

SED with Clustering and Filtering 5

Time

Tag + Text + Location

Time + Location

Tag + Text Visual

Clustering

•  Incremental Clustering1

1. Time Clustering 2. Tag + Text + Location

– Weighted sum combination – Weight corresponds to training performance

SED with Clustering and Filtering 6

wtst +wxsx +wlsl

Learning Similarity Metrics for Event Identification in Social Media, Becker1

1 2

Filtering

1.  Time + Location: –  Time: outside time-frame –  Location: outside radius of central point

2.  Tag + Text: Query Expansion 3.  Visual: Concept List

SED with Clustering and Filtering 7

1 2 3

Tag + Text Filtering

•  Use Flickr API to construct query – Tag: flickr.tags.getClusters – Text: flickr.photos.search

•  Use online event directory last.fm to retrieve tag and text information

•  Filter the clusters with same similarity metric

SED with Clustering and Filtering 8

wtst +wxsx

Example Query

SED with Clustering and Filtering 9

Visual Filtering

•  Filter clusters with invalid concept •  e.g. the list for soccer event

SED with Clustering and Filtering 10

Concept Threshold Beach 0.3 Flower Scene 0.4 Infant 0.3 …

Training

•  Setup – No training set from organizer – Compile from subset of upcoming dataset – Additional random photos from flickr – 

•  Result – 80% on F1 evaluation after clustering – 40% on F1 evaluation after filtering

SED with Clustering and Filtering 11

Result

•  Query Expansion – Challenge 1: Barcelona, Rome, soccer – Challenge 2: Paradiso, Parc del Forum

•  Runs – Different thresholds µ for the tag + text

filtering

SED with Clustering and Filtering 12

Performance Matric µ:0.2 µ:0.1 µ:0.05 Precision 12.53% 62.88% 84.86% Recall 58.79% 52.93% 52.54% F1 20.65% 57.48% 64.9% NMI 0.1166 0.2207 0.2367

SED with Clustering and Filtering 13

Matric µ:0.2 µ:0.1 µ:0.05 µ:0.1 last.fm Precision 38.5% 59.26% 66.89% 56.16% Recall 66.34% 43.9% 6.04% 18.9% F1 48.72% 50.44% 11.07% 28.28% NMI 0.2941 0.448 0.2705 0.4491

Challenge 1

Challenge 2

Summary

•  Simple clustering and filtering algorithm

SED with Clustering and Filtering 14

Correct result Incorrect result Didn’t find

Future work

•  Thorough result analysis on available ground-truth

•  Refine the filtering process •  Incorporate methods to merge and rank

clusters

SED with Clustering and Filtering 15

Thoughts for SED 2012 (and beyond?) •  Provide a common training set?

–  E.g. 2009 photos for training, 2010 for evaluation

•  TREC-style ranked-list evaluation –  e.g. AP, F1 vs depth, so as to easily see how an algorithm

(could) easily achieve

•  Accommodate other event definitions? –  Multi-city long-lasting events, e.g. Olympic torch relay

http://www.flickr.com/search/?q=olympic+torch+relay+2010&s=rec

–  Recurring events, e.g. French Open Tennis

SED with Clustering and Filtering 16

The end

SED with Clustering and Filtering 17

top related