on the quest for changing knowledge. capturing emerging entities from social media. webscience 2016...

36
On the Quest for Changing Knowledge Marco Brambilla , Stefano Ceri, Florian Daniel, Emanuele Della Valle @marcobrambi

Upload: marco-brambilla

Post on 23-Feb-2017

1.044 views

Category:

Social Media


0 download

TRANSCRIPT

Page 1: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

On the Quest for Changing KnowledgeMarco Brambilla, Stefano Ceri, Florian Daniel, Emanuele Della

Valle

@marcobrambi

Page 2: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Data-driven innovation

and

Innovation-driven data

Page 3: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Innovation requires

PreciseTo the pointUp-to-date

Domain-specific

information

Page 4: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

There are more things In heaven and earth, Horatio, Than are dreamt of in your philosophy.

Shakespeare (Hamlet Act 1, scene 5)

Page 5: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

From Data to Wisdom

Page 6: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Formalizing new knowledge is hard

Only high frequency emerges

The long tail challenge

Page 7: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Knowledge Extraction

Text miningSemantic Web

Search and recommendation systems

No specific care for emerging knowledge

Page 8: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Heaven and HeartHow to peer through an effective window

on real world?

Social media, our blessing and curse

Domain experts matter

Page 9: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Can we use social networks to discover emerging knowledge?

Page 10: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Beware the streetlamp effect

The bias of the sourceThe bias of the observer

Page 11: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Famous Emerging

Page 12: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Evolving Knowledge

Page 13: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Overview

Page 14: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Knowledge Enrichment Setting

Page 15: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Emerging Knowledge Harvesting

Page 16: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Domain TypesTypes selected by the experts

Relevant for the domain

Page 17: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Seed characterizationSelected by the expert

Belonging to an expert type

Thoroughly Described# @ a w

Page 18: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Social Media Sourcing

Content coming from the seeds’ accounts

Page 19: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Candidate Selection

Potentially any entity extracted from the social streams

Resulting in huge sets of candidates

# @ a w ♥

Page 20: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Candidate Typing

Page 21: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Candidate Pruning

Initial pruning of candidates based on

TF-DF:= df * tf / (N – df +1)

(*) variant of TF-IDF that does not discount document frequency because we are actually happy about frequent appearance

(we don’t look for information entropy!)

Page 22: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Candidate Ranking

Page 23: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Candidate Vector Space

Purely syntactic

Semantic:Based on entity extraction / DBpedia

Based on deep learning on images / ClarifAI

Page 24: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI
Page 25: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Example Analysis

Page 26: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Experiments

Fashion brands Writers Painters

Exhibitions

Page 27: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

4,400 strategies evaluated

44 alternative feature vectors (12 basic features and 32 aggregations)

9 different weighting values for aggregations

5 levels of recall for entity extraction

3 different distances

Page 28: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Pruning PhaseFrom 4,400 down to 10 strategiesEliminating the less relevant parameters

Page 29: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Italian Fashion BrandsPrecision @5 = 0.2Increasing # seeds reduces precision

Page 30: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Australian Writers – 22 seedsPrecision @5 = 0.8

Page 31: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Innovative Painters – 21 seedsPrecision @5 = 0.6

Page 32: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Twitter vs. Instagram P@5 = 1.0 P@5 = 0.8

vs.

Page 33: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Fashion: Twitter + Instagram&

Page 34: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

&

Writers: Twitter + Instagram

Prec. = 1

Page 35: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

Conclusion

It’s about time to build innovation based on data

and build knowledge based on innovation

Harvesting can be iterative

Page 36: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI

On the Quest for Changing Knowledge

contact usMarco Brambilla, @marcobrambi, [email protected]

http://datascience.deib.polimi.it