topical keyphrase extraction from twitter

10

Click here to load reader

Upload: moresmile

Post on 12-May-2015

266 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Topical keyphrase extraction from twitter

Topical Keyphrase Extraction from Twitter

Page 2: Topical keyphrase extraction from twitter

In this paper, we propose to extract topical keyphrases as one way to summarize Twitter.

1、 A context-sensitive topical PageRank method for keyword ranking

2、 A probabilistic scoring function that considers both relevance and interestingness of keyphrases for keyphrase ranking

Abstract

Page 3: Topical keyphrase extraction from twitter

Keyphrases are defined as a short list of terms tosummarize the topics of a document.

Challenges:1. Tweets are much shorter than traditional

articles and not all tweets contain useful information;

2. Topics tend to be more diverse in Twitter than informal articles such as news reports.

Keyword ranking; Candidate keyphrase generation; Keyphrase ranking.

Introduction

Page 4: Topical keyphrase extraction from twitter

Preliminaries: u=U V T Given T and C ,topical keyphrase extraction

is todiscover a list of keyphrases for each topic t ∈ T .Here each keyphrase is a sequence of words.

Method

Page 5: Topical keyphrase extraction from twitter

Topic discovery

Keywords ranking and candidate generation

Keyphrases ranking

Method

Page 6: Topical keyphrase extraction from twitter

A modified author-topic model called Twitter-LDA introduced by Zhao etal.(2011)

Topic discovery

Page 7: Topical keyphrase extraction from twitter

The topic-specific PageRank scores: TPR

A topic context sensitive PageRank method: cTPR

Topical PageRank for Keyword Ranking

Page 8: Topical keyphrase extraction from twitter

a common candidate keyphrase generation method proposed by Mihalcea and Tarau(2004) as follows:

1. Select the top S keywords for each topic2. look for combinations of these keywords

that occur as frequent phrases in the text collection

Topical PageRank for Keyword Ranking

Page 9: Topical keyphrase extraction from twitter

Relevance : A good keyphrase should be closely related to the given topic and also discriminative.

Interestingness : A good keyphrase should be interesting and can attract users’ attention.

Probabilistic Models for Topical KeyphraseRanking

Page 10: Topical keyphrase extraction from twitter

In general we can assume that P ( R = 0)>> P ( R = 1) because there are much more non-relevant keyphrases than relevant ones,that is, δ>> 1 .

Probabilistic Models for Topical KeyphraseRanking