twitter agreement analysis

Discovering Agreement and Disagreement between Users within a Twitter Conversation ThreadPRESENTATION BY:

ARVIND KRISHNAA JAGANNATHAN

Objective

“Given the thread of conversation among multiple users on Twitter,

based on the initiating statement (i.e Tweet) of a user (say the initiator),

automatically those responses which agree with the statement of the user

and those that disagree”

Phase 1: Baseline Setup

The Experimental SetupPhase 1: Supervised Classifier – The Labor-Intensive Baseline

Training Set: Initial Tweet + Response pairs for all twitter threads with <=15 and >=10 responses. Hand-annotated as “Agreement”, “Disagreement” and “Neither”

Around 10000 manually annotated pairs

Test/Development Set: Tweet + Response pairs for threads with >15 responses.

Around 1500 pairs

Classifier Applied: MIRA classifier. Implemented in Python

Results:• 81.47% Accuracy on the

test/development data.• Will be the baseline for

comparison

5 10 15 20 25 300

20

40

60

80

100

120

Baseline

Accuracy on Development Data

Accuracy on Training Data

Top 10 Lexical Features- By Weight Vector

f**k

completely_disagree

lol

roflmao

ROFL

K_RT

_RT

love_you

yeah_right

#truth

Phase 2: Structural Correspondence

Learning

Structural Correspondence Learning

Domain adaptation technique to leverage abundant labeled data in one domain and utilize it in a target domain with less/no labeled data

Source Domain: 14 annotated meeting threads from AMI meeting corpus Around 10k statement-response adjacency pairs

Target Domain: Initial tweet-response pairs from threads having 10-15 responses (~10k)

Structured Correspondence Learning Algorithm: Pivot Features

Phase 2: SCL Implementation Choose m pivot features from source and target

domains, such that They occur frequently in both domains

Are characteristic of the task we want to achieve (i.e., indicate agreement or disagreement)

Chosen using labeled source data, unlabeled source and target data

Pivot Features: 50 most frequently occurring terms in pairs annotated as “agreement”, “disagreement” and “backchannel (AMI)/ neither (Twitter)”

Structured Correspondence Learning Algorithm

Step 1: Construct m pivot feature vectors for the source and target domain

Step 2: Construct one binary prediction problem per adjacency pair of source domain Binary prediction question: For the given adjacency pair, does the

pivot feature mi occur in the response?

Train a classifier on the annotated AMI corpus to construct a weight vector, W such that,

Wi = Weight assigned to the ith adjacency pair for a particular pivot feature

For each pivot feature, there will be a weight vector W

Structure Correspondence Learning

Source Domain(AMI

Annotated Meeting Corpus)

Extract Features which strongly correlate with

agreement/disagreement

Source Feature Vector

Common Latent Space

1.Con

stru

ct C

olum

n Mat

rix

with Fe

atur

e Ve

ctor

USVT

2. Singular V

alue

Decomposit

ion

Project onto Target Domain

3. Obtain the mapping matrix UT

TwitterCorpus

Target Feature Vector

MIRA Classifier

Labels

Structure Correspondence

Structured Correspondence Learning Algorithm: Application in Target Domain

Step 3: Construct a matrix L, whose column vectors are the pivot predictor weight vectors

Step 4: Perform SVD on L, i.e., L = UDVT

= UT , which is a projection from original feature space to a latent space common to both source and target domains.

Step 5: Apply the features from each row of on the data from Twitter adjacency pairs and AMI adjacency pairs.

Step 6: Through Step 5 induce correspondences between features indicating agreement/disagreement in the AMI corpus and Twitter corpus

Results

Visualizing the correspondences between source and target domains

AMI Corpus: Features strongly associated with the feature disagree

disagree

wrong

incorrect

Uh

obviouslythough

tend_to

um

Twitter Corpus: Corresponding features

disagree

completely

#stupid

ROFL

liarhave_to

hate

#WTF

1. f**k2. completely_disagree3. lol4. roflmao5. ROFL6. K_RT7. _RT8. love_you9. yeah_right10.#truth

Results Three instances of the target classifier was set up:

Labeled source domain data; unlabeled target domain data

Labeled source domain data; unlabeled data from source and target

Unlabeled data from source domain to augment extraction of corresponding features

Annotated adjacency pairs from10 meeting threads(~8k)

Labeled source domain data; unlabeled target and source domain data; small amount of labeled target domain data

Twitter conversation threads with exactly 10 responses (~2k)

Features extracted from the target domain are applied to a MIRA classifier, and the accuracy is computed in each of the three scenarios

Results: Comparison with Baseline result

74

76

78

80

82

84

86

77.61

80.74

83.54

SCL: Accuracy on Twitter Test Data

Scenario

Accu

racy (

%)

Arvind Krishnaa J

Add baseline as: Train on the source--> apply on the target

Results: Comparison with Baseline result

500 750 1000 1500 200079.5

80

80.5

81

81.5

82

82.5

83

83.5

84

81.03

82.16

82.7983.24

83.54

Varying the size of Labeled Target Data

Number of labeled Target Data

Accu

racy (

%)

Discussions

Salient Points of Discussion Purely unlabeled data, provides classification accuracy very close to

baseline

Compared with gains from SCL applied in POS tagging Blitzer et. Al’s* task was from a significantly larger corpus

Conversations in both AMI and Twitter corpus, are generally short (AMI – around 10-12 words; Twitter maximum of 140 characters)

Certain twitter specific constructs were not leveraged (especially retweets)

Significantly differing lexicons to convey a similar feeling (use of single swear words followed by a retweet for instance)

Able to beat the baseline, with minimally available annotated data from target domain

Current implementation does not take into account the initial statement/tweet

*Blitzer, J., McDonald, R., & Pereira, F. (2006, July). Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (pp. 120-128). Association for Computational Linguistics.

Future Work Use more unlabeled data to see if baseline can

be defeated without any labeled target domain data

Incorporate the words used in the statement into the model

Restrict categories of Twitter conversation to particular domain/personalities (perhaps may lead to better results)

Clean up the code and make it ready for public distribution!

twitter agreement analysis

Documents