twitter agreement analysis
DESCRIPTION
TRANSCRIPT
Discovering Agreement and Disagreement between Users within a Twitter Conversation ThreadPRESENTATION BY:
ARVIND KRISHNAA JAGANNATHAN
Objective
“Given the thread of conversation among multiple users on Twitter,
based on the initiating statement (i.e Tweet) of a user (say the initiator),
automatically those responses which agree with the statement of the user
and those that disagree”
Phase 1: Baseline Setup
The Experimental SetupPhase 1: Supervised Classifier – The Labor-Intensive Baseline
Training Set: Initial Tweet + Response pairs for all twitter threads with <=15 and >=10 responses. Hand-annotated as “Agreement”, “Disagreement” and “Neither”
Around 10000 manually annotated pairs
Test/Development Set: Tweet + Response pairs for threads with >15 responses.
Around 1500 pairs
Classifier Applied: MIRA classifier. Implemented in Python
Results:• 81.47% Accuracy on the
test/development data.• Will be the baseline for
comparison
5 10 15 20 25 300
20
40
60
80
100
120
Baseline
Accuracy on Development Data
Accuracy on Training Data
Top 10 Lexical Features- By Weight Vector
f**k
completely_disagree
lol
roflmao
ROFL
K_RT
_RT
love_you
yeah_right
#truth
Phase 2: Structural Correspondence
Learning
Structural Correspondence Learning
Domain adaptation technique to leverage abundant labeled data in one domain and utilize it in a target domain with less/no labeled data
Source Domain: 14 annotated meeting threads from AMI meeting corpus Around 10k statement-response adjacency pairs
Target Domain: Initial tweet-response pairs from threads having 10-15 responses (~10k)
Structured Correspondence Learning Algorithm: Pivot Features
Phase 2: SCL Implementation Choose m pivot features from source and target
domains, such that They occur frequently in both domains
Are characteristic of the task we want to achieve (i.e., indicate agreement or disagreement)
Chosen using labeled source data, unlabeled source and target data
Pivot Features: 50 most frequently occurring terms in pairs annotated as “agreement”, “disagreement” and “backchannel (AMI)/ neither (Twitter)”
Structured Correspondence Learning Algorithm
Step 1: Construct m pivot feature vectors for the source and target domain
Step 2: Construct one binary prediction problem per adjacency pair of source domain Binary prediction question: For the given adjacency pair, does the
pivot feature mi occur in the response?
Train a classifier on the annotated AMI corpus to construct a weight vector, W such that,
Wi = Weight assigned to the ith adjacency pair for a particular pivot feature
For each pivot feature, there will be a weight vector W
Structure Correspondence Learning
Source Domain(AMI
Annotated Meeting Corpus)
Extract Features which strongly correlate with
agreement/disagreement
Source Feature Vector
Common Latent Space
1.Con
stru
ct C
olum
n Mat
rix
with Fe
atur
e Ve
ctor
USVT
2. Singular V
alue
Decomposit
ion
Project onto Target Domain
3. Obtain the mapping matrix UT
TwitterCorpus
Target Feature Vector
MIRA Classifier
Labels
Structure Correspondence
Structured Correspondence Learning Algorithm: Application in Target Domain
Step 3: Construct a matrix L, whose column vectors are the pivot predictor weight vectors
Step 4: Perform SVD on L, i.e., L = UDVT
= UT , which is a projection from original feature space to a latent space common to both source and target domains.
Step 5: Apply the features from each row of on the data from Twitter adjacency pairs and AMI adjacency pairs.
Step 6: Through Step 5 induce correspondences between features indicating agreement/disagreement in the AMI corpus and Twitter corpus
Results
Visualizing the correspondences between source and target domains
AMI Corpus: Features strongly associated with the feature disagree
disagree
wrong
incorrect
Uh
obviouslythough
tend_to
um
Twitter Corpus: Corresponding features
disagree
completely
#stupid
ROFL
liarhave_to
hate
#WTF
1. f**k2. completely_disagree3. lol4. roflmao5. ROFL6. K_RT7. _RT8. love_you9. yeah_right10.#truth
Results Three instances of the target classifier was set up:
Labeled source domain data; unlabeled target domain data
Labeled source domain data; unlabeled data from source and target
Unlabeled data from source domain to augment extraction of corresponding features
Annotated adjacency pairs from10 meeting threads(~8k)
Labeled source domain data; unlabeled target and source domain data; small amount of labeled target domain data
Twitter conversation threads with exactly 10 responses (~2k)
Features extracted from the target domain are applied to a MIRA classifier, and the accuracy is computed in each of the three scenarios
Results: Comparison with Baseline result
74
76
78
80
82
84
86
77.61
80.74
83.54
SCL: Accuracy on Twitter Test Data
Scenario
Accu
racy (
%)
Results: Comparison with Baseline result
500 750 1000 1500 200079.5
80
80.5
81
81.5
82
82.5
83
83.5
84
81.03
82.16
82.7983.24
83.54
Varying the size of Labeled Target Data
Number of labeled Target Data
Accu
racy (
%)
Discussions
Salient Points of Discussion Purely unlabeled data, provides classification accuracy very close to
baseline
Compared with gains from SCL applied in POS tagging Blitzer et. Al’s* task was from a significantly larger corpus
Conversations in both AMI and Twitter corpus, are generally short (AMI – around 10-12 words; Twitter maximum of 140 characters)
Certain twitter specific constructs were not leveraged (especially retweets)
Significantly differing lexicons to convey a similar feeling (use of single swear words followed by a retweet for instance)
Able to beat the baseline, with minimally available annotated data from target domain
Current implementation does not take into account the initial statement/tweet
*Blitzer, J., McDonald, R., & Pereira, F. (2006, July). Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (pp. 120-128). Association for Computational Linguistics.
Future Work Use more unlabeled data to see if baseline can
be defeated without any labeled target domain data
Incorporate the words used in the statement into the model
Restrict categories of Twitter conversation to particular domain/personalities (perhaps may lead to better results)
Clean up the code and make it ready for public distribution!