cedi2010 - slides

32
A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Jonathan Ortigosa-Hernández 1 , Juan Diego Rodríguez 1 , Leandro Alzate 2 , Iñaki Inza 1 , and José A. Lozano 1 1 Intelligent Systems Group Computer Science and Artificial Intelligence Department University of the Basque Country 2 Socialware Company, Bilbao, Spain CEDI 2010 Valencia – September 9th, 2010

Upload: jonathan-ortigosa-hernandez

Post on 03-Jul-2015

518 views

Category:

Documents


1 download

TRANSCRIPT

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

A Semi-supervised Approach toMulti-dimensional Classification with

Application to Sentiment Analysis

Jonathan Ortigosa-Hernández1, Juan Diego Rodríguez1,Leandro Alzate2, Iñaki Inza1, and José A. Lozano1

1Intelligent Systems GroupComputer Science and Artificial Intelligence Department

University of the Basque Country2 Socialware Company, Bilbao, Spain

CEDI 2010 Valencia – September 9th, 2010

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Index

1 Introduction

2 Multi-dimensional J /K Dependence Bayesian Classifier

3 Multi-dimensional Semi-supervised Learning

4 Experimentation in Sentiment AnalysisSentiment Analysis ProblemASOMO DatasetResults

5 Conclusions and Future Work

6 References

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Introduction

Supervised Classification

It consists of building a classifier Ψ from a given labelledtraining dataset D, by using an induction algorithm A(A(D) = Ψ),

X1 X2 ... Xn C

x(1)1 x

(1)2 ... x

(1)n c(1)

x(2)1 x

(2)2 ... x

(2)n c(2)

... ... ... ... ...

x(N)1 x

(N)2 ... x

(N)n c(N)

in order to predict the value of a class variable C for anynew unlabelled instance x (Ψ(x) = c).

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Introduction

Uni-dimensional and Multi-dimensionalClassification

Uni-dimensional classification tries to predict a singleclass variable based on a dataset composed of a set oflabelled examples.

(Uni-dimensional Class) Bayesian Network Classifiers(Larrañaga et al, 2005).

Multi-dimensional classification is the generalisation ofthe single-class classification task to the simultaneousprediction of a set of class variables.

Multi-dimensional Class Bayesian Network Classifiers (v.d.Gaag and d. Waal, 2006).

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Introduction

Multi-dimensional Supervised Learning

A typical supervised training dataset

X1 X2 ... Xn C1 C2 ... Cm

x(1)1 x

(1)2 ... x

(1)n c

(1)1 c

(1)2 ... c

(1)m

x(2)1 x

(2)2 ... x

(2)n c

(2)1 c

(2)2 ... c

(2)m

... ... ... ... ... ... ... ...

x(N)1 x

(N)2 ... x

(N)n c

(N)1 c

(N)2 ... c

(N)m

Each instance of the dataset contains both the values of theattributes and m labels which characterise the attributes.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Introduction

Bayesian Network Classifiers

X1 X2 X3 X4 X5 X6

C

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Introduction

Multi-dimensional Class Bayesian NetworkClassifiers (MDBNC)

X1 X2 X3 X4 X5 X6

C1 C2 C3

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Introduction

MDBNC Structure

X1 X2 X3 X4 X5

C1 C2 C3

X1 X2 X3 X4 X5

C1 C2 C3

X1 X2 X3 X4 X5

(a) Complete graph

(c) Class subgraph

(b) Feature selection subgraph

(d) Feature subgraph

C1 C2 C3

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Introduction

Sub-families of MDBNC

(a) Multi-dimensional naive Bayes

(c) Multi-dimensional J/K dependence Bayesian (2/3)

(b) Multi-dimensional tree-augmented network

!"

!#

!$

!%

!& !'

!(

!)

*# *$ *%

!+ !",

*"

!" !# !$ !% !& !' !( !) !* !"+

," ,# ,$

!" !# !$ !% !&

'" '#

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Introduction

Sub-families of MDBNC

1 Multi-dimensional naive Bayes classifier (MDnB)It has a fixed structure. Thus, it has no structural learning(v.d. Gaag and d. Waal, 2006).

2 Multi-dimensional tree-augmented network classifier(MDTAN)

Using the structural learning proposed in (v.d. Gaag and d.Waal, 2006).

3 Multi-dimensional J /K dependence Bayesian classifier(MD J /K)

The structural learning is proposed in this paper.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Multi-dimensional J /K Dependence Bayesian Classifier

Multi-dimensional J /K Dependence BayesianClassifier (MD J /K) I

A supervised method to learn a J /K structure(1) Learn the structure between the class variables (AC):

1 Calculate the p-value (significance of the mutual informationMI(Ci, Cj)) using the independence test for each pair of classvariables, and rank them.

2 Remove the p-values greater than the threshold α = 0,1.3 Use the ranking to add arcs between the class variables fulfilling

the conditions of no cycles between the class variables and nomore than J-parents per class.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Multi-dimensional J /K Dependence Bayesian Classifier

Multi-dimensional J /K Dependence BayesianClassifier (MD J /K) II

A supervised method to learn a J /K structure(2) Learn the structure between the class variables and thefeatures(ACF ):

1 Calculate the p-value (significance of the mutual informationMI(Ci, Xj)) using the independence test for each pair Ci andXj and rank them.

2 Remove the p-values greater than the threshold α = 0,1.3 Use the ranking to add arcs from the class variables to the

features.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Multi-dimensional J /K Dependence Bayesian Classifier

Multi-dimensional J /K Dependence BayesianClassifier (MD J /K) III

A supervised method to learn a J /K structure(3) Learn the structure between the features(AF ):

1 Calculate the p-value (significance of the conditional mutualinformation MI(Xi, Xj |Pac(Xj))) using the conditionalindependence test for each pair Xi and Xj given Pac(Xj) andrank them.

2 Remove the p-values greater than the threshold α = 0,1.3 Use the ranking to add arcs between the class variables fulfilling

the conditions of no cycles between the features and no morethan K-parents per feature.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Multi-dimensional Semi-supervised Learning

Major Problem of Supervised Learning

The labelling of the training data is usually done by anexternal mechanism (usually human beings).However, in many real world problems, obtaining data isrelatively easy, while labelling is difficult, expensive or laborintensive.This problem is accentuated when using multiple targetvariables.DESIRE: Learning algorithms able to incorporate a largenumber of unlabelled data with a small number of labeleddata when learning classifiers.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Multi-dimensional Semi-supervised Learning

Multi-dimensional Semi-supervised Learning

A typical semi-supervised training datasetX1 X2 ... Xn C1 C2 ... Cm

x(1)1 x

(1)2 ... x

(1)n c

(1)1 c

(1)2 ... c

(1)m

x(2)1 x

(2)2 ... x

(2)n c

(2)1 c

(2)2 ... c

(2)m

... ... ... ... ... ... ... ...

x(L)1 x

(L)2 ... x

(L)n c

(L)1 c

(L)2 ... c

(L)m

x(L+1)1 x

(L+1)2 ... x

(L+1)n ? ? ... ?

x(L+2)1 x

(L+2)2 ... x

(L+2)n ? ? ... ?

... ... ... ... ... ... ... ...

x(N)1 x

(N)2 ... x

(N)n ? ? ... ?

Semi-supervised Learning fulfills this desire.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Multi-dimensional Semi-supervised Learning

The Expectation-Maximization Algorithm

The EM algorithm (Dempster et al, 1977)Learn an initial model.Repeat until convergence:(a) Expectation step: Using the current model, estimate themissing values of the data.(b) Maximisation step: Using the whole data and the previousestimations, learn a new current model.

Any MDBNC learning algorithm can be used as model in thisalgorithm.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Sentiment Analysis Problem

Index

1 Introduction

2 Multi-dimensional J /K Dependence Bayesian Classifier

3 Multi-dimensional Semi-supervised Learning

4 Experimentation in Sentiment AnalysisSentiment Analysis ProblemASOMO DatasetResults

5 Conclusions and Future Work

6 References

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Sentiment Analysis Problem

Definitions (Liu, 2010)

Sentiment Analysis (AKA Opinion Mining) is thecomputational study of opinions, sentiments and emotionsexpressed in text.When treating Sentiment Analysis as a classificationproblem, two different problems appear:

1 Subjectivity Classification. Its aim is to classify a text assubjective or objective.

2 Sentiment Classification. It classifies an opinionated text asexpressing a positive, neutral, or negative opinion.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Sentiment Analysis Problem

Motivation for Using Semi-supervised Learning ofMulti-dimensional Classifiers

1 Up to now, these two subproblems have been studied inisolation despite of being closely related.So, probably it would be helpful to use multi-dimensionalclassifiers.

2 Obtaining enough labeled examples for a classifier may becostly and time consuming.This motivates us to deal with unlabelled examples in asemi-supervised framework.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

ASOMO Dataset

Index

1 Introduction

2 Multi-dimensional J /K Dependence Bayesian Classifier

3 Multi-dimensional Semi-supervised Learning

4 Experimentation in Sentiment AnalysisSentiment Analysis ProblemASOMO DatasetResults

5 Conclusions and Future Work

6 References

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

ASOMO Dataset

Properties of the Dataset

Collected by Socialware Company, from the ASOMOservice of mobilised opinion analysis.It consists of 2, 542 Spanish reviews extracted from a blog:

150 documents have been labeled in isolation by an expert.2, 392 posts are left unlabelled.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

ASOMO Dataset

Properties of Each Document

Each document is represented as:14 features

Obtained by using an open source morphological analyser(Carreras et al, 2006).Each feature provide different information related topart-of-speech (POS).Eg. First Persons, Agreement Expressions, Imperatives,Prediction Verbs (future), Questions, Positive Adjectives,etc. Represented as a real number between 0 and 1.

3 class variablesWill to Influence: {declarative sentence, soft WI, mediumWI, strong WI}Sentiment: {very negative, negative, neutral, positive, verypositive}Subjectivity: {Yes (subjective), No (objective)}

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Results

Index

1 Introduction

2 Multi-dimensional J /K Dependence Bayesian Classifier

3 Multi-dimensional Semi-supervised Learning

4 Experimentation in Sentiment AnalysisSentiment Analysis ProblemASOMO DatasetResults

5 Conclusions and Future Work

6 References

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Results

Experiment

The ASOMO dataset has been used to learn:3 (uni-dimensional) Bayesian network classifiers: nB, TANand 2DB.5 MDBNC: MDnB, MDTAN, MD 2/2, MD 2/3 and MD 2/4.

In both Supervised and Semi-supervised (EM algorithm)learning frameworks.Features from ASOMO dataset are discretised into 3values using equal frequency.Results averaged over 5 × 5 fold cross validation.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Results

JOINT Accuracy

This measure estimates the values of all class variablessimultaneously, that is, it only counts a success if all the classesare correctly predicted, otherwise it counts an error.

5

10

15

20

25

nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Results

Will to Influence

30

40

50

60

70

nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Results

Sentiment Polarity

20

25

30

35

40

nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Experimentation in Sentiment Analysis

Results

Subjectivity

50

60

70

80

90

nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Conclusions and Future Work

Conclusions

Multi-dimensional classification and semi-supervisedlearning are two different branches of machine learning.In this research, we have established a bridge betweenthem showing that these techniques are competitive withthe state-of-the-art algorithms.“As discussed in the literature, currently there is nocoherent strategy for handling unlabelled data, so somecreativity must be exercised.” (Cohen)

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Conclusions and Future Work

Future work

What kinds of problem could benefit from this?Web-related problems. E.g. Affect Analysis.

Scalability of MDBNC (high dimensionality of multi-labelproblems).The literature has shown that structure learning algorithmsthat maximise the likelihood are not likely to find structuresyielding good classifiers in a semi-supervised manner.

The ASOMO preliminary results show that using the EMwith a probabilistic model cannot ensure learning a correctmodel.Thus, we need to perform structure search, attempting tomaximise classification accuracy directly.

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

Conclusions and Future Work

Questions

THANK [email protected]

A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis

References

References

Liu B. (2010). Sentiment Analysis and Subjectivity. In: Indurkhya N. and Damerau F.J. Handbook of NaturalLanguage Processing, Chapman & Hall, 2nd Ed.

Rodriguez, J.D. and Lozano, J.A. (2008). Multi-objective learning of multi-dimensional Bayesian classifiers. InProceedings of the Eighth International Conference on Hybrid Intelligent Systems, HIS 2008, Barcelona, Spain. pp.501-506

van der Gaag L. and d. Waal P. (2006). Multi-dimensional Bayesian Classifiers. In Proceedings of the ThirdEuropean Workshop in Probabilistic Graphical Models, pages 107–114

Larrañaga, P., Lozano, J.A., Peña, J.M. and Inza, I (2005). Special Issue on Probabilistic Graphical Models forClassification. Machine Learning, 59(3)

Cozman, F. and Cohen, I. (2006). Risk of Semi-Supervised Learning. In: Chapelle, O. Scholkopf, B. and Zien, A.Semi-Supervised Learning. The MIT Press. pp 57-72.

Friedman, N. (1998.) The Bayesian Structural EM algorithm. In Proc. 14th Conf. on Uncertainty in ArtificialIntelligence. Morgan Kaufmann, San Francisco, CA, pp. 129–138

Dempster A., Laird N. and Rubin D. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm.Journal of the Royal Statistical Society. Series B, 39(1): 1–38

Carreras X., Chao I., Padro L., Padro M. (2006). An Open-Source Suite of Language Analyzers. In Proceedings ofthe 4th Int. Conference on Language Resources and Evaluation, Vol. 10, pp. 239–342