powering comparative classification with sentiment

13
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6818–6830 November 7–11, 2021. c 2021 Association for Computational Linguistics 6818 Powering Comparative Classification with Sentiment Analysis via Domain Adaptive Knowledge Transfer Zeyu Li, Yilong Qin*, Zihan Liu*, and Wei Wang Department of Compute Science, University of California, Los Angeles {zyli,weiwang}@cs.ucla.edu {louisqin,zihanliu}@ucla.edu Abstract We study Comparative Preference Classifica- tion (CPC) which aims at predicting whether a preference comparison exists between two entities in a given sentence and, if so, which entity is preferred over the other. High- quality CPC models can significantly ben- efit applications such as comparative ques- tion answering and review-based recommen- dation. Among the existing approaches, non- deep learning methods suffer from inferior per- formances. The state-of-the-art graph neural network-based ED-GAT (Ma et al., 2020) only considers syntactic information while ignoring the critical semantic relations and the senti- ments to the compared entities. We propose Sentiment Analysis Enhanced COmparative Network (SAECON) which improves CPC ac- curacy with a sentiment analyzer that learns sentiments to individual entities via domain adaptive knowledge transfer. Experiments on the CompSent-19 (Panchenko et al., 2019) dataset present a significant improvement on the F1 scores over the best existing CPC ap- proaches. 1 Introduction Comparative Preference Classification (CPC) is a natural language processing (NLP) task that pre- dicts whether a preference comparison exists be- tween two entities in a sentence and, if so, which entity wins the game. For example, given the sen- tence: Python is better suited for data analysis than MATLAB due to the many available deep learning libraries, a decisive comparison exists between Python and MATLAB and comparatively Python is preferred over MATLAB in the context. The CPC task can profoundly impact various real-world application scenarios. Search engine users may query not only factual questions but also comparative ones to meet their specific informa- tion needs (Gupta et al., 2017). Recommendation * These authors contributed equally. providers can analyze product reviews with com- parative statements to understand the advantages and disadvantages of the product comparing with similar ones. Several models have been proposed to solve this problem. Panchenko et al. (2019) first formalize the CPC problem, build and publish the CompSent-19 dataset, and experiment with numerous general ma- chine learning models such as Support Vector Ma- chine (SVM), representation-based classification, and XGBoost. However, these attempts consider CPC as a sentence classification while ignoring the semantics and the contexts of the entities (Ma et al., 2020). ED-GAT (Ma et al., 2020) marks the first entity- aware CPC approach that captures long-distance syntactic relations between the entities of interest by applying graph attention networks (GAT) to de- pendency parsing graphs. However, we argue that the disadvantages of such an approach are clear. Firstly, ED-GAT replaces the entity names with “entityA” and “entityB” for simplicity and hence deprives their semantics. Secondly, ED-GAT has a deep architecture with ten stacking GAT layers to tackle the long-distance issue between compared entities. However, more GAT layers result in a heavier computational workload and reduced train- ing stability. Thirdly, although the competing en- tities are typically connected via multiple hops of dependency relations, the unordered tokens along the connection path cannot capture either global or local high-quality semantic context features. In this work, we propose a Sentiment Analy- sis Enhanced COmparative classification Network (SAECON), a CPC approach that considers not only syntactic but also semantic features of the entities. The semantic features here refer to the context of the entities from which a sentiment anal- ysis model can infer the sentiments toward the entities. Specifically, the encoded sentence and entities are fed into a dual-channel context fea-

Upload: others

Post on 02-Apr-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6818–6830November 7–11, 2021. c©2021 Association for Computational Linguistics

6818

Powering Comparative Classification with Sentiment Analysis via DomainAdaptive Knowledge Transfer

Zeyu Li, Yilong Qin*, Zihan Liu*, and Wei WangDepartment of Compute Science, University of California, Los Angeles

{zyli,weiwang}@cs.ucla.edu{louisqin,zihanliu}@ucla.edu

Abstract

We study Comparative Preference Classifica-tion (CPC) which aims at predicting whethera preference comparison exists between twoentities in a given sentence and, if so, whichentity is preferred over the other. High-quality CPC models can significantly ben-efit applications such as comparative ques-tion answering and review-based recommen-dation. Among the existing approaches, non-deep learning methods suffer from inferior per-formances. The state-of-the-art graph neuralnetwork-based ED-GAT (Ma et al., 2020) onlyconsiders syntactic information while ignoringthe critical semantic relations and the senti-ments to the compared entities. We proposeSentiment Analysis Enhanced COmparativeNetwork (SAECON) which improves CPC ac-curacy with a sentiment analyzer that learnssentiments to individual entities via domainadaptive knowledge transfer. Experiments onthe CompSent-19 (Panchenko et al., 2019)dataset present a significant improvement onthe F1 scores over the best existing CPC ap-proaches.

1 Introduction

Comparative Preference Classification (CPC) is anatural language processing (NLP) task that pre-dicts whether a preference comparison exists be-tween two entities in a sentence and, if so, whichentity wins the game. For example, given the sen-tence: Python is better suited for data analysis thanMATLAB due to the many available deep learninglibraries, a decisive comparison exists betweenPython and MATLAB and comparatively Python ispreferred over MATLAB in the context.

The CPC task can profoundly impact variousreal-world application scenarios. Search engineusers may query not only factual questions but alsocomparative ones to meet their specific informa-tion needs (Gupta et al., 2017). Recommendation

* These authors contributed equally.

providers can analyze product reviews with com-parative statements to understand the advantagesand disadvantages of the product comparing withsimilar ones.

Several models have been proposed to solve thisproblem. Panchenko et al. (2019) first formalize theCPC problem, build and publish the CompSent-19dataset, and experiment with numerous general ma-chine learning models such as Support Vector Ma-chine (SVM), representation-based classification,and XGBoost. However, these attempts considerCPC as a sentence classification while ignoring thesemantics and the contexts of the entities (Ma et al.,2020).

ED-GAT (Ma et al., 2020) marks the first entity-aware CPC approach that captures long-distancesyntactic relations between the entities of interestby applying graph attention networks (GAT) to de-pendency parsing graphs. However, we argue thatthe disadvantages of such an approach are clear.Firstly, ED-GAT replaces the entity names with“entityA” and “entityB” for simplicity and hencedeprives their semantics. Secondly, ED-GAT has adeep architecture with ten stacking GAT layers totackle the long-distance issue between comparedentities. However, more GAT layers result in aheavier computational workload and reduced train-ing stability. Thirdly, although the competing en-tities are typically connected via multiple hops ofdependency relations, the unordered tokens alongthe connection path cannot capture either global orlocal high-quality semantic context features.

In this work, we propose a Sentiment Analy-sis Enhanced COmparative classification Network(SAECON), a CPC approach that considers notonly syntactic but also semantic features of theentities. The semantic features here refer to thecontext of the entities from which a sentiment anal-ysis model can infer the sentiments toward theentities. Specifically, the encoded sentence andentities are fed into a dual-channel context fea-

6819

ture extractor to learn the global and local context.In addition, an auxiliary Aspect-Based SentimentAnalysis (ABSA) module is integrated to learn thesentiments towards individual entities which aregreatly beneficial to the comparison classification.

ABSA aims to detect the specific emotional in-clination toward an aspect within a sentence (Maet al., 2018; Hu et al., 2019; Phan and Ogunbona,2020; Chen and Qian, 2020; Wang et al., 2020).For example, the sentence I liked the service andthe staff but not the food suggests positive senti-ments toward service and staff but a negative onetoward food. These aspect entities, such as service,staff, and food, are studied individually.

The well-studied ABSA approaches can be bene-ficial to CPC when the compared entities in a CPCsentence are considered as the aspects in ABSA.Incorporating the individual sentiments learned byABSA methods into CPC has several advantages.Firstly, for a comparison to hold, the preferred en-tity usually receives a positive sentiment while itsrival gets a relatively negative one. These senti-ments can be easily extracted by the strong ABSAmodels. The contrast between the sentiments as-signed to the compared entities provides a vitalclue for an accurate CPC. Secondly, the ABSAmodels are designed to target the sentiments to-ward phrases, which bypasses the complicated andnoisy syntactic relation path. Thirdly, consideringthe scarcity of the data resource of CPC, the abun-dant annotated data of ABSA can provide sufficientsupervision signal to improve the accuracy of CPC.

There is one challenge that blocks the knowl-edge transfer of sentiment analysis from the ABSAdata to the CPC task: domain shift. Existing ABSAdatasets are centered around specific topics suchas restaurants and laptops, while the CPC data hasmixed topics (Panchenko et al., 2019) that are alldistant from restaurants. In other words, sentencesof ABSA and CPC datasets are drawn from dif-ferent distributions, also known as domains. Thedifference in the distributions is referred to as a “do-main shift” (Ganin and Lempitsky, 2015; He et al.,2018) and it is harmful to an accurate knowledgetransfer. To mitigate the domain shift, we designa domain adaptive layer to remove the domain-specific feature such as topics and preserve thedomain-invariant feature such as sentiments of thetext so that the sentiment analyzer can smoothlytransfer knowledge from sentiment analysis to com-parative classification.

2 Related Work

2.1 Comparative Preference Classification

CPC originates from the task of Comparative Sen-tence Identification (CSI) (Jindal and Liu, 2006).CSI aims to identify the comparative sentences. Jin-dal and Liu (2006) approach this problem by ClassSequential Mining (CSR) and a Naive Bayesianclassifier. Building upon CSI, Panchenko et al.(2019) propose the task of CPC, release CompSent-19 dataset, and conduct experimental studies us-ing traditional machine learning approaches suchas SVM, representation-based classification, andXGBoost. However, they neglect the entities inthe comparative context (Panchenko et al., 2019).ED-GAT (Ma et al., 2020), a more recent work,uses the dependency graph to better recognize long-distance comparisons and avoid falsely identifyingunrelated comparison predicates. However, it failsto capture semantic information of the entities asthey are replaced by “entityA” and “entityB”. Fur-thermore, having multiple GAT layers severely in-creases training difficulty.

2.2 Aspect-Based Sentiment Analysis

ABSA derives from sentiment analysis (SA) whichinfers the sentiment associated with a specific en-tity in a sentence. Traditional approaches of ABSAutilize SVM for classification (Kiritchenko et al.,2014; Wagner et al., 2014; Zhang et al., 2014)while neural network-based approaches employvariants of RNN (Nguyen and Shirai, 2015; Ay-din and Güngör, 2020), LSTM (Tang et al., 2016;Wang et al., 2016; Bao et al., 2019), GAT (Wanget al., 2020), and GCN (Pouran Ben Veyseh et al.,2020; Xu et al., 2020).

More recent works1 widely use complex con-textualized NLP models such as BERT (Devlinet al., 2019). Sun et al. (2019) transform ABSAinto a Question Answering task by constructingauxiliary sentences. Phan and Ogunbona (2020)build a pipeline of Aspect Extraction and ABSAand used wide and concentrated features for sen-timent classification. ABSA is related to CPC bynature. In general, entities with positive sentimentsare preferred over the ones with neutral or negativesentiments. Therefore, the performance of a CPCmodel can be enhanced by the ABSA techniques.

1Due to the limited space, we are unable to exhaustivelycover all references. Works discussed are classic examples.

6820

Syntactic GCNBiLSTM

Text Encoding Dependency Parsing

Senti. feature Dom. Cls.

concat concatCPC Cls. CPC Loss

GRL…

CPC data…

ABSA dataor

or

Split to 2

Sentiment Analyser

Domain Loss

ABSA LossSentiment Classification

CPC ABSA

CPC: Python is better suited for data analysis than MATLAB … (Python ; MATLAB )ABSA (aux.): The fajitas were great to taste, but not to see. (fajitas )

softmax

softmax

softmax

CPC global semantic featureCPC local syntactic featureSentiment feature

<latexit sha1_base64="NkraBzhtR/1Zq4vD1wegTQzBlOM=">AAAB63icbVBNS8NAEJ34WetX1aOXxSJ4KkkR9Vj04rGC/YA2lM120i7d3YTdjVBC/4IXD4p49Q9589+YtDlo64OBx3szzMwLYsGNdd1vZ219Y3Nru7RT3t3bPzisHB23TZRohi0WiUh3A2pQcIUty63AbqyRykBgJ5jc5X7nCbXhkXq00xh9SUeKh5xRm0s48MqDStWtuXOQVeIVpAoFmoPKV38YsUSiskxQY3qeG1s/pdpyJnBW7icGY8omdIS9jCoq0fjp/NYZOc+UIQkjnZWyZK7+nkipNGYqg6xTUjs2y14u/uf1Ehve+ClXcWJRscWiMBHERiR/nAy5RmbFNCOUaZ7dStiYaspsFk8egrf88ipp12veVa3+cFlt3BZxlOAUzuACPLiGBtxDE1rAYAzP8ApvjnRenHfnY9G65hQzJ/AHzucPJg2NqA==</latexit>e1<latexit sha1_base64="Wvs5UcsOrFvlU2xdknzt2wlkXVQ=">AAAB73icbVDLSgNBEOz1GddX1KOXwRDwFHaDqMegF48RzAOSZZmd9CZDZh/OzAphyU948aCIV3/Hm3/jJNmDJhY0FFXddHcFqeBKO863tba+sbm1Xdqxd/f2Dw7LR8dtlWSSYYslIpHdgCoUPMaW5lpgN5VIo0BgJxjfzvzOE0rFk/hBT1L0IjqMecgZ1UbqVtF3bfTrfrni1Jw5yCpxC1KBAk2//NUfJCyLMNZMUKV6rpNqL6dScyZwavczhSllYzrEnqExjVB5+fzeKakaZUDCRJqKNZmrvydyGik1iQLTGVE9UsveTPzP62U6vPZyHqeZxpgtFoWZIDohs+fJgEtkWkwMoUxycythIyop0yYi24TgLr+8Str1mntZq99fVBo3RRwlOIUzOAcXrqABd9CEFjAQ8Ayv8GY9Wi/Wu/WxaF2zipkT+APr8wdr9I7r</latexit>e2 <latexit sha1_base64="NkraBzhtR/1Zq4vD1wegTQzBlOM=">AAAB63icbVBNS8NAEJ34WetX1aOXxSJ4KkkR9Vj04rGC/YA2lM120i7d3YTdjVBC/4IXD4p49Q9589+YtDlo64OBx3szzMwLYsGNdd1vZ219Y3Nru7RT3t3bPzisHB23TZRohi0WiUh3A2pQcIUty63AbqyRykBgJ5jc5X7nCbXhkXq00xh9SUeKh5xRm0s48MqDStWtuXOQVeIVpAoFmoPKV38YsUSiskxQY3qeG1s/pdpyJnBW7icGY8omdIS9jCoq0fjp/NYZOc+UIQkjnZWyZK7+nkipNGYqg6xTUjs2y14u/uf1Ehve+ClXcWJRscWiMBHERiR/nAy5RmbFNCOUaZ7dStiYaspsFk8egrf88ipp12veVa3+cFlt3BZxlOAUzuACPLiGBtxDE1rAYAzP8ApvjnRenHfnY9G65hQzJ/AHzucPJg2NqA==</latexit>e1

<latexit sha1_base64="Wvs5UcsOrFvlU2xdknzt2wlkXVQ=">AAAB73icbVDLSgNBEOz1GddX1KOXwRDwFHaDqMegF48RzAOSZZmd9CZDZh/OzAphyU948aCIV3/Hm3/jJNmDJhY0FFXddHcFqeBKO863tba+sbm1Xdqxd/f2Dw7LR8dtlWSSYYslIpHdgCoUPMaW5lpgN5VIo0BgJxjfzvzOE0rFk/hBT1L0IjqMecgZ1UbqVtF3bfTrfrni1Jw5yCpxC1KBAk2//NUfJCyLMNZMUKV6rpNqL6dScyZwavczhSllYzrEnqExjVB5+fzeKakaZUDCRJqKNZmrvydyGik1iQLTGVE9UsveTPzP62U6vPZyHqeZxpgtFoWZIDohs+fJgEtkWkwMoUxycythIyop0yYi24TgLr+8Str1mntZq99fVBo3RRwlOIUzOAcXrqABd9CEFjAQ8Ayv8GY9Wi/Wu/WxaF2zipkT+APr8wdr9I7r</latexit>e2

CPC task, repr. of <latexit sha1_base64="NkraBzhtR/1Zq4vD1wegTQzBlOM=">AAAB63icbVBNS8NAEJ34WetX1aOXxSJ4KkkR9Vj04rGC/YA2lM120i7d3YTdjVBC/4IXD4p49Q9589+YtDlo64OBx3szzMwLYsGNdd1vZ219Y3Nru7RT3t3bPzisHB23TZRohi0WiUh3A2pQcIUty63AbqyRykBgJ5jc5X7nCbXhkXq00xh9SUeKh5xRm0s48MqDStWtuXOQVeIVpAoFmoPKV38YsUSiskxQY3qeG1s/pdpyJnBW7icGY8omdIS9jCoq0fjp/NYZOc+UIQkjnZWyZK7+nkipNGYqg6xTUjs2y14u/uf1Ehve+ClXcWJRscWiMBHERiR/nAy5RmbFNCOUaZ7dStiYaspsFk8egrf88ipp12veVa3+cFlt3BZxlOAUzuACPLiGBtxDE1rAYAzP8ApvjnRenHfnY9G65hQzJ/AHzucPJg2NqA==</latexit>e1

CPC task, repr. of <latexit sha1_base64="Wvs5UcsOrFvlU2xdknzt2wlkXVQ=">AAAB73icbVDLSgNBEOz1GddX1KOXwRDwFHaDqMegF48RzAOSZZmd9CZDZh/OzAphyU948aCIV3/Hm3/jJNmDJhY0FFXddHcFqeBKO863tba+sbm1Xdqxd/f2Dw7LR8dtlWSSYYslIpHdgCoUPMaW5lpgN5VIo0BgJxjfzvzOE0rFk/hBT1L0IjqMecgZ1UbqVtF3bfTrfrni1Jw5yCpxC1KBAk2//NUfJCyLMNZMUKV6rpNqL6dScyZwavczhSllYzrEnqExjVB5+fzeKakaZUDCRJqKNZmrvydyGik1iQLTGVE9UsveTPzP62U6vPZyHqeZxpgtFoWZIDohs+fJgEtkWkwMoUxycythIyop0yYi24TgLr+8Str1mntZq99fVBo3RRwlOIUzOAcXrqABd9CEFjAQ8Ayv8GY9Wi/Wu/WxaF2zipkT+APr8wdr9I7r</latexit>e2

ABSA task repr. of<latexit sha1_base64="pHdDR6REmoPL7mt9dGEpthB1MDg=">AAAB8nicbVDLSgMxFM3UVx1fVZdugqXgqswUUZdFNy4r2AdMh5JJ77ShmWRIMkIZ+hluXCji1q9x59+YtrPQ1gMJh3Pu5d57opQzbTzv2yltbG5t75R33b39g8OjyvFJR8tMUWhTyaXqRUQDZwLahhkOvVQBSSIO3WhyN/e7T6A0k+LRTFMIEzISLGaUGCsFNRj4rv0aLgwqVa/uLYDXiV+QKirQGlS++kNJswSEoZxoHfheasKcKMMoh5nbzzSkhE7ICAJLBUlAh/li5RmuWWWIY6nsEwYv1N8dOUm0niaRrUyIGetVby7+5wWZiW/CnIk0MyDoclCccWwknt+Ph0wBNXxqCaGK2V0xHRNFqLEpuTYEf/XkddJp1P2reuPhstq8LeIoozN0ji6Qj65RE92jFmojiiR6Rq/ozTHOi/PufCxLS07Rc4r+wPn8AcFKj50=</latexit>e

<latexit sha1_base64="NkraBzhtR/1Zq4vD1wegTQzBlOM=">AAAB63icbVBNS8NAEJ34WetX1aOXxSJ4KkkR9Vj04rGC/YA2lM120i7d3YTdjVBC/4IXD4p49Q9589+YtDlo64OBx3szzMwLYsGNdd1vZ219Y3Nru7RT3t3bPzisHB23TZRohi0WiUh3A2pQcIUty63AbqyRykBgJ5jc5X7nCbXhkXq00xh9SUeKh5xRm0s48MqDStWtuXOQVeIVpAoFmoPKV38YsUSiskxQY3qeG1s/pdpyJnBW7icGY8omdIS9jCoq0fjp/NYZOc+UIQkjnZWyZK7+nkipNGYqg6xTUjs2y14u/uf1Ehve+ClXcWJRscWiMBHERiR/nAy5RmbFNCOUaZ7dStiYaspsFk8egrf88ipp12veVa3+cFlt3BZxlOAUzuACPLiGBtxDE1rAYAzP8ApvjnRenHfnY9G65hQzJ/AHzucPJg2NqA==</latexit>e1<latexit sha1_base64="Wvs5UcsOrFvlU2xdknzt2wlkXVQ=">AAAB73icbVDLSgNBEOz1GddX1KOXwRDwFHaDqMegF48RzAOSZZmd9CZDZh/OzAphyU948aCIV3/Hm3/jJNmDJhY0FFXddHcFqeBKO863tba+sbm1Xdqxd/f2Dw7LR8dtlWSSYYslIpHdgCoUPMaW5lpgN5VIo0BgJxjfzvzOE0rFk/hBT1L0IjqMecgZ1UbqVtF3bfTrfrni1Jw5yCpxC1KBAk2//NUfJCyLMNZMUKV6rpNqL6dScyZwavczhSllYzrEnqExjVB5+fzeKakaZUDCRJqKNZmrvydyGik1iQLTGVE9UsveTPzP62U6vPZyHqeZxpgtFoWZIDohs+fJgEtkWkwMoUxycythIyop0yYi24TgLr+8Str1mntZq99fVBo3RRwlOIUzOAcXrqABd9CEFjAQ8Ayv8GY9Wi/Wu/WxaF2zipkT+APr8wdr9I7r</latexit>e2

<latexit sha1_base64="pHdDR6REmoPL7mt9dGEpthB1MDg=">AAAB8nicbVDLSgMxFM3UVx1fVZdugqXgqswUUZdFNy4r2AdMh5JJ77ShmWRIMkIZ+hluXCji1q9x59+YtrPQ1gMJh3Pu5d57opQzbTzv2yltbG5t75R33b39g8OjyvFJR8tMUWhTyaXqRUQDZwLahhkOvVQBSSIO3WhyN/e7T6A0k+LRTFMIEzISLGaUGCsFNRj4rv0aLgwqVa/uLYDXiV+QKirQGlS++kNJswSEoZxoHfheasKcKMMoh5nbzzSkhE7ICAJLBUlAh/li5RmuWWWIY6nsEwYv1N8dOUm0niaRrUyIGetVby7+5wWZiW/CnIk0MyDoclCccWwknt+Ph0wBNXxqCaGK2V0xHRNFqLEpuTYEf/XkddJp1P2reuPhstq8LeIoozN0ji6Qj65RE92jFmojiiR6Rq/ozTHOi/PufCxLS07Rc4r+wPn8AcFKj50=</latexit>e

<latexit sha1_base64="RXaHw4onq9+AApJKCDGAY01x210=">AAACAXicbVDLSsNAFJ3UV42vqBvBTbAUXJWkiLqsunFZwT6gDWEyvWmHTiZhZiKUUDf+ihsXirj1L9z5N07aLLT1wFwO59zL3HuChFGpHOfbKK2srq1vlDfNre2d3T1r/6At41QQaJGYxaIbYAmMcmgpqhh0EwE4Chh0gvFN7nceQEga83s1ScCL8JDTkBKstORbR1XwXVOXui5mP8JqRDDLrqa+VXFqzgz2MnELUkEFmr711R/EJI2AK8KwlD3XSZSXYaEoYTA1+6mEBJMxHkJPU44jkF42u2BqV7UysMNY6MeVPVN/T2Q4knISBbozX1Euern4n9dLVXjpZZQnqQJO5h+FKbNVbOdx2AMqgCg20QQTQfWuNhlhgYnSoZk6BHfx5GXSrtfc81r97qzSuC7iKKNjdIJOkYsuUAPdoiZqIYIe0TN6RW/Gk/FivBsf89aSUcwcoj8wPn8AVheU7g==</latexit>A

<latexit sha1_base64="c9dR/qsVyFs2k5P/g6OoOfYr7WM=">AAACWXicbVHPS8MwFE6rm7P+qu7opTgGHmS0Q9Tj1IvHCe4HbKWkabqFpUlJUmGU/pMeBPFf8WC69aDbHuTl43vfI+99CVNKpHLdL8Pc26/VDxqH1tHxyemZfX4xlDwTCA8Qp1yMQygxJQwPFFEUj1OBYRJSPAoXz2V99I6FJJy9qWWK/QTOGIkJgkpTgZ22ceBZOnV1strTBKo5gjR/LKxpyGkkl4m+8nkR5PTGK7Rii+3uYOVOrdTawG65HXcVzjbwKtACVfQD+2MacZQlmClEoZQTz02Vn0OhCKJYj5lJnEK0gDM80ZDBBEs/XzlTOG3NRE7MhT5MOSv2b0cOE1kOqJXl5nKzVpK7apNMxQ9+TliaKczQ+qE4o47iTmmzExGBkaJLDSASRM/qoDkUECn9GZY2wdtceRsMux3vrtN9vW31nio7GuASXIFr4IF70AMvoA8GAIFP8GPUjLrxbRpmw7TWUtOoeprgX5jNX7hqs3w=</latexit>

hl,1<latexit sha1_base64="y9e2H1mttNIGznvZBt1WMP0E8uk=">AAACWXicbVHPS8MwFE6rm7P+qu7oJTgGHmS0RdTj1IvHCe4HbKWkabaFpU1JUmGU/pMeBPFf8WC69aDbHiT58r3vkfe+hCmjUjnOl2Hu7dfqB41D6+j45PTMPr8YSJ4JTPqYMy5GIZKE0YT0FVWMjFJBUBwyMgwXz2V++E6EpDx5U8uU+DGaJXRKMVKaCuy0TQLX0punN6s9iZGaY8Tyx0JfQs4iuYz1kc+LIGc3bmFtk94OqSylO1ivCOyW03FWAbeBW4EWqKIX2B+TiOMsJonCDEk5dp1U+TkSimJGdEOZJCnCCzQjYw0TFBPp5ytnCtjWTASnXOiVKLhi/1bkKJZlg1pZTi43cyW5KzfO1PTBz2mSZookeP3QNGNQcVjaDCMqCFZsqQHCgupeIZ4jgbDSn2FpE9zNkbfBwOu4dx3v9bbVfarsaIBLcAWugQvuQRe8gB7oAww+wY9RM+rGt2mYDdNaS02jqmmCf2E2fwGzVrN8</latexit>

hl,2<latexit sha1_base64="wwN01hq+SqLfS+g84rHEo6+CbBU=">AAACWXicbVFNS8MwGE6rm7N+VXf0UhwDDzLaIepx6sXjBPcBWylpmm5haVKSVBilf9KDIP4VD6ZbD7rthbx58rzPy/uRMKVEKtf9Msy9/Vr9oHFoHR2fnJ7Z5xdDyTOB8ABxysU4hBJTwvBAEUXxOBUYJiHFo3DxXMZH71hIwtmbWqbYT+CMkZggqDQV2GkbB56lXVc7qz1NoJojSPPHQj9CTiO5TPSVz4sgpzfeTrZbWJuk3CmVWhrYLbfjrszZBl4FWqCyfmB/TCOOsgQzhSiUcuK5qfJzKBRBFOvSmcQpRAs4wxMNGUyw9PPVZgqnrZnIibnQhylnxf7NyGEiywa1spxcbsZKcldskqn4wc8JSzOFGVoXijPqKO6Ua3YiIjBSdKkBRILoXh00hwIipT/D0kvwNkfeBsNux7vrdF9vW72nah0NcAmuwDXwwD3ogRfQBwOAwCf4MWpG3fg2DbNhWmupaVQ5TfDPzOYvrkGzfA==</latexit>

hs,1

<latexit sha1_base64="8DYmbEnnfAb1ZdsMhFnOZoLSsZw=">AAACWXicbVHPS8MwFE6rm7P+qu7opTgGHmS0RdTj1IvHCe4HbKWkabaFpUlJUmGU/pMeBPFf8WC69aDbHuTly/fe4733JUopkcp1vwxzb79WP2gcWkfHJ6dn9vnFQPJMINxHnHIxiqDElDDcV0RRPEoFhklE8TBaPJfx4TsWknD2ppYpDhI4Y2RKEFSaCu20jUPP0s7XzmpPEqjmCNL8sdCPiNNYLhN95fMizOmNt5P1d7CyzN0m/SK0W27HXZmzDbwKtEBlvdD+mMQcZQlmClEo5dhzUxXkUCiCKNZNMolTiBZwhscaMphgGeQrZQqnrZnYmXKhD1POiv1bkcNElgPqzHJzuRkryV2xcaamD0FOWJopzNC60TSjjuJOKbMTE4GRoksNIBJEz+qgORQQKf0ZlhbB21x5Gwz8jnfX8V9vW92nSo4GuARX4Bp44B50wQvogT5A4BP8GDWjbnybhtkwrXWqaVQ1TfDPzOYvqSazfA==</latexit>

hs,2

<latexit sha1_base64="9yX92g7ZEiukIA+wQA2qRh36s/A=">AAACbXicbVHJTsMwEHXCVsIWQBxYhCyqCg6oSiIEHAtcOBaJLlIbRY7jtladRbaDVEW58YXc+AUu/AJOmwPQjGT7+c0bzfjZTxgV0rI+NX1ldW19o7ZpbG3v7O6Z+wddEacckw6OWcz7PhKE0Yh0JJWM9BNOUOgz0vOnT0W+90a4oHH0KmcJcUM0juiIYiQV5ZnvDeLZhtoctRmNYYjkBCOWPeTq4scsELNQHdkk9zJ2bVeyTgUrKrWi0C6RuWfWraY1D7gM7BLUQRltz/wYBjFOQxJJzJAQA9tKpJshLilmRLVIBUkQnqIxGSgYoZAIN5u7lcOGYgI4irlakYRz9ndFhkJRjKeUhRvif64gq3KDVI7u3YxGSSpJhBeNRimDMoaF9TCgnGDJZgogzKmaFeIJ4ghL9UGGMsH+/+Rl0HWa9m3Tebmptx5LO2rgFFyAK2CDO9ACz6ANOgCDL83UjrUT7Vs/0s/084VU18qaQ/An9Msf3AK7MQ==</latexit>

hs<latexit sha1_base64="dM3+Oes9MLnRkxI4Tq6iTjoIBH8=">AAACg3icbVFNS8NAEN1EqzV+RT16CZaKoNQkFPUiVL14rGA/oA1hs9m0SzfZsLsRSsgf8Wd589+4aXPQmoGdffvmDTM7E6SUCGnb35q+td3Y2W3uGfsHh0fH5snpULCMIzxAjDI+DqDAlCR4IImkeJxyDOOA4lGweCnjow/MBWHJu1ym2IvhLCERQVAqyjc/29h3DOVc5Yz2NIZyjiDNnwr1CBgNxTJWVz4v/JzeOLWsW8OKWq2o1xbGJjVT6b7Zsjv2yqz/wKlAC1TW982vachQFuNEIgqFmDh2Kr0cckkQxapIJnAK0QLO8ETBBMZYePlqhoXVVkxoRYyrk0hrxf7OyGEsygaVspyR2IyVZF1sksnowctJkmYSJ2hdKMqoJZlVLsQKCcdI0qUCEHGierXQHHKIpFqboYbgbH75Pxi6Heeu4751W73nahxNcA4uwBVwwD3ogVfQBwOANKBdarearTf0a93Vu2uprlU5Z+CP6Y8/vtXCTA==</latexit>

hg,1<latexit sha1_base64="oGrHJ+hv1kLKwRCcFB7+ugG5F5w=">AAACg3icbVFNS8NAEN1EqzV+RT16CZaKoNQkFPUiVL14rGA/oA1hs9m0SzfZsLsRSsgf8Wd589+4aXPQmoGdffvmDTM7E6SUCGnb35q+td3Y2W3uGfsHh0fH5snpULCMIzxAjDI+DqDAlCR4IImkeJxyDOOA4lGweCnjow/MBWHJu1ym2IvhLCERQVAqyjc/29h3DOVc5Yz2NIZyjiDNnwr1CBgNxTJWVz4v/JzeOLWsW8OKWq2o1xbGJjVTQt9s2R17ZdZ/4FSgBSrr++bXNGQoi3EiEYVCTBw7lV4OuSSIYlUkEziFaAFneKJgAmMsvHw1w8JqKya0IsbVSaS1Yn9n5DAWZYNKWc5IbMZKsi42yWT04OUkSTOJE7QuFGXUkswqF2KFhGMk6VIBiDhRvVpoDjlEUq3NUENwNr/8HwzdjnPXcd+6rd5zNY4mOAcX4Ao44B70wCvogwFAGtAutVvN1hv6te7q3bVU16qcM/DH9McfwFrCTQ==</latexit>

hg,2

<latexit sha1_base64="5Jl0kJAJEd3mAmmmH1K9Gp2Oqpg=">AAACkXicbVHJTsMwEHXCXqAEOHKJqCpxQFVSVSwHUIELEhcQlFZqq8hxJ61VZ5E9QVRR/4fv4cbf4JQcoGQkj5/fvLHHM34iuELH+TLMldW19Y3Nrcr2zm51z9o/eFVxKhl0WCxi2fOpAsEj6CBHAb1EAg19AV1/epfHu28gFY+jF5wlMAzpOOIBZxQ15VkfdfDcinZN7Sr1QUhxwqjIbub64MdipGah3rLJ3MvEqVvKNktYVapV5doSbpwrBwjv6AfZ89xzPKvmNJyF2f+BW4AaKezRsz4Ho5ilIUTIBFWq7zoJDjMqkTMB+vJUQULZlI6hr2FEQ1DDbNHRuV3XzMgOYqlXhPaC/Z2R0VDl1Wpl3jG1HMvJslg/xeBimPEoSREi9vNQkAobYzsfjz3iEhiKmQaUSa5rtdmESspQD7Gim+Auf/k/eG023LNG86lVa98W7dgkR+SYnBCXnJM2uSePpEOYUTVaxpVxbR6al2bbLLSmUeQckj9mPnwDh/DHyw==</latexit>

S0

<latexit sha1_base64="YZzZ442xgeKguspvSzdsF4ayk/c=">AAAClnicbVHJTsMwEHXCHrYCFyQuEVUlDqhKKsRyQSxC5VgELUhtFTnupLVwFtkTRBXlk/gZbvwNTtsDtBnJ4+c3b+Txs58IrtBxfgxzaXlldW19w9rc2t7Zreztd1ScSgZtFotYvvlUgeARtJGjgLdEAg19Aa/++31Rf/0AqXgcveA4gX5IhxEPOKOoKa/yVQPPtXRq6GTVeiHFEaMiu831wY/FQI1DvWWj3MvEqVvKNkpYVapV5doSbjhVInyiH2TPuedYTU95lapTdyZhLwJ3BqpkFi2v8t0bxCwNIUImqFJd10mwn1GJnAnIrV6qIKHsnQ6hq2FEQ1D9bGJrbtc0M7CDWOoVoT1h/3ZkNFTFyFpZ2KbmawVZVuumGFz2Mx4lKULEphcFqbAxtos/sgdcAkMx1oAyyfWsNhtRSRnqn7S0Ce78kxdBp1F3z+uNp7Pqzd3MjnVyRI7JCXHJBbkhj6RF2oQZB8aVcWfcm4fmtflgNqdS05j1HJB/YbZ+AWsNyUU=</latexit>

Gs<latexit sha1_base64="YZzZ442xgeKguspvSzdsF4ayk/c=">AAAClnicbVHJTsMwEHXCHrYCFyQuEVUlDqhKKsRyQSxC5VgELUhtFTnupLVwFtkTRBXlk/gZbvwNTtsDtBnJ4+c3b+Txs58IrtBxfgxzaXlldW19w9rc2t7Zreztd1ScSgZtFotYvvlUgeARtJGjgLdEAg19Aa/++31Rf/0AqXgcveA4gX5IhxEPOKOoKa/yVQPPtXRq6GTVeiHFEaMiu831wY/FQI1DvWWj3MvEqVvKNkpYVapV5doSbjhVInyiH2TPuedYTU95lapTdyZhLwJ3BqpkFi2v8t0bxCwNIUImqFJd10mwn1GJnAnIrV6qIKHsnQ6hq2FEQ1D9bGJrbtc0M7CDWOoVoT1h/3ZkNFTFyFpZ2KbmawVZVuumGFz2Mx4lKULEphcFqbAxtos/sgdcAkMx1oAyyfWsNhtRSRnqn7S0Ce78kxdBp1F3z+uNp7Pqzd3MjnVyRI7JCXHJBbkhj6RF2oQZB8aVcWfcm4fmtflgNqdS05j1HJB/YbZ+AWsNyUU=</latexit>

Gs

<latexit sha1_base64="5Jl0kJAJEd3mAmmmH1K9Gp2Oqpg=">AAACkXicbVHJTsMwEHXCXqAEOHKJqCpxQFVSVSwHUIELEhcQlFZqq8hxJ61VZ5E9QVRR/4fv4cbf4JQcoGQkj5/fvLHHM34iuELH+TLMldW19Y3Nrcr2zm51z9o/eFVxKhl0WCxi2fOpAsEj6CBHAb1EAg19AV1/epfHu28gFY+jF5wlMAzpOOIBZxQ15VkfdfDcinZN7Sr1QUhxwqjIbub64MdipGah3rLJ3MvEqVvKNktYVapV5doSbpwrBwjv6AfZ89xzPKvmNJyF2f+BW4AaKezRsz4Ho5ilIUTIBFWq7zoJDjMqkTMB+vJUQULZlI6hr2FEQ1DDbNHRuV3XzMgOYqlXhPaC/Z2R0VDl1Wpl3jG1HMvJslg/xeBimPEoSREi9vNQkAobYzsfjz3iEhiKmQaUSa5rtdmESspQD7Gim+Auf/k/eG023LNG86lVa98W7dgkR+SYnBCXnJM2uSePpEOYUTVaxpVxbR6al2bbLLSmUeQckj9mPnwDh/DHyw==</latexit>

S0<latexit sha1_base64="5Jl0kJAJEd3mAmmmH1K9Gp2Oqpg=">AAACkXicbVHJTsMwEHXCXqAEOHKJqCpxQFVSVSwHUIELEhcQlFZqq8hxJ61VZ5E9QVRR/4fv4cbf4JQcoGQkj5/fvLHHM34iuELH+TLMldW19Y3Nrcr2zm51z9o/eFVxKhl0WCxi2fOpAsEj6CBHAb1EAg19AV1/epfHu28gFY+jF5wlMAzpOOIBZxQ15VkfdfDcinZN7Sr1QUhxwqjIbub64MdipGah3rLJ3MvEqVvKNktYVapV5doSbpwrBwjv6AfZ89xzPKvmNJyF2f+BW4AaKezRsz4Ho5ilIUTIBFWq7zoJDjMqkTMB+vJUQULZlI6hr2FEQ1DDbNHRuV3XzMgOYqlXhPaC/Z2R0VDl1Wpl3jG1HMvJslg/xeBimPEoSREi9vNQkAobYzsfjz3iEhiKmQaUSa5rtdmESspQD7Gim+Auf/k/eG023LNG86lVa98W7dgkR+SYnBCXnJM2uSePpEOYUTVaxpVxbR6al2bbLLSmUeQckj9mPnwDh/DHyw==</latexit>

S0

<latexit sha1_base64="dMkfgCnHzbdOfw62ec2P/Z5JHMY=">AAACpXicbVFLb9NAEF670KYGSlqOvViJLDigyI4q2mOAAz2A1EBeUhKt1ptxssr6od0xIrL8z/gV3Pg3rBMLQeORdvbbb77RziPMpNDo+78t++TJ09Oz1rnz7PmLi5fty6uJTnPFYcxTmapZyDRIkcAYBUqYZQpYHEqYhtuPVXz6HZQWaTLCXQbLmK0TEQnO0FC0/dMDGjjG9Y1zvEXMcMOZLN6X5hGmcqV3sbmKTUkL+TZoZPsNrG7U6mZtA7c+KBF+YBgV30rqO94nqp2/BX4uqabtrt/z9+Yeg6AGXVLbA23/WqxSnseQIJdM63ngZ7gsmELBJZTOIteQMb5la5gbmLAY9LLYT7l0PcOs3ChV5iTo7tl/MwoW66oDo6yK1I9jFdkUm+cY3S0LkWQ5QsIPH0W5dDF1q5W5K6GAo9wZwLgSplaXb5hiHM1iHTOE4HHLx2DS7wXvev3hTXfwoR5Hi1yTDnlDAnJLBuSePJAx4VbHureG1lf7tf3FHtmTg9S26pxX5D+z6R/JZc9W</latexit>Ls

<latexit sha1_base64="DluBzdICLJpv7dR3qGV4aRrlP9U=">AAACpXicbVFLT9tAEF6blofbUkOPXCyiCA5VZEeocORxKIciQUtCpCRardfjZJX1Q7tjRGT5n/Ereuu/6TqJEBCPtLPffvONdh5hLoVG3/9n2RsfPm5ube84nz5/2f3q7u33dVYoDj2eyUwNQqZBihR6KFDCIFfAklDCQzi7quMPj6C0yNJ7nOcwTtgkFbHgDA1F3ec20MAxrmuc0x4lDKecyfKiMo8wk5GeJ+YqpxUt5fegke02sLpRq5u1DdxkqUR4wjAu/1TUd9o/qXZeCvxV0Yi6Lb/jL8xbB8EKtMjKbqn7dxRlvEggRS6Z1sPAz3FcMoWCS6icUaEhZ3zGJjA0MGUJ6HG5mHLltQ0TeXGmzEnRW7CvM0qW6LoDo6yL1O9jNdkUGxYYn41LkeYFQsqXH8WF9DDz6pV5kVDAUc4NYFwJU6vHp0wxjmaxjhlC8L7lddDvdoIfne7dSev8cjWObXJADskxCcgpOSfX5Jb0CLcOrWvrzvptH9k39r3dX0pta5Xzjbwxm/4HsqnPRw==</latexit>Ld

<latexit sha1_base64="ONG7f0TBL2+goO5OJjwSUvo2ptg=">AAAB9HicbVDLSgMxFL2pr1pfVZdugkVwVWaKqMuiGxcuKtgHtEPJpJk2NJMZk0yhDP0ONy4UcevHuPNvzLSz0NYDgcM593JPjh8Lro3jfKPC2vrG5lZxu7Szu7d/UD48aukoUZQ1aSQi1fGJZoJL1jTcCNaJFSOhL1jbH99mfnvClOaRfDTTmHkhGUoecEqMlbxeSMyIEpHez/q0X644VWcOvErcnFQgR6Nf/uoNIpqETBoqiNZd14mNlxJlOBVsVuolmsWEjsmQdS2VJGTaS+ehZ/jMKgMcRMo+afBc/b2RklDraejbySykXvYy8T+vm5jg2ku5jBPDJF0cChKBTYSzBvCAK0aNmFpCqOI2K6Yjogg1tqeSLcFd/vIqadWq7mW19nBRqd/kdRThBE7hHFy4gjrcQQOaQOEJnuEV3tAEvaB39LEYLaB85xj+AH3+APyrkj8=</latexit>Lc<latexit sha1_base64="Z6UAG+lRXxJjWRPnXOwtKvt+qHo=">AAACBXicbVDLSsNAFL2prxpfUZe6CJaCq5IUUZdFQVy4qGAf0IYwmU7boZNJmJkIJXTjxl9x40IRt/6DO//GSRtEWw8MnHvOvcy9J4gZlcpxvozC0vLK6lpx3dzY3NresXb3mjJKBCYNHLFItAMkCaOcNBRVjLRjQVAYMNIKRpeZ37onQtKI36lxTLwQDTjtU4yUlnzrsNwNkRpixNKbiY/Nn+pKV75VcirOFPYicXNSghx13/rs9iKchIQrzJCUHdeJlZcioShmZGJ2E0lihEdoQDqachQS6aXTKyZ2WSs9ux8J/biyp+rviRSFUo7DQHdmS8p5LxP/8zqJ6p97KeVxogjHs4/6CbNVZGeR2D0qCFZsrAnCgupdbTxEAmGlgzN1CO78yYukWa24p5Xq7UmpdpHHUYQDOIJjcOEManANdWgAhgd4ghd4NR6NZ+PNeJ+1Fox8Zh/+wPj4BjxNmGs=</latexit>Fc

<latexit sha1_base64="HZkRLiVzB/GcjX+aqC/J7aEKc20=">AAACFHicbVDLSsNAFJ34rPEVdelmsBQEoSRF1GVREBcuKtgHtCFMJpN26GQSZiZCCf0IN/6KGxeKuHXhzr9x0gasrQcGzjn3Xube4yeMSmXb38bS8srq2nppw9zc2t7Ztfb2WzJOBSZNHLNYdHwkCaOcNBVVjHQSQVDkM9L2h1d5vf1AhKQxv1ejhLgR6nMaUoyUtjzrpNKLkBpgxLLbsYfNX3mdy1kVeFbZrtoTwEXiFKQMCjQ866sXxDiNCFeYISm7jp0oN0NCUczI2OylkiQID1GfdDXlKCLSzSZHjWFFOwEMY6EfV3Dizk5kKJJyFPm6M19Sztdy879aN1XhhZtRnqSKcDz9KEwZVDHME4IBFQQrNtIEYUH1rhAPkEBY6RxNHYIzf/IiadWqzlm1dndarl8WcZTAITgCx8AB56AObkADNAEGj+AZvII348l4Md6Nj2nrklHMHIA/MD5/AFignmc=</latexit>Fd

<latexit sha1_base64="pb+XVXWl91Xqjnf/dnAhjGz0Duw=">AAACI3icbVDLSsNAFJ34rPEVdelmsBRclaSIiquiIC5cVLAPaEOYTCbt0MkkzEyEEvovbvwVNy6U4saF/+KkDVhbDwycc+69zL3HTxiVyra/jJXVtfWNzdKWub2zu7dvHRy2ZJwKTJo4ZrHo+EgSRjlpKqoY6SSCoMhnpO0Pb/J6+4kISWP+qEYJcSPU5zSkGCltedZVpRchNcCIZfdjD5u/8nZJBua8kp5Vtqv2FHCZOAUpgwINz5r0ghinEeEKMyRl17ET5WZIKIoZGZu9VJIE4SHqk66mHEVEutn0xjGsaCeAYSz04wpO3fmJDEVSjiJfd+ZLysVabv5X66YqvHQzypNUEY5nH4UpgyqGeWAwoIJgxUaaICyo3hXiARIIKx2rqUNwFk9eJq1a1Tmv1h7OyvXrIo4SOAYn4BQ44ALUwR1ogCbA4Bm8gnfwYbwYb8bE+Jy1rhjFzBH4A+P7B95qpHI=</latexit>Fs

Figure 1: Pipeline of SAECON. Sentences from two domains shown in the gray box are fed into text encoderand dependency parser. The resultant representations of the entities from three substructures are discriminated bydifferent colors (see the legend in the corner). “Cls.” is short for classifier.

3 SAECON

In this section, we first formalize the problem andthen explain SAECON in detail. The pipeline ofSAECON is depicted in Figure 1 with essentialnotations.

3.1 Problem StatementCPC Given a sentence s from the CPC corpusDc with n tokens and two entities e1 and e2, a CPCmodel predicts whether there exists a preferencecomparison between e1 and e2 in s and if so, whichentity is preferred over the other. Potential resultscan be Better (e1 wins), Worse (e2 wins), orNone (no comparison exists).

ABSA Given a sentence s′ from the ABSA cor-pus Ds with m tokens and one entity e′, ABSAidentifies the sentiment (positive, negative, or neu-tral) associated with e′.

We denote the source domains of the CPC andABSA datasets by Dc and Ds. Dc and Ds containsamples that are drawn from Dc and Ds, respec-tively. Dc and Ds are similar but different in topicswhich produces a domain shift. We use s to denotesentences in Dc ∪ Ds and E to denote the entitysets for simplicity in later discussion. |E| = 2 ifs ∈ Dc and |E| = 1 otherwise.

3.2 Text Feature RepresentationsA sentence is encoded by its word representationsvia a text encoder and parsed into a dependencygraph via a dependency parser (Chen and Man-ning, 2014). Text encoder, such as GloVe (Penning-ton et al., 2014) and BERT (Devlin et al., 2019),maps a word w into a low dimensional embedding

w ∈ Rd0 . GloVe assigns a fixed vector whileBERT computes a token2 representation by its tex-tual context. The encoding output of s is denoted byS0 = {w1, . . . , e1, . . . , e2, . . . ,wn} where ei de-notes the embedding of entity i,wi denotes the em-bedding of a non-entity word, and wi, ej ∈ Rd0 .

The dependency graph of s, denoted by Gs, isobtained by applying a dependency parser to s suchas Stanford Parser (Chen and Manning, 2014) orspaCy3. Gs is a syntactic view of s (Marcheggianiand Titov, 2017; Li et al., 2016) that is composed ofvertices of words and directed edges of dependencyrelations. Advantageously, complex syntactic rela-tions between distant words in the sentence can beeasily detected with a small number of hops overdependency edges (Ma et al., 2020).

3.3 Contextual Features for CPC

Global Semantic Context To model more ex-tended context of the entities, we use a bi-directional LSTM (BiLSTM) to encode the entiresentence in both directions. Bi-directional recur-rent neural network is widely used in extractingsemantics (Li et al., 2019). Given the indices of e1and e2 in s, the global context representations hg,1and hg,2 are computed by averaging the hidden

2BERT generates representations of wordpieces which canbe substrings of words. If a word is broken into wordpieces byBERT tokenizer, the average of the wordpiece representationsis taken as the word representation. The representations of thespecial tokens of BERT, [CLS] and [SEP], are not used.

3https://spacy.io

6821

outputs from both directions.

−−→hg,i,

←−−hg,i = BiLSTM(S0)[ei.index], i = 1, 2

hg,i =1

2

(−−→hg,i +

←−−hg,i

),hg,i ∈ Rdg .

Local Syntactic Context In SAECON, we usea dependency graph to capture the syntacticallyneighboring context of entities that contains wordsor phrases modifying the entities and indicates com-parative preferences. We apply a Syntactic GraphConvolutional Network (SGCN) (Bastings et al.,2017; Marcheggiani and Titov, 2017) toGs to com-pute the local context feature hl,1 and hl,2 for e1and e2, respectively. SGCN operates on directeddependency graphs with three major adjustmentscompared with GCN (Kipf and Welling, 2017):considering the directionality of edges, separatingparameters for different dependency labels4, andapplying edge-wise gating to message passing.

GCN is a multilayer message propagation-basedgraph neural network. Given a vertex v in Gs andits neighbors N (v), the vertex representation of von the (j + 1)th layer is given as

h(j+1)v = ρ

∑u∈N (v)

W(j)h(j)u + b(j)

,

where ρ(·) denotes an aggregation function suchas mean and sum, W(j) ∈ Rd(j+1)×d(j) and b(j) ∈Rd(j+1)

are trainable parameters, and d(j+1) andd(j) denote latent feature dimensions of the (j +1)th and the jth layers, respectively.

SGCN improves GCN by considering differentedge directions and diverse edge types, and assignsdifferent parameters to different directions or labels.However, there is one caveat: the directionality-based method cannot accommodate the rich edgetype information; the label-based method causescombinatorial over-parameterization, increasedrisk of overfitting, and reduced efficiency. There-fore, we naturally arrive at a trade-off of usingdirection-specific weights and label-specific biases.

The edge-wise gating can select impactful neigh-bors by controlling the gates for message propaga-tion through edges. The gate on the jth layer of anedge between vertices u and v is defined as

g(j)uv = σ(h(j)u · β(j)

duv+ γ

(j)luv

), g(j)uv ∈ R,

4Labels are defined as the combinations of directions anddependency types. For example, edge ((u, v), nsubj) andedge ((v, u), nsubj−1) have different labels.

where duv and luv denote the direction and label ofedge (u, v), β(j)

duvand γ(j)luv are trainable parameters,

and σ(·) denotes the sigmoid function.Summing up the aforementioned adjustments on

GCN, the final vertex representation learning is

h(j+1)v = ρ

∑u∈N (v)

g(j)uv

(W

(j)duvh(j)u + b

(j)luv

) .

Vectors of S0 serve as the input representationsh(0)v to the first SGCN layer. The representa-

tions corresponding to e1 and e2 are the output{hl,1,hl,2} with dimension dl.

3.4 Sentiment Analysis with KnowledgeTransfer from ABSA

We have discussed in Section 1 that ABSA inher-ently correlates with the CPC task. Therefore, itis natural to incorporate a sentiment analyzer intoSAECON as an auxiliary task to take advantage ofthe abundant training resources of ABSA to boostthe performance on CPC. There are two paradigmsfor auxiliary tasks: (1) incorporating fixed param-eters that are pretrained solely with the auxiliarydataset; (2) incorporating the architecture only withuntrained parameters and jointly optimizing themfrom scratch with the main task simultaneously (Liet al., 2018; He et al., 2018; Wang and Pan, 2018).

Option (1) ignores the domain shift between Dc

and Ds, which degrades the quality of the learnedsentiment features since the domain identity in-formation is noisy and unrelated to the CPC task.SAECON uses option (2). For a smooth and ef-ficient knowledge transfer from Ds to Dc underthe setting of option (2), the ideal sentiment ana-lyzer only extracts the textual feature that is contin-gent on sentimental information but orthogonal tothe identity of the source domain. In other words,the learned sentiment features are expected to bediscriminative on sentiment analysis but invariantwith respect to the domain shift. Therefore, thesentiment features are more aligned with the CPCdomain Dc with reduced noise from domain shift.

In SAECON, we use a gradient reversal layer(GRL) and a domain classifier (DC) (Ganin andLempitsky, 2015) for the domain adaptive senti-ment feature learning that maintains the discrimina-tiveness and the domain-invariance. GRL+DC is astraightforward, generic, and effective modificationto neural networks for domain adaptation (Kamathet al., 2019; Gu et al., 2019; Belinkov et al., 2019;

6822

Li et al., 2018). It can effectively close the shiftbetween complex distributions (Ganin and Lempit-sky, 2015) such as Dc and Ds.

Let A denote the sentiment analyzer which alter-natively learns sentiment information from Ds andprovides sentimental clues to the compared entitiesin Dc. Specifically, each CPC instance is split intotwo ABSA samples with the same text before beingfed into A (see the “Split to 2” in Figure 1). Onetakes e1 as the queried aspect the other takes e2.

A(S0, Gs, E) =

{hs,1,hs,2 if s ∈ Dc,

hs if s ∈ Ds.

hs,1, hs,2, and hs ∈ Rds . These outputs arelater sent through a GRL to not only the CPC andABSA predictors shown in Figure 1 but also theDC to predict the source domain yd of s whereyd = 1 if s ∈ Ds otherwise 0. GRL, trainable bybackpropagation, is transparent in the forward pass(GRLα(x) = x). It reverses the gradients in thebackward pass as

∂GRLα∂x

= −αI.

Here x is the input to GRL, α is a hyperparame-ter, and I is an identity matrix. During training,the reversed gradients maximize the domain loss,forcing A to forget the domain identity via thebackpropagation and mitigating the domain shift.Therefore, the outputs of A stay invariant to thedomain shift. But as the outputs of A are also op-timized for ABSA predictions, the distinctivenesswith respect to sentiment classification is retained.

Finally, the selection of A is flexible as it isarchitecture-agnostic. In this paper, we use theLCF-ASC aspect-based sentiment analyzer pro-posed by Phan and Ogunbona (2020) in which twoscales of representations are concatenated to learnthe sentiments to the entities of interest.

3.5 Objective and Optimization

SAECON optimizes three classification errors over-all for CPC, ABSA, and domain classification.For CPC task, features for local context, globalcontext, and sentiment are concatenated: hei =[hg,i;hl,i;hs,i], i ∈ {1, 2}, and hei ∈ Rds+dg+dl .Given Fc, Fs, Fd, and F below denoting fully-connected neural networks with non-linear activa-tion layers, CPC, ABSA, domain predictions are

obtained by

yc = δ(Fc([F(he1);F(he2)])) (CPC only),

ys = δ(Fs(hs)) (ABSA only),

yd = δ(Fd(GRL(A(S0, Gs, E)))) (Both tasks),

where δ denotes the softmax function. With thepredictions, SAECON computes the cross entropylosses for the three tasks as Lc, Ls, and Ld, respec-tively. The label of Ld is yd. The computations ofthe losses are omitted due to the space limit.

In summary, the objective function of the pro-posed model SAECON is given as follows,

L = Lc + λsLs + λdLd + λreg(L2),

where λs and λd are two weights of the losses, andλ is the weight of an L2 regularization. We denoteλ = {λs, λd, λ}. In the actual training, we separatethe iterations of CPC data and ABSA data and inputbatches from the two domains alternatively. Alter-native inputs ensure that the DC receives batcheswith different labels evenly and avoid overfitting toeither domain label. A stochastic gradient descentbased optimizer, Adam (Kingma and Ba, 2015), isleveraged to optimize the parameters of SAECON.Algorithm 1 in Section A.1 explains the alternativetraining paradigm in detail.

4 Experiments

4.1 Experimental SettingsDataset CompSent-19 is the first public datasetfor the CPC task released by Panchenko et al.(2019). It contains sentences with entity annota-tions. The ground truth is obtained by comparingthe entity that appears earlier (e1) in the sentencewith the one that appears later (e2). The datasetis split by convention (Panchenko et al., 2019; Maet al., 2020): 80% for training and 20% for testing.During training, 20% of the training data of eachlabel composes the development set for model se-lection. The detailed statistics are given in Table 1.

Three datasets of restaurants released in Se-mEval 2014, 2015, and 2016 (Pontiki et al., 2014,2015; Xenos et al., 2016) are utilized for the ABSAtask. We join their training sets and randomly sam-ple instances into batches to optimize the auxiliaryobjective Ls. The proportions of POS, NEU, andNEG instances are 65.8%, 11.0%, and 23.2%.

Note. The rigorous definition of None inCompSent-19 is that the sentence does not containa comparison between the entities rather than that

6823

Dataset Better Worse None TotalTrain 872 (19%) 379 (8%) 3,355 (73%) 4,606

Development 219 (19%) 95 (8%) 839 (73%) 1,153Test 273 (19%) 119 (8%) 1,048 (73%) 1,440Total 1,346 (19%) 593 (8%) 5,242 (73%) 7,199

Flipping labels 1,251 (21%) 1,251 (21%) 3,355 (58%) 5,857Upsampling 3,355 (33%) 3,355 (33%) 3,355 (33%) 10,065

Table 1: Statistics of CompSent-19. The rows of Flip-ping labels and Upsampling show the numbers of theaugmented datasets to mitigate label imbalance.

entities are both preferred or disliked. Althoughthe two definitions are not mutually exclusive, wewould like to provide a clearer background of theCPC problem.

Imbalanced Data CompSent-19 is badly imbal-anced (see Table 1). None instances dominate inthe dataset. The other two labels combined onlyaccount for 27%. This critical issue can impair themodel performance. Three methods to alleviate theimbalance are tested. Flipping labels: Consider theorder of the entities, an original Better instancewill become a Worse one and vice versa if query-ing (e2, e1) instead. We interchange the e1 and e2of all Better and Worse samples so that theyhave the same amount. Upsampling: We upsam-ple Better and Worse instances with duplica-tion to the same amount of None. Weighted loss:We upweight the underpopulated labels Betterand Worse when computing the classification loss.Their effects are discussed in Section 4.2.

Evaluation Metric The F1 score of each labeland the micro-averaging F1 score are reported forcomparison. We use F1(B), F1(W), F1(N), andmicro-F1 to denote them. The micro-F1 scoreson the development set are used as the criteria topick the best model over training epochs and thecorresponding test performances are reported.

Reproducibility The implementation of SAE-CON is publicly available on GitHub5. Detailsfor reproduction are given in Section A.2.

Baseline Models Seven models experimentedin Panchenko et al. (2019) and the state-of-the-artED-GAT (Ma et al., 2020) are considered for per-formance comparison and described in Section A.3.Fixed BERT embeddings are used in our experi-ments same as ED-GAT for comparison fairness.

5https://github.com/zyli93/SAECON

4.2 Performances on CPC

Comparing with Baselines We report the bestperformances of baselines and SAECON in Ta-ble 2. SAECON with BERT embeddings achievesthe highest F1 scores comparing with all baselines,which demonstrates the superior ability of SAE-CON to accurately classify entity comparisons.The F1 scores for None, i.e., F1(N), are consis-tently the highest in all rows due to the data im-balance where None accounts for the largest per-centage. Worse data is the smallest and thus isthe hardest to predict precisely. This also explainswhy models with higher micro-F1 discussed laterusually achieve larger F1(W) given that their ac-curacy values on the majority class (None) arealmost identical. BERT-based models outperformGloVe-based ones, indicating the advantage of con-textualized embeddings.

In later discussion, the reported performances ofSAECON and its variants are based on the BERTversion. The performances of the GloVe-basedSAECON demonstrate similar trends.

Model Micro. F1(B) F1(W) F1(N)Majority 68.95 0.0 0.0 81.62SE-Lin 79.31 62.71 37.61 88.42

SE-XGB 85.00 75.00 43.00 92.00SVM-Tree 68.12 53.35 13.90 78.13BERT-CLS 83.12 69.62 50.37 89.84AvgWE-G 76.32 48.28 20.12 86.34AvgWE-B 77.64 53.94 26.88 87.47ED-GAT-G 82.73 70.23 43.30 89.84ED-GAT-B 85.42 71.65 47.29 92.34

SAECON-G 83.78 71.06 45.90 91.05SAECON-B 86.74 77.10 54.08 92.64

Table 2: Performance comparisons between the pro-posed model and baselines on F1 scores (%). “-B”and “-G” denote different versions of the model us-ing BERT (Devlin et al., 2019) and GloVe (Penningtonet al., 2014) as the input embeddings, respectively. Allreported improvements over the best baselines are sta-tistically significant with p-value < 0.01.

Ablation Studies Ablation studies demonstratethe unique contribution of each part of the pro-posed model. Here we verify the contributionsof the following modules: (1) The bi-directionalglobal context extractor (BiLSTM); (2) The syntac-tic local context extractor (SGCN); (3) The domainadaptation modules of A (GRL); (4) The entireauxiliary sentiment analyzer, including its depen-

6824

dent GRL+DC (A+GRL for short). The results arepresented in Table 3.

Four points worth noting. Firstly, the SAECONwith all modules achieves the best performance onthree out of four metrics, demonstrating the effec-tiveness of all modules (SAECON vs. the rest);Secondly, the synergy of A and GRL improves theperformance (−(A+GRL) vs. SAECON) whereasthe A without domain adaptation hurts the classi-fication accuracy instead (−GRL vs. SAECON),which indicates that the auxiliary sentiment ana-lyzer is beneficial to CPC accuracy only with the as-sistance of GRL+DC modules; Thirdly, removingthe global context causes the largest performancedeterioration (−BiLSTM vs. SAECON), showingthe significance of long-term information. This ob-servation is consistent with the findings of Ma et al.(2020) in which eight to ten stacking GAT layersare used for global feature learning; Finally, theperformances also drop after removing the SGCN(−SGCN vs. SAECON) but the drop is less thanremoving the BiLSTM. Therefore, local contextplays a less important role than the global context(−SGCN vs. −BiLSTM).

Variants Micro. F1(B) F1(W) F1(N)SAECON 86.74 77.10 54.08 92.64−BiLSTM 85.21 72.94 43.86 92.63−SGCN 86.53 76.22 51.38 92.24−GRL 86.53 76.16 49.77 92.93

−(A+GRL) 85.97 74.82 52.44 92.45

Table 3: Ablation studies on F1 scores (%) betweenSAECON and its variants with modules disabled.“−X” denotes a variant without module X. RemovingA will also remove GRL+DC.

Hyperparameter Searching We demonstratethe influences of several key hyperparameters.Such hyperparameters include the initial learningrate (LR, η), feature dimensions (d = {dg, dl, ds}),regularization weight λ, and the configurations ofSGCN such as directionality, gating, and layer num-bers.

For LR, d, and λ in Figures 2a, 2b, and 2c, wecan observe a single peak for F1(W) (green curves)and fluctuating F1 scores for other labels and themicro-F1 (blue curves). In addition, the peaks ofmicro-F1 occur at the same positions of F1(W).This indicates that the performance on Worse isthe most influential factor to the micro-F1. Theseobservations help us locate the optimal settings and

10 4 10 30.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Micro.F1(B)F1(W)F1(N)

(a) F1 vs. Learning rate (η)

60 120 180 240 3000.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Micro.F1(B)F1(W)F1(N)

(b) F1 vs. Feature dims. (d)

10 6 10 5 10 4 10 30.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Micro.F1(B)F1(W)F1(N)

(c) F1 vs. Reg. weight (λ)

2 3 4 5 6 7 8 9 100.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Micro.F1(B)F1(W)F1(N)

(d) F1 vs. SGCN layers

Figure 2: Searching and sensitivity for four key hyper-parameters of SAECON in F1 scores.

also show the strong learning stability of SAECON.Figure 2d focuses on the effect of SGCN layer

numbers. We observe clear oscillations on F1(W)and find the best scores at two layers. More layersof GCN result in oversmoothing (Kipf and Welling,2017) and hugely downgrade the accuracy, whichis eased but not entirely fixed by the gating mech-anism. Therefore, the performances slightly dropon larger layer numbers.

Table 4 shows the impact of directionality andgating. Turning off either the directionality or thegating mechanism (“73” or “37”) leads to de-graded F1 scores. SGCN without modifications(“77”) drops to the poorest micro-F1 and F1(W).Although its F1(N) is the highest, we hardly con-sider it a good sign. Overall, the benefits of thedirectionality and gating are verified.

Directed Gating Micro. F1(B) F1(W) F1(N)3 3 86.74 77.10 54.08 92.647 3 86.18 75.72 49.78 92.403 7 85.35 74.03 43.27 92.347 7 85.35 73.39 35.78 93.04

Table 4: Searching and sensitivity for the directionalityand gating of SGCN by F1 scores (%).

Alleviating Data Imbalance The label imbal-ance severely impairs the model performance, es-pecially on the most underpopulated label Worse.

6825

The aforementioned imbalance alleviation methodsare tested in Table 5. The Original (OR) row isa control experiment using the raw CompSent-19without any weighting or augmentation.

The optimal solution is the weighted loss (WLvs. the rest). One interesting observation is thatdata augmentation such as flipping labels and up-sampling cannot provide a performance gain (ORvs. FL and OR vs. UP). Weighted loss performsa bit worse on F1(N) but consistently better onthe other metrics, especially on Worse, indicat-ing that it effectively alleviates the imbalance issue.In practice, static weights found via grid searchare assigned to different labels when computingthe cross entropy loss. We leave the explorationof dynamic weighting methods such as the FocalLoss (Lin et al., 2017) for future work.

Methods Micro. F1(B) F1(W) F1(N)Weighted loss (WL) 86.74 77.10 54.08 92.64

Original (OR) 85.97 73.80 46.15 92.90Flipping labels (FL) 84.93 73.07 42.45 91.99Upsampling (UP) 85.83 73.11 46.36 92.95

Table 5: Performance analysis with F1 scores (%) fordifferent methods to mitigate data imbalance.

Alternative Training One novelty of SAECONis the alternative training that allows the sentimentanalyzer to learn both tasks across domains. Herewe analyze the impacts of different batch ratios(BR) and different domain shift handling methodsduring the training. BR controls the number ofratio of batches of the two alternative tasks in eachtraining cycle. For example, a BR of 2 : 3 sends 2CPC batches followed by 3 ABSA batches in eachiteration.

Figure 3a presents the entire training time forten epochs with different BR. A larger BR takesshorter time. For example, a BR of 1:1 (the left-most bar) takes a shorter time than 1:5 (the yellowbar). Figure 3b presents the micro-F1 scores fordifferent BR. We observe two points: (1) The re-ported performances differ slightly; (2) Generally,the performance is better when the CPC batches areless than ABSA ones. Overall, the hyperparameterselection tries to find a “sweet spot” for effective-ness and efficiency, which points to the BR of 1:1.

Figure 4 depicts the performance comparisonsof SAECON (green bars), SAECON−GRL (the“−GRL” in Figure 3, orange bars), and SAECON

CPC batches

1 2 3 4 5 ABSA batches

12

34

5

Tim

e (s

)

0100200300

400

500

(a) BR vs. Training time (s)

CPC batches

1 2 3 4 5 ABSA batches

12

34

5

Micr

o-F1

0.0000.0020.0040.0060.0080.0100.0120.0140.016

(b) BR vs. Micro-F1

Figure 3: Analyses for batch ration (BR). The values ofmicro-F1 are actual numbers minus 0.85 for the conve-nience of visualization.

with pretrained and fixed parameters of A (the op-tion (1) mentioned in Section 3.4, blue bars). Theyrepresent different levels of domain shift mitiga-tion: The pretrained and fixed A does NOT handlethe domain shift at all; The variant −GRL onlyattempts to implicitly handle the shift by alterna-tive training with different tasks to converge inthe middle although the domain difference can beharmful to both objectives; SAECON, instead, ex-plicitly uses GRL+DC to mitigate the domain shiftbetween Ds and Dc during training.

As a result, SAECON achieves the best perfor-mance especially on F1(W), −GRL gets the sec-ond, and the “option (1)” gets the worst. Thesedemonstrate that (1) the alternative training (bluevs. green) for an effective domain adaptation isnecessary and (2) there exists a positive correlationbetween the level of domain shift mitigation andthe model performance, especially on F1(W) andF1(B). A better domain adaptation produces higherF1 scores in the scenarios where datasets in thedomain of interest, i.e., CPC, is unavailable.

Micro-F1 F1(B) F1(W) F1(N)0.40.50.60.70.80.9

F1 S

core

s

Pretrained & Fixed SAECON GRLSAECON

Figure 4: Visualization of the domain shift mitigation.

4.3 Case Study

In this section, we qualitatively exemplify the con-tribution of the sentiment analyzer A. Table 6 re-ports four example sentences from the test set ofCompSent-19. The entities e1 and e2 are high-lighted together with the corresponding sentimentpredicted by A. The column “Label” shows theground truth of CPC. The “∆” column computesthe sentiment distances between the entities. Weassign +1, 0, and −1 to sentiment polarities of

6826

CPC sentences with sentiment predictions by A Label ∆

S1: This is all done via the gigabit [Ethernet:POS] interface, rather than the much slower [USB:NEG] interface. Better +2

S2: Also, [Bash:NEG] may not be the best language to do arithmetic heavy operations in something like[Python:NEU] might be a better choice. Worse −1

S3: It shows how [JavaScript:POS] and [PHP:POS] can be used in tandem to make a user’s experience fasterand more pleasant. None 0

S4: He broke his hand against [Georgia Tech:NEU] and made it worse playing against [Virginia Tech:NEU]. None 0

Table 6: Case studies for the effect of the sentiment analyzer A (see Section 4.3 for details).

POS, NEU, and NEG, respectively. ∆ is computedby the sentiment polarity of e1 minus that of e2.Therefore, a positive distance suggests that e1 re-ceives a more positive sentiment from A than e2and vice versa. In S1, sentiments to Ethernet andUSB are predicted positive and negative, respec-tively, which can correctly imply the comparativelabel as Better. S2 is a Worse sentence withBash predicted negative, Python predicted neutral,and a resultant negative sentiment distance−1. ForS3 and S4, the entities are assigned the same polari-ties. Therefore, the sentiment distances are both ze-ros. We can easily tell that preference comparisonsdo not exist, which is consistent with the groundtruth labels. Due to the limited space, more inter-esting case studies are presented in Section A.4.

5 Conclusion

This paper proposes SAECON, a CPC model thatincorporates a sentiment analyzer to transfer knowl-edge from ABSA corpora. Specifically, SAECONutilizes a BiLSTM to learn global comparative fea-tures, a syntactic GCN to learn local syntactic infor-mation, and a domain adaptive auxiliary sentimentanalyzer that jointly learns from ABSA corpora andCPC corpora for a smooth knowledge transfer. Analternative joint training scheme enables the effi-cient and effective information transfer. Qualitativeand quantitative experiments verified the superiorperformance of SAECON. For future work, wewill focus on a deeper understanding of CPC dataaugmentation and an exploration of weighting lossmethods for data imbalance.

Acknowledgement

We would like to thank the reviewers for their help-ful comments. The work was partially supportedby NSF DGE-1829071 and NSF IIS-2106859.

Broader Impact Statement

This section states the broader impact of the pro-posed CPC model. Our model is designed specifi-cally for the comparative classification scenarios inNLP. Users can use our model to detect whether acomparison exists between two entities of interestwithin the sentences of a particular sentential cor-pus. For example, a recommender system equippedwith our model can tell whether a product is com-pared with a competitor and, further, which is pre-ferred. In addition, Review-based platforms canutilize our model to decide which items are widelywelcomed and which are not.

Our model, SAECON, is tested with theCompSent-19 dataset which has been anonymizedduring the process of collection and annotation.For the auxiliary ABSA task, we also use ananonymized public dataset from SemEval 2014 to2016. Therefore, our model will not cause potentialleakages of user identity and privacy. We wouldlike to call for attention to CPC as it is a relativelynew task in NLP research.

ReferencesCem Rifki Aydin and Tunga Güngör. 2020. Combina-

tion of recursive and recurrent neural networks foraspect-based sentiment analysis using inter-aspectrelations. IEEE Access.

Lingxian Bao, Patrik Lambert, and Toni Badia. 2019.Attention and lexicon regularized LSTM for aspect-based sentiment analysis. In Proceedings of the57th Annual Meeting of the Association for Compu-tational Linguistics: Student Research Workshop.

Jasmijn Bastings, Ivan Titov, Wilker Aziz, DiegoMarcheggiani, and Khalil Sima’an. 2017. Graphconvolutional encoders for syntax-aware neural ma-chine translation. In Proceedings of the 2017 Con-ference on Empirical Methods in Natural LanguageProcessing.

Yonatan Belinkov, Adam Poliak, Stuart Shieber, Ben-jamin Van Durme, and Alexander Rush. 2019. On

6827

adversarial removal of hypothesis-only bias in natu-ral language inference. In Proceedings of the EighthJoint Conference on Lexical and Computational Se-mantics.

Samuel R. Bowman, Gabor Angeli, Christopher Potts,and Christopher D. Manning. 2015. A large anno-tated corpus for learning natural language inference.In Proceedings of the 2015 Conference on EmpiricalMethods in Natural Language Processing.

Danqi Chen and Christopher D Manning. 2014. Afast and accurate dependency parser using neuralnetworks. In Proceedings of the 2014 Conferenceon Empirical Methods in Natural Language Process-ing.

Zhuang Chen and Tieyun Qian. 2020. Relation-awarecollaborative learning for unified aspect-based sen-timent analysis. In Proceedings of the 58th AnnualMeeting of the Association for Computational Lin-guistics, ACL 2020.

Alexis Conneau, Douwe Kiela, Holger Schwenk, LoïcBarrault, and Antoine Bordes. 2017. Supervisedlearning of universal sentence representations fromnatural language inference data. In Proceedings ofthe 2017 Conference on Empirical Methods in Natu-ral Language Processing.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, andKristina Toutanova. 2019. BERT: pre-training ofdeep bidirectional transformers for language under-standing. In Proceedings of the 2019 Conference ofthe North American Chapter of the Association forComputational Linguistics: Human Language Tech-nologies.

Yaroslav Ganin and Victor Lempitsky. 2015. Unsuper-vised domain adaptation by backpropagation. In In-ternational conference on machine learning. PMLR.

Shuhao Gu, Yang Feng, and Qun Liu. 2019. Improvingdomain adaptation translation with domain invariantand specific information. In Proceedings of the 2019Conference of the North American Chapter of theAssociation for Computational Linguistics: HumanLanguage Technologies.

Samir Gupta, A. S. M. Ashique Mahmood, Karen Ross,Cathy H. Wu, and K. Vijay-Shanker. 2017. Identi-fying comparative structures in biomedical text. InBioNLP 2017.

Ruidan He, Wee Sun Lee, Hwee Tou Ng, and DanielDahlmeier. 2018. Adaptive semi-supervised learn-ing for cross-domain sentiment classification. InProceedings of the 2018 Conference on EmpiricalMethods in Natural Language Processing.

Minghao Hu, Yuxing Peng, Zhen Huang, DongshengLi, and Yiwei Lv. 2019. Open-domain targeted sen-timent analysis via span-based extraction and classi-fication. In Proceedings of the 57th Annual Meetingof the Association for Computational Linguistics.

Nitin Jindal and Bing Liu. 2006. Identifying compara-tive sentences in text documents. In Proceedings ofthe 29th Annual International ACM SIGIR Confer-ence on Research and Development in InformationRetrieval.

Anush Kamath, Sparsh Gupta, and Vitor Carvalho.2019. Reversing gradients in adversarial domainadaptation for question deduplication and textual en-tailment tasks. In Proceedings of the 57th AnnualMeeting of the Association for Computational Lin-guistics.

Diederik P. Kingma and Jimmy Ba. 2015. Adam: Amethod for stochastic optimization. In 3rd Interna-tional Conference on Learning Representations.

Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutionalnetworks. In 5th International Conference on Learn-ing Representations.

Svetlana Kiritchenko, Xiaodan Zhu, Colin Cherry, andSaif Mohammad. 2014. NRC-Canada-2014: Detect-ing aspects and sentiment in customer reviews. InProceedings of the 8th International Workshop onSemantic Evaluation.

Yujia Li, Daniel Tarlow, Marc Brockschmidt, andRichard S. Zemel. 2016. Gated graph sequence neu-ral networks. In 4th International Conference onLearning Representations.

Zeyu Li, Jyun-Yu Jiang, Yizhou Sun, and Wei Wang.2019. Personalized question routing via heteroge-neous network embedding. In Proceedings of theAAAI Conference on Artificial Intelligence.

Zheng Li, Ying Wei, Yu Zhang, and Qiang Yang. 2018.Hierarchical attention transfer network for cross-domain sentiment classification. In Proceedings ofthe AAAI Conference on Artificial Intelligence.

Tsung-Yi Lin, Priya Goyal, Ross Girshick, KaimingHe, and Piotr Dollár. 2017. Focal loss for dense ob-ject detection. In Proceedings of the IEEE interna-tional conference on computer vision.

Nianzu Ma, Sahisnu Mazumder, Hao Wang, and BingLiu. 2020. Entity-aware dependency-based deepgraph attention network for comparative preferenceclassification. In Proceedings of the 58th AnnualMeeting of the Association for Computational Lin-guistics.

Yukun Ma, Haiyun Peng, and Erik Cambria. 2018.Targeted aspect-based sentiment analysis via em-bedding commonsense knowledge into an attentiveLSTM. In Proceedings of the Thirty-Second AAAIConference on Artificial Intelligence.

Diego Marcheggiani and Ivan Titov. 2017. Encodingsentences with graph convolutional networks for se-mantic role labeling. In Proceedings of the 2017Conference on Empirical Methods in Natural Lan-guage Processing.

6828

Thien Hai Nguyen and Kiyoaki Shirai. 2015.PhraseRNN: Phrase recursive neural networkfor aspect-based sentiment analysis. In Proceedingsof the 2015 Conference on Empirical Methods inNatural Language Processing.

Alexander Panchenko, Alexander Bondarenko, MircoFranzek, Matthias Hagen, and Chris Biemann. 2019.Categorizing comparative sentences. In Proceed-ings of the 6th Workshop on Argument Mining.

Jeffrey Pennington, Richard Socher, and Christopher DManning. 2014. Glove: Global vectors for word rep-resentation. In Proceedings of the 2014 Conferenceon Empirical Methods in Natural Language Process-ing.

Minh Hieu Phan and Philip O. Ogunbona. 2020. Mod-elling context and syntactical features for aspect-based sentiment analysis. In Proceedings of the58th Annual Meeting of the Association for Compu-tational Linguistics.

Maria Pontiki, Dimitris Galanis, Haris Papageorgiou,Suresh Manandhar, and Ion Androutsopoulos. 2015.SemEval-2015 task 12: Aspect based sentimentanalysis. In Proceedings of the 9th InternationalWorkshop on Semantic Evaluation (SemEval 2015).

Maria Pontiki, Dimitris Galanis, John Pavlopoulos,Harris Papageorgiou, Ion Androutsopoulos, andSuresh Manandhar. 2014. SemEval-2014 task 4: As-pect based sentiment analysis. In Proceedings of the8th International Workshop on Semantic Evaluation(SemEval 2014).

Amir Pouran Ben Veyseh, Nasim Nouri, Franck Der-noncourt, Quan Hung Tran, Dejing Dou, andThien Huu Nguyen. 2020. Improving aspect-basedsentiment analysis with gated graph convolutionalnetworks and syntax-based regulation. In Findingsof the Association for Computational Linguistics:EMNLP 2020.

Chi Sun, Luyao Huang, and Xipeng Qiu. 2019. Uti-lizing BERT for aspect-based sentiment analysis viaconstructing auxiliary sentence. In Proceedings ofthe 2019 Conference of the North American Chap-ter of the Association for Computational Linguistics:Human Language Technologies.

Duyu Tang, Bing Qin, Xiaocheng Feng, and Ting Liu.2016. Effective LSTMs for target-dependent sen-timent classification. In Proceedings of COLING2016, the 26th International Conference on Compu-tational Linguistics.

Maksim Tkachenko and Hady Lauw. 2015. A convo-lution kernel approach to identifying comparisons intext. In Proceedings of the 53rd Annual Meeting ofthe Association for Computational Linguistics andthe 7th International Joint Conference on NaturalLanguage Processing.

Joachim Wagner, Piyush Arora, Santiago Cortes, UtsabBarman, Dasha Bogdanova, Jennifer Foster, andLamia Tounsi. 2014. DCU: Aspect-based polarityclassification for SemEval task 4. In Proceedings ofthe 8th International Workshop on Semantic Evalua-tion (SemEval 2014).

Kai Wang, Weizhou Shen, Yunyi Yang, Xiaojun Quan,and Rui Wang. 2020. Relational graph attention net-work for aspect-based sentiment analysis. In Pro-ceedings of the 58th Annual Meeting of the Associa-tion for Computational Linguistics.

Wenya Wang and Sinno Jialin Pan. 2018. Recursiveneural structural correspondence network for cross-domain aspect and opinion co-extraction. In Pro-ceedings of the 56th Annual Meeting of the Associa-tion for Computational Linguistics.

Yequan Wang, Minlie Huang, Xiaoyan Zhu, andLi Zhao. 2016. Attention-based LSTM for aspect-level sentiment classification. In Proceedings of the2016 Conference on Empirical Methods in NaturalLanguage Processing.

Dionysios Xenos, Panagiotis Theodorakakos, JohnPavlopoulos, Prodromos Malakasiotis, and Ion An-droutsopoulos. 2016. AUEB-ABSA at SemEval-2016 task 5: Ensembles of classifiers and embed-dings for aspect based sentiment analysis. In Pro-ceedings of the 10th International Workshop on Se-mantic Evaluation (SemEval-2016).

Kuanhong Xu, Hui Zhao, and Tianwen Liu. 2020.Aspect-specific heterogeneous graph convolutionalnetwork for aspect-based sentiment classification.IEEE Access.

Fangxi Zhang, Zhihua Zhang, and Man Lan. 2014.ECNU: A combination method and multiple featuresfor aspect extraction and sentiment polarity classifi-cation. In Proceedings of the 8th International Work-shop on Semantic Evaluation (SemEval 2014).

6829

A Supplementary Materials

This section contains the supplementary materialsfor Powering Comparative Classification with Sen-timent Analysis via Domain Adaptive KnowledgeTransfer. Here we provide additional supportinginformation in four aspects, including additionaldescription for the training, the reproducibility de-tails of SAECON, brief introductions of baselines,and additional case studies.

A.1 Pseudocode of SAECON

In this section, we show the pseudocode of thetraining of SAECON to provide a comprehensivepicture of the alternative training paradigm.

Algorithm 1: Optimization of SAECONwith two alternative tasks

Input: Loss weights λ; Learning rate ηData: Dc and Ds.

1 while not converge do2 {s, E}, task←− getAltSample(Dc, Ds)3 S0 ←− TextEncode(s)4 Gs ←− DepParse(s)5 if task is CPC then6 {hg,i}, {hl,i} (i = 1, 2)←−

methods in Section 3.3.7 hs,1,hs,2 ←−A(S0, Gs, E)8 Lc,Ld ←− methods in Section 3.59 optimize({Lc, λdLd, λreg(L2)}, η)

10 else// sentiment analysis

11 hs ←−A(S0, Gs, E)12 Ls,Ld ←− methods in Section 3.513 optimize({λsLs, λdLd, λreg(L2)}, η)

A.2 Reproducibility

In this section, we provide the instructions to re-produce SAECON and the default hyperparametersettings to generate the performances reported inSection 4.2.

A.2.1 Implementation of SAECONThe proposed SAECON is implemented in Python(3.6.8) with PyTorch (1.5.0) and run with a sin-gle 16GB Nvidia V100 GPU. The source code ofSAECON is publicly available on GitHub6 and

6The source code will be publicly available if the paper isaccepted. A copy of anonymized source code is submitted forreview.

comprehensive instructions on how to reproduceour model are also provided.

The implementation of SGCN is based on Py-Torch Geometric7. The implementation of oursentiment analyzer A is adapted from the officialsource code of LCF-ASC (Phan and Ogunbona,2020)8. The dependency parser used in SAECONis from spaCy 9. The pretrained embedding vectorsof GloVe are downloaded from the office site10.The pretrained BERT model is obtained from theHugging Face model repository11. The implemen-tation of the gradient reversal package is availableon GitHub12. We would like to appreciate the au-thors of these packages for their precious contribu-tions.

A.2.2 Default Hyperparameters

The default hyperparameter settings for the resultsreported in Section 4.2 are given in Table 7

Hyperparameter SettingGloVe embeddings pretrained, 100 dims (d0)

BERT version bert-base-uncasedBERT num. 12 heads, 12 layers, 768 dims (d0)

BERT param. pretrained by Hugging FaceDependency parser spaCy, pretrained model

Batch config. size = 16, batch ratio = 1 : 1Init. LR (η) 5× 10−4

CPC loss weight 2 : 4 : 1 (B:W:N)λ ({λ, λs, λd}) {1× 10−4, 1, 1}

Activation ReLU (f(x) = max(0, x))d ({dg, dl, ds}) {240, 240, 240}

Optimizer Adam (β1 = 0.9, β2 = 0.999)LR scheduler StepLR (steps = 3, γ = 0.8)GRL config. α = 1.0 (Default setting)SGCN num. 2 layers (768−→256−→240)SGCN arch. Directed, Gated

Data aug. Off (weighted loss only)

Table 7: Default hyperparameter settings. “LR” is shortfor learning rate. “num.” shows the numerical config-urations and “arch.” shows the architectural configura-tions. “aug.” is short for augmentation.

7https://github.com/rusty1s/pytorch_geometric

8https://github.com/HieuPhan33/LCFS-BERT

9https://spacy.io/10https://nlp.stanford.edu/projects/

glove/11https://huggingface.co/

bert-base-uncased12https://github.com/janfreyberg/

pytorch-revgrad

6830

Supplementary CPC sentences with sentiment predictions by A Label ∆

S1: [Ruby:NEU] wasn’t designed to “exemplify best practices”, it was to be a better [Perl:NEG]. Better +1

S2: And from my experience the ticks are much worse in [Mid Missouri:NEG] than they are in [SouthGeorgia:POS] which is much warmer year round. Worse −2

S3: As an industry rule, [hockey:NEG] and [basketball:NEG] sell comparatively poorly everywhere. None 0

S4: [Milk:NEG], [juice:NEG] and soda make it ten times worse. None 0

Table 8: Additional case studies for the effect of sentiment analyzer A (see Section A.4 for details).

A.2.3 Reproduction of ED-GATWe briefly introduce the reproduction of the state-of-the-art baseline, ED-GAT, in both GloVe andBERT versions. We implement ED-GAT withthe same software packages as SAECON suchas PyTorch-Geometric, spaCy, and PyTorch, andrun it within the same machine environment. Theparameters all follow the original ED-GAT set-ting (Ma et al., 2020) except the dimension ofGloVe. It is set to 300 in the original paper but100 in our experiments for the fairness of compar-ison. The number of layers is select as 8 and thehidden size is set to 300 for each layer with 6 at-tention heads. We trained the model for 15 epochswith Adam optimizer with a batch size of 32.

A.3 Baseline models

We briefly introduce the compared models in Sec-tion 4.2.

Majority-Class A simple model which choosesthe majority label in the training set as the predic-tion of each test instance.

SE Sentence Embedding encodes the sentencesinto low-dimensional sentence representations us-ing pretrained language encoders (Conneau et al.,2017; Bowman et al., 2015) and then feeds theminto a classifier for comparative preference predic-tion. SE has two versions (Panchenko et al., 2019)with different classifiers, namely SE-Lin with alinear classifier and SE-XGB with an XGBoostclassifier.

SVM-Tree This method (Tkachenko and Lauw,2015) applies convolutional kernel methods to CSItask. We follow the experimental settings of (Maet al., 2020).

AvgWE A word embedding-based method thataverages the word embeddings of the sentence asthe sentence representation and then feeds this rep-resentation into a classifier. The input embeddingshave several options, such as GloVe (Penningtonet al., 2014) and BERT (Devlin et al., 2019). Thesevariants are denoted by AvgWE-G and AvgWE-B

separately.BERT-CLS Using the representation of the to-

ken “[CLS]” generated by BERT (Devlin et al.,2019) as the sentence embedding and a linear clas-sifier to conduct comparative preference classifica-tion.

ED-GAT Entity-aware Dependency-basedGraph Attention Network (Ma et al., 2020) is thefirst entity-aware model that analyzes the entityrelations via the dependency graph and multi-layergraph attention layer.

A.4 Additional Case StudiesIn this section, we present four supplementary ex-amples for case study in Table 8 which have differ-ent sentiments compared with their counterparts inTable 6. S1 shows a NEU versus NEG comparisonwhich results in a sentiment distance +1 and a CPCprediction Better. “Ruby” is not praised in thissentence so it has NEU. But “Perl” is assigned a neg-ative emotion through a simple inference. S2 showsa stronger contrast between the entities. “Mid Mis-souri” is said “much worse” while the “South Geor-gia” is “much warmer”, which clearly indicatesthe sentiments and the comparative classificationresults.

S3 and S4 are two sentences both with two par-allel negative entities. The sport equipment in S3is sold “poorly” and the drinks in S4 are “ten timesworse” both indicating negative sentiments. There-fore, the labels are None.