audience targeting by b-to-b advertisement classification: a neural network approach

Expert Systems with Applications 40 (2013) 2777–2791

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

Audience targeting by B-to-B advertisement classification: A neuralnetwork approach

Alan S. Abrahams a,⇑, Eloise Coupey b, Eva X. Zhong a, Reza Barkhi c, Pete S. Manasantivongs d

a Business Information Technology Dept., 1007 Pamplin Hall, Virginia Tech, Blacksburg, VA 24061, USAb Department of Marketing, Pamplin College of Business, Virginia Tech, 2016 Pamplin Hall, Blacksburg, VA 24061, USAc Department of Accounting and Information Systems, Pamplin College of Business, Virginia Tech, 3007 Pamplin Hall, Blacksburg, VA 24061, USAd Melbourne Business School, 200 Leicester Street, Carlton VIC 3053, Australia

a r t i c l e i n f o

Keywords:Media planningAdvertisingTargetingNeural networks

0957-4174/$ - see front matter � 2012 Elsevier Ltd. Ahttp://dx.doi.org/10.1016/j.eswa.2012.10.068

⇑ Corresponding author. Tel.: +1 (540) 231 6596; faE-mail addresses: [email protected] (A.S. Abrahams),

[email protected] (E.X. Zhong), [email protected] (R. Barkhdu (P.S. Manasantivongs).

a b s t r a c t

As marketing communications proliferate, the ability to target the right audience for a message is of ever-increasing importance. Audience targeting practices for mass media, both in research and in industry,have tended to emphasize demographics, behavior, and other characteristics of customer groups as thebases for matching communications to audiences. These approaches overlook the opportunity to leveragethe nature of advertising content, by automatically matching advertisement content to appropriatemedia channels and target audience. We model the semantic and sentiment content of advertisementswith 103 variables. Based on these variables, a neural network classifier is used to assign advertisementsto groups that represent different media channels. In its ability to classify unseen advertisements, themodel outperforms the classification result generated by a random model, by 100–300%. This methodalso enables us to identify and describe divergent advertisement characteristics, by industry.

� 2012 Elsevier Ltd. All rights reserved.

1. Introduction

As the volume of products and services introduced to the mar-ket increases, so does the number of advertisements and mediachannels. The Audit Bureau of Circulations (Audit Bureau ofCirculations (ABC), 2012) lists over 2400 newspaper, consumermagazine, and business publication titles; the Business PublishersAssociation (Business Publishers Association (BPA), 2012) has aninventory of over 2600 media properties. This glut of available out-lets creates a growing problem for advertisers and for customers;advertisers experience a hard time getting their product informa-tion to high-likelihood prospective buyers, as customers retreatfrom the advertisement explosion. This reality underscores theneed to be able to accurately and efficiently identify and reachan appropriate target audience in the advertising process.

The advertising targeting process varies across media. For in-stance, internet media use automated methods for targeting, suchas tracking browsing behavior and extracting key words from webpages to behaviorally or contextually target advertisements toindividuals’ interests. In contrast, targeting in traditional massmedia, like print media and broadcast media, relies heavily onthe resource and insight of advertising agencies.

ll rights reserved.

x: +1 (540) 231 [email protected] (E. Coupey),i), [email protected]

Fig. 1 shows the major actors in the print advertising process.The advertiser is typically a company that wants to promote itsbrand or products. The creative agency is responsible for generat-ing ideas and producing advertisements. After a piece of advertise-ment is created by the creative agency, the media planning agencydetermines where the advertisement should be placed. A mediaplanning agency communicates with the advertiser about the tar-get customers, analyzes the characteristics of different media anddifferent channels, and selects media channels that maximizeexposure to prospective customers (Sissors & Baron, 2010). Themedia buying agency then implements the media plan and seeksthe most competitive price for the advertiser.

As indicated by the figure and description above, advertisingtargeting is mainly conducted by the media planning agency. Themedia planning process has several steps: first, getting the targetmarket profile from the advertiser (e.g., women in the 20–35 agegroup); second, studying the demographics of audiences for poten-tial media outlets by sending out questionnaires or using other re-search methods; and third, determining the best match betweenmedia outlets and the target market.

This process has shortcomings that are addressed by the pres-ent research. First, much attention is placed on finding appropriatemedia outlets for the target audience, while little attention is direc-ted to the actual content of advertisements to be placed, andwhether they fit in within a focal medium. Second, smallbusinesses lack the resources to pay specialist media planners(Weinrauch, Mann, Robinson, & Pharr, 1991), and empirical

http://dx.doi.org/10.1016/j.eswa.2012.10.068

mailto:[email protected]






http://dx.doi.org/10.1016/j.eswa.2012.10.068

http://www.sciencedirect.com/science/journal/09574174

http://www.elsevier.com/locate/eswa

Fig. 1. Actors in the advertising process.

2778 A.S. Abrahams et al. / Expert Systems with Applications 40 (2013) 2777–2791

evidence shows that the selection of promotional media is one ofthe core challenges facing small businesses (Huang & Brown,1999). As the efficacy of targeting depends on media plannerexpertise, smaller businesses are at a disadvantage, relative tobusinesses with greater resources. Low cost, scalable, computer-ized mechanisms are therefore required for small businesses tosuccessfully place advertisements in appropriate channels withoutexpensive, specialist human expertise.

We provide an alternative method for advertisement targetingthat addresses both of the issues noted for media planning. Thismethod utilizes content analysis of advertisements as a basis formatching a new advertisement to the nature of prior advertisingcontent in a media channel. More specifically, the content of previ-ous advertisements in past issues of potential media channels is re-trieved. Based on textual analysis of the advertisements, mediachannels are grouped, based on similarity. A new advertisementis then compared to the content characteristics of advertising inthe created groups of media channels, and determination can bemade about classification, based on best-fit. By classifying thenew advertisement into the media channel groups, a most similarand appropriate channel can be found. The ready availability oflarge numbers of previous advertisements across channels in-creases the ability to determine characteristics of advertising invarious media channels, and it enhances the ability to match anew advertisement to a suitable media channel. In addition,knowledge about the content characteristics of media channelgroups can be used in the creative phase of developing the adver-tising campaign, bolstering fit with the selected channel andenhancing odds of communication success.

The present research classifies advertisements based on textualcontent. The data set for developing the method is obtained fromprint advertisements from two major business-to-business (B2B)magazines. We focus on a narrow selection of print media to pro-vide a rigorous test of the classification method, and to demon-strate ability of the text analysis to finely discriminate differentcharacteristics of advertising. Print advertising is used in this re-search because, compared to television and radio advertising, printadvertising textually conveys more information with larger flexi-bility of length and process time attended (Abernethy & Franke,1996). Because print advertising shares content characteristicswith web advertising and mobile advertising (Gallagher, Foster, &Parsons, 2001), our results can be generalized to these media chan-nels. In the following sections we develop the conceptual rationaleand we discuss the methods that underlie our approach to adver-tisement targeting. In Section 2 we review extant work on textclassification, neural networks, and content analysis in marketingand advertising. In Section 3, we describe a procedure for modeling

textual content in advertisements, detailing the type of semanticand sentiment information extracted, and how is it representedfor classification. Section 4 contains a description of artificial neu-ral networks (ANN), and how we use ANN for classifying advertise-ments using the semantic and sentiment content of theadvertisements. In Section 5, we discuss the experiment proce-dures, including the development of the dataset, which consistsof the advertisements extracted from magazines, and the criteriafor defining classes. In Sections 6 and 7, experiment results anddiscussion are presented. Conclusions and implications are con-tained in Section 8.

2. Related work

To build the logic for the approach developed in this research, weprovide an overview of prior work in text classification, a review ofthe applications of neural network in marketing and advertising,and a discussion of various automated content analysis methods.

2.1. Text classification

Automated text classification, or topic identification, involvesthe automated assignment of documents to categories, and hasbeen well-studied over the past half-century (Apte, Damerau, &Weiss, 1994; Borko & Bernick, 1963; Calvo, Lee, & Li, 2004; Joach-ims & Sebastiani, 2002; Li & Jain, 1998). Classification may beguided by a training set of manually tagged documents (Lewis,Yang, Rose, & Li, 2004; Riloff & Lehnert, 1994; Ruiz & Srinivasan,2002), or may be entirely machine-directed (Chen, Schuffels, & Or-wig, 1996; Nigam, McCallum, Thrun, & Mitchell, 2000). Text cate-gorization problems characteristically have high dimensionality:hundreds of available input attributes, some of which may behighly correlated or violate the assumption of normality. Popularalgorithmic approaches to text categorization include Bayesian ap-proaches, decision trees, example-based classifiers, neural nets,and support vector machines (Sebastiani, 2002).

A popular historic benchmark application of text classificationtechniques is the assignment of topics to a corpus of Reuters newsarticles (Sebastiani, 2002). A common modern application of docu-ment classification is the disambiguation of search engine results.Search results are partitioned by topic to allow the user of the websearch engine to rapidly locate a pertinent document in the user’sintended subject area, thereby reducing information overload(Chung, Chen, & Nunamaker, 2005; Golub, 2006; Yang, Slattery, &Ghani, 2002).

While there is an extensive literature on automatic classifica-tion of text, the classification of content for the purposes of contex-tual advertising has only recently attracted attention. Contextualtargeting typically involves keyword (relevance) matching of thecandidate ad text to the target page in which the ad will appear(Anagnostopoulos, Broder, Gabrilovich, Josifovski, & Riedel, 2011;Broder, Fontoura, Josifovski, & Riedel, 2007). One refinement ofcontextual advertising techniques involves supplementing rele-vance information with sentiment analysis of the target page to,for instance, ensure that advertisements are not placed on targetpages with high relevance but strong negative sentiment (Fan &Chang, 2010). In contrast to these conventional applications, wepropose the alignment of content analysis features – such assemantic and sentiment features – of the candidate ad with priorads of the genre (i.e., prior successful ads to the target audience).

2.2. Neural networks in marketing

Recall that we use a neural network method to classify anadvertisement to a media channel. The basis for the neural

A.S. Abrahams et al. / Expert Systems with Applications 40 (2013) 2777–2791 2779

network classification is the set of attributes that characterize theadvertisement and the media groups, based on textual analysis ofthe advertisements’ content. Artificial neural networks (ANN) areused for approximating the function or relationship between inputvariables and output variables. Because of their nonlinear property,they are particularly flexible for modeling complex, real worldrelationships. For instance, in the present research, we use neuralnetworks to evaluate the fit between the text of an advertisementand the characteristics of prior advertising in a particular type ofmedia channel. Unlike traditional statistical classification models,neural networks do not require that probability assumptions bemade, which is more suitable for unknown relationships where lit-tle empirical experience is available (Zhang, 2000).

Neural networks have been used extensively in the marketingdiscipline, and research has demonstrated that neural networksoutperform traditional statistical techniques in a number of cases(Paliwal & Kunar, 2009). Neural networks have been used for mar-ket response prediction, customer behavior analysis and predic-tion, sales and demand forecasting, new product acceptancetesting, among other applications (Vellido, Lisboa, & Vaughan,1999). The specific instances of neural network application canbe summarized by purpose, as classification, market forecasting,or market analysis (Smith & Gupta, 2000).

Of particular interest for the present research is the use of neu-ral networks for classification. Within the marketing literature,neural networks have been applied to assist in modeling consumerresponses to advertising stimuli (Curry & Moutinho, 1993), indetermining what controls brand perceptions (Azcarraga, Hsieh,& Setiono, 2008), and in determining how demographic and behav-ioral characteristics relate to purchase patterns (Schwartz, 1992).

The most similar type of studies relevant to our task of audiencetargeting is the use of ANN in market segmentation. Market seg-mentation involves discriminating customer groups based ondemographic or geographic characteristic (Kim, Street, Russell, &Menczer, 2005), purchase history (Kaefer, Heilman, & Ramenofsky,2005), attitude towards products or services (Davies, Moutinho, &Curry, 1996) and degree of risk aversion (Dasgupta, Dispensa, &Ghose, 1994). Segmentation can also be used in grouping organiza-tions. For example, Setiono, Thong, and Yap (1998) use neural net-works for planning IT service promotion, by determining whichcompanies in a data set use computers, and which do not. Basedon the resulting market segmentation, successive sales and promo-tion activities can be performed only to specific groups of the tar-get audience.

The market segmentation approach focuses on studying andclassifying prospective customers – a widely accepted methodfor identifying a target market. Another important step to actuallyreach the target audience is to present information about the con-sumption opportunity through appropriate channels. Our methodbridges the gap between the identification of a target audienceand the receipt of information by that audience by enabling busi-nesses to match the content of new advertising to the typical con-tent of media channels with access to a particular target audience.

Research that examines the content of advertisements with aneural network approach is rare. Ramalingam, Palaniappan, Pan-chanatham, and Palanivel (2006) uses thirteen variables to repre-sent the characteristics of toothpaste advertisements on TV.Responses collected from questionnaires are then used for analyz-ing how these characteristics affect advertising effectiveness.These studies, however, do not address the issue of audience tar-geting through media channel selection.

2.3. Content analysis of advertisements

Content analysis is a popular method to describe and interpretthe content of advertisements through extraction of quantitative

variables (Neuendorf, 2002). Resnik and Stern (1977) developedcriteria for finding cues and evaluating information regardingproduct quality, price, performance, and other features. Althoughoriginally proposed to analyze television advertisements, usageof the procedure was later expanded to other media (Biswas, Olsen,& Carlet, 1992). Naccarato and Neuendorf (1998) examined the ef-fect of format and content variables as indicators to recall andreadership of business-to-business print advertisements. Contentanalysis has also been used to examine print advertisements inspecific cultures, audiences or time periods (e.g. Graham, Kamins,& Oetomo, 1993; Gross & Sheth, 1989; Kolbe & Albanese, 1996).MacInnes, Moorman, and Jaworski (1991) review the impact ofexecutional cues (such as headline content, pictures, and colors)in advertisements on consumers’ motivation to attend to the ad.

A common characteristic – and drawback – to historic contentanalysis research is that manual procedures are used to code thecontent in advertisements into variables. This process not onlyintroduces the possibility of human error, but also requires muchtime and effort, hence limiting the scale of data analyzed. In recentdecades there has been rapid growth of computer tools developedto overcome these drawbacks. Computer-aided text analysis(CATA) tools provide information on word occurrence, frequency,and dictionary reference (Neuendorf, 2002). One such example isthe program, General Inquirer, which processes free text input intocategories of words (Stone, Dunphy, Smith, & Ogilvie, 1966).

Despite the availability of CATA tools, few studies have utilizedthem to examine content and reoccurring pattern in advertise-ments. For instance, Dowling and Kabanoff (1996) evaluated themeaning of and difference between advertising slogans. Motes(1992) examined the effect of vivid words, active sentence struc-ture and text layout on readers’ reactions. Using CATA tools on alarge categorized advertisement corpus, specifically for the pur-pose of matching a new candidate advertisement to an appropriateoutlet (i.e. category of historic advertisements with a similarsemantic and sentiment profile), is the gap addressed by the cur-rent research. We describe our approach in Section 3 to showhow we contribute to this developing literature.

3. Procedure: modeling advertising content

We use a computer-based program to retrieve and model thecontent of advertisements. To do this, we consider first how thetextual content will be represented. Our approach uses ‘‘bag-of-words’’ representation, in which each word is regarded as oneoccurrence of an entry in the total vocabulary of that advertise-ment, regardless of where it appears in a sentence or in a para-graph. Moreover, we impose additional structure on the nature oftextual representation reflecting two types of content: semanticcontent and sentiment content. Each of these is described below.

3.1. Semantic content modeling

Semantic content refers to substantial meanings carried out bylanguage. In our approach, the semantic content of advertisementsis extracted in-line with the method used by General Inquirer,which includes approximately 11,780 word entries from theHarvard-IV-4 dictionary and the Lasswell dictionary (Stone et al.,1966). Each word is labeled by one or more tags; each tag corre-sponds to one category, for example, academic words, economicwords, and other tags. When the program scans the advertisement,the occurrence of these categorized words is identified and accu-mulated. As a result, each piece of text has a series of scores repre-senting the occurrence of words in each category. Advertisementsvary in length, so in order to compare them, each score is normal-ized by dividing it by the length of the piece of text (total number


of words in the advertisement) and then multiplying the result by1000, for ease of processing. Thus, each advertisement is repre-sented by a feature vector X = (x1, x2, . . ., xN), with its ith compo-nent xi denoting the number of words in the ith category perthousand words in the advertisement.

Our study includes 99 categories from the General Inquirer. Afull list of the categories, the number of entries in each categoryand the average incidence (hits per thousand advertise-ment words) for each category in our data set can be found inAppendix A.

Input layer Hidden layer Output layer

Fig. 2. Architecture of feed-forward multilayer perceptron with one hidden layer.

3.2. Sentiment content modeling

Sentiment content refers to emotional or subjective descrip-tions embedded in text. Sentiment content is of interest whenstudying the characteristics of advertisements because advertise-ments not only deliver objective descriptions about brands andproducts, but also affect customers subtly and emotionally, influ-encing consumption-related behaviors. Heath (2005) finds thatadvertisements that appeal to readers’ feelings rather than to theirknowledge can be processed with low attention and can result inbuying behavior.

We use two dictionaries, ANEW and AFFIN, to model sentimentcontent in the advertisements. The ANEW dictionary is a list of2476 words, each with a score on valence, arousal and dominance,based on a 9-point scale. These scores are determined by researchin which subjects are required to rate words according to theirfeelings upon sight of the word. The feeling of Happy vs. Unhappyaccounts for the valence score, Excited vs. Calm for the arousalscore, and In-control vs. Controlled for the dominance score (Brad-ley & Lang, 2010). For example, the word ‘‘award’’ has a valencescore of 8.44 (mean of ratings collected from all subjects), an arou-sal score of 7.22 and a dominance score of 7.26; while ‘‘failure’’ hasa valence score of 1.70, an arousal score of 4.95 and a dominancescore of 2.40.

The AFFIN dictionary is a major revision of the ANEW dictio-nary. The AFINN dictionary contains a list of 2477 unique word en-tries. The overlap between the AFINN and ANEW lexicons is 21%:that is, 521 (out of 2477) lexical entries in AFINN also appear inANEW. Each AFINN word has a valence score based on a scale of�5 to +5, with +5 indicating the word being very positive in va-lence (Nielsen, 2011). For example, ‘‘superb’’ scores +5, ‘‘opportu-nity’’ scores +2, ‘‘useless’’ scores -2 and ‘‘catastrophe’’ scores -4.

In our procedure, the advertisement text is scanned for words inthese dictionaries, and the ANEW valence, ANEW arousal, ANEWdominance and AFFIN valence scores of the words are accumulated(to give a total, for each of these measures, for each ad). Thesescores are reported both raw (total count) and normalized by thelength of the text (incidence per thousand words). By adding thesesentiment scores to variables from the General Inquirer categories,the feature vector X = (x1, x2, . . ., xN) of the advertisement is ex-panded. Detailed descriptions of variables from the ANEW and AF-FIN dictionary can be found in Appendix B.

To achieve our classification objective, finding the most appro-priate media channel for a given piece of advertisement, the fea-ture vector of the new advertisement is retrieved, as are thevectors for previous advertisements representing the characteris-tics of different media channels (operationalized as predefinedclasses). For example, advertisements for Computer Technologymay have a high score on feature variables (components of the fea-ture vector) representing tools or goals, but a relatively low scoreon feature variables that represent sentiment. Assuming the totalnumber of media channels, or predefined classes, is C, the problemof fitting the new piece of advertisement into one media channel/class can thus be transformed into a C-Class classification problem.

4. Building the neural network classification model

Neural networks are inspired by the biological neural networkin which neurons are connected and functionally related to eachother. This section will discuss how artificial neural networks areused for classification problems and how the classification modelis built.

4.1. Artificial neural networks

Fig. 2 shows the architecture of a feed-forward multilayer net-work (termed multilayer perceptron or MLP) used in our approach.The network contains an input variable layer, a hidden neuronlayer and an output neuron layer. Each hidden neuron is a nonlin-ear, bounded function of weighted input variables and each outputneuron a function of the hidden neurons. A feed-forward networkrefers to a network in which the information only flows from inputto output and no feedback cycle appears (Dreyfus, 2005).

We chose neural networks for our classification approach asneural networks have been shown to perform well on text classifi-cation tasks (Sebastiani, 2002), and can be readily applied usingmodern Commercial Off The Shelf (COTS) statistical software tools,such as JMP. Profiling features provided, for instance, in COTS toolslike JMP 9, enhance the explainability of neural net decisions bygraphically showing the change in the classification probabilityfor each output class over the attribute ranges for each input attri-bute. Support Vector Machines and example-based methods arealso known to perform extremely well for text classification tasks,though the difference is modest and results are broadly compara-ble to those achieved with neural nets (Sebastiani, 2002).

4.2. Classification problem modeling

The C-Class classification problem mentioned in Section 3 canbe regarded as a function approximation problem. Let X denotethe feature vector of input variables, Y = (y1, y2, . . .yC) denote theoutput vector of the group membership of C classes, with yj = 1 rep-resenting the advertisement being assigned to the jth class andyj = 0 representing the advertisement not in the class. The sum ofall output components should equal to 1 since a piece of advertise-ment can only be assigned to one class. Therefore, the problem istransformed into finding the functional relationship f:X ? Y (Drey-fus, 2005).

4.3. Activation function and parameters

As mentioned in Section 4.1, each hidden neuron is a function ofall weighted input variables. This function is termed the activationfunction. Our study uses a sigmoidal hyperbolic tangent functione2x�1e2xþ1 as activation function, where x is a linear combination ofinput variables. Therefore, let hk be the kth hidden node, then


hk ¼ tanhðPN

i¼1wkixiÞ. The output of the neural network is a linear

function of the hidden nodes. Let yj be the jth output node, then

yj ¼PM

k¼1wjkhk (SAS Institute Inc., 2010).Looking back on our problem, because the media channel (or

class in the sense of neural networks) of previous advertisementsis already known, they can serve as a training set of the neural net-works model, so that weight parameters can be determined byminimizing the squares cost between actual output and outputof the neural network.

The next section will explain how data was collected and pro-cessed for performing our experiments.

5. Data set

The advertisement data set used in our study was collected fromtwo Business-to-Business (B2B) magazines, Entrepreneur and Inc.,the two largest active magazines by entrepreneur readership inthe United States, accounting for approximately 1.3 million entre-preneurial readers. While the volume of internet sources climbed,B2B print advertising maintains popularity because of its high cred-ibility, low cost-per-thousand, high pass-along, and high comple-mentarity with other marketing activities. B2B advertising by thetop fifty advertisers grew by 7.5% in 2010 (www.btobonline.com,September 19, 2011), and according to Starch Information SourcesStudy (2010), 67% of respondents rated B2B print magazines as veryuseful, maintaining their status from 1996 through 2010.

The following section explains how data was collected fromEntrepreneur and Inc. magazine, how the data clean-up job wasperformed, as well as how classes for the C-Class classificationproblem were defined and constructed.

5.1. Collecting data

A total number of 5288 advertisements from 88 issues in a four-year period (July 2007–June 2011 inclusive) from both magazineswere captured. Collected information included the full text of theadvertisement, year, issue, page number, size, magazine title,advertiser, and industry.

All the advertisements were captured by 44 junior- and senior-level undergraduate business majors at a large, public, Americanuniversity. Each data capturer was provided with a coding protocoldescribing the task and was assigned to four magazine issues. Eachmagazine issue was captured in full by two data capturers, to pro-vide redundancy, and the ability to screen and cross-check foraccuracy, objectivity, and reliability. Data quality for each data cap-turer was reviewed by an independent data quality auditor, to ob-tain an overall accuracy score for each data capturer. The dataquality auditor was a compensated 2nd year MBA student. For eachmagazine edition, ten advertisement pages were chosen at randomby the data quality auditor from the hardcopy source magazine,and compared to the data capturers’ input. Ten advertisementpages correspond to a thorough, random review of between 11%and 34% of total advertisements in the edition, depending on thetotal advertisements in the edition. Any missing advertisementsor attributes were noted by the auditor, so that an overall accuracyscore for the data capturer could be computed. Each data capturerwas assigned a score out of 100% where, for instance, a score of 90%indicates that 10% of the required cells were missing or incorrectfor that data capturer for the ten advertisement pages reviewed.Obvious, easily correctable data errors were fixed, and thedata capturer’s accuracy was recalculated. Overall data captureraccuracy, as assessed by the data quality auditor, across all datacapturers was 89% (standard deviation 11%; min 56%; max99.6%). Only 6 of 44 data capturers (14%) scored below an 80%

accuracy threshold. To improve overall data quality, and toeliminate duplicates – recall, each issue was captured separatelyby two data capturers – we retained only data from the most accu-rate data capturer for each magazine issue. After eliminating thedata for the less accurate data capturer for each magazine issue,accuracy improved to 94% (standard deviation 6%).

Semantic and sentiment content were then extracted from allthe advertisements and each piece of advertisement text was rep-resented by a feature vector according to the method described inSection 3. Descriptive statistics (mean score and variance for alladvertisements on each feature vector component) of the dataset can be found in Appendices A and B.

5.2. Constructing classes

The advertisements captured were then categorized into classesaccording to the advertiser’s industry. These classes were con-structed to represent media channels with different audiencesand different characteristics.

To decide the appropriate classification of advertisers, a numberof industry classification schemes were reviewed. Two classifica-tion schemes employed by the United States Census Bureau, Stan-dard Industry Classification (SIC) and North America IndustryClassification System (NAICS), were judged to be inappropriatefor our data set. According to the coding scheme provided by theseclassification systems, one company may have different codes filedunder different categories. For example, IBM, an advertiser in ourdata set, has 27 different NAICS codes and 18 different SIC codeslisted by Hoovers Inc., a popular business research source. IBMhas NAICS codes starting with 33, 35, 51, 54, indicating differenttop level classifications. Also, NAICS and SIC codes are not readilyavailable for many of the private companies in our data set.

The standard industry breakdown used by Entrepreneur maga-zine for its popular Franchise 500 listing was also considered. Thiswas deemed unsuitable as it only includes franchise corporations.

Given the Entrepreneur and Inc. magazines face entrepreneurialreaders and small businesses owners, we employed the six SmallBusiness Success Index (SBSI) Factors (categories of service/prod-uct): Capital Access, Marketing and Innovation, Computer Technol-ogy, Workforce, Customer Service, and Compliance. These six SBSIFactors have been previously validated on entrepreneurship databy Rockbridge Associates Inc. (2011). However, based on a large pi-lot study previously conducted, where over 4000 advertisementsfrom Entrepreneur and Inc. for the period 2004–2008 were re-viewed, a seventh industry classification (‘‘Luxury, Travel, and Per-sonal’’) was added to address a large number of advertisements notcovered by the six SBSI Factors (Abrahams et al, 2012).

Table 1 gives a brief description of the seven industry categories(classes) of advertisements; detailed definitions as well as examplesub-industries and advertisers for the categories can be found inAppendix C.

The industry classification (class construction) process was per-formed by the capturer of the advertisement. As mentioned before,each advertisement was categorized by two capturers. We com-puted Kappa statistics of inter-rater reliability for the agreementof categorization from two capturers (Agresti, 1990; Cohen,1960). We obtained j = 0.93, indicating substantial agreement.

Using the cleaned data set (data from most accurate captureronly), we resolved industry coding discrepancies for each adver-tiser. As each advertiser may have appeared in multiple magazinesand may have been classified by two or more data capturers, wecomputed a ‘majority vote’ industry for each advertiser: the indus-try that received the most votes for all of the most accuratedata capturers that classified that advertiser. We then comparedthe ‘majority vote’ industry to each individual data capturer’sclassification. We recomputed Kappa to compare the majority vote

http://www.btobonline.com

Table 1Industry classification definition and description (Rockbridge Associates Inc., 2011).

Industry Description

Capital Access Products or services providing the availability of working capital, long term investments capital, or expert financial advice.Compliance Products or services assisting compliance with laws and regulations, including ensuring data securityComputer Technology Products or services making technology work effectively and efficiently in the organizationCustomer Service Products or services assisting Customer Services and customer relationship managementMarketing &

InnovationProducts or services for identifying new prospects, showing effective corporate positioning, converting leads, advertising efficiently, andgenerating with new ideas

Luxury, Travel &Personal

Products or services for travelling and personal usage

Workforce Products or services to attract, retain, train and develop, motivate and reward, and deploy employees


industry (Rater 1), to the industry chosen by the data capturer(Rater 2) for each advertisement. We obtained, j = 0.77, indicatingsubstantial agreement. The high Kappa score here indicates thatadvertisers typically advertise in a single industry across all theiradvertisements. We used the ‘majority vote’ industry in all analy-ses described below.

Fig. 3 shows the percentage of advertisements classified intoeach category.

5.3. Creating training and validation data sets

The population of advertisements was divided into three sets;two sets were used for training the neural network model anddetermining parameters while the other set was used for validat-ing the model. The division was determined by random selection,although we applied the rule that multiple advertisements fromone advertiser must be assigned to the same group. This rule wasapplied because some companies employ large amounts of repeatadvertising. If similar or repeat advertisements from the samecompany were assigned to both training data and validation data,the overlap between them would overstate model success. There-fore, unique advertisers were first identified so that the modelwas generalizable to unseen companies and advertisements.

5.4. Limitations

A limitation of our work is that return-on-investment (ROI) datais not available for print media advertisements, so the relativefinancial impact of each ad is unknown. An implicit assumptionfor this data set is then that the majority of historic advertisementsin an industry were successful. This assumption is justified because94% of advertisements in our data set (4913 of 5288 ads) were

Fig. 3. Percentage breakdown of advertisements by industry class.

placed by an advertiser who had placed at least one otheradvertisement in the publication. Repeat advertisement – place-ment of two or more ads by a single advertiser – is generally onlyundertaken if the initial advertisement is successful, as further adspend would be wasteful for failed ads. As the majority of adver-tisements both overall (94% of ads), and in each industry (min89%; max 97%), were ads by a repeat advertiser, it can be assumedthat the content analysis profile for each industry will be domi-nated by the features of successful (i.e. repeat) ads.

6. Experimental results

As there is currently no method for deciding the number of hid-den neurons that would construct the single best fitted model, wetook a heuristic approach by altering the number of neurons in thehidden layer and selecting the model yielding the best results. Thetraining and validating process was conducted with SAS JMP� 9,and the following section gives the classification result using 10hidden neurons.

Fig. 4 shows the lift curve on validation data. Lift curves are away of measuring the performance of a model as opposed to a ran-dom model. As shown in Fig. 4, the X-axis represents Portion, andthe predicted probability of a piece of advertising text being classi-fied into a certain class is sorted in descending order along the X-axis; while the Y-axis represents the lift (relative success) com-pared to a random model. For example, the lift curve for Luxury,Travel & Personal (the purple line in the graph) goes through(0.10, 3.5). This means that if the top-rated 10% of advertisementsis considered, given 368 Luxury, Travel & Personal advertisementsin the total population of 1740 advertisements in the validationset, a random model would have a correct classification rate of368/1740 = 21.15%. The lift for our model is 3.5, meaning it is 3.5times as good as a random model, having a correct classificationrate of approximately 3.5 � 21.15% = 63.45%. The lift curve showsthat, in general, for the most confident predictions (highest 50%of the data set), the model is 2–4 times (100–300%) more accuratethan random classification.

Fig. 5 shows the Receiver Operating Characteristic curve (ROCcurve) on validation data. The Area under the ROC curve is anothermeasurement of model success (Bradley, 1997). The X-axis repre-sents 1-Specificity, or the False Positive rate (the rate of a samplebeing mistakenly classified into a class different from its actualgroup). The Y-axis represents Sensitivity, or the True Positive rate(the rate of a sample being correctly classified to its actual group).The ROC curve is therefore plotted on the True Positive rate by theFalse Positive rate accumulating on a ranking order. The area underthe curve is an indicator of the superiority of the model over therandom model, with 1 indicating perfect classification, and 0.5indicating a model no better than random. (SAS Institute Inc, 2010)

To further discuss how well the model classifies advertisementsfrom each industry, the Confusion Matrix (Bradley, 1997) is pre-sented in Table 2, with the first column giving the actual class,and the following columns showing the number of advertisements

1

2

3

4

5

6

7

8

9

Lift

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00Portion

Legend Industry Capital Access Compliance Computer Technology Customer Service Marketing & Innovation Luxury, Travel & Personal Workforce

Fig. 4. Lift curve on validation data for MLP with 10 hidden neurons.

Sensitivity

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

1-Specificity

Legend Industry Area Capital Access 0.7631

Compliance 0.8117 Computer Technology 0.7817

Customer Service 0.7483 Marketing & Innovation 0.7551

Luxury, Travel & Personal 0.8372 Workforce 0.7854

Fig. 5. ROC curve on validation data for MLP with 10 hidden neurons.

Table 2Confusion Matrix on validation data.

latoTyrtsudnIdetciderP

Actual Industry

Capital Access

Compliance Computer Technology

Customer Service

Marketing & Innovation

Luxury, Travel & Personal

Workforce

Capital Access 33 0 23 0 118 17 24 215 Compliance 5 0 0 0 9 0 0 14 Computer Technology 9 0 218 0 114 30 14 385 Customer Service 6 0 42 2 40 10 2 102 Marketing & Innovation 11 0 50 0 415 46 22 544 Luxury, Travel & Personal 8 0 71 0 86 203 9 377 Workforce 13 0 10 0 53 4 23 103 Total 85 0 414 2 835 310 94 1740


classified into each class. The diagonal presents the number ofadvertisements accurately classified into its actual industry group,or in other words the number of True Positives. The model per-formed quite well with classes holding plenty of samples like Com-puter Technology (218 true positives out of 385 actual ComputerTechnology ads = 57% True Positive rate compared to 22% for a ran-dom model), Marketing & Innovation (76% True Positive rate com-pared to 31.26% for a random model) and Luxury, Travel &Personal (54% True Positive rate compared to 21.67% for randommodel). The classification for the Compliance and Customer Servicesgroup was not so successful, probably because there were not many

examples representing the characteristics of these classes in thetraining and validating sets. The Compliance category accountedfor 0.82% of the samples in the training set and 0.80% in the valida-tion set, while the Customer Services category accounted for 1.95%of the sample in the training set and 5.86% in the validation set. No-tice that a certain proportion of advertisements from Capital Access,Computer Technology, Luxury, Travel & Personal and Workforcewere misclassified into the Marketing & Innovation group. This ispossibly due to the large percentage of advertisements from Market-ing & Innovation industry in the training set, and its characteristicswere then most heavily weighted and influential.

Fig. 6. Significant variables chart.


7. Discussion

After an examination of the classification performance of themodel, in this section we take a closer look at specifically what fea-ture variables have a major impact on determining the classifica-tion result and what differences exist between industry groups.Fig. 6 shows a few significant feature variables from the neural net-work model. The vertical axis represents 7 industry categories andthe horizontal axis represents feature variables or characteristics ofadvertisements (8 significant variables are included in Fig. 6:ANEW-Valence, BEGIN, TOOL, QUAL, VEHICLE, ACAD, SOLVE andECON, detailed descriptions for each variable can be found inAppendices A and B). For a given industry class and a given featurevariable, an upward slope in the curve in any box in the figure indi-cates that when a piece of advertisement scores higher on that fea-ture variable, the possibility of that advertisement being classifiedto that industry class increases; while a downward slope indicates

Table D1Significant variables (General Inquirer categories) and common words for Marketing & In

BEGIN COMPLET TRY SOCIAL

Marketing & Innovation Start Success Looka HomeCreate Successful Try MarketInnovative Comprehensive Seek SiteRenewal Achieve Compete StoreLaunch Succeed Search CenterGenerate Win Aim Restaura

a Look – in the word sense of look for.b Have – in the word sense of possess, experience, engage in, cause to happen.c Own – in the word sense of possessing or pertaining to oneself or itself.d Like – having the same characteristics as, similar to.e Go – in the word sense of go on, continue, proceed.

a decrease in the classifying possibility. Features with greater slopeare more strongly influential – i.e. more sensitive (on the y-axis) tochanges in the input (x-axis) attribute value.

As seen from Fig. 6, as the degree of Valence (box 1a) in a pieceof advertisement goes up, the possibility of it being assigned to theComputer Technology group decreases and the possibility of itbeing assigned to the Marketing & Innovation group increases(box 1b). The Marketing & Innovation group tends to use more po-sitive motivation words (e.g. pumping up the ability to launch newideas or increase customers) in their advertisements while theComputer Technology group tends to focus on presenting objectiveand factual specifications.

Compared to other industries, the Marketing & Innovationindustry features the BEGIN variable (box 2 in Fig. 6), which indi-cates content regarding the start or commencement of activities.This is because some advertisements within the group come fromcompanies offering franchise opportunities, and they often

novation Industry.

REL VARY SV MALE REGION NEGATE

Haveb Goe Can Host Territory NoOwnc Change Have He Location UnlimitedInclude Process Be King Place UnlikeLiked Unlike Want Men International UncertainFirst Event Think Guy Area Incredible

nt Unlike Turn Need Papa Local Unwavering

Table D2Significant variables (General Inquirer categories) and common words for Computer Technology Industry.

TOOL GOAL POS POWER WEAK PAIN PERCEV

Computer Technology Paper Solution First Management Small Pain SeePhone Innovation High Power Less Frustration ClickComputer Choice Next Protect Need Malicious VisionProduct Result Higha Monitor Serve Oppressive VisualFile Selection Lastb Organize Sap Anxiety WatchCard Goal Low Require Remote Hostile identify

a High – in the word sense of adjective meaning the highestb Last – adjective, adverb: final, finally

Table D5Significant variables (General Inquirer catego-ries) and common words for Customer ServiceIndustry.

SOLVE

Customer Service ThinkFindChooseCalla

WeighRate

a call – to name, give a name to.

Table D6Significant variables (General Inquirer categories) and common words for CapitalAccess Industry.

ECON TRAVL OUGHT COLL

Capital Access Business Move Must CompanyInsurance Flow Should BankCompany Showa StateBank Trip GroupFinancial Flight TeamService Transfer Agency

a


advocate the starting of new businesses and franchises. It can benoted from Fig. 6 (BEGIN column), that as the incidence of BEGINwords increases (in the range from 0 up to 400 BEGIN words perthousand ad words), the probability of an advertisement being as-signed to Marketing & Innovation increases rapidly – eventuallyplateauing at 400 BEGIN words per thousand words – and theprobability of the advertisement being assigned to other industriesdecreases or remains stagnant.

As the advertisements contain more content related to TOOL(box 3 in Fig. 6), the possibility of it being assigned to the ComputerTechnology group goes up – because computers and technologiesare all regarded as tools for achieving goals. As evident in the TOOLcolumn there is a corresponding decrease in the probability of thead being assigned to the Marketing & Innovation industry.

The Luxury, Travel and Personal group features the QUAL (box4) and VEHICLE (box 5) categories. Ads from companies sellingLuxury products tend to contain more description about the qual-ity of the product (QUAL), while ads from airlines and automobilecompanies inevitably include descriptions of vehicles and means oftravelling (VEHICLE).

The Workforce industry features on the ACAD label (box 6) asacademic words are heavily used in the Workforce industry, whichconsists of advertisements about the education and training ofemployees.

Table D3Significant variables (General Inquirer categories) and common words for Luxury,Travel & Personal Industry.

QUALITY VEHICLE AQUATIC FOOD STAY OBJECT

Luxury,Travel &Personal

Noise Vehicle Bay Breakfast Stay VehicleEvena Car Pool Wine Wait CarHow Truck Water Drinkb Stop TruckTough Van Sea Food Set FuelQuality Sedan Creek Meal Stand EngineHot Cadillac Lake Lunch Rest Van

a Even – adv.: used to suggest that something is extreme, remarkable,unexpected.

b Drink – beverage.

Table D4Significant variables (General Inquirer categories) and common words for Workforce Indu

ACAD ROLE HUMA

Workforce Learn Employee EmploUniversity Guidea CompaCollege Personal Guidea

Education Owner PersonStudent Customer PeopleSchool Professional Manag

a Guide – noun: someone or something that directs.b Plan – noun: an organized program for some action.c Work – act of working, type or place of work.

Show – to show up, to make an appearance.

The Customer Service group is not well represented because ofthe small number of advertisements in the training set comparedto other groups. Nevertheless, advertisements from the CustomerService sector do show some emphasis on the category SOLVE(box 7), indicating many problem solving activities in the Cus-tomer Service process.

The Capital Access group is rich in economic content (the ECONlabel, box 8), as this category is focused on financial servicescompanies.

Appendix D provides detailed examples of the specific words inour data set that are triggering the significant feature variables(General Inquirer semantic categories) for each industry men-tioned above. Tables D1–D6 in Appendix D show common words

stry.

N IAV MEANS AROUSAL

yee Help By Careny Make Planb Passion

Get Insurance Inspirational Find Workc Morale

Provide Network Passionateement Learn Resource Appreciate


in the General Inquirer semantic categories that were the most sig-nificant for each industry, thus distinguishing the advertisementsin the classification process.

8. Conclusion

In this paper, we proposed and tested a method for systemati-cally analyzing the semantic and sentiment content of advertise-ments at the word category level on a data set from B2B printmagazines, for the purpose of automatically classifying advertise-ments. To our best knowledge, previous research has not employedcomputer-assisted analysis of magazine content for audience tar-geting. A neural network classifier was employed for classifyingadvertisements into predefined classes based on their characteris-tic features extracted from the content. Such an approach can beutilized to match advertisements to a media channel which hasrun advertisements with a similar semantic and sentiment profile,thus providing supportive and/or alternative approaches for mediaplanning and audience targeting. Although only two magazineswith similar characteristics were examined in our study, classifica-tion was made upon the sub-classes or industry categories withinthese two magazines, representing channels with different charac-teristics. Finally, by further examining the significant variables ofthe classifier, differences among industries were found and couldserve as a reference for creative agencies when creating or evaluat-ing advertisement content. Future work could include testing ofthe method on different media channels or matching advertise-ment content to non-advertising content within the media, espe-cially with internet media where textual data can be readilyretrieved. A limitation of our work is the implicit assumption thatthe majority of historic advertisements in an industry were suc-cessful; the high volume of repeat advertising in all industries jus-tified this assumption. To avoid this implicit assumption, it wouldbe worthwhile to gather return-on-investment (ROI) data for eachad, and then segregate historic advertisements both by industry,and into successful (positive ROI) vs. unsuccessful (zero or negativeROI) ads. The approach described here could then better guaranteethe success of a new advertisement that was matched to historicchannels that have been successful for advertisements similar insemantic and sentiment profile.

Appendix A. General Inquirer categories included for semanticcontent extraction

This appendix gives a detailed description of each General In-quirer category included for semantic content extraction andserves as one dimension of the feature vector.

The coverage for General Inquirer was 62%: of the 468,081 wordoccurrences in our advertisement data set, 291,869 could be foundin General Inquirer.

Table A1 describes the semantic categories in General Inquirer,and the average scores for our data set. The first column in the ta-ble gives the name of the General Inquirer category; the secondcolumn is the number of lexical entries in the category and thethird column is a brief description of the category; the fourthand fifth columns give descriptive statistics of the average score(measured in ‘‘incidence per thousand words’’) and variance forthat General Inquirer category, for all advertisements in our dataset.

Appendix B. ANEW and AFFIN for sentiment content extraction

This appendix gives descriptions of the ANEW and AFFIN dictio-naries used for sentiment content extraction.

For AFINN, the lexicon coverage was 5%: of the 468,081 wordoccurrences in the advertisements in our data set, 25,285 couldbe found in the AFINN lexicon. The remaining words are presumedto have no sentiment.

For ANEW, the lexicon coverage was 12%: of the 468,081 wordoccurrences in the advertisements in our data set, 54,992 could befound in the AFINN lexicon. The remaining words are presumed tohave no sentiment.

In Table B1, the first column gives the name of the category; thesecond column shows the number of lexical entries in the cate-gory; the third column gives the basis of scale (the minimumand maximum possible score for each word appearing in the dic-tionary); and the fourth column a brief description of the measure.

Table B2 shows the average AFINN and ANEW scores for alladvertisements in each industry in our data set. For ANEW, onlythe Valence score is shown as we found a greater than 98% corre-lation between Valence, Arousal, and Dominance scores for ourdata set. The score for each advertisement is calculated by cumu-lating the score for each word in the advertisement text. An aver-age is then computed for each industry category. We ran anANOVA analysis on this data, and found statistically significant dif-ferences between the categories (p < 0.0001). Marketing and Inno-vation advertisements and Capital Access advertisements were themost up-beat in sentiment (as measured by both AFINN and ANEWmean scores), and Compliance and Customer Service advertise-ments where the most negative, again according to both sentimentmeasures. Figs. B1 and B2 represent the ANOVA results graphically.

Appendix C. Advertiser industry classification rules

This appendix describes the protocol used by the data capturersfor classifying advertisements by industry. This protocol is basedon the Small Business Success Index (SBSI) industry classificationscheme from the University of Maryland and Network Solutions(Rockbridge Associates Inc., 2011). A definition of the industryclass is first given, followed by examples of sub-industries andcompanies (in italics) and then special notes for classifying certainadvertisers.

C.1. Capital Access

Capital Access includes advertisements for products or servicesthat help small businesses with availability of working capital, cap-ital for long term investments, or expert financial advice. Examplesub-industries and companies are:

Banking – Bank of America, Chase, Citi, SunTrustCredit Cards – American Express, Visa, MasterCardFinancingBusiness Insurance – Zurich business insurance, State Farmbusiness insurance, Chubb business insurance, Travelers businessinsurance, All State business insurance

Special cases and notes: For Business Insurance, ‘‘Capital Ac-cess’’ is used only if the advertisement mentions only businessinsurance (that benefits the corporation, not the employees). If per-sonal/life/disability insurance is being advertised, ‘‘Workforce’’ isused instead. Business insurance companies are included in‘‘Capital Access’’ because they provide the business with workingcapital in the event of disaster (e.g. flood or fire). Companies likeFidelity Investments are classified under ‘‘Workforce’’ instead, sinceFidelity’s advertisements usually offer private investment servicesto the business owner or employees, and not services to thecorporation itself. Also, personal investment companies fit under‘‘Workforce’’. Credit card companies which provide business

Table A1Descriptive statistics for categories from General Inquirer used for representing semantic content (Stone et al., 1966).

Category Number of lexical entries Description Mean Variance

ACAD 153 Academic words 5.95 14.62ACTV 2045 Words implying activity 101.05 48.12AFFIL 557 Words indicating affiliation or supportiveness 32.36 28.51ANI 72 Words related to animals 0.68 4.21AQUATIC 20 Words related to water 0.53 3.57AROUSAL 166 Words indicating excitement 2.46 6.87BEGIN 56 Commencement of an activity 5.11 18.48BLDGPT 46 Buildings, rooms and building parts 1.30 4.79CARD 36 Cardinality words 4.12 9.25CAUSAL 112 Causal relationships 6.65 10.24COLL 191 Human collectivities 15.49 21.22COMFORM 895 Format or media of communication 28.38 24.15COMNOBJ 104 Tools for communicating 9.39 14.89COMP 21 Comparison 12.60 15.93COMPLT 81 Complete, goals being achieved 4.23 9.50DAV 540 Descriptive verbs or actions 37.32 29.94DECR 82 Decrease 1.31 5.42DOCTR 217 Organized system of belief or knowledge 25.54 24.72ECON 510 Economic words 65.67 46.31EVAL 314 Evaluation 0.95 4.18EXCH 61 Words related to exchange and trading 3.77 9.31EXERT 194 Exerting force 2.32 7.25EXPRS 205 Sports, art and self-expressive 6.61 13.23FAIL 137 Words related to failure 0.55 3.18FALL 42 Movement of falling 0.17 1.45FETCH 79 Movement of getting, carrying 7.53 12.11FINISH 87 Termination of action 0.89 3.36FOOD 80 Edible object 1.49 6.70FREQ 46 Words related to frequency 1.91 6.26GOAL 53 Words implying goals 5.79 10.50HOSTILE 833 Words with hostility 4.81 11.87HU 795 Words related to human 36.16 32.12IAV 1947 Interpretative explanation of an action 69.48 37.90INCR 111 Increase 5.58 10.34INTJ 42 Exclamations, casual and slang references 0.71 3.43INTREL 577 Interpersonal processes 19.43 19.06KIN 50 Kinship 0.90 4.41KNOW 348 Awareness or unawareness 19.22 20.23LAND 63 Place occurring in nature 1.31 5.21LEGAL 192 Legal words 5.08 11.13MALE 56 Words implying male 2.62 9.46MEANS 244 Methods utilized in attaining goals 25.26 23.58MILIT 88 Military words 0.63 4.26NATOBJ 61 Words describing natural objects 2.18 7.34NATPRO 217 Natural processes, birth to death 6.20 15.01NEED 76 Expression of need or intent 4.54 8.43NEG 2291 Negative words (A broader version of NGTV) 16.21 20.57NEGATE 217 Reversal or negation 7.49 13.48NGTV 1160 Negative words 13.90 18.81NO 7 Disagreement 0.32 2.40NONADLT 25 Words related to childhood 0.58 4.21NUMB 51 Numbness 5.81 11.02OBJECT 661 Words implying objects 29.13 28.76ORD 15 Ordinal words 1.69 5.15OTHER 1000 Words not otherwise classifiable 10.16 15.87OUGHT 26 Moral imperative 0.43 2.76OUR 6 Inclusive self, we 11.21 17.74PAIN 254 Words indicating suffering 0.72 4.06PERCV 192 Perceive, recognizing or identifying something 5.11 9.19PERSIST 64 Persistent, endurance 1.91 5.34PLACE 318 Words indicating places 19.03 21.95PLEASUR 168 Words indicating enjoyment 2.26 6.04POLIT 263 Political works 6.34 12.82POS 1915 Positive words (A broader version of PSTV) 77.97 44.66POWER 689 Power, control or authority 23.53 23.66PRON 1000 Pronoun 57.24 40.46PSTV 1045 Positive words 69.61 41.48PSV 911 Words implying passive 22.26 20.39QUAL 344 Qualitative words 9.05 14.36QUAN 314 Quantitative words 46.71 31.15RACE 15 Racial words 0.24 2.71REGION 61 Region 6.19 11.96REL 136 Consciousness of abstract relationships 11.50 14.44RELIG 103 Religious words 0.40 3.20

(continued on next page)


Table A1 (continued)

Category Number of lexical entries Description Mean Variance

RISE 25 Movement of rising 0.33 2.18ROLE 569 Social defined roles 20.40 22.13ROUTE 23 Route, path 1.74 6.09SELF 7 Singular self 3.81 12.91SKY 34 Aerial condition and outer space 0.83 4.22SOCIAL 111 Location for social interaction 8.44 14.58SOLVE 189 Problem solving 13.67 15.94SPACE 302 Consciousness of space or spatial relationship 39.94 27.00STAY 125 Staying still 1.96 5.58STRNG 1902 Words implying strength 116.91 53.29SUBM 284 Submission to authority or power 9.15 13.83SV 102 Mental or emotional state 34.62 26.75TIME 273 Time consciousness 31.29 27.18TOOL 318 Words related to tools 10.84 16.46TRAVL 209 Travelling, physical movement 9.02 12.98TRY 70 Activities for achieving a goal 1.43 4.82UNDRST 319 Understated, words implying de-emphasis 14.46 17.04VARY 98 Words indicating changes 4.50 12.74VEHICLE 39 Vehicles 3.84 13.14VICE 685 Assessment of moral disapproval 2.66 7.25VIRTUE 719 Assessment of moral approval 39.36 29.33WEAK 755 Words implying weakness 14.63 17.87WORK 261 Ways of doing work 17.29 19.26YES 20 Agreement 0.89 3.63YOU 9 Another person being addressed 34.47 32.09

Table B1ANEW dictionary and descriptive statistics for the data set.

Dictionary LexicalEntries

Basis of scale Description of measures

AFFIN 2477 �5 to 5 (wholenumbers only)

Degree of valence positivity

ANEW 2476 1 to 9 (continuousvalues)

ANEW-Valence: Degree ofhappinessANEW-Arousal: Degree ofexcitementANEW-Dominance: Degree ofbeing in-control

Table B2Mean AFFIN and ANEW cumulative scores for all advertisements, in each industry.

Industry Mean score

AFINN ANEW

Capital Access 9.5 65.6Compliance 4.6 40.3Computer Technology 7.8 59.4Customer Service 5.9 58.6Marketing and Innovation 10.0 74.0Luxury, Travel, and Personal 7.4 63.0Workforce 9.6 59.4

Fig. B1. ANOVA analysis: Industry vs. AFINN score.


financing/working capital (like Visa, MasterCard, Citi) fit under‘‘Capital Access’’. However, payment processing companies (likePayPal), which do not provide business financing, and merely dotransaction processing, are classified under ‘‘Customer Service’’.

C.2. Compliance

Advertisements for products or services that help small busi-nesses to understand and comply with laws and regulations,including ensuring data security. Example sub-industries and com-panies are:

Tax – IRS, E-File, TurbotaxAccounting – Sage Accounting SoftwareCorporate Filings – The Company Corporation (incorporate.com)

C.3. Computer Technology

Advertisements for products or services that help small busi-nesses to make technology work effectively and efficiently in theorganization. Example sub-industries and companies are:

Hardware, Software, Computers, Printers, Photocopiers, Cam-eras, Telephony – CDW, AST, Compaq, Dell, Epson, HP, Intel, IBM,Microsoft, Toshiba, SAS, Oracle, Canon, Konica Minolta, Ricoh, Fujistu,Sharp, Brother, Sony, Sprint, Verizon, AT&T

Special cases and notes: Technology companies (like Facebook,Google, and eBay) that are primarily assisting companies with

Fig. B2. ANOVA analysis: Industry vs. ANEW score.


marketing (advertising) their products are listed under the ‘‘Mar-keting & Innovation’’ category. ‘‘Computer Technology’’ is reservedonly for internal technologies used by the company, not for cus-tomer-facing technologies.

C.4. Customer Service

Advertisements for products or services that help small busi-nesses to service their customers, show they care about themand grow their relationships. Example sub-industries and compa-nies are:

Shipping – UPS, FedEx, USPS, DHLCustomer Survey – ForeSee ResultsLive Chat – Bold SoftwarePayment Processing – PayPalOrder Management – OrderMotion, SAP.

C.5. Marketing & Innovation

Advertisements for products or services that help small busi-nesses with identifying new prospects, showing effective corporatepositioning, converting leads, finding ways to efficiently advertise,and the ability to come up with new ideas. Example sub-industriesand companies are:

Email Marketing – Bronto, iContactInternet Advertisements – Facebook Ads, Google Ads, eBayLogos – Logoworks, The Logo FactoryWeb ListingWeb Hosting – 1&1 web hosting, GoDaddy web hostingDomain Names – Rick LatonaCustomer Relationship Management – Salesforce.com, SalesGenie,SageCompanies Offering Franchise Opportunities – CruiseOne, JaniKing,Kumon

Special cases and notes: Technology companies (like Facebook,Google, and eBay) that are primarily assisting companies with

marketing (advertising) their products should be listed under‘‘Marketing & Innovation’’ and not under ‘‘Computer Technology’’.‘‘Computer Technology’’ includes only internal technologies usedby the company, not customer facing (marketing) technologies. In-clude in the ‘‘Marketing & Innovation’’ category companies that areoffering franchise opportunities to entrepreneurs (e.g. Cruise One,Jani King, Kumon, Candy Bouquet, Certa Pro Painters), since franchisecorporations primarily exist to assist with marketing, branding andcustomer acquisition.

C.6. Luxury, Travel, and Personal

Luxury, Travel and Personal is a new group added to the sixgroups defined by SBSI to cover advertisers and advertisementsfor luxury goods, travel-related products and services, or personalitems. Example sub-industries and companies are:

Automobiles – Lexus, BMW, Chevrolet, GMC, Land Rover, JaguarAirline – Delta, US Airways, American Airlines, CessnaHotel – Starwood, Hilton, Hyatt, Holiday Inn, Doubletree, EmbassySuitesLuxury goods – RolexAlcohol and Tobacco – Johnnie WalkerOther – Enzyte

C.7. Workforce

Advertisements for products or services that help small busi-nesses to attract, retain, train and develop, motivate and reward,and deploy employees efficiently, as well as encourage creativityfrom them. Example sub-industries and companies are:

EmploymentEducationBusiness Books & Magazines – Entrepreneur Magazine Book TitlesTraining – Sandler Sales InstituteLife/Personal/Disability InsurancePersonal Investments - Fidelity Investments.

Appendix D. Significant variables and common words forindustries

This appendix provides more examples of what major featurevariables (specifically, General Inquirer categories) determine theclassification for an advertisement as well as what words in eachindustry group trigger these variables. Tables D1–D6 provide themost significant categories and words for Marketing & Innovation;Computer Technology; Luxury, Travel & Personal; Workforce; Cus-tomer Service; and Capital Access. Each column represents a Gen-eral Inquirer category and the cells give the most common wordsof that category that appear in the advertisements of a certainindustry.

Marketing & Innovation (Table D1): As described in the mainbody (Section 7: Discussion), BEGIN words are strongly associatedwith the Marketing & Innovation industry. Other variable catego-ries that are positively related to the classification of the Marketing& Innovation group include COMPLET, meaning that the achieve-ment of goals is often explicitly addressed in the Marketing & Inno-vation ads in the training set; TRY, meaning the ads are full incontent regarding actions attempted to achieve the goals; SOCIALand REL, referring to the description of social activities or relation-ships between people and objects; VARY, indicating that changesand adaptations are also dominant themes of the Marketing adver-tisements; SV, meaning that a lot of mental or emotional verbs areused, which is in accordance with the discussion above that theMarketing advertisements have a higher degree of valence; MALE,


indicating content that clearly refers to the male gender; REGION,which is probably because global, national and other geographicalkeywords or content are prevalent; and NEGATE, which is becausethis group tends to describe their products with expressions likeunlimited, incredible, unlike, and so forth. Table D1 provides ashortlist of words from these categories that had highest preva-lence in Marketing & Innovation relative to other industries.

Computer Technology (Table D2): As described in the mainbody (Section 7: Discussion), TOOL words are strongly associatedwith the Computer Technology industry. Other significant themesfor the Computer Technology group include GOAL, which empha-sizes result and target; POS, indicating content with positive out-look; POWER, which is because Computer Technology involves alot of management, monitoring, protection and organization ofbusiness operations, thus indicating power and control; WEAK,which is triggered by descriptions about small size, less cost andeffort as well as serving the customer’s need – words like ‘‘small’’,‘‘less’’, ‘‘need’’ and ‘‘serve’’ are all indicators of weakness; PAIN,resulting from advertising text acclaiming the freedom from pain,frustration and anxiety enabled by technology and these negativewords are seen as an expression of pain; and PERCEV, significanton one hand as technology service companies claim to help cus-tomers see or view their business from a better angle, and on theother hand as companies like Compaq, Canon and Konica Minoltadiscuss much about the visual effect of their monitors and cameras.Table D2 provides a shortlist of words from these categories thathad highest prevalence in Computer Technology relative to otherindustries.

Luxury, Travel, and Personal (Table D3): As described in themain body (Section 7: Discussion), QUAL and VEHICLE words arestrongly associated with the Luxury, Travel, and Personal industry.The ads from this group also contain much content regardingAQUATIC, for example the descriptions of swimming pools andseas in advertisements from hotels; FOOD, referring to food andcuisine; STAY, which is due to the advertisements from hotels inthis group resulting in common descriptions of staying and resting;and OBJECT, which is a large category containing tools, food, vehi-cle, building parts etc. Table D3 provides a shortlist of words fromthese categories that had highest prevalence in Luxury, Travel, andPersonal relative to other industries.

Workforce (Table D4): As described in the main body (Section 7:Discussion), ACAD words are strongly associated with the Work-force industry. Other significant variables include ROLE and HU-MAN, indicating roles, human behavior patterns and humanactivities, which are the major activities of some training institu-tions in the Workforce group; IAV, representing interpretation ac-tions like help, make, learn etc., which are also common themes ofa training or educating process; MEANS, meaning effort taken inthe attaining of goals like plan, work etc.; and AROUSAL, becauseof reoccurring themes of passion, inspiration, morale and attentionin the education advertisements. Table D4 provides a shortlist ofwords from these categories that had highest prevalence in Work-force relative to other industries.

Customer Service (Table D5): As described in the main body(Section 7: Discussion), SOLVE words are strongly associated withthe Customer Service industry. Table D5 provides a shortlist ofwords from this category that had highest prevalence in CustomerService relative to other industries.

Capital Access (Table D6): As described in the main body (Sec-tion 7: Discussion), ECON words are strongly associated with theCapital Access industry. The group is also rich in TRAVL, whichon one hand originates from expressions regarding the manipula-tion of capital through words like ‘‘transfer’’, ‘‘move’’ and ‘‘flow’’,and on the other hand results from ads of credit card companies,emphasizing their support for payment for trips and flights;OUGHT, because the Capital Access industry tends to use a more

imperative tone, using the word ‘‘must’’ or ‘‘should’’, for examplein expressions like ‘payments must be made’; and COLL, due tothe common usage of words indicating collective groups or organi-zations, for example ‘‘bank’’, ‘‘agency’’, ‘‘team’’ and ‘‘company’’,which are either coordinators of access to capital or groups in needof capital. Table D6 provides a shortlist of words from these cate-gories that had highest prevalence in Capital Access relative toother industries.

References

Abernethy, A. M., & Franke, G. R. (1996). The information content of advertising: Ameta-analysis. Journal of Advertising, 25(2), 1–17.

Abrahams, A. S., Coupey, E., Rajivadekar, A., Miller, J., Snyder, D. C., & Hayden, S. J.(2012). Marketing to the American entrepreneur: Insights and trends frommass-market print magazine advertising. Journal of Research in Marketing andEntrepreneurship, 14(1), 65–94.

Anagnostopoulos, A., Broder, A. Z., Gabrilovich, E., Josifovski, V., & Riedel, L. (2011).Web page summarization for just-in-time contextual advertising. ACMTransactions on Intelligent Systems and Technology, 3(1), 14.

Agresti, A. (1990). Categorical data analysis. Wiley.Apte, C., Damerau, F., & Weiss, S. (1994). Automated learning of decision rules for

text categorization. ACM Transactions on Information Systems, 12(3), 233–240.Audit Bureau of Circulations (ABC). (2012). ABC eCirc. Available at: http://

abcas3.accessabc.com/ecirc/index.html. Accessed on: 13 September 2012.Azcarraga, A. P., Hsieh, M.-H., & Setiono, R. (2008). Market research applications of

artificial neural networks. In Proceedings of the IEEE congress on evolutionarycomputation (CEC 2008), Hong Kong, China (pp. 357–363).

Biswas, A., Olsen, J. E., & Carlet, V. (1992). A comparison of print advertisementsfrom the United States and France. Journal of Advertising, 21(4), 73–81.

Borko, H., & Bernick, M. (1963). Automatic document classification. Journal of theACM, 10(2), 151–162.

Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation ofmachine learning algorithms. Pattern Recognition, 30, 1145–1159.

Bradley, M.M., Lang, P.J. (2010). Affective norms for English words (ANEW):Instruction manual and affective ratings. Technical Report C-2, The Center forResearch in Psychophysiology, University of Florida.

Broder, A., Fontoura, M., Josifovski, V., & Riedel, L. (2007), A semantic approach tocontextual advertising. In Proceedings of the 30th annual international ACM SIGIRconference on research and development in information retrieval, Amsterdam, TheNetherlands (pp. 559–-566).

Business Publishers Association (BPA). (2012). About BPA WorldWide. Available at:http://www.bpaww.com/Bpaww_com/Pages/AboutBPA.aspx Accessed on: 13September 2012.

Calvo, R. A., Lee, J. M., & Li, X. (2004). Managing content with automatic documentclassification. Journal of Digital Information, 5(2), 1–15.

Chen, H., Schuffels, C., & Orwig, R. (1996). Internet categorization and search: A self-organizing approach. Journal of Visual Communication and Image Representation,7(1), 88–102.

Chung, W., Chen, H., & Nunamaker, J. (2005). A visual framework for knowledgediscovery on the web: An empirical study of business intelligence exploration.Journal of Management Information Systems, 21(4), 57–84.

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational andPsychological Measurement, 20(1), 37–46.

Curry, B., & Moutinho, L. (1993). Neural networks in marketing: Modeling consumerresponses to advertising stimuli. European Journal of Marketing, 27(7), 5–20.

Dasgupta, C. G., Dispensa, G. S., & Ghose, S. (1994). Comparing the predictiveperformance of a neural network model with some traditional market responsemodels. International Journal of Forecasting, 10(2), 235–244.

Davies, F., Moutinho, L., & Curry, B. (1996). ATM user attitudes: A neural networkanalysis. Marketing Intelligence & Planning, 14(2), 26–32.

Dowling, G. R., & Kabanoff, B. (1996). Computer-aided content analysis: What do240 advertising slogans have in common? Marketing Letters, 7, 63–75.

Dreyfus, G. (2005). Neural networks: Methodology and applications. New York:Springer.

Fan, T. K., & Chang, C-H. (2010). Sentiment-oriented contextual advertising.Knowledge and Information Systems, 23(3), 321–344.

Gallagher, K., Foster, D. K., & Parsons, J. (2001). The medium is not the message:Advertising effectiveness and content evaluation in print and on the Web.Journal of Advertising Research, 41(4), 57–70.

Golub, K. (2006). Automated subject classification of web documents. Journal ofDocumentation, 62(3), 350–371.

Graham, J. L., Kamins, M. A., & Oetomo, D. S. (1993). Content analysis of German andJapanese advertising in print media from Indonesia, Spain, and the UnitedStates. Journal of Advertising, 22(2), 5–15.

Gross, B. L., & Sheth, J. N. (1989). Time-oriented advertising: A content analysis ofUnited States magazine advertising, 1890–1988. The Journal of Marketing, 53(4),76–83.

Heath, R. (2005). Measuring affective advertising: Implications of low attentionprocessing on recall. Journal of Advertising Research, 45(2), 269–281.

Huang, X., & Brown, A. (1999). An analysis and classification of problems in smallbusiness. International Small Business Journal, 18(1), 73–85.

http://abcas3.accessabc.com/ecirc/index.html

http://abcas3.accessabc.com/ecirc/index.html

http://www.bpaww.com/Bpaww_com/Pages/AboutBPA.aspx


Joachims, T., & Sebastiani F. (Eds.), Automated text categorization (Special Issue),Journal of Intelligent Information Systems (18) (2002).

Kaefer, F., Heilman, C. M., & Ramenofsky, S. D. (2005). A neural network applicationto consumer classification to improve the timing of direct marketing activities.Computers and Operations Research, 32(10), 2595–2615.

Kim, Y. S., Street, W. N., Russell, G. J., & Menczer, F. (2005). Customer targeting: Aneural network approach guided by genetic algorithms. Management Science,51(2), 264–276.

Kolbe, R. H., & Albanese, P. J. (1996). Man to man: A content analysis of sole-maleimages in male-audience magazines. Journal of Advertising, 25(4), 1–20.

Lewis, D. D., Yang, Y., Rose, T. G., & Li, F. (2004). RCV1: A new benchmark collectionfor text categorization research. Journal of Machine Learning Research, 5,361–397.

Li, Y. H., & Jain, A. K. (1998). Classification of text documents. The Computer Journal,41(8), 537–546.

MacInnes, D. J., Moorman, C., & Jaworski, B. J. (1991). Enhancing and measuringconsumers’ motivation, opportunity, and ability to process brand informationfrom ads. Journal of Marketing, 55(4), 32–53.

Motes, W. H. (1992). Reactions to lexical, syntactical, and text layout variations of aprint advertisement. Journal of business and technical communication, 6,200–223.

Naccarato, J. L., & Neuendorf, K. A. (1998). Content analysis as a predictivemethodology: Recall, readership, and evaluations of business-to-business printadvertising. Journal of Advertising Research, 38(3), 19–33.

Neuendorf, K. A. (2002). The content analysis guidebook. Sage Publications.Nielsen, F. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in

microblogs. In Proceedings of the ESWC2011 Workshop on Making Sense ofMicroposts: Big things come in small packages (pp. 93–98).

Nigam, K., McCallum, A., Thrun, S., & Mitchell, T. (2000). Text classification fromlabeled and unlabeled documents using EM. Machine Learning, 39(2/3),103–134.

Paliwal, M., & Kunar, U. A. (2009). Neural networks and statistical techniques: Areview of applications. Expert Systems with Applications, 36(1), 2–17.

Ramalingam, V., Palaniappan, B., Panchanatham, N., & Palanivel, S. (2006).Measuring advertisement effectiveness—A neural network approach. ExpertSystems with Applications, 31(1), 159–163.

Resnik, A., & Stern, B. L. (1977). An analysis of information content in televisionadvertising. The Journal of Marketing, 41(1), 50–53.

Riloff, E., & Lehnert, W. (1994). Information extraction as a basis for high-precisiontext classification. ACM Transactions on Information Systems, 12(3), 296–333.

Rockbridge Associates (2011). The State of Small Business Report: January 2011Survey of Small Business Success. University of Maryland, Robert H. SmithSchool of Business. Network Solutions. February 9, 2011.

Ruiz, M., & Srinivasan, P. (2002). Hierarchical text categorization using neuralnetworks. Information Retrieval, 5(1), 87–118.

SAS Institute Inc. (2010). JMP� 9 modeling and multivariate methods. (pp. 272–281). Cary, NC: SAS Institute Inc. Retrieved from <http://www.jmp.com/support/downloads/pdf/jmp902/modeling_and_multivariate_methods.pdf>.

Schwartz, E. I. (1992). Smart programs go to work. BusinessWeek, March. Availableat <http://www.businessweek.com/stories/1992-03-01/smart-programs-go-to-work>.

Sebastiani, F. (2002). Machine learning in automated text categorization. ACMComputing Surveys, 34(1), 1–47.

Setiono, R., Thong, J. Y. L., & Yap, C. S. (1998). Symbolic rule extraction from neuralnetworks: An application to identifying organizations adopting IT. Informationand Management, 34(2), 91–101.

Sissors, J. Z., & Baron, R. B. (2010). Advertising Media Planning 7th ed.. McGraw-Hill.Smith, K. A., & Gupta, J. N. D. (2000). Neural networks in business: Techniques and

applications for the operations researcher. Computers and Operations Research,27, 1023–1044.

Starch Information Sources Study (2010). Prepared for Canadian Business Press,Toronto, ON, March. Available at <http://www.omdc.on.ca/AssetFactory.aspx?did=6881>.

Stone, P. J., Dunphy, D. C., Smith, M. S., & Ogilvie, D. M. (1966). The general inquirer: Acomputer approach to content analysis. Cambridge, MA: MIT Press.

Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural networks in business: Asurvey of applications (1992–1998). Expert Systems with Applications, 17(1),51–70.

Weinrauch, D., Mann, K., Robinson, P. A., & Pharr, J. (1991). Dealing with limitedfinancial resources: A marketing challenge for small business. Journal of SmallBusiness Management, 44–53.

Yang, Y., Slattery, S., & Ghani, R. (2002). A study of approaches to hypertextcategorization. Journal of Intelligent Information Systems, 18(2–3), 219–241.

Zhang, G. P. (2000). Neural networks for classification: A survey. IEEE Transactionson Systems, Man, and Cybernetics, Part C: Applications and Reviews, 30(4),451–462.

http://www.jmp.com/support/downloads/pdf/jmp902/modeling_and_multivariate_methods.pdf

http://www.jmp.com/support/downloads/pdf/jmp902/modeling_and_multivariate_methods.pdf

http://www.businessweek.com/stories/1992-03-01/smart-programs-go-to-work

http://www.businessweek.com/stories/1992-03-01/smart-programs-go-to-work

http://www.omdc.on.ca/AssetFactory.aspx?did=6881

http://www.omdc.on.ca/AssetFactory.aspx?did=6881

audience targeting by b-to-b advertisement classification: a neural network approach

Documents