statistical models for customer satisfaction data › fbec › 066daafdf79eb328524f228e… · this...

21
1 STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA Silvia Figini University of Pavia, [email protected] Paolo Giudici University of Pavia, [email protected] Abstract This paper considers customer satisfaction data. In Section 1 we report a short introduction, in Section 2 we describe how it is possible to obtain data to measure customer satisfaction. Section 3 presents how is possible to analyze the collected data and Section 4 shows a review for standard statistical methods for customer satisfaction analysis. Then in Section 5 we explain our methodological proposal based on discrete graphical models and a novel theoretical proposal to mixture different type of customer data information. Finally in Section 6 we present a statistical analysis of the dataset available on the website http://www.economia.unimi.it/projects/CSProject/ . Section 7 shows the conclusions and further ideas of research. Keywords: Customer Satisfaction, Customer dataset, Dynamic Models, Graphical Models, Lifetime Value. 1 INTRODUCTION Customer satisfaction represents a modern approach for quality in enterprises and organizations and serves the development of a truly customer-focused management and culture. Customer satisfaction measures offer a meaningful and objective feedback about client’s preferences and expectations. Customer Satisfaction research is one of the fastest growing segments of the marketing field. Marketing and management sciences, nowadays, are focusing on the coordination of all the organization’s activities in order to provide goods or services that can satisfy best specific needs of existing or potential customers. To reinforce customer orientation on a day-to-day basis, a growing number of companies choose customer satisfaction as their main performance indicator. However, it is almost impossible to keep an entire company permanently motivated by a notion as abstract and intangible as customer satisfaction. Therefore, customer satisfaction must be translated into a number of measurable parameters directly linked to people’s job-in other words factors that people can understand and influence. For more details on Customer Satisfaction, see e.g. Siskos et. Al. 1998, Y.; Grigoroudis E.; Zopounidis C.; Saurais O., 1998, Cassel, 2000, Cassel et al. 2001, Lundström, 2005, Eklöf, 2002. . Silvia Figini, [email protected]

Upload: others

Post on 28-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

1

STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA

Silvia Figini

University of Pavia, [email protected]

Paolo GiudiciUniversity of Pavia, [email protected]

Abstract

This paper considers customer satisfaction data. In Section 1 we report a short introduction, in Section 2 we describe how it is possible to obtain data to measure customer satisfaction. Section 3 presents how is possible to analyze the collected data and Section 4 shows a review for standard statistical methods for customer satisfaction analysis. Then in Section 5 we explain our methodological proposal based on discrete graphical models and a novel theoretical proposal to mixture different type of customer data information. Finally in Section 6 we present a statistical analysis of the dataset available on the website http://www.economia.unimi.it/projects/CSProject/. Section 7 shows the conclusions and further ideas of research.

Keywords: Customer Satisfaction, Customer dataset, Dynamic Models, Graphical Models, Lifetime Value.

1 INTRODUCTION

Customer satisfaction represents a modern approach for quality in enterprises and organizations and serves the development of a truly customer-focused management and culture. Customer satisfaction measures offer a meaningful and objective feedback about client’s preferences and expectations. Customer Satisfaction research is one of the fastest growing segments of the marketing field. Marketing and management sciences, nowadays, are focusing on the coordination of all the organization’s activities in order to provide goods or services that can satisfy best specific needs of existing or potential customers. To reinforce customer orientation on a day-to-day basis, a growing number of companies choose customer satisfaction as their main performance indicator. However, it is almost impossible to keep an entire company permanently motivated by a notion as abstract and intangible as customer satisfaction. Therefore, customer satisfaction must be translated into a number of measurable parameters directly linked to people’s job-in other words factors that people can understand and influence. For more details on Customer Satisfaction, see e.g. Siskos et. Al. 1998, Y.; Grigoroudis E.; Zopounidis C.; Saurais O., 1998, Cassel, 2000, Cassel et al. 2001, Lundström, 2005, Eklöf, 2002..

Silvia Figini, [email protected]

Page 2: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

2

2 HOW TO OBTAIN CUSTOMER SATISFACTION DATA

In line with standard procedures in survey sampling the objects under study have to be defined as well as the target populations, the frames containing the targets and the domains of study.The first contact will probably be through the web. Surveys on the web could typically consist of some questionnaire that pops up when someone has entered the agency’s website. The questionnaire would not likely be detailed but fairly simple and focus on questions dealing with the appearance of the site. This could be suitable for continuously monitoring the opinions of the visitors to the site. Some features of this type of survey are worth considering, for example: The target population would be the visitors to the site. We can see that this population consists of the public, the media,researchers, people from institutions and companies. Unless questions are asked about it there would be no way to distinguish between the contacts so presumably the responses will be taken as representative for the public. However there seems to be reason to believe that it is not entirely representative for the public since the other categories would probably be overrepresented. There will probably be several answers from the same individual. This means that if the person answers to the questionnaire every time he/she visits the site it will be automatically be weighted according the frequency of the visits. If he /she does not answer every time thefrequency is unknown. The non response rate would typically be unknown. This type of survey is cheap, fast and can generate data continuously.The second contact is likely to be through telephone and/or mail/email. An agency mails a questionnaire to a group of customers or former customers. The customers complete the questionnaire and mail it back to the sender. Mail surveys tend to consume less staff time as interviewers are not used; however, mail surveys tend to have low response rates—many customers fail to respond to them, even when provided with an addressed, stamped envelope for returning the completed questionnaire. The third step is a meeting or discussion over the phone with the staff of the agency. The agency calls current and/or former customers asking questions by telephone. Telephone surveys are impractical in much of “Indian country” because of the high percentage of homes that lack telephone service. If telephone surveys are used as amethod for collecting customer satisfaction data, it is helpful to send a postcard to the customer in advance of the telephone call informing him/her of the survey. It is possible also to obtain data from Face-to-face interviews where a member of the agency program staff or another person interviews current and/or former customers (often at a program office or at a post-placement workplace), reading and/or signing the questionnaire items to the customer. Face-to-face interviews, by their nature, tend to be more intimate than the other methods of data collection. This intimacy can be used to communicate care and concern to customers, but can have the disadvantage that customers may be more reluctant to criticize or speak negatively about the program in a face-to-face interview than in an Internet, mail, or telephone survey. On balance. We recommend collecting customer satisfaction data using face-to-face interviews.

2 ANALYSIS OF CUSTOMER SACTISFACTION DATA

Before to propose our methodological proposal to measure Customer Satisfaction, we discuss the term Satisfaction, which is a somewhat vague concept. A customer can be

Page 3: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

3

more or less satisfied with the quality of a service. Satisfaction should be seen as acontinuous variable rating from not satisfied at all to completely satisfied. To measure satisfaction scales with fixed endpoints are often used. The lowest point on the scale represents the situation when a customer is not satisfied at all and the highest point the situation when a customer is completely satisfied. A value in between represents the degree of satisfaction perceived by the customer. Because of the fact that it can be difficult to obtain an exact agreement between the customers opinion and the numerical value stated using a limited scale it seems feasible to allow for a small approximation error. Thus the idea of latent variables seems to be appropriate in this context. In psychometrics, sociology, econometrics and other sciences one often tries to measure concepts that are not explicitly measurable. Some examples are concepts like “attitude”, ”motivation”, “satisfaction with services”, “satisfaction with processes” and so on. Concepts like “sex”, “age”, “weight” and so on can be measured directly. Concepts that are not directly measurable are called latentvariables in statistical analysis whereas variables that can be directly measured are called manifest variables. The common theory for measuring diffuse concepts (latent variables) means that in order to measure a latent variable there should exist a number of manifest variables that can be measured directly. Measuring customer satisfaction can be done by simply asking a series of questions. It is generally best if some of the questions measure the degree of satisfaction or dissatisfaction using a scale. Satisfaction is best measured on a continuous scale. But for obvious reasons we cannot use an unlimited scale. We have to compromise. The scale should be such that it allows the customer enough flexibility to express his opinion and yet be limited. In studies using this methodology a ten point scale is often used with the endpoints fixed at 10 representing the case when the customer is completely satisfied and 0 the case when the customer is not at all satisfied. For example, the customer can be asked to indicate the item that best represents his or her view from a set of five alternatives: 1 Very unsatisfied, 2 Moderately unsatisfied, 3 Neutral, 4 Moderately satisfied, 5 Very satisfied. Questions presented in this way can be easily scored on a 5-point scale. To each component at least 3 questions should be attached. The questions are the manifest variables. These should be formulated in such a way that they relate to aspects of the components that are easy to recognize in reality. Overall the questionnaire will contain some 30-40 questions about the customer’s satisfaction with different aspects of the service. There should also be some background variables that will make it possible to do a more detailed analysis regarding segmentation.By solving the problem typically we use mathematical and statistical methods to analyze the data that has been collected in order to estimate the relations between the latent variables and between the manifest variables and the latent variables, such as Structured equation models (SEM). Thus the model estimated will fit the data according to some criteria e.g. it minimizes the deviations from the actually observed data. Note that the weighting of the manifests into the latent variables is determined by the method itself and not by any personal opinion. It is then possible to calculate the reliabilities of the manifest variables, create the contents of the latent variables and calculate the relations between the latent. It is also possible to see whether the model we set out to estimate really fits the data to a satisfactory degree or not for example using the coefficient of determination’s, R2 , measure of fit. There are several methods for solving the structure. The method of using structures of latent variables is supported by international quality organizations such as EFQM (European Foundation of Quality Management), EOQ

Page 4: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

4

(European Quality Organisation) and national quality organizations. In 2003 the National Economic Research Associates evaluated different method for measuring customer satisfaction with the aim to choose the best method for measuring customer satisfaction in USA. In 2005 the European Customer Satisfaction Index Steering Committee evaluated methods for measuring customer satisfaction in Europe. Both evaluations gave the same result. For more details on Customer Satisfaction Measurement, see e.g. Bob E. Hayes, 1998, Terry G. Vavra, 1997, Donald C et. al. 1989, Robert Grady, 1992, Norman et al. 1995.

3 OUR PROPOSAL TO ANLYSE CUSTOMER SATISFACTION DATA

Our objective is not only to create a structure that can be used for measuring overall customer satisfaction, but to use statistical methods to calculate the impacts of the different components on the overall satisfaction. We propose two possiblemethodological approaches for customer satisfaction data with particular attention at the dataset available on the website http://www.economia.unimi.it/projects/CSProject/based on a questionnaire structure.

3.1 DISCRETE GRAPHICAL MODEL FOR CUSTOMER SACTISFACTION

The first approach is based on a possible implementation of graphical models based on categorical data. The second approach is a theoretical proposal based on a dynamic measures for customer satisfaction. First we adopt a approach to data analysis based on statistical models that can be displayed as graphs, by means of graphical models (see e.g. Giudici, 2003). In these graphs, nodes represent variables, and edges drawnbetween nodes represent conditional dependences. That is to say, a line or arrow is drawn between two nodes unless the two variables are conditionally independent given some or all of the remaining variables. In this way, the graphs supply precise representations of the interrelationships between the variables in the model. Being able to work directly with the graphs promotes an understanding of the dependence structure of the data. To implement graphical models we use the software MIM. MIM supports three types of independence graph: Undirected graph (These have undirected edges which are drawn as lines, not arrows. These graphs are suitable for cross-sectional data, or situations in which there is no causal or other orderings between the variables), Directed graph (These graphs use directed edges, i.e. drawn as arrows. These express direction of influence or, sometimes, causal direction), Chain graph (These combine the undirected and directed graphs. Prior to analysis the variables are grouped into blocks. Variables in the same block may be linked by lines (not arrows), whereas variables in different blocks are linked by arrows (not lines)) and Factor graph (Factor graphs (also known as interaction graphs) display the interaction terms in a model. These are useful when working with hierarchical models). We consider models that aim to represent the associations between a number of discrete variables. we start with the so called log-linear representation of a multi-way contingency table. This representation is convenient for our purpose because it allows us to express (conditional) independence constraints by setting certain coefficients equal to zero. In fact we define subclasses of the log-linear model that can be fully interpreted in terms of conditional independence relations. These are in order of inclusion: hierarchical models, graphical models and finally decomposable models. Most of the material in this chapter is based on the book of Whittaker 1990. Other sources used in writing this section are Edward 2000, Schafer 1997, Christensen 1997, Bishop 1975. We can

Page 5: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

5

represent the conditional independence relations between a set of random variables in a so-called conditional independence graph.Let X = (X1,X2, . . . ,Xk) be a k-dimensional random vector. The conditional independence graph of X is the undirected graph G = (K,E), with K = {1, 2, . . . , k},and where {i, j} is not in the edge set E iff Xi ┴Xj | rest. This is called the pair wiseMarkov property. Perhaps surprisingly, the following properties turn out to be equivalent: Global Markov property and Local Markov property.A random experiment that only distinguishes between two possible outcomes is called a Bernoulli experiment. The outcomes, target variable Y, are usually referred to as success and failure respectively (customer satisfied, customer not satisfied). We define a random variable Y that denotes the number of successes in a Bernoulli experiment; Y consequently has possible values 0 and 1. The probability distribution of Y is completely determined by the probability of success, which we denote by p, and is: P(Y = 0) = 1−p and P(Y = 1) = p. In order to predict Y we can use log-linear models based on binary variables. However, in our case is necessary to extend the log-linear model to take into account categorical variables with more than two categories. To see how we can generalise the log-linear model to this case, consider again the 2 × 2 table:

log f12(x1, x2) = u; + u1x1 + u2x2 + u12x1x2 for x 2 Є {0, 1}2.What if the xi have more than two levels? The trick is to make the u-terms functions of x rather than constants:

log f12(x1, x2) = u; + u1(x1) + u2(x2) + u12(x1, x2).In fact we now have too many parameters, and in order to identify them we have to impose some extra constraints. To be consistent with the binary case, we impose the constraint that ua (xa) = 0 whenever xi = 0, i Є a. Here we assume that if xi has di

possible values, these are numbered 0, 1, . . . , di−1. Note however that this numbering does not imply any ordering of the values. So for example, suppose x1 has two possible values (0,1) and x2 has three possible values (0,1,2) then the following u-terms are constrained to be zero:

u1(0) = 0|u2(0) = 0|u12(0, 1) = u12(0, 2) = u12(1, 0) = u12(0, 0) = 0.To build a Hierarchical and Graphical Log-linear models we can use the Log-linear expansion, the Independence and the u-terms. The importance of the log-linear expansion rests in the fact that many interesting hypotheses can be generated by setting u-terms to zero. In most applications it does not make sense to include the three way association u123 unless the two-way associations u12, u13 and u23 are also present. A log-linear model is said to be hierarchical if the presence of a term implies that all lower-order terms that are contained in it are also present. This implies that a hierarchical model is identified by listing its highest order interaction terms. We give a definition for the graphical models that we will apply to the data set. Given an independence graph G = (K,E), the cross-classified Multinomial distribution for the random vector X is a graphical model for X if the distribution of X is arbitrary apart from constraints of the form that for all pairs of coordinates not in the edge set E of G, the u-terms containing the selected coordinates are identically zero. More explicitly, the density of a Multinomial graphical model is:

)()(log aKa

ak xuxf

,

subject to the constraints that ua = 0 if {i, j} a and (i, j) is not in the edge set E. The parameters of the graphical model are the remaining u-terms that are not set to zero.In the application we show a set of hierarchical models and their independence graphs.

Page 6: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

6

3.2 DYNAMIC MODELS FOR CUSTOMER SACTISFACTION: AN INTEGRATED APPROACH

Here we present our second methodological proposal. We would like to propose a new model for customer satisfaction data that is a mixture between time and expert opinion. Customer satisfaction can change over time. At one point in time, for example early in an intervention or early stage of a program, the customer might be mildly dissatisfied and at a later point in time, the customer might be very satisfied. Changes in the level of customer satisfaction, and the different reasons or explanation of such changes suggest that: the timing of measurement is important, and measuring and interpreting customer satisfaction can be challenging. We think that is important to collect different measures at different time for each customer to build for example a panel customer satisfaction dataset that contains observations on multiple entities (customers), where each entity is observed at two or more points in time. A double subscript distinguishes entities (states) and time periods (years or months).There is: i = entity (customer), n = number of entities, i = 1,…,n., t = time period (year), T = number of time period, t =1,…,T. Suppose we have one covariate (only one question in the questionnaire, for example marital status). The data are: (Xit, Yit), i= 1,…,n, t = 1,…,T. In our case we can observe Panel data with k covariates:

(X1it, X2it,…,Xkit, Yit), i = 1,…,n, t = 1,…,Twith n customers observed and T number of time periods. In this structure we can model customer satisfaction data with longitudinal techniques. We observe that is not possible to work with a balanced panel, because missing observations are present (customer skips some question). Then we address our attention at unbalanced panelwhere some entities (states) are not observed for some time periods (years). We observe that, in relation with Section 3, there are many information concerning the customer that comes from different dataset. In particular, we can observe the behaviour of a customer with your company over time. This is a very important information. Customers begin a relationship with a company, and over time, either decide to continue this relationship, or end it. At any point in this Life-Cycle, the customer is either becoming more or less likely to continue doing business with a company, and demonstrates this likelihood through their interactions with the company. It is easy to collect data from these interactions (purchases for commerce, page views or log-ins for publishing, contacts for service) and use this data to predict where the customer is in its Life-Cycle.If we can predict where customers are in the Life-Cycle, we can maximize themarketing ROI by targeting customers most likely to buy, trying to “save” customers who have declining interest, and not wasting money on customers unlikely to continue doing business. The most difficult part of calculating LTV is deciding what a “lifetime” is. The lifetime is the amount of time a customer will stick around before defecting and leaving your business. As we can understand, typically a customer churns when is not satisfied. This is a crucial point of our proposal. We can merge two type of information: data from customer satisfaction and expert opinions, such as more information concerning a particular group of customers. Now we give some guideline in order to derive LTV. In particular, a customer lifetime value, in our connotation, is the present value of customers future cash inflows minus cash outflows. In contrast, a customer profitability is their accrual based revenue minus costs over a fixed, usually past, time. Historical data, extracted from operational customer databases, can be used to build predictive models for various temporal outcomes: cancellation of products or services (churn), downgrading, acquiring add-

Page 7: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

7

on products or upgrading, product return, and loan prepayment. The occurrence of the target event on the i−th customer is controlled by the probability distribution of the time until the event, Ti. Customer events might be recorded at discrete increments such as months or on a continuous time scale. At the time the data was extracted for analysis, all customers usually have not experienced the event. In this case, the event time is considered (right) censored. Survival analysis is a set of statistical methods designed for censored duration data. The event time distribution is usually characterized by the survival function, or the hazard rate.

Let f(t) denote the probability density function of T and let the distribution function,

t

duuftTPtF0

)()()( .

The probability of an individual’s surviving till time t is given by the survivor function )()(1)(1)( tTPtTPtFtS . We note that S(t) is a monotone decreasing function with S(0) = 1 and S(∞) = 0. The hazard function, h(t) is an instantaneous rate of failure at time t and is defined by:

)(

)()|(lim

0 tS

tftTttP

t

t

t

.

The functions f(t), F(t), S(t), and h(t) give mathematically equivalent specifications of the distributions of T. For discrete event times, the hazard rate is the conditional probability of the event given that it has yet to occur:

)(

)(1)|()( 1

j

jjjj tS

tStTtTPth

.

The hazard can be interpreted as an age-specific rate (events/unit time). The survival function decreases monotonically from one to zero. In contrast, the hazard rate can be any nonnegative function. The shape of the hazard rate often gives insight into the underlying system driving the occurrence of an event Customer databases contain concomitant information that may affect the event time distribution such as demographics, account balances and payments, and the occurrence of other events such as the acquisition of new products or services. The vector of covariates for the i − th customer is often time-dependent. There are some financial approaches to estimate customer lifetime value (LTV), based on the discounted cash flow approach of valuing perpetuities (Berger and Nasr 1998). Given a customer there are three factors we have to determine in order to calculate LTV: the customer’s value over time: v(t) for t > 0, where t denotes time; a model describing the customer’s churn probability over time and a discounting factor D(t) which describes how much each euro gained in some future time t is worth now. We can then define f(t) as the customer’s instantaneous probability of churn at time t. The quantity most commonly modeled, is the hazard function h(t) = f(t)/S(t). While S(t) or h(t) are to be estimated(in a descriptive way following Kaplan Meyer, 1958 or in a inferential way following Cox, 1972), v(t) and D(t) are usually known from business knowledge. We can write the explicit formula for a customer’s LTV as follows:

LTV=

0

)()()( dttDtvtS .

In other words, LTV is the total value to be gained while the customer is still active. The essence of a good lifetime value model is the estimation of S(t) in a reasonable way. There are also some generalisation in a Bayesian way, see e.g. Figini, 2006.Now, suppose that for each customer we can derive a particular LTV and which is

Page 8: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

8

possible to group into S subpopulations the customers on the basis of their LTV (for example with variable ranking) undifferentiated with respect to covariate information. For example, customers can be classified by gender (male, female) and LTV (1=Very low LTV,…,4=Very high LTV). Then there would be S=8 subpopulations. Now, in order to predict customer satisfaction we propose the following models. Suppose that for each customer we know if the customer is satisfied (with probability

)1 i or not satisfied (with probability i ), denoted by a dummy target variable Y.

Suppose also that the parameter i is related to the explanatory vector xi via a

regression model i = h(xi , ) that we extend in our case as i = h(xiT , ), ns is the

sample size in group s, s=1,…,S and rs is the number in groups s with Y=1. The likelihood becomes as product of S binomial likelihoods based on this quantity:

sss rns

rsyQ )1(log);( .

If the s are unconstrained, their MLEs are ss p̂ , sp being the proportion

s

s

n

robserved in group s, s=1,…,S. Otherwise Q is to be maximized over , which

usually requires an iterative computation. An alternative method of estimation, (due to Berkson 1953), is based on the empirical logit transform of sp :

ss

s

s

ss rn

r

p

pg

log

1log .

As we can observe from a theoretical view point, each customer can be monitored over p occasions, so we have 1p vector of observation iy , i=1,…,n. There are 2p

distinct binary vectors of length p say pbb21 ,..., . Let sk be the probability that

ki by when individual I belongs to subpopulation s, so 1s and let

,..., 21 sss be the 12 p vector of this probabilities. Suppose that there are ns

customers in the sample from subpopulation s of whom rsk yield response profile y=bk

so rs+=ns. In terms of the s vectors, the corresponding log-product of multinomial

likelihood is then

S

s K sk

rsk

ss

psk

rnyQ

1

2

11 !

!log),...,;(

.

If the s ’s are unconstrained, their MLEs are thus given by the sample proportions:

ss p̂ with k-th component s

sksk n

rp . For regression modelling of sk simple

logits will not suffice because there are 2p such probabilities per group, instead of just

1s and 11 s as previously. In this situation we can use more detailed models, such

as the probit analysis (Finney, 1952), the logit analysis (Cox, 1970), or the complementary log-log transform of sk , called evit analysis. Otherwise we can adopt

the weighted least squares methodology (Koch,1977 and Landis, 1988).Our available database is not structured in order to apply our theoretical proposal. However we think that this contribution can be used in order to improve data collection for customer satisfaction.

Page 9: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

9

4 APPLICATION

In this section we describe the data available in order to measure customer satisfaction. We present a statistical analysis of the dataset available (81 questions and 266 customers) on the website http://www.economia.unimi.it/projects/CSProject/based on the first proposed approach. It is not possible to apply the second approach because different type of data are needed (panel customer satisfaction data). The dataanalyzed concerns ABC 2004 Annual Customer Satisfaction Survey. A questionnaire compose by 81 questions must be completed by a company (customer). The compiler is recognised by a Title/Position (1. Owner 2. Management 3. Technical Management 4. Technical Staff 5. Operator 6. Administrator 7. Other, please specify). The customer that compile the questionnaire select a number indicating the extent of his agreement with the statement concerning the experience with ABC during 2004. Then, under “Importance Level”, the customer select another number indicating the importance of the statement. If a certain statement is not relevant or not applicable, the customer select N/A. After completion, clicking "SUBMIT SURVEY" will ensure the questionnaire is sent to KPA Ltd., the independent consulting firm conducting this survey. The first part of the questionnaire concerns “Overall satisfaction” evaluated by a score starting from 1 (very low satisfaction) to 5 (very high satisfaction). There is also a binary variable that can be derived concerning if the ABC is your best supplier. The questionnaire presents also two questions concerning the preference for the ABC company measured by a score starting from 1 (very unlikely) to 5 (very likely). Then we can read a set of questions concerning Equipment, Sales Support, Technical Support, Training, Supplies and Media, Pre-Press/Workflow and Post Press Solutions, Customer Portal (My ABC), Administrative Support, Terms, Conditions, and Pricing, Site Planning and Installation, Overall Satisfaction with Other ABC. This second part of the questionnaire is measured by an evaluation score starting from 1 (Strongly Strongly disagree) to 5 (Strongly Strongly agree) correlated by an importance level (low=1, high=3, medium=2 and N/A). For each customer we know also where is located geographically. After a pre-processing based on a set of descriptive data analysis (frequency), we obtain a data set composed by 240 customers and 67 variables concerning the questionnaire. We have deleted questions 68, 69 and the questions from 70 to 81. There are also a variable relative at the customer seniority (year) and a variable concerning country location. Concerning the explorative analysis we report here a short summary. Concerning overall satisfaction with ABC, we can observe that only 91 customer considers ABC the best supplier. Question 11 whichmeans overall satisfaction with the equipment, shows the following results:

< please insert Figure 1 here >

As we can observe 54% of the customers are high satisfied and 28% medium satisfied. Figure 2 shows the overall satisfaction with sales support.

< please insert Figure 2 here >

In this case only a very small part of the customers (33) are very high satisfied. Also concerning the technical support, there is a good results in term of satisfaction; 99 customers are high satisfied and 68 very high satisfied. The overall satisfaction with ABC's supplies and media are medium and very high with the workflow solutions. With attention on overall satisfaction with overall solutions and with the customer

Page 10: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

10

portal (My ABC) there are a very large frequency of non answer (70 + 49). We remark also a high overall satisfaction with the administrative support and a medium level of satisfaction for with terms, conditions and pricing. Concerning the overall satisfaction with site planning and installation we register a large number of non answer (200). Finally, regarding Customer seniority and country location we can observe the results in Figure 3 and Figure 4.

< please insert Figure 3 here >< please insert Figure 4 here >

As we observe, 44% of the customers are located in German. Only a very small part is located in Israel and Italy. In terms of customer seniority we observe that there is a low percentage of customer with 4 years of relation (13,33%). But we observe a quite high percentage of old customer and young customer. This is an interesting point. Could be interesting to understand the transitions of customers between the years. This is a principle motivation because we have proposed a theoretical analysis on panel satisfaction in Section 5.2. We can extend the descriptive analysis to measure possible association between the variables. However, we prefer to report here the results based on discrete graphical models. As target variable we use a binary variable that report if a customer is satisfied (Y=0) or not satisfied (Y=1).Now we consider the problem of finding a good model when little or no prior knowledge concerning independence relations between the variables is available. This problem is known as the model selection problem. We discuss two approaches to model selection: one based on significance testing and one based on trying to optimize a model quality criterion. In both cases we use stepwise selection procedures. This is an incremental search procedure. Starting from an initial model edges are successively added or removed until some criterion is fulfilled. This section is largely based on the book of Edwards 2000. At each step the inclusion or exclusion of eligible edges is decided using significance tests. At each step, the eligible edges are tested for removal using 2 tests based on the deviance difference between

successive models. The edge whose 2 test has the largest non significant p-value is removed. If all p values are significant (i.e., all p < α, where α is the critical level), then no edges are removed and the procedure stops. In our application we adopt also Stepwise model selection using quality measures Akaike’s Information Criterion assigns quality AIC(M) to model M as follows AIC(M) = dev(M) + 2p. where p is the number of parameters of the model. This quality measure consists of two components: the lack-of-fit of the model as measured by the deviance, and the complexity of the model as measured by the number of parameters (i.e. the number of u-terms not constrained to be equal to zero). We are trying to minimize AIC we select the edge whose removal leads to the biggest reduction in AIC. In the program output AIC is defined somewhat differently, namely AIC(M) = −2LM + 2p, where LM is the value of the log-likelihood function evaluated at Mp̂ ; the ML estimates of p under M. It is easy to see that this comes down to the same thing, since dev(M) = 2(Lsat − LM) = 2Lsat − 2LM where Lsat is the value of the log-likelihood function of the saturated model evaluated at its maximum. Concerning the discrete graphical models, we analyse different sets of variables. First we consider as a target variable the overallsatisfaction level with ABC, with label a. To explain this variable we use five questions: 1)Overall satisfaction level with ABC’s improvements during 2004, label b

Page 11: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

11

2) Is ABC your best supplier, label c?, 3) Would you recommend ABC to other companies, label d? 4) If you were in the market to buy a PRODUCT, how likely would it be for you to purchase an ABC product again, label e?

< please insert Figure 5 here >Figure 5 presents the results. As we can observe the variable regarding question 4 is not related to the other variables. We observe a relationship among the other variables and in particular with the questions labelled “abc” and “abd”; finally “c ┴d”. We implement this graphical models in two ways. First we build a models based on a backward feature selection. We compare this results with a new graphical models that is obtained with the AIC minimisation, based on stepwise feature selection. We remember that the best models will minimize the deviance and consequently the AIC. Concerning the model presented in Figure 1 we report the results in Table 1.

< please insert Table 1 here >We prefer the model based on stepwise selection. Then, in order to explain question 1) labelled as (a) we adopt a different set of covariates, such as: Overall satisfaction level with the equipment (b); Overall satisfaction level with sales support, (c); Overall satisfaction with technical support, (d); Overall satisfaction level with ABC training, (e) and Overall satisfaction level with ABC's supplies and media, (f). The result is presented in Figure 6.

< please insert Figure 6 here >We observe a set of dependency, such as for example between “ac”, “ad”, “ae”, “af”, “bf”, “ab”. Table 2 shows the results following the two different types of feature selection. We prefer the graphical models based on stepwise selection.

< please insert Table 2 here >Concerning the measurement of the Overall satisfaction level with ABC, the graph is more complicated inserting variables concerning: Overall satisfaction level with workflow solutions (a); Overall satisfaction with overall solutions for variousproblems (b); Overall satisfaction level with the customer portal (c); Overallsatisfaction level with administrative support (d); Overall satisfaction with terms,conditions, and pricing (e); Overall satisfaction level with site planning and installation (f).

< please insert Figure 7 here >As we can observe from Figure 7 (AIC=522677) there are a large set of dependency. The model is complicated, as also in Figure 8 (AIC=140901). Figure 7 and Figure 8 are based on stepwise selection and AIC minimisation.

< please insert Figure 8 here >Now we would like to change the target variable and we choose as objective variable the following question: Is ABC your best supplier? As covariate, in first instance, we include: Overall satisfaction level with ABC (a); Overall satisfaction level with ABC’s improvements during 2004 (b); Would you recommend ABC to other companies? (c); If you were in the market to buy a PRODUCT, how likely would it be for you to purchase an ABC product again (d).

< please insert Figure 9 here >

In Table 1 we report the results of deviance for the model presented in Figure 9. As we observe, in this case, we prefer the backward elimination. Figure 10 shows the best graphical models based on others variables selected by stepwise feature selection, with a deviance of 1055,98. Incorporating others variables we obtain a set of graphical models very complicated and not stable.

Page 12: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

12

We remark that with this software is critical to enlarge the dimensionality of the problem, for example including all the covariates in a one step selection. In order to have results, we have build different set of variables in order to obtain simple and clear results. To have more detailed results, see e.g. Figini and Giudici, 2006.

5 CONCLUSIONS AND FURTHER RESEARCH

In this paper we have presented two possible novel approach to analyze customer satisfaction data. In order to improve the results based on discrete graphical models, we can improve the present results with the analysis of associations between all discrete variables. There exist CoCo that is a program for estimation, test and model search among hierarchical interaction models for large complete contingency tables. CoCo is a program designed to perform estimation and tests in large contingency tables. By using graph-theoretical the hierarchical log-linear interaction models are decomposed. Also, we can look at different package, such as DIGRAM, a PC based program, written by Svend Kreiner of the Department of Sociology, analyzes recursive graphical models for discrete data. The program is interactive and allows both for asymptotic and exact conditional inference. A much more recent verison of this program is now available from Kreiner at http://www.biostat.ku.dk/~skm/skm/index.html. Further information on the use of the program is available from Svend Kreiner at http://www.biostat.ku.dk/staff/skm-e.htmSecond, in order to use our theoretical proposal we suggest to collect data for panel analysis and to incorporate expert opinion (for example measured by LTV) in customer satisfaction measurement following the theoretical guide reported in Section 5.2.

Page 13: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

13

Very low2%

Low7%

Medium28%

High54%

Very high6%

No answer3%

Figure 1: Overall satisfaction with the equipment

Very low8%

Low15%

Medium28%

High29%

Very high14%

No answer6%

Figure 2: Overall satisfaction with the sales support

Page 14: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

14

17,92%

24,58%

17,08%

13,33%

27,08%

1 2 3 4 5

Years

Cu

sto

mer

s

Figure 3: Customer seniority

Benelux10%

France5%

Germany44%

Israel9%

Italy14%

UK18%

Figure 4: Customer localization

Page 15: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

15

Figure 5: Graphical models

Figure 6: Graphical models

Page 16: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

16

Figure 7: Graphical models

Figure 8: Graphical models

Page 17: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

17

Figure 9: Graphical models

Figure 10: Graphical models

Page 18: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

18

Feature selection Deviance Backward 725,3942Stepwise 728,0448

Table 1. Model Selection

Feature selection Deviance Backward 1073,6251Stepwise 1088,8662

Table 2. Model Selection

Feature selection Deviance Backward 640,28Stepwise 862,34

Table 3. Model Selection

Page 19: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

19

References

[1] Berger P.D. and Nasr I. (1998). Custumer Lifetime Value Marketing Models and Application, Journal of Interactive Marketing, 97 (1-2)12, 17-30.

[2] Berkson J. (1953). A statistically precise and relatively simple method of estimating the biossary with quantal response, based on the logistic function. Journal of the Americal Statistical Association, 48, 565-599.

[3] Bishop Y., Fienberg S.E., and Holland P.W., (1975). Discrete MultivariateAnalysis. MIT Press, Cambridge (MA).

[4] Bob E. Hayes, (1998). Measuring Customer Satisfaction: Survey Design, Use, and Statistical Analysis Methods, 2nd Edition, ASQ Quality Press, Milwaukee, Wisconsin.

[5] Cassel C.M.(2000). Measuring customer satisfaction on a national level using a superpopulation approach. Total quality management. 11:7, 909-915.

[6] Cassel C.M. and Eklöf J.(2001). Modelling customer satisfaction and loyalty on aggregate levels: Experience from the ECSI pilot study. Total quality management. 12:7&8, 834-841.

[7] Cassel, C., Eklöf, J., Hallissey, A., Letsios, A. and Selivanova I. (2002). The EPSIRating Initiative. European Quality 9:2, 10-25.

[8] Cox D.R. (1970). Analysis of binary data. Methuen, London.

[9] Christensen R. (1997). Log-Linear Models and Logistic Regression (second edition). Springer, New York, 1997.

[10] Cox D.R. (1972). Regression models and life tables. Journal of the Royal Statistical Society, Series B, 34, 187-220.

[11] Donald C., Gause G., Weinberg M., (1989). Exploring Requirements: Quality Before Design, Dorset House Publishing, New York, New York.

[12] Edwards D. (2000). Introduction to Graphical Modelling (second edition).Springer, New York.

[13] Figini S. (2006). Bayesian Variable and Model Selection for Customer Lifetime Value, Phd dissertation.

[14] Figini S. Giudici P. (2006). Customer Satisfaction: an application, Technical Reports (under revision).

[15] Finney D.J. (1952). Probit analysis, 2nd edn. Cambridge University Press, Cambridge.

Page 20: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

20

[16] Giudici P. (2003). Applied data mining, Wiley.

[17] Kaplan E.L. and Meier P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53, 457-481.

[18] Koch, G.G, Landis, J.R., Freeman, J.L., Freman, D.H. and Lehnen, R.G. (1977). A general methodology fort he analysis of experiments with repeated measurements of categorical data. Biometrics 33, 133-158.

[19] Koch, G.G, Landis, J.R., Miller, M.E., Davis, C.S. (1988). Some general methods for the analysis of categorical data in longitudinal studies. Statistics in Medicine 7, 109-137.

[20] Lundström, S. and Särndal, C.E. (2005). Estimation in Surveys with No response. John Wiley and Sons, Ltd.

[21] Norman Fenton, Robin Whitty & Yoshinori Iizuka, (1995). Software Quality Assurance and Measurement: A Worldwide Perspective, International Thomson Computer Press, London, England.

[22] Robert Grady (1992). Practical Software Metrics for Project Management and Process Improvement, PTR Prentice Hall, Englewood Cliffs, New Jersey.

[23] Schafer J.L. (1997). Analysis of Incomplete Multivariate Data. Chapman& Hall, London.

[24] Siskos Y.; Grigoroudis E.; Zopounidis C.; Saurais O., (1998). Measuring Customer Satisfaction Using a Collective Preference Disaggregation Model, Journal of Global Optimization 12, 175-195.

[25] Terry G. Vavra, (1997). Improving Your Measurement of Customer Satisfaction: A Guide to Creating, Conducting, Analyzing, and Reporting Customer Satisfaction Measurement, ASQ Quality Press, Milwaukee, Wisconsin.

[26] J. Whittaker. (1990). Graphical Models in Applied Multivariate Statistics.Wiley, Chichester.

Page 21: STATISTICAL MODELS FOR CUSTOMER SATISFACTION DATA › fbec › 066daafdf79eb328524f228e… · This paper considers customer satisfaction data. In Section 1 we report a short introduction,

21

Authors’ Biographies:

Silvia Figini BS in Economics, University of Pavia University, 2001 PhD in Statistics, Bocconi University, dissertation on feature selection methods for customer survival. At the University of Pavia she is research assistant for business statistics and data mining under the supervision of Prof. Paolo Giudici. Before joining the PhD program has worked for two years for Competence centre of data mining analysis and business intelligence in SAS Milan. Currently member of the Italian Statistical Society.

Research interests: Feature selection, Survival Analysis for lifetime models, Bayesian statistics, Computational Statistics.

Paolo Giudici (BS in Economics, Bocconi University, 1989; MsC in Statistics, University of Minnesota, 1990; PhD in Statistics, University of Trento, 1993) is Full Professor of statistics at the the University of Pavia. At the University of Pavia he is lecturer of: data analysis, business statistics, data mining; risk management (at Borromeo College). He is also: director of the data mining laboratory (www.datamininglab.it); a member of the University Assessment Board (www.unipv.it/nuv); a coordinator of the Institute of Advanced Studies school on “Methods for the management of complex systems” (www.unipv.it/complexity). He is the Author of 77 publications, among which two research books and 32 papers in Science Citation Index journals. He has spent several research periods abroad; in particular at the University of Bristol, the University of Cambridge and at the Fields Institute for research in the mathematical sciences (Toronto). He is responsible of the risk management interest group of the European network for business and industrial statistics (www.enbis.org). He is also member of the Italian Statistical Society, the Italian Association for Financial Risk management (www.aifirm.com) and the Royal Statistical Society. Bayesian statistics.

Research interests: Business Data mining, Computational Statistics, Risk measurement.

Current research projects: Data mining models for e-learning; Operational risk measurement and management; Statistical models for customer relationship management; Text mining models to detect anomalous behaviours.