measuring the effectiveness of e-commerce website design

MEASURING AND COMPARING THE EFFECTIVENESS OF

E-COMMERCE WEBSITE DESIGNS

Jungpil Hahn Doctoral Candidate

Robert J. Kauffman

Co-Director, MIS Research Center Professor and Chair

Information and Decision Sciences

Carlson School of Management University of Minnesota Minneapolis, MN 55455

Email: {jhahn, rkauffman}@csom.umn.edu

Last revised: January 20, 2003 _____________________________________________________________________________________

ABSTRACT The assessment of the effectiveness of e-commerce websites is of critical importance to online retailers. However, the current techniques for evaluating effectiveness are limited in that they do not allow for formal empirical measurement of the productivity and performance of the website. In this paper, we use the theoretical perspective of production economics to measure the performance of Internet-based selling websites. We model the website as a production system where customers consume inputs (i.e., use various functionalities of the website) to produce an output (i.e., a basket full of items at checkout). With this basic perspective, we propose an analysis methodology for measuring and comparing website efficiency and attributing observed inefficiency to customer inefficiency or website design inefficiency. The application of the proposed evaluation methodology to a currently operational e-commerce website demonstrates the value of our technique. _____________________________________________________________________________________ KEYWORDS: Business value, customer efficiency, data envelopment analysis, design efficiency,

economic analysis, electronic commerce, e-tailing, productivity, website design. _____________________________________________________________________________________ Acknowledgments: An earlier version of this paper was presented at the 2002 Workshop on Information Systems and Economics, Barcelona, Spain. The authors would like to thank Gordon Davis, Alok Gupta, Joe Konstan and Jinsoo Park, who provided useful feedback at an earlier stage in the development of this work. We also thank Rajiv Banker, Indranil Bardhan, Mayuram Krishnan, Sandra Slaughter and other participants at the 2002 Workshop on Information Systems and Economics for helpful suggestions.

INTRODUCTION

Since the crash of the DotComs in the American stock market in April and May of 2000, the

evaluation of e-commerce websites in terms of business value has become increasingly important

(Varianini and Vaturi, 2000). No longer are venture capital firms willing to make portfolio investments in

e-commerce properties that only present potential future return on investment (ROI) opportunities.

Instead, they seek e-commerce firms with more immediate opportunities, especially firms that can

demonstrate a well-developed discipline for evaluating investments in web-related software development

and e-commerce business models. This way, the risks and uncertainties of investing in this emerging

market are better balanced with the rewards. During the first phase of e-commerce, the goal for most

companies was to secure a share of the virtual market space through an online presence by attracting as

many visitors as possible to their website. However, the industry has progressed to the second phase of e-

commerce. As e-commerce begins to mature, the ability to retain customers and conduct online

operations justified by ROI is the only way an e-business can survive (Agrawal, Arjona, and Lemmens,

2001; Straub and Watson, 2001; Chen and Hitt, 2002).

Recent industry analyses, however, point out that e-commerce retailers are earning low scores on

ROI, by failing to meet consumers’ purchase needs with the poor usability and errant designs of their

web-based storefronts (Souza, Manning, Sonderegger, Roshan, and Dorsey, 2001; Anderson, 2002). For

example, a study by Zona Research reported that 60% of web-savvy users dropped out of the purchasing

process because they could not find the products in the online retailers’ websites (Zona Research, 1999).

Another study conducted by A.T. Kearney showed that 80% of experienced online shoppers gave up

shopping on e-commerce websites due to problems they encountered while interacting with the website

(Rizzuti and Dickinson, 2000). Yet another study conducted by Creative Good showed that 43% of

purchase attempts ended in failure due to poor usability of the websites (Rehman, 2000). This shortfall in

realized value compared to the potential value that web-based selling approaches offer is dramatic. The

Creative Good study points out that this level of failed purchase attempts is consistent with an estimated

loss of $14 billion in sales for e-commerce retailers in the 2000 Christmas-New Year’s holiday shopping

season alone.

Recent academic research reinforces the picture that emerges. Almost all of the papers in the two

recent guest-edited issues of INFORMS Information Systems Research on e-commerce metrics included

metrics related to the design and usability of e-commerce websites (Straub, Hoffman, Weber, and

Steinfield, 2002b, 2002a). Apparently the quality of the online customer experience that effectively-

designed websites create not only has a positive effect on the financial performance of a firm, but also

possesses the potential to create unique and sustainable competitive advantage for Internet-based sellers

and other e-commerce firms (Rajgopal, Venkatachalam, and Kotha, 2001).

1

Developing, launching and maintaining an e-commerce website entails a significant investment for e-

commerce firms. Simple e-commerce websites can cost $1-2 million per year for setup and maintenance,

whereas more sophisticated websites with dynamic capabilities require annual investments up to $52

million (Rizzuti and Dickinson, 2000; Dalton, Hagen, and Drohan, 2001). Despite the importance of

website development requiring such significant investments, the process of designing high quality

websites for e-commerce is still more of an art than a science. E-commerce companies still rely largely

on intuition when it comes to designing their websites (Hahn, Kauffman, and Park, 2002). To make

matters worse, design changes and their impacts are not tracked, making it impossible to measure the

benefits of website design (Wallach, 2001). This situation brings to the foreground the importance of

value-driven evaluation and design of e-commerce websites. However, e-businesses are facing

difficulties due to the lack of proven tools and methods for accomplishing this.

According to Ivory and Hearst (2001), traditional approaches to e-commerce website evaluation fall

into three major categories:

user testing, where users are asked to perform representative tasks with a given website and

problems are determined based on the range of observed user interactions (e.g., Spool, Scanlon,

Schroeder, Synder, and DeAngelo, 1999; Rizzuti and Dickinson, 2000);

inspection, where domain experts use a set of criteria to identify potential problems with the

website (e.g., web usability heuristics, such as those suggested by Nielsen (1994)) to identify

potential usability problems in the website design (e.g., Nielsen and Mack, 1994); and,

inquiry, where users provide feedback on the website via interviews, surveys, participation in

focus groups etc. (e.g., Schubert and Selz, 1999).

These methods have been adopted from the field of user interface evaluation (UIE) within the

broader field of human-computer interaction (HCI). However, even though these approaches have been

successfully applied for the evaluation of user interfaces of traditional IS applications, they are not

perfectly suited for web-based e-commerce applications. For example, websites are very frequently

updated and redesigned, which makes the recurring cost of recruiting test users, experts or survey

respondents for the evaluation of each redesign overly excessive for most organizations with limited labor

and capital resources. It is also important to emphasize that users of web-based applications are most

often customers, which is untypical of traditional IS applications developed for use by employees within a

firm. As a result, greater constraints are placed on what a designer/developer must do to create a

desirable setting for system use by a user/customer since training is not a viable option.

The purpose of this paper is to present a methodology for assessing the effectiveness of e-commerce

website design. Our proposed approach to e-commerce website evaluation is not for comparative

evaluation of websites of different companies (e.g., testing whether the Amazon.com website is more

2

effective than the competing BN.com website). Rather, our approach is intended for use within a firm for

assessing the effectiveness of one’s own website or comparing the effectiveness of different redesigns of

one’s own website. The intuition behind our proposed methodology is that we can measure (or estimate)

the effectiveness of an e-commerce website by analyzing how well the website enables efficient customer

behaviors that are observable from clickstream data in web server logs.

The paper is organized as follows. We review the relevant literature to provide a theoretical

foundation for modeling the effectiveness of e-commerce websites in terms of production economics. We

propose a new model of e-commerce website effectiveness that views the e-commerce website as a

production environment where effectiveness can be characterized in terms of customer transaction

productivity. We also discuss the methods for estimating the efficiency of website designs in the

analytical context of data envelopment analysis (DEA) (Charnes, Cooper, and Rhodes, 1978, 1981;

Banker, Charnes, and Cooper, 1984). Next, we illustrate the value of our proposed methodology by

applying it in the empirical evaluation of the effectiveness of website design at an Internet retailer of

groceries. We also discuss our methodology in depth prior to presenting the results of the empirical

analysis. We conclude with discussions, implications of our results and directions for future research.

ONLINE SHOPPING AS AN ECONOMIC PRODUCTION PROCESS

We conceptualize online shopping as an economic production process in which customers make use

of the e-commerce website in producing an economic transaction. We set the stage for this

conceptualization by reviewing literature that provides a foundation for characterizing e-commerce

websites as self-service technologies (SSTs, also called self-service production technologies). We then

discuss the basic concepts in production economics to offer a basis for measuring the effectiveness of e-

commerce websites. Finally, we present our model of online shopping as an economic production

process, as well as our approach for evaluating the effectiveness of e-commerce website designs.

Service Production and Self-Service Technologies (SSTs)

Early research in service operations management recognized the importance of customer’s

involvement in the service production and delivery process as a source for increasing a service firm’s

productivity (Chase, 1978; Lovelock and Young, 1979). Given that the presence of the customer (or at

least her input) is generally required in the service delivery process, customers have also been regarded as

partial employees of the service firm (Mills and Morris, 1986). This perspective of customer co-

production is especially relevant when the service encounter involves the use of SSTs in the service

production and delivery (e.g., automated teller machines (ATMs) for banking transactions, e-ticket kiosks

for airline check-in, e-commerce websites for online shopping, etc.) since customers are actually

performing the necessary tasks that a paid employee of the service firm would otherwise execute.

3

In a similar vein, the service marketing literature identifies employees’ performance in a service

delivery system as a vital factor affecting the service firm’s productivity and service quality (Zeithaml,

Parasuraman, and Berry, 1990). Since, customers are co-producers of the service, the customers’

efficiency and productivity also become important precursors to high quality service. The concept of

customer efficiency has been defined and investigated in prior research by Xue and Harker (2002), who

propose that customer efficiency consists of transaction efficiency (e.g., more efficient transactions) and

value efficiency (e.g., more frequent transactions). The authors also refer to quality efficiency for

services when the major content of the service product is provided by peer customers. Transaction

efficiency creates value for the firm through cost savings whereas value efficiency creates value through

increased volume. This focus on labor efficiency has strategic implications. Given the lack of training

opportunities for customers, it becomes difficult to increase the productivity of the customers. Hence, the

strategic implication is that a service firm should identify and select more efficient customers in order to

increase productivity and profitability. This perspective is consistent with profitability-based customer

segmentation: it has been suggested that firms should identify and serve only profitable customers

(Brooks, 1999; Zeithaml, Rust, and Lemon, 2001).

However, the narrow focus on customer efficiency does not paint the whole picture of service

production efficiency, especially service delivery environments with technology-based service production.

The emphasis on customer efficiency (i.e., employee performance) was primarily due to the fact that,

traditionally, services were labor-intensive processes that involved the co-presence of the employee and

customer. This efficiency perspective needs be extended given the rise of SSTs. Productivity increases

can be a result of not only improving the quality of the labor force (i.e., increasing employee and

customer efficiency—by making the employees or the self-served customers more efficient), but also by

investing in more efficient capital equipment (i.e., increasing technological efficiency—making SSTs

more efficient). This perspective on technological efficiency has a very different strategic implication.

Identifying efficient and profitable customers should no longer be the main focus or the only focus; rather,

the effective design of the technology so that even inefficient customers can be more efficient should

move to the foreground.

E-commerce websites, especially transactional web-based applications for Internet-based selling, can

be viewed as SSTs (Meuter, Ostrom, Roundtree, and Bitner, 2000). The design of SSTs has significant

impact on the adoption of the channel as well as the quality of the service production and delivery process.

It has been shown that the adoption of SSTs is sensitive to their design and the degree of customer contact

(Walley and Amin, 1994). In other words, low-contact services (i.e., services that require only the

presence of the customer) can deal with highly complex operational services, whereas high-contact

services (i.e., services in which the customer is the direct producer) have typically employed technologies

4

for low operational complexity (e.g., ATMs). In case of complex services that require a high degree of

customer contact, ease of use of the self-service technology becomes critically important due to the lack

of training opportunities that can be provided to the customer (Chase and Tansik, 1984).

Production Economics and the General Model of a Production Frontier

We model the customer-website interaction as an economic production process (Kriebel and Raviv,

1980). An economic production process defines the technical means by which inputs (e.g., materials and

resources) are converted into outputs (e.g., goods and services). This technical relationship is represented

by the production function, which articulates the maximum level of outputs produced for each given level

of inputs (i.e., the efficient frontier or the “best practice” production frontier). Deviations from the

production frontier reflect inefficiencies in individual observations (Aigner and Chu, 1968).

Another important concept in production economics is returns to scale, the relative increase in

outputs as all inputs are increased proportionately without changing the relative factor mix (Varian,

1992). A production process is said to exhibit constant returns to scale if, when all inputs are

proportionately increased by k, the outputs also increase by k. The production process exhibits increasing

returns to scale if outputs increase by a proportion greater than k and decreasing returns to scale if

outputs increase by a proportion smaller than k.

A general model of production is given by:

xk = f (y, s, ε) = f (yl, si) + ε , (General Model of Production)

where

xk = input k,

yl = output l,

si = environmental variable i influencing the production process,

ε = deviations from the production frontier.

Figure 1 provides a graphical illustration of the basic logic of our production model. (See Figure 1.)

The production function represents the most efficient production process. All points that lie on the curve

(e.g., the point E) are said to be efficient since there is no deviation from the production frontier (i.e., ε =

0). On the other hand, all observations that lie below the curve (e.g., the point I) are inefficient in the

sense that the same level of output may be achieved with ε less input, ε > 0. ε can take on both positive

and negative values. Inefficiencies in the production process result in negative deviations from the

production frontier (i.e., results in less output produced, indicating the production process is less

efficient). Random effects (e.g., measurement errors, effects of any factors that are not included in the

model, or randomness due to human indeterminacy) may cause both positive and negative deviations.

Assuming a symmetric distribution for the random effects deviations, the amount of downward deviation

from the frontier, on average, will be greater than or equal to the amount of upward deviation.

5

Figure 1. Production Frontier in Production Economics

Y

X0

Production Frontier

x

x = f(y, s)x

decreasing returns to scale

increasing returns to scale

constant returns to scale

I

E

ε> 0

ε= 0

Input

Out

put

Y

X0

Production Frontier

x

x = f(y, s)x

decreasing returns to scale

increasing returns to scale

constant returns to scale

I

E

ε> 0

ε= 0

Input

Out

put

Conceptualizing Website Design Effectiveness

Our view of an e-commerce website as a service production environment enables us to start thinking

about the evaluation of website performance: the ability to transform inputs to outputs. In our context of

online shopping, we conceptualize the inputs as customers’ actions in navigating through the e-commerce

website in order to produce a transaction, in which the output can be regarded as a checkout of a basket of

products. Before proceeding with the specification of the online shopping production model, it is

important to consider the axioms of economic production, to verify their conformance with the online

shopping context that we examine here.

Axioms of Production in Online Shopping. There are three basic assumptions of production

postulated by economic theory (Varian, 1992).

Regularity. The regularity axiom, also called the “no free lunch” assumption, states that the

input requirement set is a closed non-empty set for all outputs levels y with y ≥ 0. With

regularity, it is not possible to produce something from nothing; at least some non-zero input

needs to be consumed in order to produce an output. In online shopping, this translates into our

assumption that a customer must interact with the site (i.e., non-zero input) in order to produce a

transaction on the website.

Monotonicity. The monotonicity axiom, also called the “free disposability” assumption, states

that for a given production possibility set of outputs and inputs, it is possible to produce the same

level of outputs with another bundle of inputs of greater amount. Thus, it should also be possible

to produce an equal or smaller amount of all outputs by using at least as much of all inputs. In the

6

online shopping context, our interpretation is that a customer may end up with a transaction of

equal size (i.e., same number of items in cart) by browsing more within the website. For example,

a customer may add an item into the cart when she first visits the item’s product page (i.e., an

input), or on first visit to that page think about buying that item only to decide to add it to her cart

later on. However, she would have to revisit the item’s product page in order to add this item to

her cart (i.e., an additional unit of input). In other words, customers may view product pages

without adding the items to her cart (i.e., free disposable input).

Convexity. The convexity axiom states that the input requirements set is a convex set. In other

words, given any two input bundles in the input requirements set, any convex combination of

those two input bundles is also in the input requirements set and, hence, can produce the same

level of output. This assumption is also satisfied in the online shopping context where the

customer can end up with the same contents in the shopping cart by using different functionalities

of the e-commerce website. For example, during one transaction, she may add a particular item

to her cart from the product page within the product category hierarchy, and for another

transaction, the same items can be added to the cart from the specials and promotions section

within the website.

The Online Shopping Production Model. Consistent with the axioms of economic production, we

conceptualize the online shopping production model as:

xkj = f(ylj, sij) + εj , (Online Shopping Production Model)

where

xkj = input k for shopping transaction j,

ylj = output l for shopping transaction j,

sij = environmental variable i influencing the shopping transaction j,

εj = deviations from the frontier for shopping transaction j.

Given that each customer may perform multiple online transactions, we subscript the variables with j

in the online shopping production model to distinguish between different shopping transactions. The

inputs relate to the effort put forth by the customers in filling their virtual shopping carts (e.g., number of

product page views, extent of navigation through product listings, amount of search conducted, references

to help pages, etc.). The outputs in the model describe the transaction (e.g., number of items in the

shopping cart at checkout, dollar amount of items in the shopping cart at checkout, etc.). Various other

factors may influence the production process. In online shopping, these may include the level of

experience of the customer (e.g., number of previous transactions at the website, etc.), the quality of the

Internet connection (e.g., connection speed) and so forth.

7

Efficiency Concepts for Analyzing the Internet-Based Selling Websites. For Internet-based selling,

we define website efficiency as the degree to which website design supports efficient online purchasing.1

The effectiveness (or ineffectiveness) of the e-commerce website (i.e., the online production environment)

can be inferred by analyzing the inefficiencies of the customer-website interactions, which can be

measured by estimating the deviations from the production frontier for the observed online transactions.

Note, however, that the focus of efficiency estimation here is not on assessing the productivity of each

production unit (i.e., a customer transaction) per se. Instead, it is to assess the overall productivity of the

production environment. Hence, we are not interested in the individual inefficiency estimates but in the

overall distribution of the inefficiency estimates given a particular production environment (i.e., a

particular website design).

Conceptually, there may be two sources of inefficiency: customer inefficiency and website design

inefficiency. This distinction is similar to managerial versus program efficiency investigated by Charnes,

Cooper and Rhodes (1981) in the context of evaluating the effectiveness of “Program Follow Through”

compared to “Non-Follow Through” in public education. First, the customers may be inefficient in that

they do not use the website design features that are optimal for task performance. This is customer

inefficiency (i.e., inefficiency due to poor execution by the customers). Second, the design of the website

may be poor such that even efficient consumers cannot complete their tasks efficiently. And this is

website design inefficiency (i.e., inefficiency due to poor website design).

Analyzing the Sources of Inefficiency. The question then is to determine from the observed

efficiency measures where the source of inefficiency is. We approach this by extending the work of Frei

and Harker (1999) by organizing the observations into subgroups of production environments (e.g.,

different website designs) to see if one subgroup outperforms another. In other words, we organize the

customer shopping transactions by website designs to estimate efficiency scores for each subgroup. As a

1 Website design efficiency is similar to the concept of task-technology fit that has been investigated extensively in the information systems literature in the areas of graphical information presentation (e.g., Benbasat, Dexter, and Todd, 1986), tables versus graphs (e.g., Vessey, 1991), and also general information systems (e.g., Goodhue and Thompson, 1995). In the context of online shopping, the task comprises the shopping goals and decision processes. The technology is the design of the e-commerce website. In other words, website design efficiency measures the fit between the intended usage via system design and actual usage (i.e., how well the design of the website actually supports a consumer’s goal for using the website). In a similar vein, information foraging theory (Pirolli and Card, 1999) posits that people will adopt information seeking strategies that maximize their rate of gaining valuable information (i.e., maximize the return on information foraging) for a given structure of the information environment (i.e., systems and interface design). In other words, the effectiveness of different information environments may be assessed by comparing the respective return on information foraging, which measures the rate of gaining valuable information in the context of an embedding task.

8

result, each customer transaction will have two efficiency measures: one overall efficiency measure, and

one for its design subgroup. Customer and website design inefficiencies are measured by:

Overall

pDesignGrou

cyInefficiencyInefficien

efficiencyCustomerIn = and

Overall

pDesignGrou

cyInefficiencyInefficien

efficiencyCustomerIniencyignInefficWebsiteDes

−=

−=

1

1

The logic behind these measures is illustrated in Figure 2, which depicts a hypothetical situation

where we are comparing two website designs, Design 1 and Design 2. (See Figure 2.)

Figure 2. Decomposing Inefficiency

Input

Out

put

Production Frontier for Design 1Production Frontier for Design 2Overall Production Frontier

j1 j2j3

The figure depicts three production frontiers: one for Design 1, one for Design 2, and finally one

overall frontier for all observations.

First, consider the online shopping transaction j1, which is part of Design 1. j1 lies on the

efficiency frontier of both its design group as well as of the overall frontier. Hence, j1 is efficient

(i.e., there is no room for customer or website design inefficiency).

Second, consider the transaction j2, which is part of Design 2. Even though j2 lies on the

efficiency frontier of its design group (Design 2), it lies below the overall efficiency frontier and

hence is not 100% efficient. Since the inefficiency score (i.e., deviation from the production

frontier) for the design group is 0, the portion of inefficiency due to poor execution (i.e.,

customer inefficiency) is 0%, whereas the portion of inefficiency due to poor website design (i.e.,

website design inefficiency) is 100%. In other words, even though the transaction j2 was highly

productive within its current production environment (Design 2), there still exists a potential for

increased productivity. In other words, the same level of output could have been achieved with

less input if j2 had interacted with the other website design (Design 1).

9

Third, consider the transaction j3, which is inefficient overall, as well as within its own design

group (Design 2). We can see that the deviation from the overall frontier is approximately twice

the deviation from its design group frontier. Hence, customer inefficiency and website design

inefficiency will both be approximately 50%.

Analyzing the different sources of website design inefficiencies provides an innovative approach to

analyzing the design performance of e-commerce websites. The results of such inefficiency analyses will

have different implications depending on how the results turn out. If the source of inefficiency is present

for all customers, then it becomes a signal to the e-commerce manager that she should think about a

radical new redesign. However, if the source of inefficiency is the customer, then radical redesign would

not be necessary. Rather, various other remedial approaches will be more effective. For example, the e-

commerce firm may target email messages to those less efficient customers to inform them about features

that exist, which they do not currently use, or redesign the website to make these hidden areas more

salient and accessible.

Towards a Formal Methodology. With the above conceptualization of customer and website design

inefficiency, we can devise formal methods for comparing between different website designs. Since

customer inefficiency relates to the inefficiency due to poor execution given a production environment,

customer inefficiency can be measured by the deviation from the production frontier. Hence, comparing

customer inefficiency between website designs equates with comparing the means of the distributions of

the inefficiency deviations. Consider Figure 3, which depicts a hypothetical scenario where the individual

observations (i.e., customer transactions) for two different website designs are plotted. (See Figure 3.)

Figure 3. Comparing Customer Inefficiency

Y

X0 Input

Out

put

x x

xx

x

xx

x

xx

xx

x

x

x

xx

x

x

xx

xx

x

x

x

xx x

x

x

x

Y

X0 Input

Out

put

xx

xxx

xx

xxxxx x

xxxxx x

xxxx

x

xxx

xx

x

x

x

Website Design A Website Design B

Y

X0 Input

Out

put

x x

xx

x

xx

x

xx

xx

x

x

x

xx

x

x

xx

xx

x

x

x

xx x

x

x

x

Y

X0 Input

Out

put

xx

xxx

xx

xxxxx x

xxxxx x

xxxx

x

xxx

xx

x

x

x

Website Design A Website Design B

We may clearly see that the inefficiency deviations for Website Design A are smaller than those for

Website Design B. In other words, these results would suggest that Website Design B has greater

customer inefficiency than Website Design A, or Website Design A is more “customer efficient” than

10

Website Design B. Next, since website design inefficiency refers to the inefficiency due to poor website

design, this inefficiency is associated with the best practice production frontiers for the respective

production environments. In other words, we are interested in how well the website design performs if

the website were to produce the maximum level of outputs for each given level of input. Hence, the

comparison of website design inefficiency between website designs can be achieved by directly

comparing the production frontiers.

Figure 4 presents a hypothetical situation where the production frontier for Website Design A

dominates the production frontier of Website Design B (i.e., for any level of output, Website Design A

can produce that output with less input than Website Design B).2 Such results would suggest that website

design B has greater website design inefficiency compared to Website Design A, or Website Design A is

more “website design efficient” than Website Design B. (See Figure 4.)

Figure 4. Comparing Website Design Inefficiency

Y

x0 Input

Out

put

Production Frontierof Website Design A

Production Frontierof Website Design B

EMPIRICAL METHODS FOR EXAMINING WEBSITE EFFICIENCY

To illustrate the value of our proposed e-commerce website evaluation methodology, we apply our

technique to a currently operational e-commerce website. We next present the research methodology of

the empirical examination.

2 There may also be a situation where the frontiers intersect (Charnes and Cooper, 1980). In such cases, one can compare website design inefficiencies for subsets of the two website designs grouped by magnitude of input (or output) (Brockett and Golany, 1996).

11

Research Site and Data

We next present information on the research site, and the nature of the unique data and new methods

that permit us to provide new kinds of insights into the problem of website design and evaluation in

Internet-based selling.

Research Site. Data for this study were collected at an online retailer of groceries. The online grocer

is a pure-play Internet-based retailer that delivers groceries directly to the customer’s doorsteps with the

mission of “taking the dread out of grocery shopping.” The company made its first delivery in April

1999, and by-mid July 2000 it had over 9000 customers who generated more than $16 million in revenue.

Currently, the organization continues to operate only in one metropolitan area in the upper Midwest,

where it is the only online service within its regional market.

Performance evaluation of the website at the company has multiple purposes. First, performance

evaluation is carried out to assess and manage the business, and to assure investors that their invested

funds are deployed in a manner that has the potential to create significant returns. Second, performance

evaluation of the website is employed to find ways to improve the business process that customers

participate in when they shop, and, as a result, firm performance. Similar to many other web-based

businesses, the company has adopted the attitude that “you can’t manage what you can’t measure”—in

other words, competent measurement is a precursor to the formulation of effective management policy for

the firm’s online operations. With this goal in mind, management spends time to do website performance

evaluation so that it can generate insights into how the website is operating, what changes are required to

improve service quality, and why one change might be given a greater priority another, due to the relative

leverage on ROI that each may provide.

Currently, the data for estimating these business metrics are derived from two separate systems. One

is the customer data warehouse and the other is a website analysis tool that is provided as a bundled

service by an out-of-state application service provider that hosts the firm’s website. The data warehouse,

which contains customer and sales data, is used to conduct market basket analysis (Berry and Linoff,

1997). For example, final sales statistics are used to answer questions such as: “What are our best selling

products?” “What are the demographics of our customer segments?” And “What is the average

profitability for each customer segment?” This analysis is valuable for assessing the overall performance

of the online service (e.g., merchandizing effectiveness). However, it provides very little managerially-

actionable information about how to improve the company’s website.

The website analysis and data mining tool, WebTrends, is employed towards this second goal. It

compiles web server logs to generate website usage statistics. (For more detailed information on this tool,

the interested reader should see www.netiq.com.) The analysis tool offers a series of pre-packaged

reports that show various aspects of online activity. For example, some reports list the most requested

12

http://www.netiq.com/

pages, whether the page hits come to the website through external “referring” Websites, the browsers that

are used by people who visit the site, the number of hits and visits for a given date range on different parts

of the Website, and the most frequently occurring HTTP errors. These reports are used to answer various

questions such as “What are the most popular product categories?” “What are the most popular

products?” “When do customers shop?” and “What is the ratio of customers who shop or browse versus

customers who purchase?”

A shortcoming of the ready-made reports is that they are designed to only list a set of top 200

statistics, constraining the extent to which the tool can extract useful data to support a variety of

managerial decision and evaluation tasks. For example, like a typical grocery store, the number of

products offered by our data site is much greater than 200, so it is impossible for the firm’s management

to acquire a complete and accurate view of site usage, if they wish to track more than this number of

products. As the reader can imagine, this has become a major source of frustration for the firm’s

managers. They often are more interested in identifying the least visited areas of the Website and the pre-

packaged statistics tend to focus more on the most frequently-visited pages (e.g., the home page, the

check out page, the sale items page, and so on).

Thus, in our field study, we learned that the tools and techniques for evaluating the performance of

the company’s website were surprisingly limited—nowhere near what we expected for a firm that prided

itself on its innovation and technology-enabled managerial sophistication. As a result and with the lack of

useful managerially-actionable information, senior managers at the online grocer rely largely on “gut

feel” and intuition when it comes to decision-making about design changes.

Data Collection. Clickstream data were collected directly from an online grocer’s web servers. The

website uses HTTP session cookies downloaded onto the visitor’s computer to track the customer’s

shopping behavior at the site. Typical data pre-processing procedures for using webserver logs were used

to extract navigation path sequences for each individual visitor from the clickstream data (Cooley,

Mobasher, and Srivastava, 1999). The navigation sessions were then combined to identify purchase

transactions from which website usage metrics were extracted to measure the extent to which various

areas of the website were used in each of the purchasing processes. An overview of the data processing

procedure is presented in Figure 5. (See Figure 5.)

13

Figure 5. Data Processing Procedure

Pre-processing - Data cleaning- Path completion- Session identification- User identification- Transaction identification

User-Session-TransactionDatabase

Website UsageMetrics

WebserverLogs

Website Design Metadata

Website Design Analysis - Page classification - Website topology identification

Website source code(HTML, ASP files)

Pre-processing - Data cleaning- Path completion- Session identification- User identification- Transaction identification

User-Session-TransactionDatabase

Website UsageMetrics

WebserverLogs

Website Design Metadata

Website Design Analysis - Page classification - Website topology identification

Website source code(HTML, ASP files)

The current dataset spans two weeks from June 23 to July 5, 2001. In this time period, a total of

36,051 sessions were recorded by 18,297 unique customers. The analysis will focus on 5,383 actual

completed purchasing transactions from 4,941 customers. We selected this period for analysis because

there was a design change in the middle; only the homepage of the website (i.e., the first page after the

login screen) was changed.

Empirical Methods Using Data Envelopment Analysis (DEA)

We will illustrate the value of the proposed methodology by comparing the efficiencies of two

website designs. To evaluate the effectiveness of the online grocer’s e-commerce website, we employ

data envelopment analysis (DEA), a non-parametric methodology for production frontier estimation

(Charnes, Cooper, and Rhodes, 1978; Banker, Charnes, and Cooper, 1984). We employ the non-

parametric model, DEA, rather than a parametric model (e.g., stochastic frontier estimation) to estimate

the production relationship between online shopping input and output. We do this because DEA does not

assume a specific functional form for the production function and only requires relatively few

assumptions of monotonically increasing and convex relationship between inputs and outputs. The

parametric formulation of stochastic frontier estimation and the non-parametric formulation of DEA have

been shown in prior research to yield very similar results (Banker, Datar, and Kemerer, 1991). DEA

estimates the relative efficiencies of decision-making units (DMUs) from observed input measures and

output measures. The relative productivity of a DMU is evaluated by comparing it against a hypothetical

DMU that is constructed as a convex combination of other DMUs in the dataset. In our current analyses,

14

we employ an input-oriented CCR model (Charnes, Cooper, and Rhodes, 1978, 1981) to estimate the

efficiencies of online shopping transactions.3

When using DEA, the subsequent analysis is only as good as the initial selection of input and output

variables. The input and output variables are to be selected such that the inputs represent the resources

consumed by the DMUs and the outputs represent the performance of the DMUs. (See Table 1.) Our

model views the online shopping experience as a self-service production system where customers (i.e.,

decision-making units or DMUs) are using a production technology (i.e., the e-commerce website) to

produce an output (i.e., a shopping transaction). In our online shopping context, we conceptualize the

input as customers’ actions in navigating through the e-commerce website in order to produce a

transaction, in which the output can be regarded as a checkout of a basket of products.

Table 1. Input and Output Variables for Website Efficiency Measurement

CATEGORY VARIABLE MEASURE DESCRIPTION x1 Products Number of product page views x2 Lists Number of product lists views x3 Personal Number of personal list views x4 Order History Number of orders history page views x5 Search Number of search conducted x6 Promotion Number of promotional page views x7 Recipe Number of recipe page views x8 Checkout Number of checkout pages

Inputs

x9 Help Number of help page views Output y1 Basket size Number of items at checkout

The efficiency h0 of an online transaction j0, characterized on the basis of the consumption of inputs,

xi0 and production of output y0, is assessed by solving the following linear program:

3 The BCC model (Banker, Charnes, and Cooper, 1984) allows for variable returns to scale to estimate technical inefficiency at a given scale of production; the CCR model (Charnes, Cooper, and Rhodes, 1981) assumes constant returns to scale and estimates the aggregate of technical and scale inefficiencies. There are several conflicting arguments for or against the appropriateness of each model in our application. For example, one may argue that since the CCR model estimates measures of efficiency that represent the aggregate of scale and technical efficiencies whereas the BCC model estimates the technical efficiency, the BCC model may be more appropriate. Why so? Because the analytical objective is to estimate and compare the technical efficiencies of website designs (This was suggested to us by Rajiv Banker in a personal communication (December 15, 2002)). Furthermore, using BCC estimates of efficiency for hypothesis testing when comparing the efficiencies of website designs may provide a stricter test since BCC estimates will inevitably be less than (or at most equal to) the CCR estimates. On the other hand, it may be argued that the CCR model provides a more appropriate conceptualization of the efficiency measures. Why? Because in the online shopping context the size of the transaction (i.e., the number of items purchased), as an indicator of scale size, is under the control of the customer and not the e-commerce firm.

15

Min (Online Shopping Efficiency Evaluation Model) 0h

subject to

, i = 1, … , 9 inputs ∑=

≥n

jjiji xxh

100 λ

, r = 1 output ∑=

≤n

jjrjr yy

10 λ

0≥jλ for j∀

The specification of the constraints in the above linear program is such that the production

possibilities set conforms to the axioms of production in terms of convexity, monotonicity, constant

returns to scale and minimum extrapolation (Banker, Charnes, and Cooper, 1984). The first constraint

ensures that all observed input combinations lie on or within the production possibility set defined by the

production frontier (i.e., the envelopment conditions for the input). The second constraint maintains that

the output levels of inefficient observations are compared to the output levels of a convex combination of

observed outputs. The final constraint ensures that all values of the production convexity weights are

greater than or equal to zero. The DEA program is run iteratively for all online shopping transactions to

yield efficiency scores h (i.e., the reciprocal of the inefficiency score *j

** 1 jj h=θ ) for all DMUs (j = 1…J).

A DMU j0 is said to be fully efficient if the optimal solution to its linear program above yields = 1

with no slack (i.e., excess input or slack output). DMUs with h < 1 are said to be inefficient.

*0j

h

*j

*0j

h

Empirical Analysis and Hypothesis Testing

Given that the management at our research site implemented a design change for its website in the

middle of the data collection period, our analysis will focus on estimating and comparing the website

efficiencies—both customer efficiency and website design efficiency—of and between the two designs.

Customer inefficiency relates to the inefficiency due to poor execution with a particular website

design. Hence, the estimation of customer efficiency/inefficiency involves conducting DEA efficiency

estimations separately for each website design condition (i.e., each DMU is compared only to other

DMUs that transacted with the same website design). To test whether the customer inefficiency scores

for one website design are greater (or smaller) than those of another website design, we need to compare

the inefficiency distributions between the two website design conditions.

On the other hand, website design inefficiency relates to the inefficiency due to poor website design.

Hence, estimating website design inefficiency involves estimating the production frontier (i.e., the data

envelope) and comparing website design inefficiencies between different website designs involves

analyzing whether the production frontier for one website design outperforms that of the other website

16

design. In order to do so, the inefficient DMUs are adjusted to their “level-if-efficient-value” by

projecting each DMU onto the production frontier of its website design condition. In other words, for

each website design condition, all (or most) DMUs (i.e., the originally-efficient DMUs and the adjusted-

to-efficiency inefficient DMUs) should be expected to be rated as efficient within their website design

condition. Next, a pooled (inter-envelope) DEA is conducted with all DMUs (i.e., all DMUs in both

website conditions) at their adjusted efficient levels. Finally, to test whether website design inefficiency

for one website design is greater (or smaller) than that of the other website design, we need to compare

the efficiency ratings derived from the pooled DEA between the website design conditions.

We adopt the statistical test procedure proposed by Banker (1993) for comparing efficiency ratings

between groups. Basically, the statistical procedure involves testing whether the means of the

inefficiency score probability distributions for different conditions are different. Two test statistics were

proposed by Banker depending on whether inefficiency deviations of the observed data are postulated to

be drawn from exponential or half-normal distributions. It is reasonable to assume an exponential

distribution for the inefficiency deviations when one has reason to believe that most observations are

close to the production frontier, whereas a half-normal distribution should be assumed when few

observations are likely to be are close to the frontier.

The overall test procedure is as follows. Let j represent an online shopping transaction in the overall

dataset. The set J of online transactions consists of two subsets D1 and D2 representing the two different

website designs (i.e., D1 for week 1 and D2 for week 2). We denote the inefficiency score of a shopping

transaction j in group Di by to distinguish between the two groups and allow for the possibility that

the probability distributions for the two sets of inefficiency scores and may be different. If we

assume the inefficiency deviations to be exponentially distributed (i.e., – 1 exponentially distributed

with parameter σ1, – 1 exponentially distributed with parameter σ2), the null hypothesis is that the

two website designs are not different in terms of inefficiency deviations, H0: σ1 = σ2. The alternative

hypothesis is: H1: σ1 > σ 2, that the website design of Week 1 has greater input inefficiency than that of

Week 2 (i.e., website design at week 1 is less efficient than that of Week 2). The test statistic is:

iDjθ

1Dθ 2Dθ

1Dθ2Dθ

( )( )∑

∑

∈

∈

−

−

2

2

2

1

1

1

/1

/1

DjD

Dj

DjD

Dj

n

n

θ

θ

The test statistic asymptotically follows the F-distribution with ( degrees of freedom for large n,

where and n are the number of observations in the subsets D1 and D2, respectively. On the other

hand, if we assume the inefficiency deviations to be half-normally distributed then we need to use a

different test statistic, as follows:

)2,221 DD nn

1Dn2D

17

( )( )∑

∑

∈

∈

−

−

2

2

2

1

1

1

/1

/1

2

2

DjD

Dj

DjD

Dj

n

n

θ

θ

This statistic again asymptotically follows an F-distribution with degrees of freedom for large n. ),(21 DD nn

When comparing customer inefficiency between website designs, the inefficiency scores are derived

by running DEA separately for each website design condition, whereas when comparing website design

inefficiency, the inefficiency scores are derived by running the pooled DEA with all DMUs at their

adjusted efficient levels for their respective website design. To avoid confusion between the two sets of

inefficiency scores, we will use to denote the inefficiency ratings derived from the separate (between

group) DEA and use (with a hat) to denote the inefficiency ratings derived from the pooled (inter-

envelope) DEA with adjusted-to-efficiency DMUs. In any case, the general statistical test procedure will

be the same for both customer inefficiency and website design inefficiency comparisons.

iDjθ

iDjθ

RESULTS

We now present the results of our empirical examination of the evaluation of the online grocer’s e-

commerce website. We first present overall results of the aggregate efficiency estimations, then follow

by presenting the results of comparing customer and website design efficiencies between the websites.

Overall Results

We first report the overall results of the aggregate efficiency estimation. Table 2 presents descriptive

statistics of the DEA results. These results seem to suggest that, overall, the website exhibits an average

level of website efficiency since efficiency scores for 50% of the transactions range between 0.522 and

0.785. There also seems to be quite a lot of inefficient transactions since efficiency scores for 25% of the

transactions range between 0.108 and 0.521. Another way to look at this is to observe the long tail of the

distribution of inefficiency deviations, where 25% range between 0.915 and 8.291. (See Table 2.)

18

Table 2. DEA Aggregate Efficiency Score Summary Statistics

STATISTIC EFFICIENCY SCORE INEFFICIENCY DEVIATION Minimum 0.108 0.000 Maximum 1.000 8.292 Mean 0.648 0.717 Std Deviation 0.187 0.697 1st Quartile 0.522 0.274 Median 0.644 0.552 3rd Quartile 0.785 0.915

Note: The number of observations in this analysis is 5,383. Efficiency scores (0 < h*j ≤ 1)

are estimated from the DEA, whereas the inefficiency deviations are derived from the efficiency scores (θ*

j = 1/h*j - 1). Transactions with efficiency scores close to 1 are efficient.

Figure 6 plots the aggregate efficiency scores of all DMUs (with 5,383 observations) against the

respective output of each observation (i.e., the basket size or number of items in the cart at checkout).

(See Figure 6.) Visual inspection gives a summary of overall website efficiency. The plot shows the

variability of efficiency scores at all levels of outputs, suggesting that the website may be ineffective.

Figure 6. DEA Aggregate Efficiency Scores by Output Level

Note: The above graph represents the aggregate efficiency scores estimated from the overall DEA (5,383 observations). Hence, at this point we do not distinguish between transactions that occurred with the different website designs. The horizontal axis represents the efficiency scores of the online shopping transactions (0 < h*

j ≤ 1), whereas the output level (i.e., number of items in the cart at checkout) is represented on the vertical axis. The efficient transactions lie on (or near) the right edge of the graph (h*

j ≈ 1).

Comparing DEA Scores for Customer Efficiency

We now compare the efficiencies of the two website designs in our data collection period. We first

identified two sub-samples of online shopping transactions that did not span weeks. In other words, we

are interested in comparing efficiency scores for only those transactions that were performed with either

of the two website designs since a customer may initiate a transaction during week 1 and complete it

19

during week 2. Of the 5,383 completed transactions in our dataset, we identified 789 transactions which

started and ended during Week 1 (subgroup D1) and 604 which started and ended during Week 2

(subgroup D2). The online shopping DEA model was run iteratively for each of the subsets to estimate

the relative customer efficiencies ( ) and the inefficiency deviations ( ). Table 3 presents the

descriptive statistics of the DEA results and Figure 7 shows the distribution of observed inefficiency

deviations. These results seem to suggest that D2 outperforms D1 in customer efficiency (i.e., greater

average customer efficiency scores, or lesser average customer inefficiency deviations). (See Table 3 and

Figure 7.)

iDjh 1−iD

jθ

Table 3. DEA Customer Efficiency Scores Summary Statistics

STATISTIC EFFICIENCY SCORE INEFFICIENCY DEVIATION D1 D2 D1 D2

Minimum 0.111 0.211 0.000 0.000 Maximum 1.000 1.000 8.035 3.742 Mean 0.551 0.766 1.060 0.415 Std Deviation 0.177 0.186 0.895 0.494 1st Quartile 0.428 0.640 0.513 0.078 Median 0.551 0.782 0.815 0.278 3rd Quartile 0.661 0.928 1.336 0.562

Note: The subset D1 for Week 1 has 789 observations; subset D2 for Week 2 has only 604. These observations represent online transactions that started and ended during its respective week. They are only those customer transactions where the customer interacted with one particular website design in completing a transaction.

Figure 7. Distribution of Observed Customer Inefficiency Deviations

0%

5%

10%

15%

20%

25%

0 1 2 3 4

Inefficiency Deviations

% o

f Cus

tom

ers

5

D 1 (Week 1)

Overall

D 2 (Week 2)

Note: The figure shows inefficiency deviations ( 1* −jh1 ) distributions for the customer efficiency scores for the subsets, D1 and D2, and for the aggregate customer efficiency scores. Greater mass near the left edge of the graph

20

(i.e., inefficiency deviation 1* −jh1 ≈ 0, or h ≈ 1) implies that most observations are on or close to the efficient frontier. The long tail suggests some, but not many, transactions that are highly inefficient.

*j

We conducted hypothesis tests using DEA-based heuristics proposed by Banker (1993) to validate

these findings. Table 4 presents a summary of the results. (See Table 4.)

Table 4. Summary of Hypothesis Test Comparing Customer Efficiency Scores

HYPOTHESIS INEFFICIENCY DEVIATIONS

DISTRIBUTION TEST STATISTIC CRITICAL F

(α=0.01) RESULT

H0: σ1 = σ2 H1: σ1 > σ2

Exponential ( )( )∑

∑

∈

∈

−

−

2

2

2

1

1

1

/1

/1

DjD

Dj

DjD

Dj

n

n

θ

θ= 2.557 F(1578, 1208) = 1.135 Reject H0

H0: σ1 = σ2 H1: σ1 > σ2

Half-normal ( )( )∑

∑

∈

∈

−

−

2

2

2

1

1

1

/1

/1

2

2

DjD

Dj

DjD

Dj

n

n

θ

θ= 4.624 F(789, 604) = 1.119 Reject H0

Note: This table summarizes the hypothesis test comparing the customer efficiency scores using the test statistics proposed by Banker (1993). Note that the degree of freedom for the F-test when the assumed distribution for inefficiency deviations is exponential is twice the number of observations for each subset (2 × n = 2 × 789 = 1578 and 2 × n = 2 × 604 = 1208).

1D 2D

The null hypothesis (i.e., H0: σ1 = σ2) that there are no differences in input customer inefficiencies was

rejected (α = 0.01 level) for both the exponential and half-normal distribution assumptions for the

inefficiency deviations. Therefore, we accept the alternative hypothesis that the website design of Week

2 resulted in reduced input customer inefficiencies.

Similar results were obtained with the non-parametric test of Brockett and Golany (1996) that uses

the Mann-Whitney rank statistic to evaluate significance of differences in observed customer efficiencies

between website designs. The test statistic Ztest using the Mann-Whitney statistic (U = 156,440) was

–10.999 < -Zα/2 = -2.576 (α = 0.01). This suggests that customer inefficiency for D1 was greater than for

D2. Again we have evidence that the Week 2 design led to reduced input customer inefficiencies.

Comparing DEA Scores for Website Design Efficiency

Next, we compare the website design efficiencies of the two website designs. The pooled (inter-

envelope) DEA was run with the adjusted-to-efficiency DMUs to estimate the relative website design

efficiencies ( ) and inefficiency deviations ( ). Table 5 presents the descriptive statistics of the

pooled DEA results and Figure 8 shows the distribution of observed inefficiency deviations. (See Table 5

and Figure 8.) In this case, the results seem to suggest that D1 outperforms D2 in terms of website design

efficiency (i.e., greater average website efficiency scores, or lesser average website inefficiency

deviations), a reversal of the results of comparing customer efficiencies.

iDjh 1ˆ −iD

jθ

21

Table 5. Website Design Efficiency Scores Summary Statistics

STATISTIC EFFICIENCY SCORE INEFFICIENCY DEVIATION D1 D2 D1 D2

Minimum 0.875 0.561 0.000 0.000 Maximum 1.000 1.000 0.142 0.783 Mean 0.987 0.906 0.013 0.113 Std Deviation 0.021 0.078 0.023 0.108 1st Quartile 0.982 0.857 0.000 0.003 Median 0.999 0.919 0.000 0.087 3rd Quartile 1.000 0.971 0.018 0.167

Note: Again, the subset D1 for Week 1 has 789 observations, whereas the subset D2 for Week 2 has 604 observations. Website design efficiency scores are estimated by conducting the pooled DEA where the observations are adjusted to efficiency levels within their respective subsets.

Figure 8. Distribution of Observed Website Design Inefficiency Deviations

0%

10%

20%

30%

40%

50%

60%

0 0.1 0.2 0.3 0.4 0.5

Inefficiency Deviations

% o

f Cus

tom

ers

D 1 (Week 1)

Overall

D 2 (Week 2)

Note: The above graph shows the distributions of the inefficiency deviations ( 1ˆ* −jh1 ) for the website design efficiency scores for the two subsets (D1 and D2) as well as for the scores for the aggregate website design efficiency scores. The interpretation of the graph is the same as for Figure 7. However, note the difference in order of magnitude of the inefficiency deviations (horizontal axis) between the two graphs. This is due to the fact that the observations have been adjusted to efficiency levels within their respective subsets prior to conducting the pooled DEA. Hence, most of the observations should be close to the efficiency frontier, leading to the order of magnitude difference in inefficiency deviation levels.

As before, we again conducted hypothesis tests using DEA-based heuristics proposed by Banker

(1993) to validate these findings. (See Table 6.)

22

Table 6. Summary of Hypothesis Test Comparing Website Design Efficiency Scores

HYPOTHESIS

ASSUMED DISTRIBUTION FOR

INEFFICIENCY DEVIATIONS

TEST STATISTIC CRITICAL F (α=0.01) RESULT

H0: σ1 = σ2 H1: σ1 < σ2

Exponential ( )( )∑

∑

∈

∈

−

−

1

1

1

2

2

2

/1ˆ

/1ˆ

DjD

Dj

DjD

Dj

n

n

θ

θ= 8.546

F(1208, 1578) = 1.093 Reject H0

H0: σ1 = σ2

H1: σ1 < σ2 Half-normal

( )( )∑

∑

∈

∈

−

−

1

1

1

2

2

2

/1ˆ

/1ˆ

2

2

DjD

Dj

DjD

Dj

n

n

θ

θ= 33.970 F(604, 789) = 1.133 Reject H0

Note: This table summarizes the hypothesis test comparing the website design efficiency scores. Note that the direction of the inequality for the alternative hypothesis (H1) are reversed since we are testing whether website design inefficiency of week 2 (D2) was greater than week 1 (D1). As a consequence, the numerators and denominators of the test statistics are switched (i.e., the numerator has the average inefficiency deviation (or average sum of squared inefficiency deviation) for D2 whereas the average for D1 is in the denominator). Also, the degrees of freedom for the F-test are now reversed (i.e., F(1208, 1578) and F(604, 789) instead of F(1578,1208) and F(789, 604)).

The null hypothesis (i.e., H0: σ1 = σ 2) that there are no differences in input website design

inefficiencies was rejected (α = 0.01) for both the exponential and half-normal distribution assumptions

for the inefficiency deviations. Therefore, we again accept the alternative hypothesis that the website

design of Week 2 resulted in reduced input website design inefficiencies. Lending additional credibility

to these results, we also report that similar results were obtained with the non-parametric test. The test

statistic Ztest using the Mann-Whitney statistic (U = 385,329) was 19.764 > Zα/2 = 2.576 (α = 0.01),

suggesting that website design inefficiency for D2, the Week 2 design, was greater than for D1, the Week

1 design.

Decomposing Website Efficiency

We next investigate more qualitatively the difference between the two website designs in terms of the

source of inefficiency. In particular, are the observed inefficiencies largely due to customer inefficiency

or website design inefficiency? We compute inefficiencies (i.e., customer and website design

inefficiencies) for each of the inefficient DMUs. Figure 9 presents a visual overview of the results. (See

Figure 9.)

23

Figure 9. Histogram of Website Design Inefficiency Scores

0

0.2

0.4

0.6

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

D1 (Week 1) D2 (Week 2)

CustomerInefficiency

Website DesignInefficiency

Note that website design and customer inefficiency are complementary. So a 0% website design

inefficiency corresponds to 100% customer inefficiency. Hence, left-skewed distributions suggest that

the inefficiency lies with customers and not website design, whereas right-skewed bars suggest that most

of the measured inefficiency is attributable to website design. Consistent with our hypotheses tests, the

website design inefficiency scores for the two different designs (D1 for Week 1 versus D2 for Week 2)

show stark differences. For D1 (Week 1), more DMUs showed less inefficiency due to website design:

any observed inefficiency is more likely to be due to poor execution by the customers. For D2 (Week 2),

a reversal occurs: there is a large proportion of users for whom all sources of inefficiency are due to poor

website design, and not poor execution.

The shift in source of inefficiency (i.e., from more customer inefficiency during Week 1 to more

website inefficiency during Week 2) suggests that the website design during Week 1 had the potential to

be highly efficient. However, the customer base of the online grocer did not have the capabilities to reap

the benefits of this efficient design. On the other hand, the website design during Week 2, in general, was

easier to use (i.e., based on reduced customer inefficiencies) but had less potential for high efficiency.

We may infer from these findings that the customer base of the online grocer is not composed of highly

sophisticated Internet users, which may be expectable given the domain of grocery shopping. With such a

user/customer base, it may be more beneficial to go with a less-than-optimal design (in terms of the best

practice frontier) so that the overall customer base may still use the website. A useful analogy is Unix

versus Windows in computer operating systems. Unix is efficient because its user base consists mainly of

expert programmers or power users who understand the cryptic command-line interface. However, the

Windows graphical user interface makes it possible for novices to perform necessary tasks productively.

24

DISCUSSION

The major problems that managers at e-commerce sites face are associated with understanding how

their online storefront is operating, whether the current design of the website is effective, and, if not, what

design changes may increase its effectiveness. However, these problems are difficult to tackle because

obtaining an adequate representation of overall website effectiveness is problematic due to the lack of

visibility of customer’s actions. The major challenge lies in the difficulty in gaining an understanding of

how users are actually using the website. Unlike with physical stores, managers at e-commerce firms

cannot directly observe the customers’ behaviors within the online storefront. For example, when we

encounter customers having trouble finding items in the aisles of a physical store, sales representatives

may intervene to help (Underhill, 1999). However, with e-commerce websites, it is difficult to observe

what is happening within the virtual store.

The major managerial insight that can be derived from this paper is that even though one cannot

directly observe consumers’ actions within the online storefront, it is possible to indirectly infer their

behaviors from clickstream traces left by the customers. The problem then turns into a data

summarization problem. Even though it is possible to reconstruct the customer’s navigation traces it is

still very costly to examine these traces on an individual basis. Hence, we need to summarize the set of

individual navigation traces into a meaningful measure or set of measures that can provide managerially-

actionable information about how to improve the company’s website. In this paper, we have proposed

that website efficiency is a worthwhile representation of the effectiveness of the e-commerce website.

Measures of website efficiency represent how well the e-commerce website supports online shopping

transactions.

Even though website efficiency provides a useful metric for assessing the effectiveness of the online

shopping environment, it only provides a diagnosis. In other words, low website efficiency scores may

indicate that the design of the e-commerce website may be ineffective. However, why it is ineffective is

still unknown.4 Unfortunately, this kind of diagnosis does not provide management with immediate

insights into how to improve the design of the website to increase its effectiveness. With this goal in

4 This is a recurring problem with many performance metrics provided by current weblog analysis and data mining tools. For example, knowing that a particular product is the top-selling product for this week is nice to know, but there is nothing that can be done about it once you do know. Instead, it would be more appropriate to have specific information that indicates the effectiveness of the placement of product promotions, and the extent to which different kinds of placement create marginal impacts on sales. With this kind of information in hand, management will have more information about how to improve the design of the Web site so as to maximize the ROI associated with the screen real estate that is being used.

25

mind, we have proposed the measurement concepts of customer efficiency and website design

inefficiency. Customer inefficiency relates to inefficiency due to poor execution given a particular

website design. Website design inefficiency conveys the inefficiency due to poor design of the website.

These finer-grained measures of website efficiency (and inefficiency) provide the much-needed insights

into why a particular design may be ineffective and also what to do about it. For example, high levels of

website design inefficiency would suggest that the poor design of the website impedes the efficiency of

the customer’s interaction with the website. This would be a signal to the e-commerce manager that a

fundamental redesign of the website may be necessary. On the other hand, high levels of customer

inefficiency would imply that the customers are not utilizing the website to its full efficiency potential. In

such a situation, the correct remedial measure would be to educate the customers. For example, the e-

commerce firm may target email messages to those less efficient customers to inform them about features

that exist which they do not currently use, or redesign the website to make these hidden areas more salient

and accessible.

The application of our proposed evaluation methodology illustrates the value of our technique. We

were able to generate several interesting insights concerning the effectiveness of the e-commerce website

of the Internet grocer that we studied. Obtaining these insights would have been otherwise difficult (or

impossible) to observe with currently available and widely-used website evaluation techniques. For

example, we were able to estimate significant differences in efficiencies (inefficiencies) for the online

grocer’s website pre- and post-redesign. The results of the efficiency estimation suggest that the online

grocer’s website had potential for highly efficient transactions pre-redesign, however, a good portion of

the customers could not attain this high level of efficiency. On the other hand, the results also suggest

that this potential for highly efficient transactions was lessened as a result of the redesign. Still though,

the online grocer’s customers were relatively better able to use this less-than-optimal design to its full

extent. These results are very rich and provide a valuable empirical basis for how to start thinking about

redesigning the website in order to increase its effectiveness.

CONCLUSION

Evaluating the effectiveness of e-commerce website design is a very important, yet highly complex

problem for e-commerce retailers. Given that the success of e-commerce retailers hinges to a great extent

on the ability of the e-commerce firm to provide a high-quality website, e-commerce retailers need to

constantly monitor the effectiveness of their web-based storefronts. However, current methods for

website evaluation do not offer practical means for a solution to this problem. In this paper, we have

proposed a methodology for evaluating and measuring the effectiveness of e-commerce website design

that offers an innovative approach to assessing the effectiveness of e-commerce websites on an on-going

26

basis. Our methodology not only allows the measurement of the effectiveness of a particular design but

also the comparison of website effectiveness between designs over time. We conclude with a run-down

of the practical and theoretical contributions of this paper, along with caveats and considerations in

applying our methodology. We end with directions for future development for this line of work.

Contributions

Our website evaluation methodology provides significant benefits over current methods that are

widely used. One of the major advantages of our proposed measurement technique is that we are able to

make use of observable consumer actions for all users/customers at a given website. In fact, the problem

of scalability has been a major concern with the previous evaluation methods such as user testing, inquiry

or expert inspection. For example, with user testing deciding on the adequate number of subjects to test

in order to generate a representative picture of website usability problems is still in debate (Spool and

Schroeder, 2001; Bevan, Barnum, Cockton, Nielsen, Spool, and Wixon, 2003). Also, it is difficult for

usability experts to be able to identify all usability problems that may arise for the wide variety different

users that may be customers at the website due to bounded rationality. We are not, however, arguing that

testing, inquiry and inspection methods do not provide any value. On the contrary, we believe that such

methods have their own specific complementary strengths and should be employed in conjunction with

our proposed method.

Second, our methodology provides an unobtrusive approach to data collection. Though they leverage

the available web technologies, and online user surveys are currently being widely adopted, still non-

respondents will exist. Moreover, in the context of frequent website redesigns, which is the norm rather

than the exception in e-commerce, it becomes difficult to solicit continuous responses for each website

redesign. Furthermore, the obtrusive nature of the survey method (and also the user testing method) may

introduce response bias from the participants, which may contaminate the results. Thus, we view the

survey method as a weaker method for studying these systems design contexts. And so a major benefit of

our methodology is that we may bypass the aforementioned problems by making use of automatically-

collected web server log data. Since web navigation behavior occurring in a genuine real world setting is

collected, the potential for bias in the data is minimized.5 In addition, with additional effort, the data-

collection, preparation and even the efficiency estimation procedures can be systematically programmed

5 However, there are important privacy concerns since customers are extremely averse to the idea that someone is monitoring their website usage behaviors (Srivastava, Cooley, Deshpande, and Tan, 2000). The World-Wide Web Consortium (W3C) has an ongoing initiative called “Platform for Privacy Preferences.” It recommends that site administrators publish a site’s privacy policies in machine-readable format. This is so that web browsers only request and display pages that conform with the user’s privacy preferences. Still, most users are not aware of these features and conformance by firms to these protocols currently is not regulated by law.

27

into the web application servers making it possible to automatically generate efficiency metrics on a

continuous basis so that e-commerce managers can monitor the effectiveness of their website on an on-

going basis without suffering the costs of extraneous data collection and tedious analysis. Such additional

developments will allow e-commerce firms to gain competitive advantage by reducing the feedback cycle

time between website evaluation and website redesign.

Third, this paper presents a formal methodology for website evaluation. Our methodology outlines

the full cycle starting from model formulation, data requirements and collection, model operationalization,

efficiency estimation all the way to hypothesis testing. We also present a framework for remedial action

when the results suggest they are appropriate. Furthermore, the empirical methods outlined in this paper

do not employ any proprietary data that is specific to the particular research site that we have investigated.

Rather, our methods only make use of data that are available to all e-commerce firms (i.e., raw

clickstream data from web server logs). Therefore, our method should be readily applicable to different

firms.

Finally, from a theoretical standpoint, this paper is the first to introduce the production perspective to

the evaluation of online shopping. The theory of production features a number of useful

conceptualizations for the kind of context that we have studied (including, e.g., the production function,

the best practice frontier, returns to scale, technical rate of substitution, and so on). The use of that

theoretical basis also opens up use of other related and highly-sophisticated and innovative

methodological tools for estimating the productivity of a production technology (e.g., data envelopment

analysis, parametric frontier estimation, etc.). We have only touched upon a few of these concepts and

techniques in the current paper. We believe that new insights will be generated by incorporating

additional concepts from production theory into the problem of website evaluation. So this paper presents

a first step in this exciting direction. This newly-introduced evaluation perspective for online shopping

should stimulate additional research into the area of e-commerce website evaluation and also more

broadly into the research domain of online shopping.

Caveats and Considerations

The reader should consider a number of caveats and considerations relative to the interpretation of the

results of this study, as well as the implementation of the methodology that we propose.

On the Possibility of Inappropriate Generalization. Even though the value of the proposed

website evaluation methodology can be inferred by the interesting results enumerated above, care must be

taken not only when interpreting the results but also when trying to apply the methodology more widely.

We have shown that the estimated efficiencies of the different website designs were significantly different.

But we purposely have made an effort to not reveal a lot of qualitative details about the nature of the

design changes. This is because we do not want the reader to over-generalize and assume that the

28

changes that were made to the website led to increased website design efficiency and reduced customer

efficiency. In other words, one should not, given the above results and insights, assume that a similar

change to a different website would lead to the similar results in terms of website efficiency changes.

Indeed, we acknowledge and emphasize the fact that many factors come into play in terms of e-

commerce website effectiveness and that many of these factors may be site-specific. We are not

interested in uncovering universal design guidelines that may be applied to any setting (e.g., identifying

the optimal organization of product hierarchies in an e-commerce website). Instead, the focus of our

methodology is to provide to managers at e-commerce firms with useful feedback concerning how their

customers are performing in the presence of their website designs. As we briefly described in the

Introduction of this article, our proposed evaluation methodology is intended to be used as a tool for the

continuous management of website quality. The rationale is similar in spirit to an important research

stream in software engineering economics, where metrics for evaluating software development and

maintenance productivity have been developed as a vehicle for managing and maximizing the value of

software development projects (e.g., Banker, Datar, and Kemerer, 1991; Banker and Kauffman, 1991;

Banker and Slaughter, 1997; Chidamber, Darcy, and Kemerer, 1998). Likewise, our proposed evaluation

methodology is intended for use within a firm for managing its website development initiatives.

Uncovering insights related to specific design features can also be accommodated given that our

model of online shopping as production process is very general. Even though we have only employed

measures of input (xkj) and output (ylj) in our empirical investigation, other environmental variables (sij)

may be included in the production function. For example, in order to estimate the impact of a particular

design feature (e.g., a new personalization engine), the presence (or absence) of that feature may be

included as an environmental variable in the frontier estimation. The resulting coefficient from the

frontier estimation will represent the impact of that design feature on website efficiency. We are

currently collecting data related to the design qualities of the online grocer’s website in order to estimate

the impacts of various design features on website efficiency.

The Limitations of the Production Paradigm. Another concern related to broader application of

our methodology relates to the appropriateness of the production paradigm in modeling online shopping,

since our methodology is theoretically grounded in production economics.

First, when applying our website evaluation methodology, one must be confident that the

production model is an appropriate framework for effectiveness. In other words, the online

shopping process that is being analyzed must be consistent with the theoretical perspective of

production that we discuss earlier in this article. For example, in selecting input and output

variables for frontier estimation, care must be taken so that these comprise the production

possibility set.

29

Second, our assertion that efficiency is an appropriate measure of performance also should be

justified. Shopping efficiency is meaningful and important in the online grocery shopping

domain that we are investigating in this paper. In fact, the target market among consumers in our

study of the grocery shopping website performance is the time-pinched customer who seeks

convenience in her grocery shopping activities. Consequently, one of the key operational goals

for website design set forth by the data that we obtained from our research site is to have first-

time customers be able to check out within an hour and have their subsequent transaction sessions

reduced to thirty minutes. Hence, we see that website efficiency is indeed a major focus in the

current context.

Goal-Directed Versus Experiential Shopping. It may, however, not be as clear whether efficiency

is also an appropriate metric for other e-commerce websites where shopping for enjoyment may be

common. The consumer behavior literature in marketing identifies two distinct motivations for

purchasing: goal-directed (or utilitarian) versus experiential (or hedonic). The two different motivations

for shopping bring about considerable differences in consumer behavior (Babin, Darden, and Griffin,

1994). Goal-directed shopping, which typically occurs with a near-immediate purchase horizon, entails a

highly focused information search process whereby consumers are seeking information specific to

products in their consideration set so that it can be used in the purchase decision-making. The focus of

goal-directed shopping actually is on the efficiency of the purchasing process.

On the other hand, experiential shopping tends to focus on the recreational or hedonic motives of the

consumer. Experiential shopping entails on-going search without a specific purchase horizon (Bloch,

Sherrell, and Ridgway, 1986). Wolfinbarger and Gilly (2001) argue that goal-directed shopping will be

more prevalent in online contexts compared to experiential shopping. The rationale behind their

argument is that time-strapped consumers are more likely to adopt online channels to minimize the costs

associated with going to physical stores (Bellman, Lohse, and Johnson, 1999).

“Efficiency” Metrics for Internet-Based Shopping. Taken together, these theoretical arguments

seem to support the claim that efficiency is indeed an important metric in the context of online shopping.

However, it remains an empirical question whether our “efficiency orientation” holds for other product

types (e.g., apparel) or shopping motivations (e.g., buying a gift) need to be empirically validated. In

addition, employing efficiency as the dependent variable may raise questions in the minds of marketing

managers at e-commerce websites. Does website efficiency lead to greater business value?

As an illustrative example, we know that at physical grocery stores the overall layout of the shopping

aisles is not designed to maximize efficiency. Milk and other everyday-use dairy products tend to be

located at the back of the store so that shoppers who only want to quickly grab a carton of milk still need

to pass through the whole store. This often leads to impulse purchases of other items that were not on the

30

shopper’s list. Interestingly, when there is some intent to manipulate consumer behavior present on the

part of the retailer, it seems that the business value of the store layout actually ought to be inversely

related to its efficiency from the customer’s point of view. We remind the reader that prior research on

customer efficiency (e.g., Xue and Harker, 2002) suggests that efficiency consists of transaction

efficiency and value efficiency, and that such efficiencies create business value through cost savings and

increased sales volume. However, whether website efficiency is also positively related to business value

is still an open empirical question.

Future Research

Several directions for extending the current work are apparent. Given that we may observe recurring

transactions by customers over time, the deterministic frontier analysis that we employ needs to be

extended to stochastic frontier analysis (Aigner, Lovell, and Schmidt, 1977; Banker, 1993). Since each

customer will have multiple measures of efficiency over time, we need to take into consideration the

measurement and random error components of the efficiency ratings. This will require a larger sample

with many more recurring transactions than the limited sample we used for illustration purposes in this

paper. We are currently in the process of collecting a longitudinal sample of web log data for this purpose.

We also should point out that our empirical analyses did not include any environmental variables that

may affect the production process, even though our general model provided the possibility of including

such factors. Our current data collection effort includes the collection of measures for various

environmental variables (e.g., customer experience) so that the impact of these additional factors may also

be estimated. Another direction for further development comes from the fact that we have not yet fully

characterized the qualities of the different website designs in the present analysis. We represented the

two different website designs as “black box” production environments, instead of richly portraying the

differences between them in our models. We are currently working to more faithfully represent the

qualities of the website designs so that these design variables can also be incorporated in the production

model. This will enable managers to assess the impact of different design variables on website efficiency.

Finally, in order to provide a stronger rationale for the importance of the efficiency perspective, we

need to empirically validate whether website efficiency does in fact have a positive impact on business

value. Our current research project also includes additional data collection efforts in order to link our

proposed efficiency metrics with business value.

REFERENCES

Agrawal, V., Arjona, L. D. and Lemmens, R. (2001). "E-Performance: The Path to Rational Exuberance". The McKinsey Quarterly, 2001(1), pp. 31-43.

Aigner, D. J. and Chu, S. F. (1968). "On Estimating the Industry Production Function". American Economic Review, 58(4), pp. 826-839.

31

Aigner, D. J., Lovell, C. A. K. and Schmidt, P. (1977). "Formulation and Estimation of Stochastic Frontier Production Function Models". Journal of Econometrics, 6(1), pp. 21-37.

Anderson, L. (2002). In Search of the Perfect Web Site. Smart Business, March, 2002, pp. 60-64. Babin, B. J., Darden, W. R. and Griffin, M. (1994). "Work and/or Fun: Measuring Hedonic and

Utilitarian Shopping Value". Journal of Consumer Research, 20(4), pp. 644-656. Banker, R. D. (1993). "Maximum Likelihood, Consistency and Data Envelopment Analysis: A Statistical

Foundation". Management Science, 39(10), pp. 1265-1273. Banker, R. D., Charnes, A. and Cooper, W. W. (1984). "Some Models for Estimating Technical and Scale

Inefficiencies in Data Envelopment Analysis". Management Science, 30(9), pp. 1078-1092. Banker, R. D., Datar, S., M. and Kemerer, C. F. (1991). "A Model to Evaluate Variables Impacting the

Productivity of Software Maintenance Projects". Management Science, 37(1), pp. 1-18. Banker, R. D. and Kauffman, R. J. (1991). "Reuse and Productivity: An Empirical Study of Integrated

Computer-Aided Software Engineering (ICASE) at the First Boston Corporation". MIS Quarterly, 15(3), pp. 374-401.

Banker, R. D. and Slaughter, S. A. (1997). "A Field Study of Scale Economies in Software Maintenance". Management Science, 43(12), pp. 1709-1725.

Bellman, S., Lohse, G. L. and Johnson, E. J. (1999). "Predictors of Online Buying Behavior". Communications of the ACM, 42(12), pp. 32-38.

Benbasat, I., Dexter, A. S. and Todd, P. A. (1986). "An Experimental Program Investigating Color-Enhanced and Graphical Information Presentation: An Integration of the Findings". Communications of the ACM, 29(11), pp. 1094-1105.

Berry, M. J. A. and Linoff, G. (1997). Data Mining Techniques for Marketing, Sales, and Customer Support. New York, NY: John Wiley and Sons.

Bevan, N., Barnum, C., Cockton, G., Nielsen, J., Spool, J. M. and Wixon, D. (2003). "Panel: The "Magic Number 5:" Is It Enough for Web Testing?" Proceedings of the 2003 ACM Conference on Human Factors in Computing Systems (CHI 2003), Ft. Lauderdale, FL, ACM Press, New York.

Bloch, P. H., Sherrell, D. L. and Ridgway, N. M. (1986). "Consumer Search: An Extended Framework". Journal of Consumer Research, 13(1), pp. 119-126.

Brockett, P. L. and Golany, B. (1996). "Using Rank Statistics for Determining Programmatic Efficiency Differences in Data Envelopment Analysis". Management Science, 42(3), pp. 466-472.

Brooks, R. (1999, January 7). Alienating Customers Isn't Always a Bad Idea, Many Firms Discover. Wall Street Journal, pp. A1 and A12.

Charnes, A. and Cooper, W. W. (1980). "Management Science Relations for Evaluation and Management Accountability". Journal of Enterprise Management, 2(2), pp. 160-162.

Charnes, A., Cooper, W. W. and Rhodes, E. (1978). "Measuring Efficiency of Decision-Making Units". European Journal of Operational Research, 2(6), pp. 428-449.

Charnes, A., Cooper, W. W. and Rhodes, E. (1981). "Evaluating Program and Managerial Efficiency: An Application of Data Envelopment Analysis to Program Follow Through". Management Science, 27(6), pp. 668-697.

Chase, R. B. (1978). "Where does the Customer Fit in a Service Operation?" Harvard Business Review, 56(6), pp. 138-139.

Chase, R. B. and Tansik, D. A. (1984). "The Customer Contact Model for Organization Design". Management Science, 29(9), pp. 1037-1050.

Chen, P.-Y. and Hitt, L. M. (2002). "Measuring Switching Costs and the Determinants of Customer Retention in Internet-Enabled Businesses: A Study of the Online Brokerage Industry". Information Systems Research, 13(3), pp. 255-274.

Chidamber, S. R., Darcy, D. P. and Kemerer, C. F. (1998). "Managerial Use of Metrics for Object Oriented Software: An Exploratory Analysis". IEEE Transactions on Software Engineering, 24(8), pp. 629-639.

Cooley, R., Mobasher, B. and Srivastava, J. (1999). "Data Preparation for Mining World Wide Web Browsing Patterns". Journal of Knowledge and Information Systems, 1(1), pp. 5-32.

32

Dalton, J. P., Hagen, P. R. and Drohan, H. (2001). The Cost of Selling Online (Forrester Research Report). Cambridge, MA: Forrester Research Inc., July 2001.

Frei, F. X. and Harker, P. T. (1999). "Measuring the Efficiency of Service Delivery Processes: An Application to Retail Banking". Journal of Service Research, 1(4), pp. 300-312.

Goodhue, D. L. and Thompson, R. L. (1995). "Task-Technology Fit and Individual Performance". MIS Quarterly, 19(2), pp. 213-236.

Hahn, J., Kauffman, R. J. and Park, J. (2002). "Designing for ROI: Toward a Value-Driven Discipline for E-Commerce Systems Design". Proceedings of the 35th Hawaii International Conference on System Sciences (HICSS 35), Big Island, HI, January 7-10, IEEE Computer Society Press, Los Alamitos, CA.

Ivory, M. Y. and Hearst, M. A. (2001). "The State of the Art in Automating Usability Evaluation of User Interfaces". ACM Computing Surveys, 33(4), pp. 470-516.

Kriebel, C. H. and Raviv, A. (1980). "An Economics Approach to Modeling the Productivity of Computer Systems". Management Science, 26(3), pp. 297-311.

Lovelock, C. H. and Young, R. F. (1979). "Look to Consumers to Increase Productivity". Harvard Business Review, 57(3), pp. 168-178.

Meuter, M. L., Ostrom, A. L., Roundtree, R. I. and Bitner, M. J. (2000). "Self-Service Technologies: Understanding Customer Satisfaction with Technology Based Service Encounters". Journal of Marketing, 64(3), pp. 50-64.

Mills, P. K. and Morris, J. H. (1986). "Clients as "Partial" Employees of Service Organizations: Role Development in Client Participation". Academy of Management Review, 11(4), pp. 726-735.

Nielsen, J. (1994). Top 10 Heuristics for Usability, [the Internet]. UseIT: Jakob Nielsen's Website, Nielsen/Norman Group, Fremont, CA. available: www.useit.com/papers/heuristic/heuristic_list.html.

Nielsen, J. and Mack, R. L. (Eds.). (1994). Usability Inspection Methods. New York, NY: John Wiley and Sons.

Pirolli, P. L. T. and Card, S. K. (1999). "Information Foraging". Psychological Review, 106(4), pp. 643-675.

Rajgopal, S., Venkatachalam, M. and Kotha, S. (2001). Does the Quality of Online Customer Experience Create a Sustainable Competitive Advantage for E-commerce Firms? (Working Paper). Seattle, WA: School of Business Administration, University of Washington, April 2001.

Rehman, A. (2000). Holiday 2000 E-Commerce: Avoiding $14 Billion in "Silent Losses" (Research Report). New York, NY: Creative Good, October 2000.

Rizzuti, K. and Dickinson, J. (2000). Satisfying the Experienced On-Line Shopper: Global E-Shopping Survey (Research Report). London: A.T. Kearney 2000.

Schubert, P. and Selz, D. (1999). "Web Assessment: Measuring the Effectiveness of Electronic Commerce Sites Going Beyond Traditional Marketing Paradigms". Proceedings of the 32nd Hawaii International Conference on System Sciences (HICSS 32), Maui, HI, January 5-8, Los Alamitos, CA: IEEE Computer Society Press.

Souza, R., Manning, H., Sonderegger, P., Roshan, S. and Dorsey. (2001). Get ROI From Design (Forrester Research Report). Cambridge, MA: Forrester Research Inc., June 2001.

Spool, J. M., Scanlon, T., Schroeder, W., Synder, C. and DeAngelo, T. (1999). Web Site Usability: A Designer's Guide. San Francisco, CA: Morgan Kaufmann Publishers.

Spool, J. M. and Schroeder, W. (2001). "Testing Web Sites: Five Users Is Nowhere Near Enough". Proceedings of the 2001 ACM Conference on Human Factors in Computing Systems (CHI 2001 (Extended Abstracts)), Seattle, WA, March 31-April 5, ACM Press, pp. 285-286.

Srivastava, J., Cooley, R., Deshpande, M. and Tan, P.-N. (2000). "Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data". SIGKDD Explorations, 1(2), pp. 12-23.

Straub, D. W., Hoffman, D. L., Weber, B. W. and Steinfield, C. (2002a). "Measuring e-Commerce in Net-Enabled Organizations: An Introduction to the Special Issue". Information Systems Research, 13(2), pp. 115-124.

33

http://www.useit.com/papers/heuristic/heuristic_list.html

Straub, D. W., Hoffman, D. L., Weber, B. W. and Steinfield, C. (2002b). "Toward New Metrics for Net-Enhanced Organizations". Information Systems Research, 13(3), pp. 227-238.

Straub, D. W. and Watson, R. T. (2001). "Transformational Issues in Researching IS and Net-Enabled Organizations". Information Systems Research, 12(4), pp. 337-345.

Underhill, P. (1999). Why We Buy: The Science of Shopping. New York, NY: Touchstone. Varian, H. R. (1992). Microeconomic Analysis (3rd ed.). New York, NY: Norton. Varianini, V. and Vaturi, D. (2000). "Marketing Lessons from E-Failures". The McKinsey Quarterly,

2000(4), pp. 86-97. Vessey, I. (1991). "Cognitive Fit: A Theory-Based Analysis of Graphs Versus Tables Literature".

Decision Sciences, 22(2), pp. 219-240. Wallach, S. L. (2001, July 9). Usability Improvements Payoff for Web Site, [the Internet]. ITworld.com.

available: www.itworld.com/nl/sup_mgr/07092001. Walley, P. and Amin, V. (1994). "Automation in a Customer Contact Environment". International

Journal of Operations and Production Management, 14(5), pp. 86-100. Wolfinbarger, M. and Gilly, M. C. (2001). "Shopping Online for Freedom, Control and Fun". California

Management Review, 43(3), pp. 34-55. Xue, M. and Harker, P. T. (2002). "Customer Efficiency: Concept and Its Impact on E-Business

Management". Journal of Service Research, 4(4), pp. 253-267. Zeithaml, V. A., Parasuraman, A. and Berry, L. L. (1990). Delivering Quality Service: Balancing

Customer Perceptions and Expectations. New York, NY: Free Press. Zeithaml, V. A., Rust, R. T. and Lemon, K. N. (2001). "The Customer Pyramid: Creating and Serving

Profitable Customers". California Management Review, 43(4), pp. 118-142. Zona Research. (1999). Shop Until You Drop? A Glimpse into Internet Shopping Success (Zona

Assessment Paper). Redwood City, CA: Zona Research 1999.

34

http://www.itworld.com/nl/sup_mgr/07092001

measuring the effectiveness of e-commerce website design

Documents