incentives and transactions costs within the firm...

30
Review of Economic Studies (1999) 66, 309-338 0 1999 The Review of Economic Studies Limited 0034-6527/99/00150309$02.00 Incentives and Transactions Costs Within the Firm: Estimating an Agency Model Using Payroll Records CHRISTOPHER FERRALL Queen’s University and BRUCE SHEARER Universiti Lava1 First version received February 1995; final version accepted March 1998 (Eds.) We estimate an agency model using the payroll records of a copper mine that paid a pro- ductionbonus to teams of workers. We estimate the cost of incomplete informationdueto insurance and incentives considerations and the inefficiency caused by the simple form of the incentive contract itself. At the estimated parameters the cost of worker risk aversion (insurance) is of similar magnitude to moral hazard (incentives). Overall, incomplete information accounted for one-half of the bonus system’s inefficiency relative to potential full information profits. The other half is attributed to the bonus system’s inefficient generation of incentives and insurance relative to the optimal incentive contract. 1. INTRODUCTION A growing literature uses firm-level data tostudy whether incentives and incomplete infor- mation play important roles in the design and performance of contracts.’ The optimal contractunder incomplete information must balance incentives with insurance, which typically leads to a non-linear contract without closed form (Grossman and Hart (1983), Gibbons (1987)) that involves all informative characteristics of the agent’s behaviour (Holmstrom (1979)). Yet compensation schemes based upon mathematically complicated formulas involving all relevant information are rarely observed (Stiglitz (1991)). The dis- crepancy between the optimal and observed forms of incentive contracts poses an additional challenge for empirical studies of the cost of incomplete information within firms. The costs of sub-optimal contract form may be as great as the pure costs of incom- plete information that arise from the agency relationship. This paper develops, solves, and estimates a principal-agent model using the payroll records of a copper mine. The model captures essential elements of technology and infor- mation of a mining firm: worker risk aversion, team production, and asymmetric infor- mation between workers and the firm about working conditions. During the 1920s, the 1. Examples include Jensen and Murphy (1990), Margiotta and Miller (1993), Paarsch and Shearer (1 996), and the papers contained in Ehrenberg (1990) and Blinder (1990). 309

Upload: others

Post on 09-Feb-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

  • Review of Economic Studies (1999) 6 6 , 309-338 0 1999 The Review of Economic Studies Limited

    0034-6527/99/00150309$02.00

    Incentives and Transactions Costs Within the Firm:

    Estimating an Agency Model Using Payroll Records

    CHRISTOPHER FERRALL Queen’s University

    and

    BRUCE SHEARER Universiti Lava1

    First version received February 1995; final version accepted March 1998 (Eds.)

    We estimate an agency model using the payroll records of a copper mine that paid a pro- duction bonus to teams of workers. We estimate the cost of incomplete information due to insurance and incentives considerations and the inefficiency caused by the simple form of the incentive contract itself. At the estimated parameters the cost of worker risk aversion (insurance) is of similar magnitude to moral hazard (incentives). Overall, incomplete information accounted for one-half of the bonus system’s inefficiency relative to potential full information profits. The other half is attributed to the bonus system’s inefficient generation of incentives and insurance relative to the optimal incentive contract.

    1. INTRODUCTION

    A growing literature uses firm-level data to study whether incentives and incomplete infor- mation play important roles in the design and performance of contracts.’ The optimal contract under incomplete information must balance incentives with insurance, which typically leads to a non-linear contract without closed form (Grossman and Hart (1983), Gibbons (1987)) that involves all informative characteristics of the agent’s behaviour (Holmstrom (1979)). Yet compensation schemes based upon mathematically complicated formulas involving all relevant information are rarely observed (Stiglitz (1991)). The dis- crepancy between the optimal and observed forms of incentive contracts poses an additional challenge for empirical studies of the cost of incomplete information within firms. The costs of sub-optimal contract form may be as great as the pure costs of incom- plete information that arise from the agency relationship.

    This paper develops, solves, and estimates a principal-agent model using the payroll records of a copper mine. The model captures essential elements of technology and infor- mation of a mining firm: worker risk aversion, team production, and asymmetric infor- mation between workers and the firm about working conditions. During the 1920s, the

    1. Examples include Jensen and Murphy (1990), Margiotta and Miller (1993), Paarsch and Shearer (1 996), and the papers contained in Ehrenberg (1990) and Blinder (1990).

    309

  • 3 10 REVIEW OF ECONOMIC STUDIES

    Britannia Mining and Smelting Company of British Columbia paid teams of workers a production bonus. Teams whose output exceeded a minimum standard received a bonus proportional to output beyond the standard. If y is team output, X is the production standard, and a is the bonus rate, then the bonus equals zero when y < X and a ( y - X ) when y L x . We call this contract a linear bonus. From the payroll records we observe the payments made under the linear bonus, but we do not observe the values of a and X used by the firm in any pay period.

    Based on maximum likelihood estimates of the agency model we compare the efficiency of the linear bonus system to that of other contracts, including the optimal contracts under full and incomplete information. Our results indicate larger costs associ- ated with incomplete information in the Britannia mine than a more casual analysis might suggest. The estimated expected profits under the linear bonus are 34% of full information profits, even though workers received less than 5% of their wages in the form of incentive pay. Under the linear bonus system the cost to the firm of worker risk aversion (insurance) is of similar magnitude as the cost of incomplete information (incentives).

    We also use the estimated model to explore why the firm might have chosen to use the linear bonus system instead of other incentive contracts. We solve numerically an approximation to the optimal incentive contract, which indeed is a highly non-linear func- tion of team output. We also compute the efficiency of incentive contracts that are similar in form to the linear bonus. Compared with the linear bonus, the optimal incentive con- tract would have cut productive inefficiency roughly in half. Therefore, we attribute one half of the linear bonus’s inefficiency to its inefficiency in providing incentives and insurance. Whatever constrained the firm to choose a simple but inefficient contract has associated with it costs roughly equal to the costs of incomplete information. We investi- gate the plausibility of implementation costs as an explanation for the observed choice of contract.

    Our estimation strategy is similar to that of Pakes (1986), Rust (1987), Eckstein and Wolpin (1990), and Margiotta and Miller (1993) because estimating the model requires a nested solution algorithm. There is no closed-form solution for optimal values of a and X , let alone the fully optimal contract. The firm’s problem must be solved numerically on each iteration of the estimation procedure. As with Rust’s application of dynamic control theory, we use multiple realizations of production shocks within a single firm to identify the model. As with the equilibrium search model estimated by Eckstein and Wolpin, our algorithm solves the problems for two sides of the contract or market. As with Margiotta and Miller’s study of executive compensation, we impose the restriction that the data were generated by an agency contract.

    Intuitively, it is unlikely that the technology and preference parameters of an agency model can be disentangled unless the choices of both the principal and agent are modeled. Yet without separately identifying the parameters little can be inferred about the perform- ance of the bonus system. We provide a proof that formalizes this intuition. By numeri- cally maximizing the firm’s profit function, we can identify parameters that determine the cost of incomplete information and indirectly the implementation costs that rationalize the use of the linear bonus system. We also derive what can be inferred from worker behaviour alone in response to the linear bonus system. This unrestricted model is an important baseline, because to impose profit maximization requires the additional assumption of exponential utility.

    In the next section we describe a notion of implementation costs that provides a framework for costly contract forms. In Section 3 we describe the agency model, and as a baseline we characterize the optimal contract under full information. Then we define

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 31 1

    the Nash equilibrium response of workers to the linear bonus system, the firm’s objective function for choosing the optimal linear bonus, and our method for numerically approxi- mating the optimal incentive contract. Section 4 discusses identification of the model with and without modelling firm behaviour. Section 5 describes the data, defines the likelihood function, and reports estimates of the model. Section 6 uses the estimates to consider implications of the results. Section 7 discusses some limitations and possible extensions of our analysis.

    2. PRELIMINARIES: IMPLEMENTATION COSTS

    Agency theory provides a framework for modelling the costs of incomplete information within a firm, but we lack a good model of implementation costs. Allowing for implemen- tation costs can assume away the discrepancy between practice and theory unless a struc- ture is imposed on implementation costs based on observed features of contracts. We avoid the assumption that simple contracts are used because agents or the firm act sub- optimally or that they incur costs to calculate optimal choices. These explanations would make it difficult to define the cost of incomplete information at all.

    Instead, we posit that the mathematically complicated contracts are more costly to implement than simpler contracts. Examples of implementation costs include the resources required to communicate the contract to agents, to keep track of the required information, and to compute payments under the contract. Some authors have studied the choice between methods of pay by comparing contracts with simple functional forms. Lazear (1986) and Brown (1990) compare fixed wage and piece rate contracts; Lazear and Rosen (1981) and Green and Stokey (1983) compare piece rate and tournament contracts. The restriction to simple contracts can be based on the assumption that more complicated payment schemes are very costly to implement.

    Our analysis is premised on three notions: (i) the implementation costs of paying workers according to contracts of the same functional form are equal; (ii) some contracts which are special cases of a functional form require less computation than the nesting contract and therefore have lower implementation costs; (iii) mathematically more compli- cated forms are more costly. For instance, the implementation costs of a piece rate scheme do not depend on the piece rate itself, however a constant wage contract (nested by a piece-rate contract with piece rate equal to zero) is cheaper to implement than any non- zero piece rate contract. Contracts that contain, say, logarithms or other non-linear func- tions are more expensive than piece rates, both to communicate to workers and to calcu- late on a regular basis. Contracts that depend on more variables, such as both past and current performance, require more bookkeeping and are therefore more expensive to implement than contracts based on fewer var.iables. Holmstrom and Milgrom (1987) argue that linear payment rules are robust to variations in the production environment and may therefore require less costly tinkering than complicated payment rules. These adjustment costs depend only on changes in the contract and not on the form of the contract itself. In that sense adjustment costs offer a different explanation for simple contracts than do implementation contracts that vary with the form of the contract.

    Conceptually, we break the firm’s choice into two steps

    Here 2 denotes the family of all possible enforceable compensation contracts. An element ZE 2 is a form of payment, with specific contracts in z described by a set of parameters g .

  • 312 REVIEW OF ECONOMIC STUDIES

    G(z) denotes the set of possible values of g. C(z) denotes the implementation costs of contracts in z and z(g, z ) denotes the profits associated with a particular fully specified contract, (g, z) . For example, if z is the linear bonus scheme used by our firm, then an element of z is defined by the two parameters (a , X ) , and G(z) = ( (a , X ) : (a , X)E >. G(z) does not contain simple piece rates ( X = 0) nor constant wages ( X = a = 0). These contracts would be presumed to be members of two other elements of 2 because they require less computation than 'contracts with X > 0 and a > 0. By construction C(z) is a fixed cost when choosing g € G(z). The profit function z(g, z ) incorporates the agency relationship with workers, namely incentive compatibility, individual rationality, and free-riding within teams. The firm can be viewed as choosing the optimal parameters g*(z) and then choosing the optimal contract form z*. The classic principal-agent problem, in Grossman and Hart (1983) for example, would assume C(z) = 0 for all forms of payment z , whereas an analysis of salaries vs. linear piece rates implicitly assumes that C(z) is prohibitively large for any contracts more flexible than a linear one.

    We cannot estimate C(z) directly. Rather, by estimating g*(z*) given the firm's observed payment scheme z*, we can identify the main parameters that determine the profit function z(g, z): worker risk aversion, cost of effort, and the distribution of pro- duction shocks. We then compare costs of incomplete information with the implemen- tation costs needed to rationalize the choice of z*. For our application, we compute expected profits under simpler and more complicated compensation schemes than the linear bonus used by the firm. We also approximate numerically the optimal agency con- tract ignoring implementation costs. Although we can at best indirectly infer C(z), it is important that the choice of incentive contract be separable in some explicit way as in (1). Otherwise, it would be difficult to define the costs of incomplete insurance independently of the costs of contract form z. Without separability both costs are in effect on the margin, whereas (1) implies that only information costs act on the margin in choosing between different parameter vzlues contained in G@).

    3. TEAM PRODUCTION AND INCENTIVES

    Several aspects of mining make it an ideal industry to which to apply agency theory. Productivity varies substantially due to working conditions, space constraints make com- plete monitoring difficult, and the production process is simple. Two occupations, miners and muckers, were primary to extracting ore at Britannia. Miners drilled and blasted rock from the face of the tunnel while muckers shovelled the blasted rock (or muck) into ore carts. We assume that each team was composed of one miner and two muckers, because the ratio of shifts worked by muckers to miners varies between 1-5 and 2 in most pay periods. Let the subscript a denote miners and let b denote muckers. Since muckers can only muck rock blasted by the miner, output from a tunnel can be approximated by a Leontief production function

    where i l a is the effort of the miner and is the effort of mucker i. Total team output il can be interpreted as the amount of rock processed by the team. Rock in different sectors of the mine could have very different ore content. We let the revenue generated by a team working in some sectorj take the form

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 313

    Here d / and vi are two random aspects of section j and P represents the price of copper. Each of these values is observable by both workers and the firm. To reduce notation most of the model is presented for d / = 1 and v, = 0, which when combined with (1) results in the production function

    y = 8 min { L , L , l + hb,2 1. (4) The term 8 is composed of a productivity shock in the specific tunnel the team works in and the price of copper P.2 The value of 8 lasts one pay period. We refer to the environ- ment in which both the workers and the firm observe 8 before effort levels are set as the full information case; when only workers get to observe 8 it is referred to as the incomplete information case, implying the firm must offer an incentive contract.

    Workers have a utility function U(r W - (kj/k)Ak), iE { a , b ) , where W is the wage earned in a period, A is effort, and (kj/k)Ak is the cost of effort for workers in occupation i. Members of occupation i also have reservation utility I&. Except for the occupation- specific costs of effort and reservation utilities, all workers are assumed to be identical. The parameter r > 0 is the relative weight of wages to effort. The von Neumann-Morgen- stern utility function U exhibits risk aversion (U’ > 0 and U” < 0) so workers are willing to pay for insurance against production shocks. Later notation is simplified by defining qGk-1.

    Assumption Al .

    (i) within each sector 8 is log-normally distributed, In 8-N(p , 02), and 8 is i.i.d.

    (ii) kb = 1, (iii) 2 l , (iv) k, = l /2?‘ + l , (v) U = -exp {-[r W - (kj/k)Ak]).

    across sectors within each time period,

    We denote the density and cumulative distribution functions of 8 as f ( 0 ) and F(8) , respectively. The firm’s knowledge of the copper price P is captured by their knowledge of p and o, and during the empirical analysis these parameters are allowed to vary over time to reflect changes in both output prices and observable elements of productivity in the mine. From Al.(i), f ( 8 ) = l/(8o) $((ln 0 - p ) / o ) and F(8) = @((ln 8 - p ) / o ) where $ and @ are the standard normal density and distribution functions. Setting kb = 1 in Al(ii) normalizes the utility function U. Al(iii) generalizes the model in Ferrall and Shearer (1994) which was equivalent to q = 1. Assuming q > 1 rules out objective functions that are not concave. When k,, kb, and k are all free parameters there are three different cases of team behaviour. Not all three parameters are identified from the distribution of bonus payment^.^ Setting k, to be related to q in Al(iv) selects those environments in which muckers (rather than miners) constrain team output on the margin in the equilibrium defined below.

    Since we maintain the assumption that workers observe 8 before choosing effort, the functional form of U does not affect the choice of effort nor the firm’s problem with full information. U does, however, determine the firm’s choice over incentive contracts, and when solving the firm’s problem we employ A ~ ( v ) . ~

    2. Including P in 8 simplifies the notation while keeping it consistent with the empirical analysis. 3. The proof is available upon request. 4. From a computational standpoint, exponential utility is perhaps the only feasible functional form when

    solving the principal’s problem. Margiottta and Miller (1993) also maintain this assumption, and most of the simulations of the Grossman and Hart (1983) model performed by Haubrich (1994) use exponential utility.

  • 314 REVIEW OF ECONOMIC STUDIES

    The optimal full information contract

    To establish a baseline, consider the full information environment (6 observed by every- one). The optimal full-information contract specifies a wage and effort level for each occupation and each value of 8.

    Definition Dl. The optimal full information contract (when 8 can be observed) is described by two wage functions, W, and Wb, and two effort functions, Aa and A b , that solve:

    (0 min {AU(O), 245(0)) - WU(W - 2Wb(e)) f (e)de, ( 5 )

    subject to

    Theorem T1. The optimal full information contract takes the form

    Revenue : yF = OAf

    All proofs are provided in the Appendix.

    With full information, optimal wages consist of a piece rate combined with a base wage or, if the constant term U-'(ai) is negative, a base fee to enter the mine. The contract does not have a constant wage because revenue shocks affect the productivity of effort, so it is optimal for worker effort to vary with 8. To provide insurance against this vari- ation, wages vary with revenue generated. The base wage differs across occupations but the optimal piece rate 1/3(q + 1) is the same.

    The linear bonus

    Now consider contracts when the firm only observes team revenue, y , and therefore can only enforce wage payments that depend on revenue and occupation. The firm used what we call a linear bonus. Workers in each occupation were paid base wages, denoted Pu and

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 3 15

    f i b . A team working in sector j of the mine split equally a bonus of the form

    if y j < xj, yi - xj), ifyjzxj,

    where xi is the standard and aj is the bonus rate for sectorj. A team member in occupation i therefore received a total wage equal to pi + wj/3.5 The piece rate aj and revenue stan- dard xi differed across areas, and the standard was also adjusted for the number of shifts worked in the area during the pay period.6 We do not observe the sector of the mine the team worked in nor do we observe the values of xj and aj. Under some mild assumptions, however, it is not necessary to have this information.

    Assumption A2

    (i) 4 , vi and 8 are independently distributed, (ii) E [ 4 ] = 1, (iii) q v j ] 2 0, (iv) Workers were assigned to sectors randomly.

    The firm can set the standards and piece rates to cancel out fixed differences across sectors by choosing two values a and X such that

    a a.=- d,’

    xj = d,x + PVj . This system of bonuses and standards equalizes opportunities across areas of the mine prior to the realization of 8. Based on (8), workers would be indifferent to the sector they were assigned, which makes A2(iv) a reasonable assumption. Company reports suggest that balancing outcomes across sectors was important to the firm, perhaps for the follow- ing reason: if work in one area is especially difficult, because of bad rock or equipment malfunctions, then the mine cannot easily shut the area down and shift work to other areas. The marginal product of placing extra workers in other tunnels is small due to space constraints, and only by making progress in a bad area can miners reach better rock. Based on A2 and the equalizing bonuses defined by (a, X) we can continue to use the simpler production function (4) and ignore the observable values d, and vj, although it is necessary to say more about them in Section 6.

    Figure 1 sketches the revenue of a team that arises from a Nash equilibrium response to the linear bonus system (7). A bonus system gives workers no incentive to provide effort when the value of 8 falls below some value 8*. For 8 < 8*, all members of the team set effort to zero, because working conditions make it too difficult to earn a bonus. Total revenue from the team can be positive due to vj in (3). For 8 > 8* each worker wants to equate the marginal return to effort to marginal cost. The existence of 8* later plays an important role in model identification. Leontief production, however, implies that any effort one occupation supplies above and beyond the effort of the other occupation is

    5. Early on, the firm experimented with different bonus rates for different occupations. There appears to have been resistance to this and by 1926 the pay system was changed so that workers split a team bonus equally. We return to this issue in Section 6.

    6. Workers might work in different areas during a pay period, and they were rewarded a share of the bonus in each area in proportion to the number of shifts they had worked in that area.

  • 316

    c) g L a C

    8

    REVIEW OF ECONOMIC STUDIES

    P

    # d output

    d

    Miner ef€ort 1 d

    .L- + . + . +

    \ Mucker effort

    e FIGURE 1

    Nash equilibrium effort and output

    wasted. Therefore, in equilibrium, miners will always supply twice the effort level of muckers.

    Define

    Theorem T2. Given a bonus system (a, X ) , there exists a continuum of Nash equilibria effort functions for miners and muckers, denoted A@) and &(e). In each case, &,(e) = Aa(8)/2. Under Al, effort of miners in the unique Pareto efjcient Nash equilibrium takes the form

    (0, otherwise.

    There are multiple Nash solutions because the effort levels of the two muckers enter y additively, creating a range of 8 for which the two muckers may stop shirking simul- taneously. The range depends on the effort level of the miners, but the lower bound equals e*. For 8 < e* each mucker would set effort to zero in response to any effort level chosen by the miner and the other mucker. In general, one of the occupations constrains the effort level of the whole team.7 Under assumption Al(iv) muckers constrain team effort.

    7. The Leontief production function also implies that the set of Stackelberg equilibria with the miner as leader is equal to the set of Nash equilibria.

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 317

    Among Nash equilibria, utility of all team members is highest when shirking stops at 8*, which makes it a Pareto efficient outcome for the team.

    The firm chooses the bonus system (a , X) to maximize expected profit per team, subject to individual rationality and incentive compatibility constraints for each occu- pation embodied in the Nash effort functions in Theorem T2. Expected team profit is E[n] = ,!?[revenue] -E[cost]. With worker utility U additively separable in income and effort there are no wealth effects in the choice of effort. The firm can set pi to solve the individual rationality constraint with equality

    1 In (- ai) pi = -In (F(8*) + Hi(a, X)) --, r r

    where

    is the component of occupation i's base wage that compensates for utility generated when the team provides effort, that is when 8 > 8*.

    Definition D2. The optimal linear bonus solves

    03

    max (l - a ) l 2(,) 8('")'' f ( 8 ) d e + a x [ l -F(8*)] ra l'' a,x O*

    The objective function (10) takes into account that teams with low production shocks have no incentive to provide extra effort: they produce only the constant amount v,, which is left out of D2 for convenience, and set variable revenus y to zero. While the revenue standard X creates a deadweight loss in revenue, it provides a form of insurance for teams which allows the firm to lower the bonus rate a from the case of a simple piece rate (X = 0).

    The optimal incentive contract

    The form of the optimal wage contract is unknown, but the solution is not a piece rate as with full information, nor is it the pay system that the firm used, the linear bonus. The optimal contract is a set of incentive compatible and individually rational wage and effort functions:

    Definition D3. Under asymmetric information on 8 , the optimal team contract solves

  • 318

    subject to

    REVIEW OF ECONOMIC STUDIES

    for iE {a, b } Ai(8)EargmFx U

    The expected profit under the optimal incentive contract D3 lies between that under D1 (the full information outcome) and D2 (the optimal linear bonus). Unlike the full information case, the optimal incentive contract has no closed form, and unlike expected profits under a linear bonus, D3 is not a straightforward numerical objective function. We solve D2 repeatedly to estimate the parameters of the model. We solve only an approximation to D3, and we solve it only once for one set of parameters estimated from the data (the results are discussed in Section 6). We must approximate the optimal incen- tive contract in two steps. First, we take a discrete approximation to the continuous distribution of production shocks 8 based on 80 points of support. Second, the optimal step contract over this distribution is computed numerically.

    The points of support for 8 can be written

    for j = 1,2, . . . ,79. These are midpoints of intervals defined by percentiles of the true distribution. The right tail of the distribution is approximated by &0 = 1-25&,. The prob- ability of ej is set to a constant 1/80. With 80 values of 8 there are at most 80 levels of revenue produced by teams. This means that the optimal contract can be defined by 240 values {( y l , W:, Wp>},"! In this contract the wage paid to workers in occupation i is a step function: ?Vi( y ) = W;' for yl=

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 319

    This condition uses the symmetry of the muckers and the perfect complementarity of effort across occupations, since each occupation's effort determines the maximum amount of revenue generated. In the Nash equilibrium response to the wage function the prob- ability of yr being produced is

    where I { A ) is the indicator function for event A .

    Definition D 4 The approximate optimal contract under incomplete information solves

    The objective in (15) was solved only at the estimated parameter values. Of the 80 possible revenue values, only a few are produced with positive probability in equilibrium, suggesting that 80 points provides a close approximation to the actual optimal contract (1 1).

    4. IDENTIFICATION

    The agency model defined in Section 3 depends on the parameters (k, r, p, G, k,, kb, U,, a b ) . The reservation utilities G, and a b are of secondary interest, because they determine only the base wages Pa and P b and not the performance of various contracts. Under A1 we have already normalized the values of k, and kb so that the cost of effort function affects the problem only through the constant q = k - 1. The parameter vector to estimate from one period's data therefore becomes simply (k, r , p , G).

    First we consider the empirical content of worker response to the linear bonus system. We assume the data to consist of bonuses received by a random sample of workers paid under the linear bonus. In our data (a, X ) are unknown parameters to be estimated unless the firm's response to worker behaviour is also modelled. Estimates in which (a, X ) are freely estimated are called the unrestricted model. Estimates in which (a, X ) are implicit functions of the structural parameters through maximization of the profit function (10) are called the restricted or structural model. Finally, we consider how using data from several pay periods aids identification of the structural model.

    The unrestricted model

    Re-define three variables normalized by the mean production shock p

    6 = @e-'", 1?/ = ve-M" + 1)/9

    9

    Note that ln 8 -N(O, 02).

  • 320 REVIEW OF ECONOMIC STUDIES

    Theorem T3. From assumption A1 and the Nash equilibrium effort functions in Theorem T2, the distribution of realized bonuses satisJes the following three properties:

    (i) Positive bonuses are bounded away from zero with a lower bound W(@*) = w(8*) =

    (ii) A team receives no bonus with probability ax/3(2q + 1);

    (iii) For W > w(6*) the density and cumulative distribution functions equal

    (iv) The minimum received bonus w(8*) is always identijied directly from the data regardless of the value of a and X. When a > 0 and X = 0 then ((q + 1)/q)0 and in (16) are identljied. When a x 0 then 0 and q are separately identijied, and a x can be computed from the other estimates.

    The first three statements in T3 follow from inserting the Nash equilibrium effort functions into the bonus equation (7). The economic reason for the first statement is that no team finds it worthwhile to earn small bonuses. Instead, the team sets effort to zero for values of 8 < 8*.

    There are perhaps revealing explanations for the sequence of identifying conditions in T3(iv). Suppose that a > 0 but X = O.* In this case, team effort varies with 8, every worker receives a bonus, and W(@*) = 0. The result is a regression model, where 8 would determine the error term. It is possible to identify a mean and a variance from a regression model. In our parameterization ((q + l) /q)o and would be identified. When a x > 0 the model becomes a Tobit model, except the lower bound is not a constant but instead depends upon the agency model’s parameters. The maximum likelihood estimator for the boundary, w(8*), is the smallest positive bonus in the data. Flinn and Heckman (1982), Christensen and Kiefer (1991), and Donald and Paarsch (1993) discuss the properties of boundary estimators. In essence, the identity in T3(i) can be used to concentrate the likelihood function, enabling us to separately identify (and consistently estimate) q and 0. Once q is identified it is possible to use T3(i) to recover ax .

    It is perhaps not intuitive how four parameters can be identified from what appears to be a Tobit model with one extra parameter. Two factors drive identification, one stat- istical and one theoretical. First, the minimum observation from a distribution with finite lower bound converges at a rate faster than the square root of the sample size. Information about extreme values accumulates in the sample faster than information about non- extreme values such as the mean. And theoretically, the parameters of interest have a functional relationship with the lower bound w(8*). If q and a x were not related through T3(i) by optimal worker behaviour there would not be enough information in a sample drawn from the bonus system to separately identify them.

    8. When a= 0 there are no incentives for effort and no variation in output or bonuses.

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 321

    Also surprising may be a futher implication of T3(iv). Consider the choice between analysing data from two firms, one such as ours where a and x are only known to be positive, and another firm that paid a simple but known piece rate ( X = 0 and a observed). Then the same number of parameters of the team model are identified from the distri- bution of bonus payments in the two firms. In the second firm information about a would help identify q in (1 6) directly. However, when the minimum possible bonus is zero there is no further information available in the sample. Theoretically, the equilibrium reaction to a positive production standard provides information about the environment not pro- vided by a linear payment rule.

    By using data from several pay periods during which the bonus parameters were changed, we can estimate the relative movement in mean production shocks. That is, we assume that the preference parameters, (r , q), are constant over time and the firm changed the bonus parameters (a , X ) in response to changes in the parameters (c, p) as tunnels were extended and the output price varied. In the unrestricted model (imposing only optimal team behaviour) q is constant over periods but q, ax, and c all vary across periods.

    Restricted model

    Nevertheless, little can be inferred about the cost of incomplete information within this firm based on the unrestricted model and Theorem T3. Since only the product ax is at best identified from the data, expected profits under the linear bonus system cannot be computed from the distribution of bonuses imposing worker optimality alone. Nor are enough other parameters identified to compute expected profits under the first-best full- information profits defined in (6). The key missing parameter is the utility parameter r which determines worker risk aversion and thereby the tradeoff between incentive; and insurance. Adding restrictions to the model will not create more information in the sample, but it can reduce the number of free parameters estimated using the sample infor- mation. In the restricted model we estimate below, the firm chooses (a, X ) to maximize profits and these parameters become implicit functions (without closed forms) of the struc- tural parameters.’ With ax no longer free in T3(iv) it is possible to estimate r through the indirect relationship between ax and r.l0

    Two parameters, r and q, are constant across periods in the restricted model as opposed to one parameter in the unrestricted model. The mean production shock p is normalized to zero in one period and then estimated freely in place of Win the unrestricted model. Across periods only two estimated parameters, p and c, vary in the restricted model as opposed to three parameters in the unrestricted model. Each additional period of data therefore adds one degree of freedom in the restricted model that imposes profit- maximizing behaviour relative to the unrestricted model based on team behaviour alone.

    9. If (a, X) were known, numerical solutions to the firm’s problem would still be useful for analysing the data. Numerical solutions would be required to determine whether (or to impose the condition that) ( a , x ) maximized profit.

    10. The cost-of-effort coefficients k, and kb enter the firm’s profit function (10) independently through the integral functions Hi(a, X). Recall that Al(iv) sets k, as a function of k to select one of the possible cases of Nash equilibria. It might be possible that both r and kb could be estimated (within the parameter range permitted for this case) by imposing profit maximization and exponential utility, although identification would stem solely from the non-linear relationship between ax and the two estimated parameters r and kb. We tried to estimate both parameters, but the attempt failed because Hb(a, X) (and hence the likelihood function) was nearly flat in kb .

  • 322 REVIEW OF ECONOMIC STUDIES

    5. DATA AND RESULTS

    Data

    The data consist of the payroll records for the Britannia mine during the years 1927 and 1928 For each pay period we observe the job each employee i performed, number of shifts worked ni, and the bonus received, wj. While there were two pay periods per month, bonus rates were changed at most once per month. We therefore combine the data into monthly periods. Our sample consists of 616 workers employed as either miners or muck- ers in the Britannia mine. Of these 616 workers, only 51 (8%) worked as both miner and mucker during the sample period, which suggests that muckers and miners were two separate occupations requiring different skills.

    TABLE I.

    Bonus payments to miners and muckers: 1927-1928

    Shifts worked Workers Total bonus Positive bonus per shift

    Miner Mucker Ayg. Prop Prop= Month total total Ratio Total shifts recd $0.50 Avg. Stdev. Min Max

    1 2 3 4 5 6 7 8 9

    10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

    Avg.

    1422 2389 1.68 1349 2241 1.66 1462 2236 1.53 991 1769 1.79

    1120 1852 1.65 1433 2098 1-46 1212 2091 1.73 1127 1996 1.77 1169 1961 1.68 1414 1874 1.33 1082 1824 1.69 946 1500 1.59

    1103 1886 1.71 838 1833 2.19 872 1749 2.01 759 1503 1.98 923 1616 1.75 850 1823 2.14 569 1327 2.33 790 1492 1.89 779 1648 2.12

    1090 1793 1.64 1288 1882 1.46 885 1490 1.68

    1061 1828 1.77

    130 130 126 99

    100 l 20 112 105 109 133 101 85

    1 00 95 90 80 85 94 70 80 87 97

    110 83

    100

    29.3 0.48 27.6 0.59 29.3 0.7 1 27.9 0.67 29.7 0.70 29.4 0.72 29.5 0.79 29.7 0.82 28.7 0.68 29.1 0.74 28.8 0.56 28.8 0.53 29.9 0.61 28.1 0.8 l 29.1 0.68 28.3 0.68 29.9 0.59 28.4 0.6 1 27.1 0.67 28.5 0.6 l 27.9 0.52 29.7 0.65 28.8 0.59 28.6 0.64 28.84 0.65

    0.04 0.08 0.06 0.04 0.10 0.05 0.06 0.08 0.05 0.06 0.04 0.04 0.02 0.04 0.08 0.09 0.07 0.09 0.07 0.0 1 0.06 0.07 0.06 0.08 0.06

    0.16 0.19 0.01 1.04 0.09 0.10 0.02 0.46 0.12 0.11 0.02 0.51 0.12 0.14 0.02 0.60 0.13 0.15 0.02 0.59 0.13 0.14 0.02 0.64 0.22 0.42 0.02 2.38 0.12 0.12 0.02 0.52 0.19 0.23 0.02 0.95 0.1 1 0.12 0.02 0.61 0.18 0.20 0.02 0.74 0.19 0.17 0.02 0.63 0.19 0.21 0.02 0.84 0.23 0.28 0.02 1.40 0.20 0.24 0.02 1.43 0.19 0.39 0.02 2.06 0.17 0.22 0.02 1.02 0.14 0.16 0.02 0.83 0.17 0.23 0.02 1.03 0.17 0.19 0.00 0.70 0.23 0.31 0.02 1.15 0.20 0.27 0.02 1.03 0.22 0.25 0.02 1.52 0.23 0.28 0.02 1.14 0.17 0.21 0.02 0.99

    Notes: Month 1 = January, 1927. Bonuses are expressed in dollars, and are based on workers with 25 or more shifts in each month. Workers is the total number of miners and muckers working 25 or more shifts.

    Table 1 summarizes the data by month. We restrict the sample to workers who worked 25 or more shifts in a month. This restricts the sample to workers who were working full time in the mine. The average number of shifts worked per worker is quite stable in the sample, averaging between 27.1 and 29.9 in each month. The ratio of mucker shifts to miner shifts is generally between 1.5 and 2. The proportion of workers receiving

    11. The records are located in the Special Collections section of the University of British Columbia Library. We transferred the records onto microfiche and from the microfiche coded the data into machine- readable form.

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 323

    a bonus in each month fluctuates between 0.4 and 0-7 with no obvious trend across per- iods. Workers were paid bonuses according to the number of shifts they worked during the pay period, so Table 1 reports bonuses per shift. The wage data presented in Table 1 is not adjusted for inflation. Base wages per shift were $4.25 and $4.00 for muckers, positive bonuses were on average 4 or 5% of base wages. The maximum positive bonus per shift fluctuates a fair amount across periods with outliers in months 7 and 16. On average it is $0.99, or over 23% of the base wage. The minimum positive bonus per shift is less than $0.2, with outliers in months l and 20.

    In each period roughly 6% of workers received a total bonus equalling $0.50. While the model predicts that positive bonuses are bounded away from zero, the theoretical bound applies per-shift and not for a total over a pay period regardless of shifts worked. In periods where the smallest positive bonus is below $0.50, bonuses of $0.50 still appear regularly. We have found no explanation for this in company records. It appears that either the firm guaranteed a minimum bonus of $0.50 for certain jobs, or it may be that the firm usually rounded smaller bonuses up to $0.50. We proceed using the latter expla- nation. With rounding, the probability of receiving a positive bonus is unchanged, but bonuses between njw(8*) and 0.50 are rounded to 0.50 for worker i . Without rounding, workers working n shifts would receive a bonus of 0.50 or less if nw(8) 50.5. Using T3(i) to solve this condition for the upper bound of 8 leads to

    Therefore, with rounding a worker receives a bonus of 0.50 with probability F@:*) - F @ * ) .

    Likelihood function

    Based on Theorem T3 and (1 7), the log-likelihood function for N workers during one pay period equals

    Whether or not rounding occurred, estimates based on mazimizing (18) are consistent. We found parameter estimates based on equating w(8*) to the minimum bonus per shift to be very sensitive to changes in the observed minimum across periods, so we report estimates based on (18). To compute L under the restricted model, requires solving (10) numerically for the optimal values of (a , X) given current estimates of (ka, kb, r, p, o).12 To reduce the number of estimated parameters, we let the technology parameters (p, o), change at most every two months. We tested this restriction by estimating the unrestricted model over all 24 months, with q free, and allowing 0, p and y to change in each period. We then estimated the model under the restriction that o, p and v/ change every two periods. The likelihood-ratio test statistic equals 58-39 with 36 degrees of freedom, which is not significant at the 1% level.

    To assess the fit of both the unrestricted and restricted estimates to the bonus data, the bonus distribution was partitioned on the basis of bonus received and the number of

    12. The profit function was maximized using Newton’s method. The convergence criterion was very tight since the results must be passed on to the algorithm maximizing L. In particular, all elements of the gradient vector for n had to have absolute values below 1.0e - 7. Newton’s method was also used to maximize L.

  • 324 REVIEW OF ECONOMIC STUDIES

    shifts worked. For each period the partitions are

    W1 = { w ~ w = O } , W21 { ~ 1 0 < ~ = < $ 0 * 5 0 ) , W31 {~1$0.50

  • TABLE 2

    Maximum likelihood estimates of unrestricted model

    Period

    Value 1 2 3 4 5 6 7 8 9 10 11 12

    VI 0*13* 0.08* O.10* 0.12* 0.10* 0.08* 047* 0.12* 0.14* 0.17* 0.16* 0.11

    4 3 6*54* 7*63* 5*81* 3*13* 5.40* 9*88* 8.55* 4*59* 5-06* 3-27* 4.17* 5-78 (1.69) (2.14) (1.57) (0.79) (1.50) (3.02) (2.39) (1.26) (1.54) (1.15) (1.40) (1.67)

    0 0*76* 0.59* 0.69* 0.88* 0.76* 0.75* 0*74* 0.98* 1.13* 1.18* 1.03 (0.10) (0.09) (0.09) (0.09) (0.10) (0.13) (0.10) (0.13) ' (0.15) (0.22) (0.21) (0.16)

    v 4*74*

    (0.03) (0.02) (0.02) (0.02) (0.02) (0.02) (0.01) (0.02) (0.03) (0.04) (0.05) (0.02)

    (1.68) In likelihood -7297

    X 2 5.2 13.6* 4.7 34.5* 8.4 3.3 3.2 1.8 3.2 10.4 6.1 5.0 Total 99.4*

    Notes: Estimated standard errors are in ( ). . The chi squared test statistic for fitting the bonus distribution is defined in equation (16). * Indicates value of test statistic is significant at 1% level.

    U

    Z R z A m

    Z U

    9

  • TABLE 3

    Maximum likelihood estimates of structural parameters

    Period Value l 2 3 4 5 6 7 8 9 10 11 12

    0

    P

    4 3

    ax /3

    11

    X

    r

    k-b k-a

    J In likelihood

    Total

    0.87* (0.08) -0.23 (0.12) 0.16

    13.35 2.12 1.35*

    (0.06) 13.61* (4.4)

    1 .oo 0.20

    -7325 10.1 1 17.9

    0.54* (0.05) 0.03

    (0.09) 0.18

    20.2 1 3.70

    12.9

    0.57* (0.05) 0.00

    (0.09) 0.18

    18.99 3.40

    6.1

    0.58* 0.59* 0.84* 0.62* 0.74* 0.81* 0*83* 0.92* (0.04) (0.05) (0.09) (0.05) (0.07) (0.08) (0.10) (0.09) 0.03 0.04 0.04 0.22 0.01 -0.12 4 - 1 6 -0.12

    (0.09) (0.10) (0.12) (0.10) (0.1 1) (0.12) (0.12) (0-1 3) 0.18 0.18 0.16 0.17 0.17 0.16 0.16 0.16

    19.88 20.13 21.08 27.31 18.93 15.47 14.76 16.80 3.54 3.57 3.37 4.76 3.1 3 2.50 2.36 2.63

    38+9* 4.9 8.9 3.2 4.2 4.1 12.7 4.8

    0.85* (0.09) 0.00

    0:16 19.73 3.14

    6.9

    Note: Standard errors in ( ). * Indicates test significant at 1% level. Parameters without standard errors were fixed or depend on estimated parameters. r is converted to a per shift value by dividing by 25 (shifts per month).

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 327

    that the firm would realize since we are not allowing the firm to implement the optimal payment system given risk neutral workers (i.e. selling the firm to the workers). The work- er's choice of effort is unaffected by risk neutrality since 8 is observed before effort is chosen. The base wages paid to workers will decrease under risk neutrality, however, since workers do not require compensation for the variation in pay. Total wages need merely compensate workers for their expected cost of effort. Denoting total wages under risk neutrality as W,'", we can write

    Expected profits are therefore equal to

    Evaluating (20) at the estimated parameters allows us to calculate the average percentage increase in profits if workers were risk neutral. Firm profits would increase by at least 36% where workers were risk neutral, implying a substantial compensation for risk within the mine.

    The goodness of fit test based on (19) was also performed on the restricted model. Even using a conservative five degrees of freedom in each period the fit is only rejected in period 4. As with the unrestricted model period 4 causes the restricted model's overall fit to be rejected. Excluding period 4 the overall fit is not rejected (the value of the test statistic falls to 79.0). In the restricted model the cost of effort parameters and the risk aversion parameter are constant across periods; changes in the bonus distribution are attributed to changes in the profit-maximizing choice of the (a , X) due to varying con- ditions in the mine. These restrictions are tested by performing a likelihood ratio test against the unrestricted model. The model is estimated over 12 periods, implying l l degrees of freedom (36 parameters in the unrestricted model, 25 parameters in the restric- ted model). The likelihood ratio of 2(7325 - 7297) = 56 rejects the restrictions placed on the unrestricted model.

    Two elements were added to the model when moving from the unrestricted to the restricted model: profit maximizing choices of (a, X) and exponential utility, Al(v). Although it is plausible that a more flexible utility function could improve the fit of the restricted model, this is not feasible computationally and any extra parameters would not be identified. Since the overall fit of the restricted model is still satisfactory and rejecting profit maximization would leave us no framework for estimating the cost of incomplete information, we use the restricted estimates to explore the performance of the linear bonus.

    6. ESTIMATED EFFICIENCY OF CONTRACTS

    We now compare profits per-team under several different contracts, listed in descending order of expected profits (ignoring any costs of implementing the contract itself).

    1. Full Information Optimal Contract: Profits defined in equation (6) averaged over all pay periods at the parameter values estimated in the restricted model.

  • 328 REVIEW OF ECONOMIC STUDIES

    2. Incomplete Information Contract: Profits computed using the approximation defined in (15). The parameter values for period 12 were used to form the profit function.

    3. Incomplete Information Two-Rate Linear Bonus: Profits computed when separate bonus rates a, and a b are paid to each occupation. The parameters are chosen to maximize

    for all 12 periods where e* is determined by the solution of the efficient Nash equilibrium within the team.

    4. Incomplete Information Linear Bonus: Profits computed from equation (10) for the parameter estimates and the contract actually used by the firm.

    5. Incomplete Information Simple Piece Rate: Profits computed under a simple piece rate a that maximizes

    As with the two-rate system, (22) was maximized numerically for each pay period. In terms of implementation costs as we defined. them in Section 2, the incomplete

    information optimal contract is the most costly. Next in order of implementation costs is the two-bonus rate system, then the linear bonus, and finally the simple piece rate and full information optimal contract, which is also a piece rate. To implement the full infor- mation contract would presumably require other costly measures including hiring more foremen (presuming this does not create new incentive problems with regard to the fore- men themselves), collecting more data about local conditions, and so forth. These fall under the costs of having incomplete information on worker action rather than the costs of implementing the contract given the information available. Before comparing profits under these contracts, we state and discuss two results concerning their performance.

    Assumption A3

    (i) -In (-gi) = r(PE[v]/3 + si>, for some constants Si, iE ( a , b ) such that S, + 2sb = 0, (ii) Var [d] = 0, (iii) v - ~ ( p ~ , 0:).

    Theorem T4. Under assumptions A1 -A3:

    (i) Estimates of expected profts E[z] do not depend upon E [ v ] , average team value productivity with zero effort, under full information or any incentive contract that nests the linear bonus system;

    (ii) Estimated profts under a simple piece rate are overestimated by the factor ra2p 02,.

    As long as reservation utilities reflect average value productivity A3(i), and the pay- ment scheme is flexible enough to control for variation in productivity across areas in the mine, then expected profits do not depend upon the average value of productivity without incentives. This implies that the ratio of profit under two different payment schemes can be computed from the estimated parameters without knowing actual output or output price. In the case of a simple piece rate, however, the mine cannot use xi to cancel out

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 329

    fixed productivity differences vi across areas of the mine. The productivity differences do not cancel out of expected profits. In essence, a simple piece rate creates a lottery across areas of the mine, and the expression in T4 is the worker's risk premium associated with this lottery. Without knowing how variable the observable component of productivity was across areas, expected profits under the simple piece rate overstate actual profits.

    TABLE 4

    Expected projits under alternative compensation scheme

    Environment Pay method E[Profits] Relative

    (1) Full information Optimal contract 48.87 1 .oo (2) Incomplete information Optimal contract 32.98 0.67 (3) Incomplete information Two bonus rates 18.47 0.38 (4) Incomplete information Linear bonus 16.62 0.34 ( 5 ) Incomplete information Piece rate 15.55 0.32

    Note: Values are averages over twelve two month periods, except row (2) which is computed using estimates for the data in period 12.

    Table 4 summarizes the performance of the five different contracts. Expected profits under the linear bonus are a third of what they would be under full information. This 66% inefficiency exists even though only 5% of compensation was in the form of incentive pay. The level of incentive pay is not a reliable indicator of the costs of incomplete infor-

    , mation. The difference between casual and structural estimates of the c st of incomplete information has been noted for executives of U.S. corporations. Using he sensitivity of executive pay to shareholder wealth as a measure of the importance of in entives, Jensen and Murphy (1990) find a sensitivity of less than 1%. Using estimates of a \ ynamic agency model, Margiotta and Miller (1993) estimate the effect of moral hazard to be of the same order as total assets controlled by the firm. (See also Haubrich (1994)).

    The key to both our results and those of Miller and Margiotta is risk aversion. With risk averse agents, the cost to the principal of incomplete information is not proportional to the amount of variation in pay generated by the optimal incentive scheme. To obtain a measure of the economic impact of incomplete information requires an estimate of risk aversion as well as other aspects of technology and preferences.

    We can divide the 66% loss in expected profits under the linear bonus into parts related to incomplete information and the form of the contract. Expected profits under the approximate optimal contract with incomplete information (row 2 of Table 4) are about 61% of that with complete information (thus 39% loss), accounting for a bit more than half of the inefficiency of the linear bonus. The other half is due to the inefficiency in producing incentives and providing insurance under the linear bonus compared to optimal incentive contract (34% vs. 61% of potential profits).

    Now we consider marginal changes in the form of the incentive contract away from the linear bonus, in particular the two-rate and simple piece rate schemes that maximize (21) and (22). Paying miners and muckers different piece rates would have increased monthly efficiency, but only by approximately 11%. Therefore it does not take large implementation costs to explain why the firm choose to pay a single piece rate for both occupations.

    Profits are smaller under the simple piece rate than under the linear bonus system, since the bonus system nests the piece rate. The average increase in profits under the linear bonus is small. On average, profits increased by a minimum of about 7.4% with a pro- duction standard. The fact that the estimated difference in profits (ignoring productivity

  • 330 REVIEW OF ECONOMIC STUDIES

    l l 1

    Stepwise approximate optimal incentive contract

    differences across areas of the mine) is small suggests that the main benefit to the firm in introducing the production standard was its ability to cancel out observable productivity differences across areas of the mine.

    Optimal contract

    Figures 2 and 3 summarize the optimal stepwise incentive contract based on (15). Recall that we discretized the distribution of 8 into 80 points and allowed the firm to choose 80 values of y at which the wage functions would take jumps. However, only at eight points would teams produce output with positive probability. These points are indicated by a triangle in Figure 2 and the corresponding probabilities are given in Figure 3. The optimal contract calls for zero incentive-sensitive output 45% of the time and then tremendous

    FIGURE 3 Optimal distribution of output

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 331

    effort otherwise. The share of output going to the team approaches the share of about 0.48 in the linear bonus (although these are shares of total output not output above a standard). Similar shares go to miners and muckers, but muckers each receive a small amount more than the miner.

    The parts of the contract indicated in dashed lines are steps between two levels of output that are produced with zero probability. It is not surprising that the computed values are uneven in these ranges. Given that the corresponding level of output is never going to be produced, the optimal shares are only bounded above by some value. These small drops in the shares appear to be critical in distinguishing the optimal contract from the optimal linear bonus. If the firm were constrained to flatten out the contract teams would begin producing some of the intervening output levels. This raises the cost to the firm of satisfying the incentive compatibility constraint. The firm would respond by lower- ing the values of output and reducing effort levels, which would move in the direction of the optimal linear bonus.

    For the contract in Figure 2 to work properly the firm must communicate to workers output shares for values of output that will never be produced. It is quite plausible that the cost of this communication, even with all workers responding optimally, would swamp the efficiency gains over the linear bonus reported in Table 4. In this case it may be cost effective to use a very simple contract such as a linear bonus. The optimal linear bonus is not the one that most closely approximates the unconstrained optimal contract, but in fact is quite different in form and leads to quite different effort and output levels within the firm.

    Naturally there are too many uncertainties and approximations involved in this analysis to be convinced that the optimal contract ignoring implementation costs in this mine is the contract in Figures 2 and 3. At the least, the optimal contract is unusual enough to make it plausible that the firm would absorb the extra inefficiency created by the linear bonus. One could easily use numerical simulations to provide examples where a simple contract is very inefficient without going through the process of estimating par- ameters of the model. By using parameters inferred from the data generated by the linear bonus we are confident that the results are much more than a theoretical possibility and may be relevant to other types of firms.

    7. CONCLUSION

    Firms use simple incentive systems even though agency theory does not sanction them as optimal. Empirical estimates of the cost of incomplete information within firms must account for the less-than-optimal levels of incentives and insurance provided by simple pay systems. In a case-study of the Britannia copper mine, we estimate an agency model using the data generated under the mine’s linear bonus. Based on the estimated parameters we compute the costs associated with incomplete information and the inefficiency of the bonus system relative to the optimal incentive contract. The costs of incomplete infor- mation and inefficient contract form were both large and of similar magnitude. Under the linear bonus the firm earned expected profits equivalent to 34% of potential full infor- mation profits. This inefficiency would have been cut in half by implementing the optimal incentive contract rather than using the linear bonus. The remaining inefficiency stems from the costs of incentives and free-riding within teams due to incomplete information. Under the linear bonus both the insurance and the incentive elements of the agency relationship were quantitatively large.

  • 332 REVIEW OF ECONOMIC STUDIES

    We conclude with a discussion of some possible limitations and extensions of our analysis. We assume that workers know the conditions in the area they work before choos- ing effort. More realistically, workers only receive a signal of productivity when choosing effort. The opposite extreme would be the case of symmetric incomplete information: neither workers nor firms know the value of 8. In this case, team effort is constant relative to the realization of 8. At any set of model parameters, expected profits are smaller when workers do not see and respond to 8 than when they do, because there is less information about the production process. In this sense, our estimates of the cost of incomplete infor- mation are conservative. However, our estimates of the model’s parameters are biased in some unknown way if workers do not observe 8 and we assume they do. A limited explo- ration of the sensitivity of the results to assumptions about information suggest the con- clusions are not greatly affected by them (Ferrall and Shearer (1994)).

    While we control for heterogeneity in tasks and skills across occupations, skills may differ within occupations as well. In particular, workers may develop skills on the job. Shearer (1996) has matched the payroll data from Britannia with personnel files that include when the worker joined the firm. Using the same framework, he estimates the return to tenure within the firm controlling for the incentives induced by the bonus system. He finds essentially flat productivity-tenure profiles, which allows us to ignore at least this source of heterogeneity in skills within occupations.

    We specify the environment as a static principal-agent problem. An important issue in the dynamics of incentive pay is the ratchet effect (Gibbons (1987) and Kanemoto and MacLeod (1990)). The ratchet effect arises when a firm uses a worker’s past performance to determine the parameters of the compensation scheme, and workers recognize this feedback. We ignore the ratchet effect for two reasons. Workers at Britannia changed location within the mine, and as tunnels progress rock conditions evolve over time. Both these facts reduce the extent to which a worker’s past performance affects his future compensation even if past performance is used to rate an area relative to other areas. Ickes and Samuelson (l 987) argue that worker rotation mitigates the ratchet effect because it reduces the correlation between a worker’s performance today and the compensation scheme he expects to face in the future.

    The results suggest many areas of further research. For example, our proof for identi- fication of the agency model parameters using payroll data is specific to our model and the payment scheme used by the firm. A more general analysis of the empirical implications of agency models would be valuable, particularly since more data sets on compensation are becoming available. Our analysis relies on uncontrolled variation in outcomes of time periods. We attribute this variation to changes in productivity as the mine’s tunnels advanced. To avoid this assumption one might conduct controlled experiments on the incentive effects of different payment systems in situations where the agency model’s requirements of random productivity and unobserved effort hold. Structural estimates of incentive experiments would help determine how reliable results are when using an agency model as the “data generating process” for payroll data.

    APPENDIX Proof of Theorem T1.

    We first note that maximizing (4) implies Au = 2Ab. With risk averse workers and a risk neutral firm, the optimal wage contract provides complete insurance. Using this condition to invert U for occupation i

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 333

    Productive efficiency requires that the marginal product of team effort equal the sum of worker marginal effort costs

    where dividing by r converts ki to monetary units. Rearranging gives

    Substituting these expressions back into (1.1)

    ka 2" kb 1 u-'(flb) w,(e) = - ___ U + l 2"k,+kb

    f+L. r 2 ( q + 1) 2"k, + kb r

    To derive expected profits simply substitute back into

    E [ K ] = E [ y F ] - E [ W a ] - 2 ~ w b ] = ~ E [ y F ] - U , ' ( f l u ) - 2 U b ' ( f l b ) . U+1

    Substituting kb = 1 and k, = 2-" -Ikb gives the desired result.

    ive normal distribution, then To derive the expression for output note that in general if In (O)-N(p , 02), and @ is the standard cumulat-

    To see this note that

    Using the change of variables y = In 8 -p , it is straightforward to show that

    where 6 is the standard normal density function. See for example Olkin, Gleser and Derman (1980, p. 300). It follows directly that

    Proof of Theorem T2

    A team member will never supply effort, Ai ie {a , b } , beyond that level that equates the marginal utility of effort to its marginal cost. Furthermore with Leontief production any extra effort, beyond that supplied by the other occupation, is wasted. Therefore, given the muckers are identical, the (symmetric) set of sustainable effort levels in a Nash equilibrium are

    ( o , l a > ,

    Ab€ (0, x b } ,

    for miners and muckers respectively, where

  • 334 REVIEW OF ECONOMIC STUDIES

    Let be that value of 8 for which a worker in occupation i is just indifferent between supplying effort xi and supplying zero effort. That is

    Note that e,* is not unique since mucker effort enters utility additively. That is, there is a range of e,* for which the muckers can stop shirking in equilibrium. We proceed in the proof with the minimum value of 82: i.e. that value of e,* such that each mucker shirks for 8 < 88 independent of what the other mucker or miner does. We then show that this leads to the Pareto efficient Nash equilibrium.

    Under assumption A1

    and, since any extra effort is wasted,

    e* = max (e:, e ; } = 08. Therefore the Nash equilibrium effort functions are

    (0, otherwise,

    otherwise.

    To show that this is the Pareto efficient Nash equilibrium, let

    Note that the set of 8s for which positive effort is supplied is smaller under 68 than 88. Furthermore, the amount of effort and utility level, is unaffected for 8 < Of and 8 > 6;. However, for e$ < 8 < 63, effort and utility were positive under e,* but are now equal to zero. Proof of Theorem T3

    Direct substitution of the Nash equilibrium effort functions into the bonus equation (6) gives O ( q + 1 ) h e-Mq + 1 ) h

    if 8 > 8*,

    otherwise, w(e) =

    or

    otherwise.

    T3(i) follows from direct substitution of the expression for 6* into the bonus equation. T3(ii) follows from the fact that

    and the log-normality of 6.

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 335

    T3(iii) To derive the density function in L3(iii) let z = In (0). Then

    Furthermore,

    z = -In [ H w + ax/3)], 71 71+1

    and

    Therefore

    T3(iv) The contribution to the likelihood of a limit observation is

    Similarly, the contribution to the likelihood of a non-limit observation is

    1 G < w i + ax/3)((71 + 1)/q)o eXP [- 2[((71 + 1)/71)0l2 l [In ( (wi + ?))l2].

    As a function of structural parameters, the lower bound on the wage distribution is W ( & * ) = ax/(3(2q + l)). Using results in Donald and Paarsch (1993) on boundary estimators, consistency follows from the fact that w ( G * ) is monotonic and invertible in ax/3, as long as ax > 0. The inverted equation

    23 = (271 + l)w(&*), 3

    is a smooth function of 77 and w(6*) . If a x = 0 then w(6*) = 0 and the boundary equation is not invertible.

    limit observations depend only on the ratio I+Ye-((q+')'q)p.

    the concentrated likelihood function

    The parameters I+Y and p are not separately identified, because the contributions of both limit and non-

    Replacing w(O*) by its estimate wmin, the minimum observed positive bonus in the sample, we can consider

    Since the rate of convergence of wmin to w(O*) is faster than h , (Flinn and Heckman (1982), Christensen and Kiefer (1991), and Donald and Paarsch (1993)) wmin can be treated as a fixed parameter in the concentrated likelihood function. The values p, 7 and CT enter 1' independently and can be consistently estimated by maximiz- ing l'. Furthermore, we can see the lack of identification when a x = 0 because 71 and W ( & * ) enter 1' as factors and 71 would be eliminated from l' if the latter were equal to zero. If, in addition, a = 0 itself then by equation (16) p= 0 leaving only ((7 + l) /q)o identified.

    Proof of Theorem T4

    (i) Bonus system. The bonus system uses xi to cancel out fixed differences q, i.e., xi = X + Pvj . Expected revenue is E [ y ] = E[OA,(O)] + P E [ v ] and expected profits per team are

    E[RJ = ~ [ e n , ( e ) ~ + P ~ V J - E[wa(e) + 2wb(e)l - pa - 2pb .

  • 336 REVIEW OF ECONOMIC STUDIES

    Under assumptions A1 -A3

    and expected profits reduce to

    Full information

    Under full information, 8, v and A are observable. The firm chooses effort, &(e, v) and wages, w , ( B , v) to maximize expected profits subject to the workers expected utility constraint. Expected profits per team are

    1, / v e q e , v) + m- W , ( e , v) - 2 w b ( 8 , v)f(e)g(v)dedv. Expected utility for occupation i is

    As in Theorem T1 the optimal contract implies that A, = 2&, and is characterized by the optimal risk sharing and efficiency conditions. Because the marginal benefit of worker effort is independent of fixed effects, v does not affect the efficiency of effort condition. That is

    Optimal risk sharing implies full insurance for the worker, that is

    Assumptions A1 -A3 imply

    wb = kb [ r e +- PE[V1 + sb, r(q + 1) 2'k, + kb 3

    and expected profits are independent of fixed effects v.

    (ii) Piece rates. Expected profits of the firm are

    Jv J, [eaa(e> + PvI (1 - a ) f ( e )g (Wedv-Pu - 2 P b , where Pn and p b solve the participation constraints of the miners and muckers respectively. To solve for pu

    Using the independence of 8 and v gives

    Given the normality of v we use the result

    E[e-kv] = exp ( - k p v + 5 pot). Furthermore, using the definition

  • FERRALL & SHEARER INCENTIVES AND TRANSACTION COSTS 337

    and

    -In (-ai) = r(PE[v]/3 +S,),

    gives

    .The same procedure for P b gives

    Actual expected profits are

    but we estimate profits under the piece rate as

    We therefore over-estimate profits under the piece rate system by the amount ira2P20’,.

    Acknowledgements. We acknowledge research support from the Centre for Resource Studies at Queen’s University and Ferrall acknowledges support from the Social Science and Humanities Research Council of Canada. For helpful comments and suggestions we thank two referees, an editorial board member, Dan Bernhardt, Robert Gibbons, Bart Hamilton, Mary Margiotta, Harry Paarsch, Daniel Parent, Craig Riddell, and Frank Wolak. We would also like to thank participants at the conference “Incentives and Contracts: Theory and Evidence” held at the University of Montreal, the Econometric Society meetings in Washington DC, the Stanford Institute for Theoretical Economics, and seminar participants at the University of Toronto, University of Quebec at Montreal, and Wilfrid Laurier University.. Finally, we thank Ron Shearer for drawing our attention to the data, the staff at the University of British Columbia Special Collections for their assistance in collecting the data, and Jennine Ball, Julie McCarthy, Brian Sourham, Krista Clairmont, Beth Riley and Lisa Wu for data entry. The data and computer programs used in this paper are available upon request from the authors.

    REFERENCES BLINDER, A. (ed.) (1990) Paying for Productivity: A Look at the Evidence (Washington: The Brookings

    Britannia Mining and Smelting Company Limited, Records, Provincial Archives of British Columbia. Add. Mss.

    BROWN, C. (1990), “Firms’ Choice of Method of Pay”, Industrial and Labor Relations Review, 43, 165s-182s. CHRISTENSEN, B. J. and KIEFER, N. M. (1991): “The Exact Likelihood Function for an Empirical Job

    Search Model”, Econometric Theory, 7, 464-486. DONALD, S. and PAARSCH, H. (1993), “Piecewise Pseudo-Maximum Likelihood Estimation in Empirical

    Models of Auctions”, International Economic Review, 34, 125-148. EHRENBERG, R. (ed.) (1990), “Do Compensation Policies Matter?”, Industrial and Labor Relations Review,

    Special Issue, 43. ECKSTETN, Z. and WOLPIN, K. I. (1990), “Estimating an Equilibrium Search Model using Panel Data on

    Individuals”, Econometrica, 58, 783-808. FERRALL, C. and SHEARER, B. (1994), “Incentives, Team Production, Transactions Costs, and the Optimal

    Contract: Estimates of an Agency Model using Payroll Records” (Queen’s University Discussion Paper No. 908, July).

    FLINN, C. and HECKMAN, J. J. (1982), “New Methods for Analyzing Structural Models of Labor Force Dynamics”, Journal of Econometrics, 18, 11 5-168.

    GIBBONS, R. (1987), “Piece-Rate Incentive Schemes”, Journal of Labor Economics, 5, 413-429. GREEN, J. and STOKEY, N. (1983), “Comparison of Tournaments and Contracts”, Journal of Political Econ-

    GROSSMAN, S. J. and HART, 0. D. (1983), “An Analysis of the Principal-Agent Problem”, Econometrica,

    HAUBRICK, J. G. (1994), “Risk Aversion, Performance Pay, and the Principal-Agent Problem”, Journal of

    HOLMSTROM, B. (1979), “Moral Hazard and Observability”, Bell Journal of Economics, 19, 74-91.

    Institution).

    1221. Victoria, British Columbia.

    omy, 91, 349-364.

    51, 7-46.

    Political Economy, 102, 258-216.

  • 338 REVIEW OF ECONOMIC STUDIES

    HOLMSTROM, B. (1982), “Moral Hazard in Teams”, Bell Journal of Economics, 13, 324-340. HOLMSTROM, B. and MILGROM, P. (1987), “Aggregation and Linearity in the provision of Intertemporal

    ICKES, B. W. and SAMUELSON, L. (1987), “Job Transfers and Incentives in Complex Organizations: Thwart-

    JENSEN, M. C. and MURPHY, K. J. (1990), “Performance Pay and Top-Management Incentives”, Journal

    KANDEL, E. and LAZEAR, E. P. (1992), “Peer Pressure and Partnerships”, Journal of Political Economy, 100,

    KANEMOTO, Y. and MAcLEOD, W. B. (1992), “The Ratchet Effect and the Market for Secondhand Work-

    LAZEAR, E. P. (1986), “Salaries and Piece Rates”, Journal of Business, 59, 405-431. LAZEAR, E. P. and ROSEN, S. (1981), “Rank-Order Tournaments as Optimal Labor Contracts”, Journal of

    Political Economy, 89, 841-864. MAcLEOD, W. B. and MALCOMSON, J. (1989), “Implicit Contracts, Incentive Compatibility, and Involun-

    tary Unemployment”, Econometrica, 57, 447-480. MARGIOTTA, M. M. and MILLER, R. A. (1993), “Managerial Compensation and the Cost of Moral Hazard”

    (GSIA Working Paper 1993-19, November, Carnegie Mellon). OLKIN, I., GLESER, L. and DERMAN, C. (1980) Probability Models and Applications (New York:

    Macmillan). PAARSCH, H. and SHEARER, B. (1996), “Piece Rates, Fixed Wages, and Incentive Effects: Statistical Evi-

    dence from Payroll Records” (Working paper 96-14 CREFA, Department of Economics, Lava1 University).

    PAKES, A. (1986), “Patents as Options: Some Estimates of the Value of Holding European Patent Stocks”, Econometrica, 54, 755-784.

    PRESS, W. H., FLANNERY, B. P., TEUKOLSKY, S. A. and VETTERLING, W. T. (1987) Numerical Recipe: The Art of Scientijic Computing (New York: Cambridge University Press).

    RUST, J. (1987), “Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher”, Econometrica, 55, 999- 1034.

    SHEARER, B. (1996), “Piece Rates, Principal Agent Models and Productivity Profiles: Parametric and Semi- Parametric Evidence from Payroll Records”, Journal of Human Resources, 31, 275-303.

    STIGLITZ, J. E. (1991), “Symposium on Organizations and Economics”, Journal of Economic Perspectives, 5,

    Incentives”, Econometrica, 55, 303-328.

    ing the Ratchet Effect”, The Rand Journal of Economics, 18, 275-286.

    of Political Economy, 98, 225-264.

    801-817.

    ers”, Journal of Labor Economics, 10, 85-98.

    15-24.