performance measurement and metric manipulation in the ... · performance measurement and metric...

1

Performance measurement and metric manipulation in the public sector

EGPA Conference 2006-07-25 SG VII Ethics & the Integrity of Governance

Colin Fisher

Bernadette Downes Nottingham Business School Nottingham Trent University

Contact details

Colin Fisher Professor of Managerial Ethics and values

Nottingham Business School Nottingham Trent University Burton Street Nottingham NG1 4BU Email: [email protected] Tel 0115 8482822

Bernadette Downes Research associate and senior lecturer

Nottingham Business School Nottingham Trent University Burton Street Nottingham NG1 4BU Email: [email protected] Tel 0115 8482401

2

Performance measurement and metric manipulation in the public

sector

ABSTACT

This paper explores the circumstances that influence whether managers in the public

services manipulate the measurement information that is used to assess performance;

and if they do, what level of deception they might use. The realistic evaluation

approach is adopted. A Delphi survey and the collection of critical incidents through

interviews are used to identify possible configurations of contexts – mechanisms –

outcomes that provide possible explanations of information manipulation. A number

of these configurations are discussed. In a later stage of the project the findings will

be used to develop proposals for improved governance of performance measurement

systems in the public services.

INTRODUCTION

This paper reports on the circumstances that influence whether managers in the

public services manipulate the measurement information that is used to assess

performance; and if they do, what level of deception they might use. The research is

not focussed on the extent of such deception, nor upon the question of whether

performance measurement and target based systems of performance management are

effective in improving the performance of public sector organisations. Its purpose is

to identify the circumstances and mechanisms that trigger deception so as to propose

means of improving governance of performance management systemsi. The

widespread use of performance measures and targets within the UK public sector

3

(DoH 1997 & 2001, Givan 2005, Sanderson 2001, Smith 2005) makes research into

how deception might be discouraged important. This paper uses the results of an

initial empirical study into this question to propose an explanatory configuration of

contexts, mechanisms and outcomes that can be tested in a future research study

REALISTIC EVALUATION

Pawson and Tilley (1997) developed a method of realistic evaluation designed

to evaluate public policies and programmes. We have adopted the approach in this

study. Realistic evaluation is based on realist ontology and epistemology and has

three elements, described below, that connect with each other in, what Pawson and

Tilley call, context- mechanism-outcome configurations (CMOC).

• Underlying mechanisms – are the causal mechanisms that bring about the

events that people experience. There are many mechanisms that may

influence events, but they are likely to be hidden and not immediately

obvious to the observer.

• The mechanisms are triggered by certain contexts but not by other

contexts.

• Outcomes. These are the desirable or undesirable consequences of the

mechanisms that have been triggered by the contexts.

Realist social scientists (Collier1994: 42-45) refer to outcomes as events and add

a further element – experiences - that are people’s socially constructed and subjective

understandings of events. However experiences are not considered in realistic

evaluation. Pawson and Tilley argued that CMOCs should be proposed and

empirically tested. In Popperian manner, the CMOCs that survive the tests stand as

4

probable explanations of the programme outcomes being evaluated; until they too are

refuted by further research.

Tilley (2000) gives the example of evaluating the impact of installing CCTV in

car parks on the rate of car crime to illustrate the realistic evaluation process. One of

the possible mechanisms he labeled ‘the nosey parker mechanism’. People see the

CCTV cameras in a car park and this makes them feel secure, more people therefore

use the car park, which becomes busier. Because there are more people about the

criminals are deterred. However this mechanism will only be triggered in certain

contexts. If the car park is one that people use to park their cars while they are at work

then it will be busy at the start and end of the working day but will be quiet in the

middle of the day; and criminals will be able to steal cars and their contents

undisturbed. In this context CCTV will be not reduce the amount of car crime. Other

contexts may well trigger other mechanisms that would lead to a reduction in crime

(Phillips 1999).

THE OBJECTIVES AND STRUCTURE OF THE PAPER

Realistic evaluation implies a three-stage approach. Firstly a reconnaissance is

done to identify potential mechanisms and contexts. Secondly possible configurations

of contexts, mechanisms and outcome are conjectured; and finally the conjectures are

empirically tested. The reconnaissance was conducted by reviewing the literature and

by undertaking semi-structured interviews within the National Health Service (NHS)

to collect critical incidents. The second stage utilises the literature review and critical

incidents but is also based on a Delphi questionnaire survey of local government. The

third stage, of which the first round is complete and the second round is currently

being undertaken, is based on a Delphi survey of NHS managers.

5

STAGE 1 IDENTIFYING OUTCOMES, MECHANISMS AND CONTEXTS

In this section the contexts, mechanisms and outcomes that are possibly relevant to

metric manipulation are identified.

Outcomes

Two outcomes relating to manipulation of performance measurement data and

information are proposed;

1. the propensity of managers and professionals to manipulate,

2. the level of deceit involved in the manipulation.

Figure 1 provides a framework for classifying the levels of deceit involved in

data manipulation. Its origins lie in the work of Grice (1975a, 1975b), a philosopher

who proposed that a cooperative principle is to be found in effective conversations.

The principle can be expressed through four maxims.

• Maxim of quantity – provide the right amount of information, neither too

much nor too little.

• Maxim of quality – provide true information.

• Maxim of relation – provide relevant information and be sensitive to the

balance and relationships between different information.

• Maxim of manner – be clear and brief, do not confuse by prolixity or other

devices or tropes. (This statement, as an example, neglects this maxim)

Of course people may choose to break these maxims and this gives rise to

different types of deception according to which maxim is flouted.

6

McCornack (1992) has developed a theory of information manipulation based on

Grice’s work. His subsequent empirical work (McCornack et al 1992) suggests that

violations of the four maxims are not all seen as exhibiting the same degree of

dishonesty. Breaking the quality maxim was seen as the most dishonest form of

deception; followed by relevance, clarity (manner) and quantity. These insights have

been used in figure 1 to produce a scale of dishonesty and deceptiveness. It needs to

be borne in mind that concepts of deception may be culturally specific. A replication

study (Yeung et al 1999) in Hong Kong found that violations of the maxims of

quantity and manner were not seen as deceptive whereas in McCornack et al’s

American study all four kinds of violation were seen as deceptive, albeit to differing

degrees. Nevertheless the Hong Kong findings reinforce the view that there are

different degrees of deception.

The three forms of deception, based on Grice’s four maxims are described next.

Selective presentation

There are three forms of selective presentation that result from ignoring the

maxims of quality and manner. All share the common feature of presenting

information so that the recipient is likely to form an incorrect understanding. One way

is simply to hide significant information in a mass of trivial detail. A more

sophisticated form is termed by Grice conversational implicature. This refers to the

way that people use the interplay of the maxims to create an implication in the hearer

that is not explicit within the communicator’s words. An example concerns a report

on Ambulance Service response times published by the Commission for Health

Improvement (CHI). Ambulances are supposed to arrive at the most urgent calls

within eight minutes. A statement was made that three quarters of the Ambulance

7

Services met this target and that there had been a 20% improvement in performance

since 1999. This was no doubt true and so the maxim of quality had been obeyed. The

use of an aggregate figure and the emphasis given to the rate of improvement were

intended to misdirect the reader of the original statement from what others saw as a

very patchy and disappointing result. As the acting Chief Executive of the CHI

admitted,

While we feel that targets have been helpful in focussing the minds of

ambulance Trust staff on improving performance, we believe that more

sophisticated measures of response times and outcome measures are needed.

(BBC News Online 2002a)

The Consumer Association was blunter; saying the CHI report was short on detail and

failed to tackle crucial issues (BBC News Online 2002a). Implicature may underwrite

much of what has become known in politics as ‘spin’.

A commonly reported form of selective presentation in the public sector

literature is being economical with the truth, a phrase re-invented by�the UK Cabinet

Secretary, Sir Robert Armstrong during the Australian 'Spycatcher' trial in 1986� One

interviewee identified a clear example; it concerned the two-week target for the time

between a GP referring a patient to hospital for suspected cancer and the patient being

seen by a hospital specialist (Dodds et al 2003). The hospital staff were aware that

there could be a delay between the GP seeing the patient and sending a referral letter

to the hospital but the clock for the two week target only starting ticking when the

referral letter was received by the hospital. This, and other similar factors, meant that

the official measurements did not reflect the patients’ experiences. However to admit

this would make the hospital’s performance look worse than the official statistics

8

stated; and so there was an unspoken agreement not to raise the issue; to be

economical with the truth.

Gaming

Managers seek benefits by gaming, which is taking advantage of the loopholes

in the rules and systems under which they operate. One form of gaming exploits the

lack of an overview of an organisation’s performance and concentrates on achieving a

high profile target by neglecting other, perhaps equally important matters, which are

either not measured or are the subject of a low profile measure. An example of this

type of gaming can be seen in the consequences of the government’s introduction of a

target that all GPs’ patients should be seen within 48 hours. The intention of such a

target, it is to be supposed, is to improve services to patients. However Primary Care

Trusts (PCTs) and general practices often responded to this target by scrapping

advanced booking and requiring patients to phone in early on the day they wanted the

appointment for (BBC News Online 2003a and 2003 b). The result was that the

targets were met, but in the eyes of many patients the service was made worse (Health

Care Commission 2005:12-13).

Distortion

There can be two forms of distortion, misclassifying things, thereby taking

advantage of the margin of error that categorisation always allows, and, more

seriously lying or falsifying data. The National Audit Office’s (2001) investigation

into ‘inappropriate adjustments to NHS waiting lists’ reported on several cases where

waiting list information had been adjusted by re-classifying patients. Examples were

9

delaying the addition of patients to waiting lists and inappropriate suspension of

patients.

There is less evidence in the public domain of deliberate lying connected with

NHS performance measurement. However there are accounts from occasional

whistleblowers. One such, Ian Perkin, alleged that in the Trust he worked in staff

inputting data into the information system were being asked to insert a figure of zero

in the cancelled operations field when there had in fact been 28 cancellations (Revill

2003).

Mechanisms

A number of mechanisms were identified from the literature that might

encourage managers and professionals to manipulate data and information.

1. The avoiding hassle & scrutiny mechanism

If a manager, their team or their unit, is assessed as having performed very well or

very badly on a measure or against a target they may become subject to special

scrutiny. If they have performed badly this will lead to particularly close scrutiny, and

audit. A manager or professional may manipulate the measures to avoid such

unwelcome attention. This might also be called the ‘feed the beast’ mechanism, which

is a catchphrase that Harradine (2006) reports was popular amongst the senior

management team of a NHS Trust. It means providing whatever information to the

next level of the NHS that is necessary to avoid scrutiny.

2. The principled mechanism

10

The principled mechanism is one in which people believe that the wrong things

are being measured. One possibility is that important things, such as clinical quality

and outcomes, may be relatively ignored whereas less important things, such as

administrative performance, are closely measured and the subject of targets. People

may, alternatively, believe that the information is wrong because it is incomplete.

Within the NHS for example aggregate waiting list information may not discriminate

between cases of high and low clinical need and value; so that Trusts are encouraged

to do many quick and cheap operations, to reduce waiting times, whilst delaying

complex and expensive cases. If they think wrong measure distort priorities they may

believe it proper to manipulate them in order to achieve a better quality of service.

3. The frustration at inaccurate data and information mechanism

People may believe that the performance measures that they are assessed against

are inaccurate. This may be due to miscoding when the information is input. Another

possibility is that by the time the information is used to assess performance it is out of

date. A lack of faith in the performance measures, especially where they indicate a

weak performance, often becomes a justification in people’s minds for deception.

4. Large bonus triggered mechanism

Jensen (2003) has pointed out that a performance related pay (PRP) system that

is based on the proportion of a target achieved (for example – no bonus if target is not

reached, cost of living increase only if it is reached and a large bonus if it is exceeded)

encourages managers to nudge upward metrics, which are near a bar point, up to that

point if the bonus that is triggered is a large proportion of total remuneration. That a

11

small reported increase in performance can result in a disproportionately large

increase in bonus encourages deception..

5. Small bonus triggered mechanism

The mechanism is the same as for mechanism 5 but the bonus is a small

percentage of total remuneration.

6. Bonus in proportion to achievement mechanism

In this PRP mechanism the size of any bonus is based on a straight line

relationship between performance and reward (in which an x% increase in

performance results in an x% increase in reward – as in the old fashioned, time study

based, incentive schemes).

At this stage of the research the intention was to identify as wide a range of

mechanisms as possible. This means that some of the mechanisms may not be

commonly present in public sector organisations (CIPD 2005: 8). Mechanisms 4, 5 &

6 for example all refer to performance related pay (PRP) which is still relatively

uncommon within the public sector. Despite NHS Trusts being given from their

foundation the right to establish local pay systems only a minority did; and of those

few developed PRP systems (Corby et al 2003). Where PRP schemes have been

introduced they have often awarded only a small percentage of discretionary pay on

the basis of competency rather than on measured performance or outcomes assessed

against targets (PR Newswire Europe Ltd. 2000).

The research interviews conducted within the NHS has identified two further

mechanisms.

7. ‘Seeking additional resources’ mechanisms.

12

This mechanism operates in two ways. Interviewees from hospital Trusts

reported that information was often presented so as to make the situation look worse

than it was, to bolster a case for additional resources. At Trust Board level, contrarily,

the intention was more often to make things look better than they were, earn more

income through the recently introduced payment by results (PbR) system. A survey,

conducted by the NHS Alliance, of Health Commissioners reported that 67% of those

commissioning under the new PbR system had concrete evidence of gaming by

service providers while 53% were suspicious that gaming was happening but could

not get the information from providers to investigate (Harding 2006). Carlisle (2006)

provides a list of the types of game that might be played as PbR comes fully into

operation.

Contexts

A number of contextual factors have been identified in the literature that may

trigger the mechanisms discussed in the previous section.

The balance of benefits, risks and sanctions

The research of Murray and Millar (1997), from within the psychology

discipline, into the situational factors that make people more likely to think that they

are being deceived, found that people will suspect deception if;

• the benefit to the communicator is high

• the chance of the deception being detected is low, and

• the cost to the deceiver if detected is low.

13

The obverse of this finding is that potential high rewards coupled with a perception of

low probability of being caught, and few damaging consequences if caught, could act

as triggers for deception.

Embeddedness of performance measurement

Performance management can be said to be highly embedded in an organisation

if it is central to such activities as arguing for priorities or resources or for obtaining

rewards, such as recognition or performance related pay. The more performance

measurement is central to these purposes the greater the likelihood that the deception

mechanisms will be triggered.

Organisational response to externally driven performance measurement

The structure of ministerial accountability, through Parliament, for public

services has resulted in central government departments using a system of centrally

imposed performance measures, targets and assessment regimes to control local

authorities and health Trusts. Organisations subject to such monitoring may respond

in two different ways. Modell (2001: 458) studied the responses of senior managers in

the Norwegian health system to institutional pressures to conform to government’s

expectations. He used Oliver’s (1991) framework of managerial responses that was

scaled from acquiescence to manipulation. He found that management in some

hospitals sought to adapt an imposed performance management system to improve

operational efficiency as well as to provide government with the information it needed

to legitimate its policies. He pointed out however that they only responded in this way

if crude financial performance measures (often acceptable to politicians (Abernethhy

and Chua 1996)) were replaced by a more sophisticated performance measurement

14

system acceptable to both of them, hospital management, and politicians. Such a

development is likely to reduce the probability of metric manipulation. However

Chang’s (2006: 74) research identified that an externally imposed measurement

system may have little impact on an organisation below its senior management level.

Where medical professionals’ interests were not associated with the demands of

government’s performance measures the chances of metric manipulation are increased

(Chang 2006: 75).

The use of a balanced, multi-dimensional performance measurement system

Multi-dimensional performance measurement systems, most commonly

associated with the Balanced Score Card (Kaplan and Norton 1996), seek to ensure

that achievement against one target is not achieved at the cost of a poorer performance

against other targets. The balanced score card is the basis of Government’s oversight

of the NHS, and underlies the Performance Assessment Frameworks (Radnor and

Lovell (2003: 181), but it is not necessarily the basis of internal performance

management within Trusts. Indeed it has been argued that the necessarily complex

nature of public service performance measurement makes it harder to take a balanced

view;

As services are broken down and deconstructed into ever smaller components

the less the performance of the whole service is being measured.

Adcroft and Willis (2005: 394)

Some forms of metric manipulation, such as gaming, are only possible where a

balanced view of performance is not taken. A multi-dimensional system should,

because it allows more sophisticated and transparent trade-offs between priorities,

15

diminish the tendency to focus on single high priority targets at the expense of other

objectives. Such an approach should avoid triggering deception mechanisms.

The attitude of organisational informal culture towards metric manipulation

A final context that may trigger deceit is an acceptance or toleration of metric

manipulation within an organisation’s informal culture. Stylianou et al (2004)

identified occupational values and cultures as factors that influences the likelihood of

people acting unethically. Put simply they found that some occupational groups do

not regard some formally unethical practices as wrong. As Jensen (2003) pointed out

organisations may develop a culture in which ‘managing the numbers’ is regarded as

acceptable. Vakkuri and Meklin (2003: 757) have developed a conceptual framework

for the development of performance management in Universities, which, like

hospitals, are knowledge based organisations. They identified a number of

ambiguities in universities, between the academic culture’s perceptions of the

objectives of a performance management system and the system’s formal objectives,

that might lead to the misuse of the system;

When people use PM systems, appropriate them, resist them or politicize with

them they are strongly influenced by the cultural conditions of their working

environments.

(Vakkuri and Meklin 2003: 756)

STAGE 2 CONJECTURING CONFIGURATIONS: CONTEXTS,

MECHANISMS AND OUTCOMES

Research Methods

16

The causal connections, if any, between mechanisms, contexts and outcomes

will be studied using two main techniques.

• A questionnaire based Delphi technique.

• Semi-structured interviews using the critical incident technique.

The purpose of the Delphi technique was to obtain a consensus of opinion from a

panel of experts as part of the reconnaissance stage of the research. Two Delphi

studies were undertaken. The first of these used a panel of 20 local government

experts (judged by seniority and length of service) that was formed by snowball

sampling. It never met but its members were sent a questionnaire to complete in

which the mechanisms discussed earlier were presented as a series of scenarios. The

panel members were asked how likely they thought people in their organisations

were to respond to the scenarios;

• by engaging in metric manipulation,

• and what level of deception they might use,

• and in which circumstances they might deceive.

Mechanisms 7 & 8 were not included because they were identified after the Delphi

questionnaire had been designed. The results were analysed and then reported back to

the panel members who were invited to reconsider their opinions in the light of their

peers’ views. Sixteen responded to the first round of the survey and twelve to the

second round, which produced a consensus opinion on most issues.

The second Delphi survey of NHS managers used an improved questionnaire

that included the two new scenarios; in other respects it was the same as that used

with the local; government panel. A panel of 30 NHS managers was formed and 16

replies were received in the first round. The second round of the survey is currently

17

being conducted. The findings from the NHS panel used in this paper are those from

the first round only and so must be treated as provisional.

The Delphi survey can only provide a broad brush picture derived from answers

to hypothetical questions. The semi- structured interviews compensated for this

limitation by asking for specific examples. During the interviews respondents were

asked, by use of the critical incident technique (Flanagan 1954), to describe examples

of metric manipulation that they had observed. At this stage of the project 31

interviews have been completed within five NHS Trusts. The research project has

received ethical approval from a NHS research ethics committee and research and

development approval has been obtained from each of the Trusts within which

research has been conducted.

Findings

The purpose in this section is to use the research findings to identify context –

mechanism – outcome configurations (CMOC). One particular critical incident can be

used to illustrate a particular configuration and show how they may operate in

practice.

The respondent was a nurse manager in a hospital Trust. She, and her

colleagues, recognised that the performance measurement system had been

instigated to provide government with a means of monitoring and controlling

Trusts. However they developed it so that it could also be used as an aid to

management within the Trust. The informal culture of the Trust was not

sympathetic to metric manipulation.

The quality of care for long term patients was an important issue in the

Department of Health’s agenda. One of the statistics used to measure this was

18

the hospital re-admission rate. A low re-admission rate was thought to be a

good proxy measure for the quality of long-term care. Statistics from the

primary care Trust (PCT) were sent to the respondent that showed the target of

reducing the re-admission rate by 1% had been met. She was sceptical because

her own monitoring showed a much less positive performance. However she

recognised that when two Trusts were calculating the same performance

measure using different systems and data sets it was likely that the various

figures would not tally. The sensible thing, she thought, was to contact the

PCT and propose a meeting to reconcile the figures and produce some reliable

information. She was rather shocked when the PCT replied that a meeting

would not be worthwhile. After discussing it with the PCT manager she

formed the impression that the PCT was not interested in using the measures

to improve the internal operations of the service. They only interested only in

presenting ‘good news’ to the Department of Health. If their figures gave good

news they saw no point in challenging them; and perhaps converting good

news into bad news. This would only result in additional interference and

questioning from those bodies higher up the NHS hierarchy.

Refusing to review some information when it might be necessary is

being economical with the truth, a form of selective presentation. The PCT

had a informal culture that accepted metric manipulation and saw performance

measure as an externally focussed system These two contexts seem to have

triggered the ‘avoidance of hassle’ mechanism which resulted in a willingness

to selectively present information. The respondent however worked in a Trust

where the informal culture disapproved of manipulation; and there was a more

proactive approach to performance measurement. In these circumstances the

19

‘avoiding hassle’ mechanism was not triggered and the respondent would have

preferred to review the statistics.

Relationships between mechanisms and outcomes and contexts

The results from the local government Delphi survey are presented in table 1

and provide information on the relationship between mechanisms and outcomes. They

show the propensity to deceive, and the level of deceit people are thought to be

prepared to use, in each of the scenarios presented in the questionnaire.

Table 1. Selected results from the local government Delphi survey, rounds 1 & 2 Mechanism Level of deceit

Propensity to manipulate Wght.

average All figure are % of responses


gaming

distortion

Wgt. Av.

A possibility A serious possibility

A near certainty

Propen-sity

1.The avoiding hassle & scrutiny mechanism

75 (100)

6 (0)

19 (0)

2

22 (20)

9 (7)

3 (3.5)

8.2


81 (100)

6 (0)

12.5 (0)

2

19 (18.5)

14 (10)

5 (3)

10.3


31 (62.5)

38 (37.5)

31 (0)

1.8

21 (23.5)

24 (18)

8.5 (6)

15.8


31.5 (12.5)

31 (62.5)

37.5 (25)

4.8

18 (22.5)

17 (12.5)

8 (6.5)

12.7


37.5 (25)

37 (75)

25 (0)

3.5

20.5 (24)

14.5 (9)

6.5 (5)

11.5


62 (50)

25 (50)

13 (0)

3

20 (23)

11.5 (14)

6.5 (5)

10.4

• The weighted averages of level of deceit are based on a scale on which: hiding information = 1, economy with truth = 2, gaming = 4, misclassification = 8 and distortion = 16. These values are then weighted by the % of respondents selecting them

• The weighted averages for propensity are based on the following weights: possibility = 1; serious possibility = 2; near certainty = 3.

• The main figures are the results from the first round of the survey and the figures in parentheses are the results of the second round.

From the statistics in table 1 configurations linking mechanisms and outcomes

can be defined. They are shown in figure 2. The axes in figure 1 represent the

outcomes of metric manipulation in terms of propensity to deceive and the level of

deceit. The mechanisms of ‘principled objection’ and ‘avoiding hassle’ lead to a

20

relatively low propensity to manipulate and to the use of low levels of deceit. The two

PRP mechanisms, which bring small financial rewards, cause a low propensity to

manipulate but can involve gaming as well as selective presentation. The ‘frustration

at inaccurate information’ mechanism leads to a higher propensity to manipulate’.

This manipulation is largely selective presentation but gaming can also be triggered

and (according to the first round results) deception in the form of misclassification of

data. The Delphi panel thought that PRP systems that paid large bonuses would (if

they were to be commonly used) lead to both a high propensity to manipulate and the

use of distortion, the highest level of deceit.

Selected findings from round 1 of the NHS Delphi survey are shown in table 2

and are represented schematically in figure 3.

Table 2. Selected results from NHS Delphi survey: round 1

Mechanism Level of deceit

Propensity to manipulate Wght.

average All figure are % of responses


gaming distortion

Wgt av.

A possibility A serious possibility

A near certainty

Propen-sity

1.The avoiding hassle & scrutiny mechanism

62.5

25

12.5

4

14

14

3.5

8.75


62.5

19

19

3.4

21

8

7.5

9.9


38

38

25

4.8

17.5

25

5.5

14


56

19

25

4.6

10

7

12.5

10.25


69

12.5

19

3.2

17

6

10

9.8


63

25

12

3.5

15

8.5

8

9.3

7. The bidding for resources mechanism

69

12.5

19

3.4

26

22

10.5

17

8. the maximising income mechanism

63

19

19

3.5

21

15.5

4

10.4

• The weighted averages of level of deceit are based on a scale on which: hiding information = 1, economy with truth = 2, gaming = 4, misclassification = 8 and distortion = 16. These values are then weighted by the % of respondents selecting them

21

• The weighted averages of propensity are based on the following weights: possibility = 1; serious possibility = 2; near certainty = 3.

The results show some differences when compared with those of the local

government panel. The ‘frustration at inaccurate information’ mechanism is still

associated with a relatively high propensity to deceive but shows an increased

tendency to use more serious forms of deceit. The ‘avoiding hassle’ mechanism

generates a low propensity to deceive, as is the case with the local government

managers, but it is also linked with a higher level of deception by the NHS panel. The

NHS panel thought, similarly to local government managers, that PRP schemes that

produced large bonuses would lead to a high level of deceit and a high propensity to

deceive. The results on the two new scenarios/mechanisms introduced into the NHS

survey were interesting. The ‘bidding for resources mechanism’ was considered to

lead to a, relatively, very high propensity to manipulate but that only low levels of

deceit would be used. The ‘maximising income’ mechanism, it was considered, would

lead to similar forms of deceit being used but managers would be less likely to use

them.

The results from the two Delphi surveys begin to identify the areas in which the

risks of metric manipulation are greatest and in which efforts at improving

governance should be directed. In the next section an attempt will be made to identify

the contexts that may trigger the mechanisms.

Contexts as triggers for mechanisms

This section uses both the results from the Delphi surveys and from the

interviews to identify which contexts may trigger which mechanisms. The conjectured

relationships between contexts and mechanisms are shown in figure 3 by the vertical

arrows.

22

The most important trigger, according to the responses to the Delphi

questionnaire (table 3), would appear to be the informal culture of an organisation.

More than 60% of both panels though that acceptance of deceit by the informal

culture was a important or very important trigger for all the mechanisms.

Table 3. The relative importance of factors bearing on a decision to manipulate information Q. How important are each of the following factors when people decide whether or not they will manipulate performance data and information? Allocate a total of 10 points between the following so that the most important get the largest number and so on.

��

��

��

��

��

1. Acceptability or unacceptability of information manipulation within the organisation’s informal culture and values.

��

2. Whether the desired effect can be achieved by slight manipulations that might not be considered so bad, or whether it would require serious wrongdoing.

��

3. The risk of being caught. ��

4. The severity or otherwise of the disciplinary consequences if caught manipulating data or information.

��

��

��

�

Three of the Trusts in which interviews were conducted provided interesting

comparisons. In one Trust there was an acceptance of manipulation;

“here we go fiddling the figures again, then again if it makes us look better –

other Trusts do it”

(Int.14, see also Int. 2)

In another the attitude was that it was better to face the issues,

“We don’t want to hit the target and miss the point”.

(Int 15)

In a third Trust it was reported that there had been a change in the culture following a

change in Chief Executive. Under a former chief executive people manipulated the

23

figures out of fear of his anger if targets were not met (Harradine 2006). Instances of

selective presentation, gaming and distortion in this Trust were reported by the

interviewees quoted above. Such practices had become less acceptable under the new

leadership.

Table 2 shows the risk of being caught, considered in the balance with the

potential benefits if not caught, to be factors in deciding whether to manipulate

information. In some Trusts the level of auditing of performance information was

thought to be low; ‘the chances of being caught if you are clever enough are low’

(Int.16). In others the level of audit was thought high enough to dissuade people (Ints

16 & 11). Where audit systems have not been set up distortion is more likely (Int.13,

CI 1).

It might have been thought that tough auditing of performance information was

associated with performance management being highly embedded in an organisation.

The interviews suggested this is not necessarily the case. A combination of a low

level of audit and a high degree of embeddedness triggered selective presentation and

gaming, especially in relation to bids for resources and budgets (Int. 16 & 12). As far

as personal appraisals were concerned the interviews suggested that in several Trusts

performance measurement was not embedded. Many appraisal schemes were based

on assessing people’s performances against competence standards and qualitatively

expressed objectives rather than against measured targets. In such circumstances the

deception mechanisms are unlikely to be triggered.

The belief that performance measurement is an externally focussed control

appears to trigger deception mechanisms, if the system is not also thought to be a

valuable management tool within the organisation. In one of the Trusts there was an

attempt to use and adapt an imposed, external system to make it of value to the

24

internal operations of the hospital; and deception mechanisms were not activated. In

other Trusts the internal impact of performance management was low. It was in these

Trusts that evidence was found that the ‘principled mechanism’(Int. 9, CI 2), the

‘avoiding hassle mechanism’ (Int. 9 CI 3, Int 6, CI 1) and the ‘inaccurate information’

mechanism (Int 2 CI 1) were being triggered.

The lack of a multi-dimensional, balanced approach to performance

measurement appears to have triggered the ‘frustration’ mechanism and gaming

behaviours (Int 4 & 16). Obversely, in Trusts where a balanced view was taken, this

mechanism was less likely to triggered, perhaps because such an overview mitigates

the impact of unreliable information. For example one respondent (Int. 8, CI 4 & 6)

said that managers no longer sought to disguise apparent budget overspends because a

corporate view was taken on the budget outturn; and individual over-spent budgets

were not necessarily penalised, as the over-spend might be due to system error.

CONCLUSIONS

This paper reports on an ongoing realist analysis of metric manipulation within

the public sector. Based upon a reconnaissance, conducted by reviewing the literature

and by initial empirical research, it identifies a number of context – mechanisms –

outcome configurations. The configurations provide plausible accounts of the causal

networks that lead to metric manipulation.

Realist research into public policy often favours quantitative methods for testing

CMOCs. The use of critical incidents in this study may add to the methodology of

realistic evaluation. They not only provide accounts of CMOCs in practice but also

give some insight into how the participants in the incidents subjectively experience

25

them. This offers the prospect of adding to Pawson and Tilley’s realistic evaluation

approach the element of experience that is a major component of realist epistemology.

Further research, using the expanded Delphi survey will be conducted to test the

causal robustness of the configurations. The results of this research and analysis will

be used to identify forms of governance for, and ways of managing, performance

measurement systems that may diminish the propensity to manipulate metrics and to

lower the level of deceit used.

26

References

Abernethy, M. A. and Chua, W. F. (1996) ‘A field study of control systems

‘redesign’: The impact of institutional processes on strategic choice’,

Contemporary Accounting, 13, 569-606.

Adcroft, A. and Willis, R. (2005) ‘The (un)intended outcome of public sector

performance measurement’, International Journal of Public Sector

Management, 18:.5, 386- 400.

BBC News (Online) (2002a) Many ambulance staff ‘fiddling figures’. Available on

the World Wide Web. URL > http://news.bbc.co.uk/go/pr/fr/-

/1hi/health/3111150.stm<. Site accessed 20/09/2004.

BBC News (Online) (2002b) ‘NHS managers ‘fiddle figure’’, Available on the World

Wide Web. URL >http://www.bbc.co.uk/1/hi/health/2299291.stm<. Site

accessed 20/09/2004.

BBC News (Online) (2003a) ‘Advanced booking scrapped by GPs’, Available on the

World Wide Web. URL >http://www.bbc.co.uk/go/pr/fr/-

/hi/health/3102307.stm<. Site accessed 20/09/2004.

BBC News (Online) (2003b) ‘Reid warns GPs on NHS targets’, Available on the

World Wide Web. URL >http://www.bbc.co.uk/1/hi/health/3178674.stm<. Site

accessed 20/09/2004.

Ballantine, J., Brignall, S. and Modell, S. (1998) ‘Performance measurement and

management in public health services: a comparison of UK and Swedish

practice’ Management Accounting Research, 9, 71-94.

27

Brignall, S. and Modell, S. (2000) ‘An institutional perspective on performance

measurement and management in the ‘new public sector’’, Management

Accounting Research, 11 281-306.

Carlisle, D. (2006) ‘Good management – How to steer clear of PbR gaming’, Health

Services Journal, 2nd March.

Chang, L-C. (2006) ‘Managerial responses to externally imposed performance

measurement in the NHS: An institutional theory perspective’, Financial

Accountability and Management, 22: 1, 63-85.

CIPD (Chartered Institute of Personnel and Development) (2005) Performance

Management: Survey Report, London: CIPD.

Collier, A. (1994) Critical Realism: An Introduction to Roy Bhaskar’s Philosophy,

London: Verso.

Corby, S., White, G., Millward, L., Meerabeau, E. and Druker, J. (2003) ‘Finding a

cure? Pay in England’s National Health Service’, Employee Relations, 25: 5,

502-516.

Dodds, W., Morgan, M., Wolfe, C. and Raju, K. S. (2004) ‘Implementing the 2-week

wait rule for cancer referral in the UK: general practitioners’ views and

practices’, European Journal of Cancer Care, 13, 82-87.

Flanagan, J. C. (1954) ‘The Critical Incident Technique’, Psychological Bulletin, 1,

327-58.

Givan, R. K. (2005) ‘Seeing stars: human resources performance indicators in the

National Health Service’, Personnel Review, 34: 6, 634-647.

Grice, H. P. (1975a) ‘Logic and Conversation’ in P. Cole and J. L. Morgan (eds)

Syntax and Semantics, 3: Speech Acts, New York: Academic Press.

28

Grice, H. P. (1975a) ‘Logic and Conversation’ in D. Davidson and G. Harman The

Logic of Grammar, Encino, CA: Dickenson.

Harding, M-L. (2006) ‘Payment by results wide open to fraud, say commissioners’,

Health Services Journal, 116: 5994, 23rd February, 5.

Harradine, D. (2006) Negotiated orders and the role of accountancy: an ethnographic

study of a NHS Trust, Ph.D. dissertation, Nottingham: Nottingham Business

School, Nottingham Trent University.

Healthcare Commission (2005) Primary Care Trust: Survey of Patients, London:

Healthcare Commission.

Jensen, M. C. (2003) ‘Paying people to lie: the truth about the budgeting process’.

European Financial Management, 9: 3, 379-406.

McCornack, S. A (1992) ‘Information manipulation theory’, Communication

Monographs, 59, 1-16.

McCornack, S. A., Levine, T. R., Solowczuk, H. I., Torres, H. I. and Campbell, D. M.

(1992) ‘When the alteration of Information is viewed as deception: an empirical

test of information manipulation theory’, Communication Monographs, 59, 17-

29.

Modell, S. (2001) ‘Performance measurement and institutional processes: a study of

managerial responses to public sector reform’, Management Accounting

Research, 12, 437-464.

Murray, M. G. and Millar, K. (1997) ‘Effects of situational variable on judgments

about deception and detection accuracy’, Basic and Applied Social Psychology,

19: 4, 401-410.

29

National Audit Office (2001). Inappropriate adjustments to NHS waiting lists. Report

by the Comptroller and Auditor General, HC 452 Session 2001-2. London: The

Stationery Office.

Oliver, C. (1991) ‘Strategic responses to institutional processes’, Academy of

Management Review, 16, 145-179.

Pawson, R. and Tilley, N. (1997) Realistic Evaluation, London: Sage.

Phillips, C. (1999) ‘A review of CCTV evaluations: Crime reduction effects and

attitudes towards its use’, Crime Prevention Studies, 10, 123-155.

PR Newswire Europe Ltd. (2000) NHS performance related pay “riddled with

racism” says MSFC/CPHVA report. Available on the World Wide Web at URL

>http;//www.prnewswire.co.uk/cgi/nrews/release?id=11351<. Site visited

13/02/06.

Radnor, Z. and Lovell, B. (2003) ‘Defining, justifying and implementing the balanced

scorecard in the National Health Service’, International Journal of Marketing,

3: 3, 174-188.

Revill, J. (2003) ‘Whistleblower reveals NHS culture of secrecy’, The Observer, 26th

January: 4.

Sanderson, I (2001) ‘Performance management, evaluation and learning in ‘modern’

local government’, Public Administration, 79: 2, 297-313.

Stylianou, A. C., Winter, S., and Giacalone, R. A. (2004) ‘Accepting unethical

information practices: the interactive effects of individual and situational

factors’, paper presented at the Academy of Management Conference: OCIS

division, New Orleans August 2004.

Tilley, N. (2000) Realistic Evaluation: An Overview, paper presented at the Founding

Conference of the Danish Evaluation Society, September 2000. Available on the

30

World Wide Web at URL

>http://www.danskevalueingsselskab.db/pdf/Nick%20Tilley.pdf<, Site accessed

28/1/2006

Vakkuri, J. and Meklin, P. (2003) ‘The impact of culture on the use of performance

measurement information in the University setting’, Management Decision, 41:

8, 751-759.

Winter, S. J., Stylianou, A. C., Giacalone, R. A. (2004) ‘Individual Differences in the

Acceptability of Unethical Information Technology Practices: The Case of

Machiavellianism and Ethical Ideology’, Journal of Business Ethics, 53: 3, 275-

296.

Yeung, L. N. T., Levine, T. R., Nishiyama, K. (1999) ‘Information manipulation

theory and perceptions of deception in Hong Kong’, Communication Reports,

12: 1, 1-13.

�

31

Figure 1. Levels of deceit in data and information manipulation

Implication Misleading by anticipating how people will heuristically misinterpret selectively provided information.

Distraction Misleading by hiding the truth amongst a mass of detail. Too much

Economy with the Truth Not telling the whole truth so as to give a false impression Too little

System Manipulation Changing practice so that a measured objective is achieved at the expense of an unmeasured one

Manipulating information by re-classifying data Moving data between time periods or categories to create required performance numbers

Lying Telling a known untruth

QUANTITY MANNER RELEVANCE QUALITY

LESS DISHONEST MORE DISHONEST

Selective presentation Gaming Distortion

32

Figure 2. Mechanisms, contexts and outcomes: Local Government Delphi Round 2 results

Weighted mean % considering

deception

high 18

15

12

9

Propensity to manipulate

6

low

3

2 Selective presentation

4 Gaming

Distortion

Level of deceit

low high

Avoiding hassle

Principled mechanism

Frustration at inaccurate information

Large bonus

small bonus

Proportionate bonus

33

Figure 3. Mechanisms, contexts and outcomes: NHS Round 1 results

Weighted mean % considering

deception

high

18

15

12

9

Propensity to manipulate

6

low

3

2 Selective presentation

4 Gaming

Distortion

Level of deceit

low high

High embededness of performance management

Low risk & low audit check: high rewards

Specific triggers External focus of PM system

Lack of balance & integration in PMS

General trigger Attitude of informal culture to data manipulation

Avoiding hassle

Principled mechanism

Frustration at inaccurate information

Bidding for resources

Maximising income

Large bonus

small bonus

Proportionate bonus

34

i This research is supported financially by the Chartered Institute of Management Accountants to whom the authors would like to express their thanks.

performance measurement and metric manipulation in the ... · performance measurement and metric...

Documents