performance measurement and metric manipulation in the ... · performance measurement and metric...
TRANSCRIPT
1
Performance measurement and metric manipulation in the public sector
EGPA Conference 2006-07-25 SG VII Ethics & the Integrity of Governance
Colin Fisher
Bernadette Downes Nottingham Business School Nottingham Trent University
Contact details
Colin Fisher Professor of Managerial Ethics and values
Nottingham Business School Nottingham Trent University Burton Street Nottingham NG1 4BU Email: [email protected] Tel 0115 8482822
Bernadette Downes Research associate and senior lecturer
Nottingham Business School Nottingham Trent University Burton Street Nottingham NG1 4BU Email: [email protected] Tel 0115 8482401
2
Performance measurement and metric manipulation in the public
sector
ABSTACT
This paper explores the circumstances that influence whether managers in the public
services manipulate the measurement information that is used to assess performance;
and if they do, what level of deception they might use. The realistic evaluation
approach is adopted. A Delphi survey and the collection of critical incidents through
interviews are used to identify possible configurations of contexts – mechanisms –
outcomes that provide possible explanations of information manipulation. A number
of these configurations are discussed. In a later stage of the project the findings will
be used to develop proposals for improved governance of performance measurement
systems in the public services.
INTRODUCTION
This paper reports on the circumstances that influence whether managers in the
public services manipulate the measurement information that is used to assess
performance; and if they do, what level of deception they might use. The research is
not focussed on the extent of such deception, nor upon the question of whether
performance measurement and target based systems of performance management are
effective in improving the performance of public sector organisations. Its purpose is
to identify the circumstances and mechanisms that trigger deception so as to propose
means of improving governance of performance management systemsi. The
widespread use of performance measures and targets within the UK public sector
3
(DoH 1997 & 2001, Givan 2005, Sanderson 2001, Smith 2005) makes research into
how deception might be discouraged important. This paper uses the results of an
initial empirical study into this question to propose an explanatory configuration of
contexts, mechanisms and outcomes that can be tested in a future research study
REALISTIC EVALUATION
Pawson and Tilley (1997) developed a method of realistic evaluation designed
to evaluate public policies and programmes. We have adopted the approach in this
study. Realistic evaluation is based on realist ontology and epistemology and has
three elements, described below, that connect with each other in, what Pawson and
Tilley call, context- mechanism-outcome configurations (CMOC).
• Underlying mechanisms – are the causal mechanisms that bring about the
events that people experience. There are many mechanisms that may
influence events, but they are likely to be hidden and not immediately
obvious to the observer.
• The mechanisms are triggered by certain contexts but not by other
contexts.
• Outcomes. These are the desirable or undesirable consequences of the
mechanisms that have been triggered by the contexts.
Realist social scientists (Collier1994: 42-45) refer to outcomes as events and add
a further element – experiences - that are people’s socially constructed and subjective
understandings of events. However experiences are not considered in realistic
evaluation. Pawson and Tilley argued that CMOCs should be proposed and
empirically tested. In Popperian manner, the CMOCs that survive the tests stand as
4
probable explanations of the programme outcomes being evaluated; until they too are
refuted by further research.
Tilley (2000) gives the example of evaluating the impact of installing CCTV in
car parks on the rate of car crime to illustrate the realistic evaluation process. One of
the possible mechanisms he labeled ‘the nosey parker mechanism’. People see the
CCTV cameras in a car park and this makes them feel secure, more people therefore
use the car park, which becomes busier. Because there are more people about the
criminals are deterred. However this mechanism will only be triggered in certain
contexts. If the car park is one that people use to park their cars while they are at work
then it will be busy at the start and end of the working day but will be quiet in the
middle of the day; and criminals will be able to steal cars and their contents
undisturbed. In this context CCTV will be not reduce the amount of car crime. Other
contexts may well trigger other mechanisms that would lead to a reduction in crime
(Phillips 1999).
THE OBJECTIVES AND STRUCTURE OF THE PAPER
Realistic evaluation implies a three-stage approach. Firstly a reconnaissance is
done to identify potential mechanisms and contexts. Secondly possible configurations
of contexts, mechanisms and outcome are conjectured; and finally the conjectures are
empirically tested. The reconnaissance was conducted by reviewing the literature and
by undertaking semi-structured interviews within the National Health Service (NHS)
to collect critical incidents. The second stage utilises the literature review and critical
incidents but is also based on a Delphi questionnaire survey of local government. The
third stage, of which the first round is complete and the second round is currently
being undertaken, is based on a Delphi survey of NHS managers.
5
STAGE 1 IDENTIFYING OUTCOMES, MECHANISMS AND CONTEXTS
In this section the contexts, mechanisms and outcomes that are possibly relevant to
metric manipulation are identified.
Outcomes
Two outcomes relating to manipulation of performance measurement data and
information are proposed;
1. the propensity of managers and professionals to manipulate,
2. the level of deceit involved in the manipulation.
Figure 1 provides a framework for classifying the levels of deceit involved in
data manipulation. Its origins lie in the work of Grice (1975a, 1975b), a philosopher
who proposed that a cooperative principle is to be found in effective conversations.
The principle can be expressed through four maxims.
• Maxim of quantity – provide the right amount of information, neither too
much nor too little.
• Maxim of quality – provide true information.
• Maxim of relation – provide relevant information and be sensitive to the
balance and relationships between different information.
• Maxim of manner – be clear and brief, do not confuse by prolixity or other
devices or tropes. (This statement, as an example, neglects this maxim)
Of course people may choose to break these maxims and this gives rise to
different types of deception according to which maxim is flouted.
6
McCornack (1992) has developed a theory of information manipulation based on
Grice’s work. His subsequent empirical work (McCornack et al 1992) suggests that
violations of the four maxims are not all seen as exhibiting the same degree of
dishonesty. Breaking the quality maxim was seen as the most dishonest form of
deception; followed by relevance, clarity (manner) and quantity. These insights have
been used in figure 1 to produce a scale of dishonesty and deceptiveness. It needs to
be borne in mind that concepts of deception may be culturally specific. A replication
study (Yeung et al 1999) in Hong Kong found that violations of the maxims of
quantity and manner were not seen as deceptive whereas in McCornack et al’s
American study all four kinds of violation were seen as deceptive, albeit to differing
degrees. Nevertheless the Hong Kong findings reinforce the view that there are
different degrees of deception.
The three forms of deception, based on Grice’s four maxims are described next.
Selective presentation
There are three forms of selective presentation that result from ignoring the
maxims of quality and manner. All share the common feature of presenting
information so that the recipient is likely to form an incorrect understanding. One way
is simply to hide significant information in a mass of trivial detail. A more
sophisticated form is termed by Grice conversational implicature. This refers to the
way that people use the interplay of the maxims to create an implication in the hearer
that is not explicit within the communicator’s words. An example concerns a report
on Ambulance Service response times published by the Commission for Health
Improvement (CHI). Ambulances are supposed to arrive at the most urgent calls
within eight minutes. A statement was made that three quarters of the Ambulance
7
Services met this target and that there had been a 20% improvement in performance
since 1999. This was no doubt true and so the maxim of quality had been obeyed. The
use of an aggregate figure and the emphasis given to the rate of improvement were
intended to misdirect the reader of the original statement from what others saw as a
very patchy and disappointing result. As the acting Chief Executive of the CHI
admitted,
While we feel that targets have been helpful in focussing the minds of
ambulance Trust staff on improving performance, we believe that more
sophisticated measures of response times and outcome measures are needed.
(BBC News Online 2002a)
The Consumer Association was blunter; saying the CHI report was short on detail and
failed to tackle crucial issues (BBC News Online 2002a). Implicature may underwrite
much of what has become known in politics as ‘spin’.
A commonly reported form of selective presentation in the public sector
literature is being economical with the truth, a phrase re-invented by�the UK Cabinet
Secretary, Sir Robert Armstrong during the Australian 'Spycatcher' trial in 1986� One
interviewee identified a clear example; it concerned the two-week target for the time
between a GP referring a patient to hospital for suspected cancer and the patient being
seen by a hospital specialist (Dodds et al 2003). The hospital staff were aware that
there could be a delay between the GP seeing the patient and sending a referral letter
to the hospital but the clock for the two week target only starting ticking when the
referral letter was received by the hospital. This, and other similar factors, meant that
the official measurements did not reflect the patients’ experiences. However to admit
this would make the hospital’s performance look worse than the official statistics
8
stated; and so there was an unspoken agreement not to raise the issue; to be
economical with the truth.
Gaming
Managers seek benefits by gaming, which is taking advantage of the loopholes
in the rules and systems under which they operate. One form of gaming exploits the
lack of an overview of an organisation’s performance and concentrates on achieving a
high profile target by neglecting other, perhaps equally important matters, which are
either not measured or are the subject of a low profile measure. An example of this
type of gaming can be seen in the consequences of the government’s introduction of a
target that all GPs’ patients should be seen within 48 hours. The intention of such a
target, it is to be supposed, is to improve services to patients. However Primary Care
Trusts (PCTs) and general practices often responded to this target by scrapping
advanced booking and requiring patients to phone in early on the day they wanted the
appointment for (BBC News Online 2003a and 2003 b). The result was that the
targets were met, but in the eyes of many patients the service was made worse (Health
Care Commission 2005:12-13).
Distortion
There can be two forms of distortion, misclassifying things, thereby taking
advantage of the margin of error that categorisation always allows, and, more
seriously lying or falsifying data. The National Audit Office’s (2001) investigation
into ‘inappropriate adjustments to NHS waiting lists’ reported on several cases where
waiting list information had been adjusted by re-classifying patients. Examples were
9
delaying the addition of patients to waiting lists and inappropriate suspension of
patients.
There is less evidence in the public domain of deliberate lying connected with
NHS performance measurement. However there are accounts from occasional
whistleblowers. One such, Ian Perkin, alleged that in the Trust he worked in staff
inputting data into the information system were being asked to insert a figure of zero
in the cancelled operations field when there had in fact been 28 cancellations (Revill
2003).
Mechanisms
A number of mechanisms were identified from the literature that might
encourage managers and professionals to manipulate data and information.
1. The avoiding hassle & scrutiny mechanism
If a manager, their team or their unit, is assessed as having performed very well or
very badly on a measure or against a target they may become subject to special
scrutiny. If they have performed badly this will lead to particularly close scrutiny, and
audit. A manager or professional may manipulate the measures to avoid such
unwelcome attention. This might also be called the ‘feed the beast’ mechanism, which
is a catchphrase that Harradine (2006) reports was popular amongst the senior
management team of a NHS Trust. It means providing whatever information to the
next level of the NHS that is necessary to avoid scrutiny.
2. The principled mechanism
10
The principled mechanism is one in which people believe that the wrong things
are being measured. One possibility is that important things, such as clinical quality
and outcomes, may be relatively ignored whereas less important things, such as
administrative performance, are closely measured and the subject of targets. People
may, alternatively, believe that the information is wrong because it is incomplete.
Within the NHS for example aggregate waiting list information may not discriminate
between cases of high and low clinical need and value; so that Trusts are encouraged
to do many quick and cheap operations, to reduce waiting times, whilst delaying
complex and expensive cases. If they think wrong measure distort priorities they may
believe it proper to manipulate them in order to achieve a better quality of service.
3. The frustration at inaccurate data and information mechanism
People may believe that the performance measures that they are assessed against
are inaccurate. This may be due to miscoding when the information is input. Another
possibility is that by the time the information is used to assess performance it is out of
date. A lack of faith in the performance measures, especially where they indicate a
weak performance, often becomes a justification in people’s minds for deception.
4. Large bonus triggered mechanism
Jensen (2003) has pointed out that a performance related pay (PRP) system that
is based on the proportion of a target achieved (for example – no bonus if target is not
reached, cost of living increase only if it is reached and a large bonus if it is exceeded)
encourages managers to nudge upward metrics, which are near a bar point, up to that
point if the bonus that is triggered is a large proportion of total remuneration. That a
11
small reported increase in performance can result in a disproportionately large
increase in bonus encourages deception..
5. Small bonus triggered mechanism
The mechanism is the same as for mechanism 5 but the bonus is a small
percentage of total remuneration.
6. Bonus in proportion to achievement mechanism
In this PRP mechanism the size of any bonus is based on a straight line
relationship between performance and reward (in which an x% increase in
performance results in an x% increase in reward – as in the old fashioned, time study
based, incentive schemes).
At this stage of the research the intention was to identify as wide a range of
mechanisms as possible. This means that some of the mechanisms may not be
commonly present in public sector organisations (CIPD 2005: 8). Mechanisms 4, 5 &
6 for example all refer to performance related pay (PRP) which is still relatively
uncommon within the public sector. Despite NHS Trusts being given from their
foundation the right to establish local pay systems only a minority did; and of those
few developed PRP systems (Corby et al 2003). Where PRP schemes have been
introduced they have often awarded only a small percentage of discretionary pay on
the basis of competency rather than on measured performance or outcomes assessed
against targets (PR Newswire Europe Ltd. 2000).
The research interviews conducted within the NHS has identified two further
mechanisms.
7. ‘Seeking additional resources’ mechanisms.
12
This mechanism operates in two ways. Interviewees from hospital Trusts
reported that information was often presented so as to make the situation look worse
than it was, to bolster a case for additional resources. At Trust Board level, contrarily,
the intention was more often to make things look better than they were, earn more
income through the recently introduced payment by results (PbR) system. A survey,
conducted by the NHS Alliance, of Health Commissioners reported that 67% of those
commissioning under the new PbR system had concrete evidence of gaming by
service providers while 53% were suspicious that gaming was happening but could
not get the information from providers to investigate (Harding 2006). Carlisle (2006)
provides a list of the types of game that might be played as PbR comes fully into
operation.
Contexts
A number of contextual factors have been identified in the literature that may
trigger the mechanisms discussed in the previous section.
The balance of benefits, risks and sanctions
The research of Murray and Millar (1997), from within the psychology
discipline, into the situational factors that make people more likely to think that they
are being deceived, found that people will suspect deception if;
• the benefit to the communicator is high
• the chance of the deception being detected is low, and
• the cost to the deceiver if detected is low.
13
The obverse of this finding is that potential high rewards coupled with a perception of
low probability of being caught, and few damaging consequences if caught, could act
as triggers for deception.
Embeddedness of performance measurement
Performance management can be said to be highly embedded in an organisation
if it is central to such activities as arguing for priorities or resources or for obtaining
rewards, such as recognition or performance related pay. The more performance
measurement is central to these purposes the greater the likelihood that the deception
mechanisms will be triggered.
Organisational response to externally driven performance measurement
The structure of ministerial accountability, through Parliament, for public
services has resulted in central government departments using a system of centrally
imposed performance measures, targets and assessment regimes to control local
authorities and health Trusts. Organisations subject to such monitoring may respond
in two different ways. Modell (2001: 458) studied the responses of senior managers in
the Norwegian health system to institutional pressures to conform to government’s
expectations. He used Oliver’s (1991) framework of managerial responses that was
scaled from acquiescence to manipulation. He found that management in some
hospitals sought to adapt an imposed performance management system to improve
operational efficiency as well as to provide government with the information it needed
to legitimate its policies. He pointed out however that they only responded in this way
if crude financial performance measures (often acceptable to politicians (Abernethhy
and Chua 1996)) were replaced by a more sophisticated performance measurement
14
system acceptable to both of them, hospital management, and politicians. Such a
development is likely to reduce the probability of metric manipulation. However
Chang’s (2006: 74) research identified that an externally imposed measurement
system may have little impact on an organisation below its senior management level.
Where medical professionals’ interests were not associated with the demands of
government’s performance measures the chances of metric manipulation are increased
(Chang 2006: 75).
The use of a balanced, multi-dimensional performance measurement system
Multi-dimensional performance measurement systems, most commonly
associated with the Balanced Score Card (Kaplan and Norton 1996), seek to ensure
that achievement against one target is not achieved at the cost of a poorer performance
against other targets. The balanced score card is the basis of Government’s oversight
of the NHS, and underlies the Performance Assessment Frameworks (Radnor and
Lovell (2003: 181), but it is not necessarily the basis of internal performance
management within Trusts. Indeed it has been argued that the necessarily complex
nature of public service performance measurement makes it harder to take a balanced
view;
As services are broken down and deconstructed into ever smaller components
the less the performance of the whole service is being measured.
Adcroft and Willis (2005: 394)
Some forms of metric manipulation, such as gaming, are only possible where a
balanced view of performance is not taken. A multi-dimensional system should,
because it allows more sophisticated and transparent trade-offs between priorities,
15
diminish the tendency to focus on single high priority targets at the expense of other
objectives. Such an approach should avoid triggering deception mechanisms.
The attitude of organisational informal culture towards metric manipulation
A final context that may trigger deceit is an acceptance or toleration of metric
manipulation within an organisation’s informal culture. Stylianou et al (2004)
identified occupational values and cultures as factors that influences the likelihood of
people acting unethically. Put simply they found that some occupational groups do
not regard some formally unethical practices as wrong. As Jensen (2003) pointed out
organisations may develop a culture in which ‘managing the numbers’ is regarded as
acceptable. Vakkuri and Meklin (2003: 757) have developed a conceptual framework
for the development of performance management in Universities, which, like
hospitals, are knowledge based organisations. They identified a number of
ambiguities in universities, between the academic culture’s perceptions of the
objectives of a performance management system and the system’s formal objectives,
that might lead to the misuse of the system;
When people use PM systems, appropriate them, resist them or politicize with
them they are strongly influenced by the cultural conditions of their working
environments.
(Vakkuri and Meklin 2003: 756)
STAGE 2 CONJECTURING CONFIGURATIONS: CONTEXTS,
MECHANISMS AND OUTCOMES
Research Methods
16
The causal connections, if any, between mechanisms, contexts and outcomes
will be studied using two main techniques.
• A questionnaire based Delphi technique.
• Semi-structured interviews using the critical incident technique.
The purpose of the Delphi technique was to obtain a consensus of opinion from a
panel of experts as part of the reconnaissance stage of the research. Two Delphi
studies were undertaken. The first of these used a panel of 20 local government
experts (judged by seniority and length of service) that was formed by snowball
sampling. It never met but its members were sent a questionnaire to complete in
which the mechanisms discussed earlier were presented as a series of scenarios. The
panel members were asked how likely they thought people in their organisations
were to respond to the scenarios;
• by engaging in metric manipulation,
• and what level of deception they might use,
• and in which circumstances they might deceive.
Mechanisms 7 & 8 were not included because they were identified after the Delphi
questionnaire had been designed. The results were analysed and then reported back to
the panel members who were invited to reconsider their opinions in the light of their
peers’ views. Sixteen responded to the first round of the survey and twelve to the
second round, which produced a consensus opinion on most issues.
The second Delphi survey of NHS managers used an improved questionnaire
that included the two new scenarios; in other respects it was the same as that used
with the local; government panel. A panel of 30 NHS managers was formed and 16
replies were received in the first round. The second round of the survey is currently
17
being conducted. The findings from the NHS panel used in this paper are those from
the first round only and so must be treated as provisional.
The Delphi survey can only provide a broad brush picture derived from answers
to hypothetical questions. The semi- structured interviews compensated for this
limitation by asking for specific examples. During the interviews respondents were
asked, by use of the critical incident technique (Flanagan 1954), to describe examples
of metric manipulation that they had observed. At this stage of the project 31
interviews have been completed within five NHS Trusts. The research project has
received ethical approval from a NHS research ethics committee and research and
development approval has been obtained from each of the Trusts within which
research has been conducted.
Findings
The purpose in this section is to use the research findings to identify context –
mechanism – outcome configurations (CMOC). One particular critical incident can be
used to illustrate a particular configuration and show how they may operate in
practice.
The respondent was a nurse manager in a hospital Trust. She, and her
colleagues, recognised that the performance measurement system had been
instigated to provide government with a means of monitoring and controlling
Trusts. However they developed it so that it could also be used as an aid to
management within the Trust. The informal culture of the Trust was not
sympathetic to metric manipulation.
The quality of care for long term patients was an important issue in the
Department of Health’s agenda. One of the statistics used to measure this was
18
the hospital re-admission rate. A low re-admission rate was thought to be a
good proxy measure for the quality of long-term care. Statistics from the
primary care Trust (PCT) were sent to the respondent that showed the target of
reducing the re-admission rate by 1% had been met. She was sceptical because
her own monitoring showed a much less positive performance. However she
recognised that when two Trusts were calculating the same performance
measure using different systems and data sets it was likely that the various
figures would not tally. The sensible thing, she thought, was to contact the
PCT and propose a meeting to reconcile the figures and produce some reliable
information. She was rather shocked when the PCT replied that a meeting
would not be worthwhile. After discussing it with the PCT manager she
formed the impression that the PCT was not interested in using the measures
to improve the internal operations of the service. They only interested only in
presenting ‘good news’ to the Department of Health. If their figures gave good
news they saw no point in challenging them; and perhaps converting good
news into bad news. This would only result in additional interference and
questioning from those bodies higher up the NHS hierarchy.
Refusing to review some information when it might be necessary is
being economical with the truth, a form of selective presentation. The PCT
had a informal culture that accepted metric manipulation and saw performance
measure as an externally focussed system These two contexts seem to have
triggered the ‘avoidance of hassle’ mechanism which resulted in a willingness
to selectively present information. The respondent however worked in a Trust
where the informal culture disapproved of manipulation; and there was a more
proactive approach to performance measurement. In these circumstances the
19
‘avoiding hassle’ mechanism was not triggered and the respondent would have
preferred to review the statistics.
Relationships between mechanisms and outcomes and contexts
The results from the local government Delphi survey are presented in table 1
and provide information on the relationship between mechanisms and outcomes. They
show the propensity to deceive, and the level of deceit people are thought to be
prepared to use, in each of the scenarios presented in the questionnaire.
Table 1. Selected results from the local government Delphi survey, rounds 1 & 2 Mechanism Level of deceit
Propensity to manipulate Wght.
average All figure are % of responses
Selective presentation
gaming
distortion
Wgt. Av.
A possibility A serious possibility
A near certainty
Propen-sity
1.The avoiding hassle & scrutiny mechanism
75 (100)
6 (0)
19 (0)
2
22 (20)
9 (7)
3 (3.5)
8.2
2. The principled mechanism
81 (100)
6 (0)
12.5 (0)
2
19 (18.5)
14 (10)
5 (3)
10.3
3. The frustration at inaccurate data and information mechanism
31 (62.5)
38 (37.5)
31 (0)
1.8
21 (23.5)
24 (18)
8.5 (6)
15.8
4. Large bonus triggered mechanism
31.5 (12.5)
31 (62.5)
37.5 (25)
4.8
18 (22.5)
17 (12.5)
8 (6.5)
12.7
5. Small bonus triggered mechanism
37.5 (25)
37 (75)
25 (0)
3.5
20.5 (24)
14.5 (9)
6.5 (5)
11.5
6. Bonus in proportion to achievement mechanism
62 (50)
25 (50)
13 (0)
3
20 (23)
11.5 (14)
6.5 (5)
10.4
• The weighted averages of level of deceit are based on a scale on which: hiding information = 1, economy with truth = 2, gaming = 4, misclassification = 8 and distortion = 16. These values are then weighted by the % of respondents selecting them
• The weighted averages for propensity are based on the following weights: possibility = 1; serious possibility = 2; near certainty = 3.
• The main figures are the results from the first round of the survey and the figures in parentheses are the results of the second round.
From the statistics in table 1 configurations linking mechanisms and outcomes
can be defined. They are shown in figure 2. The axes in figure 1 represent the
outcomes of metric manipulation in terms of propensity to deceive and the level of
deceit. The mechanisms of ‘principled objection’ and ‘avoiding hassle’ lead to a
20
relatively low propensity to manipulate and to the use of low levels of deceit. The two
PRP mechanisms, which bring small financial rewards, cause a low propensity to
manipulate but can involve gaming as well as selective presentation. The ‘frustration
at inaccurate information’ mechanism leads to a higher propensity to manipulate’.
This manipulation is largely selective presentation but gaming can also be triggered
and (according to the first round results) deception in the form of misclassification of
data. The Delphi panel thought that PRP systems that paid large bonuses would (if
they were to be commonly used) lead to both a high propensity to manipulate and the
use of distortion, the highest level of deceit.
Selected findings from round 1 of the NHS Delphi survey are shown in table 2
and are represented schematically in figure 3.
Table 2. Selected results from NHS Delphi survey: round 1
Mechanism Level of deceit
Propensity to manipulate Wght.
average All figure are % of responses
Selective presentation
gaming distortion
Wgt av.
A possibility A serious possibility
A near certainty
Propen-sity
1.The avoiding hassle & scrutiny mechanism
62.5
25
12.5
4
14
14
3.5
8.75
2. The principled mechanism
62.5
19
19
3.4
21
8
7.5
9.9
3. The frustration at inaccurate data and information mechanism
38
38
25
4.8
17.5
25
5.5
14
4. Large bonus triggered mechanism
56
19
25
4.6
10
7
12.5
10.25
5. Small bonus triggered mechanism
69
12.5
19
3.2
17
6
10
9.8
6. Bonus in proportion to achievement mechanism
63
25
12
3.5
15
8.5
8
9.3
7. The bidding for resources mechanism
69
12.5
19
3.4
26
22
10.5
17
8. the maximising income mechanism
63
19
19
3.5
21
15.5
4
10.4
• The weighted averages of level of deceit are based on a scale on which: hiding information = 1, economy with truth = 2, gaming = 4, misclassification = 8 and distortion = 16. These values are then weighted by the % of respondents selecting them
21
• The weighted averages of propensity are based on the following weights: possibility = 1; serious possibility = 2; near certainty = 3.
The results show some differences when compared with those of the local
government panel. The ‘frustration at inaccurate information’ mechanism is still
associated with a relatively high propensity to deceive but shows an increased
tendency to use more serious forms of deceit. The ‘avoiding hassle’ mechanism
generates a low propensity to deceive, as is the case with the local government
managers, but it is also linked with a higher level of deception by the NHS panel. The
NHS panel thought, similarly to local government managers, that PRP schemes that
produced large bonuses would lead to a high level of deceit and a high propensity to
deceive. The results on the two new scenarios/mechanisms introduced into the NHS
survey were interesting. The ‘bidding for resources mechanism’ was considered to
lead to a, relatively, very high propensity to manipulate but that only low levels of
deceit would be used. The ‘maximising income’ mechanism, it was considered, would
lead to similar forms of deceit being used but managers would be less likely to use
them.
The results from the two Delphi surveys begin to identify the areas in which the
risks of metric manipulation are greatest and in which efforts at improving
governance should be directed. In the next section an attempt will be made to identify
the contexts that may trigger the mechanisms.
Contexts as triggers for mechanisms
This section uses both the results from the Delphi surveys and from the
interviews to identify which contexts may trigger which mechanisms. The conjectured
relationships between contexts and mechanisms are shown in figure 3 by the vertical
arrows.
22
The most important trigger, according to the responses to the Delphi
questionnaire (table 3), would appear to be the informal culture of an organisation.
More than 60% of both panels though that acceptance of deceit by the informal
culture was a important or very important trigger for all the mechanisms.
Table 3. The relative importance of factors bearing on a decision to manipulate information Q. How important are each of the following factors when people decide whether or not they will manipulate performance data and information? Allocate a total of 10 points between the following so that the most important get the largest number and so on.
��������������������� �����
������
������
�����
����
1. Acceptability or unacceptability of information manipulation within the organisation’s informal culture and values.
���� ��
2. Whether the desired effect can be achieved by slight manipulations that might not be considered so bad, or whether it would require serious wrongdoing.
���� ��
3. The risk of being caught. ���� ��
4. The severity or otherwise of the disciplinary consequences if caught manipulating data or information.
���� ��
�� � �
���
�
Three of the Trusts in which interviews were conducted provided interesting
comparisons. In one Trust there was an acceptance of manipulation;
“here we go fiddling the figures again, then again if it makes us look better –
other Trusts do it”
(Int.14, see also Int. 2)
In another the attitude was that it was better to face the issues,
“We don’t want to hit the target and miss the point”.
(Int 15)
In a third Trust it was reported that there had been a change in the culture following a
change in Chief Executive. Under a former chief executive people manipulated the
23
figures out of fear of his anger if targets were not met (Harradine 2006). Instances of
selective presentation, gaming and distortion in this Trust were reported by the
interviewees quoted above. Such practices had become less acceptable under the new
leadership.
Table 2 shows the risk of being caught, considered in the balance with the
potential benefits if not caught, to be factors in deciding whether to manipulate
information. In some Trusts the level of auditing of performance information was
thought to be low; ‘the chances of being caught if you are clever enough are low’
(Int.16). In others the level of audit was thought high enough to dissuade people (Ints
16 & 11). Where audit systems have not been set up distortion is more likely (Int.13,
CI 1).
It might have been thought that tough auditing of performance information was
associated with performance management being highly embedded in an organisation.
The interviews suggested this is not necessarily the case. A combination of a low
level of audit and a high degree of embeddedness triggered selective presentation and
gaming, especially in relation to bids for resources and budgets (Int. 16 & 12). As far
as personal appraisals were concerned the interviews suggested that in several Trusts
performance measurement was not embedded. Many appraisal schemes were based
on assessing people’s performances against competence standards and qualitatively
expressed objectives rather than against measured targets. In such circumstances the
deception mechanisms are unlikely to be triggered.
The belief that performance measurement is an externally focussed control
appears to trigger deception mechanisms, if the system is not also thought to be a
valuable management tool within the organisation. In one of the Trusts there was an
attempt to use and adapt an imposed, external system to make it of value to the
24
internal operations of the hospital; and deception mechanisms were not activated. In
other Trusts the internal impact of performance management was low. It was in these
Trusts that evidence was found that the ‘principled mechanism’(Int. 9, CI 2), the
‘avoiding hassle mechanism’ (Int. 9 CI 3, Int 6, CI 1) and the ‘inaccurate information’
mechanism (Int 2 CI 1) were being triggered.
The lack of a multi-dimensional, balanced approach to performance
measurement appears to have triggered the ‘frustration’ mechanism and gaming
behaviours (Int 4 & 16). Obversely, in Trusts where a balanced view was taken, this
mechanism was less likely to triggered, perhaps because such an overview mitigates
the impact of unreliable information. For example one respondent (Int. 8, CI 4 & 6)
said that managers no longer sought to disguise apparent budget overspends because a
corporate view was taken on the budget outturn; and individual over-spent budgets
were not necessarily penalised, as the over-spend might be due to system error.
CONCLUSIONS
This paper reports on an ongoing realist analysis of metric manipulation within
the public sector. Based upon a reconnaissance, conducted by reviewing the literature
and by initial empirical research, it identifies a number of context – mechanisms –
outcome configurations. The configurations provide plausible accounts of the causal
networks that lead to metric manipulation.
Realist research into public policy often favours quantitative methods for testing
CMOCs. The use of critical incidents in this study may add to the methodology of
realistic evaluation. They not only provide accounts of CMOCs in practice but also
give some insight into how the participants in the incidents subjectively experience
25
them. This offers the prospect of adding to Pawson and Tilley’s realistic evaluation
approach the element of experience that is a major component of realist epistemology.
Further research, using the expanded Delphi survey will be conducted to test the
causal robustness of the configurations. The results of this research and analysis will
be used to identify forms of governance for, and ways of managing, performance
measurement systems that may diminish the propensity to manipulate metrics and to
lower the level of deceit used.
26
References
Abernethy, M. A. and Chua, W. F. (1996) ‘A field study of control systems
‘redesign’: The impact of institutional processes on strategic choice’,
Contemporary Accounting, 13, 569-606.
Adcroft, A. and Willis, R. (2005) ‘The (un)intended outcome of public sector
performance measurement’, International Journal of Public Sector
Management, 18:.5, 386- 400.
BBC News (Online) (2002a) Many ambulance staff ‘fiddling figures’. Available on
the World Wide Web. URL > http://news.bbc.co.uk/go/pr/fr/-
/1hi/health/3111150.stm<. Site accessed 20/09/2004.
BBC News (Online) (2002b) ‘NHS managers ‘fiddle figure’’, Available on the World
Wide Web. URL >http://www.bbc.co.uk/1/hi/health/2299291.stm<. Site
accessed 20/09/2004.
BBC News (Online) (2003a) ‘Advanced booking scrapped by GPs’, Available on the
World Wide Web. URL >http://www.bbc.co.uk/go/pr/fr/-
/hi/health/3102307.stm<. Site accessed 20/09/2004.
BBC News (Online) (2003b) ‘Reid warns GPs on NHS targets’, Available on the
World Wide Web. URL >http://www.bbc.co.uk/1/hi/health/3178674.stm<. Site
accessed 20/09/2004.
Ballantine, J., Brignall, S. and Modell, S. (1998) ‘Performance measurement and
management in public health services: a comparison of UK and Swedish
practice’ Management Accounting Research, 9, 71-94.
27
Brignall, S. and Modell, S. (2000) ‘An institutional perspective on performance
measurement and management in the ‘new public sector’’, Management
Accounting Research, 11 281-306.
Carlisle, D. (2006) ‘Good management – How to steer clear of PbR gaming’, Health
Services Journal, 2nd March.
Chang, L-C. (2006) ‘Managerial responses to externally imposed performance
measurement in the NHS: An institutional theory perspective’, Financial
Accountability and Management, 22: 1, 63-85.
CIPD (Chartered Institute of Personnel and Development) (2005) Performance
Management: Survey Report, London: CIPD.
Collier, A. (1994) Critical Realism: An Introduction to Roy Bhaskar’s Philosophy,
London: Verso.
Corby, S., White, G., Millward, L., Meerabeau, E. and Druker, J. (2003) ‘Finding a
cure? Pay in England’s National Health Service’, Employee Relations, 25: 5,
502-516.
Dodds, W., Morgan, M., Wolfe, C. and Raju, K. S. (2004) ‘Implementing the 2-week
wait rule for cancer referral in the UK: general practitioners’ views and
practices’, European Journal of Cancer Care, 13, 82-87.
Flanagan, J. C. (1954) ‘The Critical Incident Technique’, Psychological Bulletin, 1,
327-58.
Givan, R. K. (2005) ‘Seeing stars: human resources performance indicators in the
National Health Service’, Personnel Review, 34: 6, 634-647.
Grice, H. P. (1975a) ‘Logic and Conversation’ in P. Cole and J. L. Morgan (eds)
Syntax and Semantics, 3: Speech Acts, New York: Academic Press.
28
Grice, H. P. (1975a) ‘Logic and Conversation’ in D. Davidson and G. Harman The
Logic of Grammar, Encino, CA: Dickenson.
Harding, M-L. (2006) ‘Payment by results wide open to fraud, say commissioners’,
Health Services Journal, 116: 5994, 23rd February, 5.
Harradine, D. (2006) Negotiated orders and the role of accountancy: an ethnographic
study of a NHS Trust, Ph.D. dissertation, Nottingham: Nottingham Business
School, Nottingham Trent University.
Healthcare Commission (2005) Primary Care Trust: Survey of Patients, London:
Healthcare Commission.
Jensen, M. C. (2003) ‘Paying people to lie: the truth about the budgeting process’.
European Financial Management, 9: 3, 379-406.
McCornack, S. A (1992) ‘Information manipulation theory’, Communication
Monographs, 59, 1-16.
McCornack, S. A., Levine, T. R., Solowczuk, H. I., Torres, H. I. and Campbell, D. M.
(1992) ‘When the alteration of Information is viewed as deception: an empirical
test of information manipulation theory’, Communication Monographs, 59, 17-
29.
Modell, S. (2001) ‘Performance measurement and institutional processes: a study of
managerial responses to public sector reform’, Management Accounting
Research, 12, 437-464.
Murray, M. G. and Millar, K. (1997) ‘Effects of situational variable on judgments
about deception and detection accuracy’, Basic and Applied Social Psychology,
19: 4, 401-410.
29
National Audit Office (2001). Inappropriate adjustments to NHS waiting lists. Report
by the Comptroller and Auditor General, HC 452 Session 2001-2. London: The
Stationery Office.
Oliver, C. (1991) ‘Strategic responses to institutional processes’, Academy of
Management Review, 16, 145-179.
Pawson, R. and Tilley, N. (1997) Realistic Evaluation, London: Sage.
Phillips, C. (1999) ‘A review of CCTV evaluations: Crime reduction effects and
attitudes towards its use’, Crime Prevention Studies, 10, 123-155.
PR Newswire Europe Ltd. (2000) NHS performance related pay “riddled with
racism” says MSFC/CPHVA report. Available on the World Wide Web at URL
>http;//www.prnewswire.co.uk/cgi/nrews/release?id=11351<. Site visited
13/02/06.
Radnor, Z. and Lovell, B. (2003) ‘Defining, justifying and implementing the balanced
scorecard in the National Health Service’, International Journal of Marketing,
3: 3, 174-188.
Revill, J. (2003) ‘Whistleblower reveals NHS culture of secrecy’, The Observer, 26th
January: 4.
Sanderson, I (2001) ‘Performance management, evaluation and learning in ‘modern’
local government’, Public Administration, 79: 2, 297-313.
Stylianou, A. C., Winter, S., and Giacalone, R. A. (2004) ‘Accepting unethical
information practices: the interactive effects of individual and situational
factors’, paper presented at the Academy of Management Conference: OCIS
division, New Orleans August 2004.
Tilley, N. (2000) Realistic Evaluation: An Overview, paper presented at the Founding
Conference of the Danish Evaluation Society, September 2000. Available on the
30
World Wide Web at URL
>http://www.danskevalueingsselskab.db/pdf/Nick%20Tilley.pdf<, Site accessed
28/1/2006
Vakkuri, J. and Meklin, P. (2003) ‘The impact of culture on the use of performance
measurement information in the University setting’, Management Decision, 41:
8, 751-759.
Winter, S. J., Stylianou, A. C., Giacalone, R. A. (2004) ‘Individual Differences in the
Acceptability of Unethical Information Technology Practices: The Case of
Machiavellianism and Ethical Ideology’, Journal of Business Ethics, 53: 3, 275-
296.
Yeung, L. N. T., Levine, T. R., Nishiyama, K. (1999) ‘Information manipulation
theory and perceptions of deception in Hong Kong’, Communication Reports,
12: 1, 1-13.
�
31
Figure 1. Levels of deceit in data and information manipulation
Implication Misleading by anticipating how people will heuristically misinterpret selectively provided information.
Distraction Misleading by hiding the truth amongst a mass of detail. Too much
Economy with the Truth Not telling the whole truth so as to give a false impression Too little
System Manipulation Changing practice so that a measured objective is achieved at the expense of an unmeasured one
Manipulating information by re-classifying data Moving data between time periods or categories to create required performance numbers
Lying Telling a known untruth
QUANTITY MANNER RELEVANCE QUALITY
LESS DISHONEST MORE DISHONEST
Selective presentation Gaming Distortion
32
Figure 2. Mechanisms, contexts and outcomes: Local Government Delphi Round 2 results
Weighted mean % considering
deception
high 18
15
12
9
Propensity to manipulate
6
low
3
2 Selective presentation
4 Gaming
Distortion
Level of deceit
low high
Avoiding hassle
Principled mechanism
Frustration at inaccurate information
Large bonus
small bonus
Proportionate bonus
33
Figure 3. Mechanisms, contexts and outcomes: NHS Round 1 results
Weighted mean % considering
deception
high
18
15
12
9
Propensity to manipulate
6
low
3
2 Selective presentation
4 Gaming
Distortion
Level of deceit
low high
High embededness of performance management
Low risk & low audit check: high rewards
Specific triggers External focus of PM system
Lack of balance & integration in PMS
General trigger Attitude of informal culture to data manipulation
Avoiding hassle
Principled mechanism
Frustration at inaccurate information
Bidding for resources
Maximising income
Large bonus
small bonus
Proportionate bonus
34
i This research is supported financially by the Chartered Institute of Management Accountants to whom the authors would like to express their thanks.