implementing development evaluations under severe resource ...€¦  · web viewas methods provide...

23
Implementing development evaluations under severe resource constraints 1 Richard Longhurst Evaluation Unit, International Labour Office, Geneva 2 May 2008 This paper examines the linkages between resources available to implement evaluations, the methodologies used and the validity and usefulness of the results, when resources are severely constrained. Some literature is reviewed: rapid appraisals, the quantitative- qualitative methodology debate, and recent work on ‘impact evaluations when time and budget are constrained’. Some issues for further investigation are: carrying out a methodology audit, planning ahead using complementary inputs and ensuring effective design of projects, the importance of other forms of information, the use of some valid short cut techniques and the need for an agreed long term evaluation strategy. I Introduction: Resources, Methodology, Validity and Usefulness This paper examines the linkages between resources available to implement evaluation activities, the methodologies used and the validity and usefulness of the results, when resources are constrained, often severely. The paper relates to the volume of resources normally available to many evaluation managers and evaluators in international development agencies. It touches on the recent debates on the role of impact assessments and raises the issue of how organisations responsible for evaluation work might take decisions to exploit complementarities between the evaluations they carry out 1 Draft for discussion purposes only. Comments most welcome to [email protected] . 2 Paper first presented at the UKES Annual Meeting ‘Evaluation: Strengthening its usefulness in policy making and practice’ December 2003, Cardiff. The author is grateful to the evaluation managers and senior consultants who gave their time and expertise in telephone interviews, carried out originally in November 2003. This paper was revised and updated for a presentation to the Geneva Evaluation Network in January 2008, and many thanks for the comments received at that seminar. The usual disclaimers apply. 1

Upload: others

Post on 14-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

Implementing development evaluations under severe resource constraints 1

Richard Longhurst Evaluation Unit, International Labour Office, Geneva2

May 2008

This paper examines the linkages between resources available to implement evaluations, the methodologies used and the validity and usefulness of the results, when resources are severely constrained. Some literature is reviewed: rapid appraisals, the quantitative-qualitative methodology debate, and recent work on ‘impact evaluations when time and budget are constrained’. Some issues for further investigation are: carrying out a methodology audit, planning ahead using complementary inputs and ensuring effective design of projects, the importance of other forms of information, the use of some valid short cut techniques and the need for an agreed long term evaluation strategy.

I Introduction: Resources, Methodology, Validity and Usefulness

This paper examines the linkages between resources available to implement evaluation activities, the methodologies used and the validity and usefulness of the results, when resources are constrained, often severely. The paper relates to the volume of resources normally available to many evaluation managers and evaluators in international development agencies. It touches on the recent debates on the role of impact assessments and raises the issue of how organisations responsible for evaluation work might take decisions to exploit complementarities between the evaluations they carry out in terms of methodological intensity. It will be obvious that this paper raises more questions than answers: it is for discussion purposes. It covers an area that is not being covered in the growing debate on resources and evaluation, and opinions are sought as to whether the proposal made here and their implementation add significantly to the transaction costs of evaluation management.

The study draws on a literature review and a small sample of telephone interviews carried out originally in late 2003 to feed into a paper at a UK Evaluation Society conference. It has been updated with interviews over the intervening period, and supplemented with twelve years of experience carrying out and managing evaluations at the lower levels of resource availability.

II What the Literature tells us

With a growing demand for evaluation in the international development field, there is a strong need to examine how evaluations are carried out, the standards to be upheld and the resources required. For at least twenty five years, there has been an examination of how short-cut research and other investigation methods can be

1 Draft for discussion purposes only. Comments most welcome to [email protected] Paper first presented at the UKES Annual Meeting ‘Evaluation: Strengthening its usefulness in policy making and practice’ December 2003, Cardiff. The author is grateful to the evaluation managers and senior consultants who gave their time and expertise in telephone interviews, carried out originally in November 2003. This paper was revised and updated for a presentation to the Geneva Evaluation Network in January 2008, and many thanks for the comments received at that seminar. The usual disclaimers apply.

1

Page 2: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

enhanced to help fill gaps in information in a valid manner about the lives of people in developing countries, especially the rural poor.

This author’s first experience of a critical examination of the resources needed and methodologies for gathering information in low-income countries was the debate on Rapid Rural Appraisal (RRA) initiated at Sussex University in the late 1970s (Chambers 1981, 1992, Longhurst 1981). This debate sought to examine the trade-offs between the rapid impressionistic visits made by outsiders with the lengthier survey techniques in terms of the information they provided and the value to decision making in terms of timeliness, rigour and resources. The most interesting concept underlying this is ‘optimal ignorance’, which is ‘knowing what we do not need to know’ (Chambers 1981). It was concluded that with proper preparation RRAs were a valuable part of the rural investigator’s toolbox. This was followed with efforts to combine RRAs with existing formal survey techniques and anthropological techniques (Kumar 1993, Scrimshaw and Gleason, 1992). RRAs then gave way to Rural Participatory Assessments (PRA), techniques that handed the agenda for information generation to the primary stakeholders, but were more time-consuming to implement, and more difficult in terms of drawing generalisations.

From the mid-1990s, there has been a healthy break-up of the traditional disciplinary divisions separating quantitative and qualitative techniques (Bamburger 2000, Chung et al, 1997, Greene, Benjamin and Goodyear, 2001, IIED, 1997, Kanbur 2001, Tashakkori and Teddlie, 1998). This has generated some understanding as to the best mix of methodologies to provide information that is appropriate, timely and valid. The mixed methods-qualitative/quantitative (known as Q-squared) debate points to some preliminary results.

The findings of Greene et al central to this debate, who found that while mixed methods can enhance validity and allow for triangulation, and therefore have great promise for enhanced understanding of programmes and policies, they are not a panacea. Four issues are important which all have strong implications. As methods provide different views of the same issue, results may converge but offer a more complex view, making it more difficult to the evaluator to present the findings to the users of results. Second, methods vary on multiple dimensions raising the question as to which dimensions are important in a given mixed method study. Third, there is the issue of the weight to be accorded to the different methods or both. Fourth, the demands of mixed-method design are different than those of a single method study; assumptions underlying each method have to be fully understood. These all place some extra pressure on resources and the competencies of staff. These conclusions are endorsed by Bamberger’s (2000) collection of papers.3

Further work was developed by the International Food Policy Research Institute (IFPRI) on the validity of information obtained through ‘rapid’ appraisals compared with ‘slow’ surveys (Chung, Haddad, Ramakrishna and Riely, 1997, Maxwell, 1998, Morris, 1999, Christiaensen, Hoddinott and Bergeron, 2000). This has also mixed conclusions on how methodologies can be better designed with both resource efficiency and data quality in mind. Conclusions are largely site-specific: complementarities of methods did provide a greater range of insights and permitted

3 There are examples of researchers working in isolation who developed ‘mixed methods’ well before it was fashionable, see Hill 1972; there must be others.

2

Page 3: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

triangulation to a greater degree, so improving validity and usefulness of the results to primary stakeholders. But extra costs were incurred with greater need for teamwork and survey staff finding difficulty in mixing methods. Different survey techniques created different social dynamics between research teams and their respondents. There have also been some very useful publications on how to collect information, including for evaluations, ‘on the run’ (Thomas et al, 1998, Wadsworth, 1997).

Evaluation methodology has also come under scrutiny from another direction:

the use of evaluation to inform policy and action has been questioned by the Centre for Global Development (2006). A Working Group convened by the Centre investigated why rigorous impact evaluations of social development programmes were rare, and was charged with developing proposals to stimulate more and better impact evaluations. This has led to a discussion on whether evaluations, and especially those with substantial data collection are justified and where and when. The report made the case for more extensive impact evaluations to lead to better understanding of the impact of social development programmes, which in turn has caused discussion on the role of strong evaluation designs, especially randomised control trials (see Bamberger and White, 2007). The point to make here is there is currently a healthy debate on the volume of resources that should be applied to evaluation design, rather than sticking to a ‘one size fits all’ scheme.

These strands of rapid and Q squared work culminated in two very good publications about evaluations under resource constraints. At a meeting of the American Evaluation Association in 2002, a professional development workshop was held on ‘Impact Evaluations when Time and Money are Limited’ (Rugh and Bamburger, 2002), and the subject of an article (Bamberger, Rugh and Fort, 2004) and a book (Bamberger, Rugh and Mabry, 2006). At the AEA meeting, Rugh and Bamburger developed a simple but extremely useful evaluation survey typology (see Table 1), but this has not reappeared in subsequent publications. This paper takes this table as a starting point to assess evaluation and resource availability.

Table 1: Levels of evaluation studies on the basis of resources available.

Source: Rugh and Bamberger 2002

The book by Bamberger, Rugh and Mabry (2007) was a milestone, but it did not address the lower end of resource availability. The authors concentrated on issues

3

Level 5: Thorough research leading to in-depth analysisLevel 4: Good sampling and data collection methods used to gather

data, which is representative of target populationLevel 3: A rapid survey is conducted on a convenient sample of

participantsLevel 2: A fairly good mix of people are asked their perspectives about

the project Level 1: A few people are asked their perspectives about the projectLevel 0: Decision makers impressions based on anecdotes, brief

encounters; mostly intuition

Page 4: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

related to impact assessment at the levels of 4 and 5, on the assumption of encountering problems with inter alia control groups, baselines and random sampling. Planning for many international development impact evaluations does not begin until a project or programme is well advanced, and most evaluations must be conducted under budget and time constraints, often with limited access to baseline data and control groups. This evaluation approach is developed to respond to the demand for ways to work within budget, time and data constraints while also ensuring maximum possible methodological rigour.

They developed an integrated six step approach which covers: (i) planning and scoping the evaluation, (ii-iv) options for dealing with constraints related to costs, time and data availability (which could include reconstructing baseline conditions and control groups) (v) identifying the strengths and weaknesses (threats to validity and adequacy) of the evaluation design, and (vi) measures to address the threats and strengthen the evaluation design and conclusions, with a valuable ‘threats to validity checklist’. They offered many useful suggestions such as: better identification of budget, time and data constraints, reviewing data collection to identify the most economical methods, looking for reliable secondary data, commissioning of preparatory studies before the arrival of time-constrained external consultants, use of recall to reconstruct baselines, and marking out in advance the threats to validity.

Some simple conclusions emerge from this brief review of literature on ‘rapidity’, and mixing of research methods. While disciplinary boundaries have broken down, stakeholders and their needs to use information have taken more centre stage, and there have been many valuable techniques uncovered. However, there are still site-specific issues in interpreting research, and trying to trade-off the extra complexity with concerns of validity and use by stakeholders. Some short-cut methods that might replace the long survey do have real value and appear to retain much validity. But resources remain central. The conclusion from Greene et al of ‘designing and implementing a mixed-method evaluation is not merely choosing from a smorgasbord of methods available (p.41)’, rather dulls expectations of using some techniques under budget and time constraints.

Many evaluators still find themselves working in practice at lower levels no higher than at level 3 in Table 1, suggesting that timing, focus and level of detail of the evaluation is often pre-determined by the client information needs and types of decision to which the evaluation must contribute (Patton, 1997). This paper focuses on the levels below impact assessment (levels 0-3), where extensive primary data are not collected. This is where most evaluation managers operate within fixed, small budgets and with few staff. Managers want to carry out work of an acceptable standard using the scarce resources they have, including follow-up with stakeholders. When budgets are trimmed, or extra evaluation demands are made on them, they need to know how to implement this in a more efficient manner without loss of value.

III Evaluations under Severe Resource Constraints:

i) Questions that should be asked

There are some critical questions that evaluation managers should ask at the beginning of an evaluation when trying to juggle resources, methodology with

4

Page 5: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

maximum accuracy and credibility as targets. These questions should stake out in advance the territory of i) what results do we have to achieve and ii) how do we tailor resources. Most of these questions are ‘answered’ or resolved implicitly as the evaluation proceeds as a manager makes decisions on how to proceed based on experience. It would be useful to debate these questions openly and try and attach some weights to the different factors. In other words, the questions along the lines of ‘what is our optimal ignorance’ are not being asked. These questions should be:

Given our information objectives on the one hand and constraints and resources (time, money, expertise) on the other, which combinations of techniques and activities will be optimal? Minimum levels of accuracy, participation and credibility have to be established. Are there other criteria that should be used to assess the volume of resources to be applied to an evaluation ? These might be the need to fit into planning cycles, the sensitivity of the topic (and the perception that the evaluation be a critical one and so needing to be well researched). Purpose needs to be determined with resources.

Which techniques of an investigation are used on the basis of evaluation credibility, how much to get useful information on results, how much for triangulation and how much to ensure proper participation of stakeholders? Often for example, the returns to survey questionnaires and information generated are low, but this technique is also used to ensure that all stakeholders are given the opportunity to have a say 4 Can the use of techniques change over time, e.g. as credibility improves?

Is the right level of resources being allocated to each activity, and what is the ‘right’ level? Emphasis could be put on respondents who are strong in auto evaluation and so are already ‘thinking’ along the lines of the evaluation.

How can evaluations (at whatever level) be inter-connected to better use resources? Can techniques serve dual uses?

Therefore there is sense in laying out in advance these methodology ‘audit’ questions, and a simple technique which might be used is a table which takes each evaluation question to be answered, the techniques to be used and a rough estimate of the resources required 5. This is a questions and instruments matrix (question, instrument, method, resources needed), which maps out in advance the expected allocation of resources to each evaluation question and then be converted in the final report as a statement of methodology, so lending more transparency to the process.

ii) What volume of resources are used at the 0-3 levels

It is necessary to clarify what level 0-3 evaluations would comprise in terms of available resources, in the context of an international development agency. Level 3 is described in Table 1 as ‘A rapid survey is conducted on a convenient sample of participants’. Most managers find that at this level, a project, thematic or programme evaluation is normally accorded: a research assistant (approximately 30-40 days to research a background issues paper, to collect documents and contact points of stakeholder), an external evaluator involving about 30 days, often trimmed because

4 How many times might evaluators say ‘we did not get much useful information there but needed to do it to make the evaluation credible?5 Again it must be emphasised if this sounds complex, this is a process that is going on continually in the mind of an evaluator: better to make those processes implicit than explicit.

5

Page 6: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

of tight budgets and less then the work requires, and then there is the highly elastic variable, which is approximately 15 days of management time. In addition there is (if a programme evaluation) funds for travel to some ‘representative’ case study areas, or (if a project evaluation) to travel to the project site, and hold workshops. All of this is likely to involve direct costs of about $35-40,000. It is the manager’s time that can expand if there are problems with the evaluation and this is the resource that needs to be protected by careful planning. Methods would be used with these resources at level 3 normally cover four areas i) interviews with relevant departmental staff ii) literature and report review, iii) minimal analysis of quantitative data, iv) interviews in case study locations, and v) email questionnaires and telephone interviews.

A level 2 evaluation (described in Table 1 as ‘ a fairly good mix of people are asked about their perspectives about the project’) could be taken as investigations at the headquarters only, but with a few telephone interviews to the field. This would entail direct costs of about $10-15,000. For level 1 investigations (‘a few people are asked their perspectives about the project’), the evaluator may not move from home base but conduct telephone interviews (assuming the coverage of mobile phones, this would not in itself impose a sampling bias). Such an evaluation might cost about $5,000. Finally with a level 0 evaluation, (‘decision makers impressions based on anecdotes, brief encounters, mostly intuition’) no effort is made for any sort of representativeness from information sources but to rely on conventional wisdom, ‘gossip around the water fountain’, or a video conference call to a focus group.

Level 3 through to level 0 indicates a gradient in terms of resources and in credibility of method (although questions of ‘optimal ignorance’ apply here: getting more information does not necessarily make it a better evaluation). In terms of findings, level 0 techniques can be accurate in terms of findings of many evaluations conducted with more resources, as respondents are forced to deliver their gut feelings about a programme in a way that a more structured investigation may miss.

There is a dated literature on the resource costs of different techniques, scattered through many different publications. The cost of RRAs is lower that surveys because of their smaller sample size and greater focus. Kumar (1993) has pointed out that one sample survey conducted by a US firm is likely to cost $100,000- 200,000, an amount that can support three to four RRAs. The costs of a series of rapid diagnostic surveys carried out by Eklund (1990) in Zambia (in 1984) and in Zaire (in 1986) were below $5,000 excluding the staff salaries. PRAs are more limited by the training of the staff. Figures given in the telephone interview for country evaluations ranged between $60-100,000 while some large global impact assessment could cost as much as $400,000. Another example was given of a recent country evaluation study that cost only $20,000 because of significant co-operation provided in country. These figures do not help much without some reference to yardsticks but attention to comparative costs when evaluation standards have been agreed would be useful.

6

Page 7: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

In the earlier research for this paper with the telephone interviews6, the question was asked of the evaluation managers 7 , how far was their work typified by level 3 evaluations. The answer was that many respondents had some difficulty with the typology saying this did not easily apply to them but were able to classify their studies accordingly. There is a clear trend that the stakeholders of the evaluations of multilateral and bilateral organisations require information at two levels: levels 4-5 through carrying out a few but large impact assessment studies, and at levels 1-2 in the form of self assessments by programme staff. Two international organisations carried out evaluations at level 3 with occasional efforts to reach level 4. Some respondents suggested for various reasons it was not useful to concentrate on surveys but the typology is helpful if it includes self-assessment mechanisms. Some organisations may be stuck at a particular level (often level 3) because of the stage in the evolution of their evaluation activities. One organisation works at levels 0-1 and 4 only, clearly there is some imbalance here. Levels 4 and 5 relate to impact assessment, important if that is the demand.

IV Improving the effectiveness of level 2-3 evaluations

i) Some overarching issues: long term evaluation strategy, evaluation manuals making resource costs explicit, and employing optimal ignorance.

An approach to improving level 2-3 evaluations has to be strategic. Two important elements are an evaluation manual that focuses on the deployment of resources and the results they serve and second, a clear evaluation strategy of what types of evaluations are done and when and how they fit the needs of the organisation as well as how they relate to each other. In the telephone survey all but a couple of organisations had a clear evaluation strategy, in some organisation with a set of agreed standards. Several respondents emphasised the importance of having such an agreed strategy and standards as a means of protecting the integrity of their work.

A strategy and the current importance of evaluation in organisations means that most interviewed did not face severe constraints that led to trimming of methodology, down grading evaluations in terms of resources, (for example from level 3 to level 2; that is, mainly dropping field visits to a desk review only), with the reasons being: i) evaluation was achieving a rising profile, extra staff posts were being added for some, and others were getting an adequate consultancy budget; ii) they would refuse to do any study that compromised the work unless resources were forthcoming, iii) evaluation units’ work was moving towards quality assurance, and iv) the main threats to validity came not so much from weak methodology but from the ‘usual suspects’: problems of attribution, poor design of original intervention, and

6 Using the ‘Levels of Studies’ typology as an aide memoire, telephone interviews were conducted with evaluation mangers in international organisations. A telephone interview was held with senior evaluation managers in 14 organisations, breaking down approximately two thirds multilaterals, one sixth bilaterals and one sixth international NGOs. In addition three former directors of evaluation departments now working as independent consultants were interviewed. This was supplemented with other reviews (AusAID, 1998), and the author’s recent and current experience with three multilaterals has also been included.7 There is a certain irony in asking questions about the Levels 0-5 typology, using a methodology that could be classified as level 1-2. The author offers this mea culpa and hopes that the reader will still regard the results to be of sufficient validity to arouse interest!

7

Page 8: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

other confounding issues. Therefore a sound institutional base for the evaluation work with transparency in procedures was important. Clear procedures (in the form of a manual and training) were also important for generating self- or auto-evaluation.

But an evaluation strategy to improve low resource evaluations has to take a longer time horizon than is usual. Evaluation departments should set out a five or even 10 year programme. This can be subject to some change of course, but particular with the centrality of the Millennium Development Goals, more organisations are taking longer planning time horizons. But so far we do not know of an evaluation strategy that extends longer than two years.

Different stakeholders had varying information needs with results tailored accordingly. Many evaluation issues (e.g. those that reflected levels of compliance rather than impact and changes in behaviour) did not require a level 4/5 methodology. Generally the investigations showed that evaluation mangers will not ‘trim’ on methodology if they feel it will compromise the integrity of their study. Many of these issues return to ‘optimal ignorance’ knowing what we do not need to know’.

ii) Approaches to improve evaluations at levels 0-3, 2-3 in particular: some specific techniques, re-think the use of other information in the organisation,

First, there are techniques that can cut corners without loss of validity. Although the complexity of mixed method studies has been emphasised, some practical ideas have emerged over the last few years. These include use of purposive sampling, cluster groups, focus groups, key informants, and means to retrospectively reconstruct baselines and control groups. More focus on the needs and uses of the evaluation can better direct the techniques to be employed. It was been suggested in a personal communication that ‘trimming’ is not the best approach when facing constraints, rather to rethink the whole strategy, focussing on reliable data cores and the linkages between data. Better to focus on a data core, knowing, for example, 25% of the picture with good certainty rather than 100% with haziness. This may be an idea for evaluation methodology, concentrating on doing a fraction of the investigation with accuracy and making sensible assumption about other areas.

Another facet of a ‘data core’ needed to collect information and the ease with which the information registers in the minds of the respondents has been a long standing suggestion in village studies research (Lipton and Moore, 1972): whether information is (in the mind of the respondent) reflecting continuous or non continuous processes (e.g. crops that are harvested all at once compared to crops harvested from time to time), and whether an event is registered or non registered in the minds of the respondent, (e.g. payments to hired labour compared to the unpaid input of family members). This suggests a focus on ‘events’ in evaluation questioning and using these as organising foci to draw out opinions on subjects.

A further technique that bears investigation is to learn lessons from Real Time Evaluations (RTEs), practised in the humanitarian sector, and pioneered by the Evaluation and Policy Analysis Unit of UNHCR. With the boom in the practice of humanitarian evaluation over the last 15 years, some innovative techniques have been

8

Page 9: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

developed. RTEs are usually carried out using field visits and headquarters meetings, some with telephone interviews with field based staff.

Key characteristics of RTEs are: i) they take place during the course of implementation of the response to a humanitarian crisis, ii) the time frame is short, may be a few days: they may be repeated and be seen as an ongoing evaluation: the emphasis is on process rather than seeking results, iii) secondary information is used; normally they are carried out by HQ and local staff, and iv) the emphasis is on learning and this is where they have been proved a success; they have been an opportunity for staff, especially junior local staff to express their concerns. Unlike a normal evaluation the products can be integrated within the programme cycle8. Clearly the context for an RTE is the urgency of the problem and a need to make swift course corrections and the context of an emergency makes them useful. RTEs are also a means of closing the widening gap between monitoring and evaluation.

RTEs could fit into non emergency situations by using their strength in learning and in making immediate corrections to the programming cycle: they can be problem or issue oriented or if an ‘emergency’ breaks out in a project e.g. why farmers are not adopting a new variety. They can be specially ‘billed’ in advance as a means of trying to address an issue so that mid-course corrections can be made on the programming cycle and they can be used to strengthen monitoring.

Second, the combined work of mixed methods and the Quantitative/ Qualitative (Q-squared) activities has generated some progress in linking the technique to the situation, but transaction costs still seem high.

Third, the recent work of Bamberger, Rugh and Mabry can change attitudes about linking resources to evaluation methods, but this still assumes surveys and sampling are needed. There are good ideas about scoping, how to generate a low-resource baseline, low resource counterfactual, and make more up front thinking respectable. There are two specific techniques suggested at the low resource end: first quick ethnology procedures, which involve integrating all tools around critical questions, and they also point out that different forms of sampling can be used for qualitative evaluations, and this approach can be used to make a level 2-3 evaluation more transparent. But most importantly this and other work has put more pressure on the need to ensure that evaluation criteria are considered at the project design stage.

Fourth, there is the importance of evaluability, which is a cost effective process that ensures evaluation criteria are properly included at the design stage. It involves an assessment that objectives and indicators are set clearly for all levels, the project or programme is logically conceived, risks and assumptions are identified and that the monitoring systems are in place for effective project execution and evaluation: in short to ensure that a logical framework has been properly carried out 9, is being updated and used as the project is implemented. An evaluability approach also ensures that monitoring information feeds into evaluation, something that is not done as often as might be expected: monitoring information can be collected in ways that preclude its use in later evaluations.

8 http://www.unhcr.org/cgi-bin/texis/vtx/home?id=search9 Do a log frame for the evaluation, it is a project after all!

9

Page 10: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

Linked to evaluability are other elements related to planning up front. Overcoming resource constraints may well be best achieved at the planning stage. The greatest concern expressed by evaluation mangers related to their relationships within their own organisation and stakeholders outside and the timing and nature of the information their evaluations produced. This points to greater up front planning and design of studies: early contact with stakeholders, getting their review of the scope of the study and Terms of Reference, and covering all relevant secondary information especially the reviews of similar programmes and projects of other organisations. The need to properly seek secondary literature appears in all the relevant literature.

One evaluation manager said that moving more resources up-front was a more efficient means of resourcing the evaluation, getting reluctant elements on board and allowing the use of more junior staff that were more enthusiastic about the research elements. He regarded that the study should be 50% complete by the time that the external consultants started fieldwork. There is a clear need to identify in advance what information is expected to be missing and set out a strategy to collect it or not. This can be done using a decision tree approach (Rugh 2003).

One manager expressed the view that the big problem was in adapting evaluation to the information needs of the organisation; the situation was always very dynamic, with country and sector strategies. Evaluation inputs did not always come when such strategies were being written, and this needs extra coordination. The evaluation work programme needs to be widely distributed and then understood.

Fifth, more consideration could be given to re-aligning information collected in other parts of the organisation: as well as monitoring information, also financial data, back to office reports and the encouragement of anecdotal information. The question is how this can be collected and organised in such a way as to not incur high transaction costs. This should be an element considered in the five (or more) year evaluation plan proposed above.

iv) How can level 2-3 evaluations have a broader impact within an organisation and feed into levels 4-5.

Evaluations must be linked, and be part of the long term strategy mentioned above: level 2-3 evaluations must be used to build up a data base and other than provide a list of ‘lessons learned’. These evaluations can feed into one-off level 4-5 evaluations in the agency that might be implemented every three years or so, providing valuable ground work and baseline data. Evaluations can provide time series information on the development of an institution or programme of work or have standing items in the TORs (e.g. gender) that also lay the foundation for a thematic evaluation. Complementarities between evaluations should be exploited: in the spirit of the Paris Declaration, approaches should be harmonised if possible, although initially transaction costs have been found to be high. It is possible for evaluators to combine and agree that they will cover similar areas if the chance arises.

v) The role of the level 0-1 evaluations

10

Page 11: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

In many organisations the most important decisions are made on the basis of levels 0/1, so evaluators should be taking note (and learning from, rather than disparaging) how this works. Recognition of its importance could be factored into the methodology audit. However use of this low level methodology has an important role in self evaluation and encouraging staff to be more analytical. There may be value in building on the level 0 methodology to push it to level 1, which involves broadening the consultation process. Anecdotes as a methodology at level 0 do have an important role and project staff should be encouraged to story tell: sometimes this is the only means to assess impact after the project has been completed if resources are limited.

V. Summing Up

Sinister Success

We should be less concerned about our failures than about ‘sinister success’ in subverting evaluation work: this is a phrase and conclusion borrowed from the World Bank/OED publication on Evaluations and Partnership (Liebenstein, Feinstein and Ingram, 2004), and it applies here also. In other words, although we should work hard to improve evaluation techniques we should also be concerned that overall, evaluation units are kept too small for work at hand, certainly, below the size of 3% of the administration budget normally recommended 10 .

Planning well ahead with complementary inputs

Elements that would benefit from available resources have been outlined: a methodology audit to assess what methods will be used for which evaluations and how complementarities between methods can be exploited; a five to ten year strategy for laying out evaluations, getting design right so that projects and programmes are evaluable,

The importance of other forms of information within the organisation

Problems with implementing evaluation studies could be eased by promoting information from other parts of the organisation, from programme staff and financial departments. Developing a self-assessment system for performance - the project completion reports – could compensate for lack of resources for evaluation studies, and if this is done well, evaluation becomes the icing on the cake. As a complementary resource, a well-functioning and useful monitoring system can provide some relief for constraints in the evaluation system: identifying significant successes and failures provides information that feeds into future evaluations. But often monitoring is for the benefit of project managers (and is paid from their budget) rather than to be helpful to future evaluation. All the information that an evaluation collects at whatever the level is useful and valid if circumstances are fully understood, with different value to different stakeholders.

10 All organisations interviewed here were below that norm, for one very large organisation with an administrative budget of nearly $1b., the proportion was 0.20%.

11

Page 12: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

Bibliography

AusAID, 1998, Review of the Evaluation Capacities of Multilateral Organisations, Evaluation No 11. Canberra.

Bamberger, M (ed), 2000, Integrating Quantitative and Qualitative Research in Development Projects, Directions in Development, World Bank, Washington DC.

Bamberger, M, J Rugh, M Church and L Fort, 2004, Shoestring Evaluation: Designing Impact Evaluations under Budget, Time and Data Constraints, American Journal of Evaluation, 25, 1, 5-37.

Bamberger, M, J Rugh and L Mabry, 2006, Real World Evaluation: Working under Budget, Time, Data and Political Constraints, Sage Publications, London.

Bamberger, M and H White, 2007, Using Strong Evaluation Designs in Developing Countries: Experiences and Challenges, Journal of Multidisciplinary Evaluation, 4, 8, 58-73.

Centre for Global Development, 2006, When Will We ever Learn? Improving Lives through Impact Evaluation, Report of the Evaluation Gap Working Group, Washington DC.

Chambers, R, 1981, Rapid Rural Appraisal: Rationale and Repertoire, IDS Discussion Paper 155, Brighton.

Chambers, R, 1992, Rural Appraisal: Rapid, Relaxed and Participatory, IDS Discussion Paper 311, Brighton.

Christiaensen L, J Hoddinott and G Bergeron, 2000, Comparing Village Characteristics derived from Rapid Appraisals and Household Surveys: A tale from Northern Mali, FCND Discussion Paper No 91, IFPRI, Washington.

Chung K, L Haddad, J Ramakrishna and F Riely, 1997, Identifying the Food Insecure, The Application of the Mixed-Method Approaches in India, IFPRI, Washington.

Eklund, P, 1990, Rapid Rural Assessments for Sub-Saharan Africa, EDI Working Paper, World Bank, Washington DC.

Greene, J, L Benjamin and L Goodyear, 2001, The Merits of Mixing Methods in Evaluation, Evaluation, Vol 7, 1, 25-44.

Hill, P, 1972, Rural Hausa: a Village and a Setting, CUP, Cambridge.

IIED, 1997, Methodological Complementarity, PLA Notes No 28, London.

Jamal A and J Crisp, 2002, Real-time Humanitarian Evaluations: Some Frequently Asked Questions, UNHCR, EPAU, Geneva.

12

Page 13: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

Kanbur R, 2001, Q-Squared? A Commentary on Qualitative and Quantitative Poverty Appraisal, In R Kanbur (ed), Qual-Quant. Qualitative and Quantitative Poverty Appraisal: Complementarities, Tensions and the Way Forward, Cornell University Applied Economics and Management Working Paper 2001-05, Ithaca, New York

Kumar, K (ed), 1993, Rapid Appraisal Methods, Regional and Sectoral Studies, World Bank, Washington DC.

Liebenthal, A, O Feinstein and G Ingram, forthcoming, Evaluation and Development: The Partnership Dimension, OED, World Bank, Washington.

Lipton M and M Moore, 1972, The methodology of village studies in less developed countries, IDS Discussion Paper No 10, Brighton.

Longhurst R, 1981, 'Rapid Rural Appraisal: Social Structure and Rural Economy', IDS Bulletin, vol. 12, no.4, editor and author: 'Research Methodology and Rural Economy in Northern Nigeria', 23-31.

Longhurst R, 1993, Integrating Formal Sample Surveys and Rapid Rural Appraisal Techniques, Report to IFAD.

Longhurst R, 1998, Integrating Formal Sample Survey with RRA and Participatory Techniques, Annual Meeting of the UK Agricultural Economics Society, Reading.

Marsland, N, I Wilson, S Abeyasekera and U Kleih, undated, A Methodological Framework for Combining Quantitative and Qualitative Survey Methods, an output of the DFID-funded natural Resources Systems Programme (Socio-Economic Methodologies Component) project R7033, draft University of Greenwich and University of Reading.

Maxwell, D, 1998, Can Qualitative and Quantitative Methods serve Complementary Purposes for Policy Research? Evidence from Accra. FCND DP No 40, IFPRI, Washington.

Morris, S, C Carletto, J Hoddinott and L Christiaensen, 1999, Validity of Rapid estimates of Household Wealth and Income for Health Surveys in Rural Africa, FCND DP 72, IFPRI, Washington.

National Science Foundation, 1997, User-Friendly Guide to Mixed Method Evaluations, Directorate for Education and Human Resources, Washington DC.

Patton, M, 1997, Utilisation-Focussed Evaluation, Sage Publications, California.

Rugh J and M Bamberger, 2002, Impact Evaluations when Time and Money are Limited, AEA Development Session, http://home.wmis.net/~russon/icce/index.html

Rugh, J, 2003, Planning Appropriate Evaluation Designs: A Decision Tree Approach, AEA Professional Development Workshop ‘Impact Evaluation on a Shoestring’.

13

Page 14: Implementing development evaluations under severe resource ...€¦  · Web viewAs methods provide different views of the same issue, results may converge but offer a more complex

Scrimshaw N and G Gleason (eds), 1992, Rapid Assessment Procedures: Qualitative Methodologies for Planning and Evaluation of Health-Related Programmes, INFDC, Boston.

Save the Children Federation/US, 1989, Handbook for using Rapid Rural Appraisal Techniques in Planning, Monitoring and Evaluation of Community-based Development Projects, Khartoum.

Tashakkori, A and C Teddlie, 1998, Mixed Methodology: Combining Qualitative and Quantitative Approaches, Applied Social Research Methods Series, Vol 46, Sage, California

Thomas, A, J Chataway and M Wuyts, 1998, Finding Out Fast: Investigative Skills for Policy and Development, Open University and Sage, London.

Wadsworth, Y, 1997, Everyday Evaluation on the Run, Allen and Unwin, St Leonards, Australia.

14