March 18, 2022

UKES Round Table Exchange 2 ‘Invisible Hands’

This blog builds on Round Table Exchange (RTE) No 2: ‘Invisible hands’ in the evaluation process: Commissioning, Resourcing and Procuring evaluations held on 24th February 2022.

The panellists were Hala Elsayed, Head Experimentation and Evaluation Hub, UK Ministry of Justice; Dr Rachel Iredale, Consulting Director at RSM Ireland, and David Rider Smith, Senior Evaluation Co-ordinator, Evaluation Service UNHCR, with moderation provided by Professor Murray Saunders, University of Lancaster. All three panellists have extensive experience of both commissioning and undertaking evaluations. The UK Evaluation Society RTE team (Bridget Dillon and Murray Saunders) should emphasise that although we bear full editorial responsibility for the contents of this blog which draws on the discussion at the RTE, we are grateful to our panellists for their insights.


The UK Evaluation Society provided a short briefing to participants  to set the scene. This was developed from some  observations about  paying for, commissioning, and procuring evaluation in recent years[1]. It also summarises some of the critical commentaries arising from colleagues in various associations and societies across the world, as well as the academic literature. In the ubiquitous fora for evaluation debate, there seems to be movement toward a more self-conscious and pro-active evaluative thinking and practice that contributes to new priorities.

In considering  commissioning, resourcing, and procuring evaluations holistically, we have to some extent elided three systemic characteristics of what the change and innovation literature call the ‘pre-adoption and ‘adoption’ decision phases of making an evaluation happen. By commissioning we are referring to the overall interface between the impetus of an evaluation, the framework used to create TORs (design, brief and scope) and those who may be carrying it out. By resourcing, we are referring to how much money and other enabling features are made available for evaluative activity and by procurement we are referring to the practices associated with obtaining resources for the evaluation and appointing the evaluators.

[1] See for example Jones, Lindsey, Laura Kuhl, and Nathanial Matthews. “Addressing power and scale in resilience programming: A call to engage across funding, delivery and evaluation.” The Geographical Journal 186.4 (2020): 415-423.

Cox, Jayne, and Pete Barbrook-Johnson. “How does the commissioning process hinder the uptake of complexity-appropriate evaluation?.” Evaluation 27.1 (2021): 32-56.R

A sense of the issues

Rachel Iredale helpfully pointed out that the various practices associated with these different dimensions are embedded in at least ten groups of real people, in real time! This is particularly the case in public policy or programme environments. Often each of these groups work in different organisational cultures, have different sets of  practices and priorities which leads to an emphasis on different features of the process which are not always in concert. In effect, they may be practicing in silos  without the joined up thinking an effective response to an impetus to evaluate might require. Hala Elsayed suggested that aspects of the process may be in tension, precisely because of these different priorities in the overall brief, the amount of resources required and the nature of the envisaged evaluation framework. In the public policy domain, these stakeholding groups may be, variously, the following:

  • Sponsoring government department or organisation
  • Senior Management Teams: Deciding an evaluation is needed. Agreeing budget and allocating resource
  • Programme Managers: Draft the ITT
  • Procurement groups : Often out-sourced with a completely different organisational culture
  • Marketing: Decisions about pre-market event, Frameworks, Dynamic Purchasing Systems. Other ways of promoting awareness of the ITT, e.g. UK Evaluation Society, social media.
  • Subject Matter Experts: Review tender submissions
  • Administration: Organise the logistics of panel meetings. Overseen by procurement
  • Evaluation team: Work alongside appointed evaluators
  • Finance: Due diligence. Issue contracts
  • Contract Management Team: Often separate from the evaluators.

The metaphor of the ‘invisible hand’ in the title of this RTE is intended to capture the unacknowledged ways in which the process of getting an evaluation off the ground, particularly in the public domain, is influencing  the shape of the overall evaluation landscape, and subsequent evaluative practices.  The thrust of RTE2 and this blog is now to make the processes more transparent and explicit.

There is a growing awareness across the evaluation community of pressures which need more systematic investigation and debate. David Rider Smith, drawing on his experience (in the context of the UN system, UK Government as well as others) , reminds us that first, a large number evaluations are being undertaken each year led by a relatively small group of often US and European based consultancies , well versed in the processes and frameworks of those commissioning the work. This can make it challenging for evaluators outside of this traditional base to break through (unless part of a team led by these established firms ). This has been recognised for some time, and there are examples where shifts are happening through incentives supporting more balanced and localised approaches.

Among other salient features of the present but dynamic context are the following:

  • There can be a tension between the demands of accountability and learning or transformation and change for the intended beneficiaries.
  • evaluations are often underfunded with over ambitious briefs and expectations,
  • methodologies tend toward safety and familiarity and designs can be risk averse (to some extent evaluators themselves help to create this feature as a self-fulfilling prophecy),
  • commissioners of evaluations are often under strong pressure to privilege particular types of evidence.

Here are some factors which our panelists noted during the course of the discussion:

  • It matters where the evaluation function is positioned within an organisation for how evaluation is perceived and what the organsational expectations may be. There is a big difference between its location under finance and corporate divisions where interests are likely to emphasise accountability as the principal evaluation purpose, or whether it is located within a research/evidence division where interests are more likely to signal learning and improving impact.
  • Regular oversight or setting up of oversight bodies makes a difference – an increase in a focus on the results agenda (and the evidence on which it is based) has prompted a stronger focus on the production of quality evidence. An example was provided of The Independent Commission of Aid Impact (ICAI) set up in UKAnother example is the EU Regulatory Scrutiny Board which seeks to improve the quality of evaluation and the recently established Evaluation Taskforce in the Cabinet Office. It should be noted that oversight of evidence quality should be methodologically neutral.

It is noticeable that the evaluation process itself is often not evaluated by commissioners or indeed evaluation companies – hence we do not know if we really did need a senior level evaluator for x part of an evaluation or whether a junior would have done equally well, or whether the overall cost of the evaluation was value for money (VFM). Not publishing an evaluation report can be an indicator that the evaluation team has not done a good job as well as an indicator of ‘awkward’ findings.

  •  The prescriptive Framework contracts are not designed with evaluation in mind – it is easy to see how they are market efficient in strict terms, but less easy to see that they deliver the appropriate team for the evaluation in hand.  Some improvements have been noted in recent years e.g., incentivizing companies to include local evaluators, innovative methodologies etc. The framework approach, and the sheer volume of frameworks results in difficulties for smaller companies and independent evaluators to bid for work.
  • The role of ‘the commissioner’ in its broadest sense and how power is manifest is largely unscrutinised in the system.  As we note above, it also encourages evaluators to ‘play safe’ and be risk averse .
  • The persistent ‘what works’ mindset currently permeates a lot of government commissioning of evaluations. This is helpful to the extent that it raises the status and value of evaluation, but it also serves to potentially stifle opportunities for creativity and wider evaluative thinking; evaluation is about more than ‘what works’.


Our participants were able to make some interesting and creative suggestions on positive new directions to address some of the issues they had identified.  For example –

  • Issue: capability among commissioners of evaluations
  • Terms of reference should be as focused as possible in the often rapidly moving pre-adoption and adoption environments. At the same time, it is also useful to encourage commissioners to be more nuanced and explicit about what and where they have uncertainties. This would enable evaluators to navigate and adjust their evaluative practice as a policy or programme evolves. This should lead to a clearer identification of the breadth, scope and nuance of the evaluand over time.
  • Tenders and terms of reference should be shorter, sharper, more accessible, and less formulaic (to counter the current reality in which ITTs are getting longer).
  • Issue: functioning of aspects of the pre-adoption and adoption practices in ‘making an evaluation happen’
  • There is a need for evaluators, singly and within companies and agencies which submit tenders, to use their voice to highlight and push back on badly functioning commissioning practice
  • Change should be enabled from within the community of practice of evaluators.
  • Issue: ‘voice’ for ‘end users’ in the pre-adoption and adoption process 
  • The voice of the end user or policy/programme recipient might be a conventional requirement within the ToRs provided by commissioners of evaluations. (This will counter ‘silos’ and create greater and meaningful interaction between parties).
  • Issue: market engagement provided by the commissioning process
  • Standard pre-adoption and adoption practice should include, as far as possible, the provision of market engagement opportunities for potential bidders to ensure a good understanding of the context and expectations of the evaluation.
  • This might involve the encouragement, using incentives across the systems, of a more diverse range of evaluation practitioners to bid for contracts in order to nurture, encourage and even pioneer new methods​ and perspectives. (This will counter the prevailing profile of current evaluators which is far from diverse).
  • Issue: the purposes of evaluation which recognise awkward, wicked or ‘certainty challenging’ contexts
  • Evaluative practice has to raise its game to be commensurate with the demands of the complexity of current contexts and which also have implications for what can be determined in terms outcomes and impact.
  • Issue: timing of evaluative considerations
  • Evaluation should be moved to front end design of interventions e. bring evaluators into discussions early, at and from the conceptual stage. Whilst some agencies have done this, thinking about evaluation at a point near the end of an intervention remains the prevailing paradigm. (This shift will counter the commonly perceived, formulaic approach, and render evaluative practice meaningful).
  • We should move from an evaluation approach suited mainly to retrospection, ‘did it work?’ and add the forward looking, ‘will it work ?”. (This will counter the misperception that evaluation is only oriented to the past, and optimise a much wider value)
  • Issue: consistency of evaluation resourcing
  • We need to consider a more sophisticated understanding of the limits and possibilities of evaluation and the capacity building it may require and the implications of that to be reflected in resourcing.  (This will enhance clarity of the scope, depth and breadth of evaluative practice)
  • TORs should routinely include realistic indicative budgets. Lack of provision of an indictive budget is poor practice (encourages ‘gaming’ approach, rather than focus on evaluation needs), non-diverse (encourages informal network influence) and illogical (the commissioner must know the budget limit).  A proportional budget rule for evaluations on interventions should be realistically applied i.e. proportionate to the overall budget of the policy or programme.
  • Commissioners have the tools to set a suitable budget for an evaluation. Ideally the budget is transparently calculated on a reasoned basis and published (to counter the current lack of transparency in present practices)

After burn…

To continue this discussion and elaborate further these and any other proposals for the ‘invisible hands’, we welcome responses to this blog.  Feel free to disagree, extend or nuance any of the points raised and to offer other specifics for adjustment, or change to current systems to make evaluative practice fit for the future.

Send your contributions – Title : RTE 2 Blog  – to