UKES Round Table Exchange 3 – Is Evaluation Ready for the Data Revolution ?

The UK Evaluation Society Round Table Exchange series identify adaptations/changes to be sought for Evaluative Practice to be fit for the Future

Chair: Professor Murray Saunders, Lancaster University


Dr Noshua Watson (Interwoven Impact);
Dr Jozef Leonardus Vaessen (Independent Evaluation Group, World Bank)
Dr Mette Bastholm Jensen (XCEPT, Chemonics)

RTE3 was held at UKES Conference on May 26th.
Blog edited by Murray Saunders and Bridget Dillon (UKES Council)

Key points emerging for building stronger practice: 

  1. 1.  Gaining data remotely is not a proxy for consultation – the need to consult and work with communities involved in evaluative activity remains critically important
  2. 2.  Build in consultation and co-construction of themes and interrogatory questions in algorithmic designs to aid authenticity and responsiveness
  3. 3.  Those working in the field of evaluation need to have/develop a level of data literacy.  Data analysts need to bring sensitivity and an understanding of the wider context of the issue under study. This implies the use of multidisciplinary evaluation teams .
  4. 4.  New standards, including on ethical practice, need to be developed to take into account new realities of remote data gathering and protecting communities’ security and privacy. The traditional use of the concept of ‘informed consent’ is not fit for purpose in the new data context
  5. 5.  New ‘use strategies’ are required to ensure analyses based on machine learning are accessible, intelligible and have usability for a wide audience.
  6. 6.  BEWARE the apparent authority and seduction of big numbers!  Big data sets require careful interpretation and avoidance of spurious or meaningless correlation derived from the apparent and automatic weight of big numbers. There may well be many kinds of embedded bias in unstructured data, no matter how sophisticated the algorithm being used or how effectively a machine learns.
  7. 7.  BEWARE the trap of ‘data in search of a problem!  Continued care should be exercised to derive evaluative research questions or problems which are prompted by policy, social and cultural environments. The questions come first and take precedence over the technologically driven availability of new data or the possibility of huge data sets. This points to the continued importance of evaluation design which pre-signals the availability of big data
  8. 8.  Subject-matter expertise, on ground verification and tri-angulation continue to be decisive dimensions of contextual authenticity.

Data Revolution?

The UN Secretary General’s Independent Expert Advisory Group on a Data Revolution for Sustainable Development asserted in 2015 that,

“Better data and statistics will help governments track progress and make sure their decisions are evidence-based; they can also strengthen accountability. This is not just about governments. International agencies, CSOs and the private sector should be involved. A true data revolution would draw on existing and new sources of data to fully integrate statistics into decision making, promote open access to, and use of, data and ensure increased support for statistical systems. ” (HLP Report, P23)

The strong inference from this statement is that the ‘data revolution’ is mainly about statistics and very large data sets which are available from public and private sources. This is enabled by new technologies in communication and computing (including social media and the mining of commercial on-line environments for descriptions of large scale tendencies in viewing, spending/consumption, economic, cultural behaviour and remote geospatial imaging). However, we might see this as a somewhat ‘reductive’ statement. It was revised to a more expansive statement in which the Advisory Group argues that ‘most people are in broad agreement that the ‘data revolution’ refers to the transformative actions needed to respond to the demands of a complex development agenda, improvements in how data is produced and used; closing data gaps to prevent discrimination; building capacity and data literacy in “small data” and big data analytics; modernizing systems of data collection; liberating data to promote transparency and accountability; and developing new targets and indicators.’[1]

[1]  See

Not just big data?

So, the ‘data revolution’ is often taken to refer to the advances in data management, data analytics and computational capabilities which allow researchers and others to process and analyze these data (as well as other existing data) in new ways for a variety of purposes. It is about more than just size, although of course size matters! It is also about being clear about differences between information, data and evidence and how we talk about the attribution of value.  We need to understand this data environment and what exactly is changing and how this dimension of evaluative practice might, in turn, change to be more effective in the future?

We should also be aware of other shifts in the firmament, informed not necessarily by technologies and techniques but by socio-political considerations and re-appraisals of what counts as authoritative evidence, the different forms it can take and how it is produced. The failure to enable or even acknowledge a more expansive approach to evidence and the data on which it is based, has been the essence of the critique on the science and evidence used to justify action and non-action (e.g., use of masks) in early approaches to the covid19 pandemic.[1]

Other issues which need to be included in our consideration of the ‘data revolution’ center on challenges associated with, for example, using remote sensing data, mobile phone data, and various internet data. Big data might arrive with us as too varied, too unorganised, and ultimately too uncertain and unverifiable. The challenge may be how to integrate or synthesise key messages from this data chaos. Enter machine learning and the algorithm!

There is a criticism which suggests that data may be presented “as is” in evaluations, without any discussion of their validity and how they were verified. Where data are used, we need to include a discussion of how observations were made or whether measurements were reliable.

The vast data sets provided by social media data have seductive potential but even the most sophisticated algorithms and machine learning processes seem to have problems with the irony and sarcasm embedded in many social media posts! Volume of interest in specific topics can be nuanced and mapped but their meaning may be lost.

[1] See for example

Views from the panel and the audience

Panellists agreed that this is where a nuanced consideration of data science, in particular machine learning and algorithmic methods, becomes important. They focused on effects in their different sites of evaluative work.

Interestingly they pointed to the promise of these analytical methods which depended, not just on the power of digital manipulation of large amounts of data, but also on perennial considerations. Jos Vaessen pointed to potential gains in efficiency (machine learning), validity (enhanced coding and thematic analysis potential) and scope (through the new affordances offered by advances in technologies) i.e. not just “well known but better” but more, “this is a paradigm shift”.

Of course, the new analytical muscle afforded by algorithms is enhanced by new data forms like remote technologies using satellite imagery and GPS for example. Mette Bastholm Jensen (XCEPT – Cross Border Conflict Evidence/Policy/Trends) outlined the dynamics of cross-border conflict in which time sequenced imagery over long periods can, for example, chart changes in people and wildlife movements. The use of these technologies is evident in the current Russian invasion of Ukraine, for example.

Noshua Watson (Interwoven) provided an interesting example of evaluative tension in which ESG (Environmental Social and Governance which constitute the three key factors when measuring the sustainability and ethical impact of an investment in a business or company) considerations, often attracting large scale data management and analytics to assess impact and effects might be missing the point associated with ‘sense making’ across diverse stakeholders.

Together with very large data come powerful responsibilities, to (mis)quote Spiderman. Data science tends to focus too much on statistical manipulations of big data and deriving conclusions from mathematical calculations without a forensic examination of the intersection between the instrument deriving the data and the data sources themselves (this interface often remains opaque or hidden but might yield quite flawed data).  Fundamentally, data needs to help us to make ‘sense’ of a situation, not confuse or obfuscate.

The ‘data revolution’ is also notable in the domain of policy evaluation. While many evaluation functions and evaluation professionals have yet to explore this new opportunity space, others have started their journey into questioning, managing and analyzing (big) data and other forms of data in the framework of evaluative analysis. The covid-19 pandemic, while restricting opportunities for empirical data collection in evaluation, has problematized to some extent the pace of innovation in evaluation, including the use of new data as well as the use of data science techniques to process and analyze ‘old and new’ data.

So, the panelists and wider participants were able to point to the evaluative potential of these newer forms of data and the technologies which produce them, but acknowledged that the emerging practices were patchy, not widespread and relied on quite high levels of expertise not widely available.

There are other caveats around ethical practice, consultation, contextual understanding associated with data analytics, machine learning and the wider use of algorithmic techniques which might form the basis of some principles of procedure associated with these emerging practices.  These are listed at the start of the blog.