CDEI Bias Review - Call for Evidence: Summary of Responses
Published 10 October 2019
Call for Evidence Summary of Responses - Review into Bias in Algorithmic Decision-Making
1. Introduction
1.1 The CDEI’s Review into Bias in Algorithmic Decision-Making
As part of the CDEI’s 2019/2020 Work Programme, we are undertaking a Review focusing on Bias in Algorithmic Decision-Making. The Review explores bias in four key sectors: policing, financial services, recruitment and local government. These sectors have been selected because they all involve significant decisions being made about individuals, and have evidence of both a growing uptake of machine learning algorithms and historic bias in human decision-making.
The Review seeks to answer three sets of questions:
Data: Do organisations and regulators have access to the data they require to adequately identify and mitigate bias?
Tools and techniques: What statistical and technical solutions are available now, or will be required in future, to identify and mitigate bias, and which represent best practice?
Governance: Who should be responsible for governing, auditing and assuring these algorithmic decision-making systems?
More information about the Review can be found in our Interim Report, which sets out how we are defining the issue, our approach to the Review, our progress to date, and emerging insights.
1.2 The Call for Evidence
The Review into Bias in Algorithmic Decision-Making is being informed by the following evidence:
- research undertaken by our policy teams;
- stakeholder engagement with government, regulators, industry and the public;
- responses to our Call for Evidence;
- landscape summary, carried out by academics to assess the academic and policy landscape in this area.
The Call for Evidence document was published on 7 May 2019 and invited submissions on four question areas:
- The use of algorithmic tools;
- Bias identification and mitigation;
- Public engagement;
- Regulation and governance.
Evidence responding to any or all of these four questions was welcomed in relation to any of the four sectors (policing, financial services, recruitment and local government), as well as submissions that related to bias in algorithmic decision-making more generally.
This report summarises the findings and provides a general overview of the number and type of responses received, followed by a more detailed analysis identifying themes and gaps in each question area.
2. Methodology
Responses to the Call for Evidence were logged and reviewed by the CDEI’s policy team leading on the Review into Bias in Algorithmic Decision-Making.
The following information was captured as part of this process:
- The type of organisation that responded.
- Whether the response focused on a particular sector.
- Which of the four questions were responded to.
- High-level themes covered in the response.
- Key points and arguments made.
An overview of this information is set out in the following section of this report.
The policy team analysed this information in order to supplement the wider research and stakeholder engagement being undertaken to inform the Review.
Where respondents recommended or included further reading, the policy team has incorporated this into their evidence base for the Review where appropriate.
Many respondents offered to meet the CDEI to discuss their response further, and the policy team is holding additional meetings with individual stakeholders where relevant.
3. Overview of Responses Received
3.1 Type of organisation that responded
52 responses were received from a variety of organisations. While some individuals did respond, these were individual academics or individuals working in relevant industries. We did not receive any responses from members of the public more generally. The table below shows the breakdown of the type of organisation that responded.
Type of respondent | Number |
---|---|
Academic (institutions, individuals) | 16 |
Civil society (think tanks, academies, non-profits) | 10 |
Industry (companies, sector bodies) | 21 |
Public sector (regulators, government agencies) | 2 |
Other | 3 |
Total | 52 |
(See the end of the webpage for a complete list of organisations)
3.2 Responses focusing on a particular sector
Of the 52 responses received, the majority focused on specific sectors. The table below shows the number of respondents that answered with reference to a particular sector.
Sector | Number of respondents |
---|---|
Policing | 11 |
Financial services | 14 |
Recruitment | 3 |
Local government | 5 |
General (no specific sector focus) | 19 |
Total | 52 |
As we are taking a phased approach to the Review, starting with policing, then moving to finance and recruitment, before focusing on local government, we anticipated that more responses would be received in these first two sectors than the latter two.
3.3 Responses answering specific questions
Respondents could choose to answer any or all of the questions. Table 3 shows the number of respondents answering each question.
Question Area | Number of Respondents |
---|---|
The use of algorithmic tools | 31 |
Bias identification and mitigation | 35 |
Public engagement | 26 |
Regulation and governance | 28 |
Did not use the question format | 16 |
We were pleased to see responses across all four question areas, demonstrating the importance of these themes for the Review.
3.4 Themes and arguments in the evidence received
The responses revealed a broad consensus in the way the issue of bias in algorithmic decision-making is framed. There was a general acknowledgement that, as the use of big data and machine learning in this space increases, the importance of addressing potential issues of algorithmic bias also grows. At the same time, respondents agreed that many of these biases stem from underlying societal inequalities and, as such, will require societal as well as technological solutions. Some were also optimistic about the potential of algorithms to challenge and improve biased human decision-making.
The need for a set of ethical principles to underpin approaches to decision-making algorithms was a repeated theme, and a number of organisations pointed either to frameworks which they had developed themselves or to external approaches such as the European Commission Ethics Guidelines for Trustworthy AI or the Singaporean FEAT Principles. There was significant overlap between the principles being proposed, with fairness, transparency and explainability coming through as especially strong themes.
34 of the responses focused on one of the four specific sectors, giving detailed evidence about current algorithmic approaches being used and their associated risks and benefits. In general, responses from different sectors picked up on similar issues and themes. However, there was a difference in tone between responses focused on the private sector (financial services and recruitment) and those focused on the public sector (policing and local government). The latter tended to be more cautious in tone and less confident that necessary guidance and regulation are already in place, while private sector respondents were more likely to be cautious about the prospect of further regulation.
Many responses discussed the challenges of designing unbiased algorithms based on data which is itself inherently biased. This bias can stem either from issues of unrepresentative sampling (certain populations being over or under-represented) or accurate representations of historic patterns of biased behaviour. Solutions to these issues are often not obvious or easy. In particular many respondents wrote that while protected characteristic data, for example race, can be removed from models, it is much harder to decide how to deal with data, for example postcode, that might be legitimate to use in a decision-making models but risks acting as a proxy due to its correlation with certain protected characteristics.
A number of the responses drew attention to a potential contradiction in the way in which protected characteristics data should be used. In general, it would be considered discriminatory to use protected characteristic data (e.g. sex or race) in a decision-making algorithm and organisations may avoid holding this information for data minimisation reasons. However, not holding this data may also make evaluating outcomes for potential bias against protected characteristics effectively impossible. Respondents varied in their attitude towards this conflict, with some arguing in favour of collecting more protected characteristic data to allow for evaluation of outcomes and others disagreeing with this approach, citing concerns about General Data Protection Regulation (GDPR) and data security.
There was consensus that bias can enter algorithms at multiple points, from the underpinning data, to the design of the algorithms themselves, to the interaction between algorithmic outputs and human decision makers. Although some responses pointed to particular stages as being especially high risk, the general view was that bias identification and mitigation should be pursued at every stage of algorithm development and use.
Algorithms do not operate in isolation from their human designers and many responses emphasised the need to consider the interaction between the two. The importance of maintaining a primary role for human decision makers came out as a clear theme, in particular in the context of debates over accountability and autonomy. However, there was also agreement that these human decision makers needed careful training in how to properly interpret outputs, challenge potential biases and avoid inserting their own unconscious biases into the process. A number of respondents also emphasised increasing diversity in the teams which build algorithms as an important step to challenge bias in the design process.
Respondents were broadly positive about the role of public engagement and many discussed the importance of informing individuals who are having automated decisions made about them that they are subject to this approach and that they have a right to challenge the decision. While some responses did discuss the importance of dialogue with the public, many framed this issue more in terms of public education and information. When discussing existing challenges, most pointed to current low levels of understanding and some highlighted media scare stories as fostering a culture of public mistrust.
Discussion of possible regulatory and governance gaps was the only area where significant disagreement emerged from responses. Some felt that current gaps were clear and existing legal protections inadequate, while others argued that comprehensive legal frameworks already existed, citing in particular data protection and human rights legislation. The variety of perspectives can partly be explained by differences in sectoral contexts. For example, the financial services sector is heavily regulated and has a longer history of using algorithmic decision-making, so respondents were less likely to identify gaps in regulation. However, in policing these technologies are more novel, meaning that they often lack detailed guidance and precedents are yet to be established.
4. Summary of Responses
This section sets out a summary of the key themes in each question.
If respondents did not use the question format, we have incorporated their responses into the relevant themes below.
4.1 Question 1: The use of algorithmic tools
- What algorithmic tools are currently being developed or in use?
- Who is developing these tools?
- Who is selling these tools?
- How are these tools currently being used? How might they be used in the future?
- What does best practice look like and are there examples of especially innovative approaches?
- What are the key ethical concerns with the increased use of algorithms in making decisions about people?
A number of respondents went into a useful level of detail on current algorithmic approaches being pursued in their sector, either describing the range of products on the market or illustrating the in-house approaches being taken by the organisation responding. Responses focused on financial services outlined the range of ways the sector uses algorithms including credit scoring, customer risk profiling, setting prices, processing complaints, marketing and fraud detection. In some cases tools are developed in-house, some are procured from external suppliers and organisations may employ a combination of these approaches. The underlying methodology varies from more traditional methods such as logistic regression and random forest models, to more advanced machine learning and natural language processing. The other sectors tended to go into less granular detail about the scale and nature of algorithmic approaches being used.
Due to the higher volume of responses from financial services and policing sectors, there was a greater range of insight into the current level of adoption in these areas. The responses which were received from recruitment and local government stakeholders gave useful insights, but this imbalance has emphasised to the CDEI the need for further stakeholder engagement focused on these sectors.
Understanding what is meant by algorithmic tools Many respondents emphasised the broad range of uses covered by the term “algorithmic tools” and the very different ethical implications which stem from these. Across the sectors, algorithms may be used, for example, to generate insights, plan workflows or detect fraud without necessarily leading to decisions which are visible to customers or affect the way different customers are treated. While our focus is on the use of algorithms which are making or informing decisions about individuals, this is crucial context to understand the range of ways this technology impacts on the sectors the Review is focusing on.
Extent of adoption Perspectives on the extent of adoption varied by sector, with a number of respondents emphasising the established history of using algorithms in the financial services sector and others describing the relative novelty of their involvement in local government and recruitment. There remains a relative lack of clarity about how far the more novel machine learning based technologies are being used, with several respondents suggesting that the current landscape features pockets of disruptive innovation rather than broader, wholescale adoption.
In-house and technical development Respondents provided evidence of algorithms being designed in-house by organisations and being procured externally. In cases where respondents were describing their own approaches, they often went into useful detail of the processes and ethical frameworks which they currently have in place.
Optimism and caution Overall, the majority of respondents expressed a combination of optimism and caution about the use of decision-making algorithms. Some responses, in particular those focused on the public sector, argued that the use of such tools is completely inappropriate, but the majority expressed a more positive sense of the potential benefits. Almost all responses acknowledged the range of risks attached to this technology, in particular focusing on bias and fairness, but also including transparency, accountability and the importance of public acceptability.
4.2 Question 2: Bias identification and mitigation
- To what extent (either currently or in the future) do we know whether algorithmic decision-making is subject to bias?
- At what point is the process at highest risk of introducing bias? For example, in the data used to train the algorithm, the design of the algorithm, or the way a human responds to the algorithm’s output.
- Assuming this bias is occurring or at risk of occurring in the future, what is being done to mitigate it? And who should be leading efforts to do this?
- What tools do organisations need to help them identify and mitigate bias in their algorithms? Do organisations have access to these tools now?
- What examples are there of best practice in identifying and mitigating bias in algorithmic decision-making?
- What examples are there of algorithms being used to challenge biases within existing systems?
Many respondents focused their answers on this section and there was significant consensus among them. In particular, responses on the way in which bias can enter systems and the challenges in mitigation highlighted broadly similar themes.
Bias in existing systems A repeated theme from respondents was that the ultimate source of bias in algorithms is not the algorithms themselves, but rather underpinning human biases. Whether this manifests itself in biased datasets reflecting patterns of real world discriminatory behaviour or in biased design decisions made by teams lacking in diversity, respondents were keen to emphasise that this issue is not purely, or even primarily, a technological one. When algorithms display biases, they are often reflecting pre-existing structural inequalities.
Risk occurs throughout the process Most respondents who discussed this question agreed that bias could potentially enter algorithms at multiple stages from data collection to technical design to human interpretation of outputs. Drawing on this assumption, most went on to argue that effective bias mitigation will not be a single stage process but will require interventions at every stage.
Historic data and proxies The risk of bias entering algorithms through the underpinning data used in design and predictions was addressed in some depth. In particular, respondents pointed out that data drawn from historic practice, with the likelihood that this would reflect historic patterns of behaviour, has the potential to be problematic. The issue of proxies, which allow algorithms to discriminate against certain characteristics indirectly without having explicitly drawn on the characteristics themselves, was repeatedly raised. Some respondents also discussed other forms of data bias such as skewed sampling and over and under-fitting.
Role of humans in decision-making There was broad agreement that individuals must retain a role in decision-making, in part as a defence against bias. However, this went alongside an acknowledgement that, without proper training, human intervention may fail to identify or mitigate possible biases and may in fact worsen them. An overarching theme of responses was the importance of training staff in bias mitigation when they are designing and using decision-making algorithms and the need for more diverse teams to support this.
Mitigation approaches Respondents outlined a range of approaches to identifying and mitigating biases. In particular, there was an emphasis on the importance of interrogating underlying datasets for potential biases, ongoing performance monitoring of algorithms and ensuring that staff designing and using tools are properly trained. The need to have skilled senior staff who can make informed judgements about potential bias and act as intelligent customers when procuring externally was also referenced.
Some respondents did point to individual tools which can be used to identify biases, in particular the IBM AI Fairness 360 was repeatedly mentioned, but there was also a consensus that tools must be complemented by processes and principles for organisations to follow. Respondents referred to approaches which involved pre-processing of data, modification of algorithms themselves and post-processing of data, often in combination. Several also proposed the use of counterfactual approaches to assess fairness and others argued that AI-based approaches would be necessary to identify bias occurring within AI models.
Several respondents commented that accepted statistical approaches for measuring bias already exist, but the challenge is that, given bias can never be removed entirely, there is no consensus about what standards algorithms should be held to and what best practice looks like. Others argued that a standard approach was unlikely to be achieved given the widely varying contexts involved.
Rather than focusing on tools and statistical methods, many respondents emphasised the need for sets of principles to guide decision-making. Some individual companies or public sector organisations had developed bespoke sets of principles, while others drew on sets developed by other organisations, such as the OECD Principles on Artificial Intelligence. Several respondents emphasised that humans and not algorithms themselves are the root cause of bias and so a key mitigation approach is addressing and improving human behaviour. This also led some to express a general scepticism as to how far bias can be removed from decision-making processes.
Complexity of defining fairness While respondents acknowledged that bias in algorithmic decision-making represents a risk which should be mitigated, many also emphasised that this process is unlikely to be a straightforward one even with tools and principles in place. In particular, definitions of bias and fairness are contested and can conflict with one another.
4.3 Question 3: Public engagement
- What are the best ways to engage with the public and gain their buy in before deploying the use of algorithms in decision-making? For example, should a loan applicant be told that an algorithm is being used to assess their loan application?
- What are the challenges with engaging with the public on these issues?
- What are good examples of meaningful public engagement on these issues?
The main focus of many responses was around the importance of the public having the right information available to them rather than the need for public engagement more broadly. Themes of transparency and explainability came through strongly in the responses. Many recognised the importance of building public trust in data-driven technologies as crucial if the UK is to benefit properly from their potential.
Awareness and understanding A key theme in the responses highlights the differing levels of public understanding and the importance of not assuming the public is a homogenous group with a single view. Respondents focusing on the finance sector believed that people generally understood that algorithms were used to calculate certain financial products such as credit scores, but there was general agreement that, across all contexts, individuals must be made aware when algorithms were being used to make decisions about them.
Transparency and explainability The themes of transparency and explainability came through in a large number of responses. Respondents were clear that individuals should be able to access information that clearly explains how an algorithm was used to inform a decision about them, in particular referring to GDPR requirements when a decision is fully automated. A number of respondents also cited the importance of individuals having a clear route to challenging the decision. However, there was no obvious consensus about the sort of information that should be made available, with the vast variation in public understanding of this technology cited as a challenge in achieving explainability. Respondents also raised issues around the trade-offs between explainability and accuracy and there was no clear view on where that balance should lie.
Importance of Context Many respondents highlighted the importance of the context that algorithms are used in. There was a view that algorithms which are informing highly significant decisions about individuals require greater transparency and oversight. Respondents focusing on the policing and local government sectors highlighted this in particular. Context was also raised to highlight issues of commercial confidentiality, for example, in the finance sector, and some respondents argued that explainability must be balanced with effectiveness and efficiency.
Challenges of Public Engagement The majority of respondents focused on the public’s right to be informed and understand their rights (for example, under GDPR) rather than the need for public engagement specifically. However, a small number did focus on the importance of public engagement, particularly in policing and local government where the delivery of public services relies on public trust. The main challenge cited was the difficulty of conducting meaningful public engagement in an area that so few people fully understand. Some respondents drew attention to the tension between large-scale public engagement which may struggle to encompass the complexity of the issues involved, and more targeted approaches such as focus groups, which while more nuanced are less representative.
4.4 Question 4: Regulation and governance
- What are the gaps in regulation of the use of algorithms?
- Are there particular improvements needed to existing regulatory arrangements to address the risk of unfair discrimination as a result of decisions being made by algorithms?
This question prompted the most diverse perspectives and revealed some significant differences between sectors. Contrasted with the broad consensus in responses to the other three questions, here we saw disagreement between respondents arguing for increased regulation and governance and those arguing that the necessary structures are already in place.
Role of GDPR and human rights law Respondents usefully outlined existing regulatory frameworks and generally focused on the role of the Information Commissioner’s Office in enforcing GDPR and the Equality and Human Rights Commission in promoting and upholding equality and human rights. Responses which were focused on individual sectors also outlined additional regulatory and governance frameworks which apply to them specifically, for example the work of the Financial Conduct Authority in the financial services sector.
Variation by sector In general, the responses which were more cautious about the possibility of new regulatory or governance measures came from the financial services sector and pointed to the sector’s established history of using data in decision-making as well as its already robust regulatory framework. Responses which expressed concerns about regulatory gaps tended to either take a broader perspective of algorithmic decision-making as a whole or were focused on public sector uses of this technology.
Promoting innovation and adapting to new technology Where respondents expressed concern about additional regulation and governance, they tended to focus on concerns that the landscape would become too complex when combined with existing legislation and that premature regulation could prove counterproductive by inadvertently stifling innovation. On the other hand, supporters of the need for more frameworks (regulatory or otherwise) largely argued that the technology and the challenges which it presents are so new that existing approaches do not fully account for their potential impacts.
5. List of Organisations that Responded
- The Alan Turing Institute
- Barclays
- The Behavioural Insights Team
- Big Brother Watch
- Carnegie UK Trust
- Direct Line Group
- DMG Media
- Dragonfly
- Equifax
- Experian
- The Financial Conduct Authority (FCA)
- FinTrust, Newcastle University
- The Futures Institute
- Gemserv
- Huawei
- IBM
- The Institute of Chartered Accountants in England and Wales (ICAEW)
- Independent Digital Ethics Panel for Policing (IDEPP)
- Liberty
- Lloyd’s
- London Business School
- medConfidential
- Microsoft
- PwC UK
- RELX
- The Royal Society
- Royal Statistical Society
- Social Finance
- techUK
- UK Finance
- The Human Rights, Big Data and Technology Project, University of Essex
- Visa
- West Midlands Police
- What Works for Children’s Social Care
- Women Leading in AI
- Workday
- Yoti
In cases where individuals have responded, we have not listed their organisational affiliations.