Rand Rra3194-1
Rand Rra3194-1
Rand Rra3194-1
landscape of tools
for trustworthy AI
in the UK and the US
Current trends, future possibilities,
and potential avenues for collaboration
Salil Gunashekar, Henri van Soest, Michelle Qu, Chryssa Politi,
Maria Chiara Aquilino and Gregory Smith
For more information on this publication, visit www.rand.org/t/RRA3194-1
Research Integrity
Our mission to help improve policy and decision making through research and analysis is enabled through our
core values of quality and objectivity and our unwavering commitment to the highest level of integrity and ethical
behaviour. To help ensure our research and analysis are rigorous, objective, and nonpartisan, we subject our
research publications to a robust and exacting quality-assurance process; avoid both the appearance and reality
of financial and other conflicts of interest through staff training, project screening, and a policy of mandatory
disclosure; and pursue transparency in our research engagements through our commitment to the open publication
of our research findings and recommendations, disclosure of the source of funding of published research, and
policies to ensure intellectual independence. For more information, visit www.rand.org/about/principles.
RAND’s publications do not necessarily reflect the opinions of its research clients and sponsors.
Preface
Over the years, there has been a proliferation of frameworks, declarations are grateful for their valuable feedback and constructive guidance. In
and principles from various organisations around the globe to guide particular, we would like to thank Joe Cowen, Deepa Mani, Jonathan Tan
the development of trustworthy artificial intelligence (AI). These and Alyssa Hanou. We would also like to thank our quality assurance
frameworks articulate the foundations for the desirable outcomes reviewers at RAND Europe, Erik Silfversten and Sana Zakaria, for their
and objectives of trustworthy AI systems, such as safety, fairness, feedback on drafts of the report. Finally, we are very grateful to the
transparency, accountability and privacy. However, they do not provide stakeholders who kindly agreed to participate in the interviews and
specific guidance on how to achieve these objectives, outcomes and crowdsourcing exercise.
requirements in practice. This is where tools for trustworthy AI become
RAND Europe is a not-for-profit research organisation that aims to
important. Broadly, these tools encompass specific methods, techniques,
improve policy and decision making in the public interest, through
mechanisms and practices that can help to measure, evaluate,
research and analysis. RAND Europe’s clients include European
communicate, improve and enhance the trustworthiness of AI systems
governments, institutions, non-governmental organisations and firms
and applications.
with a need for rigorous, independent, multidisciplinary analysis.
Against the backdrop of a fast-moving and increasingly complex global
The findings and analysis within this report represent the views of the
AI ecosystem, this study mapped UK and US examples of developing,
authors and are not official government policy. For more information
deploying and using tools for trustworthy AI. The research also identified
about RAND Europe or this document, please contact:
some of the challenges and opportunities for UK–US alignment and
collaboration on the topic and proposes a set of practical priority actions Salil Gunashekar (Deputy Director, Henri van Soest (Senior Analyst,
for further consideration by policymakers. The report’s evidence aims to Science and Emerging Technology Defence and Security Research
inform aspects of future bilateral cooperation between the UK and the Research Group) Group)
US governments in relation to tools for trustworthy AI. Our analysis also RAND Europe RAND Europe
intends to stimulate further debate and discussion among stakeholders Eastbrook House, Shaftesbury Road Rue de la Loi 82 / Bte 3
as the capabilities and applications of AI continue to grow and the need Cambridge CB2 8DR 1040 Brussels
for trustworthy AI becomes even more critical. United Kingdom Belgium
This rapid scoping study was conducted between November 2023
and January 2024 and was commissioned by the British Embassy Email: sgunashe@randeurope.org Email: vansoest@randeurope.org
Washington via the UK Foreign, Commonwealth and Development Office
(FCDO) and the UK Department for Science, Innovation and Technology
(DSIT). We would like to thank the project team at the British Embassy
Washington for their support and guidance throughout the study. We
ii Examining the landscape of tools for trustworthy AI in the UK and the US
Executive summary
Background and context
The pace of progress of AI has been rapid in recent years. AI is trustworthy AI based on a series of principles or guidelines that often
already being used in many fields and is a technology that could overlap across definitions. These include such characteristics as
bring significant benefits to society, such as enhancing productivity, fairness, transparency, accountability, privacy, safety and explainability.
innovation, health, education and well-being. However, AI and its
progress also pose major risks and challenges – including social, Tools for trustworthy AI are specific approaches or
ethical, legal, economic and technical – that need to be addressed to methods to help make AI more trustworthy and can help
ensure that AI is trustworthy. Consequently, AI has become a critical to bridge the gap between the high-level AI principles
area of interest for stakeholders around the globe and there have been and characteristics, on the one hand, and the practical
many discussions and initiatives to ensure that AI is developed and implementation of trustworthy AI, on the other.
deployed in a responsible and ethical manner.
These tools encompass methods, techniques, mechanisms and
In general, AI systems and applications are regarded practices that can help to measure, evaluate, communicate, improve
as trustworthy when they can be reliably developed and enhance the trustworthiness of AI systems. Thus, the goal of
and deployed without adverse consequences to tools for trustworthy AI is to provide developers, policymakers and
individuals, groups or society. other stakeholders with the resources they need to ensure that AI is
developed and deployed in a responsible and ethical manner. In Chapter
While there is no universally accepted definition of the term trustworthy 1 and Annex A, we provide more information about what we mean by
AI, various stakeholders – governments and international organisations trustworthy AI and tools for trustworthy AI in the context of this study.
alike – have proposed their own definitions, which characterise
iii
1 In this study, we characterised trustworthy AI based on the fundamental underlying principles and/or
characteristics of AI proposed by four major stakeholders across the world – specifically, the UK, the US, the
European Commission and the Organisation for Economic Co-operation and Development. In Chapter 1 and
Annex A, we provide further details about these principles and characteristics.
iv Examining the landscape of tools for trustworthy AI in the UK and the US
Indicative of a potentially fragmented landscape, we identified 233 tools for trustworthy AI, of which roughly 70% (n=163) were
associated with the US, 28% (n=66) were associated with the UK, and the remainder (n=4) represented a collaboration between US and
UK organisations. Broadly, the tools can be categorised as technical, procedural or educational (drawing on the classification used by
the Organisation for Economic Co-operation and Development), which further encompass a range of characteristics and dimensions
associated with trustworthy AI.
The landscape of tools for trustworthy AI in the US is more technical in nature, while the landscape in the UK is observed to be more
procedural. Roughly 72% (n=119) of the US tools were technical in nature, while 56% (n=37) of the UK tools were technical in nature.
30% (n=49) of the US tools were procedural, compared with 58% (n=38) of the UK tools. Finally, 9% (n=16) of the US tools were
educational, compared with 12% (n=8) of the UK tools.
Compared to the UK, the US has a greater degree of involvement of academia in the development of tools for trustworthy AI. Roughly
27% (n=45) of the US tools were developed by academia or collaboratively between academia and external partners, such as industry
or non-profit organisations. By contrast, 9% (n=6) of the UK tools for trustworthy AI involved academia.
Figure 1. Practical considerations for UK and US policymakers to help build a linked-up, aligned and agile ecosystem
ACTION 2
ACTION 1
Systematically capture experiences
Link up with relevant stakeholders Innovate and Monitor and and lessons learnt on tools for
to proactively track and analyse the anticipate discover trustworthy AI, share those insights
landscape of tools for trustworthy with stakeholders and use them to
AI in the UK, the US and beyond anticipate potential future directions
ACTION 3 ACTION 4
Promote the consistent use Engage and Encourage the inclusion of
of a common vocabulary collaborate assessment processes in the
for trustworthy AI among development and use of tools for
stakeholders in the UK and the US Learn and Analyse and trustworthy AI to gain a better
evaluate understand understanding of their effectiveness
ACTION 5
ACTION 6
Continue to partner and build
diverse coalitions with international Join forces to provide resources
organisations and initiatives, and Share and such as data and computing
to promote interoperable tools for communicate power to support and
trustworthy AI democratise the development of
tools for trustworthy AI
Potential stakeholders to involve across the different actions: Department for Science, Innovation and Technology (including the Responsible Technology
Adoption Unit and UK AI Safety Institute); Foreign, Commonwealth & Development Office (including the British Embassy Washington); AI Standards Hub; UK
Research and Innovation; AI Research Resource; techUK; Evaluation Task Force in the UK; Government Office for Science; National Institute of Standards and
Technology; US AI Safety Institute; National Science Foundation; National Artificial Intelligence Research Resource; US national laboratories; Organisation for
Economic Co-operation and Development; European Commission; United Nations (and associated agencies); standards development organisations.
Table of contents
Preface i
Executive summary ii
Chapter 1. What is this study about? 1
1.1. Background and context 1
1.2. Objectives of the study 4
1.3. Overview of the methodology 4
Chapter 2. What does the landscape of tools for trustworthy AI look like in the UK and the US? 6
2.1. Overview of tools identified 8
2.2. The landscape of trustworthy AI in the UK and the US is moving from
principles to practice, and high-level guidelines are increasingly being complemented
by more specific, practical tools 10
2.3. Large US technology companies are developing wide-ranging toolkits to make AI
products and services more trustworthy 12
2.4. Some non-AI companies are developing their own internal guidelines on AI
trustworthiness to ensure they comply with ethical principles 14
2.5. There is limited evidence about the formal assessment of tools for trustworthy AI 15
2.6. The development of multimodal foundation models has increased the complexity
of developing tools for trustworthy AI 16
Chapter 3. What actions should be considered looking ahead? 17
3.1. Practical considerations for policymakers 18
Bibliography 29
Annex A. Further details on the underlying principles of trustworthy AI from different stakeholders 33
Annex B. Detailed methodological approach 36
Annex C. Longlist of tools for trustworthy AI 40
viii Examining the landscape of tools for trustworthy AI in the UK and the US
EC European Commission
EU European Union
Chapter 1
How safe, secure and reliable is an AI system? How can we ensure that
AI systems are aligned with human values and respect human rights?
How can we prevent and mitigate the potential harms of AI, such as bias,
What is this
discrimination, manipulation and deception? How well and transparently
are the decisions and actions of AI systems explained? How can we
foster trust and confidence in AI among consumers and the public?
study about?
These and other related questions have prompted much debate and
discussion over the years about ‘trustworthy AI’ and how to ensure that
AI systems and applications are trustworthy.
2 Examining the landscape of tools for trustworthy AI in the UK and the US
1.1.1. What do we mean by trustworthy AI Policy’s Blueprint for an AI Bill of Rights.8 These frameworks and
principles, to varying degrees of detail, lay the important foundations
and tools for trustworthy AI in the context that outline the desirable outcomes and objectives of trustworthy AI
of this study? systems – as well as the trustworthiness of the processes and involved
stakeholders – throughout the system’s life cycle. However, they do not
Trustworthy AI is a wide-ranging and complex concept. In general, AI
provide specific guidance on how to achieve these objectives, outcomes
systems and applications are regarded as trustworthy when they can
and requirements in practice.
be reliably developed and deployed without adverse consequences to
individuals, groups or society. While there is no universally accepted This is where tools for trustworthy AI become very relevant. Tools for
definition of the term trustworthy AI, various stakeholders – governments trustworthy AI are specific approaches or methods to help make AI
and international organisations alike – have proposed their own more trustworthy and can help to bridge the gap between the high-level
definitions, which characterise trustworthy AI based on a series of AI principles and characteristics, on the one hand, and the practical
principles or guidelines that often overlap across definitions. These implementation of trustworthy AI, on the other. Broadly, these tools
include such characteristics as fairness, transparency, accountability, encompass methods, techniques, mechanisms and practices that can
privacy, safety and explainability. help to measure, evaluate and communicate the trustworthiness of AI
systems and applications (where trustworthiness can be characterised
Over the years, discussions around trustworthy AI have prompted the
by different dimensions as listed above). They can also help to improve
development of various frameworks and principles for trustworthy
and enhance the trustworthiness of AI systems and applications by
AI, such as the European Commission’s (EC) Ethics Guidelines for
identifying and addressing potential issues and risks. Thus, the goal of
Trustworthy AI3; the Organisation for Economic Co-operation and
tools for trustworthy AI is to provide developers, policymakers and other
Development (OECD) AI Principles4; the United Nations Educational,
stakeholders with the resources they need to ensure that AI is developed
Scientific and Cultural Organization’s (UNESCO) Recommendation on
and deployed in a responsible and ethical manner.
the Ethics of AI5; and, more recently, the underpinning principles of the
UK government’s AI regulation white paper,6 the US Executive Order on In this report, we focus on the state of play of tools for trustworthy AI in
the Safe, Secure, and Trustworthy Development and Use of Artificial the UK and the US ecosystems. We characterised the trustworthiness
Intelligence,7 and the White House Office of Science and Technology of AI based on the fundamental underlying principles proposed by four
major stakeholders in different regions across the world that are currently
3 EC (2019).
4 OECD (2019).
5 UNESCO (2021).
6 DSIT (2023).
7 The White House (2023a).
8 The White House (2022).
3
actively involved in key AI-related discussions and debates – specifically, deliberately relied on an inclusive and holistic interpretation of trustworthy
the UK, the US, the EC and the OECD. In Table 1, we outline the key AI. Such an expansive characterisation fed into our methodology to identify
dimensions of trustworthy AI covered by each stakeholder. In Annex A, we tools in the UK and the US and allowed us to capture a variety of examples
provide further details on these principles and characteristics. We have of tools that have been designed and developed for trustworthy AI.
Table 1. Key underlying principles and characteristics of trustworthy AI, from different stakeholders, that were used in this study
UK government9 National Institute of Standards European Commission11 Organisation for Economic Co-operation
and Technology (US)10 and Development12
Five principles: Seven characteristics: Three components: Five principles:
• Safety, security and • Valid and reliable • Lawful • Inclusive growth, sustainable
robustness • Safe • Ethical development and well-being
• Appropriate transparency • Secure and resilient • Robust • Human-centred values and fairness
and explainability • Transparency and explainability
• Accountable and transparent Four ethical principles:
• Fairness • Robustness, security, and safety
• Explainable and interpretable • Respect for human autonomy
• Accountability and • Accountability
• Privacy-enhanced • Prevention of harm
governance
• Fair – with harmful bias • Fairness
• Contestability and redress
managed • Explicability
Seven requirements:
• Human agency and oversight
• Technical robustness and safety
• Privacy and data governance
• Transparency
• Diversity, non-discrimination and fairness
• Societal and environmental well-being
• Accountability
Source: RAND Europe synthesis of the respective sources cited in the heading row
13 We reached out to 64 experts, based in the US, the UK and the EU.
14 The evidence from the interviews has been anonymised and cited throughout the report using unique interviewee identifiers (INT01, INT02, etc.).
5
In the final phase of the research, we cross-analysed the findings from the
desk research – i.e. the longlist of tools identified – and complemented
this analysis with information from the interviews. The resulting findings
form the basis of the narrative and key takeaways presented in this report.
We provide more details about the research methodology and associated
caveats in Annex B.
Nidia Dias & Google DeepMind / Better Images of AI / AI for Biodiversity / CC-BY 4.0
6
Chapter 2
What does the
landscape of tools In this chapter, we discuss what the landscape of tools for trustworthy
AI looks like in the UK and the US, based on a cross-analysis of the
document and database review and interviews. The chapter begins
for trustworthy AI with a high-level descriptive overview of the range of tools identified,
followed by an analysis on how these tools are being used in the context
and the US? 15 The examples we include in this report do not represent an endorsement of the tools or
techniques or of the organisation developing them.
7
Indicative of a potentially fragmented landscape, we identified 233 tools for trustworthy AI, of which roughly 70% (n=163) were
associated with the US, 28% (n=66) were associated with the UK, and the remainder (n=4) represented a collaboration between US and
UK organisations. Broadly, the tools can be categorised as technical, procedural or educational (drawing on the classification used by
the Organisation for Economic Co-operation and Development), which further encompass a range of characteristics and dimensions
associated with trustworthy AI.
The landscape of tools for trustworthy AI in the US is more technical in nature, while the landscape in the UK is observed to be more
procedural. Roughly 72% (n=119) of the US tools were technical in nature, while 56% (n=37) of the UK tools were technical in nature.
30% (n=49) of the US tools were procedural, compared with 58% (n=38) of the UK tools. Finally, 9% (n=16) of the US tools were
educational, compared with 12% (n=8) of the UK tools.
Compared to the UK, the US has a greater degree of involvement of academia in the development of tools for trustworthy AI. Roughly
27% (n=45) of the US tools were developed by academia or collaboratively between academia and external partners, such as industry
or non-profit organisations. By contrast, 9% (n=6) of the UK tools for trustworthy AI involved academia.
16 The categories we used in this analysis align with those used in the OECD Catalogue of Tools & Metrics for Trustworthy AI (OECD 2021).
17 The Excel spreadsheet was populated based on the information contained in the source data we consulted or using our best understanding of the information associated with the tool that we analysed. We recognise
that some of the information contained in the source data may not be the most up-to-date information linked to that tool. Furthermore, it is possible for a tool to be linked to more than one category. For example, a
tool may be classified as both technical and educational in nature. As a result, the sum of these classification values may be larger than the number of tools identified.
18 It is worth noting that the total figure reported here reflects each tool example we identified in the underpinning source data – this includes an aggregation of individual tools as well as toolkits (that may, in some
examples, include constituent tools).
19 Although this cannot be verified without further in-depth examination of the different tools, the fact that procedural tools are less prominent in the US may be linked to cultural differences and a relatively more general
lack of support for certification compared with technical solutions in the US context (INT10).
9
• Across the three broad categories, the tools encompass different specific tool
types. There is a diverse range of tool types, such as audit processes, checklists,
guidelines, standards and sectoral codes of conduct. For example, we identified
138 toolkits or software solutions, of which 115 were from the US (83%) and 23
were from the UK (17%).20 We identified 18 audit processes, of which 5 were from
the US (32%) and 13 from the UK (68%).
• The tools also had different levels of maturity. Using the OECD grouping for
tool readiness,21 we identified tools that were: under development; presented in
a published document; in the product stage; or implemented in multiple projects.
For example, we found 75 tools that were under development and 111 tools that
have been implemented in multiple projects.
• The tools were developed by a range of stakeholders across diverse types of
organisations spanning industry, academia and not-for-profit organisations.
There was also collaboration between these categories. For example, Microsoft
Research separately worked with the University of Pennsylvania22; the University
of Washington23; and the Montreal AI Ethics Institute, McGill University (both
in Quebec, Canada) and Carnegie Mellon University (also in Pennsylvania).24
Google worked with the Courant Institute of Mathematical Sciences at New York
University.25 However, we found differences between the involvement of academia
in the US and the UK tools for trustworthy AI ecosystems. Of the 163 US tools
we identified, 45 tools (approximately 27%) were developed by academia or
collaboratively between academia and external partners, such as industry or non-
profit organisations. By contrast, of the 66 UK tools we identified, 6 were developed
by academia (approximately 9%), and we only found 1 example of a British
academic institution working together with external partners.26
20 While toolkits and software could be seen as distinctive tool types, the OECD catalogue combines them into a single
category. We decided to maintain this category for the purposes of this study.
21 OECD (2024a).
22 Kearns et al. (2018).
23 Covert et al. (2020).
24 Gupta et al. (2020).
25 Cortes et al. (2017).
26 Berditchevskaia et al. (2021).
10 Examining the landscape of tools for trustworthy AI in the UK and the US
27 INT07; INT09.
28 EC (2024a); INT02.
29 INT07; INT08; INT10.
30 INT09.
11
33 INT05.
34 Tools are approaches to analyse or improve the trustworthiness of an AI model, while metrics are mathematical formulas for measuring certain technical requirements relating to trustworthy AI.
35 INT05.
36 INT05.
37 INT05; INT06.
38 INT05.
39 INT05.
13
US telecommunications company Comcast has developed a set of Rolls Royce has developed the Aletheia framework, which is a
security and privacy requirements for AI applications that serve as framework to govern the ethical and responsible use of AI. It
guardrails against outputs or uses of the AI model that cannot be consists of a toolkit that addresses 32 facets of social impact, trust,
considered trustworthy. These requirements consists of a baseline transparency and governance. The goal of the framework is to guide
that all AI applications developed and deployed within Comcast have developers, executives and boards on the deployment of AI. The toolkit
to meet, as well as two additional sets that are specific to continuously was first developed for internal use by Rolls Royce, and the company
learning models and user-interacting models.46 then decided to make it public.47
46 Comcast (2023).
47 Rolls Royce (2023).
15
48 INT03; INT10.
49 INT01; INT05.
50 OECD (2024b).
51 CDEI and DSIT (2024a).
52 Graham et al. (2020).
16 Examining the landscape of tools for trustworthy AI in the UK and the US
What actions
Action 4: Encourage the inclusion of assessment processes in
the development and use of tools for trustworthy AI to gain a
better understanding of their effectiveness.
looking ahead?
of tools for trustworthy AI.
Action 1: Link up with relevant to seek guidance and insights on the state of play and direction of travel,
and to collaborate on the technical infrastructure and capabilities required
stakeholders to proactively track to monitor trends (e.g. automating the data collection). Over time, this
and analyse the landscape of tools could lead to acquiring a more robust, evidence-based awareness and
for trustworthy AI in the UK, the US understanding of the wider global landscape of tools for trustworthy AI,
and its implications for the UK AI market and UK–US alignment. In the
and beyond UK, DSIT could continue to play an active role in this engagement, as we
Given the rapidly evolving capabilities of AI, the many ongoing global recognise that – through its Portfolio of AI Assurance Techniques62 – it
conversations about AI oversight, and the UK’s aim to take on a strategic, has partnered with the OECD.63 As noted on the Portfolio of AI Assurance
international leadership role in AI,60 we propose that in the short term, Techniques website, the current examples of AI assurance techniques
the UK adopts a pro-active role in continuously tracking and monitoring will be regularly updated over time with additional case studies.64
the potentially fragmented tools for trustworthy AI landscape. Given the Furthermore, continuing to link up with local stakeholders in the wider
pace at which AI is developing, it is important that the UK remains on ecosystem working on other aspects of tools for trustworthy AI – for
the front foot so that it does not fall behind the developments – both example, the Alan Turing Institute, the British Standards Institution (BSI)
technical and regulatory – that are taking place in the wider tools for and the National Physical Laboratory in the UK,65 as well as universities66
trustworthy AI ecosystem. – will help cover a broader range of tools and ensure a more holistic
understanding of the environment and its development trajectory.
As noted in this report, the OECD has created an online, interactive
platform – the Catalogue of Tools & Metrics for Trustworthy AI – ‘to share
Potential stakeholders to involve: DSIT, including the
and compare tools and build upon each other’s effort’.61 The UK and
Responsible Technology Adoption Unit (RTA); techUK;67 the
the US could continuously cooperate with the OECD team responsible
AI Standards Hub; the OECD; and the US National Institute of
for maintaining the Catalogue to extract a more detailed understanding
Standards and Technology (NIST).
about the UK and the US ecosystems (and other relevant jurisdictions),
60 DSIT (2024b).
61 OECD (2024a).
62 The portfolio was initially developed by the Centre for Data Ethics and Innovation. On 6 February 2024, this centre changed its name to the Responsible Technology Adoption Unit (RTA): CDEI and DSIT (2024c).
63 CDEI and DSIT (2024a); OECD (2024a).
64 CDEI and DSIT (2024b).
65 These three organisations, with the support of the UK government, are involved in a joint initiative – the AI Standards Hub – with a mission to ‘advance trustworthy and responsible AI with a focus on the role that
standards can play as governance tools and innovation mechanisms’ (AI Standards Hub 2024).
66 As noted in Chapter 2, based on the examples of tools identified, there appears to be a greater degree of collaboration between industry and academia in the US compared with the UK.
67 They are included because they were involved in the initial development of the Portfolio of AI Assurance Techniques.
20 Examining the landscape of tools for trustworthy AI in the UK and the US
68 This function could be (partially) served by the ‘Introduction to AI assurance’ resource, which is planned
to be published by the UK government in Spring 2024 and aims to raise awareness on AI assurance
techniques and help stakeholders increase their understanding of trustworthy AI systems (DSIT 2024b).
21
resource (e.g. like an online observatory and forum) that would need of tools for trustworthy AI for stakeholders in the ecosystem. DSIT’s
to be regularly updated to reflect new developments regarding tools Portfolio of AI Assurance Techniques70 is a helpful foundation to build on
for trustworthy AI. Since the information and analyses contained in and potentially expand out over time, along with the AI Standards Hub.71
this resource would be stakeholder driven and incorporate market-led Depending on the availability of resources, the portfolio and associated
‘signals’, such a resource would have direct implications for the trajectory activities could be co-developed with a US-based entity, such as NIST.
of the ecosystem of trustworthy AI in the UK and the US. Furthermore, As noted in Action 1, it would be valuable to draw on the experiences of
the UK and the US could consider collaborating on actively soliciting those involved in the OECD Catalogue of Tools & Metrics for Trustworthy
the development of tools for trustworthy AI in the context of specific AI.72
challenges, such as the UK Fairness Innovation Challenge.69
This approach of information exchange would not only facilitate Potential stakeholders to involve: DSIT, including the RTA and
continuous improvement and innovation in tools for trustworthy AI, the UK AI Safety Institute (UK AISI); the AI Standards Hub; the
to keep up with the rapid pace of AI development, but also provide Government Office for Science; the OECD; NIST; and the US AI
evidence to anticipate potential future directions. This forward-looking Safety Institute (US AISI).
approach could assist in the creation of more resilient and effective tools
and strategies that could potentially cope with the uncertainty of fast-
changing developments in AI. Together with Action 1, these activities
could also contribute to increasing the awareness and accessibility
Action 3: Promote the consistent such as the UK and the US, is to ensure that there is clarity and that
stakeholders involved have a shared understanding – a lexicon – of the
use of a common vocabulary for different phrases and concepts, while considering their respective unique
trustworthy AI among stakeholders in socio-technical and regulatory contexts.75 This shared understanding,
the UK and the US in turn, is key to achieving interoperability as well as regulatory clarity.76
The US and the EU have already made progress towards developing a
The emergence of numerous AI oversight frameworks across the world, shared terminology and taxonomy for AI (currently covering 65 terms)77
including in the UK, the US and the European Union (EU), as noted in through the EU–US Trade and Technology Council (TTC) Joint Roadmap
Chapter 1, highlights the need for developing a common taxonomy and for Trustworthy AI and Risk Management.78 The UK could consider
for aligning terminology and vocabulary, particularly when it comes to leveraging this work79 and/or getting involved to further boost transatlantic
operationalising a complex concept such as trustworthy AI. There is some cooperation and harmonisation on AI, while tailoring it to the context of
inconsistency in terms of how various foundational concepts associated AI activities in the UK. Rather than starting from scratch, it will be helpful
with trustworthy AI – fairness, transparency, accountability and safety, to to draw on existing resources80 that are concerned with taxonomies and
name a few – are currently used by stakeholders in the UK, the US, and terminologies for trustworthy AI.81
beyond (see Box 11).73 In addition, while such terms as risk governance
and risk management are defined and operationalised by entities such as Potential stakeholders to involve: DSIT, including the RTA; the FCDO
standards development organisations, individual countries have their own (British Embassy Washington); the BSI; ANSI; NIST; and the EC.
approaches that have been developed in parallel to international efforts.74
This existence of parallel tracks could be problematic, as a key step in
boosting effective cooperation between two notable jurisdictions,
73 INT03; INT10.
74 INT10.
75 It is worth noting that while it is important to have clarity and consensus on what trustworthy AI is and what its key characteristics are, it is perhaps less necessary to establish consensus on how to achieve
trustworthy AI. For example, it could be more valuable to seek to translate and map terminology to aid interoperability provided differing approaches are mutually understood.
76 INT10.
77 EC (2023b).
78 EC (2022, 2023a).
79 It could also draw on similar efforts conducted by the International Organization for Standardization (ISO), the Institute of Electrical and Electronics Engineers (IEEE) and NIST.
80 See, for example, Newman (2023); ISO (2021).
81 A potential venue for this effort – particularly from the perspective of the safety of advanced AI systems – is the planned International Report on the Science of AI Safety, which will be released by the UK government
in Spring 2024 (DSIT 2024b).
23
The UK, the US, the EC and the OECD incorporate the concept of fairness into their
conceptualisations of trustworthy AI. However, the definitions or interpretations of
fairness used in all four contexts differ in subtle but important ways. We reproduce these
definitions here:
UK: ‘AI systems should not undermine the legal rights of individuals or
organisations, discriminate unfairly against individuals or create unfair
market outcomes. Actors involved in all stages of the AI life cycle should
consider definitions of fairness that are appropriate to a system’s use,
outcomes and the application of relevant law.’82
US: ‘Concerns for equality and equity by addressing issues such as harmful
bias and discrimination.’83
EC: ‘Fairness has both a substantive and a procedural dimension. The
substantive dimension implies a commitment to ensuring equal and just
distribution of both benefits and costs, and ensuring that individuals and
groups are free from unfair bias, discrimination and stigmatisation. The
procedural dimension of fairness entails the ability to contest and seek
effective redress against decisions made by AI systems and by the humans
operating them.’84
OECD: ‘AI actors should respect the rule of law, human rights and
democratic values, throughout the AI system lifecycle. These include
freedom, dignity and autonomy, privacy and data protection, non-
discrimination and equality, diversity, fairness, social justice, and
internationally recognised labour rights. To this end, AI actors should
implement mechanisms and safeguards, such as capacity for human
determination, that are appropriate to the context and consistent with the
state of art.’85
82 DSIT (2023).
83 NIST (2023).
84 EC (2019).
85 OECD (2019).
24
Action 4: Encourage the inclusion addition, developers of tools for trustworthy AI could be encouraged to
include (more) information about internal pre-release assessment and to
of assessment processes in the participate in relevant post-release assessment activities.
development and use of tools for This assessment would help not only to learn lessons (see Action
trustworthy AI to gain a better 2), but also to track and understand the impacts and longer-term
understanding of their effectiveness outcomes associated with tool design, development, deployment and
use. Independent assessments would also promote transparency and
As noted in Chapter 2, there is limited evidence associated with the accountability. Conducting longer-term follow-up studies can improve
formal assessment and evaluation of tools for trustworthy AI in the UK the evidence base and aid in understanding the effectiveness of tools.
and the US. For example, does each tool enhance the specific aspects Over time, feedback from these assessments could contribute to helping
of trustworthy AI it is concerned with? Across the portfolio of tools, is developers design more effective and innovative tools, which can improve
there a general improvement in trustworthy AI, risk management and a wider range of outcomes. This action is directly linked to Action 2 and
desirable outcomes associated with the different approaches? Does could involve a set of follow-on activities that are rolled out over time.
increased trust persist over time? While it might be too resource intensive
to formally assess the quality and effectiveness of every tool, given the Potential stakeholders to involve: DSIT, including the RTA and
sizeable number and diverse range of tools being developed and used, UK AISI; the British Embassy Washington; NIST; the Evaluation
the UK and the US could potentially consider informally or formally Task Force in the UK; evaluation practitioners; and US AISI.
assessing and cross-analysing subsets of tools across the AI value
chain (through stakeholder feedback, researcher observations, etc.). In
25
Action 5: Continue to partner Examples of organisations and initiatives include: the Global Partnership
on Artificial Intelligence (GPAI)86; OECD.AI87; the UN’s High-Level Advisory
and build diverse coalitions Body on Artificial Intelligence88; UNESCO89; the Hiroshima AI Process90
with international organisations and other AI-related G7 activities; various AI-related activities across
and initiatives, and to promote the EU91 (including, for example, the EC’s proposed legal framework on
AI – the ‘AI Act’)92; and outcomes of the AI Safety Summit 202393 (e.g.
interoperable tools for trustworthy AI partnerships with AI Safety Institutes across the globe).94 The UK and
While tools for trustworthy AI are important, they are not enough on the US are already actively involved to varying degrees in these and other
their own, particularly given the rapid proliferation of AI governance– multilateral fora. It may also be useful to draw on the lessons learnt from
related activities across the globe. Trustworthy AI is a global and developing recent transatlantic strategic collaborative vehicles in other
cross-sectoral issue that requires the collaboration and coordination related contexts, such as biosecurity95 and cybersecurity.96 Against the
of AI actors from different countries, regions, sectors and disciplines, backdrop of the current regulatory uncertainty around AI, these high-
involving stakeholders with diverse skills, to share best practices, learn level collaborative set-ups will foster dialogue and cooperation on the
from each other and harmonise tools for trustworthy AI while respecting global governance and coordination of AI, as well as provide avenues
each other’s unique contexts. To ensure the development of responsible for the adoption and implementation of the principles and practices of
and trustworthy AI – and consequently of tools for trustworthy AI – the trustworthy AI – and, subsequently, the development and deployment of
UK and the US could continue to engage with various stakeholders and tools for trustworthy AI – in a compatible and interoperable manner.
promote inclusive dialogue and information exchange that involves
diverse perspectives, with a particular emphasis on multistakeholder Potential stakeholders to involve: DSIT, including the RTA and
initiatives, international organisations, and other countries and regions UK AISI; the FCDO; NIST; US AISI; the OECD; the EC; and the UN.
that share similar values and a similar vision.
86 GPAI (2024).
87 OECD (2024c).
88 United Nations, Office of the Secretary-General’s Envoy on Technology (2024).
89 UNESCO (2024).
90 The White House (2023b).
91 EC (2024b).
92 EC (2024a).
93 FCDO et al. (2023).
94 These high-level international collaborations can be further developed through partnerships between dedicated institutes in the US, the UK and other countries. For example, the UK AISI has formed a partnership with the
US AISI and the Singaporean government to collaborate on safety testing of AI models (DSIT 2024b). As a further signal of strong bilateral collaboration between the UK and US on AI safety, on 1 April 2024, a memorandum
of understanding was signed to enable the UK and US AISIs ‘to work closely to develop an interoperable programme of work and approach to safety research, to achieve their shared objectives on AI safety’ (UK AISI 2024).
95 Cabinet Office (2024).
96 NCSC (2023).
26 Examining the landscape of tools for trustworthy AI in the UK and the US
Action 6: Join forces to provide researchers are being ‘priced out’ of this research because of the steep
cost of compute.100 NAIRR and AIRR, together with the US national labs,
resources such as data and could potentially help provide the necessary compute capacity for these
computing power to support and efforts.101 Furthermore, it may be helpful to draw on the experiences of
democratise the development of current models of international collaboration in AI compute, such as
the recently announced memorandum of understanding between the
tools for trustworthy AI UK and Canada.102 Directing efforts towards more equitable access
The current generation of foundation models are large and complex and democratising compute and data to ‘internationalise’ tools for
and are trained on vast amounts of publicly available data. This means trustworthy AI could not only address the UK–US landscape, but also
it will become harder to find data that can be used as a holdout data point towards common ambitions across key multilateral fora in the wider
set.97 This store of non-synthetic data that is not included in any existing AI governance ecosystem (as highlighted in Action 5).
models could be a useful resource for the development of tools to
measure trustworthiness.98 There is potentially much unique data Potential stakeholders to involve: NAIRR; AIRR; DSIT, including
within governments that is not being accessed.99 The recently created UK AISI; UKRI; the US National Science Foundation (NSF); and US
US National AI Research Resource (NAIRR) and the UK AI Research national labs.
Resources (AIRR) could potentially help provide access to these data
through a joint cloud service. Similarly, developing and deploying large
foundation models and creating appropriate tools for ensuring the
In Figure 2, we provide a visual summary of the six practical actions
trustworthiness of these models can require large amounts of compute.
suggested for policymakers.
Academic and not-for-profit research can be an important source of
independent research on AI assurance techniques. However, these
97 In machine learning, a holdout dataset refers to data that has never been used in the training of the model. These types of data can be used to independently validate certain characteristics of the model while avoiding
the need to use data that the model is familiar with. Since very large foundation models are trained on almost all publicly available data, holdout data can become increasingly hard to find: Raschka (2018).
98 Synthetic data is data that has been generated through an algorithm; non-synthetic data is data that has been measured and collected in the ‘real world’: Jordon et al. (2022).
99 INT06. An example of a move to address this is the collaboration between NASA and IBM to release NASA’s Harmonized Landsat and Sentinel-2 (HLS) dataset of geospatial data: Blumenfeld (2023).
100 INT05; INT09.
101 UKRI (2024); NSF (2024).
102 DSIT (2024a).
27
Figure 2. Practical considerations for UK and US policymakers to help build a linked-up, aligned and agile ecosystem
ACTION 2
ACTION 1
Systematically capture experiences
Link up with relevant stakeholders Innovate and Monitor and and lessons learnt on tools for
to proactively track and analyse the anticipate discover trustworthy AI, share those insights
landscape of tools for trustworthy with stakeholders and use them to
AI in the UK, the US and beyond anticipate potential future directions
ACTION 3 ACTION 4
Promote the consistent use Engage and Encourage the inclusion of
of a common vocabulary collaborate assessment processes in the
for trustworthy AI among development and use of tools for
stakeholders in the UK and the US Learn and Analyse and trustworthy AI to gain a better
evaluate understand understanding of their effectiveness
ACTION 5
ACTION 6
Continue to partner and build
diverse coalitions with international Join forces to provide resources
organisations and initiatives, and Share and such as data and computing
to promote interoperable tools for communicate power to support and
trustworthy AI democratise the development of
tools for trustworthy AI
Potential stakeholders to involve across the different actions: Department for Science, Innovation and Technology (including the Responsible Technology
Adoption Unit and UK AI Safety Institute); Foreign, Commonwealth & Development Office (including the British Embassy Washington); AI Standards Hub; UK
Research and Innovation; AI Research Resource; techUK; Evaluation Task Force in the UK; Government Office for Science; National Institute of Standards and
Technology; US AI Safety Institute; National Science Foundation; National Artificial Intelligence Research Resource; US national laboratories; Organisation for
Economic Co-operation and Development; European Commission; United Nations (and associated agencies); standards development organisations.
Bibliography https://github.com/Comcast/ProjectGuardRail
Cortes, Corinna, Xavier Gonzalvo, Vitaly Kuznetsov, Mehryar Mohri, &
AI Standards Hub (homepage). 2024. As of 5 February 2024: Scott Yang. 2017. ‘AdaNet: Adaptive Structural Learning of Artificial
https://aistandardshub.org/the-ai-standards-hub/ Neural Networks.’ As of 5 February 2024:
http://proceedings.mlr.press/v70/cortes17a.html
Berditchevskaia, Aleks, Eirini Malliaraki, & Kathy Peach. 2021. Participatory
AI for Humanitarian Innovation. London: NESTA. As of 5 February 2024: Covert, Ian, Scott Lundberg, & Su-In Lee. 2020. ‘Understanding Global
https://media.nesta.org.uk/documents/Nesta_Participatory_AI_for_ Feature Contributions with Additive Importance Measures.’ As of 5
humanitarian_innovation_Final.pdf February 2024:
https://arxiv.org/abs/2004.00668
Blumenfeld, Josh. 2023. ‘NASA and IBM Openly Release Geospatial AI
Foundation Model for NASA Earth Observation Data.’ As of 6 February DBT (Department for Business & Trade), FCDO (Foreign, Commonwealth
2024: and Development Office), & Prime Minister’s Office. 2023. ‘The Atlantic
https://www.earthdata.nasa.gov/news/impact-ibm-hls-foundation-model Declaration.’ London: HM Government. As of 22 January 2024:
https://www.gov.uk/government/publications/the-atlantic-declaration/
Cabinet Office. 2024. ‘UK and U.S. Announce New Strategic Partnership the-atlantic-declaration
to Tackle Increased Biological Threats.’ London: HM Government. As of 5
February 2024: DSIT (Department for Science, Innovation and Technology). 2023. ‘A Pro-
https://www.gov.uk/government/news/uk-and-us-announce-new- innovation Approach to AI Regulation.’ London: HM Government. As of 22
strategic-partnership-to-tackle-increased-biological-threats January 2024:
https://www.gov.uk/government/publications/
CDEI (Centre for Data Ethics and Innovation) and DSIT (Department for ai-regulation-a-pro-innovation-approach
Science, Innovation, and Technology). 2024a. ‘Find Out About Artificial
Intelligence (AI) Assurance Techniques.’ As of 22 January 2024: ———. 2024a. ‘UK-Canada Cooperation in AI Compute: Memorandum of
https://www.gov.uk/ai-assurance-techniques Understanding.’ London: HM Government. As of 5 February 2024:
https://www.gov.uk/government/publications/uk-canada-
———. 2024b. ‘Portfolio of AI Assurance techniques.’ As of 22 January 2024: cooperation-in-ai-compute-memorandum-of-understanding/
https://www.gov.uk/guidance/cdei-portfolio-of-ai-assurance-techniques uk-canada-cooperation-in-ai-compute-memorandum-of-understanding
———. 2024c. ‘The CDEI is Now the Responsible Technology Adoption ———. 2024b. ‘A Pro-innovation Approach to AI Regulation: Government
Unit.’ As of 6 February 2024: Response’ London: HM Government. As of 6 February 2024:
https://www.gov.uk/government/news/ https://www.gov.uk/government/consultations/ai-regulation-a-pro-
the-cdei-is-now-the-responsible-technology-adoption-unit innovation-approach-policy-proposals/outcome/a-pro-innovation-
approach-to-ai-regulation-government-response
29
DSIT (Department for Science, Innovation and Technology), Innovate FCDO (Foreign, Commonwealth and Development Office), DSIT
UK, the Equality and Human Rights Commission and the Information (Department for Science, Innovation and Technology), & UK AISI (UK AI
Commissioner’s Office. 2024. ‘Fairness Innovation Challenge.’ As of 5 Safety Institute). 2023. ‘AI Safety Summit 2023.’ As of 11 April 2024:
February 2024: https://www.gov.uk/government/topical-events/ai-safety-summit-2023
https://fairnessinnovationchallenge.co.uk/ Floridi, Luciano, Josh Cowls, Monica Beltrametti, Raja Chatila, Patrice
EC (European Commission). 2019. ‘Ethics Guidelines for Trustworthy AI.’ Chazerand, Virginia Dignum, Christop Luetge, Robert Madelin, Ugo
As of 22 January 2024: Pagallo, Francesca Rossie, Burkhard Schafer, Peggy Valcke, & Effy
https://digital-strategy.ec.europa.eu/en/library/ Vayena. 2019. AI4People’s Ethical Framework for a Good AI Society:
ethics-guidelines-trustworthy-ai Opportunities, Risks, Principles, and Recommendations. Brussels:
AI4People. As of 11 April 2024:
———. 2022. ‘TTC Joint Roadmap for Trustworthy AI and Risk
https://www.eismd.eu/wp-content/uploads/2019/11/AI4People’s-Ethical-
Management.’ As of 22 January 2024:
Framework-for-a-Good-AI-Society_compressed.pdf
https://digital-strategy.ec.europa.eu/en/library/
ttc-joint-roadmap-trustworthy-ai-and-risk-management GPAI (Global Partnership on Artificial Intelligence). 2024. ‘Global
Partnership on Artificial Intelligence.’ As of 22 January 2024:
———. 2023a. ‘EU–US Trade and Technology Council.’ As of 22 January
https://gpai.ai/
2024:
https://commission.europa.eu/strategy-and-policy/priorities-2019-2024/ Graham, Logan, Abigail Gilbert, Joshua Simons, Anna Thomas, & Helen
stronger-europe-world/eu-us-trade-and-technology-council_en Mountfield. 2020. Artificial Intelligence in Hiring: Assessing Impact on
Equality. London: Institute for the Future of Work. As of 5 February 2024:
———. 2023b. ‘EU-U.S. Terminology and Taxonomy for Artificial
https://www.ifow.org/publications/
Intelligence.’ As of 22 January 2024:
artificial-intelligence-in-hiring-assessing-impacts-on-equality
https://digital-strategy.ec.europa.eu/en/library/
eu-us-terminology-and-taxonomy-artificial-intelligence Gupta, Abhishek, Camylle Lanteigne, & Sara Kingsley. 2020. ‘SECure: A
Social and Environmental Certificate for AI Systems.’ As of 5 February
———. 2024a. ‘AI Act.’ As of 22 January 2024:
2024:
https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai
https://arxiv.org/abs/2006.06217
———. 2024b. ‘European Approach to Artificial Intelligence.’ As of 22
IBM Research. 2024a. ‘AI Fairness 360.’ As of 22 January 2024:
January 2024:
https://aif360.res.ibm.com/
https://digital-strategy.ec.europa.eu/en/policies/
european-approach-artificial-intelligence ———. 2024b. ‘AI Privacy 360.’ As of 22 January 2024:
https://aip360.res.ibm.com/
30 Examining the landscape of tools for trustworthy AI in the UK and the US
———. 2024c. ‘AI Explainability 360.’ As of 22 January 2024: NCSC (National Cyber Security Centre). 2023. ‘Guidelines for Secure AI
https://aix360.res.ibm.com/ System Development.’ As of 5 February 2024:
https://www.ncsc.gov.uk/collection/
———. 2024d. ‘Uncertainty Quantification 360.’ As of 22 January 2024:
guidelines-secure-ai-system-development
https://uq360.res.ibm.com/
Newman, Jessica. 2023. A Taxonomy of Trustworthiness for Artificial
———. 2024e. ‘AI FactSheets 360.’ As of 22 January 2024:
Intelligence. Berkeley, CA: University of California Berkeley Centre for
https://aifs360.res.ibm.com/
Long-Term Cybersecurity. As of 5 February 2024:
———. 2024f. ‘Adversarial Robustness 360 – Resources.’ As of 22 January https://cltc.berkeley.edu/wp-content/uploads/2023/01/Taxonomy_of_AI_
2024: Trustworthiness.pdf
https://art360.res.ibm.com/resources#overview
NIST (National Institute of Standards and Technology). 2023. Artificial
ISO (International Organization for Standardization). 2021. ISO/IEC DIS Intelligence Risk Management Framework (AIRMF1.0). Gaithersburg, MD:
22989(en) Information technology — Artificial intelligence — Artificial National Institute of Standards and Technology. As of 22 January 2024:
Intelligence Concepts and Terminology. Geneva: International Organization https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf
for Standardisation. As of 5 February 2024:
NSF (National Science Foundation). 2024. ‘Democratizing the Future
https://www.iso.org/obp/ui/#iso:std:iso-iec:22989:dis:ed-1:v1:en
of AI R&D: NSF to Launch National AI Research Resource Pilot.’ As of 5
Jordon, James, Lukasz Szpruch, Florimond Houssiau, Mirko Bottarelli, February 2024:
Giovanni Cherubin, Carsten Maple, Samuel Cohen, & Adrian Weller. 2022. https://new.nsf.gov/news/
Synthetic Data – What, Why and How? London: Alan Turing Institute. As of democratizing-future-ai-rd-nsf-launch-national-ai
5 February 2024:
OECD (Organisation for Economic Co-operation and Development).
https://arxiv.org/abs/2205.03257
2019. ‘Recommendation of the Council on Artificial Intelligence.’ As of 22
Kearns, Michael, Seth Neel, Aaron Roth, & Zhiwei Steven Wu. 2018. January 2024:
‘Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0449
Fairness.’ As of 5 February 2024:
———. 2021. ‘Tools for Trustworthy AI: A Framework to Compare
https://arxiv.org/abs/1711.05144
Implementation Tools for Trustworthy AI.’ As of 22 January 2024:
Kumar, Ram Shankar Siva. 2021. ‘AI Security Risk Assessment https://www.oecd.org/science/tools-for-trustworthy-ai-008232ec-en.htm
Using Counterfit.’ Redmond, WA: Microsoft. As of 5 February 2024:
———. 2024a. ‘Catalogue of Tools & Metrics for Trustworthy AI – About
https://www.microsoft.com/en-us/security/blog/2021/05/03/
the Catalogue.’ As of 22 January 2024:
ai-security-risk-assessment-using-counterfit/
https://oecd.ai/en/catalogue/faq
31
———. 2024b. ‘Catalogue of Tools & Metrics for Trustworthy AI – Show ———. 2023b. ‘G7 Leaders’ Statement on the Hiroshima AI Process.’ As of
Use Cases.’ As of 22 January 2024: 19 January 2024:
https://oecd.ai/en/catalogue/tool-use-cases https://www.whitehouse.gov/briefing-
room/statements-releases/2023/10/30/
———. 2024c. ‘Policies, Data and Analysis for Trustworthy Artificial
g7-leaders-statement-on-the-hiroshima-ai-process/
Intelligence.’ As of 19 January 2024:
https://oecd.ai/en/ UK AISI (UK AI Safety Institute). 2024. ‘Collaboration on the safety of AI:
UK-US memorandum of understanding.’ London: HM Government. As of
———. 2024d. ‘Catalogue of Tools & Metrics for Trustworthy AI – Show
11 April 2024:
Tools.’ As of 22 January 2024:
https://www.gov.uk/government/publications/collaboration-
https://oecd.ai/en/catalogue/tools
on-the-safety-of-ai-uk-us-memorandum-of-understanding/
PAI (Partnership on AI). 2023. ‘PAI’s Guidance for Safe Foundation Model collaboration-on-the-safety-of-ai-uk-us-memorandum-of-understanding
Deployment.’ As of 22 January 2024:
UKRI (UK Research and Innovation). 2024. ‘AI Research Resource
https://partnershiponai.org/modeldeployment/
Funding Opportunity Launches.’ As of 5 February 2024:
Raschka, Sebastian. 2018. ‘Model Evaluation, Model Selection, and https://www.ukri.org/news/
Algorithm Selection in Machine Learning.’ As of 5 February 2024: ai-research-resource-funding-opportunity-launches/
https://arxiv.org/abs/1811.12808
UNESCO. 2021. ‘Recommendation on the Ethics of Artificial Intelligence.’
Rolls Royce. 2023. ‘The Aletheia Framework.’ As of 5 February 2024: As of 22 January 2024:
https://www.rolls-royce.com/innovation/the-aletheia-framework.aspx https://unesdoc.unesco.org/ark:/48223/pf0000380455
The White House. 2022. ‘Blueprint for an AI Bill of Rights.’ As of 22 ———. 2024. ‘Artificial Intelligence.’ As of 19 January 2024:
January 2024: https://www.unesco.org/en/artificial-intelligence
https://www.whitehouse.gov/ostp/ai-bill-of-rights/
United Nations, Office of the Secretary-General’s Envoy on Technology.
———. 2023a. ‘Executive Order on the Safe, Secure, and Trustworthy 2024. ‘High-Level Advisory Body on Artificial Intelligence.’ As of 19
Development and Use of Artificial Intelligence.’ As of 11 April 2024: January 2024:
https://www.whitehouse.gov/briefing-room/presidential- https://www.un.org/techenvoy/ai-advisory-body
actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-
development-and-use-of-artificial-intelligence/
32 Examining the landscape of tools for trustworthy AI in the UK and the US
A.1. UK government
A.2 National Institute of Standards and
The UK government approach to trustworthy AI is based on the
following principles103:
Technology (US)
For AI systems to be trustworthy, they often need to be responsive to a
• Safety, security and robustness: ‘AI systems should function in a
multiplicity of criteria that are of value to interested parties. Approaches
robust, secure and safe way throughout the AI life cycle, and risks
which enhance AI trustworthiness can reduce negative AI risks. The
should be continually identified, assessed and managed.’
NIST Artificial Intelligence Risk Management Framework articulates
• Appropriate transparency and explainability: ‘Transparency refers to the following characteristics of trustworthy AI and offers guidance for
the communication of appropriate information about an AI system to addressing them104:
relevant people (for example, information on how, when, and for which
• Validity: ‘Confirmation, through the provision of objective evidence,
purposes an AI system is being used). Explainability refers to the
that the requirements for a specific intended use or application have
extent to which it is possible for relevant parties to access, interpret
been fulfilled.’
and understand the decision-making processes of an AI system.’
• Reliability: ‘Ability of an item to perform as required, without failure, socio-technical system attributes, accountability and transparency
for a given time interval, under given conditions.’ also relate to the processes and activities internal to an AI system and
its external setting. Neglecting these characteristics can increase the
• Safety: ‘AI systems should not under defined conditions, lead to a
probability and magnitude of negative consequences.'105
state in which human life, health, property, or the environment is
endangered.’
• Resilience: The ability to ‘withstand unexpected adverse events or
A.3. European Commission
unexpected changes in the environment or use’ of AI systems – or the The EC asked a High-Level Expert Group on AI to provide advice on the
ability to ‘maintain their functions and structure in the face of internal EU’s strategy for AI. One of the tasks of the group was to draft Ethics
and external change and degrade safely and gracefully when this is Guidelines for Trustworthy Artificial Intelligence. According to the expert
necessary.’ group, AI should be106:
• Transparency: ‘The extent to which information about an AI system • Lawful: ‘Complying with all applicable laws and regulations.’
and its outputs is available to individuals interacting with such a
• Ethical: ‘Ensuring adherence to ethical principles and values.’
system – regardless of whether they are even aware that they are
doing so.’ • Robust: ‘AI systems should offer a consistent performance
regardless of the context or data.’
• Explainability: ‘A representation of the mechanisms underlying AI
systems’ operation.’ There are four ethical principles, rooted in fundamental rights, that are
‘ethical imperatives’ that must be respected at all times107:
• Interpretability: ‘The meaning of AI systems’ output in the context of
their designed functional purposes.’ • Respect for human autonomy: ‘Humans interacting with AI systems
must be able to keep full and effective self-determination over
• Privacy: ‘Refers generally to the norms and practices that help to
themselves, and be able to partake in the democratic process.
safeguard human autonomy, identity, and dignity.’
AI systems should not unjustifiably subordinate, coerce, deceive,
• Fairness: ‘Concerns for equality and equity by addressing issues such manipulate, condition or herd humans. Instead, they should be
as harmful bias and discrimination.’ designed to augment, complement and empower human cognitive,
social and cultural skills.’
'Creating trustworthy AI requires balancing each of these characteristics
based on the AI system’s context of use. While all characteristics are
• Prevention of harm: ‘AI systems should neither cause nor exacerbate • Privacy and data governance: ‘Prevention of harm to privacy also
harm or otherwise adversely affect human beings. This entails the necessitates adequate data governance that covers the quality
protection of human dignity as well as mental and physical integrity. and integrity of the data used, its relevance in light of the domain in
AI systems and the environments in which they operate must be safe which the AI systems will be deployed, its access protocols and the
and secure.’ capability to process data in a manner that protects privacy.’
• Fairness: ‘Fairness has both a substantive and a procedural • Transparency: ‘The data sets and the processes that yield the
dimension. The substantive dimension implies a commitment to AI system’s decision, including those of data gathering and data
ensuring equal and just distribution of both benefits and costs, labelling as well as the algorithms used, should be documented to
and ensuring that individuals and groups are free from unfair bias, the best possible standard to allow for traceability and an increase
discrimination and stigmatisation.… The procedural dimension of in transparency. The processes and decisions made by AI should
fairness entails the ability to contest and seek effective redress against be explainable. AI systems should not represent themselves as
decisions made by AI systems and by the humans operating them.’ humans to users; humans have the right to be informed that they are
interacting with an AI system.’
• Explicability: ‘Processes need to be transparent, the capabilities and
purpose of AI systems openly communicated, and decisions – to the • Diversity, non-discrimination and fairness: ‘In order to achieve
extent possible – explainable to those directly and indirectly affected.’ Trustworthy AI, we must enable inclusion and diversity throughout
the entire AI system’s life cycle. Besides the consideration and
In order to meet these principles, AI systems should at least meet these
involvement of all affected stakeholders throughout the process,
seven requirements:
this also entails ensuring equal access through inclusive design
• Human agency and oversight: ‘AI systems should support human processes as well as equal treatment.’
autonomy and decision making, as prescribed by the principle of
• Environmental and societal well-being: ‘In line with the principles
respect for human autonomy. This requires that AI systems should
of fairness and prevention of harm, the broader society, other
both act as enablers to a democratic, flourishing and equitable
sentient beings and the environment should be also considered as
society by supporting the user’s agency and foster fundamental
stakeholders throughout the AI system’s life cycle. Sustainability and
rights and allow for human oversight.’
ecological responsibility of AI systems should be encouraged, and
• Technical robustness and safety: ‘Technical robustness requires research should be fostered into AI solutions addressing areas of
that AI systems be developed with a preventative approach to risks global concern, such as for instance the Sustainable Development
and in a manner such that they reliably behave as intended while Goals. Ideally, AI systems should be used to benefit all human beings,
minimising unintentional and unexpected harm and preventing including future generations.’
unacceptable harm.’
35
• Accountability: ‘The requirement of accountability complements the • Transparency and explainability: ‘AI actors should commit to
above requirements and is closely linked to the principle of fairness. It transparency and responsible disclosure regarding AI systems. To
necessitates that mechanisms be put in place to ensure responsibility this end, they should provide meaningful information, appropriate to
and accountability for AI systems and their outcomes, both before the context, and consistent with the state of art to foster a general
and after their development, deployment and use.’ understanding of AI systems, to make stakeholders aware of their
interactions with AI systems, including in the workplace, to enable
A.4. OECD those affected by an AI system to understand the outcome, and to
enable those adversely affected by an AI system to challenge its
The OECD AI principles were adopted by the OECD Council on Artificial outcome based on plain and easy-to-understand information on
Intelligence in 2019, with the goal of promoting the responsible the factors, and the logic that served as the basis for the prediction,
stewardship of AI. They include the following value-based principles108: recommendation or decision.’
• Inclusive growth, sustainable development and well-being: • Robustness, security and safety: ‘AI systems should be robust,
‘Stakeholders should proactively engage in responsible stewardship secure and safe throughout their entire lifecycle so that, in conditions
of trustworthy AI in pursuit of beneficial outcomes for people and of normal use, foreseeable use or misuse, or other adverse
the planet, such as augmenting human capabilities and enhancing conditions, they function appropriately and do not pose unreasonable
creativity, advancing inclusion of underrepresented populations, safety risk. To this end, AI actors should ensure traceability, including
reducing economic, social, gender and other inequalities, and in relation to datasets, processes and decisions made during the AI
protecting natural environments, thus invigorating inclusive growth, system lifecycle, to enable analysis of the AI system’s outcomes and
sustainable development and well-being.’ responses to inquiry, appropriate to the context and consistent with
• Human-centred values and fairness: ‘AI actors should respect the the state of art. AI actors should, based on their roles, the context,
rule of law, human rights and democratic values, throughout the and their ability to act, apply a systematic risk management approach
AI system lifecycle. These include freedom, dignity and autonomy, to each phase of the AI system lifecycle on a continuous basis to
privacy and data protection, non-discrimination and equality, diversity, address risks related to AI systems, including privacy, digital security,
fairness, social justice, and internationally recognised labour rights. safety and bias.’
To this end, AI actors should implement mechanisms and safeguards, • Accountability: ‘AI actors should be accountable for the proper
such as capacity for human determination, that are appropriate to the functioning of AI systems and for the respect of the above principles,
context and consistent with the state of art.’ based on their roles, the context, and consistent with the state of art.’
AI,109 DSIT’s Portfolio of AI Assurance Techniques110 and a selection of the US that might not have been picked up Task 2.1. We also used the
toolboxes and toolkits developed by technology companies.111 In addition, exercise to ask respondents for suggestions about other useful sources
we conducted a series of targeted searches in Google. The review of information to consult (e.g. organisations, reports, articles), as well
collected a wide-ranging set of relevant information on different actors in their views on efforts towards improving collaboration between the UK
the UK and the US AI ecosystems, including research organisations and and the US on trustworthy AI. The online crowdsourcing exercise was
universities, thinktanks, national and regional industry associations, and set up to run in the background once the research began, and it ran for
government and industry initiatives on AI. Specifically, we extracted the entire duration of the study. We created a data collection template using
following information about each tool into an Excel spreadsheet (the final Google Sheets that contained the main fields we wanted to capture from
longlist of tools we compiled is presented in the accompanying Excel file): the experts. The exercise was primarily aimed at AI researchers and
representatives from government, industry and third sector organisations.
• Name of tool
We drew on the expertise within RAND and our wider networks to compile
• Short description of tool a list of 64 stakeholders from the US, the UK and EU countries, who were
• Developer(s) of tool invited to fill out the crowdsourcing template. In total, we received ten
responses to the crowdsourcing exercise.
• Country or countries of tool
• Time period of development B.2.3. Task 2.3: Stakeholder interviews
• Type of tool We conducted interviews with a range of stakeholders involved in the
• Aim of tool tools for trustworthy AI ecosystem.112 These included stakeholders
connected to some of the tools we had identified in Task 2.1, as well as
• Development stage of tool more general experts with knowledge of the wider landscape of tools
• Target audience. for trustworthy AI. We conducted ten semi-structured interviews in
total, covering both US and UK stakeholders from academia, industry,
government and the third sector. The interviews lasted between 30 and
B.2.2. Task 2.2: Crowdsourcing exercise
60 minutes and were conducted online, through Microsoft Teams. We
We carried out a targeted online crowdsourcing exercise with experts developed a concise, tailored interview protocol that built on emerging
to collect additional examples of tools for trustworthy AI in the UK and findings from the desk research in Task 2.1. Where appropriate, we
modified the questions we asked based on the interviewee’s expertise interview data and integrated relevant insights and information from
and background. Below we list the indicative topics we discussed with the interviews into the cross-analysis of the tools database. The cross-
interviewees: analysis of the evidence was conducted through discussions among core
members of the study team. Informed by the analysis of the evidence, we
• Understanding of the phrase ‘trustworthy AI’.
also articulated a series of considerations for policymakers involved in
• Information about specific tools for making AI trustworthy. the trustworthy AI ecosystem in the UK and the US.
• Awareness of wider developments and trends in the trustworthy AI
space taking place in the UK and the US. B.3.2. Task 3.2: Reporting
• Views on challenges associated with developing, deploying and using In the final stage of the research, we synthesised all the data from the
tools for trustworthy AI. preceding stages of research. This information formed the basis of the
findings included in this report. We have used message-led headings in
• Awareness of gaps or challenges in the current cooperation between
the main sections of this report (Chapters 2 and 3) to communicate the
the UK and the US on tools for trustworthy AI.
findings of the research in a succinct manner that may be suitable for
• Ideas for initiatives or wider priority areas to consider for future UK– non-expert readers. Where relevant, we have also included examples of
US collaboration on trustworthy AI. tools for trustworthy AI in the UK and the US to illustrate specific findings.
• Suggestions for organisations to speak to or of further resources to In Annex C, we include the longlist of tools identified to enable readers to
consult in the research. look up information about specific examples of tools in more depth.
targeted searches of tools for trustworthy AI and information provided by with the tool that we analysed. The final longlist we compiled highlights a
stakeholders we interviewed. wide spectrum of tools in this growing area that span different parts of the
AI value chain, target diverse sectors, and cover a variety of dimensions
Second, the development of trustworthy AI and linked AI governance
and characteristics of AI trustworthiness.
issues is a fast-moving field, involving multiple stakeholders across the
world with differing priorities. By focusing on the UK and the US, we Finally, we spoke to a relatively small sample of interviewees, mainly
have not included important developments taking place in this rapidly because of the tight timeframes within which the research had to be
evolving field in other parts of the world (including other regulatory policy completed, which has meant that the diversity of views captured in the
discussions). However, we are confident, based on the approach we research is limited. Moreover, it was beyond the scope of this study to
adopted, that our analysis provides a fair and relatively holistic picture of independently verify all the information that the interviewees provided.
the state of evidence (at the time of writing) in the UK and the US. However, the interviews were only intended to complement the document
and data review and to gather views and perceptions from UK- and
Third, while we aimed to capture as many relevant examples of tools as
US-based stakeholders working in the wider trustworthy AI ecosystem.
possible within the study timeframe, the final longlist of tools was not
Furthermore, within the sample of interviewees, we attempted to seek
intended to be exhaustive or definitive, nor did we evaluate or assess
expert opinion across a range of stakeholders from industry, academia,
the effectiveness of the tools. Rather, the examples we captured served
government and the third sector.
as concrete, illustrative cases of tools that have been developed and
deployed in practice to make AI trustworthy. The database of tools was Notwithstanding the caveats discussed above, we hope that the analyses
intended to provide a wide-ranging snapshot of the state of play at the and findings presented in this report will be useful to inform future
time of writing. Furthermore, we collated information about each tool into thinking related to the growing and increasingly important area of tools for
our database based on the information contained in the source data we trustworthy AI.
consulted or using our best understanding of the information associated
40 Examining the landscape of tools for trustworthy AI in the UK and the US
113 These fields were completed with varying levels of specificity that depended on the information associated
with each tool in the underlying evidence we reviewed.
114 OECD (2024a).