Random Sample as a Pre-pilot Evaluation of Benefits and Risks for AI in Public Sector

Vethman, Steven; Schaaphok, Marianne; Hoekstra, Marissa; Veenman, Cor

doi:10.1007/978-3-031-50485-3_10

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1948))

Included in the following conference series:

European Conference on Artificial Intelligence

807 Accesses

Abstract

Public organisations have adopted AI into their public service aiming to tap into the promised potential for society, such as increasing efficiency and effectiveness of current processes. Recent studies from the European Commission share, however, that critical issues of AI use only tended to surface when they were already in operation and thus had already affected citizens. To prevent negative impact to citizens, we propose public organisations to use random sampling as a safe, yet valuable practical evaluation step before considering a pilot. This safe pre-pilot evaluation step enables evaluation of the AI system without applying it in any decisions or actions that already affect citizens. We pose six arguments on the added value of random sampling in the evaluation step of AI systems: 1) it provides high quality data for evaluation and validation of assumptions; 2) it supports gathering input for fairness evaluation; 3) it creates a benchmark to compare AI to alternatives; 4) it enables challenging assumptions in the organisation and the AI development; 5) it supports a discussion on the limitations of AI 6) and it provides a safe space to evaluate and reflect. In addition, we discuss limitations and challenges for random sampling in the evaluation, such as temporary loss of efficiency, class and representation imbalances, organizational hesitancy and societal experiences. We invite the participants of this workshop to reflect with us on the potential benefits and challenges, and in turn distill the practical requirements where using a random sample for evaluation is safe and useful.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Sampling Techniques for Qualitative Research

No Calculation When Observation Can Be Made

Finite Populations Sampling Strategies and Costs Control

Notes

1.
https://appl-ai-tno.nl/projects/ai-oversight-lab/.

References

Bethlehem, J.: Applied Survey Methods, a statistical perspective. John Wiley and Sons Inc (2009)
Google Scholar
Clemmensen, L., Kjærsgaard, R.: Data representativity for machine learning and AI systems (2022)
Google Scholar
Dannenberg, E.: Factsheet overtredingen van de inlichtingenplicht - meer maatwerk en eenvoudigere regels (2021). https://www.divosa.nl/publicaties/factsheet-overtredingen-van-de-inlichtingenplicht/factsheet-overtredingen-van-de
EUR-Lex Access to European Union Law: glossary proportionality. https://eur-lex.europa.eu/EN/legal-content/glossary/principle-of-proportionality.html
European Parliament: amendments adopted by the European parliament on 14 June 2023 on the proposal for a regulation of the European parliament and of the council on laying down harmonised rules on artificial intelligence (artificial intelligence act) and amending certain union legislative acts (2023). https://www.europarl.europa.eu/doceo/document/TA-9-2023-0236_EN.html
Gerards, J., Schäfer, M., Muis, I., Vankan, A.: Fundamental rights and algorithms impact assessment (FRAIA). Utrecht University, Tech. rep. (2021)
Google Scholar
High-Level Expert Group on Artificial Intelligence: ethics guidelines for trustworthy AI. Tech. rep, European Commission (2019)
Google Scholar
Hoekstra, M., Chideock, C., Veenstra, A.: Quick scan AI in de publieke dienstverlening ii. Tech. rep. (2021)
Google Scholar
Martínez-Plumed, F., et al.: CRISP-DM twenty years later: from data mining processes to data science trajectories. IEEE Trans. Knowl. Data Eng. 33(8), 3048–3061 (2021). https://doi.org/10.1109/TKDE.2019.2962680
Article Google Scholar
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)
Article Google Scholar
Molinari, F., Van Noordt, C., Vaccari, L., Pignatelli, F., Tangi, L.: AI watch beyond pilots: sustainable implementation of AI in public services (KJ-NA-30868-EN-N (online)), 14 (2021). https://doi.org/10.2760/440212(online)
Reventlow, N.J.: Data collection is not the solution for Europe’s racism problem (2020). https://www.aljazeera.com/opinions/2020/7/29/data-collection-is-not-the-solution-for-europes-racism-problem
Schröer, C., Kruse, F., Gómez, J.M.: A systematic literature review on applying CRISP-DM process model. Proc. Comput. Sci. 181, 526–534 (2021). https://doi.org/10.1016/j.procs.2021.01.199, https://www.sciencedirect.com/science/article/pii/S1877050921002416
Stahl, B., et al.: A systematic review of artificial intelligence impact assessments. Artif. Intell. Rev. (2023)
Google Scholar
Steen, M., Timan, T., Vethman, S.: Using an extended error matrix to promote transdisciplinary collaboration and jointly work towards social justice (2022). https://marcsteen.nl/docs/ESDiT_2022__Error_Matrix.pdf
Valizadegan H, Amizadeh S, H.M.: Sampling strategies to evaluate the performance of unknown predictors. In: Proceedings SIAM International Conference Data Mining (2014)
Google Scholar
Wirth, R., Hipp, J.: CRISP-DM: towards a standard process model for data mining (2000)
Google Scholar
Xia, B., Lu, Q., Perera, H., Zhu, L., Xing, Z., Liu, Y., Whittle, J.: Towards concrete and connected AI risk assessment (C2AIRA): a systematic mapping study. In: 2023 IEEE/ACM 2nd International Conference on AI Engineering - Software Engineering for AI (CAIN), pp. 104–116. IEEE Computer Society, Los Alamitos, CA, USA (2023). https://doi.org/10.1109/CAIN58948.2023.00027

Download references

Acknowledgement

We would like to thank all our colleagues in the AI Oversight lab$^{2}$, our external partners, as well as all other public and private organisations that have facilitated transparency on this urgent yet sensitive topic such that the lessons described in this paper could be learned.$^{2}$https://appl-ai-tno.nl/projects/ai-oversight-lab/

Author information

Authors and Affiliations

Netherlands Organisation for Applied Scientific Research (TNO) - Data Science, The Hague, The Netherlands
Steven Vethman, Marianne Schaaphok & Cor Veenman
Netherlands Organisation for Applied Scientific Research (TNO) - Vector, The Hague, The Netherlands
Marissa Hoekstra
Leiden University - Leiden Institute of Advanced Computer Science (LIACS), Leiden, The Netherlands
Cor Veenman

Authors

Steven Vethman
View author publications
You can also search for this author in PubMed Google Scholar
Marianne Schaaphok
View author publications
You can also search for this author in PubMed Google Scholar
Marissa Hoekstra
View author publications
You can also search for this author in PubMed Google Scholar
Cor Veenman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Steven Vethman .

Editor information

Editors and Affiliations

Halmstad University, Halmstad, Sweden
Sławomir Nowaczyk
Warsaw University of Technology, Warsaw, Poland
Przemysław Biecek
Warsaw University, Warsaw, Poland
Neo Christopher Chung
University of Huddersfield, Huddersfield, UK
Mauro Vallati
AGH University of Science and Technology, Kraków, Poland
Paweł Skruch
AGH University of Science and Technology, Kraków, Poland
Joanna Jaworek-Korjakowska
University of Huddersfield, Huddersfield, UK
Simon Parkinson
University of Huddersfield, Huddersfield, UK
Alexandros Nikitas
Universität Osnabrück, Osnabrück, Germany
Martin Atzmüller
University of Economics Prague, Prague, Czech Republic
Tomáš Kliegr
University of Bamberg, Bamberg, Germany
Ute Schmid
Jagiellonian University, Kraków, Poland
Szymon Bobek
Jožef Stefan Institute, Ljubljana, Slovenia
Nada Lavrac
HU University of Applied Sciences Utrecht, Utrecht, The Netherlands
Marieke Peeters
Rotterdam University of Applied Sciences, Rotterdam, The Netherlands
Roland van Dierendonck
Amsterdam University of Applied Sciences, Amsterdam, The Netherlands
Saskia Robben
University of Reims Champagne-Ardenne, Reims, France
Eunika Mercier-Laurent
Istanbul Technical University, Istanbul, Türkiye
Gülgün Kayakutlu
Wroclaw University of Economics and Business, Wrocław, Poland
Mieczyslaw Lech Owoc
University of Galway, Galway, Ireland
Karl Mason
University of Galway, Galway, Ireland
Abdul Wahid
University of Calabria, Rende, Italy
Pierangela Bruno
University of Calabria, Rende, Italy
Francesco Calimeri
Marche Polytechnic University, Ancona, Italy
Francesco Cauteruccio
University of Calabria, Rende, Italy
Giorgio Terracina
University of Bamberg, Bamberg, Germany
Diedrich Wolter
Coburg University of Applied Sciences, Coburg, Germany
Jochen L. Leidner
FAU Erlangen-Nürnberg, Erlangen, Germany
Michael Kohlhase
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vethman, S., Schaaphok, M., Hoekstra, M., Veenman, C. (2024). Random Sample as a Pre-pilot Evaluation of Benefits and Risks for AI in Public Sector. In: Nowaczyk, S., et al. Artificial Intelligence. ECAI 2023 International Workshops. ECAI 2023. Communications in Computer and Information Science, vol 1948. Springer, Cham. https://doi.org/10.1007/978-3-031-50485-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-50485-3_10
Published: 25 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50484-6
Online ISBN: 978-3-031-50485-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Random Sample as a Pre-pilot Evaluation of Benefits and Risks for AI in Public Sector