-
Using Large Language Models for Humanitarian Frontline Negotiation: Opportunities and Considerations
Authors:
Zilin Ma,
Susannah,
Su,
Nathan Zhao,
Linn Bieske,
Blake Bullwinkel,
Yanyi Zhang,
Sophia,
Yang,
Ziqing Luo,
Siyao Li,
Gekai Liao,
Boxiang Wang,
Jinglun Gao,
Zihan Wen,
Claude Bruderlein,
Weiwei Pan
Abstract:
Humanitarian negotiations in conflict zones, called \emph{frontline negotiation}, are often highly adversarial, complex, and high-risk. Several best-practices have emerged over the years that help negotiators extract insights from large datasets to navigate nuanced and rapidly evolving scenarios. Recent advances in large language models (LLMs) have sparked interest in the potential for AI to aid d…
▽ More
Humanitarian negotiations in conflict zones, called \emph{frontline negotiation}, are often highly adversarial, complex, and high-risk. Several best-practices have emerged over the years that help negotiators extract insights from large datasets to navigate nuanced and rapidly evolving scenarios. Recent advances in large language models (LLMs) have sparked interest in the potential for AI to aid decision making in frontline negotiation. Through in-depth interviews with 13 experienced frontline negotiators, we identified their needs for AI-assisted case analysis and creativity support, as well as concerns surrounding confidentiality and model bias. We further explored the potential for AI augmentation of three standard tools used in frontline negotiation planning. We evaluated the quality and stability of our ChatGPT-based negotiation tools in the context of two real cases. Our findings highlight the potential for LLMs to enhance humanitarian negotiations and underscore the need for careful ethical and practical considerations.
△ Less
Submitted 30 May, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Pavlok-Nudge: A Feedback Mechanism for Atomic Behaviour Modification with Snoring Usecase
Authors:
Shreya Ghosh,
Md Rakibul Hasan,
Pradyumna Agrawal,
Zhixi Cai,
Susannah Soon,
Abhinav Dhall,
Tom Gedeon
Abstract:
This paper proposes a feedback mechanism to 'break bad habits' using the Pavlok device. Pavlok utilises beeps, vibration and shocks as a mode of aversion technique to help individuals with behaviour modification. While the device can be useful in certain periodic daily life situations, like alarms and exercise notifications, the device relies on manual operations that limit its usage. To this end,…
▽ More
This paper proposes a feedback mechanism to 'break bad habits' using the Pavlok device. Pavlok utilises beeps, vibration and shocks as a mode of aversion technique to help individuals with behaviour modification. While the device can be useful in certain periodic daily life situations, like alarms and exercise notifications, the device relies on manual operations that limit its usage. To this end, we design a user interface to generate an automatic feedback mechanism that integrates Pavlok and a deep learning based model to detect certain behaviours via an integrated user interface i.e. mobile or desktop application. Our proposed solution is implemented and verified in the context of snoring, which first detects audio from the environment following a prediction of whether the audio content is a snore or not. Based on the prediction of the deep learning model, we use Pavlok to alert users for preventive measures. We believe that this simple solution can help people to change their atomic habits, which may lead to long-term benefits.
△ Less
Submitted 10 May, 2023; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Meaningful human command: Advance control directives as a method to enable moral and legal responsibility for autonomous weapons systems
Authors:
Susannah Kate Devitt
Abstract:
21st Century war is increasing in speed, with conventional forces combined with massed use of autonomous systems and human-machine integration. However, a significant challenge is how humans can ensure moral and legal responsibility for systems operating outside of normal temporal parameters. This chapter considers whether humans can stand outside of real time and authorise actions for autonomous…
▽ More
21st Century war is increasing in speed, with conventional forces combined with massed use of autonomous systems and human-machine integration. However, a significant challenge is how humans can ensure moral and legal responsibility for systems operating outside of normal temporal parameters. This chapter considers whether humans can stand outside of real time and authorise actions for autonomous systems by the prior establishment of a contract, for actions to occur in a future context particularly in faster than real time or in very slow operations where human consciousness and concentration could not remain well informed. The medical legal precdent found in 'advance care directives' suggests how the time-consuming, deliberative process required for accountability and responsibility of weapons systems may be achievable outside real time captured in an 'advance control driective' (ACD). The chapter proposes 'autonomy command' scaffolded and legitimised through the construction of ACD ahead of the deployment of autonomous systems.
△ Less
Submitted 3 August, 2023; v1 submitted 12 March, 2023;
originally announced March 2023.
-
Bad, mad, and cooked: Moral responsibility for civilian harms in human-AI military teams
Authors:
Susannah Kate Devitt
Abstract:
This chapter explores moral responsibility for civilian harms by human-artificial intelligence (AI) teams. Although militaries may have some bad apples responsible for war crimes and some mad apples unable to be responsible for their actions during a conflict, increasingly militaries may 'cook' their good apples by putting them in untenable decision-making environments through the processes of rep…
▽ More
This chapter explores moral responsibility for civilian harms by human-artificial intelligence (AI) teams. Although militaries may have some bad apples responsible for war crimes and some mad apples unable to be responsible for their actions during a conflict, increasingly militaries may 'cook' their good apples by putting them in untenable decision-making environments through the processes of replacing human decision-making with AI determinations in war making. Responsibility for civilian harm in human-AI military teams may be contested, risking operators becoming detached, being extreme moral witnesses, becoming moral crumple zones or suffering moral injury from being part of larger human-AI systems authorised by the state. Acknowledging military ethics, human factors and AI work to date as well as critical case studies, this chapter offers new mechanisms to map out conditions for moral responsibility in human-AI teams. These include: 1) new decision responsibility prompts for critical decision method in a cognitive task analysis, and 2) applying an AI workplace health and safety framework for identifying cognitive and psychological risks relevant to attributions of moral responsibility in targeting decisions. Mechanisms such as these enable militaries to design human-centred AI systems for responsible deployment.
△ Less
Submitted 6 September, 2023; v1 submitted 31 October, 2022;
originally announced November 2022.
-
Improving alignment of dialogue agents via targeted human judgements
Authors:
Amelia Glaese,
Nat McAleese,
Maja Trębacz,
John Aslanides,
Vlad Firoiu,
Timo Ewalds,
Maribeth Rauh,
Laura Weidinger,
Martin Chadwick,
Phoebe Thacker,
Lucy Campbell-Gillingham,
Jonathan Uesato,
Po-Sen Huang,
Ramona Comanescu,
Fan Yang,
Abigail See,
Sumanth Dathathri,
Rory Greig,
Charlie Chen,
Doug Fritz,
Jaume Sanchez Elias,
Richard Green,
Soňa Mokrá,
Nicholas Fernando,
Boxi Wu
, et al. (9 additional authors not shown)
Abstract:
We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines. We use reinforcement learning from human feedback to train our models with two new additions to help human raters judge agent behaviour. First, to make our agent more helpful and harmless, we break down the requirements for good dialogue into na…
▽ More
We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines. We use reinforcement learning from human feedback to train our models with two new additions to help human raters judge agent behaviour. First, to make our agent more helpful and harmless, we break down the requirements for good dialogue into natural language rules the agent should follow, and ask raters about each rule separately. We demonstrate that this breakdown enables us to collect more targeted human judgements of agent behaviour and allows for more efficient rule-conditional reward models. Second, our agent provides evidence from sources supporting factual claims when collecting preference judgements over model statements. For factual questions, evidence provided by Sparrow supports the sampled response 78% of the time. Sparrow is preferred more often than baselines while being more resilient to adversarial probing by humans, violating our rules only 8% of the time when probed. Finally, we conduct extensive analyses showing that though our model learns to follow our rules it can exhibit distributional biases.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
A method for ethical AI in Defence: A case study on developing trustworthy autonomous systems
Authors:
Tara Roberson,
Stephen Bornstein,
Rain Liivoja,
Simon Ng,
Jason Scholz,
S. Kate Devitt
Abstract:
What does it mean to be responsible and responsive when developing and deploying trusted autonomous systems in Defence? In this short reflective article, we describe a case study of building a trusted autonomous system - Athena AI - within an industry-led, government-funded project with diverse collaborators and stakeholders. Using this case study, we draw out lessons on the value and impact of em…
▽ More
What does it mean to be responsible and responsive when developing and deploying trusted autonomous systems in Defence? In this short reflective article, we describe a case study of building a trusted autonomous system - Athena AI - within an industry-led, government-funded project with diverse collaborators and stakeholders. Using this case study, we draw out lessons on the value and impact of embedding responsible research and innovation-aligned, ethics-by-design approaches and principles throughout the development of technology at high translation readiness levels.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models
Authors:
Adam Liška,
Tomáš Kočiský,
Elena Gribovskaya,
Tayfun Terzi,
Eren Sezener,
Devang Agrawal,
Cyprien de Masson d'Autume,
Tim Scholtes,
Manzil Zaheer,
Susannah Young,
Ellen Gilsenan-McMahon,
Sophia Austin,
Phil Blunsom,
Angeliki Lazaridou
Abstract:
Knowledge and language understanding of models evaluated through question answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia. However, our world is dynamic, evolves over time, and our models' knowledge becomes outdated. To study how semi-parametric QA models and their underlying parametric language models (LMs) adapt to evolving knowledge, we construct a new l…
▽ More
Knowledge and language understanding of models evaluated through question answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia. However, our world is dynamic, evolves over time, and our models' knowledge becomes outdated. To study how semi-parametric QA models and their underlying parametric language models (LMs) adapt to evolving knowledge, we construct a new large-scale dataset, StreamingQA, with human written and generated questions asked on a given date, to be answered from 14 years of time-stamped news articles. We evaluate our models quarterly as they read new articles not seen in pre-training. We show that parametric models can be updated without full retraining, while avoiding catastrophic forgetting. For semi-parametric models, adding new articles into the search space allows for rapid adaptation, however, models with an outdated underlying LM under-perform those with a retrained LM. For questions about higher-frequency named entities, parametric updates are particularly beneficial. In our dynamic world, the StreamingQA dataset enables a more realistic evaluation of QA models, and our experiments highlight several promising directions for future research.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Teaching language models to support answers with verified quotes
Authors:
Jacob Menick,
Maja Trebacz,
Vladimir Mikulik,
John Aslanides,
Francis Song,
Martin Chadwick,
Mia Glaese,
Susannah Young,
Lucy Campbell-Gillingham,
Geoffrey Irving,
Nat McAleese
Abstract:
Recent large language models often answer factual questions correctly. But users can't trust any given claim a model makes without fact-checking, because language models can hallucinate convincing nonsense. In this work we use reinforcement learning from human preferences (RLHP) to train "open-book" QA models that generate answers whilst also citing specific evidence for their claims, which aids i…
▽ More
Recent large language models often answer factual questions correctly. But users can't trust any given claim a model makes without fact-checking, because language models can hallucinate convincing nonsense. In this work we use reinforcement learning from human preferences (RLHP) to train "open-book" QA models that generate answers whilst also citing specific evidence for their claims, which aids in the appraisal of correctness. Supporting evidence is drawn from multiple documents found via a search engine, or from a single user-provided document. Our 280 billion parameter model, GopherCite, is able to produce answers with high quality supporting evidence and abstain from answering when unsure. We measure the performance of GopherCite by conducting human evaluation of answers to questions in a subset of the NaturalQuestions and ELI5 datasets. The model's response is found to be high-quality 80\% of the time on this Natural Questions subset, and 67\% of the time on the ELI5 subset. Abstaining from the third of questions for which it is most unsure improves performance to 90\% and 80\% respectively, approaching human baselines. However, analysis on the adversarial TruthfulQA dataset shows why citation is only one part of an overall strategy for safety and trustworthiness: not all claims supported by evidence are true.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
Faster indicators of dengue fever case counts using Google and Twitter
Authors:
Giovanni Mizzi,
Tobias Preis,
Leonardo Soares Bastos,
Marcelo Ferreira da Costa Gomes,
Claudia Torres Codeço,
Helen Susannah Moat
Abstract:
Dengue is a major threat to public health in Brazil, the world's sixth biggest country by population, with over 1.5 million cases recorded in 2019 alone. Official data on dengue case counts is delivered incrementally and, for many reasons, often subject to delays of weeks. In contrast, data on dengue-related Google searches and Twitter messages is available in full with no delay. Here, we describe…
▽ More
Dengue is a major threat to public health in Brazil, the world's sixth biggest country by population, with over 1.5 million cases recorded in 2019 alone. Official data on dengue case counts is delivered incrementally and, for many reasons, often subject to delays of weeks. In contrast, data on dengue-related Google searches and Twitter messages is available in full with no delay. Here, we describe a model which uses online data to deliver improved weekly estimates of dengue incidence in Rio de Janeiro. We address a key shortcoming of previous online data disease surveillance models by explicitly accounting for the incremental delivery of case count data, to ensure that our approach can be used in practice. We also draw on data from Google Trends and Twitter in tandem, and demonstrate that this leads to slightly better estimates than a model using only one of these data streams alone. Our results provide evidence that online data can be used to improve both the accuracy and precision of rapid estimates of disease incidence, even where the underlying case count data is subject to long and varied delays.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Authors:
Jack W. Rae,
Sebastian Borgeaud,
Trevor Cai,
Katie Millican,
Jordan Hoffmann,
Francis Song,
John Aslanides,
Sarah Henderson,
Roman Ring,
Susannah Young,
Eliza Rutherford,
Tom Hennigan,
Jacob Menick,
Albin Cassirer,
Richard Powell,
George van den Driessche,
Lisa Anne Hendricks,
Maribeth Rauh,
Po-Sen Huang,
Amelia Glaese,
Johannes Welbl,
Sumanth Dathathri,
Saffron Huang,
Jonathan Uesato,
John Mellor
, et al. (55 additional authors not shown)
Abstract:
Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop…
▽ More
Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gopher. These models are evaluated on 152 diverse tasks, achieving state-of-the-art performance across the majority. Gains from scale are largest in areas such as reading comprehension, fact-checking, and the identification of toxic language, but logical and mathematical reasoning see less benefit. We provide a holistic analysis of the training dataset and model's behaviour, covering the intersection of model scale with bias and toxicity. Finally we discuss the application of language models to AI safety and the mitigation of downstream harms.
△ Less
Submitted 21 January, 2022; v1 submitted 8 December, 2021;
originally announced December 2021.
-
Developing a Trusted Human-AI Network for Humanitarian Benefit
Authors:
Susannah Kate Devitt,
Jason Scholz,
Timo Schless,
Larry Lewis
Abstract:
Artificial intelligences (AI) will increasingly participate digitally and physically in conflicts, yet there is a lack of trused communications with humans for humanitarian purposes. In this paper we consider the integration of a communications protocol (the 'whiteflag protocol'), distributed ledger 'blockchain' technology, and information fusion with AI, to improve conflict communications called…
▽ More
Artificial intelligences (AI) will increasingly participate digitally and physically in conflicts, yet there is a lack of trused communications with humans for humanitarian purposes. In this paper we consider the integration of a communications protocol (the 'whiteflag protocol'), distributed ledger 'blockchain' technology, and information fusion with AI, to improve conflict communications called 'protected assurance understanding situation and entitities' PAUSE. Such a trusted human-AI communication network could provide accountable information exchange regarding protected entities, critical infrastructure, humanitiarian signals and status updates for humans and machines in conflicts. We examine several realistic potential case studies for the integration of these technologies into a trusted human-AI network for humanitarian benefit including mapping a conflict zone with civilians and combatants in real time, preparation to avoid incidents and using the network to manage misinformation. We finish with a real-world example of a PAUSE-like network, the Human Security Information System (HSIS), being developed by USAID, that uses blockchain technology to provide a secure means to better understand the civilian environment.
△ Less
Submitted 10 March, 2023; v1 submitted 7 December, 2021;
originally announced December 2021.
-
Australia's Approach to AI Governance in Security and Defence
Authors:
Susannah Kate Devitt,
Damian Copeland
Abstract:
Australia is a leading AI nation with strong allies and partnerships. Australia has prioritised the development of robotics, AI, and autonomous systems to develop sovereign capability for the military. Australia commits to Article 36 reviews of all new means and methods of warfare to ensure weapons and weapons systems are operated within acceptable systems of control. Additionally, Australia has u…
▽ More
Australia is a leading AI nation with strong allies and partnerships. Australia has prioritised the development of robotics, AI, and autonomous systems to develop sovereign capability for the military. Australia commits to Article 36 reviews of all new means and methods of warfare to ensure weapons and weapons systems are operated within acceptable systems of control. Additionally, Australia has undergone significant reviews of the risks of AI to human rights and within intelligence organisations and has committed to producing ethics guidelines and frameworks in Security and Defence. Australia is committed to OECD's values-based principles for the responsible stewardship of trustworthy AI as well as adopting a set of National AI ethics principles. While Australia has not adopted an AI governance framework specifically for the Australian Defence Organisation (ADO); Defence Science and Technology Group (DSTG) has published 'A Method for Ethical AI in Defence' (MEAID) technical report which includes a framework and pragmatic tools for managing ethical and legal risks for military applications of AI. Australia can play a leadership role by integrating legal and ethical considerations into its ADO AI capability acquisition process. This requires a policy framework that defines its legal and ethical requirements, is informed by Defence industry stakeholders, and provides a practical methodology to integrate legal and ethical risk mitigation strategies into the acquisition process.
△ Less
Submitted 9 March, 2022; v1 submitted 23 November, 2021;
originally announced December 2021.
-
Cognitive factors that affect the adoption of autonomous agriculture
Authors:
S. K. Devitt
Abstract:
Robotic and Autonomous Agricultural Technologies (RAAT) are increasingly available yet may fail to be adopted. This paper focusses specifically on cognitive factors that affect adoption including: inability to generate trust, loss of farming knowledge and reduced social cognition. It is recommended that agriculture develops its own framework for the performance and safety of RAAT drawing on human…
▽ More
Robotic and Autonomous Agricultural Technologies (RAAT) are increasingly available yet may fail to be adopted. This paper focusses specifically on cognitive factors that affect adoption including: inability to generate trust, loss of farming knowledge and reduced social cognition. It is recommended that agriculture develops its own framework for the performance and safety of RAAT drawing on human factors research in aerospace engineering including human inputs (individual variance in knowledge, skills, abilities, preferences, needs and traits), trust, situational awareness and cognitive load. The kinds of cognitive impacts depend on the RAATs level of autonomy, ie whether it has automatic, partial autonomy and autonomous functionality and stage of adoption, ie adoption, initial use or post-adoptive use. The more autonomous a system is, the less a human needs to know to operate it and the less the cognitive load, but it also means farmers have less situational awareness about on farm activities that in turn may affect strategic decision-making about their enterprise. Some cognitive factors may be hidden when RAAT is first adopted but play a greater role during prolonged or intense post-adoptive use. Systems with partial autonomy need intuitive user interfaces, engaging system information, and clear signaling to be trusted with low level tasks; and to compliment and augment high order decision-making on farm.
△ Less
Submitted 28 November, 2021;
originally announced November 2021.
-
The Ethics of Biosurveillance
Authors:
S. K. Devitt,
P. W. J. Baxter,
G. Hamilton
Abstract:
Governments must keep agricultural systems free of pests that threaten agricultural production and international trade. Biosecurity surveillance already makes use of a wide range of technologies, such as insect traps and lures, geographic information systems, and diagnostic biochemical tests. The rise of cheap and usable surveillance technologies such as remotely piloted aircraft systems (RPAS) pr…
▽ More
Governments must keep agricultural systems free of pests that threaten agricultural production and international trade. Biosecurity surveillance already makes use of a wide range of technologies, such as insect traps and lures, geographic information systems, and diagnostic biochemical tests. The rise of cheap and usable surveillance technologies such as remotely piloted aircraft systems (RPAS) presents value conflicts not addressed in international biosurveillance guidelines. The costs of keeping agriculture pest-free include privacy violations and reduced autonomy for farmers. We argue that physical and digital privacy in the age of ubiquitous aerial and ground surveillance is a natural right to allow people to function freely on their land. Surveillance methods must be co-created and justified through using ethically defensible processes such as discourse theory, value-centred design and responsible innovation to forge a cooperative social contract between diverse stakeholders. We propose an ethical framework for biosurveillance activities that balances the collective benefits for food security with individual privacy: (1) establish the boundaries of a biosurveillance social contract; (2) justify surveillance operations for the farmers, researchers, industry, the public and regulators; (3) give decision makers a reasonable measure of control over their personal and agricultural data; and (4) choose surveillance methodologies that give the appropriate information. The benefits of incorporating an ethical framework for responsible biosurveillance innovation include increased participation and accumulated trust over time. Long term trust and cooperation will support food security, producing higher quality data overall and mitigating against anticipated information gaps that may emerge due to disrespecting landholder rights
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Normative Epistemology for Lethal Autonomous Weapons Systems
Authors:
Susannah Kate Devitt
Abstract:
The rise of human-information systems, cybernetic systems, and increasingly autonomous systems requires the application of epistemic frameworks to machines and human-machine teams. This chapter discusses higher-order design principles to guide the design, evaluation, deployment, and iteration of Lethal Autonomous Weapons Systems (LAWS) based on epistemic models. Epistemology is the study of knowle…
▽ More
The rise of human-information systems, cybernetic systems, and increasingly autonomous systems requires the application of epistemic frameworks to machines and human-machine teams. This chapter discusses higher-order design principles to guide the design, evaluation, deployment, and iteration of Lethal Autonomous Weapons Systems (LAWS) based on epistemic models. Epistemology is the study of knowledge. Epistemic models consider the role of accuracy, likelihoods, beliefs, competencies, capabilities, context, and luck in the justification of actions and the attribution of knowledge. The aim is not to provide ethical justification for or against LAWS, but to illustrate how epistemological frameworks can be used in conjunction with moral apparatus to guide the design and deployment of future systems. The models discussed in this chapter aim to make Article 36 reviews of LAWS systematic, expedient, and evaluable. A Bayesian virtue epistemology is proposed to enable justified actions under uncertainty that meet the requirements of the Laws of Armed Conflict and International Humanitarian Law. Epistemic concepts can provide some of the apparatus to meet explainability and transparency requirements in the development, evaluation, deployment, and review of ethical AI.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.
-
Trust and Safety
Authors:
S. K. Devitt,
R. Horne,
Z. Assaad,
E. Broad,
H. Kurniawati,
B. Cardier,
A. Scott,
S. Lazar,
M. Gould,
C. Adamson,
C. Karl,
F. Schrever,
S. Keay,
K. Tranter,
E. Shellshear,
D. Hunter,
M. Brady,
T. Putland
Abstract:
Robotics in Australia have a long history of conforming with safety standards and risk managed practices. This chapter articulates the current state of trust and safety in robotics including society's expectations, safety management systems and system safety as well as emerging issues and methods for ensuring safety in increasingly autonomous robotics. The future of trust and safety will combine s…
▽ More
Robotics in Australia have a long history of conforming with safety standards and risk managed practices. This chapter articulates the current state of trust and safety in robotics including society's expectations, safety management systems and system safety as well as emerging issues and methods for ensuring safety in increasingly autonomous robotics. The future of trust and safety will combine standards with iterative, adaptive and responsive regulatory and assurance methods for diverse applications of robotics, autonomous systems and artificial intelligence (RAS-AI). Robotics will need novel technical and social approaches to achieve assurance, particularly for game-changing innovations. The ability for users to easily update algorithms and software, which alters the performance of a system, implies that traditional machine assurance performed prior to deployment or sale, will no longer be viable. Moreover, the high frequency of updates implies that traditional certification that requires substantial time will no longer be practical. To alleviate these difficulties, automation of assurance will likely be needed; something like 'ASsurance-as-a-Service' (ASaaS), where APIs constantly ping RAS-AI to ensure abidance with various rules, frameworks and behavioural expectations. There are exceptions to this, such as in contested or communications denied environments, or in underground or undersea mining; and these systems need their own risk assessments and limitations imposed. Indeed, self-monitors are already operating within some systems. To ensure safe operations of future robotics systems, Australia needs to invest in RAS-AI assurance research, stakeholder engagement and continued development and refinement of robust frameworks, methods, guidelines and policy in order to educate and prepare its technology developers, certifiers, and general population.
△ Less
Submitted 13 April, 2021;
originally announced April 2021.
-
How to Survive a Learning Management System (LMS) Implementation? A Stakeholder Analysis Approach
Authors:
Ajayi Ekuase-Anwansedo,
Susannah F. Craig,
Jose Noguera
Abstract:
To survive a learning management system (LMS) implementation an understanding of the needs of the various stakeholders is necessary. The goal of every LMS implementation is to ensure the use of the system by instructors and students to enhance teaching and communication thereby enhancing learning outcomes of the students. If the teachers and students do not use the system, the system is useless. T…
▽ More
To survive a learning management system (LMS) implementation an understanding of the needs of the various stakeholders is necessary. The goal of every LMS implementation is to ensure the use of the system by instructors and students to enhance teaching and communication thereby enhancing learning outcomes of the students. If the teachers and students do not use the system, the system is useless. This research is motivated by the importance of identifying and understanding various stakeholders involved in the LMS implementation process in order to anticipate possible challenges and identify critical success factors essential for the effective implementation and adoption of a new LMS system. To this end, we define the term stakeholder. We conducted a stakeholder analysis to identify the key stakeholders in an LMS implementation process. We then analyze their goals and needs, and how they collaborate in the implementation process. The findings of this work will provide institutions of higher learning an overview of the implementation process and useful insights into the needs of the stakeholders, which will in turn ensure an increase in the level of success achieved when implementing a LMS.
△ Less
Submitted 21 February, 2021;
originally announced February 2021.
-
AI Ethics Needs Good Data
Authors:
Angela Daly,
S Kate Devitt,
Monique Mann
Abstract:
In this chapter we argue that discourses on AI must transcend the language of 'ethics' and engage with power and political economy in order to constitute 'Good Data'. In particular, we must move beyond the depoliticised language of 'ethics' currently deployed (Wagner 2018) in determining whether AI is 'good' given the limitations of ethics as a frame through which AI issues can be viewed. In order…
▽ More
In this chapter we argue that discourses on AI must transcend the language of 'ethics' and engage with power and political economy in order to constitute 'Good Data'. In particular, we must move beyond the depoliticised language of 'ethics' currently deployed (Wagner 2018) in determining whether AI is 'good' given the limitations of ethics as a frame through which AI issues can be viewed. In order to circumvent these limits, we use instead the language and conceptualisation of 'Good Data', as a more expansive term to elucidate the values, rights and interests at stake when it comes to AI's development and deployment, as well as that of other digital technologies. Good Data considerations move beyond recurring themes of data protection/privacy and the FAT (fairness, transparency and accountability) movement to include explicit political economy critiques of power. Instead of yet more ethics principles (that tend to say the same or similar things anyway), we offer four 'pillars' on which Good Data AI can be built: community, rights, usability and politics. Overall we view AI's 'goodness' as an explicly political (economy) question of power and one which is always related to the degree which AI is created and used to increase the wellbeing of society and especially to increase the power of the most marginalized and disenfranchised. We offer recommendations and remedies towards implementing 'better' approaches towards AI. Our strategies enable a different (but complementary) kind of evaluation of AI as part of the broader socio-technical systems in which AI is built and deployed.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
A Bayesian social platform for inclusive and evidence-based decision making
Authors:
Susannah Kate Devitt,
Tamara Rose Pearce,
Alok Kumar Chowdhury,
Kerrie Mengersen
Abstract:
Against the backdrop of a social media reckoning, this paper seeks to demonstrate the potential of social tools to build virtuous behaviours online. We must assume that human behaviour is flawed, the truth can be elusive, and as communities we must commit to mechanisms to encourage virtuous social digital behaviours. Societies that use social platforms should be inclusive, responsive to evidence,…
▽ More
Against the backdrop of a social media reckoning, this paper seeks to demonstrate the potential of social tools to build virtuous behaviours online. We must assume that human behaviour is flawed, the truth can be elusive, and as communities we must commit to mechanisms to encourage virtuous social digital behaviours. Societies that use social platforms should be inclusive, responsive to evidence, limit punitive actions and allow productive discord and respectful disagreement. Social media success, we argue, is in the hypothesis. Documents are valuable to the degree that they are evidence in service of, or to challenge an idea for a purpose. We outline how a Bayesian social platform can facilitate virtuous behaviours to build evidence-based collective rationality. The chapter outlines the epistemic architecture of the platform's algorithms and user interface in conjunction with explicit community management to ensure psychological safety. The BetterBeliefs platform rewards users who demonstrate epistemically virtuous behaviours and exports evidence-based propositions for decision-making. A Bayesian social network can make virtuous ideas powerful.
△ Less
Submitted 13 February, 2021;
originally announced February 2021.
-
Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Authors:
Angeliki Lazaridou,
Adhiguna Kuncoro,
Elena Gribovskaya,
Devang Agrawal,
Adam Liska,
Tayfun Terzi,
Mai Gimenez,
Cyprien de Masson d'Autume,
Tomas Kocisky,
Sebastian Ruder,
Dani Yogatama,
Kris Cao,
Susannah Young,
Phil Blunsom
Abstract:
Our world is open-ended, non-stationary, and constantly evolving; thus what we talk about and how we talk about it change over time. This inherent dynamic nature of language contrasts with the current static language modelling paradigm, which trains and evaluates models on utterances from overlapping time periods. Despite impressive recent progress, we demonstrate that Transformer-XL language mode…
▽ More
Our world is open-ended, non-stationary, and constantly evolving; thus what we talk about and how we talk about it change over time. This inherent dynamic nature of language contrasts with the current static language modelling paradigm, which trains and evaluates models on utterances from overlapping time periods. Despite impressive recent progress, we demonstrate that Transformer-XL language models perform worse in the realistic setup of predicting future utterances from beyond their training period, and that model performance becomes increasingly worse with time. We find that, while increasing model size alone -- a key driver behind recent progress -- does not solve this problem, having models that continually update their knowledge with new information can indeed mitigate this performance degradation over time. Hence, given the compilation of ever-larger language modelling datasets, combined with the growing list of language-model-based NLP applications that require up-to-date factual knowledge about the world, we argue that now is the right time to rethink the static way in which we currently train and evaluate our language models, and develop adaptive language models that can remain up-to-date with respect to our ever-changing and non-stationary world. We publicly release our dynamic, streaming language modelling benchmarks for WMT and arXiv to facilitate language model evaluation that takes temporal dynamics into account.
△ Less
Submitted 26 October, 2021; v1 submitted 3 February, 2021;
originally announced February 2021.
-
Investigating the Relationship between Multi-Party Linguistic Entrainment, Team Characteristics, and the Perception of Team Social Outcomes
Authors:
Mingzhi Yu,
Diane Litman,
Susannah Paletz
Abstract:
Multi-party linguistic entrainment refers to the phenomenon that speakers tend to speak more similarly during conversation. We first developed new measures of multi-party entrainment on features describing linguistic style, and then examined the relationship between entrainment and team characteristics in terms of gender composition, team size, and diversity. Next, we predicted the perception of t…
▽ More
Multi-party linguistic entrainment refers to the phenomenon that speakers tend to speak more similarly during conversation. We first developed new measures of multi-party entrainment on features describing linguistic style, and then examined the relationship between entrainment and team characteristics in terms of gender composition, team size, and diversity. Next, we predicted the perception of team social outcomes using multi-party linguistic entrainment and team characteristics with a hierarchical regression model. We found that teams with greater gender diversity had higher minimum convergence than teams with less gender diversity. Entrainment contributed significantly to predicting perceived team social outcomes both alone and controlling for team characteristics.
△ Less
Submitted 2 September, 2019;
originally announced September 2019.