Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback

Bewley, Tom; Lawry, Jonathan; Richards, Arthur

Computer Science > Artificial Intelligence

arXiv:2305.16924 (cs)

[Submitted on 26 May 2023]

Title:Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback

Authors:Tom Bewley, Jonathan Lawry, Arthur Richards

View PDF

Abstract:We propose a method to capture the handling abilities of fast jet pilots in a software model via reinforcement learning (RL) from human preference feedback. We use pairwise preferences over simulated flight trajectories to learn an interpretable rule-based model called a reward tree, which enables the automated scoring of trajectories alongside an explanatory rationale. We train an RL agent to execute high-quality handling behaviour by using the reward tree as the objective, and thereby generate data for iterative preference collection and further refinement of both tree and agent. Experiments with synthetic preferences show reward trees to be competitive with uninterpretable neural network reward models on quantitative and qualitative evaluations.

Comments:	arXiv admin note: substantial text overlap with arXiv:2210.01007
Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2305.16924 [cs.AI]
	(or arXiv:2305.16924v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2305.16924

Submission history

From: Tom Bewley [view email]
[v1] Fri, 26 May 2023 13:37:59 UTC (5,590 KB)

Computer Science > Artificial Intelligence

Title:Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators