Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs

Zhang, Shiyu; Song, Haoyang; Wang, Qixin; Pei, Yu

Computer Science > Software Engineering

arXiv:2406.19812 (cs)

[Submitted on 28 Jun 2024]

Title:Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs

Authors:Shiyu Zhang, Haoyang Song, Qixin Wang, Yu Pei

View PDF HTML (experimental)

Abstract:Reinforcement Learning (RL) has gained significant attention across various domains. However, the increasing complexity of RL programs presents testing challenges, particularly the oracle problem: defining the correctness of the RL program. Conventional human oracles struggle to cope with the complexity, leading to inefficiencies and potential unreliability in RL testing. To alleviate this problem, we propose an automated oracle approach that leverages RL properties using fuzzy logic. Our oracle quantifies an agent's behavioral compliance with reward policies and analyzes its trend over training episodes. It labels an RL program as "Buggy" if the compliance trend violates expectations derived from RL characteristics. We evaluate our oracle on RL programs with varying complexities and compare it with human oracles. Results show that while human oracles perform well in simpler testing scenarios, our fuzzy oracle demonstrates superior performance in complex environments. The proposed approach shows promise in addressing the oracle problem for RL testing, particularly in complex cases where manual testing falls short. It offers a potential solution to improve the efficiency, reliability, and scalability of RL program testing. This research takes a step towards automated testing of RL programs and highlights the potential of fuzzy logic-based oracles in tackling the oracle problem.

Comments:	10 pages, 5 figures
Subjects:	Software Engineering (cs.SE); Artificial Intelligence (cs.AI)
MSC classes:	68T05, 68T27, 93C42
ACM classes:	D.2.5; I.2.3
Cite as:	arXiv:2406.19812 [cs.SE]
	(or arXiv:2406.19812v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2406.19812

Submission history

From: Shiyu Zhang [view email]
[v1] Fri, 28 Jun 2024 10:41:17 UTC (575 KB)

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Software Engineering

Title:Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Software Engineering

Title:Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators