Search | arXiv e-print repository

The reflection complexity of sequences over finite alphabets

Authors: Jean-Paul Allouche, John M. Campbell, Jeffrey Shallit, Manon Stipulanti

Abstract: In combinatorics on words, the well-studied factor complexity function $ρ_{\bf x}$ of a sequence ${\bf x}$ over a finite alphabet counts, for any nonnegative integer $n$, the number of distinct length-$n$ factors of ${\bf x}$. In this paper, we introduce the \emph{reflection complexity} function $r_{\bf x}$ to enumerate the factors occurring in a sequence ${\bf x}$, up to reversing the order of sy… ▽ More In combinatorics on words, the well-studied factor complexity function $ρ_{\bf x}$ of a sequence ${\bf x}$ over a finite alphabet counts, for any nonnegative integer $n$, the number of distinct length-$n$ factors of ${\bf x}$. In this paper, we introduce the \emph{reflection complexity} function $r_{\bf x}$ to enumerate the factors occurring in a sequence ${\bf x}$, up to reversing the order of symbols in a word. We introduce and prove results on $r_{\bf x}$ regarding its growth properties and relationship with other complexity functions. We prove that if ${\bf x}$ is $k$-automatic, then $r_{\bf x}$ is computably $k$-regular, and we use the software {\tt Walnut} to evaluate the reflection complexity of automatic sequences, such as the Thue--Morse sequence. We prove a Morse--Hedlund-type result characterizing eventually periodic sequences in terms of their reflection complexity, and we deduce a characterization of Sturmian sequences. Furthermore, we investigate the reflection complexity of episturmian, $(s+1)$-dimensional billiard, and Rote sequences. There are still many unanswered questions about this measure. △ Less

Submitted 13 June, 2024; originally announced June 2024.

MSC Class: 05A05; 11B85; 68R15

arXiv:2406.01377 [pdf, other]

Multi-Agent Transfer Learning via Temporal Contrastive Learning

Authors: Weihao Zeng, Joseph Campbell, Simon Stepputtis, Katia Sycara

Abstract: This paper introduces a novel transfer learning framework for deep multi-agent reinforcement learning. The approach automatically combines goal-conditioned policies with temporal contrastive learning to discover meaningful sub-goals. The approach involves pre-training a goal-conditioned agent, finetuning it on the target domain, and using contrastive learning to construct a planning graph that gui… ▽ More This paper introduces a novel transfer learning framework for deep multi-agent reinforcement learning. The approach automatically combines goal-conditioned policies with temporal contrastive learning to discover meaningful sub-goals. The approach involves pre-training a goal-conditioned agent, finetuning it on the target domain, and using contrastive learning to construct a planning graph that guides the agent via sub-goals. Experiments on multi-agent coordination Overcooked tasks demonstrate improved sample efficiency, the ability to solve sparse-reward and long-horizon problems, and enhanced interpretability compared to baselines. The results highlight the effectiveness of integrating goal-conditioned policies with unsupervised temporal abstraction learning for complex multi-agent transfer learning. Compared to state-of-the-art baselines, our method achieves the same or better performances while requiring only 21.7% of the training samples. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 6 pages, 6 figures

Journal ref: 2024 IEEE International Conference on Robotics and Automation (ICRA) 2024

arXiv:2403.18062 [pdf, other]

ShapeGrasp: Zero-Shot Task-Oriented Grasping with Large Language Models through Geometric Decomposition

Authors: Samuel Li, Sarthak Bhagat, Joseph Campbell, Yaqi Xie, Woojun Kim, Katia Sycara, Simon Stepputtis

Abstract: Task-oriented grasping of unfamiliar objects is a necessary skill for robots in dynamic in-home environments. Inspired by the human capability to grasp such objects through intuition about their shape and structure, we present a novel zero-shot task-oriented grasping method leveraging a geometric decomposition of the target object into simple, convex shapes that we represent in a graph structure,… ▽ More Task-oriented grasping of unfamiliar objects is a necessary skill for robots in dynamic in-home environments. Inspired by the human capability to grasp such objects through intuition about their shape and structure, we present a novel zero-shot task-oriented grasping method leveraging a geometric decomposition of the target object into simple, convex shapes that we represent in a graph structure, including geometric attributes and spatial relationships. Our approach employs minimal essential information - the object's name and the intended task - to facilitate zero-shot task-oriented grasping. We utilize the commonsense reasoning capabilities of large language models to dynamically assign semantic meaning to each decomposed part and subsequently reason over the utility of each part for the intended task. Through extensive experiments on a real-world robotics platform, we demonstrate that our grasping approach's decomposition and reasoning pipeline is capable of selecting the correct part in 92% of the cases and successfully grasping the object in 82% of the tasks we evaluate. Additional videos, experiments, code, and data are available on our project website: https://shapegrasp.github.io/. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 8 pages

arXiv:2403.12033 [pdf, other]

HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation

Authors: Ce Zhang, Simon Stepputtis, Joseph Campbell, Katia Sycara, Yaqi Xie

Abstract: Being able to understand visual scenes is a precursor for many downstream tasks, including autonomous driving, robotics, and other vision-based approaches. A common approach enabling the ability to reason over visual data is Scene Graph Generation (SGG); however, many existing approaches assume undisturbed vision, i.e., the absence of real-world corruptions such as fog, snow, smoke, as well as non… ▽ More Being able to understand visual scenes is a precursor for many downstream tasks, including autonomous driving, robotics, and other vision-based approaches. A common approach enabling the ability to reason over visual data is Scene Graph Generation (SGG); however, many existing approaches assume undisturbed vision, i.e., the absence of real-world corruptions such as fog, snow, smoke, as well as non-uniform perturbations like sun glare or water drops. In this work, we propose a novel SGG benchmark containing procedurally generated weather corruptions and other transformations over the Visual Genome dataset. Further, we introduce a corresponding approach, Hierarchical Knowledge Enhanced Robust Scene Graph Generation (HiKER-SGG), providing a strong baseline for scene graph generation under such challenging setting. At its core, HiKER-SGG utilizes a hierarchical knowledge graph in order to refine its predictions from coarse initial estimates to detailed predictions. In our extensive experiments, we show that HiKER-SGG does not only demonstrate superior performance on corrupted images in a zero-shot manner, but also outperforms current state-of-the-art methods on uncorrupted SGG tasks. Code is available at https://github.com/zhangce01/HiKER-SGG. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024. Project page: https://zhangce01.github.io/HiKER-SGG

arXiv:2402.18650 [pdf, other]

The Grasp Reset Mechanism: An Automated Apparatus for Conducting Grasping Trials

Authors: Kyle DuFrene, Keegan Nave, Joshua Campbell, Ravi Balasubramanian, Cindy Grimm

Abstract: Advancing robotic grasping and manipulation requires the ability to test algorithms and/or train learning models on large numbers of grasps. Towards the goal of more advanced grasping, we present the Grasp Reset Mechanism (GRM), a fully automated apparatus for conducting large-scale grasping trials. The GRM automates the process of resetting a grasping environment, repeatably placing an object in… ▽ More Advancing robotic grasping and manipulation requires the ability to test algorithms and/or train learning models on large numbers of grasps. Towards the goal of more advanced grasping, we present the Grasp Reset Mechanism (GRM), a fully automated apparatus for conducting large-scale grasping trials. The GRM automates the process of resetting a grasping environment, repeatably placing an object in a fixed location and controllable 1-D orientation. It also collects data and swaps between multiple objects enabling robust dataset collection with no human intervention. We also present a standardized state machine interface for control, which allows for integration of most manipulators with minimal effort. In addition to the physical design and corresponding software, we include a dataset of 1,020 grasps. The grasps were created with a Kinova Gen3 robot arm and Robotiq 2F-85 Adaptive Gripper to enable training of learning models and to demonstrate the capabilities of the GRM. The dataset includes ranges of grasps conducted across four objects and a variety of orientations. Manipulator states, object pose, video, and grasp success data are provided for every trial. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: Accepted to the 2024 IEEE International Conference on Robotics and Automation (ICRA2024)

arXiv:2312.06697 [pdf]

doi 10.1016/j.jpi.2023.100348

Performance of externally validated machine learning models based on histopathology images for the diagnosis, classification, prognosis, or treatment outcome prediction in female breast cancer: A systematic review

Authors: Ricardo Gonzalez, Peyman Nejat, Ashirbani Saha, Clinton J. V. Campbell, Andrew P. Norgan, Cynthia Lokker

Abstract: Numerous machine learning (ML) models have been developed for breast cancer using various types of data. Successful external validation (EV) of ML models is important evidence of their generalizability. The aim of this systematic review was to assess the performance of externally validated ML models based on histopathology images for diagnosis, classification, prognosis, or treatment outcome predi… ▽ More Numerous machine learning (ML) models have been developed for breast cancer using various types of data. Successful external validation (EV) of ML models is important evidence of their generalizability. The aim of this systematic review was to assess the performance of externally validated ML models based on histopathology images for diagnosis, classification, prognosis, or treatment outcome prediction in female breast cancer. A systematic search of MEDLINE, EMBASE, CINAHL, IEEE, MICCAI, and SPIE conferences was performed for studies published between January 2010 and February 2022. The Prediction Model Risk of Bias Assessment Tool (PROBAST) was employed, and the results were narratively described. Of the 2011 non-duplicated citations, 8 journal articles and 2 conference proceedings met inclusion criteria. Three studies externally validated ML models for diagnosis, 4 for classification, 2 for prognosis, and 1 for both classification and prognosis. Most studies used Convolutional Neural Networks and one used logistic regression algorithms. For diagnostic/classification models, the most common performance metrics reported in the EV were accuracy and area under the curve, which were greater than 87% and 90%, respectively, using pathologists' annotations as ground truth. The hazard ratios in the EV of prognostic ML models were between 1.7 (95% CI, 1.2-2.6) and 1.8 (95% CI, 1.3-2.7) to predict distant disease-free survival; 1.91 (95% CI, 1.11-3.29) for recurrence, and between 0.09 (95% CI, 0.01-0.70) and 0.65 (95% CI, 0.43-0.98) for overall survival, using clinical data as ground truth. Despite EV being an important step before the clinical application of a ML model, it hasn't been performed routinely. The large variability in the training/validation datasets, methods, performance metrics, and reported information limited the comparison of the models and the analysis of their results (...) △ Less

Submitted 9 December, 2023; originally announced December 2023.

Journal ref: Journal of Pathology Informatics. 2023;15:100348

arXiv:2312.03812 [pdf]

doi 10.1016/j.jpi.2023.100347

Seeing the random forest through the decision trees. Supporting learning health systems from histopathology with machine learning models: Challenges and opportunities

Authors: Ricardo Gonzalez, Ashirbani Saha, Clinton J. V. Campbell, Peyman Nejat, Cynthia Lokker, Andrew P. Norgan

Abstract: This paper discusses some overlooked challenges faced when working with machine learning models for histopathology and presents a novel opportunity to support "Learning Health Systems" with them. Initially, the authors elaborate on these challenges after separating them according to their mitigation strategies: those that need innovative approaches, time, or future technological capabilities and t… ▽ More This paper discusses some overlooked challenges faced when working with machine learning models for histopathology and presents a novel opportunity to support "Learning Health Systems" with them. Initially, the authors elaborate on these challenges after separating them according to their mitigation strategies: those that need innovative approaches, time, or future technological capabilities and those that require a conceptual reappraisal from a critical perspective. Then, a novel opportunity to support "Learning Health Systems" by integrating hidden information extracted by ML models from digitalized histopathology slides with other healthcare big data is presented. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Journal ref: Journal of Pathology Informatics 15 (2024) 100347

arXiv:2312.00192 [pdf, other]

Benchmarking and Enhancing Disentanglement in Concept-Residual Models

Authors: Renos Zabounidis, Ini Oguntola, Konghao Zhao, Joseph Campbell, Simon Stepputtis, Katia Sycara

Abstract: Concept bottleneck models (CBMs) are interpretable models that first predict a set of semantically meaningful features, i.e., concepts, from observations that are subsequently used to condition a downstream task. However, the model's performance strongly depends on the engineered features and can severely suffer from incomplete sets of concepts. Prior works have proposed a side channel -- a residu… ▽ More Concept bottleneck models (CBMs) are interpretable models that first predict a set of semantically meaningful features, i.e., concepts, from observations that are subsequently used to condition a downstream task. However, the model's performance strongly depends on the engineered features and can severely suffer from incomplete sets of concepts. Prior works have proposed a side channel -- a residual -- that allows for unconstrained information flow to the downstream task, thus improving model performance but simultaneously introducing information leakage, which is undesirable for interpretability. This work proposes three novel approaches to mitigate information leakage by disentangling concepts and residuals, investigating the critical balance between model performance and interpretability. Through extensive empirical analysis on the CUB, OAI, and CIFAR 100 datasets, we assess the performance of each disentanglement method and provide insights into when they work best. Further, we show how each method impacts the ability to intervene over the concepts and their subsequent impact on task performance. △ Less

Submitted 30 November, 2023; originally announced December 2023.

arXiv:2311.18062 [pdf, other]

Understanding Your Agent: Leveraging Large Language Models for Behavior Explanation

Authors: Xijia Zhang, Yue Guo, Simon Stepputtis, Katia Sycara, Joseph Campbell

Abstract: Intelligent agents such as robots are increasingly deployed in real-world, safety-critical settings. It is vital that these agents are able to explain the reasoning behind their decisions to human counterparts; however, their behavior is often produced by uninterpretable models such as deep neural networks. We propose an approach to generate natural language explanations for an agent's behavior ba… ▽ More Intelligent agents such as robots are increasingly deployed in real-world, safety-critical settings. It is vital that these agents are able to explain the reasoning behind their decisions to human counterparts; however, their behavior is often produced by uninterpretable models such as deep neural networks. We propose an approach to generate natural language explanations for an agent's behavior based only on observations of states and actions, thus making our method independent from the underlying model's representation. For such models, we first learn a behavior representation and subsequently use it to produce plausible explanations with minimal hallucination while affording user interaction with a pre-trained large language model. We evaluate our method in a multi-agent search-and-rescue environment and demonstrate the effectiveness of our explanations for agents executing various behaviors. Through user studies and empirical experiments, we show that our approach generates explanations as helpful as those produced by a human domain expert while enabling beneficial interactions such as clarification and counterfactual queries. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2311.15131 [pdf, other]

Localizing Lying in Llama: Understanding Instructed Dishonesty on True-False Questions Through Prompting, Probing, and Patching

Authors: James Campbell, Richard Ren, Phillip Guo

Abstract: Large language models (LLMs) demonstrate significant knowledge through their outputs, though it is often unclear whether false outputs are due to a lack of knowledge or dishonesty. In this paper, we investigate instructed dishonesty, wherein we explicitly prompt LLaMA-2-70b-chat to lie. We perform prompt engineering to find which prompts best induce lying behavior, and then use mechanistic interpr… ▽ More Large language models (LLMs) demonstrate significant knowledge through their outputs, though it is often unclear whether false outputs are due to a lack of knowledge or dishonesty. In this paper, we investigate instructed dishonesty, wherein we explicitly prompt LLaMA-2-70b-chat to lie. We perform prompt engineering to find which prompts best induce lying behavior, and then use mechanistic interpretability approaches to localize where in the network this behavior occurs. Using linear probing and activation patching, we localize five layers that appear especially important for lying. We then find just 46 attention heads within these layers that enable us to causally intervene such that the lying model instead answers honestly. We show that these interventions work robustly across many prompts and dataset splits. Overall, our work contributes a greater understanding of dishonesty in LLMs so that we may hope to prevent it. △ Less

Submitted 25 November, 2023; originally announced November 2023.

Comments: 14 pages, 12 figures

arXiv:2311.05720 [pdf, other]

Long-Horizon Dialogue Understanding for Role Identification in the Game of Avalon with Large Language Models

Authors: Simon Stepputtis, Joseph Campbell, Yaqi Xie, Zhengyang Qi, Wenxin Sharon Zhang, Ruiyi Wang, Sanketh Rangreji, Michael Lewis, Katia Sycara

Abstract: Deception and persuasion play a critical role in long-horizon dialogues between multiple parties, especially when the interests, goals, and motivations of the participants are not aligned. Such complex tasks pose challenges for current Large Language Models (LLM) as deception and persuasion can easily mislead them, especially in long-horizon multi-party dialogues. To this end, we explore the game… ▽ More Deception and persuasion play a critical role in long-horizon dialogues between multiple parties, especially when the interests, goals, and motivations of the participants are not aligned. Such complex tasks pose challenges for current Large Language Models (LLM) as deception and persuasion can easily mislead them, especially in long-horizon multi-party dialogues. To this end, we explore the game of Avalon: The Resistance, a social deduction game in which players must determine each other's hidden identities to complete their team's objective. We introduce an online testbed and a dataset containing 20 carefully collected and labeled games among human players that exhibit long-horizon deception in a cooperative-competitive setting. We discuss the capabilities of LLMs to utilize deceptive long-horizon conversations between six human players to determine each player's goal and motivation. Particularly, we discuss the multimodal integration of the chat between the players and the game's state that grounds the conversation, providing further insights into the true player identities. We find that even current state-of-the-art LLMs do not reach human performance, making our dataset a compelling benchmark to investigate the decision-making and language-processing capabilities of LLMs. Our dataset and online testbed can be found at our project website: https://sstepput.github.io/Avalon-NLU/ △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: Accepted to the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP, Findings of the Association for Computational Linguistics)

arXiv:2310.10701 [pdf, other]

doi 10.18653/v1/2023.emnlp-main.13

Theory of Mind for Multi-Agent Collaboration via Large Language Models

Authors: Huao Li, Yu Quan Chong, Simon Stepputtis, Joseph Campbell, Dana Hughes, Michael Lewis, Katia Sycara

Abstract: While Large Language Models (LLMs) have demonstrated impressive accomplishments in both reasoning and planning, their abilities in multi-agent collaborations remains largely unexplored. This study evaluates LLM-based agents in a multi-agent cooperative text game with Theory of Mind (ToM) inference tasks, comparing their performance with Multi-Agent Reinforcement Learning (MARL) and planning-based… ▽ More While Large Language Models (LLMs) have demonstrated impressive accomplishments in both reasoning and planning, their abilities in multi-agent collaborations remains largely unexplored. This study evaluates LLM-based agents in a multi-agent cooperative text game with Theory of Mind (ToM) inference tasks, comparing their performance with Multi-Agent Reinforcement Learning (MARL) and planning-based baselines. We observed evidence of emergent collaborative behaviors and high-order Theory of Mind capabilities among LLM-based agents. Our results reveal limitations in LLM-based agents' planning optimization due to systematic failures in managing long-horizon contexts and hallucination about the task state. We explore the use of explicit belief state representations to mitigate these issues, finding that it enhances task performance and the accuracy of ToM inferences for LLM-based agents. △ Less

Submitted 26 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted to EMNLP 2023 (Main Conference). Code available at https://github.com/romanlee6/multi_LLM_comm

Journal ref: in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Page 180-192, ACL

arXiv:2310.01405 [pdf, other]

Representation Engineering: A Top-Down Approach to AI Transparency

Authors: Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

Abstract: In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience. RepE places population-level representations, rather than neurons or circuits, at the center of analysis, equipping us with novel methods for monitoring and manipulating high-level cognitive p… ▽ More In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience. RepE places population-level representations, rather than neurons or circuits, at the center of analysis, equipping us with novel methods for monitoring and manipulating high-level cognitive phenomena in deep neural networks (DNNs). We provide baselines and an initial analysis of RepE techniques, showing that they offer simple yet effective solutions for improving our understanding and control of large language models. We showcase how these methods can provide traction on a wide range of safety-relevant problems, including honesty, harmlessness, power-seeking, and more, demonstrating the promise of top-down transparency research. We hope that this work catalyzes further exploration of RepE and fosters advancements in the transparency and safety of AI systems. △ Less

Submitted 10 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: Code is available at https://github.com/andyzoujm/representation-engineering

arXiv:2309.10346 [pdf, other]

Explaining Agent Behavior with Large Language Models

Authors: Xijia Zhang, Yue Guo, Simon Stepputtis, Katia Sycara, Joseph Campbell

Abstract: Intelligent agents such as robots are increasingly deployed in real-world, safety-critical settings. It is vital that these agents are able to explain the reasoning behind their decisions to human counterparts, however, their behavior is often produced by uninterpretable models such as deep neural networks. We propose an approach to generate natural language explanations for an agent's behavior ba… ▽ More Intelligent agents such as robots are increasingly deployed in real-world, safety-critical settings. It is vital that these agents are able to explain the reasoning behind their decisions to human counterparts, however, their behavior is often produced by uninterpretable models such as deep neural networks. We propose an approach to generate natural language explanations for an agent's behavior based only on observations of states and actions, agnostic to the underlying model representation. We show how a compact representation of the agent's behavior can be learned and used to produce plausible explanations with minimal hallucination while affording user interaction with a pre-trained large language model. Through user studies and empirical experiments, we show that our approach generates explanations as helpful as those generated by a human domain expert while enabling beneficial interactions such as clarification and counterfactual queries. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: Human Multi-Robot Interaction Workshop at IROS 2023

arXiv:2309.05943 [pdf, other]

Knowledge-Guided Short-Context Action Anticipation in Human-Centric Videos

Authors: Sarthak Bhagat, Simon Stepputtis, Joseph Campbell, Katia Sycara

Abstract: This work focuses on anticipating long-term human actions, particularly using short video segments, which can speed up editing workflows through improved suggestions while fostering creativity by suggesting narratives. To this end, we imbue a transformer network with a symbolic knowledge graph for action anticipation in video segments by boosting certain aspects of the transformer's attention mech… ▽ More This work focuses on anticipating long-term human actions, particularly using short video segments, which can speed up editing workflows through improved suggestions while fostering creativity by suggesting narratives. To this end, we imbue a transformer network with a symbolic knowledge graph for action anticipation in video segments by boosting certain aspects of the transformer's attention mechanism at run-time. Demonstrated on two benchmark datasets, Breakfast and 50Salads, our approach outperforms current state-of-the-art methods for long-term action anticipation using short video context by up to 9%. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: ICCV 2023 Workshop on AI for Creative Video Editing and Understanding

arXiv:2308.09870 [pdf, other]

Enhancing State Estimation in Robots: A Data-Driven Approach with Differentiable Ensemble Kalman Filters

Authors: Xiao Liu, Geoffrey Clark, Joseph Campbell, Yifan Zhou, Heni Ben Amor

Abstract: This paper introduces a novel state estimation framework for robots using differentiable ensemble Kalman filters (DEnKF). DEnKF is a reformulation of the traditional ensemble Kalman filter that employs stochastic neural networks to model the process noise implicitly. Our work is an extension of previous research on differentiable filters, which has provided a strong foundation for our modular and… ▽ More This paper introduces a novel state estimation framework for robots using differentiable ensemble Kalman filters (DEnKF). DEnKF is a reformulation of the traditional ensemble Kalman filter that employs stochastic neural networks to model the process noise implicitly. Our work is an extension of previous research on differentiable filters, which has provided a strong foundation for our modular and end-to-end differentiable framework. This framework enables each component of the system to function independently, leading to improved flexibility and versatility in implementation. Through a series of experiments, we demonstrate the flexibility of this model across a diverse set of real-world tracking tasks, including visual odometry and robot manipulation. Moreover, we show that our model effectively handles noisy observations, is robust in the absence of observations, and outperforms state-of-the-art differentiable filters in terms of error metrics. Specifically, we observe a significant improvement of at least 59% in translational error when using DEnKF with noisy observations. Our results underscore the potential of DEnKF in advancing state estimation for robotics. Code for DEnKF is available at https://github.com/ir-lab/DEnKF △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: 8 pages, 6 figures, 4 tables

arXiv:2307.01158 [pdf, other]

Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement Learning

Authors: Ini Oguntola, Joseph Campbell, Simon Stepputtis, Katia Sycara

Abstract: The ability to model the mental states of others is crucial to human social intelligence, and can offer similar benefits to artificial agents with respect to the social dynamics induced in multi-agent settings. We present a method of grounding semantically meaningful, human-interpretable beliefs within policies modeled by deep networks. We then consider the task of 2nd-order belief prediction. We… ▽ More The ability to model the mental states of others is crucial to human social intelligence, and can offer similar benefits to artificial agents with respect to the social dynamics induced in multi-agent settings. We present a method of grounding semantically meaningful, human-interpretable beliefs within policies modeled by deep networks. We then consider the task of 2nd-order belief prediction. We propose that ability of each agent to predict the beliefs of the other agents can be used as an intrinsic reward signal for multi-agent reinforcement learning. Finally, we present preliminary empirical results in a mixed cooperative-competitive environment. △ Less

Submitted 18 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: To appear at ICML 2023 Workshop on Theory of Mind

arXiv:2306.12314 [pdf, other]

Introspective Action Advising for Interpretable Transfer Learning

Authors: Joseph Campbell, Yue Guo, Fiona Xie, Simon Stepputtis, Katia Sycara

Abstract: Transfer learning can be applied in deep reinforcement learning to accelerate the training of a policy in a target task by transferring knowledge from a policy learned in a related source task. This is commonly achieved by copying pretrained weights from the source policy to the target policy prior to training, under the constraint that they use the same model architecture. However, not only does… ▽ More Transfer learning can be applied in deep reinforcement learning to accelerate the training of a policy in a target task by transferring knowledge from a policy learned in a related source task. This is commonly achieved by copying pretrained weights from the source policy to the target policy prior to training, under the constraint that they use the same model architecture. However, not only does this require a robust representation learned over a wide distribution of states -- often failing to transfer between specialist models trained over single tasks -- but it is largely uninterpretable and provides little indication of what knowledge is transferred. In this work, we propose an alternative approach to transfer learning between tasks based on action advising, in which a teacher trained in a source task actively guides a student's exploration in a target task. Through introspection, the teacher is capable of identifying when advice is beneficial to the student and should be given, and when it is not. Our approach allows knowledge transfer between policies agnostic of the underlying representations, and we empirically show that this leads to improved convergence rates in Gridworld and Atari environments while providing insight into what knowledge is transferred. △ Less

Submitted 21 June, 2023; originally announced June 2023.

Comments: Accepted to CoLLAs 2023

arXiv:2306.09482 [pdf, other]

Sample-Efficient Learning of Novel Visual Concepts

Authors: Sarthak Bhagat, Simon Stepputtis, Joseph Campbell, Katia Sycara

Abstract: Despite the advances made in visual object recognition, state-of-the-art deep learning models struggle to effectively recognize novel objects in a few-shot setting where only a limited number of examples are provided. Unlike humans who excel at such tasks, these models often fail to leverage known relationships between entities in order to draw conclusions about such objects. In this work, we show… ▽ More Despite the advances made in visual object recognition, state-of-the-art deep learning models struggle to effectively recognize novel objects in a few-shot setting where only a limited number of examples are provided. Unlike humans who excel at such tasks, these models often fail to leverage known relationships between entities in order to draw conclusions about such objects. In this work, we show that incorporating a symbolic knowledge graph into a state-of-the-art recognition model enables a new approach for effective few-shot classification. In our proposed neuro-symbolic architecture and training methodology, the knowledge graph is augmented with additional relationships extracted from a small set of examples, improving its ability to recognize novel objects by considering the presence of interconnected entities. Unlike existing few-shot classifiers, we show that this enables our model to incorporate not only objects but also abstract concepts and affordances. The existence of the knowledge graph also makes this approach amenable to interpretability through analysis of the relationships contained within it. We empirically show that our approach outperforms current state-of-the-art few-shot multi-label classification methods on the COCO dataset and evaluate the addition of abstract concepts and affordances on the Visual Genome dataset. △ Less

Submitted 15 June, 2023; originally announced June 2023.

arXiv:2305.19097 [pdf, other]

A generalized framework to predict continuous scores from medical ordinal labels

Authors: Katharina V. Hoebel, Andreanne Lemay, John Peter Campbell, Susan Ostmo, Michael F. Chiang, Christopher P. Bridge, Matthew D. Li, Praveer Singh, Aaron S. Coyner, Jayashree Kalpathy-Cramer

Abstract: Many variables of interest in clinical medicine, like disease severity, are recorded using discrete ordinal categories such as normal/mild/moderate/severe. These labels are used to train and evaluate disease severity prediction models. However, ordinal categories represent a simplification of an underlying continuous severity spectrum. Using continuous scores instead of ordinal categories is more… ▽ More Many variables of interest in clinical medicine, like disease severity, are recorded using discrete ordinal categories such as normal/mild/moderate/severe. These labels are used to train and evaluate disease severity prediction models. However, ordinal categories represent a simplification of an underlying continuous severity spectrum. Using continuous scores instead of ordinal categories is more sensitive to detecting small changes in disease severity over time. Here, we present a generalized framework that accurately predicts continuously valued variables using only discrete ordinal labels during model development. We found that for three clinical prediction tasks, models that take the ordinal relationship of the training labels into account outperformed conventional multi-class classification models. Particularly the continuous scores generated by ordinal classification and regression models showed a significantly higher correlation with expert rankings of disease severity and lower mean squared errors compared to the multi-class classification models. Furthermore, the use of MC dropout significantly improved the ability of all evaluated deep learning approaches to predict continuously valued scores that truthfully reflect the underlying continuous target variable. We showed that accurate continuously valued predictions can be generated even if the model development only involves discrete ordinal labels. The novel framework has been validated on three different clinical prediction tasks and has proven to bridge the gap between discrete ordinal labels and the underlying continuously valued variables. △ Less

Submitted 30 May, 2023; originally announced May 2023.

arXiv:2305.15640 [pdf, other]

Characterizing Out-of-Distribution Error via Optimal Transport

Authors: Yuzhe Lu, Yilong Qin, Runtian Zhai, Andrew Shen, Ketong Chen, Zhenlin Wang, Soheil Kolouri, Simon Stepputtis, Joseph Campbell, Katia Sycara

Abstract: Out-of-distribution (OOD) data poses serious challenges in deployed machine learning models, so methods of predicting a model's performance on OOD data without labels are important for machine learning safety. While a number of methods have been proposed by prior work, they often underestimate the actual error, sometimes by a large margin, which greatly impacts their applicability to real tasks. I… ▽ More Out-of-distribution (OOD) data poses serious challenges in deployed machine learning models, so methods of predicting a model's performance on OOD data without labels are important for machine learning safety. While a number of methods have been proposed by prior work, they often underestimate the actual error, sometimes by a large margin, which greatly impacts their applicability to real tasks. In this work, we identify pseudo-label shift, or the difference between the predicted and true OOD label distributions, as a key indicator to this underestimation. Based on this observation, we introduce a novel method for estimating model performance by leveraging optimal transport theory, Confidence Optimal Transport (COT), and show that it provably provides more robust error estimates in the presence of pseudo-label shift. Additionally, we introduce an empirically-motivated variant of COT, Confidence Optimal Transport with Thresholding (COTT), which applies thresholding to the individual transport costs and further improves the accuracy of COT's error estimates. We evaluate COT and COTT on a variety of standard benchmarks that induce various types of distribution shift -- synthetic, novel subpopulation, and natural -- and show that our approaches significantly outperform existing state-of-the-art methods with an up to 3x lower prediction error. △ Less

Submitted 27 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: NeurIPS 2023

arXiv:2302.12232 [pdf, other]

Concept Learning for Interpretable Multi-Agent Reinforcement Learning

Authors: Renos Zabounidis, Joseph Campbell, Simon Stepputtis, Dana Hughes, Katia Sycara

Abstract: Multi-agent robotic systems are increasingly operating in real-world environments in close proximity to humans, yet are largely controlled by policy models with inscrutable deep neural network representations. We introduce a method for incorporating interpretable concepts from a domain expert into models trained through multi-agent reinforcement learning, by requiring the model to first predict su… ▽ More Multi-agent robotic systems are increasingly operating in real-world environments in close proximity to humans, yet are largely controlled by policy models with inscrutable deep neural network representations. We introduce a method for incorporating interpretable concepts from a domain expert into models trained through multi-agent reinforcement learning, by requiring the model to first predict such concepts then utilize them for decision making. This allows an expert to both reason about the resulting concept policy models in terms of these high-level concepts at run-time, as well as intervene and correct mispredictions to improve performance. We show that this yields improved interpretability and training stability, with benefits to policy performance and sample efficiency in a simulated and real-world cooperative-competitive multi-agent game. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: Accepted to the 6th Conference on Robot Learning (CoRL 2022), Auckland, New Zealand

arXiv:2302.05018 [pdf, other]

Predicting Out-of-Distribution Error with Confidence Optimal Transport

Authors: Yuzhe Lu, Zhenlin Wang, Runtian Zhai, Soheil Kolouri, Joseph Campbell, Katia Sycara

Abstract: Out-of-distribution (OOD) data poses serious challenges in deployed machine learning models as even subtle changes could incur significant performance drops. Being able to estimate a model's performance on test data is important in practice as it indicates when to trust to model's decisions. We present a simple yet effective method to predict a model's performance on an unknown distribution withou… ▽ More Out-of-distribution (OOD) data poses serious challenges in deployed machine learning models as even subtle changes could incur significant performance drops. Being able to estimate a model's performance on test data is important in practice as it indicates when to trust to model's decisions. We present a simple yet effective method to predict a model's performance on an unknown distribution without any addition annotation. Our approach is rooted in the Optimal Transport theory, viewing test samples' output softmax scores from deep neural networks as empirical samples from an unknown distribution. We show that our method, Confidence Optimal Transport (COT), provides robust estimates of a model's performance on a target domain. Despite its simplicity, our method achieves state-of-the-art results on three benchmark datasets and outperforms existing methods by a large margin. △ Less

Submitted 9 February, 2023; originally announced February 2023.

arXiv:2301.02336 [pdf, other]

doi 10.1145/3568162.3578630

Exploring Levels of Control for a Navigation Assistant for Blind Travelers

Authors: Vinitha Ranganeni, Mike Sinclair, Eyal Ofek, Amos Miller, Jonathan Campbell, Andrey Kolobov, Edward Cutrell

Abstract: Only a small percentage of blind and low-vision people use traditional mobility aids such as a cane or a guide dog. Various assistive technologies have been proposed to address the limitations of traditional mobility aids. These devices often give either the user or the device majority of the control. In this work, we explore how varying levels of control affect the users' sense of agency, trust i… ▽ More Only a small percentage of blind and low-vision people use traditional mobility aids such as a cane or a guide dog. Various assistive technologies have been proposed to address the limitations of traditional mobility aids. These devices often give either the user or the device majority of the control. In this work, we explore how varying levels of control affect the users' sense of agency, trust in the device, confidence, and successful navigation. We present Glide, a novel mobility aid with two modes for control: Glide-directed and User-directed. We employ Glide in a study (N=9) in which blind or low-vision participants used both modes to navigate through an indoor environment. Overall, participants found that Glide was easy to use and learn. Most participants trusted Glide despite its current limitations, and their confidence and performance increased as they continued to use Glide. Users' control mode preference varied in different situations; no single mode "won" in all situations. △ Less

Submitted 5 January, 2023; originally announced January 2023.

Comments: 9 pages, 6 figures, Human-Robot Interaction 2023

arXiv:2212.01507 [pdf, other]

Learning and Blending Robot Hugging Behaviors in Time and Space

Authors: Michael Drolet, Joseph Campbell, Heni Ben Amor

Abstract: We introduce an imitation learning-based physical human-robot interaction algorithm capable of predicting appropriate robot responses in complex interactions involving a superposition of multiple interactions. Our proposed algorithm, Blending Bayesian Interaction Primitives (B-BIP) allows us to achieve responsive interactions in complex hugging scenarios, capable of reciprocating and adapting to a… ▽ More We introduce an imitation learning-based physical human-robot interaction algorithm capable of predicting appropriate robot responses in complex interactions involving a superposition of multiple interactions. Our proposed algorithm, Blending Bayesian Interaction Primitives (B-BIP) allows us to achieve responsive interactions in complex hugging scenarios, capable of reciprocating and adapting to a hugs motion and timing. We show that this algorithm is a generalization of prior work, for which the original formulation reduces to the particular case of a single interaction, and evaluate our method through both an extensive user study and empirical experiments. Our algorithm yields significantly better quantitative prediction error and more-favorable participant responses with respect to accuracy, responsiveness, and timing, when compared to existing state-of-the-art methods. △ Less

Submitted 2 December, 2022; originally announced December 2022.

arXiv:2211.07882 [pdf, other]

Explainable Action Advising for Multi-Agent Reinforcement Learning

Authors: Yue Guo, Joseph Campbell, Simon Stepputtis, Ruiyu Li, Dana Hughes, Fei Fang, Katia Sycara

Abstract: Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm. An expert teacher provides advice to a student during training in order to improve the student's sample efficiency and policy performance. Such advice is commonly given in the form of state-action pairs. However, it makes it difficult for the student to reason with and apply to novel… ▽ More Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm. An expert teacher provides advice to a student during training in order to improve the student's sample efficiency and policy performance. Such advice is commonly given in the form of state-action pairs. However, it makes it difficult for the student to reason with and apply to novel states. We introduce Explainable Action Advising, in which the teacher provides action advice as well as associated explanations indicating why the action was chosen. This allows the student to self-reflect on what it has learned, enabling advice generalization and leading to improved sample efficiency and learning performance - even in environments where the teacher is sub-optimal. We empirically show that our framework is effective in both single-agent and multi-agent scenarios, yielding improved policy returns and convergence rates when compared to state-of-the-art methods △ Less

Submitted 16 June, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

arXiv:2211.06292 [pdf, other]

A New Graph Node Classification Benchmark: Learning Structure from Histology Cell Graphs

Authors: Claudia Vanea, Jonathan Campbell, Omri Dodi, Liis Salumäe, Karen Meir, Drorith Hochner-Celnikier, Hagit Hochner, Triin Laisk, Linda M. Ernst, Cecilia M. Lindgren, Christoffer Nellåker

Abstract: We introduce a new benchmark dataset, Placenta, for node classification in an underexplored domain: predicting microanatomical tissue structures from cell graphs in placenta histology whole slide images. This problem is uniquely challenging for graph learning for a few reasons. Cell graphs are large (>1 million nodes per image), node features are varied (64-dimensions of 11 types of cells), class… ▽ More We introduce a new benchmark dataset, Placenta, for node classification in an underexplored domain: predicting microanatomical tissue structures from cell graphs in placenta histology whole slide images. This problem is uniquely challenging for graph learning for a few reasons. Cell graphs are large (>1 million nodes per image), node features are varied (64-dimensions of 11 types of cells), class labels are imbalanced (9 classes ranging from 0.21% of the data to 40.0%), and cellular communities cluster into heterogeneously distributed tissues of widely varying sizes (from 11 nodes to 44,671 nodes for a single structure). Here, we release a dataset consisting of two cell graphs from two placenta histology images totalling 2,395,747 nodes, 799,745 of which have ground truth labels. We present inductive benchmark results for 7 scalable models and show how the unique qualities of cell graphs can help drive the development of novel graph neural network architectures. △ Less

Submitted 11 November, 2022; originally announced November 2022.

Comments: Last two authors contributed equally. To be published at New Frontiers In Graph Learning, Neurips 2022

ACM Class: I.2.6

arXiv:2204.12658 [pdf, other]

Measuring a Robot Hand's Graspable Region using Power and Precision Grasps

Authors: John Morrow, Joshua Campbell, Nuha Nishat, Ravi Balasubramanian, Cindy Grimm

Abstract: The variety of robotic hand designs and actuation schemes makes it difficult to measure a hand's graspable volume. For end-users, this lack of standardized measurements makes it challenging to determine a priori if a robot hand is the right size for grasping an object. We propose a practical hand measurement standard, based on precision and power grasps, that is applicable to a wide variety of rob… ▽ More The variety of robotic hand designs and actuation schemes makes it difficult to measure a hand's graspable volume. For end-users, this lack of standardized measurements makes it challenging to determine a priori if a robot hand is the right size for grasping an object. We propose a practical hand measurement standard, based on precision and power grasps, that is applicable to a wide variety of robot hand designs. The resulting measurements can be used to both determine if an object will fit in the hand and characterize the size of an object with respect to the hand. Our measurement procedure uses a functional approach, based on grasping a hypothetical cylinder, that allows the measurer choose the exact hand orientation and finger configurations that are used for the measurements. This ensures that the measurements are functionally comparable while relying on the human to determine the finger configurations that best match the intended grasp. We demonstrate using our measurement standard with three commercial robot hand designs and objects from the YCB data set. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: Significantly changed second version of arXiv:2106.10402. Paper originally submitted to Robotics Automation Letters

arXiv:2202.07562 [pdf, other]

Improving the repeatability of deep learning models with Monte Carlo dropout

Authors: Andreanne Lemay, Katharina Hoebel, Christopher P. Bridge, Brian Befano, Silvia De Sanjosé, Diden Egemen, Ana Cecilia Rodriguez, Mark Schiffman, John Peter Campbell, Jayashree Kalpathy-Cramer

Abstract: The integration of artificial intelligence into clinical workflows requires reliable and robust models. Repeatability is a key attribute of model robustness. Repeatable models output predictions with low variation during independent tests carried out under similar conditions. During model development and evaluation, much attention is given to classification performance while model repeatability is… ▽ More The integration of artificial intelligence into clinical workflows requires reliable and robust models. Repeatability is a key attribute of model robustness. Repeatable models output predictions with low variation during independent tests carried out under similar conditions. During model development and evaluation, much attention is given to classification performance while model repeatability is rarely assessed, leading to the development of models that are unusable in clinical practice. In this work, we evaluate the repeatability of four model types (binary classification, multi-class classification, ordinal classification, and regression) on images that were acquired from the same patient during the same visit. We study the performance of binary, multi-class, ordinal, and regression models on four medical image classification tasks from public and private datasets: knee osteoarthritis, cervical cancer screening, breast density estimation, and retinopathy of prematurity. Repeatability is measured and compared on ResNet and DenseNet architectures. Moreover, we assess the impact of sampling Monte Carlo dropout predictions at test time on classification performance and repeatability. Leveraging Monte Carlo predictions significantly increased repeatability for all tasks on the binary, multi-class, and ordinal models leading to an average reduction of the 95\% limits of agreement by 16% points and of the disagreement rate by 7% points. The classification accuracy improved in most settings along with the repeatability. Our results suggest that beyond about 20 Monte Carlo iterations, there is no further gain in repeatability. In addition to the higher test-retest agreement, Monte Carlo predictions were better calibrated which leads to output probabilities reflecting more accurately the true likelihood of being correctly classified. △ Less

Submitted 15 February, 2022; originally announced February 2022.

Comments: arXiv admin note: text overlap with arXiv:2111.06754

arXiv:2111.06754 [pdf, other]

Monte Carlo dropout increases model repeatability

Authors: Andreanne Lemay, Katharina Hoebel, Christopher P. Bridge, Didem Egemen, Ana Cecilia Rodriguez, Mark Schiffman, John Peter Campbell, Jayashree Kalpathy-Cramer

Abstract: The integration of artificial intelligence into clinical workflows requires reliable and robust models. Among the main features of robustness is repeatability. Much attention is given to classification performance without assessing the model repeatability, leading to the development of models that turn out to be unusable in practice. In this work, we evaluate the repeatability of four model types… ▽ More The integration of artificial intelligence into clinical workflows requires reliable and robust models. Among the main features of robustness is repeatability. Much attention is given to classification performance without assessing the model repeatability, leading to the development of models that turn out to be unusable in practice. In this work, we evaluate the repeatability of four model types on images from the same patient that were acquired during the same visit. We study the performance of binary, multi-class, ordinal, and regression models on three medical image analysis tasks: cervical cancer screening, breast density estimation, and retinopathy of prematurity classification. Moreover, we assess the impact of sampling Monte Carlo dropout predictions at test time on classification performance and repeatability. Leveraging Monte Carlo predictions significantly increased repeatability for all tasks on the binary, multi-class, and ordinal models leading to an average reduction of the 95% limits of agreement by 17% points. △ Less

Submitted 12 November, 2021; originally announced November 2021.

Comments: Machine Learning for Health (ML4H) at NeurIPS 2021 - Extended Abstract

arXiv:2109.13845 [pdf]

Not Color Blind: AI Predicts Racial Identity from Black and White Retinal Vessel Segmentations

Authors: Aaron S. Coyner, Praveer Singh, James M. Brown, Susan Ostmo, R. V. Paul Chan, Michael F. Chiang, Jayashree Kalpathy-Cramer, J. Peter Campbell

Abstract: Background: Artificial intelligence (AI) may demonstrate racial bias when skin or choroidal pigmentation is present in medical images. Recent studies have shown that convolutional neural networks (CNNs) can predict race from images that were not previously thought to contain race-specific features. We evaluate whether grayscale retinal vessel maps (RVMs) of patients screened for retinopathy of pre… ▽ More Background: Artificial intelligence (AI) may demonstrate racial bias when skin or choroidal pigmentation is present in medical images. Recent studies have shown that convolutional neural networks (CNNs) can predict race from images that were not previously thought to contain race-specific features. We evaluate whether grayscale retinal vessel maps (RVMs) of patients screened for retinopathy of prematurity (ROP) contain race-specific features. Methods: 4095 retinal fundus images (RFIs) were collected from 245 Black and White infants. A U-Net generated RVMs from RFIs, which were subsequently thresholded, binarized, or skeletonized. To determine whether RVM differences between Black and White eyes were physiological, CNNs were trained to predict race from color RFIs, raw RVMs, and thresholded, binarized, or skeletonized RVMs. Area under the precision-recall curve (AUC-PR) was evaluated. Findings: CNNs predicted race from RFIs near perfectly (image-level AUC-PR: 0.999, subject-level AUC-PR: 1.000). Raw RVMs were almost as informative as color RFIs (image-level AUC-PR: 0.938, subject-level AUC-PR: 0.995). Ultimately, CNNs were able to detect whether RFIs or RVMs were from Black or White babies, regardless of whether images contained color, vessel segmentation brightness differences were nullified, or vessel segmentation widths were normalized. Interpretation: AI can detect race from grayscale RVMs that were not thought to contain racial information. Two potential explanations for these findings are that: retinal vessels physiologically differ between Black and White babies or the U-Net segments the retinal vasculature differently for various fundus pigmentations. Either way, the implications remain the same: AI algorithms have potential to demonstrate racial bias in practice, even when preliminary attempts to remove such information from the underlying images appear to be successful. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: 31 pages, 6 figures

arXiv:2107.02293 [pdf, other]

Histogram of Cell Types: Deep Learning for Automated Bone Marrow Cytology

Authors: Rohollah Moosavi Tayebi, Youqing Mu, Taher Dehkharghanian, Catherine Ross, Monalisa Sur, Ronan Foley, Hamid R. Tizhoosh, Clinton JV Campbell

Abstract: Bone marrow cytology is required to make a hematological diagnosis, influencing critical clinical decision points in hematology. However, bone marrow cytology is tedious, limited to experienced reference centers and associated with high inter-observer variability. This may lead to a delayed or incorrect diagnosis, leaving an unmet need for innovative supporting technologies. We have developed the… ▽ More Bone marrow cytology is required to make a hematological diagnosis, influencing critical clinical decision points in hematology. However, bone marrow cytology is tedious, limited to experienced reference centers and associated with high inter-observer variability. This may lead to a delayed or incorrect diagnosis, leaving an unmet need for innovative supporting technologies. We have developed the first ever end-to-end deep learning-based technology for automated bone marrow cytology. Starting with a bone marrow aspirate digital whole slide image, our technology rapidly and automatically detects suitable regions for cytology, and subsequently identifies and classifies all bone marrow cells in each region. This collective cytomorphological information is captured in a novel representation called Histogram of Cell Types (HCT) quantifying bone marrow cell class probability distribution and acting as a cytological "patient fingerprint". The approach achieves high accuracy in region detection (0.97 accuracy and 0.99 ROC AUC), and cell detection and cell classification (0.75 mAP, 0.78 F1-score, Log-average miss rate of 0.31). HCT has potential to revolutionize hematopathology diagnostic workflows, leading to more cost-effective, accurate diagnosis and opening the door to precision medicine. △ Less

Submitted 8 July, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

arXiv:2106.10402 [pdf, other]

Grasping Benchmarks: Normalizing for Object Size \& Approximating Hand Workspaces

Authors: John Morrow, Nuha Nishat, Joshua Campbell, Ravi Balasubramanian, Cindy Grimm

Abstract: The varied landscape of robotic hand designs makes it difficult to set a standard for how to measure hand size and to communicate the size of objects it can grasp. Defining consistent workspace measurements would greatly assist scientific communication in robotic grasping research because it would allow researchers to 1) quantitatively communicate an object's relative size to a hand's and 2) appro… ▽ More The varied landscape of robotic hand designs makes it difficult to set a standard for how to measure hand size and to communicate the size of objects it can grasp. Defining consistent workspace measurements would greatly assist scientific communication in robotic grasping research because it would allow researchers to 1) quantitatively communicate an object's relative size to a hand's and 2) approximate a functional subspace of a hand's kinematic workspace in a human-readable way. The goal of this paper is to specify a measurement procedure that quantitatively captures a hand's workspace size for both a precision and power grasp. This measurement procedure uses a {\em functional} approach -- based on a generic grasping scenario of a hypothetical object -- in order to make the procedure as generalizable and repeatable as possible, regardless of the actual hand design. This functional approach lets the measurer choose the exact finger configurations and contact points that satisfy the generic grasping scenario, while ensuring that the measurements are {\em functionally} comparable. We demonstrate these functional measurements on seven hand configurations. Additional hand measurements and instructions are provided in a GitHub Repository. △ Less

Submitted 18 June, 2021; originally announced June 2021.

Comments: Submitted to IROS 2021, waiting for response

arXiv:2011.07005 [pdf, other]

Learning Predictive Models for Ergonomic Control of Prosthetic Devices

Authors: Geoffrey Clark, Joseph Campbell, Heni Ben Amor

Abstract: We present Model-Predictive Interaction Primitives -- a robot learning framework for assistive motion in human-machine collaboration tasks which explicitly accounts for biomechanical impact on the human musculoskeletal system. First, we extend Interaction Primitives to enable predictive biomechanics: the prediction of future biomechanical states of a human partner conditioned on current observatio… ▽ More We present Model-Predictive Interaction Primitives -- a robot learning framework for assistive motion in human-machine collaboration tasks which explicitly accounts for biomechanical impact on the human musculoskeletal system. First, we extend Interaction Primitives to enable predictive biomechanics: the prediction of future biomechanical states of a human partner conditioned on current observations and intended robot control signals. In turn, we leverage this capability within a model-predictive control strategy to identify the future ergonomic and biomechanical ramifications of potential robot actions. Optimal control trajectories are selected so as to minimize future physical impact on the human musculoskeletal system. We empirically demonstrate that our approach minimizes knee or muscle forces via generated control actions selected according to biomechanical cost functions. Experiments are performed in synthetic and real-world experiments involving powered prosthetic devices. △ Less

Submitted 13 November, 2020; originally announced November 2020.

Comments: Accepted to CoRL 2020. Accompanying video presentation: https://www.youtube.com/watch?v=DxQPF3VwuoA&feature=youtu.be

arXiv:2010.12083 [pdf, other]

Language-Conditioned Imitation Learning for Robot Manipulation Tasks

Authors: Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Stefan Lee, Chitta Baral, Heni Ben Amor

Abstract: Imitation learning is a popular approach for teaching motor skills to robots. However, most approaches focus on extracting policy parameters from execution traces alone (i.e., motion trajectories and perceptual data). No adequate communication channel exists between the human expert and the robot to describe critical aspects of the task, such as the properties of the target object or the intended… ▽ More Imitation learning is a popular approach for teaching motor skills to robots. However, most approaches focus on extracting policy parameters from execution traces alone (i.e., motion trajectories and perceptual data). No adequate communication channel exists between the human expert and the robot to describe critical aspects of the task, such as the properties of the target object or the intended shape of the motion. Motivated by insights into the human teaching process, we introduce a method for incorporating unstructured natural language into imitation learning. At training time, the expert can provide demonstrations along with verbal descriptions in order to describe the underlying intent (e.g., "go to the large green bowl"). The training process then interrelates these two modalities to encode the correlations between language, perception, and motion. The resulting language-conditioned visuomotor policies can be conditioned at runtime on new human commands and instructions, which allows for more fine-grained control over the trained policies while also reducing situational ambiguity. We demonstrate in a set of simulation experiments how our approach can learn language-conditioned manipulation policies for a seven-degree-of-freedom robot arm and compare the results to a variety of alternative methods. △ Less

Submitted 22 October, 2020; originally announced October 2020.

Comments: Accepted to the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada as spotlight presentation

arXiv:2010.05123 [pdf, other]

Towards Hardware-Agnostic Gaze-Trackers

Authors: Jatin Sharma, Jon Campbell, Pete Ansell, Jay Beavers, Christopher O'Dowd

Abstract: Gaze-tracking is a novel way of interacting with computers which allows new scenarios, such as enabling people with motor-neuron disabilities to control their computers or doctors to interact with patient information without touching screen or keyboard. Further, there are emerging applications of gaze-tracking in interactive gaming, user experience research, human attention analysis and behavioral… ▽ More Gaze-tracking is a novel way of interacting with computers which allows new scenarios, such as enabling people with motor-neuron disabilities to control their computers or doctors to interact with patient information without touching screen or keyboard. Further, there are emerging applications of gaze-tracking in interactive gaming, user experience research, human attention analysis and behavioral studies. Accurate estimation of the gaze may involve accounting for head-pose, head-position, eye rotation, distance from the object as well as operating conditions such as illumination, occlusion, background noise and various biological aspects of the user. Commercially available gaze-trackers utilize specialized sensor assemblies that usually consist of an infrared light source and camera. There are several challenges in the universal proliferation of gaze-tracking as accessibility technologies, specifically its affordability, reliability, and ease-of-use. In this paper, we try to address these challenges through the development of a hardware-agnostic gaze-tracker. We present a deep neural network architecture as an appearance-based method for constrained gaze-tracking that utilizes facial imagery captured on an ordinary RGB camera ubiquitous in all modern computing devices. Our system achieved an error of 1.8073cm on GazeCapture dataset without any calibration or device specific fine-tuning. This research shows promise that one day soon any computer, tablet, or phone will be controllable using just your eyes due to the prediction capabilities of deep neutral networks. △ Less

Submitted 10 October, 2020; originally announced October 2020.

arXiv:2005.13139 [pdf, other]

Predictive Modeling of Periodic Behavior for Human-Robot Symbiotic Walking

Authors: Geoffrey Clark, Joseph Campbell, Seyed Mostafa Rezayat Sorkhabadi, Wenlong Zhang, Heni Ben Amor

Abstract: We propose in this paper Periodic Interaction Primitives - a probabilistic framework that can be used to learn compact models of periodic behavior. Our approach extends existing formulations of Interaction Primitives to periodic movement regimes, i.e., walking. We show that this model is particularly well-suited for learning data-driven, customized models of human walking, which can then be used f… ▽ More We propose in this paper Periodic Interaction Primitives - a probabilistic framework that can be used to learn compact models of periodic behavior. Our approach extends existing formulations of Interaction Primitives to periodic movement regimes, i.e., walking. We show that this model is particularly well-suited for learning data-driven, customized models of human walking, which can then be used for generating predictions over future states or for inferring latent, biomechanical variables. We also demonstrate how the same framework can be used to learn controllers for a robotic prosthesis using an imitation learning approach. Results in experiments with human participants indicate that Periodic Interaction Primitives efficiently generate predictions and ankle angle control signals for a robotic prosthetic ankle, with MAE of 2.21 degrees in 0.0008s per inference. Performance degrades gracefully in the presence of noise or sensor fall outs. Compared to alternatives, this algorithm functions 20 times faster and performed 4.5 times more accurately on test subjects. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Comments: Accepted to ICRA 2020. Accompanying video presentation: https://www.youtube.com/watch?v=EjSVjueePyQ&t=1s

arXiv:2005.12508 [pdf, other]

Learning Whole-Body Human-Robot Haptic Interaction in Social Contexts

Authors: Joseph Campbell, Katsu Yamane

Abstract: This paper presents a learning-from-demonstration (LfD) framework for teaching human-robot social interactions that involve whole-body haptic interaction, i.e. direct human-robot contact over the full robot body. The performance of existing LfD frameworks suffers in such interactions due to the high dimensionality and spatiotemporal sparsity of the demonstration data. We show that by leveraging th… ▽ More This paper presents a learning-from-demonstration (LfD) framework for teaching human-robot social interactions that involve whole-body haptic interaction, i.e. direct human-robot contact over the full robot body. The performance of existing LfD frameworks suffers in such interactions due to the high dimensionality and spatiotemporal sparsity of the demonstration data. We show that by leveraging this sparsity, we can reduce the data dimensionality without incurring a significant accuracy penalty, and introduce three strategies for doing so. By combining these techniques with an LfD framework for learning multimodal human-robot interactions, we can model the spatiotemporal relationship between the tactile and kinesthetic information during whole-body haptic interactions. Using a teleoperated bimanual robot equipped with 61 force sensors, we experimentally demonstrate that a model trained with 121 sample hugs from 4 participants generalizes well to unseen inputs and human partners. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: Accepted to ICRA 2020

arXiv:1911.11744 [pdf, other]

Imitation Learning of Robot Policies by Combining Language, Vision and Demonstration

Authors: Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Chitta Baral, Heni Ben Amor

Abstract: In this work we propose a novel end-to-end imitation learning approach which combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn is used to synthesize specific motion controllers at run-time. This multimodal approach enables generalization to a wide variety of environmental conditions and allows an end-user to direct a robot poli… ▽ More In this work we propose a novel end-to-end imitation learning approach which combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn is used to synthesize specific motion controllers at run-time. This multimodal approach enables generalization to a wide variety of environmental conditions and allows an end-user to direct a robot policy through verbal communication. We empirically validate our approach with an extensive set of simulations and show that it achieves a high task success rate over a variety of conditions while remaining amenable to probabilistic interpretability. △ Less

Submitted 26 November, 2019; originally announced November 2019.

Comments: Accepted to the NeurIPS 2019 Workshop on Robot Learning: Control and Interaction in the Real World, Vancouver, Canada

arXiv:1911.08736 [pdf]

Pan-Cancer Diagnostic Consensus Through Searching Archival Histopathology Images Using Artificial Intelligence

Authors: Shivam Kalra, H. R. Tizhoosh, Sultaan Shah, Charles Choi, Savvas Damaskinos, Amir Safarpoor, Sobhan Shafiei, Morteza Babaie, Phedias Diamandis, Clinton JV Campbell, Liron Pantanowitz

Abstract: The emergence of digital pathology has opened new horizons for histopathology and cytology. Artificial-intelligence algorithms are able to operate on digitized slides to assist pathologists with diagnostic tasks. Whereas machine learning involving classification and segmentation methods have obvious benefits for image analysis in pathology, image search represents a fundamental shift in computatio… ▽ More The emergence of digital pathology has opened new horizons for histopathology and cytology. Artificial-intelligence algorithms are able to operate on digitized slides to assist pathologists with diagnostic tasks. Whereas machine learning involving classification and segmentation methods have obvious benefits for image analysis in pathology, image search represents a fundamental shift in computational pathology. Matching the pathology of new patients with already diagnosed and curated cases offers pathologist a novel approach to improve diagnostic accuracy through visual inspection of similar cases and computational majority vote for consensus building. In this study, we report the results from searching the largest public repository (The Cancer Genome Atlas [TCGA] program by National Cancer Institute, USA) of whole slide images from almost 11,000 patients depicting different types of malignancies. For the first time, we successfully indexed and searched almost 30,000 high-resolution digitized slides constituting 16 terabytes of data comprised of 20 million 1000x1000 pixels image patches. The TCGA image database covers 25 anatomic sites and contains 32 cancer subtypes. High-performance storage and GPU power were employed for experimentation. The results were assessed with conservative "majority voting" to build consensus for subtype diagnosis through vertical search and demonstrated high accuracy values for both frozen sections slides (e.g., bladder urothelial carcinoma 93%, kidney renal clear cell carcinoma 97%, and ovarian serous cystadenocarcinoma 99%) and permanent histopathology slides (e.g., prostate adenocarcinoma 98%, skin cutaneous melanoma 99%, and thymoma 100%). The key finding of this validation study was that computational consensus appears to be possible for rendering diagnoses if a sufficiently large number of searchable cases are available for each cancer subtype. △ Less

Submitted 20 November, 2019; originally announced November 2019.

arXiv:1909.07471 [pdf, other]

Multimodal Dataset of Human-Robot Hugging Interaction

Authors: Kunal Bagewadi, Joseph Campbell, Heni Ben Amor

Abstract: A hug is a tight embrace and an expression of warmth, sympathy and camaraderie. Despite the fact that a hug often only takes a few seconds, it is filled with details and nuances and is a highly complex process of coordination between two agents. For human-robot collaborative tasks, it is necessary for humans to develop trust and see the robot as a partner to perform a given task together. Datasets… ▽ More A hug is a tight embrace and an expression of warmth, sympathy and camaraderie. Despite the fact that a hug often only takes a few seconds, it is filled with details and nuances and is a highly complex process of coordination between two agents. For human-robot collaborative tasks, it is necessary for humans to develop trust and see the robot as a partner to perform a given task together. Datasets representing agent-agent interaction are scarce and, if available, of limited quality. To study the underlying phenomena and variations in a hug between a person and a robot, we deployed Baxter humanoid robot and wearable sensors on persons to record 353 episodes of hugging activity. 33 people were given minimal instructions to hug the humanoid robot for as natural hugging interaction as possible. In the paper, we present our methodology and analysis of the collected dataset. The use of this dataset is to implement machine learning methods for the humanoid robot to learn to anticipate and react to the movements of a person approaching for a hug. In this regard, we show the significance of the dataset by highlighting certain features in our dataset. △ Less

Submitted 16 September, 2019; originally announced September 2019.

Report number: AI-HRI/2019/09

arXiv:1908.05552 [pdf, other]

Learning Interactive Behaviors for Musculoskeletal Robots Using Bayesian Interaction Primitives

Authors: Joseph Campbell, Arne Hitzmann, Simon Stepputtis, Shuhei Ikemoto, Koh Hosoda, Heni Ben Amor

Abstract: Musculoskeletal robots that are based on pneumatic actuation have a variety of properties, such as compliance and back-drivability, that render them particularly appealing for human-robot collaboration. However, programming interactive and responsive behaviors for such systems is extremely challenging due to the nonlinearity and uncertainty inherent to their control. In this paper, we propose an a… ▽ More Musculoskeletal robots that are based on pneumatic actuation have a variety of properties, such as compliance and back-drivability, that render them particularly appealing for human-robot collaboration. However, programming interactive and responsive behaviors for such systems is extremely challenging due to the nonlinearity and uncertainty inherent to their control. In this paper, we propose an approach for learning Bayesian Interaction Primitives for musculoskeletal robots given a limited set of example demonstrations. We show that this approach is capable of real-time state estimation and response generation for interaction with a robot for which no analytical model exists. Human-robot interaction experiments on a 'handshake' task show that the approach generalizes to new positions, interaction partners, and movement velocities. △ Less

Submitted 15 August, 2019; originally announced August 2019.

Comments: Accompanying video: https://youtu.be/2fxOn3lIdvo

arXiv:1908.04955 [pdf, other]

Probabilistic Multimodal Modeling for Human-Robot Interaction Tasks

Authors: Joseph Campbell, Simon Stepputtis, Heni Ben Amor

Abstract: Human-robot interaction benefits greatly from multimodal sensor inputs as they enable increased robustness and generalization accuracy. Despite this observation, few HRI methods are capable of efficiently performing inference for multimodal systems. In this work, we introduce a reformulation of Interaction Primitives which allows for learning from demonstration of interaction tasks, while also gra… ▽ More Human-robot interaction benefits greatly from multimodal sensor inputs as they enable increased robustness and generalization accuracy. Despite this observation, few HRI methods are capable of efficiently performing inference for multimodal systems. In this work, we introduce a reformulation of Interaction Primitives which allows for learning from demonstration of interaction tasks, while also gracefully handling nonlinearities inherent to multimodal inference in such scenarios. We also empirically show that our method results in more accurate, more robust, and faster inference than standard Interaction Primitives and other common methods in challenging HRI scenarios. △ Less

Submitted 14 August, 2019; originally announced August 2019.

Comments: Project website: http://interactive-robotics.engineering.asu.edu/interaction-primitives Accompanying video: https://youtu.be/r5AqfxTDfLA

arXiv:1901.10074 [pdf, other]

CaRENets: Compact and Resource-Efficient CNN for Homomorphic Inference on Encrypted Medical Images

Authors: Jin Chao, Ahmad Al Badawi, Balagopal Unnikrishnan, Jie Lin, Chan Fook Mun, James M. Brown, J. Peter Campbell, Michael Chiang, Jayashree Kalpathy-Cramer, Vijay Ramaseshan Chandrasekhar, Pavitra Krishnaswamy, Khin Mi Mi Aung

Abstract: Convolutional neural networks (CNNs) have enabled significant performance leaps in medical image classification tasks. However, translating neural network models for clinical applications remains challenging due to data privacy issues. Fully Homomorphic Encryption (FHE) has the potential to address this challenge as it enables the use of CNNs on encrypted images. However, current HE technology pos… ▽ More Convolutional neural networks (CNNs) have enabled significant performance leaps in medical image classification tasks. However, translating neural network models for clinical applications remains challenging due to data privacy issues. Fully Homomorphic Encryption (FHE) has the potential to address this challenge as it enables the use of CNNs on encrypted images. However, current HE technology poses immense computational and memory overheads, particularly for high-resolution images such as those seen in the clinical context. We present CaRENets: Compact and Resource-Efficient CNNs for high performance and resource-efficient inference on high-resolution encrypted images in practical applications. At the core, CaRENets comprises a new FHE compact packing scheme that is tightly integrated with CNN functions. CaRENets offers dual advantages of memory efficiency (due to compact packing of images and CNN activations) and inference speed (due to the reduction in the number of ciphertexts created and the associated mathematical operations) over standard interleaved packing schemes. We apply CaRENets to perform homomorphic abnormality detection with 80-bit security level in two clinical conditions - Retinopathy of Prematurity (ROP) and Diabetic Retinopathy (DR). The ROP dataset comprises 96 x 96 grayscale images, while the DR dataset comprises 256 x 256 RGB images. We demonstrate over 45x improvement in memory efficiency and 4-5x speedup in inference over the interleaved packing schemes. As our approach enables memory-efficient low-latency HE inference without imposing additional communication burden, it has implications for practical and secure deep learning inference in clinical imaging. △ Less

Submitted 28 January, 2019; originally announced January 2019.

arXiv:1901.06080 [pdf, other]

Accelerated Experimental Design for Pairwise Comparisons

Authors: Yuan Guo, Jennifer Dy, Deniz Erdogmus, Jayashree Kalpathy-Cramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, Stratis Ioannidis

Abstract: Pairwise comparison labels are more informative and less variable than class labels, but generating them poses a challenge: their number grows quadratically in the dataset size. We study a natural experimental design objective, namely, D-optimality, that can be used to identify which $K$ pairwise comparisons to generate. This objective is known to perform well in practice, and is submodular, makin… ▽ More Pairwise comparison labels are more informative and less variable than class labels, but generating them poses a challenge: their number grows quadratically in the dataset size. We study a natural experimental design objective, namely, D-optimality, that can be used to identify which $K$ pairwise comparisons to generate. This objective is known to perform well in practice, and is submodular, making the selection approximable via the greedy algorithm. A naïve greedy implementation has $O(N^2d^2K)$ complexity, where $N$ is the dataset size, $d$ is the feature space dimension, and $K$ is the number of generated comparisons. We show that, by exploiting the inherent geometry of the dataset--namely, that it consists of pairwise comparisons--the greedy algorithm's complexity can be reduced to $O(N^2(K+d)+N(dK+d^2) +d^2K).$ We apply the same acceleration also to the so-called lazy greedy algorithm. When combined, the above improvements lead to an execution time of less than 1 hour for a dataset with $10^8$ comparisons; the naïve greedy algorithm on the same dataset would require more than 10 days to terminate. △ Less

Submitted 17 January, 2019; originally announced January 2019.

arXiv:1811.02539 [pdf, other]

Deep feature transfer between localization and segmentation tasks

Authors: Szu-Yeu Hu, Andrew Beers, Ken Chang, Kathi Höbel, J. Peter Campbell, Deniz Erdogumus, Stratis Ioannidis, Jennifer Dy, Michael F. Chiang, Jayashree Kalpathy-Cramer, James M. Brown

Abstract: In this paper, we propose a new pre-training scheme for U-net based image segmentation. We first train the encoding arm as a localization network to predict the center of the target, before extending it into a U-net architecture for segmentation. We apply our proposed method to the problem of segmenting the optic disc from fundus photographs. Our work shows that the features learned by encoding ar… ▽ More In this paper, we propose a new pre-training scheme for U-net based image segmentation. We first train the encoding arm as a localization network to predict the center of the target, before extending it into a U-net architecture for segmentation. We apply our proposed method to the problem of segmenting the optic disc from fundus photographs. Our work shows that the features learned by encoding arm can be transferred to the segmentation network to reduce the annotation burden. We propose that an approach could have broad utility for medical image segmentation, and alleviate the burden of delineating complex structures by pre-training on annotations that are much easier to acquire. △ Less

Submitted 10 November, 2018; v1 submitted 6 November, 2018; originally announced November 2018.

arXiv:1805.03144 [pdf, other]

High-resolution medical image synthesis using progressively grown generative adversarial networks

Authors: Andrew Beers, James Brown, Ken Chang, J. Peter Campbell, Susan Ostmo, Michael F. Chiang, Jayashree Kalpathy-Cramer

Abstract: Generative adversarial networks (GANs) are a class of unsupervised machine learning algorithms that can produce realistic images from randomly-sampled vectors in a multi-dimensional space. Until recently, it was not possible to generate realistic high-resolution images using GANs, which has limited their applicability to medical images that contain biomarkers only detectable at native resolution.… ▽ More Generative adversarial networks (GANs) are a class of unsupervised machine learning algorithms that can produce realistic images from randomly-sampled vectors in a multi-dimensional space. Until recently, it was not possible to generate realistic high-resolution images using GANs, which has limited their applicability to medical images that contain biomarkers only detectable at native resolution. Progressive growing of GANs is an approach wherein an image generator is trained to initially synthesize low resolution synthetic images (8x8 pixels), which are then fed to a discriminator that distinguishes these synthetic images from real downsampled images. Additional convolutional layers are then iteratively introduced to produce images at twice the previous resolution until the desired resolution is reached. In this work, we demonstrate that this approach can produce realistic medical images in two different domains; fundus photographs exhibiting vascular pathology associated with retinopathy of prematurity (ROP), and multi-modal magnetic resonance images of glioma. We also show that fine-grained details associated with pathology, such as retinal vessels or tumor heterogeneity, can be preserved and enhanced by including segmentation maps as additional channels. We envisage several applications of the approach, including image augmentation and unsupervised classification of pathology. △ Less

Submitted 9 May, 2018; v1 submitted 8 May, 2018; originally announced May 2018.

arXiv:1711.03087 [pdf, other]

doi 10.1109/TG.2018.2861759

Exploration in NetHack With Secret Discovery

Authors: Jonathan C. Campbell, Clark Verbrugge

Abstract: Roguelike games generally feature exploration problems as a critical, yet often repetitive element of gameplay. Automated approaches, however, face challenges in terms of optimality, as well as due to incomplete information, such as from the presence of secret doors. This paper presents an algorithmic approach to exploration of roguelike dungeon environments. Our design aims to minimize exploratio… ▽ More Roguelike games generally feature exploration problems as a critical, yet often repetitive element of gameplay. Automated approaches, however, face challenges in terms of optimality, as well as due to incomplete information, such as from the presence of secret doors. This paper presents an algorithmic approach to exploration of roguelike dungeon environments. Our design aims to minimize exploration time, balancing coverage and discovery of secret areas with resource cost. Our algorithm is based on the concept of occupancy maps popular in robotics, adapted to encourage efficient discovery of secret access points. Through extensive experimentation on NetHack maps we show that this technique is significantly more efficient than simpler greedy approaches and an existing automated player. We further investigate optimized parameterization for the algorithm through a comprehensive data analysis. These results point towards better automation for players as well as heuristics applicable to fully automated gameplay. △ Less

Submitted 6 August, 2018; v1 submitted 8 November, 2017; originally announced November 2017.

Comments: 11 pages, 11 figures. Accepted in IEEE Transactions on Games. Revision adds BotHack comparison, result breakdown by num. map rooms, and improved optimal solution

arXiv:1706.01977 [pdf, other]

From the Lab to the Desert: Fast Prototyping and Learning of Robot Locomotion

Authors: Kevin Sebastian Luck, Joseph Campbell, Michael Andrew Jansen, Daniel M. Aukes, Heni Ben Amor

Abstract: We present a methodology for fast prototyping of morphologies and controllers for robot locomotion. Going beyond simulation-based approaches, we argue that the form and function of a robot, as well as their interplay with real-world environmental conditions are critical. Hence, fast design and learning cycles are necessary to adapt robot shape and behavior to their environment. To this end, we pre… ▽ More We present a methodology for fast prototyping of morphologies and controllers for robot locomotion. Going beyond simulation-based approaches, we argue that the form and function of a robot, as well as their interplay with real-world environmental conditions are critical. Hence, fast design and learning cycles are necessary to adapt robot shape and behavior to their environment. To this end, we present a combination of laminate robot manufacturing and sample-efficient reinforcement learning. We leverage this methodology to conduct an extensive robot learning experiment. Inspired by locomotion in sea turtles, we design a low-cost crawling robot with variable, interchangeable fins. Learning is performed using both bio-inspired and original fin designs in an artificial indoor environment as well as a natural environment in the Arizona desert. The findings of this study show that static policies developed in the laboratory do not translate to effective locomotion strategies in natural environments. In contrast to that, sample-efficient reinforcement learning can help to rapidly accommodate changes in the environment or the robot. △ Less

Submitted 6 June, 2017; originally announced June 2017.

Comments: Submitted to Robotics: Science and Systems (RSS 2017)

arXiv:1704.05505 [pdf, other]

Making Sense of Unstructured Text Data

Authors: Lin Li, William M. Campbell, Cagri Dagli, Joseph P. Campbell

Abstract: Many network analysis tasks in social sciences rely on pre-existing data sources that were created with explicit relations or interactions between entities under consideration. Examples include email logs, friends and followers networks on social media, communication networks, etc. In these data, it is relatively easy to identify who is connected to whom and how they are connected. However, most o… ▽ More Many network analysis tasks in social sciences rely on pre-existing data sources that were created with explicit relations or interactions between entities under consideration. Examples include email logs, friends and followers networks on social media, communication networks, etc. In these data, it is relatively easy to identify who is connected to whom and how they are connected. However, most of the data that we encounter on a daily basis are unstructured free-text data, e.g., forums, online marketplaces, etc. It is considerably more difficult to extract network data from unstructured text. In this work, we present an end-to-end system for analyzing unstructured text data and transforming the data into structured graphs that are directly applicable to a downstream application. Specifically, we look at social media data and attempt to predict the most indicative words from users' posts. The resulting keywords can be used to construct a context+content network for downstream processing such as graph-based analysis and learning. With that goal in mind, we apply our methods to the application of cross-domain entity resolution. The performance of the resulting system with automatic keywords shows improvement over the system with user-annotated hashtags. △ Less

Submitted 18 April, 2017; originally announced April 2017.

Showing 1–50 of 61 results for author: Campbell, J