As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely m... more As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely measured in isolation, without a human in the loop. It is crucial to benchmark progress in AI, not just in isolation, but also in terms of how it translates to helping humans perform certain tasks, i.e., the performance of human-AI teams. In this work, we design a cooperative game - GuessWhich - to measure human-AI team performance in the specific context of the AI being a visual conversational agent. GuessWhich involves live interaction between the human and the AI. The AI, which we call ALICE, is provided an image which is unseen by the human. Following a brief description of the image, the human questions ALICE about this secret image to identify it from a fixed pool of images. We measure performance of the human-ALICE team by the number of guesses it takes the human to correctly identify the secret image after a fixed number of dialog rounds with ALICE. We compare performance of the h...
We learn to identify decision states, namely the parsimonious set of states where decisions meani... more We learn to identify decision states, namely the parsimonious set of states where decisions meaningfully affect the future states an agent can reach in an environment. We utilize the VIC framework, which maximizes an agent's `empowerment', i.e. the ability to reliably reach a diverse set of states -- and formulate a sandwich bound on the empowerment objective that allows identification of decision states. Unlike previous work, our decision states are discovered without extrinsic rewards -- simply by interacting with the world. Our results show that our decision states are: (1) often interpretable, and (2) lead to better exploration on downstream goal-driven tasks in partially observable environments.
Theory of Mind is the ability to attribute mental states (beliefs, intents, knowledge, perspectiv... more Theory of Mind is the ability to attribute mental states (beliefs, intents, knowledge, perspectives, etc.) to others and recognize that these mental states may differ from one's own. Theory of Mind is critical to effective communication and to teams demonstrating higher collective performance. To effectively leverage the progress in Artificial Intelligence (AI) to make our lives more productive, it is important for humans and AI to work well together in a team. Traditionally, there has been much emphasis on research to make AI more accurate, and (to a lesser extent) on having it better understand human intentions, tendencies, beliefs, and contexts. The latter involves making AI more human-like and having it develop a theory of our minds. In this work, we argue that for human-AI teams to be effective, humans must also develop a theory of AI's mind (ToAIM) - get to know its strengths, weaknesses, beliefs, and quirks. We instantiate these ideas within the domain of Visual Quest...
We introduce EvalAI, an open source platform for evaluating artificial intelligence algorithms (A... more We introduce EvalAI, an open source platform for evaluating artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating ML models and AI agents in a dynamic environment against ground-truth annotations or by interacting with a human. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe. By simplifying and standardizing the process of benchmarking these models, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of ML and AI, thereby increasing the rate of measurable progress in this domain. Our code is available at https://github.com/Cloud-CV/EvalAI.
We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) a... more We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating machine learning models and agents acting in an environment against annotations or with a human-in-the-loop. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe. By simplifying and standardizing the process of benchmarking these models, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of machine learning and artificial intelligence, thereby increasing the rate of measurable progress in this domain.
We present a hierarchical reinforcement learning (HRL) or options framework for identifying decis... more We present a hierarchical reinforcement learning (HRL) or options framework for identifying decision states. Informally speaking, these are states considered important by the agent's policy e.g. , for navigation, decision states would be crossroads or doors where an agent needs to make strategic decisions. While previous work (most notably Goyal et. al., 2019) discovers decision states in a task/goal specific (or 'supervised') manner, we do so in a goal-independent (or 'unsupervised') manner, i.e. entirely without any goal or extrinsic rewards. Our approach combines two hitherto disparate ideas - 1) \emph{intrinsic control} (Gregor et. al., 2016, Eysenbach et. al., 2018): learning a set of options that allow an agent to reliably reach a diverse set of states, and 2) \emph{information bottleneck} (Tishby et. al., 2000): penalizing mutual information between the option $\Omega$ and the states $s_t$ visited in the trajectory. The former encourages an agent to reliab...
We propose a novel framework to identify sub-goals useful for exploration in sequential decision ... more We propose a novel framework to identify sub-goals useful for exploration in sequential decision making tasks under partial observability. We utilize the variational intrinsic control framework (Gregor this http URL., 2016) which maximizes empowerment -- the ability to reliably reach a diverse set of states and show how to identify sub-goals as states with high necessary option information through an information theoretic regularizer. Despite being discovered without explicit goal supervision, our sub-goals provide better exploration and sample complexity on challenging grid-world navigation tasks compared to supervised counterparts in prior work.
Learning diverse and reusable skills in the absence of rewards in an environment is a key challen... more Learning diverse and reusable skills in the absence of rewards in an environment is a key challenge in reinforcement learning. One solution to this problem, as has been explored in prior work (Gregor et al., 2016; Eysenbach et al., 2018; Achiam et al., 2018), is to learn a set of intrinsic macro-actions or options that reliably correspond to trajectories when executed in an environment. In this options framework, we identify and distinguish between decision-states (e.g. crossroads) where one needs to make a decision, as being distinct from corridors (where one can follow default behavior) in the modeling of options. Our intuition is that identifying decision states would lead to more interpretable behavior from an RL agent, exposing clearly what the underlying options correspond to. We formulate this as an information regularized intrinsic control problem using techniques similar to (Goyal et al., 2019) who applied the information bottleneck to goal-driven tasks. Our qualitative res...
In this supplement, we further discuss the specificity of the obtained domainspecific masks (Sect... more In this supplement, we further discuss the specificity of the obtained domainspecific masks (Section. 0.1). Following this, we discuss how sparsity as an incentive compares with sIoU in terms of learning a balance between specificity and invariance and in terms of performance (Section. 0.2). In Section. 0.3, we discuss alternative techniques for directly ensembling masks instead of the output predictions in response to each mask. In Section. 0.4, we provide more extensive comparisons to prior work on the PACS [10] dataset. Finally, in Section. 0.5, we describe in detail the implementation and other details associated with our experiments. We use C, I, P, Q, R, S to denote the domains – clipart, infograph, painting, quickdraw, real and sketch respectively on the DomainNet [17] dataset.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019
As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely m... more As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely measured in isolation, without a human in the loop. It is crucial to benchmark progress in AI, not just in isolation, but also in terms of how it translates to helping humans perform certain tasks, i.e., the performance of human-AI teams. In this work, we design a cooperative game - GuessWhich - to measure human-AI team performance in the specific context of the AI being a visual conversational agent. GuessWhich involves live interaction between the human and the AI. The AI, which we call ALICE, is provided an image which is unseen by the human. Following a brief description of the image, the human questions ALICE about this secret image to identify it from a fixed pool of images. We measure performance of the human-ALICE team by the number of guesses it takes the human to correctly identify the secret image after a fixed number of dialog rounds with ALICE. We compare performance of the h...
We learn to identify decision states, namely the parsimonious set of states where decisions meani... more We learn to identify decision states, namely the parsimonious set of states where decisions meaningfully affect the future states an agent can reach in an environment. We utilize the VIC framework, which maximizes an agent's `empowerment', i.e. the ability to reliably reach a diverse set of states -- and formulate a sandwich bound on the empowerment objective that allows identification of decision states. Unlike previous work, our decision states are discovered without extrinsic rewards -- simply by interacting with the world. Our results show that our decision states are: (1) often interpretable, and (2) lead to better exploration on downstream goal-driven tasks in partially observable environments.
Theory of Mind is the ability to attribute mental states (beliefs, intents, knowledge, perspectiv... more Theory of Mind is the ability to attribute mental states (beliefs, intents, knowledge, perspectives, etc.) to others and recognize that these mental states may differ from one's own. Theory of Mind is critical to effective communication and to teams demonstrating higher collective performance. To effectively leverage the progress in Artificial Intelligence (AI) to make our lives more productive, it is important for humans and AI to work well together in a team. Traditionally, there has been much emphasis on research to make AI more accurate, and (to a lesser extent) on having it better understand human intentions, tendencies, beliefs, and contexts. The latter involves making AI more human-like and having it develop a theory of our minds. In this work, we argue that for human-AI teams to be effective, humans must also develop a theory of AI's mind (ToAIM) - get to know its strengths, weaknesses, beliefs, and quirks. We instantiate these ideas within the domain of Visual Quest...
We introduce EvalAI, an open source platform for evaluating artificial intelligence algorithms (A... more We introduce EvalAI, an open source platform for evaluating artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating ML models and AI agents in a dynamic environment against ground-truth annotations or by interacting with a human. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe. By simplifying and standardizing the process of benchmarking these models, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of ML and AI, thereby increasing the rate of measurable progress in this domain. Our code is available at https://github.com/Cloud-CV/EvalAI.
We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) a... more We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating machine learning models and agents acting in an environment against annotations or with a human-in-the-loop. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe. By simplifying and standardizing the process of benchmarking these models, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of machine learning and artificial intelligence, thereby increasing the rate of measurable progress in this domain.
We present a hierarchical reinforcement learning (HRL) or options framework for identifying decis... more We present a hierarchical reinforcement learning (HRL) or options framework for identifying decision states. Informally speaking, these are states considered important by the agent's policy e.g. , for navigation, decision states would be crossroads or doors where an agent needs to make strategic decisions. While previous work (most notably Goyal et. al., 2019) discovers decision states in a task/goal specific (or 'supervised') manner, we do so in a goal-independent (or 'unsupervised') manner, i.e. entirely without any goal or extrinsic rewards. Our approach combines two hitherto disparate ideas - 1) \emph{intrinsic control} (Gregor et. al., 2016, Eysenbach et. al., 2018): learning a set of options that allow an agent to reliably reach a diverse set of states, and 2) \emph{information bottleneck} (Tishby et. al., 2000): penalizing mutual information between the option $\Omega$ and the states $s_t$ visited in the trajectory. The former encourages an agent to reliab...
We propose a novel framework to identify sub-goals useful for exploration in sequential decision ... more We propose a novel framework to identify sub-goals useful for exploration in sequential decision making tasks under partial observability. We utilize the variational intrinsic control framework (Gregor this http URL., 2016) which maximizes empowerment -- the ability to reliably reach a diverse set of states and show how to identify sub-goals as states with high necessary option information through an information theoretic regularizer. Despite being discovered without explicit goal supervision, our sub-goals provide better exploration and sample complexity on challenging grid-world navigation tasks compared to supervised counterparts in prior work.
Learning diverse and reusable skills in the absence of rewards in an environment is a key challen... more Learning diverse and reusable skills in the absence of rewards in an environment is a key challenge in reinforcement learning. One solution to this problem, as has been explored in prior work (Gregor et al., 2016; Eysenbach et al., 2018; Achiam et al., 2018), is to learn a set of intrinsic macro-actions or options that reliably correspond to trajectories when executed in an environment. In this options framework, we identify and distinguish between decision-states (e.g. crossroads) where one needs to make a decision, as being distinct from corridors (where one can follow default behavior) in the modeling of options. Our intuition is that identifying decision states would lead to more interpretable behavior from an RL agent, exposing clearly what the underlying options correspond to. We formulate this as an information regularized intrinsic control problem using techniques similar to (Goyal et al., 2019) who applied the information bottleneck to goal-driven tasks. Our qualitative res...
In this supplement, we further discuss the specificity of the obtained domainspecific masks (Sect... more In this supplement, we further discuss the specificity of the obtained domainspecific masks (Section. 0.1). Following this, we discuss how sparsity as an incentive compares with sIoU in terms of learning a balance between specificity and invariance and in terms of performance (Section. 0.2). In Section. 0.3, we discuss alternative techniques for directly ensembling masks instead of the output predictions in response to each mask. In Section. 0.4, we provide more extensive comparisons to prior work on the PACS [10] dataset. Finally, in Section. 0.5, we describe in detail the implementation and other details associated with our experiments. We use C, I, P, Q, R, S to denote the domains – clipart, infograph, painting, quickdraw, real and sketch respectively on the DomainNet [17] dataset.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019
Uploads
Papers by Prithvijit Chattopadhyay