We let the participants interact autonomously with two different cobot planners. The first one is a
proactive robot that runs our A-POMDP model in Figure
2. This cobot first anticipates a human’s characteristic, e.g., lost attention, incapability, or tiredness, and then it estimates if the human needs assistance or not (extended short-term adaptation, in Section
3.2.1). On the contrary, the other cobot, the
reactive robot, does not handle the unanticipated behaviors of a human. It treats a human’s need for help as a directly observable (deterministic) state. The reactive robot deterministically decides that a human needs help when (i) a certain time duration has passed without a cube placement (i.e., without a subtask completion), (ii) the human is not detected around the workplace, or (iii) the human fails in a subtask. We design the
reactive robot by removing the
anticipation stage-1 of our A-POMDP model design in Figure
2. With the
anticipation stage-2 being deterministic, this model is designed as an MDP. Through this comparison, our intention is to show the importance of handling the unanticipated human behaviors (i.e., stochastic interpretation of such human states) for an improved short-term adaptation.
6.1.3 Results and Discussions.
In Figure
12, we give the box and whisker plots of the success and efficiency analysis and in Figure
12(e) are the numerical values and the ANOVA results. As seen in Figure
12(a), the task success rate when the participants worked without a cobot is significantly worse than a collaboration with any of the cobot models (
\(p \lt 0.05\)). A collaboration with a proactive robot contributes positively to a task success by
\(~41\%\) and with the reactive robot, it is increased by
\(~35\%\) compared to a human working alone. Similarly, a human’s success rate has also significantly increased when the human collaborated with either of the cobots (in Figure
12(b) with
\(p \lt 0.05\) for both cases), whereas with the proactive one this increase is slightly higher (by
\(~35\%\) increase on the human success rate).
In Table
4, we give the Likert scale (1–5 with increasing agreement) results of the participant’s statements throughout this experiment. Table
4(a) shows the statements for analyzing a cobot’s impact on such a challenging task. The participants state that both of the cobots helped them remember the task rules and that a robot collaboration is beneficial in such tasks (see high mean rates in the table and the sum across all Likert items significantly favors the impact of the proactive case). Finally, the average task efficiencies are shown in Figure
12(d). Both the reactive and the proactive robot contributed significantly positively to the task efficiency (
\(p \lt 0.05\)), compared to a human working alone (an increase of
\(~56\%\) for the proactive robot and
\(21\%\) for the reactive one to the overall task success rates in Figure
12(e)). Thereby, we underscore the importance of such cobot collaborating with humans in challenging tasks. The success rate analysis, the efficiency results, and the subjective ratings of the participants support
Hypothesis 1.
Even though the proactive robot on average achieved higher success rates than the reactive one (see Figure
12(e)), there is no significant difference between them. The decision models only decide the level of taking over a subtask. After that, both cobots achieve success in placing the cubes (over
\(95\%\)). Therefore, some participants were comfortable leaving the task to the cobot once they realize its capability. For the reactive robot, this happened significantly more than the proactive one due to its deterministic rules. Still, the proactive robot provides a more stable success than the reactive one by keeping the variance low (see Figure
12(a)). In both cases, our main concern is how much of this success actually comes from the human. Figure
12(c) shows that the proactive robot significantly increased the human’s contribution to success when compared to the reactive robot (
\(p = 0.038\)). The reactive robot also respects the initial task assignment; however, it favors taking over a task when, for example, a human idles too long, discarding the unanticipated human behaviors and preferences. This led to a decrease in a human’s successful contribution in a task during her collaboration with the reactive robot when compared to her performance alone (as shown in Figure
12(e)). Finally, since a higher efficiency is achieved when a task is successfully accomplished by its assigned collaborator, the proactive robot significantly increased the task efficiency of a human working alone by
\(~57\%\) (
\(p=0.0023\)) and ruled out the task efficiency achieved with the reactive robot (with
\(p=0.0104\) in Figure
12(e)).
As discussed in Section
6.1.1, the naturalness reflects a fluent communication in that the handovers and turn-taking need to be interpreted correctly by both of the collaborators. Figure
13(a) suggests that the proactive robot could keep the warnings close to zero, suggesting a higher accuracy in estimating a human’s unanticipated behaviors and need for help. In Figure
13(c), the ANOVA results point out the significantly higher number of warnings the reactive robot received, which is 3.6 times of the proactive robot’s with
\(p \lt 0.0001\). The participants also evaluated if a collaboration with a cobot felt comparatively natural to them, i.e., more human-like. The participants stated that the reactive robot’s interference distracted them significantly more than the proactive case (with
\(p=0.023\) in Table
4(a)). This is largely due to the significantly increased unexpected interferences from the reactive robot (with
\(p=0.012\) as shown in Table
4(b)). The “expectation” here is an ambiguous term that might differ from one person to another; however, it is inherently discussed in the literature that an efficient collaboration is achieved when the partners reach a joint intention. Thus, the expectations of the partners are often toward understanding each other and obtaining a joint action on a task [
7].
Finally, Figure
13(b) shows the rewards gathered by the cobot, which is the combination of task success and the number of warnings. As expected, the proactive robot has received 2.6 times more rewards than the reactive robot (
\(p \lt 0.0001\)). With that and the analysis above, we conclude that our A-POMDP cobot model (the proactive robot) leads to a more efficient and natural collaboration when compared to the same cobot that does not handle a human’s unanticipated behaviors, which supports
Hypothesis 2.
Hypothesis
3 is evaluated through the subjective statements of the participants in Table
4. Firstly, since the collaboration with both of the cobots achieved very high success rates, the participants rated their trust in both of the cobots very high (the proactive robot with the mean rating of 4.43, the reactive robot with 4.05 out of 5.00 in Table
4(c)). The participants still thought that the proactive robot is significantly more trustworthy than the reactive one (with
\(p=0.041\)). The participants think that the proactive robot took over the task with significantly more accurate timing, i.e., when they needed assistance, than the reactive robot (with
\(p=0.002\)). This is also supported by the statements of “The robot acted as I expected” and “The robot was able to adapt to my assistance needs” that are both rated significantly higher for the proactive robot (with
\(p=0.012\) and
\(p=0.004\), respectively, in Table
4(b)). Similarly, another consistent analysis is made for the negative statement, “I did not want the robot to take over the task”, which is rated significantly higher for the reactive robot that took over more frequently (
\(p=0.030\) in Table
4(c)). All in all, these analyses point out a better collaboration experience with the proactive robot, resulting in a higher level of trust (The sum of scores under Table
4(c), i.e., the Likert scale for the trust and collaboration capability of a robot, significantly favors the proactive case).
In general, better anticipation of a person’s assistance needs, respecting her preferences, and greater trust suggest a higher acceptance for the proactive robot. In addition, the proactive robot contributed to an increased performance of its partner, and led to more efficient task completion. We thus conclude that the proactive robot has more positive teammate traits than the reactive one. The participants also indirectly support this by stating that they would prefer to work with the proactive robot on this kind of demanding tasks significantly more than the reactive robot, even though both of the cobots are rated high (
\(\mu _{proactive}=4.50\) out of 5.00, with
\(p=0.004\) as shown in Table
4(c)). More positive teammate traits and a higher trust may already indicate a higher perceived collaboration for the participants. As a supporting statement, the participants think that the reactive robot was significantly more competitive in its behaviors, whereas the ratings for the proactive robot were below average (with
\(p\lt 0.001\) in Table
4(c)). However, since the partners share a mutual goal, competition is not encouraged for better team performance. Finally, the participants affirm that they feel more comfortable with the proactive robot (
\(p=0.003\)) and they are more pleased collaborating with it (
\(p=0.011\)), all pointing out a higher perceived collaboration for the proactive robot.
Many of the positive traits of the proactive robot mentioned is a result of better human adaptation skills, which is hard to directly observe and evaluate in HRI in general, especially when it requires reasoning about the hidden human states in a static environment. In simulation experiments, we are able to track the accuracy of the belief estimation over human states since the ground-truth information of the human states is available through the simulated human decision models (in Figure
5). In our previous study, we find that the average estimation accuracy decreases proportionally when the frequency of unexpected human behaviors increases [
16]. In real-world experiments, the ground-truth information of whether a cobot has accurately anticipated a human’s state and adapted to the human accordingly is not explicit and can only be known to the interacting human. Hence, we ask the participants to broadly evaluate. In Table
4(b), we show that the participants think the proactive robot was able to adapt significantly better to their assistance needs (with
\(p=0.004\)) whereas the reactive robot behaves more repetitively rather than responding to their changing behaviors (
\(p\lt 0.001\)). The sum of scores in Table
4(b) (Likert scale on overall adaptation capability) also significantly favors the proactive case. Finally, at the end of the experiments we asked the participants two direct technical comparison questions where the robots are renamed anonymously (see comparison category in Table
3). For the first question, the vast majority of the participants,i.e.,
\(71.4\%\), picked the reactive robot as the cobot that is following preset rules instead of responding to their changing needs and preferences. For the second one,
\(78.6\%\) of the subjects picked the proactive robot as the one that was learning and adapting better to the participant’s assistance needs. From these statements we conclude that our A-POMDP model adapting to a human’s unanticipated behaviors shows better adaptation skills and it has a higher perceived collaboration, trust, and positive teammate traits, than a cobot model that does not handle such behaviors, which supports
Hypothesis 3.
With this experiment, we show the negative impact of the unanticipated human behaviors on the fluency of a collaboration, which are mostly overlooked in HRC studies. Despite the diverse backgrounds of the participants, their statements show great consistency, suggesting that the unexpected cobot interference occurs mostly due to the wrong anticipation of and adaptation to the current behavior of a person, which then results in a significantly less efficient and natural collaboration. Since our robot decision model is a POMDP, it is not learning from the history of interaction but it was able to reach to complex conclusions about the human states. For instance, referring to Figure
2, when our robot observes
a subtask succeeded several times, it is very likely to anticipate that the state of
Human is not struggling for a participant placing the cubes correctly. In that case, if the participant is observed to be idling for a long time (inactivity), the robot is still likely to anticipate that the human will take care of the job, and so it does not interfere. In other words, the robot concludes that “the human may still be assessing the subtask (after a long wait) as she has mostly succeeded so far”. If this inactivity of the participant continues or if it is followed by an observation of
a subtask failed, the likelihood of transitioning to the state of
Human may be tired significantly increases. After such a belief update, the robot is likely to offer assistance and act on the task. The robot’s belief translates into: “A longer wait may indicate tiredness after a decrease in her performance. I better assist her with the task”. Such conclusions are reached due to the probabilistic distributions over our state machine design with multiple anticipation stages (see Figure
2). With this experiment and analysis, we validate our cobot’s extended short-term adaptation skills on a real setup, which is in line with our simulation experiments in [
16].
We are aware that the same intrinsic parameters of our A-POMDP design may not respond reliably to all human types. For that, our first attempt was to incorporate different human types into our A-POMDP design. After we optimize for an online POMDP solver, the model in its current design (in Figure
2) takes approximately
1 second to respond to each observation in real-time (see in Section
5.2.2). However, incorporating various human types in the same model results in more complex transition and observation functions. The robot’s responses have become significantly inaccurate with a response time of approximately
3 seconds for each action decision. When we deploy the model for early experiments, many of the tasks could not be completed (e.g., due to a participant’s frustration) or completed with significantly lower success rates. In the end, the POMDP model has become inapplicable and the experiments have been inconclusive. That is, we could not generate any quantitative results. Therefore, instead of modeling human types as a latent variable in a single model, our intuition is to design a separate Bayesian inference procedure for the human states that change less frequently (i.e., the human types), whereas the A-POMDP handles only more frequent human dynamics. For that, the adaptive policy selection mechanism provides a faster and further destabilization of the robot traits by selecting (not learning) different policies according to the accumulated knowledge in the long-term. The next section discusses this long-term adaptation mechanism.