research-article

Open access

Impact of Anthropomorphic Robot Design on Trust and Attention in Industrial Human-Robot Interaction

Authors:

Linda Onnasch,

Clara Laudine HildebrandtAuthors Info & Claims

ACM Transactions on Human-Robot Interaction (THRI), Volume 11, Issue 1

Article No.: 2, Pages 1 - 24

https://doi.org/10.1145/3472224

Published: 18 October 2021 Publication History

All formats PDF

Abstract

The application of anthropomorphic features to robots is generally considered beneficial for human-robot interaction (HRI). Although previous research has mainly focused on social robots, the phenomenon gains increasing attention in industrial human-Robot interaction as well. In this study, the impact of anthropomorphic design of a collaborative industrial robot on the dynamics of trust and visual attention allocation was examined. Participants interacted with a robot, which was either anthropomorphically or non-anthropomorphically designed. Unexpectedly, attribute-based trust measures revealed no beneficial effect of anthropomorphism but even a negative impact on the perceived reliability of the robot. Trust behavior was not significantly affected by an anthropomorphic robot design during faultless interactions, but showed a relatively steeper decrease after participants experienced a failure of the robot. With regard to attention allocation, the study clearly reveals a distracting effect of anthropomorphic robot design. The results emphasize that anthropomorphism might not be an appropriate feature in industrial HRI as it not only failed to reveal positive effects on trust, but distracted participants from relevant task areas which might be a significant drawback with regard to occupational safety in HRI.

1 Introduction

With the progressing advancements in technology over the past years, the workplace has changed substantially. Today we already see how expeditiously robots are moving into manufacturing and human environments, with a variety of applications ranging from elderly care to service in office environments and collaborative work in industrial line productions [11]. The fluency and success of these new forms of human-machine interaction crucially depend on human-robot trust [29]. Freedy et al. [26], for instance, showed that levels of trust are correlated to the amount of manual interventions overriding a robot's actions. Whereas overtrusting the robot led to an omission of actions with negative performance outcomes, distrusting the robot resulted in unnecessary actions. Other studies revealed that trust in an automated system was the driving force in relying on the system's advice, not the objective reliability [18]. When participants distrusted the system, the system's advice was not followed even though it was correct. Further, Martelaro et al. [45] found a mediating effect of trust on the relationship between a robot's depiction as being vulnerable and feelings of companionship in interaction with a social robot.

Whereas shaping factors of trust in human-robot interaction (HRI) have already been identified, the effective impact of these factors is often not very well understood. One such example is anthropomorphism in HRI. It is suggested that a human-like robot design promotes trust in the robot. But this positive effect seems to be limited to only a certain level of anthropomorphism (e.g. not true for the uncanny valley [59, 66, 67]) and certain application domains (e.g. social robotics [27]). Therefore, the current study aims at broadening the scope of research on the effective impact of anthropomorphism on trust by addressing an application domain that has not gained much attention yet: The industrial context.

Moreover, the study further addresses the effect of anthropomorphism on attention allocation in industrial HRI. The majority of studies so far have investigated how anthropomorphic robot design affects subjective measures like perceived intelligence, likability or acceptance of a robot [17, 30, 36, 72]. Variables that are more closely linked to coordination behavior and performance have been rare. As attention is crucial for a safe and efficient interaction, this variable is explicitly addressed in the current study.

1.1 Trust in HRI

Several definitions have been proposed for trust in human-machine interaction [29, 38, 48]. One definition, which is widely accepted and used was established by Lee & See [39] with regard to human-automation interaction. According to the two authors, trust is “the attitude that an agent will help achieve an individual's goal in a situation characterized by uncertainty and vulnerability” ([39], p.51). This definition might also be applied to HRI. Trust is described as an attitude in a task-related context. The two contextual components of trust are defined as the vulnerability of the trustor and the uncertainty of the situation.

In work-related HRI, the human is relying on the robot to fulfill a task that is important to the human co-worker, e.g. holding a heavy part of a car body for the human co-worker to ease a manual welding process. Additionally, to the vulnerability of the trustor in terms of task fulfilment, the example illustrates a second component of vulnerability in HRI: The physical proximity and the vulnerability of the human co-worker in terms of physical safety. Therefore, humans not only have to rely on the robot to adequately support them in the work process but also to not hurt them, especially when workplaces of humans and robots overlap. The definition's second contextual component is the uncertainty of the situation. In the common industrial HRI, uncertainty used to be relatively low as robots were pre-programmed and always fulfilled actions in exactly the same repetitive manner. However, with the introduction of so-called “cobots”, collaborative robots that are easily programmable and therefore flexibly deployable, and the emergence of more and more semi-autonomous mobile robots, the uncertainty in interaction with a robot in the industrial domain has increased.

The proposed trust definition also applies to social robots. In social HRI settings, the uncertainty is even stronger. In these scenarios, the robot interacts in typically unstructured environments and has to adapt to changing inputs. The task of a social robot is the interaction and communication itself, consequently the human is depending on the robot as the interactional counterpart [5, 24]. This represents the vulnerability component of trust.

In this respect, the proposed trust definition by Lee and See [39] is applicable to HRI. As trust is described as an attitude, it primarily focuses on cognitive processes like expectations and attributions that are central components of trust. In this aspect, however, the definition might be too narrow as robots differ from automation due to their embodiment as physical interaction partners. Salem et al. [60] suggest that this creates new risk dimensions and thus trust in robots may vary from trust in automated systems. Because of the embodiment, a robot can be touched and even physically react to human behavior. Therefore, interactions with a robot might resemble important aspects of human-human interaction in which affective processes represent a crucial dimension of trust, too [40]. In line with this, Tapus and colleagues [64], for example, found that empathetic robot expressions and speech are perceived as more trustworthy, and Paradeda et al. [52] showed that the highest levels of trust are gained when a robot starts with small talk and according facial expressions like it is the case in typical human-human interactions.

When addressing trust in interaction with a technical agent, trust is often measured several times to cope with the dynamic nature of this attitude [33]. Lewis et al. [41] differentiate the dynamics of trust into three phases. Trust formation describes the basic attitude at the beginning of an interaction with the trustee. In this situation, trust often has to be built upon the trustee's appearance, context information and prior experience with similar agents as no interaction experience has been established yet. With starting interaction, trust is adapted to the actual experiences. If these experiences violate the trustor's expectations, trust dissolution follows. A restoration phase describes the trust development if positive interactions succeed such a negative event [44].

Interestingly, trust restoration and adaptation do not always seem to be exclusively related to interaction experiences with the robot. Robot failures sometimes do not affect people's decisions of whether to follow the advice of a robot or not, but do affect subjective perceptions of the robot's trustworthiness [57, 60]. To accommodate this divergence, it is important to not only account for the dynamics of trust but also for the differences in trust attitude and trust behavior.

In summary, existing trust definitions highlight two components that determine if trust becomes a relevant construct in interaction with another agent: The trustor has to be vulnerable to the actions of the trustee, and there has to be a situational uncertainty. Both components apply to work-related HRI. Seeking for a better understanding of the impact of trust in HRI, we further need to address this construct in its dynamic complexity. This means addressing trust not only as an attitude but also considering the behavioral component, as well as trust development over time.

Regarding factors that influence trust, a comprehensive meta-analysis by Hancock et al. [29] has revealed three different categories of impact factors: characteristics of the human, the robot and the environment. The meta-analysis showed that the strongest effects on trust development are based on the robot's characteristics. This includes performance-based factors like the robot's reliability and failure rate. Attribute-based factors like the robot's level of anthropomorphism are especially important for the initial trust which sets the base for trust formation.

1.2 Anthropomorphism in HRI

Anthropomorphism is broadly defined as the human tendency to transfer human-like characteristics to non-human entities [17]. The implementation of anthropomorphic characteristics to robot design can be induced on different dimensions [51]. A human-like robot appearance is the most apparent anthropomorphic design feature. It is particularly effective in initial interactions as physical appearance establishes expectations and biases interaction [24]. Other ways to design a robot anthropomorphically are with regard to communication style (e.g. natural speech or gestures [56]), the robot's movement (e.g. using human-like trajectories for an industrial single-arm robot [37]) or context (e.g. naming a robot and describing its personality or hobbies [50]).

Many studies reveal a positive impact of anthropomorphism on the interaction between humans and robots: Anthropomorphism improves initial trust perceptions and improves acceptance of robots as team partners [17]. Moreover, it facilitates human-machine interaction, increases familiarity, and supports users’ perception of predictable and comprehensible robot behaviors [72]. Human-like robots are perceived as more intelligent [30] and sociable [36] and have been shown to be liked more [12]. More recent studies have demonstrated that humanoid robots are judged according to human norms more so compared to less anthropomorphic ones [42, 43]. Other studies have revealed that people empathize more with an anthropomorphic robot compared to a non-anthropomorphic one [15, 55].

As presented, a large body of research confirms positive effects of an anthropomorphic design, especially in social settings. However, research is needed to understand the effect of a robot's anthropomorphic features in work-related domains. Results of the few studies regarding this application context are mixed. Some studies report positive effects of anthropomorphism. For instance, Kuz and colleagues showed in a set of studies positive effects of anthropomorphic trajectories of an industrial single-arm robot as compared to functional trajectories [37, 46]. Human-like movements benefited the anticipation of target positions from the robot's trajectory, whereas this was not possible when movements were modelled in a typically linear style. More positive effects of anthropomorphism were reported in a hospital case study. Hospital staff was friendlier and even tolerated malfunctions of a mobile delivery robot more so when the robot had been given a human name compared to robots without names [14]. This illustrates that even very low levels of anthropomorphism (like calling a robot by a human name) might have the power to cause a more forgiving attitude towards robotic malfunction. In line with that, Ahmad et al. [3] found that an anthropomorphic robot design positively impacted participants’ trust in an error-prone robot. However, the anthropomorphic impact turned into the opposite with low error rates. In this case, anthropomorphism had a negative effect on trust. Last but not least, Bernotat and Eyssel [7] could not confirm any impact of anthropomorphism in work-related HRI. The study investigated judgments of an anthropomorphically designed versus a standard industrial (non-anthropomorphic) robot in smart homes. They did not find an effect of robot type on trust.

Furthermore, first studies have revealed a strong impact of anthropomorphism in HRI on attentional processes. For instance, Bae and Kim [4] demonstrated that anthropomorphic robot features like a face attract more visual attention compared to non-anthropomorphic designs. If the anthropomorphic design of a robot is functional, i.e. has a task-related purpose, the introduction of human-like features could ease the interaction. In line with this idea, Wiese et al. [34] showed that people engage in joint attention with robots following their gaze to target locations. Moreover, Moon and colleagues [47] provided empirical evidence that using human-like gaze cues during human-robot handovers improves the timing and perceived quality of the handover event. However, if the anthropomorphic design is not instrumental, these features could be detrimental in terms of a distracting potential. In this sense, anthropomorphic robot features that are non-instrumental could encourage people to engage in joint attention, which in this case, would be meaningless and in consequence could unsettle and distract the human counterpart. Because anthropomorphic designs without any task-relation are increasingly implemented in an industrial HRI context (e.g. Sawyer and Baxter from HAHN Group/Rethink Robotics, workerbot from pi4_robotics), possible negative consequences of such robot design need to be addressed in research. The assumed negative effects might be especially relevant for an industrial setting where human operators work in multi-task environments. In these settings, the human's attention should be focused on task-related areas. A non-instrumental anthropomorphism could therefore distract human operators from their primary task fulfillment.

In sum, several studies have shown a positive effect of an anthropomorphic robot design on trust in HRI. However, as most of these studies are based in social settings, it remains unclear if positive effects are to be expected in an industrial setting, too. Only few studies have focused on a work-related HRI so far. Whereas an anthropomorphic robot movement seems to support users’ perception of predictable robot behaviour [37, 46], a positive impact of anthropomorphism on trust in industrial HRI could not be clearly confirmed as results are mixed [3, 7]. However, it seems that an anthropomorphic design could lead to a more forgiving attitude when a robot is error-prone [14]. Last but not least, results from studies on attention allocation indicate possible distracting effects as an inadvertent consequence of non-instrumental anthropomorphic robot design [4].

1.3 Hypotheses

Based on the state of research regarding trust and anthropomorphism in HRI, we hypothesized that interacting with an anthropomorphic robot should lead to higher trust levels compared to a robot without any anthropomorphic features. Moreover, we assumed that an anthropomorphic robot design reveals a more forgiving attitude towards the robot, resulting in a less pronounced trust decline after the experience of a robot failure.

Furthermore, we expected that an anthropomorphic design should lead to a shift in visual attention towards the anthropomorphic features (abstract face), irrespective of their task relevance. However, as the anthropomorphic features had no task relevance, this effect was expected to decrease with increasing interaction experience.

2 Materials and Methods

To ensure transparency and in compliance with good research practice, the study was submitted to and approved by the ethics committee of the Humboldt-Universität zu Berlin. Prior to conducting the experiment, we preregistered the study at the Open Science Framework where the raw data of the experiment is available [31].

2.1 Participants

The number of participants was defined based on an a priori power analysis (GPower 3.1, for details see [21]). Accordingly, a total of 40 students (25 female, 15 male) were recruited via the official participant database PESA of the Institute of Psychology, Humboldt-Universität zu Berlin, and by distributing posters at public notice boards at universities in Berlin. Additionally, the experiment was advertised in student groups on Facebook. The majority of participants were psychology students (n = 35), the remaining five participants studied human medicine, human factors, mathematics, politics and communication, and technology management, respectively. Participants were ranging in age from 18 to 35 years (M = 24.47, SD = 4.34). All of them spoke German as a native language or at an equivalent level. People who wore glasses and/or had an impairment of color vision could not participate because of eye-tracking-restrictions. Wearing soft or hard contact lenses was, however, permitted. None of the participants had previously interacted with the robot used in this study, but four stated to have prior experience with robots, either gained in an experiment with a non-industrial (humanoid) robot at the engineering psychology lab at the Humboldt-Universität zu Berlin (n = 3) or in a work-related industrial setting (n = 1).

Participants signed consent forms at the beginning of the experiment and received course credit as compensation at the end of the experiment.

2.2 Apparatus and Task

The laboratory was arranged as an assembly workspace with a steel rack containing storage boxes (Figure 1, Figure 2). The industrial robot was positioned in front of the rack facing the human workstation which was set on a high table. The industrial robot used in this experiment was a Sawyer robot from Rethink Robotics equipped with one arm with 7 degrees of freedom and a range of 1.26 meters (Figure 1). Participants’ visual attention allocation was assessed with a monocular mobile eye tracker from Pupil Labs [35]. The Pupil Labs glasses were equipped with two cameras: one world camera to record the participant's view (1,920 × 1,080 pixels, 100° fisheye field of view, 60 Hz sampling frequency on a subset of 1,280 × 720 pixels) and one eye-camera for the right eye (1,920 × 1,080 pixels, 120 Hz sampling frequency on a subset of 320 × 280 pixels). Eye-camera adjustment, calibration, recording of the eye movements, and the definition of four areas of interest (AOIs) were carried out with help of the open source software Pupil Capture (release 1.10, January 2019 [35]). For the post video analysis, we used the open source software Pupil Player (release 1.10, January 2019 [35]).

Fig. 1.

Fig. 2.

For the main task of the human-robot collaboration, the robot was grasping boxes out of the steel rack and handing them over to the participant. The boxes were first pretentiously scanned by the robot (depicted by a flashlight) to simulate an initial check of the quality (shape, color, size) and quantity of all components inside the box. Afterwards, the robot handed the box to the human coworker. The required movement sequences were programmed by using the software INTERA and included varying movements in the following chronology. First, the robot moved its gripper over a box in the steel rack and started a flashlight at the gripper to pretend a visual quality check. Next, the robot arm moved to a position that enabled it to grab the box. Equipped with the box, the arm moved towards the handover area and waited two seconds in this position before opening the gripper. Subsequently, it moved back to the initial position and waited for 15 seconds before starting the next loop. Except for the task relevant movements, no other interaction with the robot was possible. The robot's functions and movement patterns were the same in all conditions. However, every single handover was performed differently by the robot, depending on the initial position of the box in the rack. Therefore, the movements were not predictable by the participant and represented the uncertainty aspect of the situation. This was a prerequisite for trust being a relevant construct in the interaction of human and robot.

The boxes delivered by the robot initiated the task of the participant. First, participants had to take the box from the robot. Second, they had to assemble the components inside the box in a prescribed style. Components were LEGO bricks which simulated parts of a circuit board. Third, participants controlled the quality of the work by comparing the correct location and color of the components to a target circuit board. Subsequently, participants put the final product back into the box and transferred it to the experimenter.

Without the robot handing over the box, participants could not fulfil their task; they were depending on the robot to support them. This represented the vulnerability component of trust in our experimental setup.

Every interaction sequence was recorded through a birds-eye video camera by Logitech Capture (version C922). The camera provides a progressive image transmission of 1080 pixels. It was fixed at the ceiling of the laboratory in a distance of 3 meters central above the area where the handover between the participant and the robot took place.

2.3 Design

The experimental study used a mixed design with anthropomorphism as between-subject factor and interaction experience as within-subject factor. Anthropomorphism was implemented through the robot's physical appearance. This dimension is the most prominent with regard to first impression and is known to shape expectations and interaction behavior [20]. In the anthropomorphic condition, the robot's display showed an abstract face consisting of two eyes and according eyebrows. This operationalization of anthropomorphic appearance was chosen because the face is a central informational component in human-human interaction and is supposed to give clues about the mental state of the interaction partner [28, 70]. Therefore, introducing a robotic face should already be sufficient to change the perceived level of anthropomorphism. The face was dynamic as it changed the gaze direction and blinked from time to time. However, the dynamics were not meaningful as they were not related to the robot's actions. In the non-anthropomorphic condition, the robot's display just showed the Rethink Robotics logo (Figure 1).

In order to differentiate between the pure impact of anthropomorphism and the combined impact of anthropomorphism and experience, we included a second independent variable: interaction experience (within-subject). All participants worked on the collaborative task for a total of four blocks. Every block consisted of six boxes that were handed over by the robot. Whereas the first three blocks represented an increasing positive interaction experience, the last block incorporated a single negative interaction experience. After three successful handovers, the robot dropped a box before handing it over to the participant. Due to the robot's failure, the participant could not start to assemble the circuit board but had to wait for the robot to hand over the next box. After the failure experience, another three successful handovers followed.

2.4 Dependent Measures

2.4.1 Control Variables and Manipulation Check.

To prevent confounding effects of participants’ attitudes towards technology and robots in particular, we used the Negative Attitudes towards Robots Scale (NARS, Cronbach's α = .80 [63]), the subscale “Propensity to trust technology” of the Complacency as Potential questionnaire (Cronbach's α = .63 [22]), and the Affinity for Technology Interaction scale (ATI [25], Cronbach's α ranged in five different validation samples between .83–.92). Answers for the first two questionnaires were provided on a five-point Likert scale, the ATI scale on a six-point Likert scale. To analyze data, we computed the scale means. Moreover, we assessed participants’ propensity to anthropomorphize with a German version of the Individual Differences in Anthropomorphism Questionnaire by Waytz et al. (IDAQ; Cronbach's α between .82–.90 [68]). The German version by Eyssel and Pfundmair [19] deviates from the original version as it uses a seven-point Likert scale. The scores for the single items were accumulated. Therefore, possible results ranged between 15 and 105.

To check if our manipulation regarding anthropomorphism was successful, we asked participants to complete the Godspeed questionnaire consisting of five subscales [6]: Perceived Anthropomorphism (Cronbach's α = .87), Animacy (Cronbach's α = .70), Likeability (Cronbach's α = .85), Perceived Intelligence (Cronbach's α = .76), and Perceived Safety of the robot (Cronbach's α = .91). Additionally, we included the subscale Perceived Humanness of the revised Godspeed questionnaire by Ho and MacDorman (Cronbach's α = .84, [32]). We did not include the other two subscales Eeriness and Attractiveness as these are not necessarily related to anthropomorphism. All scales were presented in the form of semantic differentials with rating opportunities from one to five.

2.4.2 Trust.

As trust was one of our key variables, we chose a multifaceted assessment approach. Trust as an attitude was measured using subjective variables, trust behavior was assessed with objective measures. The subjective trust assessment included a questionnaire specifically developed to measure key factors impacting trust in an industrial setting of human-robot-collaboration [13]. It differentiates three subscales addressing the robot's motion and pick-up speed (Cronbach's α = .61), the safe co-operation (Cronbach's α = .80), and the robot and gripper reliability (Cronbach's α = .71) with a total of ten items that are answered on a five-point Likert scale. The overall score is assessed via the sum score of all items. Thus, the overall score can range from a minimum score of ten to a maximum score of 50. With regard to the trust definition [39] this questionnaire addresses robot attributes that are relevant for the cognitive component of trust development (trust proposition). However, the questionnaire does not assess the actual trust level while interacting with a robot.

Therefore, we asked participants to additionally rate two single items. Participants had to rate the perceived robot's overall reliability from 0% to 100% as the perceived reliability of a robot is known to be a major source for trust formation [29]. Although reliability is already a subscale of the trust questionnaire, this strongly focuses on the gripper reliability, not the robot's overall reliability. With the additional single question, we wanted to ensure to have a rating of the perceived overall robot reliability. Moreover, we asked participants to rate their overall trust from 0% to 100% on a single item to assess the actual trust level.

The objective trust variable (trust behavior) was based on the bird's eye video recording and measured the time participants had both hands in the handover area before the robot gave the box to the participant. The handover area was fixed and framed on the table with red tape (Figure 1, area 1). Time measurement started when both hands of a participant were in the handover area and time was stopped when the actual handover of the box from the robot to the human began. The video recordings were manually screened using a software that was specifically programmed for this purpose and time was measured using the according time stamps of the video. Differences between the time stamps were automatically computed and represented the duration in which both hands were in the handover area. Shorter times were interpreted as higher levels of trust as this indicates that participants expected the robot to release the box as anticipated. Longer times were accordingly associated with lower trust. This was the case when participants had their hands already under or at the box when the robot was still moving to act as a backup in case the robot would drop the box too early. The mean handover time was calculated per block for the positive interactions (block 1 to block 3). For the comparison of pre- and post-failure trust we calculated the mean handover time based on the three interactions directly before and after the failure, respectively (Figure 3).

Fig. 3.

2.4.3 Visual Attention Allocation.

Visual attention allocation was measured by means of eye-tracking data. First, we assessed the number of fixations for different predefined areas of interest (AOI). This measure reflects the importance of the predefined areas for the subject. More important areas will be fixated more frequently [53]. For this purpose, four different AOIs (specified by markers) were defined before the experiment started. These AOIs corresponded to parts of the experimental setup and task that had to be paid attention to for a secure and efficient HRI. These were a) the shelf including the boxes, b) the assembly area, c) the handover area, and d) the robot's head-mounted display showing either a face (anthropomorphic condition) or the Rethink Robotics logo (non-anthropomorphic condition; Figure 2).

Second, we assessed the mean fixation time per AOI (in milliseconds, ms). This variable should give additional insight as an increased fixation time indicates either difficulties in understanding a visual input or that the object is more engaging in some way [53].

Fixations were defined by a minimum duration of 100 ms and a maximum duration of 400 ms and a dispersion in this time of 1.5° [8, 49, 61]. Both variables were calculated for the first three blocks only as we did not have hypotheses regarding the impact of a robot failure on visual attention allocation (Figure 3).

2.5 Procedure

The experimental procedure is depicted in Figure 3. The study took place at the engineering psychology lab at the Humboldt-Universität zu Berlin. All subjects were randomly assigned to one of the two between-subject conditions. Because of practical reasons, the assignment changed every five participants, so that the current robot set-up could be run several times before reconfiguring the system (different programs for anthropomorphic and non-anthropomorphic set-up). Participants received corresponding written instructions for the task and task setting. They were asked to imagine being a factory worker in a company producing electronic devices. The instructions described a collaborative task in which participants were working together with a robot. The robot's task was to hand over boxes from a steel rack to the participants. The boxes contained assembly material for a LEGO circuit board. Participants’ task was to assemble the circuit boards and to do subsequent quality checks. No further information was given about the robot. After filling out the informed consent and sociodemographic questions, participants completed questionnaires assessing the control variables: the ATI, the propensity to trust technology, the IDAQ, and NARS. Afterwards, participants put on the mobile eye-tracking device and started the first calibration (recalibrations were conducted between blocks if needed). Subsequently, they were exposed to the robot and the display was either showing a face or the company's logo, according to the respective condition. For the manipulation check, the appearance of the static robot was then assessed via anthropomorphism questionnaires (Godspeed and subscale perceived humanness from the Godspeed revised). Before the start of the actual task, the trust questionnaires were filled out (initial measure, t0). After completing the first block, subjective trust was measured again (pre-failure measure, after six positive interactions, t1). Blocks 2 and 3 were comparable to the first block with six successful interactions between robot and human per block. To measure the impact of an increasing positive experience in interaction with the robot, another subjective trust assessment was conducted after the third block (pre-failure measure, after 18 positive interactions, t2). Block 4 contained the robot's failure. After this last block, trust was measured again (post-failure measure, t3). In total, participants experienced 25 handovers with the robot. The entire experimental procedure lasted approximately 1.5 hours.

2.6 Data Analysis

For the control variables and the manipulation check regarding anthropomorphism, results are based on two-sided t-tests for independent samples comparing the two between-subject groups. We expected no differences between groups. Therefore, it was important to control for a type II error, which is indirectly considered by choosing a relatively liberal alpha error of α = .2 [9].

We assessed different aspects of trust attitude with three different measures. Therefore, we first conducted correlational tests with the respective variables to analyze the interdependencies.

To test our hypotheses, we used two different analysis designs. To test for the impact of anthropomorphism on trust and visual attention allocation during fault-free interactions, we followed a 2 (anthropomorphism) × 3 (interaction experience) design. Data was then analyzed with mixed ANOVAs for the single variables with anthropomorphism as between-subject and interaction experience as within-subject factor, including the first three experimental blocks (Figure 4, blue rectangle). If assumptions of variance homogeneity or normal distribution were violated, the non-parametric Mann-Whitney-U-Test was applied instead of an ANOVA.

Fig. 4.

To test the impact of anthropomorphism and failure experience on trust, we used a 2 (anthropomorphism) × 2 (interaction experience) design. Interaction experience only included trust data directly assessed before experiencing a robot failure and after failure experience. This involved subjective trust measures at t2 and t3, and objective trust measures recorded in block 4 (Figure 4, orange rectangle). Because the trust measures at t2 were already included in the first statistical analysis design, we applied Bonferroni corrections to account for the alpha error accumulation of multiple tests in the second analysis. If assumptions of variance homogeneity or normal distribution were violated, the non-parametric Mann-Whitney-U-Test was applied instead of an ANOVA.

For pairwise comparisons of the independent variable interaction experience, p-values were Bonferroni corrected. In the case of violations of Mauchly's sphericity test, the Greenhouse-Geisser adjustment was used for corrections.

3 Results

3.1 Attitudes Towards Technology

We found no significant differences between the anthropomorphic and the non-anthropomorphic groups neither with regard to the NARS (t(38) = 1.03, p = .309), participants’ propensity to trust technology (U = 198.5, p = .967), nor their affinity for technology interaction (ATI; t(38) = −0.57, p = .566) or the propensity to anthropomorphize (IDAQ; t(38) = −0.38, p = .699). All participants scored relatively low on the NARS (M = 2.69, SD = 0.60), and had a moderately positive attitude towards technology (Propensity to Trust Technology: M_propTrust = 3.72, SD_propTrust = 0.49; ATI: M_affinityTech = 3.25, SD_affinityTech = 0.99). Regarding participants’ tendency to anthropomorphize, data showed a relatively big variance with values ranging from 20 to 77 (M = 45.7, SD = 14.07).

3.2 Manipulation Check

Surprisingly, the overall scores of the Godspeed questionnaire revealed comparable ratings in the anthropomorphic (M = 2.97, SD = 0.46) and non-anthropomorphic condition (M = 2.80, SD = 0.32), t(38) = 1.34, p = .187. Data of the subscale anthropomorphism showed that both robots were perceived as being not anthropomorphic (M_anthro = 1.66, SD_anthro = 0.44; M_non-anthro = 1.67, SD_non-anthro = 0.49; t(38) = −0.06, p = .947). This was supported by ratings of the human-likeness scale of the revised Godspeed questionnaire [27]. Again, ratings were rather low and did not differ between the two anthropomorphic conditions (M_anthro = 2.12, SD_anthro = 0.70; M_non-anthro = 1.76, SD_non-anthro = 0.63; t(38) = 1.69, p = .099). Independent of conditions, the robot was liked (M = 3.43, SD = 0.71; U = 130.5, p = .057), perceived as being intelligent (M = 3.37, SD = 0.78; t(38) = −0.06, p = .384) and safe (M = 3.93, SD = 0.75; t(38) = −.06, p = .683), but animacy ratings were rather low (M = 2.02, SD = 0.73; t(38) = −.06, p = .184).

3.3 Interdependencies of Trust Attitude Measures

We assessed trust attitude with three different measures: trust propositions with Charalambous et al.’s [13] trust questionnaire (trust prop.), a single item asking for the perceived reliability of the robot (rel), and a single trust item (trust). For correlational analyses the variables were averaged over t0, t1, t2 and t3. Results revealed significant positive relationships between the three trust attitude measures (r_{trust prop – rel} = .564, p < .001; r_{trust prop – trust} = .486, p = .001; r_{rel – trust} = .524, p = .001). The rather medium correlations indicate that the three measures were related, but apparently, and as expected, assessed different aspects of trust attitude.

3.4 Trust Development with Positive Interaction Experience

We first compared participants’ ratings given in Charalambous et al.’s [13] trust questionnaire. Overall ratings only revealed a significant main effect of experience, F(2, 76) = 25.49, p < .001, η_p² = .40. Post hoc tests using Bonferroni correction for multiple comparisons revealed that participants’ trust was significantly lower prior to interaction (M_t0 = 39.15, SD_t0 = 5.13) than after the experience of faultless interaction at t1 (M = 43.20, SD = 5.44; p < .001) and t2 (M = 44.22, SD = 5.17; p < .001), but not between the latter ones (p = .313). Anthropomorphic design did not have a significant effect on trust formation (F < 1). This pattern was mirrored in the results for the subscales robot's speed, safe co-operation, and gripper reliability that again only revealed significant main effects of experience with most pronounced differences between ratings prior and after interaction (speed: F(1.3, 50.54) = 19.68, p < .001, η_p² = .34; safe co-operation: F(2, 76) = 11.47, p < .001, η_p² = .23; gripper reliability: F(2, 76) = 9.66, p < .001, η_p² = .20).

For the single item reliability and trust ratings we decided to substitute data of one participant in the non-anthropomorphic condition for all analyses based on these variables. Box plot outlier analyses showed that most ratings of this person deviated substantially from group means (deviations for single item trust: t0 more than 3 box-lengths, t1 and t3 more than 1.5 box-lengths from the edge of the box; for single item reliability: t0, t1 and t3 more than 1.5 box-lengths from the edge of the box). We substituted the according data with the mean minus two standard deviations [23]. We also substituted one data point from each of two participants in the anthropomorphic condition with the group mean. One participant rated subjective trust at t0 with 0, the other one rated the robot's reliability at t1 with 0. We assume that this happened accidentally as all other reliability and trust ratings of both participants were substantially higher (e.g. according t0_reliability = 84%; according t1_trust = 71%).

Single item reliability ratings are shown in Figure 5. Participants perceived the anthropomorphic robot to be significantly less reliable (M_anthro = 81.15%, SD_anthro = 16.93%; M_non-anthro = 88.87%, SD_non-anthro = 16.93%; F(1, 38) = 4.15, p = .048, η_p² = .09). No significant results were found for experience (F(2, 76) = 2.79, p = .067, η_p² = .06), nor was there an interaction effect of anthropomorphism and experience (F < 1).

Fig. 5.

On a descriptive level, results for the single item trust (Figure 5) pointed into the same direction as the reliability ratings with lower trust ratings for the anthropomorphic robot. But this pattern was not statistically supported, F(1, 38) = 3.69, p = .062, η_p² = .08. With ongoing experience, trust in both conditions increased significantly, F(1.37, 52.16) = 10.37, p = .001, η_p² = .21. Post hoc tests using Bonferroni corrections showed that the increase was most apparent between ratings prior to and after interactions (t0 vs. t1: p = .007; t0 vs. t2: p = .004).

In sum, findings for trust attitude did not show the expected positive effect of an anthropomorphic robot design. Whereas no positive impact could be found neither for trust propositions nor the actual trust level, the perceived reliability even revealed a negative impact of anthropomorphism.

Trust behavior was defined as the mean handover time between robot and participant. Longer times were interpreted as lower trust. In both conditions data of one participant could not be analyzed because of technical failures in the video recording. Remaining data showed a beneficial effect of interaction experience. After a prolonged time of interaction with the robot, handover times in both conditions declined (M_B1 = 2.71s, SD_B1 = 0.67s; M_B2 = 2.74s, SD_B2 = 0.52s; M_B3 = 2.54s, SD_B3 = 0.46s). This was statistically supported by a significant main effect of experience, F(2, 72) = 4.93, p = .010, η_p² = .12. Post hoc tests using Bonferroni correction for multiple comparisons located the significant decrease in handover time between the second and third block (p = .005). Whether the robot had a face or not, did not affect handover times, nor was there an interaction effect between anthropomorphism and experience (F < 1).

3.5 Visual Attention Allocation

Data of 39 participants were included in analyses on visual attention allocation. One participant's data were excluded due to technical problems with the recording.

Results regarding the number of fixations for all AOIs are depicted in Figure 6. For the AOIs shelf and participants’ assembly area, results showed an effect of time on task (shelf: F(1.59, 59.16) = 128.87, p < .001, η_p² = .77; assembly area: F(1.59, 59.01) = 31.02, p < .001, η_p² = .45). For the AOI shelf this effect was due to a significant decrease of visual attention to this area in block 3 compared to block 1 (p < .001) and block 2 (p < .001) as post hoc tests using Bonferroni correction for multiple comparisons revealed. Regarding participants’ attention to their assembly area, data showed a continuous decrease of attention to this AOI from the first to the last block with significant differences between all blocks (b1 vs. b2, b1 vs. b3: p < .001, b 2 vs. b3: p = .006). No impact of anthropomorphism was found, nor interaction effects between anthropomorphism and experience (shelf & assembly area: F < 1; see Figure 6).

Fig. 6.

However, this changed substantially when looking at the other two AOIs. Participants spent less attention to the handover area with ongoing interaction experience, F(2,74) = 8.50, p < .001, η_p² = .18 (b1 vs. b2, p = .002, b1 vs. b3, p = .007, no difference between b2 and b3, p = 1.0). But there was also a large difference in the overall amount of attention located to this area. Participants of the anthropomorphic condition looked at this area significantly less often (M = 47.78, SD = 8.44) than participants of the non-anthropomorphic condition (M = 56.26, SD = 13.34), F(1, 37) = 5.55, p = .024, η_p² = .13.

This was mirrored for the AOI representing the robot's display showing either a face or the logo of the robot's company. In the anthropomorphic face condition participants looked at the display significantly more often (M = 25.85, SD = 18.07) than participants in the non-anthropomorphic condition looked at the display without a face (M = 8.45, SD = 7.45; F(1, 37) = 15.75, p < .001, η_p² = .29). This effect did not change across blocks (F(1.47, 54.53) = 1.38, p = .256, η_p² = .03).

Results for the mean fixation time for the AOI shelf revealed again an effect of experience, F(2, 74) = 7.92, p = .001, η_p² = .17. The mean fixation time decreased with ongoing time on task (M_b1 = 255.13ms, SD_b1 = 29.64ms; M_b2 = 254.10ms, SD_b2 = 25.66ms; M_b3 = 243.07ms, SD_b3 = 28.35ms). Post-hoc tests using Bonferroni correction showed a significant difference for fixation duration between block 1 and block 3 (p = .009). No other effects were found (F < 1).

Participants showed a stable mean fixation duration with regards to the assembly area. On average, a fixation to this area lasted for 195.20ms (SD = 24.72ms). No impact of anthropomorphism, nor interaction experience was found (main effect anthropomorphism: F(1, 37) = 2.45, p = .126, η_p² = .06; main effect interaction experience: F < 1; interaction effect: F < 1).

Regarding the handover AOI, there was an effect of experience, F(2, 74) = 3.46, p = .036, η_p² = .08. With ongoing time on task, fixation times decreased (M_b1 = 188.82ms, SD_b1 = 29.84ms; M_b2 = 177.56ms, SD_b2 = 29.89ms; M_b3 = 182.32ms, SD_b3 = 27.86ms). Post-hoc tests with Bonferroni correction for multiple comparisons showed that this reduction was most prominent from block 1 to block 2 (p = .020). Anthropomorphism did not have a significant impact (F(1, 37) = 1.78, p = .190, η_p² = .04), nor was there an interaction effect (F(2, 74) = 1.75, p = .180, η_p² = .04).

The mean fixation time for the AOI face could not be calculated for all participants because some never looked at the robot's display and therefore produced missing data for this measure. Therefore, the analysis was conducted with n = 14 participants in the non-anthropomorphic and n = 17 participants in the anthropomorphic condition. Fixation duration showed no differences between the two anthropomorphic conditions F(1, 29) = 3.21, p = .083, η_p² = .10, nor an impact of ongoing positive interaction experience (F < 1) or an interaction effect (F < 1).

3.6 Trust Dissolution with Negative Interaction Experience

The failure experience did have a significant impact on participants’ trust attitude and trust behavior. For trust attitude this was revealed by all measures. Overall ratings of Charalambous et al.’s [13] trust questionnaire showed a significant effect of failure experience (M_prefailure = 44.22, SD_prefailure = 5.17; M_postfailure = 40.15, SD_postfailure = 6.06; F(1, 38) = 35.22, p < .001, η_p² = .48). Anthropomorphism had no impact (F < 1) nor was there an interaction effect (F < 1). This pattern of results was also found for the subscales, with the strongest effect of failure experience on the reliability subscale (speed: F(1, 38) = 6.03, p = .019, η_p² = .13; safe co-operation: F(1, 38) = 4.24, p = .046, η_p² = .10; gripper reliability: F(1, 38) = 40.75, p < .001, η_p² = .51). No other effects were significant (all F < 1).

Results for the single items were in line (see Figure 7 for the single items reliability and trust). For the perceived robot reliability as well as for the single trust assessment there was a significant effect of the robot's failure (reliability: F(1, 38) = 48.57, p < .001, η_p² = .56; trust: F(1, 38) = 26.83, p < .001, η_p² = .41). Reliability ratings dropped from 87.03% (SD = 12.30%) to 74.79% (SD = 15.56%), accordingly trust ratings declined from 82.65% (SD = 13.16%) to 73.72% (SD = 15.76%). No effects of anthropomorphism, nor an interaction effect became evident for the reliability or trust items prior to and after the robot's failure (for reliability: anthropomorphism, F(1, 38) = 3.91, p = .055, η_p² = .09; interaction, F(1, 38) = 1.06, p = .310, η_p² = .02 / for trust: anthropomorphism, F(1, 38) = 2.39, p = .130, η_p² = .05; interaction, F(1, 38) = 0.83, p = .367, η_p² = .02).

Fig. 7.

Trust behavior was again assessed by handover time. Two participants could not be included in the analyses due to missing recordings. The comparison of pre- and post-failure times revealed a main effect of failure experience (F(1, 36) = 11.02, p = .002, η_p² = .23). Prior to the robot's handover failure, participants spent 2.63s in the handover area (SD = 0.60s), this time increased to 2.92s (SD = .55s) after the negative experience. This finding was further explained by an interaction effect (F(1, 38) = 6.35, p = .016, η_p² = .15). Whereas the failure experience had hardly any impact on participants in the non-anthropomorphic condition (pre-failure: M_non-anthro = 2.75s, SD_non-anthro = 0.68s; post-failure: M_non-anthro = 2.82s, SD_non-anthro = 0.55s), the failure resulted in an increase in handover time for participants working together with the anthropomorphic robot (pre-failure: M_non-anthro = 2.51s, SD_non-anthro = 0.50s; post-failure: M_non-anthro = 3.02s, SD_non-anthro = 0.55s; Figure 8).

Fig. 8.

4 Discussion

The main objective of this study was to examine the effects of anthropomorphic robot design on trust and visual attention in an industrial work setting. On the one hand, we assumed that an anthropomorphic robot design might increase trust in the industrial robot, comparable to findings from social HRI. On the other hand, we expected negative effects of an anthropomorphic robot design, as it might inappropriately alter patterns of visual attention in HRI. Results revealed some surprising findings.

First of all, our manipulation check revealed that both robot designs resulted in a comparable perception regarding the anthropomorphism, human-likeness, likeability, intelligence and safety scales. Participants rated anthropomorphism as well as human-likeness and animacy of both robot designs rather low compared to the high ratings of likeability, intelligence and safety. A rather low level of perceived anthropomorphism for both robot designs was not unexpected because Sawyer is an industrial single arm robot that does not resemble very human-like associations. Our chosen anthropomorphic manipulation was applied by implementing a relatively abstract face on a display. This consisted of eyes and according eyebrows, that simulated (random) gaze behavior. In general, an anthropomorphic implementation via face seems to be a valid manipulation, as face features are known to be a central informational component in human-human interaction that also give clues about the mental state of interaction partners [28, 70]. We further decided for the relatively subtle form of manipulation as it depicts the realistic implementation of anthropomorphism in industrial HRI. That this is of practical relevance, is proven by the manufacturer Rethink Robotics as the two face variations are already available in the robot's standard configuration. However, we still would have expected distinct perceptions of anthropomorphism in our two conditions. In this regard, and if a direct transferability of results to an industrial application is not the main focus of interest, future studies should consider to use more, and more salient anthropomorphic features like complete faces (instead of our abstract version) or vary not only the appearance but also communicational aspects of appearance like facial expressions (especially emotions).

Even though the perception of anthropomorphism did not show the expected distinct effects, the two robot designs affected to some extend trust attitudes and revealed effects for attention allocation. Therefore, besides our subtle manipulation of anthropomorphism, an alternative explanation for the missing differences might be related to the chosen instruments to assess perceived anthropomorphism. In line with the majority of researchers we chose the Godspeed questionnaire (over 1,000 citations [6]) and a subscale of its revised version [32]. This questionnaire is widely accepted and has been validated in several studies. A distinctive feature, however, is that all validation studies have exclusively used different robots and have not included a comparison of a more or less anthropomorphic robot. In accordance, the Godspeed questionnaire has also been subsequently applied in studies to compare different robots, not one robot that is varied with regard to its anthropomorphic degree. It might be possible, that this measure is not sensitive enough for subtler differences of anthropomorphism like in our study. First studies, including this one, that manipulated anthropomorphism using the same robot support this idea. In these studies, anthropomorphism was either varied by framing a robot anthropomorphically [50] or by additionally changing the appearance (same manipulation as in this study, [58]). Whereas anthropomorphism showed an impact on objective and subjective measures (e.g. donation behavior or trust attitude), again no differences were found with respect to the Godspeed questionnaire. These results suggest that future studies are needed to determine if our non-findings with regard to anthropomorphism are due to an unsuccessful manipulation or whether the results are due to a lack of sensitivity of said measurement instrument.

However, even without this evidence for a distinct perception of anthropomorphism, trust and attention were affected by the two different robot designs. Findings are discussed and detailed in the following.

In the current study, we assessed aspects of trust attitude with three different variables. As expected, measures were significantly correlated while assessing different aspects of trust. The trust questionnaire by Charalambous and colleagues [13] measures factors impacting trust formation but not the actual trust level. Therefore, we decided to include a single question directly asking for participants’ trust. We further asked for the perceived overall reliability of the robot with a single item.

With regard to our first hypothesis, stating that an anthropomorphic robot design leads to higher trust levels, this assumption has to be rejected. The anthropomorphic design had no significant impact on propositions of trust nor on the actual trust level. In addition, results for the perceived overall reliability of the robot even showed a negative impact of anthropomorphism. Participants consistently perceived the anthropomorphic robot to be less reliable compared to the non-anthropomorphic robot. This was already present before interacting with the robot (just by seeing it) and remained with an ongoing positive interaction experience, which was exactly the same in both conditions.

Whereas the perceived reliability of the robot was unaffected by positive interactions, this changed for the other two trust attitude measures (trust questionnaire and single item trust). Trust ratings prior to interacting with the robot were significantly lower than ratings after first successful interaction experiences. Results thereby revealed an adaptive trust formation, based on the visual first impression of the robot, that was adjusted to the actual experience of interaction [41].

Measures of trust behavior did not differ between conditions, but again disclosed the process of trust formation as participants’ time in the handover area decreased with increasing interactions. Apparently, participants started to trust the robot in its capability to fulfill the handover task (as revealed by trust attitude measures), thereby enabling an efficient coordination of their behavior to the robot's actions.

With regard to the impact of anthropomorphism, the observation that differences between the two groups were only found for reliability ratings is instructive to what might be important for trust formation. The trust questionnaire [13] primarily focuses on rational aspects of the interaction (e.g. the robot's speed or gripper reliability) and accordingly primes a cognitive evaluation of the robot's trustworthiness. When asked to rate their overall trust and reliability in the robot, participants in contrast might not only base their judgement on hard facts but also on affective aspects as well as on contextual factors [52, 60, 64]. Especially the latter might play an important role for the robot's perceived reliability. Contextual factors might explain the negative perception of the anthropomorphic robot as it does not seem to fit into an industrial setting in which people normally expect to interact with professional machines and not with human-like robots. Nevertheless, this tendency towards a negative robot perception in terms of trust did not translate to the actual behavior in interaction with the robot. This is in line with previous research revealing that trust might differ with regard to trust attitude and trust behavior [53, 56]. Therefore, our results can be understood as support for the importance of assessing trust on multifaceted levels. Future studies should therefore continue to address both trust levels and attempt to identify factors that determine when trust attitude guides behavior and when it does not.

Our second hypothesis stated that an anthropomorphic robot design leads to a more forgiving attitude towards the robot. This should result in a less pronounced trust decline after the experience of a robot failure. Surprisingly, neither trust attitude nor trust behavior measures supported this assumption.

First of all, failure experience did have a significant impact on trust attitude as revealed by all trust attitude measures. Trust behavior was also affected by the negative interaction experience with the robot, resulting in longer handover times after the handover failure. The main effect was further explained by an interaction of failure experience and anthropomorphism. In the anthropomorphic condition, handover times prior to the robot's failure were shorter compared to the non-anthropomorphic condition. This difference before the robot failure might be an indicator for higher behavioral trust after a prolonged time of successful interactions with the anthropomorphic robot. However, the following negative experience impacted handover times stronger in this condition, so that participants working together with the human-like robot spent more time for the handover procedure post-failure compared to the non-anthropomorphic group. This reveals that even after the robot made a mistake, this did not affect participants’ coordination with and adaption to the robot's actions in the non-anthropomorphic group, but did alienate participants interacting with the human-like robot to some extent as they invested more time into the handover again to ensure a safe box transfer. Therefore, in contrast to our hypothesis, anthropomorphism in HRI not only missed positive effects but even seems to have had a detrimental effect on trust, in this case trust dissolution [41]. This negative effect of anthropomorphism on trust dissolution after the experience of a single failure is in line with results by Ahmad et al. [3] who also reported a negative effect on trust in a robot with low error rates. Whereas Ahmad et al. revealed this negative effect for trust attitude, the current study found the effect for trust behavior.

To summarize findings on the impact of non-instrumental anthropomorphic robot design on trust, it seems that robots as tools are perceived more trustworthy in a work-related setting compared to the presentation as human-like machines. An anthropomorphic robot design did not reveal any beneficial effects, neither with regard to trust attitude nor trust behavior. This finding is in clear contrast to previous research on the effect of anthropomorphism, at least in social HRI [24, 55, 72]. Therefore, more research is needed to understand results in more detail. One aspect that needs further consideration is with respect to our non-anthropomorphic condition. In the current study, we realized anthropomorphism by displaying an abstract face and compared this to the presentation of the company's logo. Although the brand logo was visible in both conditions as it is part of the physical robot design (e.g. name “Sawyer” at the torso, brand logo at the robot's arm), the additional emphasis of the logo could have affected results. From brand research, it is known that branding has a positive impact on the perception and evaluation of products (at least with trusted brands [2, 16, 54, 65]). Based on our study we cannot differentiate if anthropomorphism had an explicitly negative impact on the robot's perception or if the difference was due to a positive gain in the other condition due to the salient presentation of the brand logo. Therefore, to identify the cause of the observed differences in reliability perceptions, it would be interesting to additionally include a control group to the current study design with a blank display as a reference group in following studies.

Although more research is clearly needed on the impact of anthropomorphism in industrial HRI, our results indicate that a purely functional robot design might better correspond to people's expectations of a highly reliable and dependable machine in an industrial domain. Compared to findings from social HRI, this highlights the importance of the context as a possible moderating factor in shaping the impact of anthropomorphic design on trust in HRI.

In addition to trust as a central variable, the current study also focused on the influence of anthropomorphic design on visual attention. Our last hypothesis stated that human-like attributes of a robot, in our case the robot's face, should change attentional patterns in interaction and lead to an attentional shift towards the robot's face. Eye-tracking data supported this hypothesis. Looking at the results it becomes evident that people interacting with the anthropomorphic robot ascribed the head-mounted display far more importance compared to participants in the non-anthropomorphic condition throughout the experiment. At the same time, participants of the anthropomorphic group did not look as often to the handover area as the non-anthropomorphic group. These findings are in line with previous research by Bae and Kim [4], revealing that anthropomorphically designed robots evoke a higher degree of visual attentiveness compared to robots with a machine-like design. Specifically, our results confirm that the presence of a human-like face encourages humans to direct their gaze towards the robot in a comparable manner to human-human interaction. In interpersonal interactions, such as the handover of an object, people also seek eye contact to coordinate their movements and to predict the actions of one's counterpart [10, 62, 71]. The high number of fixations indicates that even robot eyes that look rather unrealistic and caricatural might lead to a preferred processing as we know it from human-human interaction [20]. To trigger this highly overlearned mechanism it seems to be sufficient to simply provide prototypical features of a human face, such as eyebrows and black spots.

An additional aspect that may have contributed to the attentional pattern found in our experiment could be the simple animation of the robotic eyes. The animation might have been more engaging compared to the non-animated logo display. This could have promoted stimulus-driven bottom-up preferred information processing. Consequently, one could argue that results might not only follow from our anthropomorphism manipulation, but could be affected by the saliency of movements, too. However, results for fixation time did not reveal any differences between the two anthropomorphic conditions, which indicates that both robot designs were comparably engaging for participants [53]. Moreover, the specific pattern of visual distribution also argues for anthropomorphism as its driving force. If the animation had been the main impact factor, we would have expected a comparable reduction of attention from all remaining AOIs in the anthropomorphic condition. But instead, attention was only withdrawn from the handover AOI. This indicates that the changed attentional pattern was due to differences in the sequence of the handover. Instead of looking at the robot's arm and the box when preparing for receiving the box, participants in the anthropomorphic condition most likely applied a schema from human-human interaction by seeking eye contact to coordinate. Following this interpretation, results highlight the importance to differentiate between non-instrumental and instrumental anthropomorphic robot design. In the current study, anthropomorphism was implemented as a design element but not as a functional feature. The robotic eyes neither showed mutual or deictic gaze [1]. As participants might have expected this functional component of eyes, they might have tried to engage in joint attention with the robot. Therefore, if features like a face should be implemented in robot design at all, these features should at least be instrumental, i.e. support the task at hand to positively affect HRI [47, 69].

Taken together, the findings for visual attention did reveal a distracting effect of non-instrumental dynamic face features of a robot. To further understand these effects and the underlying driving mechanisms results call for further investigation of instrumental and non-instrumental anthropomorphism in HRI. More specifically, studies should compare static face representations (always non-instrumental) with dynamic non-instrumental and dynamic instrumental robot faces to gain further insights to differential effects of anthropomorphic design in HRI.

With regard to methodological aspects, our study identified several challenges that need to be addressed in future research on anthropomorphism for an industrial context. Considering the industrial domain, subtle manipulations of anthropomorphism are needed as only these represent realistic conditions and therefore a transferability of results. At the same time, these subtle manipulations still have to be proven successful which should be mirrored in according manipulation checks. This highlights the need for sensitive measures of anthropomorphism as studies focusing on a work-related context will most probably use variations of industrial robots instead of completely different robots. If it is not clear if existing measures are sufficiently sensitive, it will always be in question if results are due to an unsuccessful manipulation or due to a lack of sensitivity in variables. This is exactly the case for the reported study.

Moreover, our interpretation of results for trust attitude are based on mixed findings and only some results with regard to our anthropomorphism manipulation reached the conventional level of significance whereas others just missed it (e.g. p = .062 for single item trust, p = .055 for perceived reliability after failure experience). This partial lack of statistical support might be due to a relatively small sample size of participants. Although we calculated the sample size beforehand, we might have overestimated the expected effect sizes and therefore underestimated the appropriate sample size for the study design.

Therefore, further studies are needed to gain more insights on the impact of a human-like robot design on trust in a work-related setting. Concerning effects of anthropomorphic design on visual attention future research is needed to validate our results and further differentiate between non-instrumental and instrumental anthropomorphic features. In sum, this study was one of the first to investigate anthropomorphic robot features in an industrial HRI and findings call for more research and empirical findings to determine, if the current study results are a first indicator for a serious (negative) impact of non-instrumental anthropomorphic design on trust and attention in an industrial context.

5 Conclusions

Our study illustrates that anthropomorphic robot design might not be the universal remedy to ease HRI and promote trust in robots. The successful implementation of anthropomorphic design features is highly sensitive for the context and the functionality fostered by the design.

In our hypotheses we stated that a human-like robot design should be beneficial for HRI in terms of trust. At the same time, we assumed that it could be detrimental as anthropomorphic features distract the human from work areas that are more task relevant. Results were not in favor of this supposed tradeoff between benefits and costs of anthropomorphism. We could not find any benefits of an anthropomorphic robot design in the current study. The human-like features did not have the intended positive effects known from social HRI and even showed a negative impact on the perceived reliability of the robot. Additionally, the robot's human-like face shifted participants’ visual attention away from task-relevant areas. In real work environments, this could be critical in terms of occupational safety. In HRI work settings that incorporate the handling of welding equipment, heavy load, hot pieces or any other safety-critical component, people have to be focused on the vulnerable task steps and not on a robotic face that does not reveal relevant task information. Therefore, anthropomorphic design should not be generally applied to industrial robots if it does not support the task.

These findings and recommendations are in contrast to numerous positive results of anthropomorphic design in social HRI. This suggests a moderating factor that shapes the effect of anthropomorphic design on trust: The context. In social HRI, the main task of human and robot is the interaction itself. Therefore, an anthropomorphic robot design supports this task as it introduces more social cues to the interaction. In an industrial setting, the interaction of human and robot might be a necessary step in task fulfillment but not the task goal itself. Therefore, anthropomorphism might not always be beneficial if it does not clearly ease the task at hand. Results of our study support this need of context differentiation, but need further validation.

Whereas the body of HRI research is growing, a systematic consideration of such meta variables has not gained much attention yet, but is crucial to our understanding and interpretation of conflicting findings. Taken together, it seems that anthropomorphism is a mixed blessing in HRI. More research is needed to specify the conditions that determine in which cases anthropomorphism in robot design is beneficial or detrimental to HRI.

References

[1]

Henny Admoni and Brian Scassellati. 2017. Social eye gaze in human-robot interaction: A review. J. Human-Robot Interact 6, 1 (2017), 25–63. DOI:https://doi.org/10.5898/jhri.6.1.admoni

Abstract

1 Introduction

1.1 Trust in HRI

1.2 Anthropomorphism in HRI

1.3 Hypotheses

2 Materials and Methods

2.1 Participants

2.2 Apparatus and Task

2.3 Design

2.4 Dependent Measures

2.4.1 Control Variables and Manipulation Check.

2.4.2 Trust.

2.4.3 Visual Attention Allocation.

2.5 Procedure

2.6 Data Analysis

3 Results

3.1 Attitudes Towards Technology

3.2 Manipulation Check

3.3 Interdependencies of Trust Attitude Measures

3.4 Trust Development with Positive Interaction Experience

3.5 Visual Attention Allocation

3.6 Trust Dissolution with Negative Interaction Experience

4 Discussion

5 Conclusions

References

Cited By

Index Terms

Recommendations

Toward an Understanding of Trust Repair in Human-Robot Interaction: Current Research and Future Directions

Human-Robot Trust: Just a Button Press Away

Effects of robot-human versus robot-robot behavior and entitativity on anthropomorphism and willingness to interact

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations