1 Introduction

Nonverbal communication is a key part of fluid, efficient human–human interactions, influencing rapid interpretations of bi-directional communication/intent, and frequently occurring subconsciously [8, 9]. When robots are in human spaces, particularly social spaces, prior work, verified that such nonverbal cues are just as important for fluid coordination [6, 10,11,12,13,14]. Results of these works demonstrate that the use of non-verbal cues in human robot interaction can increase the safety and predictability of robots, aiding robot goal and state legibility, and playing in following social norms. Expressive motion, for example, has been used to increase comfort and legibility in single robots [6, 13]. What has been less explored is whether and how such expressive motions communication results might extend to multi-robot systems (Fig. 1).

Fig. 1
figure 1

Examples of a multi-robot group using motion to express two different goals to the human

Mobile multi-robot systems, such as Kiva robots [15], are becoming common. While much work has been done on how these systems can efficiently complete tasks [2, 3, 16], little work has been done combining algorithms for multi-robot coordination with the social expressiveness of multi-robot systems, especially when those robots are in human spaces. This paper presents a novel framework of motion parameters for generating multi-robot expressive motion, called MoTiS, standing for Motion, Time, and Space. MoTiS extends and combines concepts from prior work in single [4, 13, 17] and multi-robot expressive motion [18,19,20].

While prior work indicates that human reactions to robot motions may be amplified in multi-robot systems [21, 22], multi-robot systems are challenging to model, with high computational complexities and shifting between behaving as a coherent group or individuals. This increased complexity underscores the importance of investigating the expressive, communicatory potential of multi-robot groups. In the hopes of creating a framework for more robust, reusable expressive motion, this paper pulls together the prior work in both single and multi-robot expressive motion.

The MoTiS framework consists of six parameters: (1) relative direction, (2) coherence, (3) relative start time, (4) relative speed, (5) proximity, and (6) geometry. The presented work utilized online studies to explore three research questions regarding how each parameter impacted the interpretation of the robot group:

  • How does the motion affect the perceived social attitudes of the robot group?

  • How does the motion affect the perceived relationship between the robot group and the human?

  • How does the motion affect the perceived functional goals of the robot group?

Results of our six substudies—each focusing on variants of one of the six parameters—showed that MoTiS parameters significantly impact the human interpretation of multi-robot motions, impacting perceived robot communication, as well as what was perceived as a group versus disparate robot actors. Relative direction, coherence, relative start time, and geometry (four of the six parameters) all had significant communicatory impact. Relative speed and proximity had a moderate communicatory impact, which we expect to be heightened during future in-person evaluations, consistent with single-robot expressive motion results [13, 23]. We further detail the evaluation of our system and its communicatory potentials in the substudy sections, as well as the final discussion.

Another unique insight from this work was the way in which architectural floor plan impacts the interpretation of robot expressive communication, such as indicating a desire for the human to enter or stay away. This extends the idea of relative motion to relate-to-space, rather than just relative-to-human, or relative-to-each-other. Future researchers can utilize our MoTiS framework and our design insights to more easily and effectively create custom multi-robot expressive motion.

2 Related Work

This section reviews three areas of related work: (1) prior work establishing the motivation for multi-robot expressive motion, (2) prior human–robot interaction (HRI) study results and implementation approaches demonstrating the value and viability of single robot expressive motion, and (3) prior implementation approaches from the domain of generalized multi-robot systems.

2.1 Motivating Multi-robot Expressive Motion

Humans operate in social groups intuitively, e.g., when walking around crowds and co-existing with other humans in common spaces, such as a shared cafeteria or a building lobby. Here we review the prior work regarding how humans perceive multi-robot groups when they are in their common spaces, which is a key part of encoding robot motion communications. Second, we explore how human–human group motion can be modeled, which can inspire methods for creating fluid multi-robot group motion in human spaces.

2.1.1 Human Perception of Multiple Social Agents

A key part of successful human–multi-robot interactions is how humans perceive multi-robot groups, as well as when humans perceive multiple robots as a group. Research has discovered that many factors affect the perception of multi-robot groups. Work by [21], found that people act more favorably towards groups of heterogeneous robots than homogeneous robots. Other work by [22] has shown that groups of minimal robots encourage more direct interaction than single robots. Ref. [24] has also researched how group communication affects interaction with humans, finding that groups of minimal robots that speak only within the group may create a negative feeling, whereas if a group communicates with people outside the group, people are more likely to interact with that group. Ref. [25] confirmed this finding, looking at human–robot group interactions from a social psychology standpoint, specifically looking at in-group and out-group [26, 27]. Multiple prior studies were compiled and showed that people react negatively to robots who act as out-groups relative to humans.

Continuing to apply social psychology to human–robot group interaction, [20] proposed a framework for creating theories on HRI in group settings. This framework is comprised of three concepts: (1) entitativity, which refers to how and when people perceive individuals as a social group, (2) cohesion, which refers to how much “togetherness” a group of individuals has, and (3) in-group identification, or when an individual considers themselves part of a group. Ref. [20] gives recommendations on how to measure and evaluate these concepts in group HRI settings.

Similar to the concept of in-group and out-group, recent work has also explored if robots can create feelings of inclusion and exclusion in humans [28]. A human and two robots played a ball-passing game, with the robots displaying three different behaviors: exclusion, inclusion, and over inclusion. During the study, the human was passed the ball less than one-third of the time in the exclusion case, exactly one-third of the time in the inclusion case, and more than one-third of the time in the over-inclusion case. Results found the exclusion case caused the human to perceive that the robots were excluding them intentionally.

2.1.2 Modelling Multi-human Interaction

Human–human group motion has been extensively studied and modeled. The social force model is presented by [29], illustrating how setting parameters for individuals can be used in creating a model for pedestrian behavior. The social force model assumes that a person will take the most direct route to their destination, that each individual pedestrian will want to keep some distance between themselves and others, and that this distance is a function of speed. The social force equation used to determine pedestrian path consists of an attractive force towards the goal and a repulsive force towards others. This model was able to reproduce pedestrian-like movement in simulation.

Expanding on the social force model [29, 30] explored pedestrian flow in various simulated environments, including bottlenecks, intersections, and corridors, with specific focus on how these features can affect evacuation of a space. These simulations showed that the geometry of the space affects how people exit the space, and that additions such as columns can help reduce panic and congestion in crowds. Other work in crowd dynamics in high-stress situations, such as evacuations, has used simulation with real people to test models in a safe and accurate way [31]. These simulations showed that in high-stress situations, space between people is much smaller than when navigating the same space under low-stress conditions.

Smaller subgroups within larger pedestrian groups have also been studied, as people often walk in groups together rather than alone [32]. The movement of pedestrians was analyzed and showed that when there were few people around, groups tend to walk side-by-side, but when the space was more crowded the group formed a V shape pointing towards their walking direction.

More recent work has tried to decipher the “rules” of motion in a group of people to create a simulation [33]. This work used a bottom-up approach that studied humans in motion capture and in simulations of crowds to gather data. Using this data, a simulation was created to show how individuals within a larger group align their motion to those around them.

2.2 Single Robot Expressive Motion

Most work in expressive motion in robotics has focused on single robot systems, therefore we use this area as a guide for creating multi-robot expressive motion. This subsection examines how prior work has produced generalizable expressive motion for single robots and how expressive motion has been perceived by humans and utilized in human–robot interaction. In order to understand multi-robot expressive motion it is necessary to understand the basics of single robot expressive motion.

2.2.1 Frameworks for Expressive Motion

One way to create expressive motion is to create a framework of features or variables that are used to generate expressive motion. These features can be tuned and layered together to create complex expressive motion. This method of generating expressive motion in single robots and agents is highly applicable to multi-robot systems as a framework on its own could be used for any number of robots; however, the works described in this section focus on frameworks created with single robots and agents in mind.

Frameworks have been shown to be effective in creating expressive motion for simple, single moving agents. Ref. [34] explored how physical motions are interpreted emotionally for an interactive device. They used a framework consisting of three parameters: velocity, smoothness, and openness, which were effective in portraying emotions. Similarly, work by [35] also explored how different motion parameters were viewed expressively in a small moving target. The results showed that using only a few parameters, and a very simple motion agent, people still interpreted the motions as emotive. These works emphasize the communicatory power of motion, not only as a functional communication, but as an emotive one.

In robotics, prior some work as used situationally inspired parameters for their frameworks [36,37,38]. However, the most popular choice for a theoretically grounded framework is using Laban Movement Analysis (LMA) [39,40,41]. LMA is part of the larger Laban/Bartenieff Movement System [42,43,44], which is an established set of concepts from the field of dance used for detailed movement analysis. It consists of four main categories: Body, Effort, Shape, and Space. Since these concepts were originally created to describe and annotate dance, they are useful tools to adapt into motion features for generating expressive motion.

Reference [45] utilized an LMA-inspired framework to layer emotion on top of the movements of a humanoid robot. The framework included six features: space, time, weight, inclination, height, and area. A pilot experiment was then run to see what features correlated to different emotions. The correlating features were then applied to the robot motions and a study was run showing that overall users could differentiate emotion in basic movements of the humanoid robot.

While [45] utilized Body, Effort, Shape, and Space from LMA, it is most common in robotics to focus on the Effort system for creating expressive motion [18, 46,47,48,49,50]. The Efforts are not directly tied to a physical form or the ability to move in specific ways, which makes them widely applicable to a variety of robot forms. Work by [18] proposed a framework that directly modeled LMA Efforts Time, Flow, and Weight to generate expressive motions for a NAO robot. In the framework the Time Effort was modeled through acceleration, the Weight Effort was modeled through acceleration and velocity, and the Flow Effort was measured through acceleration, velocity, and movement profile. Similarly, [46, 47] modeled all four of the LMA Efforts to create a framework to generate expressive head motion. Time Effort was shown in velocity, abruptness, and arrival time, Weight Effort was shown in acceleration like [18], vertical compression, and head pitch, and Flow Effort was shown through range of motion in tilt, pan, and yaw [46], and Space Effort was shown through the starting position and the distribution of target head positions. A user study showed that the efforts were legible in the head motions when compared side by side (for example sudden vs sustained).

2.2.2 Human Perception of Single Robot Expressive Motion

Generating different expressive motions in robots is important, but what is especially important in human–robot interactions is how humans interpret the motion of the robot. We cover how trajectory and path, timing, and proxemics have affected people’s perceptions of a robot, as verified in human studies. These prior evaluations of single robot expressive motion emphasize the communicatory value of using expressive motion.

First, we examine how trajectory and heading have been perceived in prior human user studies. Work by [51] showed that the amount of change in the heading of a trajectory affects the perceived emotion of a mobile robot. The LMA Efforts were quantified into measurable motion qualities and combined with expert-generated trajectories to create trajectories that described opposing feelings. Happy and lackadaisical paths had high amounts of change in heading, whereas confident and rushed paths were more direct and did not change heading. Results showed that people were able to recognize the emotive trajectories.

Later work using the same mobile robot platform focused purely on heading in the trajectory exploring the Space Effort of LMA (indirect movement vs direct movement) [4] which showed that a robot moving in a straight line (direct) reads as focused and goal oriented, while a sine path (indirect) reads as hesitant and meandering. Participants also thought that the more direct trajectory meant that the robot had knowledge of its goal earlier than the indirect path. These results confirmed the findings of [51], showing that paths with smaller changes in heading are viewed as more confident. Further work with these mobile robots navigating a hallway also showed that paths that frequently changed heading made people think the robot was confused or broken [52].

Timing is also a key aspect of expressive motion. In mobile robots, [23] explored how the speed of the robot affected how often people interacted with it. The mobile robot moved through the hallways of a university building with a basket of candy, from which people could take a piece. Results found that people were more likely to interact with the robot when it was moving at a slow speed, as the fast speed was perceived as intimidating and busy.

Apart from mobile robots, [53] used a robot arm to explore how speed, changes in speed, and pauses affected how people perceived the motion of the arm. Results showed that overall pausing had the greatest impact on interpretation, with pauses making the robot seem less competent, less confident, less natural, and having a negative disposition. Speed only significantly affected confidence, with high speeds leading people to perceive the robot as more confident.

Finally, we look at how spacing between robots and humans can be utilized to facilitate effective and comfortable human–robot interaction. Ref. [17] examined the connection between robot likability, varying levels of eye contact, and proxemics. Results showed that if a participant rated the robot as unlikable, they gave more space to the robot when it made eye contact than when it averted its gaze. If a participant rated the robot as likable, there was no significant difference in how much space they gave the robot in either eye contact condition. These results show that human-robot proxemics are dependent on many factors, such as eye contact and likability of the robot.

Papadakis et al. [13] explored how spacing between robots and humans could change fluidly from one context to another as a mobile robot navigated down a hallway. This work introduced a probability density function that allowed for social spacing that adapts to context and allows for smooth transitions. The social spacing models showed that generally people are more comfortable with a robot farther away, however if they give specific cues, such as gesturing that the robot can pass them, their acceptable social spacing may change and allow the robot to come closer to pass.

Proxemics have also been successfully integrated into planning approaches to allow for socially acceptable robot navigation. [7] turned human navigation rules into cost functions for a heuristic planner for a single robot that resulted in person-acceptable navigation. This integration of human proxemics allowed for the robot to move near to humans without getting so close that it would read to an onlooker as uncomfortable. Simulations were run using the planner which validated that socially acceptable navigation choices were made by the robot in different contexts. Similarly, [54] used human comfort and social norms to maximize human physical and mental comfort and tested their planner in simulation. They developed two criteria, safety criterion and visibility criterion. The safety criterion allows for safe and socially acceptable distances to be kept between the robot and the human, using the human’s position and posture. The visibility criterion allows for maximum visibility of the robot for the human, using the human’s field of view. This work was tested on a real robot and showed promising results, illustrating again that proxemics are context-dependent and many factors need to be considered to determine what distances make humans most comfortable around robots.

2.3 Generating Multi-robot Expressive Motion

An expansion of single robot expressive motion is multi-robot expressive motion. Concepts from single robot expressive motion can be extended to work with a multi-robot system. This section looks at prior work on generating multi-robot expressive motion, with particular emphasis on works that use a framework approach. This field of multi-robot expressive motion is in its early stages, so we examine six different works that explored generative legible and expressive multi-robot motion. We use these works as inspiration for choosing our own multi-robot expressive motion parameters in Sect. 3.

Reference [55] studied how three characteristics of motion affect legibility in a virtual multi-robot system: (1) the center trajectory of the group, (2) the dispersion of the group, and (3) the stiffness of the group. Users were shown three virtual multi-robot groups in the same space and asked to answer as quickly as they could where they thought each robot group was going. Trajectory and dispersion of a group affected whether or not the user was able to guess where the robot group was going correctly, and stiffness of the group predicted the users’ response time.

In addition to legibility, [19] created artificial emotions based on context to modulate the behavior of a multi-robot group for efficient, expressive motion. For each given emotion, behavior was modulated by changing (1) the field of view of the robot, (2) the optimal speed, (3) the social margins between the robot other agents, and (4) the stopping distance between the robot and obstacles. This system was shown to be effective in preventing collisions and deadlocks when navigating in simulated human spaces.

The perceived expressivity has also been examined in multi-robot motion. Ref. [56] used spatial and temporal synchronicity, and inter-robot distance as control parameters in decentralized swarm algorithms to test perceived expressivity of the swarm. These parameters were used in prior work by [57] to explore perceived cohesion and expressivity. A user study was run with small, minimal tabletop robots, in which the participants were experienced dancers and were asked to rate the multi-robot group on perceived organization, cohesion, and expressivity. Results showed that temporally asynchronous groups were perceived as the most expressive and spatially synchronous groups were perceived as the most cohesive, supporting the results of [57]. Using these results, choreographers created emotive multi-robot sequences by tuning the control parameters. A second, online user study evaluated the emotive group sequences and fear and happiness, which both had high synchronicity, were the most recognized.

Similarly, [58] used tunable control parameters for an aerial multi-robot group with the aim of allowing performers to easily control swarms for theatrical productions. The control parameters included: (1) behavior duration, which was the time the robots had to complete the command, (2) formation specification, which included the shape of the multi-robot group and the heading of each robot in the group, (3) motion specification, which included the manner of the robots, for example “drunk” or “nervous” manner which informed the individual trajectories, and (4) the action, which informed the group trajectory. This framework was evaluated in simulation and a six quadcopter performance showing the framework was capable of generating safe, expressive motion in the vein of what the performers wanted; however, no user studies were performed.

Reference [59] took inspiration from emotions, the same fundamental emotions used by [60], to create different shape and size features for expressive swarm behavior. Three types of features were included: (1) shape features, which described the shape of the group formation, (2) size features, referring to the size the robot group was constrained to, and (3) movement features, which informed the movement of the individual agents. This method of generating expressive multi-robot motion was evaluated via simulation video studies and later implemented on small swarm robots. Results showed this method to be an effective way of creating legible group emotions.

3 Multi-robot Expressive Motion Framework: Relative Motion, Timing, and Spacing (MoTiS)

This section introduces a novel system for multi-robot expressive motion, which includes parameters for relative motion, time and space. We dub our system MoTiS, short for relative motion, time, and space. MoTiS includes six parameters that are adapted to multi-robot systems and take into consideration the challenges of a multi-robot domain. In particular, this framework seeks to extend and coalesce prior concepts from single and multi-robot expressive motion, such as proximity [17, 61] and speed [23], and also introduces the concept of relativity, i.e., a temporal sequence, movement relative to another agent, or the floor plan of a space.

The MoTiS parameters can be tuned and layered to create expressive motion, as seen in Fig. 2. The parameters are grouped into three categories: (1) relative motion, which includes relative direction and coherence, (2) relative timing, which includes relative speed ans relative start time, and (3) relative spacing, which includes proximity and geometry. This parameter set is a combination of parameters inspired by single robot expressive motion, multi-robot expressive motion, human motion, and ideas novel to this work. Details on the novelty, inspiration, and definition of each parameter are outlined in this section.

Fig. 2
figure 2

The six multi-robot expressive motion parameters grouped by relative motion, timing, and spacing

3.1 Relative Motion

Relative motion refers to how the group or individual agents move relative to something else, and they are features that span both time and space. The two parameters under relative motion are relative direction and coherence, both of which extend prior concepts from single and multi-robot expressive motion.

Relative Direction: Relative direction refers to the direction individual robots, or the robot group, move relative to something else, for example the robot group moves away from a human. Direction of motion is a concept that has been previously used in expressive motion in the form of path trajectory for single robot expressive motion [51] and as individual robot heading in a multi-robot group [58]. Direction is also a key part of functional tasks. For example to deliver a package, a robot will have to move towards the package recipient.

Coherence: Here, coherence refers to the degree to which the individual robots in the group are doing the same thing. The idea of group coherence has been used in multiple works on generating multi-robot expressive motion [55,56,57], which has specifically looked at general temporal and spatial synchronicity in multi-robot groups. Ref. [20] used a social psychology approach to examine how cohesion in human interactions can be applied to group interactions with robots. The coherence parameter in this work is extended beyond temporal and spatial synchronicity. In this work, coherence can apply to many different aspects of the group motion, for example direction coherence or speed coherence.

3.2 Relative Timing

Relative timing refers to the timing of movements of individual agents relative to something else. The two parameters under relative timing are relative start time and relative speed. Timing of movements and actions in HRI is not a novel concept, and has been shown to be key to successful interaction in prior work in single robot expressive motion and HRI scenarios [53, 62,63,64,65,66]. The speed parameter extends prior work in single and multi-robot expressive motion, and the start time parameter is novel to this work.

Relative Speed: Speed has been previously studied in social robotics and has shown to be an effective tool for creating expressive motion in single robot expressive motion [23, 35, 53] and multi-robot expressive motion [19, 59]. Prior works are extended for the concept of speed to include the “relative” term, which means speed is not only referring to the speed of the multi-robot group as an absolute, but also as relative to something. For example, the multi-robot group can move down a sidewalk faster than the average pace of the pedestrians, rather than just moving “fast.” Here, speed is specifically not defined as velocity, since velocity is comprised of both direction and speed, but for the purposes of this work velocity is separated into relative direction and relative speed.

Relative Start Time: Relative start time is when one robot in the group, or the whole robot group, starts moving relative to something else. Similar concepts to start time, such as delays and arrival time, have been previously studied in single robot expressive motion [52]. We hypothesize that it will be an important part of interpreting the motivations of a robot group. For example, if the robot group starts moving after a human starts moving, it may be seen as reactionary to the human, but if the robot group starts moving before a human, onlookers may not assume the robots are moving in relation to the human.

3.3 Relative Spacing

Relative spacing refers to how the agents are spaced as a group, or what shape they are creating as a whole. The two parameters under relative spacing are proximity and geometry. Proximity has been used widely in social navigation [7, 67,68,69,70] and multi-robot expressive motion [19, 56, 57], and has been shown to be important in HRI. Geometry has also been explored in multi-robot expression [58, 59] and in humans [43, 71, 72] and can be effective at conveying emotion.

Proximity: Proximity is the relative distance between two things. The proximity parameter is inspired by work in proxemics [61] that explore the social spacings between humans, and the prior work in social robotics that has used proxemics in social navigation [7, 67,68,69,70]. Proxemics has also been used in generating expressive motion, with work like [19] distinguishing between the inter-agent proxemics and the agent-obstacle proxemics as separate variables for expressive multi-robot motion. Work by [56, 57] also used the distance between robots in a group to create expressive group motion. This work extends proximity to also include the proxemics between the robot group as a whole and humans, rather than individual agents within the group.

Geometry: Geometry refers to the shape the robot group is making as a whole, and has been shown to be an effective tool in creating expressive and communicative group motion [58, 59]. The concept of shape in an expressive context has also been studied, and different shapes have been shown to elicit different emotional responses in humans [71, 72]. Geometry also closely maps to the Laban concepts of Shape and Space [43], which are used in dance and choreography, both of which are forms of expressive motion. Ref. [71] summarized many prior works on shape and emotion, highlighting that different shapes are interpreted emotively by people. For example, v-shapes and down pointing triangles can read as threatening.

4 Research Overview

To evaluate the applicability of these parameters, we ran six independent online studies with videos of interactions between a multi-robot group and a human figurine. Each study focuses on a particular parameter of the MoTiS framework, as seen in Fig. 2. These studies provide insights into the communicative strengths (or weaknesses) of each parameter, adding insight to when they may best be applied, and what unanticipated complexities arise. Results showed that group coherence and relative direction play a large role in determining when participants viewed the robots as a group. Results also showed that certain parameters have stronger effects to onlookers, whereas some do not translate to people observing the interaction. Several parameters also showed that different instances of one parameter can greatly vary the interpretation of the robot group.

The overall goal of each study was to explore how people interpreted different instances of one of the multi-robot expressive parameters. A combination of anchored scale questions and extended response questions were used to gain insight into how people viewed the multi-robot group motion. Each study also had a specific goal of exploring how various instances of that parameter were interpreted differently, as seen in Fig. 3. First, we look at the simplest case of relativity, which is relative direction. Second, we look at the most basic parameter exclusive to multi-robot system: coherence. Following these two studies, we look at parameters for timing, which have been expanded from single robot expressive motion to include multiple robots and relativity. Finally, we look at parameters of space, which have been inspired by human–human motion and multi-robot motion: proximity and geometry.

Study Scenario: There were six substudies corresponding to the parameters in Fig. 2, also summarized in Fig. 3. Across all, a participant watched a video of a human interacting with a group of four Sphero robots. The setting of the videos is depicted in Fig. 4. This simple architectural setup was chosen to define a entryway, as it has clear cognates in human-constructed environments and many possible interpretations. For example, it may appear that the robots are guarding the entry, or welcoming people to cross. We chose a simple scenario and architectural floor plan so that the majority of the communication throughout studies came from the motion of the robot groups.

Across all substudies, the robots started in a horizontal line centered and evenly spaced around an entry lane, with a human at the far side. The figurine—a Playmobile character with androgynous hair and clothing—begins centered in the entry lane, facing the robot and moves forward in the studies using an invisible thread. The human always moves centered in the entry lane in all study videos. The difference between studies is when the human follows the entry lane to come closer to the robots, and what movements the robots perform in anticipation or response to the human motion. We hypothesized that these variants would invoke convergent human interpretations of the robot communications relative to particular study variables. For example, perhaps motions that physically stop the human from passing the entry communicate that the human should not enter the building, whereas group motion toward the human away from such architectural cues could be seen as greeting the human.

Fig. 3
figure 3

Six substudies and their independent variables

Fig. 4
figure 4

Video study layout diagram and snapshot from a study video. In both, the floor plan, agent, and human are labeled

Participant Surveys: After viewing the video for any of the substudies, participants answered four anchored scale questions and two open-ended responses. The participant surveys sought to gain insights on how participants interpreted the robot motion through different lenses. For these studies, we did not seek demographic data as no demographic factors were included in our hypotheses. To help ensure quality and representative data, we instead applied worker approval rate and geographic qualifications. All participants were required to have over a 97% approval rate on MTurk and be located in the United States. The questions were split up into two sets so that participants did not have to answer many questions, but answered at least two anchored scale questions about their social interpretation of the robots and their functional interpretation of the robots. Each participant only saw one of the two sets of questions. Splitting the questions into two sets allowed us to explore more questions without overwhelming participants. In each set, participants were given four anchored scale questions, which asked about specific interpretations and two extended response questions. The purpose of the anchored scale questions was to get quantitative results about participants of specific interpretations, such as if they found the robots to be threatening or harmless. The purpose of the extended response questions was to gain more insight into participants choices on the anchored scales and their interpretation of the robots’ motion. For each study, there were 40 participants per study condition with 20 answering question set 1 and 20 answering question set 2. The question sets are as follows:

Questions Set 1:

  1. 1.

    Extended Response:What do you think the robot group was trying to do?

  2. 2.

    Anchored Scale: The interaction between the robot group and the human was [unfriendly / friendly].

  3. 3.

    Anchored Scale: The robot group was [unwelcoming / welcoming to the human].

  4. 4.

    Anchored Scale: The robot group was [avoiding / inviting the human].

  5. 5.

    Quality Control: Choose cat for this question.

  6. 6.

    Anchored Scale: The robot group [did not want / wanted the human to go past them].

  7. 7.

    Anchored Scale: The robot group [did not want / wanted] the human to go through entry.

  8. 8.

    Extended Response: Provide three adjectives to describe the robot group

Questions Set 2:

  1. 1.

    Extended Response: What do you think is the motivation of the robot group?

  2. 2.

    Anchored Scale: The interaction between the robot group and the human was [hostile / courteous].

  3. 3.

    Anchored Scale: The robot group was [threatening / harmless].

  4. 4.

    Anchored Scale: The robot group [did not want / wanted] the human to join them.

  5. 5.

    Quality Control: Choose cat for this question.

  6. 6.

    Anchored Scale: The robot group was [not blocking / blocking the human].

  7. 7.

    Anchored Scale: The robot group [did not want / wanted] the human to go through entry.

  8. 8.

    Extended Response: Provide three adjectives to describe the robot group

Each set consists of two extended response questions, five anchored scale statements, and one quality control question. The anchored scale statements had seven choices: three negative descriptors, one neutral descriptor, and three positive descriptors. For example, the statement “The robots were [unfriendly / friendly]” would have the following options: very unfriendly, unfriendly, somewhat unfriendly, neither friendly or unfriendly, somewhat friendly, friendly, and very friendly. In the anchored scale statements, participants chose the option that best completed the sentence from a drop down menu. This drop down menu allowed the participants to see their choice in the complete sentence. The quality control question was also answered with a drop down menu. Extended response questions had a text box in which participants typed out their response.

The anchored scale questions also fall into three categories used for analysis: (1) social attributions, (2) relational attributions, (3) and functional interpretation. The questions can be seen sorted into these categories in Table 1. The social attribution questions concern how people socially interpret the robot movements and the interaction between the robots and the human. The relational attribution questions explored whether or not participants thought the robots were including or excluding the human. The functional interpretation questions explore what participants thought the robots were functionally trying to do.

Table 1 The anchored scale questions from the parameter validation studies sorted by question type

Analysis Methods: Across all studies, the anchored scale data was generally not distributed normally, and therefore requiring non-parametric testing. To determine significance across all conditions, the Kruskal–Wallis test is used. If significance is found using the Kruskal–Wallis test, then pairwise tests are done with the Mann–Whitney U test. For both tests, significance is determined by a p value less than 0.05. Results for the anchored scale questions are plotted showing the responses on a seven point anchored scale from \(-\,3\) (very [negative descriptor]) to 3 (very [positive descriptor]). The descriptors are in bold in the subcaptions. The plots show the median (denoted by a black horizontal line), 25% quartile (denoted by a colored box), 75% quartiles (denoted by T-bars extending from the colored box), and outliers (denoted by diamond markers) by study condition. Significance is shown as * for p < 0.05, ** for p < 0.01, *** for p < 0.001. The extended response questions were analyzed using social psychology grounded coding methods [73].

5 Relative Motion Study and Results

The relative motion studies encompass the parameters that span both space and time: (1) relative direction (N = 160) and (2) coherence (N = 160). Both studies had significant results. In relative direction the moving towards conditions were almost always seen as aggressive and blocking, whereas moving away was often considered non-confrontational. Coherence impacted when participants viewed the robots as one large group, or as multiple smaller groups.

5.1 Relative Direction Study

Relative direction represents the robot’s direction of motion relative to a person or object in the scene. For example, you can see in Fig. 4, the robots are moving away radially towards and away, centered around either the human or the entry. The goal of this study is to explore the ways in which the object of the relative motion, e.g., human/floor plan/object, impacts the social interpretation of the pathway.

Sequence of Events First, the human begins moving down the entry lane toward the entry. Shortly after the human begins, the robot group moves. All robots move in the same relative direction. The human continues forward until the entry is reached, unless blocked by the robots. Both the robots and the human move at a medium speed in all conditions. The robot group relative direction depends on the study conditions.

Conditions and hypotheses: The robot response includes the following conditions: (1) moving away from the entry, abbreviated as aw_en, (2) moving toward the entry lane, abbreviated as t_en, (3) moving away from the human, abbreviated as aw_hu, and (4) moving towards the human, abbreviated as t_hu, as seen in Table 2 and Fig. 5. We chose two extremes so that we could evaluate if relative direction significantly changed participants’ perceptions of the robot group in very opposing cases. Our hypotheses are as follows:

  • H1-RDIR: Relative direction will most impact functional attributions.

  • H2-RDIR: Relative direction will least impact social attributions.

  • H3-RDIR: Moving away will have different meaning from moving towards

  • H4-RDIR: Moving relative to the human will have different meaning from moving relative to the entry

Table 2 Relative direction study conditions
Fig. 5
figure 5

The different relative directions of the Spheros for the Relative Direction study

5.1.1 Relative Direction Results

Across all questions, relative direction (towards vs away) had a larger impact on how participants viewed the robot group than object of relative direction (human vs entry). The moving away conditions were viewed more positively than the moving towards conditions. The primary view of the towards conditions were that they were confronting and blocking the human. Full numerical results can be seen in Table 3 and are reported in detail in this section.

Table 3 Numerical results of relative direction study

Within the social attributions question set, [threatening/harmless] and [hostile/courteous] questions had significant results. In the [threatening/harmless] question, both moving away conditions were deemed harmless, and towards the entry and towards the human were deemed threatening, as seen in Fig. 6a. Both away conditions were very significantly different from the towards conditions. Similar results were seen in the [hostile/courteous] question, in which both away conditions were very significantly different from the towards conditions, as seen in Fig. 6b.

The only question with significant results in the relational attributions set was [avoiding/inviting]. The robots moving way from human was highly significantly more avoiding than towards human and entry, and away from the entry significantly more avoiding than towards the human, as seen in Fig. 7. The robot moving toward human was viewed more positively than moving towards the entry for [avoiding/inviting] and [unwelcoming/welcoming]. These results, though not significant, may highlight the importance of choosing the object the robots are moving relative to (in this case, the person or the floor plan) depending on the task they are doing. Moving towards the human may have been viewed more positively because inviting and welcoming are human-centric activities.

Results across all three functional interpretation were highly significant between all moving away conditions and all moving towards conditions. The moving away conditions were interpreted as the robots allowing the human to enter, whereas the moving towards conditions were interpreted as the robots not wanting the human to enter, as seen in Fig. 8. However, the object of motion (human vs entry) did not significantly impact any of the three functional interpretation within the moving towards and moving away conditions.

Fig. 6
figure 6

Survey responses to social attribution anchored scale questions for the relative direction videos

Fig. 7
figure 7

Survey responses to “The robot group was avoiding/inviting the human.”

The extended response questions provide some insight into why the moving away conditions may have been generally viewed more positively. 20% of participants stated that they thought the robots had a positive relationship to the human. For example, another participant thought the robots were “getting out of the human’s way and allowing the human to reach the gate unimpeded” and were “accommodating, inviting, and friendly.” However, the moving away conditions were not universally viewed as positive. There were more positive responses than the moving towards conditions, but the most common responses for the moving away conditions were that the robots were avoiding the human, afraid, and 40% said had a negative relationship with the human. For example, one participant said the robots were trying “to avoid blocking the path of the human” and were “scared [and] worried.”

Generally, the responses to the moving towards conditions were more negative than moving away. 50% of the responses for the moving toward conditions thought the robots had a negative relationship with the human and only 10% thought the relationship was positive. Participants also thought the robots were protective and blocking the human form passing. For example, one participant said that “the primary motivation of the robot group was to block the human’s access to the entry. They did not want the human to proceed further so they blocked its path” and that the robots were “rude, harsh, [and] mean.” Similar responses were common for moving towards conditions, with over 25% of participants describing the robots as aggressive.

Fig. 8
figure 8

Survey responses to functional interpretation anchored scale questions for the relative direction videos

5.2 Relative Direction Discussion

Relative direction had the greatest impact on both functional attributions and social attributions, meaning relative direction highly impacted how participants viewed what the robots functional goals were, and whether they had a positive or negative social attitude. These results support the findings of [51], that the heading of a robot impacts an onlookers view of the robot’s final goal. Our results show that this is also true in a multi-robot setting when all the robots are moving in the same relative direction. These results support hypothesis H1-RDIR, that the relative direction conditions will most impact on functional attributions, and disproves hypothesis H2-RDIR, that relative direction will least impact on social attributions. The high impact on functional attributions is likely due to the physical aspect of many of the functional questions; there is less room for interpretation when the robots are physically stopping the human from passing. The social attribution questions that got the most significant results had the strongest negative descriptors, ie. threatening, hostile, whereas the question with no significant difference between the conditions used softer negative descriptors, ie. unfriendly. Friendly and unfriendly motion is often heavily context dependent. For example, moving towards a human may be seen as friendly when the group of robots is taking the human on a guided tour, whereas moving towards a human may be seen as unfriendly when the robots are chasing someone away from a restricted area. Without knowing the context it can be difficult to distinguish between the two, whereas [threatening/harmless] and [hostile/courteous] may be less ambiguous in the relative direction conditions. Along with relative direction, final ending position or geometry may have to be considered in future studies to see what impact that has on peoples’ interpretations, as it is possible to move towards the entry or human without physically blocking them.

We found that moving away from either the entry or human was viewed positively, both in functional and social attributions. Moving towards either the entry or human was viewed negatively, both in functional and social attributions. These results are novel relative to prior work and advance our understanding of how relative direction can impact how people the attitudes and goals of groups of robots. These results support hypothesis H3-RDIR, that moving away from the entry or human will have different interpretations from moving towards the entry or human. These results are likely impacted by the towards conditions physically stopping the human from proceeding. For example, in hostile and courteous, the robots may be perceive as courteous when they allow the human to pass through the entry, and hostile when they stop the human. Similarly with threatening and harmless, it is easy to see how stopping the human from entering the opening can be seen as threatening even without additional context.

Table 4 Coherence study conditions

When exploring the impact of the object of direction (entry or human), we found that moving relative to the human was generally more viewed more positively, but was also more ambiguous than moving relative to the entry. These results are novel relative to prior work and advance our knowledge of how object of motion can change how people view a multi-robot group. However these results were not significant, and hypothesis H4-RDIR, that moving toward the human would be different from moving toward the entry was not supported. These results, though not significant, may highlight the importance of choosing the object the robots are moving relative to (in this case, the person or the architecture) depending on the task they are doing. Moving towards the human may have been viewed more positively because the focus of the motion is the human, and inviting and welcoming are human-centric activities. Moving away from entry was more ambiguous than moving away from the human, which may be because the object of motion is the human and the relational attribution questions are human-centric.

5.3 Coherence Study

Coherence is a sub-concept of relative motion, which represents whether the robots’ vector direction relative to the person were the same or different. Each robot was assigned a number 1 through 4 (most left 1, most right 4) and given a relative direction. Similar to the prior section, each robot had the ability to move toward the person, move away from the person, or not move. Two of our directional concepts are drawn form the previous study, as seen in Fig. 4, we also add an option to not move. The goal of this study is to explore the ways in which the coherence of the group impacts the social interpretation of the group.

Sequence of Events: First, the human begins moving down the entry lane toward the entry. Shortly after the human begins, the robot group moves, but one or two robot moves in a different direction. The human continues forward until the entry is reached, unless blocked by the robots. Both the robots and the human move at a medium speed throughout all conditions. The robot group motion depends on the study conditions.

Conditions & Hypotheses: The robot response includes the following conditions, as seen in Table 4: (1) three robots moving towards the human and one outlier staying still, (2) three robots moving towards the human and one outlier moving away, (3) two robots moving towards the human and two outliers staying still, (4) two robots moving towards the human and two outliers moving away. Our hypotheses are as follows:

  • H1-CO: Coherence will have the least impact on the functional interpretation.

  • H2-CO: Outliers moving away will read more negatively than outliers staying still.

  • H3-CO: Conditions with two outliers will read more negatively than conditions with one outlier.

Fig. 9
figure 9

Survey responses to “The interaction between the robot group and the human was hostile/courteous.”

Fig. 10
figure 10

Survey responses to the relational attribution anchored scale questions for the coherence videos

5.3.1 Coherence results

Within the social interpretation questions, coherence only significantly impacted the results of the interpretation of the robot motions as [hostile/courteous], as seen in Fig. 9. Coherence had significant impact on if the robots seemed to be avoiding or inviting the human, and if the robots seemed to want or not want the human to join their group. However, there was no significant impact on if the robots seemed unwelcoming or welcoming. Results can be seen in Fig. 10. Coherence did not impact the functional interpretation. No questions had significant results, which does not support all functional interpretation hypotheses. Full numerical results can be seen in Table 5.

5.3.2 Coherence Discussion

One unexpected theme arose from the extended response questions was that many participants were treating the outliers as their own separate robot group, and giving different attributes and motivations to the outliers than they were to the robots moving towards the human. This phenomenon happened across all four conditions. The motivations of the outliers and robots moving towards the human were determined by the relative direction of the robots. Essentially, coherence predicted sub-group membership; within the subgroups, the other parameters predicted how people interpreted the motion. These results are novel to this work. Prior work [55,56,57] has examined how coherence affects if a multi-robot group is perceived as expressive, but not how coherence affects that expression and defines the interpretation of a multi-robot group.

Coherence had the biggest impact on how participants viewed the relationship between the robots and the human and the least impact on functional interpretation, which supports hypothesis H1-CO, that coherence will have the least impact on the functional interpretation. These results may be because due to how participants grouped the robots. In the conditions where there was one outlier, participants seems to think that the three robots moving towards the human are carrying out a task as normal, and the outlier has decided to not participate for some reason. For example, one participant stated that for the one outlier moving away, three robots moving towards the human condition that “the robots seemed to distract the human, while one robot went and hid,” and another said the “ robot group appears to be to meet someone new but not all of them wanted to.” Similarly for two robots, some participants thought that the two robots moving toward the human were completing a task and the two outliers decided not to join. For example, on participant said of the condition where two outliers stay still that “of the two robot groups it appears only one group of robots had an intention to stop or interfere with the human heading for the gate.” Another participant said of the condition when two outliers move away that “half [of the robots] went to greet the human while the other set seemed to be afraid.” These interpretations describe different relationships to the human depending on the number of outliers and their motion. These interpretations may also partially explain why for functional interpretation the results across all conditions were fairly neutral and had sizable variance. If participants saw the robots as having multiple feelings about the human and different motivations, they may not have known which of the robots to answer the question for. In fact, some participants explicitly said that they were confused that all the robots did not act as a group and that they were unsure of what was going on.

Table 5 Numerical results of coherence study

Outliers moving away from the human read as more negative than the conditions in which the outliers stay still, even though there were still robots moving towards the human, which supports hypothesis H2-CO. This result is likely due to the outlier motion. Moving away reads as more actively avoidant and unwelcoming than staying in the same place. Answers from the extended response also back up this reasoning, as participants tended to interpret the outliers moving away as avoiding the human, or being afraid of it. For example, one participant stated of the condition with two outliers moving away that “one group was trying to get away. The other group was trying to go interact with the human.” This interpretation aligns with the results of the relative direction study, in which moving away from the human was interpreted as avoidant.

This trend of the outliers moving away reading as more negative and the outliers staying still reading as neutral or positive was the case in most questions, but did not hold for the robots [did not want/ wanted] the human to join their group. Here, one outlier staying still read as significantly more positive than two outliers staying still. This result may be because in the case of one outlier moving still, there are three robots left to go towards the human, whereas in the case of two outliers, only two robots are left to move towards the human. The greater number of robots moving towards the human may have sent a clearer signal of the robots wanting the human to join since the majority of the robot group did approach the human.

There was no significant difference overall between conditions with one outlier and conditions with two outliers, which does not support hypothesis H3-CO, that conditions with two outliers will read more negatively than conditions with one outlier. However, the amount of coherence appeared to affect how people determined sub-group membership. In the conditions where there was one outlier, participants seems to think that the three robots moving towards the human are carrying out a task as normal, and the outlier has decided to not participate for some reason. For example, one participant stated that for the one outlier moving away, three robots moving towards the human condition that “the robots seemed to distract the human, while one robot went and hid,” and another said the “ robot group appears to be to meet someone new but not all of them wanted to.” These results imply that because there was only one outlier, it was not enough to be considered two fully independent sub-groups. Participants still described their actions as one group, but with different actions within that group.

6 Relative Timing Study and Results

This section covers the studies for the parameters grouped under relative timing: (1) relative start time (N = 120) and (2) relative speed (N = 160). The studies showed that when the robots and human started at the same time, participants viewed this condition as significantly more negative than when the robots and human started at different times. Surprisingly, relative speed had no significant effects on how participants viewed the multi-robot group.

6.1 Relative Start Time Study

Start time is a sub-concept of relative timing, which represents the robot’s speed relative to a person or object in the scene. The goal of this study is to explore the ways in which the relative start time between the multi-robot group and the human impacts the social interpretation of the motion.

Sequence of Events Depending on the study condition, the human moves first, the robots move first, or they begin at the same time. Once moving, the human begins moving down the entry lane toward the entry. The robot group moves towards the human. Both the robots and the human stop when they meet. The robot and human group speed is medium for all conditions.

Conditions and Hypotheses The robot and human responses include the following conditions: (1) the robots starts moving first, (2) the robots and the human start moving at the same time, and (3) the human starts moving first. Our hypotheses are as follows:

  • H1-RST: Relative start time will have the biggest impact of relational attributes.

  • H2-RST: Relative start time will have the least impact on functional attributions.

  • H3-RST: The robots starting before the human will read as most positive.

6.1.1 Relative Start Time Results

The main trend of the relative start time results was that the robots and human starting together was almost always viewed the most negatively, with little difference between the robots starting first and the human starting first. Full numerical results can be seen in Table 6 and are reported in detail in this section.

Table 6 Numerical results of relative start time study

All three social attribution questions followed the same trend: the robots and the human starting simultaneously was viewed most negatively, and the human starting first was viewed most positively. However, these differences were only significant for [unfriendly/friendly], as seen in Fig. 11.

In relational attributions, both the robots starting first and the human starting first were viewed as inviting the human, with the human starting first being significantly more inviting than the robots and human starting at the same time, as seen in Fig. 12. Similarly, both the robots starting first and the human starting first read as welcoming, with both conditions being significantly more welcoming than the robots and the human starting together, which was viewed as unwelcoming. Results were not significant for “the robot group [did not want/wanted] the human to join them.”

The results for functional interpretations follow the same trends as social and relational attributions, with both the robots starting first and the human starting first being viewed significantly more positively than the robots and human starting at the same time across all three questions, as seen in Fig. 13.

These quantitative results are supported by the extended response answers. The conditions in which the robots start first and the condition in which the human starts first had the most responses describe a positive relationship between the robots and the humans, 25% of responses and 35% of responses respectively. The condition with robots and humans starting together had over 50% of responses describe a negative relationship or the robot as blocking the human.

Fig. 11
figure 11

Survey responses to “The interaction between the robot group and the human was unfriendly/friendly.”

Fig. 12
figure 12

Survey responses to the relational attribution anchored scale questions for the relative start time videos

6.2 Relative Start Time Discussion

Relative start time had the biggest impact on what participants thought the robots were functionally doing and the least impact on how people perceived the robots’ social attributions. These results do not support either hypothesis H1-RST, that relative start time will have the biggest impact of relational attributes, or H2-RST, that relative start time will have the least impact on functional attributions. These results imply that start time may affect what people think the goals of a robot group are. The extended response showed that when the human started first that many participants read this as the robots moving to greet the human, whereas in the condition in which the robots and the human start together the participants overwhelmingly thought that the robots were attempting to block the human. These results are novel to this work and advances our understanding of how start time can affect the interpretation of group motion.

Both the robot starting before the human and the robots staring after the human read positively with no significant difference between these two conditions. This result somewhat supports hypothesis H3-RST, that the robots starting before the human will read as most positive. This insight is novel to this work. These two conditions in which the robots or the human start first may have read more positively for a similar reason: it appears that one group is reacting to the other, which may be easier to see as positive than starting at the same time. For example, in real life a group of people may be talking amongst themselves, but when the see a friend enter the area, they move over to talk to them. Similarly, if a person is in a space, and sees their friends move into it, they may go to meet them. These interpretations are supported by the extended response results, in which 40% of participants explicitly said the robots were going to greet the human in the case where the human starts first, and 20% said the same in the case where the robots start first.

Fig. 13
figure 13

Survey responses to the functional interpretation anchored scale questions for relative start time

The condition when the robots and the human start at the same time was seen as the most negative. This insight is novel to this work. This negative response may be due to this condition coming across as somewhat aggressive. In the extended response answers, participants rated the condition when the human and robots start together as aggressive almost twice as often as they labeled the other two conditions aggressive. One possible reason for this condition reading as so aggressive is that it does not appear that either the robots or the human are reacting to the other. In real life, when two people or two groups of people both start coming towards each other at the same time it is often in an aggressive manner, such as two teams starting a sports game, or two people beginning a fight.

6.3 Relative Speed Study

Relative speed is a sub-concept of relative timing, which represents the robots’ speed relative to a person or object in the scene. The goal of this study is to explore how the speed of the the robot group relative to the human impacts the interpretation of the robots’ motion.

Sequence of Events: First, the human begins moving down the entry lane toward the entry. The robot group then begins moving towards the human. The robots and the human stop when they meet. Both robot and human speeds depend on the study condition. Once the human and the robot group accelerated to the desired speed, the speeds were held constant until they had to decelerate to stop. Acceleration and deceleration periods were as short as possible to get the longest time period of consistent speed.

Conditions & Hypotheses: The robot and human responses include the following conditions: (1) the robots and human moving fast, (2) the robots moving fast and the human moving slow, (3) the robots moving slow and the human moving fast, and (4) the robots and human moving slow, as seen in Table 7 and Fig. 14. Our hypotheses are as follows:

  • H1-RSP: Relative speed will have the highest impact on social attributions.

  • H2-RSP: Relative speed will have the least impact on functional interpretation.

  • H3-RSP: The human’s speed will have significant impact on relational attributes more than in social attributes and functional interpretation.

  • H4-RSP: Conditions in which the robots are moving fast will be viewed as less neutral than the conditions in which the robots are moving slow.

Table 7 Relative speed study conditions
Fig. 14
figure 14

Conditions for the relative speed study, with each different color arrow representing a different possible speed

6.3.1 Relative Speed Results

Trends in the relative speed results showed that the conditions in which the robots were moving fast were viewed more negatively than the conditions in which the robots were moving slow across all questions. However, these trends were not statistically significant meaning relative speed did not impact on social attributions, relational attributions, or functional interpretation.

Results from the extended response questions supported these trends, with conditions in which the robots move quickly reading as aggressive and blocking. When the robot moved fast and the human moved slow over 50% of responses described the robots as blocking, hostile, and aggressive. In the condition where both the robots and human moved fast, this was the case for over 75% of responses. For example, one participant said, “[t]he robots seem as though they are trying to intimidate the human. Reminds me of Tony Soprano and his crew threatening somebody. They move quick and get right in your face.” The conditions in which the robot moved slowly, however, both had 25% of responses say the robots were greeting or wanting to interact with the human. For example, one participant said, “[s]ince they went slowly I think it was to greet the person.”

6.4 Relative Speed Discussion

None of the hypotheses for relative speed were supported, due to the lack of significant results. However, the trends seen that showed the robots moving fast was viewed more negatively than the robots moving slow support results from prior work with single robots, which has found that people are less likely to interact with a robot moving quickly, as they appear to be less social [4, 23]. We theorize that our results only show trends rather than statistical significance because in our studies the participant is a bystander and physically removed from space the robot is in, which may impact the perception of the robot group motion. This difference may account for the lack of impact and should be explored further in future work.

Relative speed may be a parameter than needs more context to effectively send social and functional signals, since speed can mean a variety of social and functional things. For example, moving fast may mean that you are late and in a hurry, that you are uncomfortable in a situation and wanting to leave, or that you are excited about something and eager to get to that thing. These three situations all have different functional and social implications, but without necessary context may look the same to an outsider.

We also theorize that the lack of significant results may be due to there being only a small difference between fast and slow speeds on the Spheros in the study videos. Due to hardware and space constraints, the Spheros had a very limited range of speed with which to work with, resulting in not enough distinction for results to be significant. Another possibility is that speed may not be particularly communicative on its own, and will need to be combined with other parameters in the future to get more distinct reactions.

7 Relative Spacing Study and Results

This section covers the studies for the parameters grouped under relative spacing: (1) geometry (N = 120) and (2) proximity (N = 120). The geometry study showed that different formations imply different social attitudes for the robot group, with the line formation being viewed the most positively and the clump formation being viewed the most negatively. Contrary to prior work in single robot proxemics [7, 17], proximity had only one significant result: that the robots seemed more friendly when they stopped close to the human and close to the entry.

7.1 Geometry Study

Geometry is a sub-concept of relative spacing, which represents the shape the robots in the multi-robot group create. The goal of this study is to explore the effects of the end-pose geometry of the robots on people’s interpretation the robots’ motion.

Sequence of Events: First, the human begins moving down the entry lane toward the entry. Shortly after this, the robot group moves. The robot group all moves at the same speed. The human continues forward until the entry is reached, unless blocked by the robots. The robot group final position and the relative direction of individual robots depends on the study condition.

Fig. 15
figure 15

Final position shapes for the geometry study

Fig. 16
figure 16

Survey responses to social attribution anchored scale questions for the geometry videos

Conditions & Hypotheses: The geometry of the robots’ end poses are described by the following conditions: (1) line, (2) clump, and (3) square. The conditions can be seen in Fig. 15. Across all conditions starting pose stayed consistent and only end pose was varied. Our hypotheses are as follows:

  • H1-GEO: Geometry will have the biggest impact on functional interpretation.

  • H2-GEO: Geometry will have the least impact relational attributions.

  • H3-GEO: Square formation will be viewed the most positively.

7.1.1 Geometry Results

Geometry had significant impact on the participants views on if the robots were [threatening/harmless] and [hostile/courteous], as seen in Fig. 16 in social attributions. The square formation was highly significantly viewed as more harmless than the clump condition; however, the line formation was highly significantly viewed as more harmless than the square and clump formations. Similar trends were seen in the [hostile/courteous] question. Geometry did not significantly affect the participant perception of relational attributions. In functional attributions, geometry only impacted if the participants thought the robot was blocking the human, as seen in Fig. 17. The clump formation was significantly more blocking than the square formation. Full numerical results can be seen in Table 8.

The extended response questions add insight to the quantitative results. The answers to the extended response questions had 40% of participants mention a positive relationship between the human and the robots for the line formation, whereas the clump formation only had 5% of participants mention a positive relationship, and the square formation only had 10%. Many participants (25%) also thought in the extended response that the robots in the line condition were curious and interested in the robot, which is a neutral/positive interpretation of the robots’ motions. For example, one participant stated of the line formation that the robots “are curious and welcoming” and “they want to get up close and learn.” Another participant said the robots “appear to be investigating the human.”

Participants in the extended response also mentioned that they though the clump formation was aggressive (40% of participants) and that the robots in the clump formation were acting negatively towards the human (50%). These percentages are more than twice of line and square, which again provides insight into the more negative ranking of clump. Additionally, no participants specifically said that they thought the clump formation was welcoming or unwelcoming in their extended response, whereas 20% explicitly said the line formation was welcoming and 10% specifically said the square formation was welcoming.

The square formation was the most mixed, with 25% of participants stating that they thought the robots had a negative relationship with the human and only 10% thought stated they thought the square formation had a positive relationship with the human. The square formation was described with words like aggressive and protective. Participants also thought that the square formation was wary and cautious of the human in 15% of responses (as opposed to 0% of responses for line) which may also contribute to the more negative results.

Table 8 Numerical results of geometry study
Fig. 17
figure 17

Survey responses to functional interpretation anchored scale question“The robot group was blocking/not blocking the human.”

7.2 Geometry Discussion

Overall, different formations implied different social attitudes for the robot group. Geometry had the highest impact on social attributions which does not support hypothesis H1-GEO, that geometry will have the biggest impact on functional interpretation. The the line formation was seen the most positively, clump formation was viewed as the most hostile, and the square formation was seen as observant. Geometry did not have any significant effect on relational attributions, which does support hypothesis H2-GEO, geometry will have the least impact relational attributions. These results are novel compared to prior work in multi-robot expressive motion as explicit formations have not been examined for communicatory impact.

The line formation was seen as the most positive formation. Though only statistically significant in one question, the line formation showed trends of being the most positively viewed formation across almost all questions. This result does not support hypothesis H3-GEO, that the square formation will be viewed the most positively. For the line formation, participants often described the robots as moving to greet and welcome the human, and they were viewed as less threatening and more courteous. This interpretation may be because the line formation gave the most space between the human and the robots. The robots were evenly spread out and not very close to the human, whereas in the clump formation the robots came very close to the human. Similarly, the square formation had the robots in a more concentrated space than the line formation, which may have read as threatening. The open spacing of the line formation may have read as the robots giving space to the human and not trying to interact too aggressively. The line formation is also symmetric, which prior work has shown is generally associated with lower arousal and more positive emotions [71].

The clump formation was seen as the most negative. The robots move on very sharp diagonals directly toward the human in the clump formation, whereas the motion is not so directed at the human in the other two formations, which is another possible reason for clump being the most negatively rated. Some participants (10%) in the extended response explicitly mentioned that they thought the robots in the clump formation were trying to attack or confront the human, which likely plays some part in the negative response for clump. For example, one participant stated “robot group is attempting to attack” in response to the clump formation. One possible reason for this negative interpretation may be the disorganization of the clump formation. Prior work has shown humans have a higher arousal response to asymmetry than they do to symmetry [71], which could account for the strong reactions to the clump. Similarly, motions that are very angular and move on sharp diagonals have been shown to be associated with being threatening [71, 72], which may also have contributed to the negative perception.

The square formation read as observant and curious. In the extended response 20% of participants interpreted the square formation as trying to surround the human, with some mentioning that it reminded them of a checkpoint before an entry. For example, one participant said the robots were trying “[t]o surround and observe the individual who is approaching” for the square formation. Investigating and curiosity do not have positive or negative connotations, but some participants paired this curiosity or investigation with describing the robots as cautious or wary, which may explain the negative skew on the generally neutral results for the square formation. The square formation was also viewed as the least blocking, which is likely because it physically leaves the biggest opening for the human to get to the entry out of the three conditions. Like the line, the square is a symmetric formation, which may be why it was viewed more positively than the disorganized clump, as symmetric formations are associated with low arousal and positive emotions [71].

7.3 Proximity Study

Proximity is a sub-concept of relative spacing, which represents the robot’s distance relative to a person or object in the scene. The goal of this study is to explore the ways in which the relative distance between the multi-robot group and the figure impacts the social interpretation of the motion. An additional goal is to explore if the distance between the meeting point of the robot group and the human and the entry changes the social interpretation.

Sequence of Events: First, the human begins moving down the entry lane toward the entry. Shortly after this, the robot group moves. The robots and the human move until they reach the line according to the experimental condition. Both the robots and the human move at a medium speed in all conditions. The robot group and human motion depends on the study condition.

Conditions & Hypotheses: The robot and human responses include the following conditions: (1) Robots and the human both stop at Line 1, (2) Robots stop at Line 2 and the human stops at Line 1, and (3) Robots and and the human both stop at Line 2, as seen in Table 9 and Fig. 18. Our hypotheses are as follows:

  • H1-PROX: Proximity will have the largest impact on relational attributions.

  • H2-PROX: Proximity will have the least impact on functional interpretations.

  • H3-PROX: The condition in which the human and the robot both stop at Line 2 will read as the most positive.

Table 9 Proximity study conditions
Fig. 18
figure 18

The two lines shown are the potential stopping points for both the robots and the human in the proximity study. Black arrows show the direction of travel

7.3.1 Proximity Results

The one significant pairing across the entire proximity study was that the robot and human both stopping at Line 2 read as significantly more friendly than the robot stopping at Line 2 while the human stopped at Line 1, as seen in Fig. 19. However, the robots and human stopping at Line 2 was not significantly more friendly than the robots and human stopping at Line 1. Full numerical results can be seen in Table 10.

Table 10 Numerical results of proximity study
Fig. 19
figure 19

Survey responses to “The interaction between the robot group and the human was unfriendly/friendly.”

The extended response results showed that the robots and human both stopping at Line 2 was viewed the most positively, with 35% of participants describing the robots as wanting to greet or interact with the human. This condition was also only described as blocking the human or as aggressive to the human in 30% of responses. Both the robots and human stopping at Line 1 was viewed the most negatively, with 50% of responses saying the robots were blocking or aggressive towards the human, and only 15% of responses saying the robots were greeting or positively interacting with the human.

7.4 Proximity Discussion

The significant result that both robots and human stopping at Line 2 is significantly friendlier than the robots and human stopping at Line 1 is from the social attributions question set, which does not support hypotheses H1-PROX or H2-PROX that proximity will have the largest impact on relational attributions, and that proximity will have the least impact on functional interpretations respectively. However, hypothesis H3-PROX was also somewhat supported as in one case, the robots stopping at Line 2 was seen as the most positive, however this was not true across all questions. Results from prior work have shown that in one-on-one in-person interactions, proximity depends heavily on context and direction of approach [13, 17], following similar trends to human–human proxemics [74]. Our results do not support these prior findings, due to the lack of significant results, but warrants further exploration.

One possible reason is that proximity may be a parameter that is more effective in communicating when the human and robot are in the same physical space. The level of discomfort one may feel when a robot invades their personal space would likely be much higher than watching a robot invade someone else’s personal space as a bystander. This effect may be amplified in this study because the “human” is a figurine, and people may not think of a figurine as having personal space. Therefore, it would be harder for participants to infer as much meaning from the proxemics between the robots and the figurine, as both are inanimate objects.

Another possible reason for the lack of significant results in the proximity study is that proximity can be very context dependent and very person dependent. Personal proxemic preferences can differ from person to person, and can change depending on the situation, and this study did not provide much context. For example, human proxemics change depending on what kind of setting the person is in. A person’s close, personal distance will be much smaller in crowded spaces than in spaces with less people. Similarly, the current context of COVID-19 and social distancing in real life may have affected how people thought about the relative distances between the robots and the human.

8 Discussion

In this section, we discuss our main findings from the validation studies of each parameter, and discuss unexpected insights that arose from the results. The six sub-study results show strong promise for multi-robot expressive motion system that can be used by future researchers, significantly impacting human interpretations of what the robots were trying to communicate, and their general attitude toward the human in the video across most parameters (Sect. 8.1). Across the studies, we also confirmed and extended prior understandings of multi-robot expressive motion related to what is considered a robot group, and the general communicatory interpretations of parameters, relativity and floor plan. These insights are summarized in our design guidelines (Section 8.2).

8.1 MoTiS Framework Validation

The MoTiS framework brings together concepts from prior work in single-robot expressive motion and multi-robot systems in one easy-to-find place for researchers who want to easily layer expressive motion into their multi-robot systems. We expanded relative direction [51] and relative speed [23, 53], relative start time [52] and proximity from single robot expressive motion to multi-robot motion, including concepts like relativity. We also included concepts from multi-robot expressive motion, such as coherence [20, 55, 56] and geometry [59] and explored how specific instances of these parameters can be interpreted. Relative direction, coherence, relative start time, and geometry all made a significant impact on how participants viewed the robot group, while relative speed and proximity had trends with specific impact as follows:

  • Relative Direction Relative direction had the most impact on how participants viewed both the robots’ goals and their social attitude. The direction of motion (towards vs away) impacted the functional goals, showing whether the robots were blocking the human or allowing the human to enter. Object of motion (human vs entry) indicated both social and functional goals, inferring what the robot group is paying attention to and/or what the robot wants another social actor to do. These results support the findings of prior work in single-robot expressive motion [51], and in a novel way, show the impact relative direction can have in multi-robot systems.

  • Coherence Coherence had the biggest impact in determining participant perception of group and subgroup membership. Coherence also impacted how participants viewed the relationship between the robots and the human. The outlier robots moving away were perceived more negatively than the outlier robots staying still. These results emphasize the importance of knowing when robots will be perceived as a group, which has been highlighted in prior work in human sociology and robotics [24, 26, 27, 75].

  • Relative Start Time Relative start time most impacted what participants thought the robots were functionally trying to achieve. The robot group and the humans starting at the same time was viewed very negatively, with participants describing it as aggressive, threatening, and trying to block the human. However, the robots starting before or after the human caused participants to view the interaction and the robots’ goals more positively, often being described as the robots going to greet the human. These results are novel to this work.

  • Geometry: Geometry had the most impact on how participants interpreted the social attitudes of the robot group. The line formation was perceived the most positively. The square formation was perceived as the robots being observant. The clump formation was viewed negatively, with participants labelling the robots as aggressive. These results support findings of prior work in how shapes imply certain emotions [71, 72] in humans, and shows that these concepts also extend to multi-robot systems.

  • Relative Speed Trends in relative speed showed that faster robot speeds were viewed more negatively than slower robot speeds. The human’s speed did not affect any trends. These results somewhat back up prior work in single robotics that showed that faster robot speeds are seen as less social [23]. Contrary to prior studies [23, 52, 53], our results did not confirm that speed had a significant impact.

  • Proximity Trends in proximity showed that the robots stopping close to the human and the entry was seen as more positive than stopping close to the human but far from the entry, or far from the human but close to the entry. However, this trend was only significant when participants were asked if the robots seemed friendly or unfriendly. These results do not support prior work that have shown that proximity has a large impact on human comfort in interactions [13, 17, 76].

As stated, relative speed and proximity showed trends, but did not have significant results. We hypothesize this is due to the participant not being an interaction partner with the robot. We will continue to explore these parameters in future in-person studies. In previous studies where the humans are interaction partners with robots, both speed [23, 52, 53] and proximity [13, 17, 76] have been shown to be highly important to how humans interpret the robot motion. However, in our studies in which the participant is a bystander and physically removed from the space the robot is in, neither relative speed or proximity impacted the perception of the robot group motion. This difference may account for the lack of impact these two parameters had and should be explored further in future work.

8.2 Design Guidelines

In addition to exploring the social, relational, and functional impact of each parameter, some unexpected insights emerged. We present the following design guidelines for future researchers using our MoTiS framework in multi-robot systems.

Coherent group motion influences when people view multiple robots as a unified group. Extending previous work [21], our results confirm that it is crucial for social robot developers to estimate when humans see multiple robots as a group, because this will impact the human interpretation of what the robots are communicating. For example, in the coherence study condition in which two robots moved away and two moved toward the human, one participant stated “One group was trying to get away. The other group was trying to go interact with the human.” In this example, the participant spilt the robots into two separate groups: one with a fearful motivation and another with a desire to interact. Confirming prior work in in-group and out-group [26, 27, 75], the coherence study results (see Sect. 5.3.1) indicated that coherence predicts sub-group membership and that subgroups were generally labeled to have the same expression. When one or more robots within a group of robots move differently from the main group, people view these robots as having different goals and motivations than the main group.

Patterns emerge in how certain types of motion are perceived by onlookers. Some instances of parameters were almost universally viewed as negative. In the relative start time study, participants stated that when the human and robot started moving at the same time, it seemed confrontational. For example, one participant described the robots as “immediately form[ing] a line to prevent the human from moving.” In the relative direction study, moving toward the entry or human was often seen as functionally blocking the human, and therefore negative. As in the relative start time study, people also viewed moving towards the human as confrontational, with participants using words like “aggressive,” “threatening,” and “hostile.” Similarly in the geometry study, the robots ending formation as a clump was also seen as functionally blocking the human, and therefore negative. Additionally, the clump was the least organized final formation, and this disorganization and chaos may have also led to negative interpretations, with participants describing the robots as “trying to intimidate the human” and “attempting to attack or have a collision with the human.” These results highlight that some instances of parameters will naturally be viewed as mean or aggressive by many people, and should be avoided or only used in certain contexts in human spaces. However, there were no parameters that were viewed as universally positive. For example, moving away from the entry was rated positively by participants, but in the extended response many said that the robots were afraid of the human.

Relativity can greatly affect people’s interpretation of group motion. In relative direction, moving towards the human or entry was always viewed as trying to stop the human from proceeding, and moving away from the human or entry was viewed as allowing the human to continue on their path. Additionally, moving towards the human or entry was viewed negatively (specifically hostile and threatening) and moving away was viewed positively (specifically courteous and harmless). There was often a much higher standard deviation in responses when moving relative to the human, versus relative to the entry showing that moving towards the entry was likely less confusing to participants. In relative start time, the robots and human starting together was always viewed as trying to stop the human from proceeding and participants viewed the robots and the person having a negative relationship.

The floor plan affected how participants inter- preted the motivations and goals of the robots. Throughout all six studies, participants described the robots’ goals and motions referencing the floor plan. For example, in the geometry study one participant described the robots as “stewards of the area or structure they’re originally around.” Across most of the studies participants also said they thought the robots were “trying to block entry to the human,” referencing the entryway the human was moving towards. These results show that the relationship between the robots and the floor plan of the space does change the interpretations of multi-robot group motions across multiple parameters. It was unexpected that the floor plan would come up so frequently in the results, and these findings warrant future exploration of how floor plan can affect the interpretation of group expressive motion.

9 Conclusion

In this work, we created and validated a hierarchical framework for multi-robot expressive motion, called MoTiS, consisting of six features: (1) relative direction, (2) coherence, (3) relative speed, (4) relative start time, (5) proximity, and (6) geometry. After compiling insights from prior literature and running six video-based studies, we developed a system of expressive parameters to support multi-robot expressive motion. Previous work in single robot expressive motion was extended to multiple robots to create expressive parameters, such as relative speed and proximity, and the parameter validation study showed the ways in which these previous findings in single robot expressive motion are applicable to multi-robot systems and how they need to be modified. For example, proxemics [7, 17, 61], are classically the spatial relationship between one robot and one human. For multi-robot systems, proxemics need to be extended to not only include the spatial relationship between the robot group and the human, but the space between individual robots within the group. These novel expressive parameters also integrate the new needs inherent to multi-robot systems, such as geometry and coherence, which have not been explored in single-robot expressive motion, but have been used in multi-robot expressive motion. Consolidating parameters from prior work in single and multi-robot expressive motion, and adding new, novel parameters (start time) allows us to create a system of expressive parameters that has the potential to generate a wider, more nuanced range of multi-robot expressive motion than seen in prior work.

This framework consisting of six parameters was evaluated in six independent online user studies, one for each variable. These six user studies validated that four out of six of the parameters in the framework had impact on how onlookers viewed the robot group motion. We found that coherence plays a crucial role in determining how participants grouped the robots. In addition to validating the parameter framework, we evaluated three overarching research questions:

  • How does the motion affect the perceived social attitudes of the robot group?

  • How does the motion affect the perceived relationship between the robot group and the human?

  • How does the motion affect the perceived functional goals of the robot group?

Relative direction and geometry impacted the perceived social attitudes of the robots, with moving away and line formations being viewed as socially positive. Coherence impacted the perceived relationship between the robots and the human, with outlier robots moving away from the human being viewed as a negative robot–human relationship. Relative direction and relative start time both impacted the perceived functional goals of the robots, with moving towards being viewed as blocking and moving before the human being viewed as welcoming and greeting. Relative speed and proximity only showed trends, which contradicts prior work that shows speed and proxemics have previously significantly impact the perception of single-robot motion [4, 13, 17, 23]. We hypothesize that this may be due to participants being onlookers in our studies, as opposed to interaction partners like in prior work. These results warrant further exploration, with emphasis on in-person studies. Additionally, we provided design guidelines for the different parameters, showing how different iterations of each parameter can impact how participants interpret the group motion. Extending prior work, we confirmed that whether or not humans view multiple robots as a group impacts how they perceive the motion of these robots. We found that coherence is a strong predictor of when onlookers will view multiple robots as a group. However, it is not only the robots’ motion that matters: the floor plan of the physical space also plays a key part in how people view the motivations and goals of the robot group.

Future work will explore more deeply how these parameters can be used in different social contexts to generate legible multi-robot group motion, with emphasis on in-person studies with physical robots. Additionally, it will be important to examine how these parameters can generate expressive motion in various physical spaces, since the floor plan impacts human perception. Future work can explore these parameters in different scenarios and evaluate how architectural floor plan and context can change the efficacy of different parameters. We plan to test with in-person studies how a parallel entry lane compares to a perpendicular entry lane, which was shown in this study. These future studies will also provide a comparison between participants as onlookers versus interaction partners with the robot group. Future researchers can utilize our novel parameter framework to generate expressive multi-robot motion in their unique applications. Multi-robot groups are currently being used in human spaces such as factories [1] and search and rescue operations [3] and can be direct beneficiaries of this work. These and an expanding number of everyday human-robot applications contexts will benefit from effective, legible communications.