1 Introduction
Designing robotic assistance frameworks presents a set of challenges that are unique to
human-robot interaction (
HRI) and highly dependent on the task, environment, capabilities of the user, and the ultimate goal of the interaction. Robotic interfaces can act as guides in remote teleoperation or robotic surgery—providing haptic feedback and constraints to users—or shared control may enable robots to act as partners providing assistance that compensates for a user’s deficit due to disability or high cognitive load [
21,
41]. While the goal of teleoperation and assistive robotics is successful task completion, rehabilitation robots must target learning, sometimes at the expense of achieving task success. Robotics have great potential for enhancing training and rehabilitation because of their ability to support many task repetitions and to quantify the performance of the trainee. However, the automation cannot simply track the same series of joint states over and over, measuring the user’s error. Instead, the automation must take actions based on a quantitative task definition and the specific needs of the user.
Motor learning literature and past studies of control strategies for training provides us with several requirements that robots must meet in order to promote learning. Most important in these is a need for active involvement by the user. When a patient is suffering from severe deficit, such as immediately after a stroke, there is some benefit to passive movement. However, in the long term, simply moving a person through the motions of a task leads to slacking [
20]. Training and therapy is most effective when training is intense and patients are actively involved in the exercises [
31]. User engagement can be promoted using game-like graphic interfaces for task-oriented training [
48,
55] and by matching the control strategy to the relative difficulty of the task [
30,
37]. This suggests that typical robot control frameworks tracking a particular joint trajectory are ill-suited for robot-mediated training.
Instead, impedance/admittance control, virtual fixtures, and potential fields are used to “assist-as-needed” or to challenge high skill users through error augmentation, kinematic variability, and random perturbations [
5,
34,
44]. Although hardware advances have made robots more versatile and made it possible for us to autonomously support an almost unlimited number of meaningful tasks, most prior work in the upper extremity has focused on path-following tasks, where the goal is to minimize tracking error or follow a normative velocity profile. This is because a fundamental problem in physical human-robot interaction is how we should define the desired behavior [
34,
44]. There are two main aspects to consider. One must specify the low level task that the actuators and encoders manage (e.g., a desired trajectory, velocity profile, or torque specification) and at a higher level one must select from one of many possible strategies to achieve the desired behavior. While the simplest solution may be to minimize the error between the robot and a recorded trajectory based on expert input or average behaviors, this brings up issues of time dependence and enforcing a particular solution to a task that may have infinitely many “good enough” solutions. This approach also neglects a fundamental principle of motor learning—that errors and variability actually enhance learning [
53].
We assert that interfaces must enable flexibility in task solutions and provide task-based feedback rather than error based. Stereotypical movements, such as reaching, cleaning, self-feeding, and walking, have substantial variation between equally qualitatively successful trials, both within and between individuals. This variability is necessary for motor learning, so our control objectives need to maintain a task-level view of performance. This perspective is achieved by monitoring the spatial statistics of movement rather than relying on error or task-specific performance heuristics, providing assistance based on a global measure as opposed to local interactions. A measure comparing the temporal statistics of trajectories to a spatial distribution defining a task—ergodicity—enables one to identify changes due to reduction of a person’s existing deficit or training with robotic assistance when other common measures failed to capture at least one of these effects [
14]. Furthermore, one can easily use a set of demonstrations to learn a task definition with imitation learning [
24].
In this article, we present a novel method for providing corrective feedback based on a statistical task definition that can be generalized to a broad set of tasks. Rather than asking the robot to help the user minimize their tracking error, the robot intervenes to
increase the information about the task that is encoded in the motion. This is implemented by using an ergodic measure to close the loop on our hybrid approach that switches between full user autonomy and full rejection of user inputs. In a user study, we implement ergodic
hybrid shared control (
HSC) on an impedance control robot (Figure
1) and empirically compare training with the ergodic HSC to an assist-as-needed path controller.
The article is organized as follows: First, in Section
2, we provide background information on control strategies used in robotic training applications and highlight two control design features that greatly impact the performance of these training strategies. This is followed by a discussion of our prior work and the ergodic HSC algorithm in Section
3. Results of applying the algorithm to simulated noise input are given in Section
4. The experimental protocol and design can be found in Section
5 followed by the results in Section
6. Finally, we provide a discussion of the results in Section
7.
5 Methodology
In this study, we assess ergodic hybrid shared control in a set of timed drawing tasks, and compare it to virtual fixtures. Both hybrid shared control and virtual fixtures require the active participation of the user. Without input from the user, neither control framework will complete the task on its own. Both control strategies are also designed to assist as needed, enabling some variability in the user’s task execution. HSC achieves this by instantaneously evaluating user inputs based on their impact on a cost function in the future. The virtual fixtures are only enabled when participants are far from the task path.
The scale of the drawings in this study requires the participant to reach over a large area of their workspace much like therapeutic reaching exercises in [
12,
13,
22,
39,
52]. Drawing has a high level goal similar to daily activities like cleaning or food preparation, where there are many possible strategies one could use to complete the task and the order of locations visited does not necessarily affect the end result. Yet, one can still make a comparison to an error-based controller by using the distance from the lines in the drawing to generate a set of virtual fixtures that limit the distance between the user’s path and the desired path. The time limit adds a dynamic challenge to the task such that healthy adults need practice to improve their performance. Twenty-four participants were asked to perform a set of baseline tests without assistance, followed by a training set and post-training set for each image tested. The type of assistance provided during training was randomly assigned. The experimental design allowed us to establish performance changes due to each of the control strategies and compare their effects on training.
5.1 Study Design/Apparatus
Participants were asked to draw the four grayscale line drawings shown in Figure
1. Each were sized to
\(2,\!200\text{ px} \times 2,\!200\text{ px.}\) They were given a maximum of 10 seconds to copy each image into a box on the screen. The area on the screen corresponded to a
\(1\text{ m}\times 1\text{ m}\) horizontal plane in front of the user, much like the mapping of a mouse to a computer screen. For our experiment, we utilized a Sawyer Robot (Rethink Robotics) in the interaction control mode provided in the Intera SDK. Interaction control mode enabled us to set the type of control, force vs. impedance, of each dimension in Cartesian space. Sawyer’s integrated sensors were used track the state of the end effector as well as estimate the user inputs. The sensor information was sent to the host computer which simulated a double integrator system and executed Algorithm
1, sending updates on the interaction control parameters according to Equations (
9) and (
10), defined below. The host computer also updated the visualization provided to the users.
5.1.1 Virtual Fixture Implementation.
Virtual fixtures were implemented by setting the planar components of the interaction control commands to force mode. The other dimensions were set to a high level of impedance, restricting the motion of the user to the horizontal plane. When users were within 100 pixels (approximately 4.5 cm) of a dark pixel in the given drawing, impedance and force parameters were set to
\(\mathbf {0}\). When this condition was violated, a force proportional to the distance to the nearest pixel and the current velocity was produced by the robot as in Equation (
9),
where
\(d_p\) is the distance to the nearest dark pixel,
r is the radius of the channel, and
\((x_p,y_p)\) are the coordinates of the nearest pixel.
\(K_P\) and
\(K_D\) are gains on the proportional and derivative terms of the feedback law, respectively.
5.1.2 HSC Rejection Equations.
There are several ways to implement the rejection of user actions described in Algorithm
1. When using a low power haptic device, one can generate a transient virtual wall or ignore the user inputs when using an admittance controller with sufficient mechanical power to generate forces equal to that of the user. In this study, we modify the impedance parameters in the end-effector space of the robot. The task-irrelavant dimensions were set to a high impedance, again restricting the motion of the user to the horizontal plane. When user actions are accepted, impedance in the plane is set to zero. When user inputs are rejected, impedance parameters are set to track the velocity at the time of the last accepted action according to the following equation:
5.2 Procedure
At the beginning of the session, participants were seated in a chair facing the robot and a display screen, and were asked to grasp a handle on the robot end effector. Sawyer is capable of exerting forces at this interaction point between the user and the robot in the x, y, and z directions, and can exert torques about all three axes. However, we maximized the impedance on the torques about these axes as well as the force in the z direction, restricting the end effector to a horizontal plane. The position of the end effector was measured from the joint angles using forward kinematics, and the acceleration of the end effector was measured using an inertial measurement unit installed in the end effector. The acceleration was used as input to the simulated double integrator system. At start-up, force/torque limits were placed on each degree of freedom.
A host computer was used to communicate with Sawyer during setup and operation. Using the core software architecture of the
Robot Operating System (
ROS), the Host received position and acceleration information from Sawyer. The host also sent messages setting the parameters of the Sawyer impedance model and controller. Information from the Sawyer was used to visualize the interaction point as a 3D cursor and the drawing history as a series of dots in the ROS visualization package, rviz. The position information also kinematically controlled the double integrator system being simulated by the host computer. The host set the parameters to either increase or decrease the impedance at the end effector or modify the forces at the end effector according to either Equations (
9) or (
10).
At the beginning of the session, the drawing task was demonstrated to the participants by the authors, and participants were able to practice drawing on the screen using the robot as a cursor. Participants performed a baseline set of trials in which they drew each image 10 times for a total of 40 trials. The order in which they completed these drawings was randomized to minimize learning during the baseline trials. After the baseline set of trials, participants trained with their assigned control strategy completing both the training and post-training trials for one image before moving on to the next image.
Subjects were recruited locally (\(n=24\)), and had to be healthy, able-bodied adults (in the age range of 18–50) with no prior history of upper limb or cognitive impairments. Only right-hand dominant participants were accepted into the study, and each subject performed the task with their right limb. All study protocols were reviewed and approved by the Northwestern University Institutional Review Board, and all subjects gave written informed consent prior to participation in the study.
5.3 Measurements and Statistical Analysis
We assess user performance using the metrics that close the loop in the two control strategies that were tested: error and ergodicity. The data for each image consisted of 10 baseline trials, 10 trials with either ergodic HSC or virtual fixtures, and 10 trials post-training for a total of 30 trials for each of the four images. These were grouped into sets of 10 trials to evaluate subject performance over time. The analysis consisted of two-factor (set and group) mixed design ANOVA tests. The ANOVAs were used to compare the effect of the ergodic HSC and virtual fixture training on each of the performance measures. When significant main effects or interaction effects were detected, Student’s t-tests were used to evaluate the difference between the performance of the ergodic HSC group and the control group.
5.3.1 Error.
Every \(t_s\) seconds, we measured the position of the robot end effector and translate it to image coordinates on the domain \([0,\,2,\!200]\) \(\times\) \([0,\,2,\!200]\). We then perform a search for the nearest dark pixel (saturation \(\lt 130\)). The distance between the end-effector position and the nearest dark pixel is recorded. The error measure that we report here is the mean distance from the nearest dark pixel computed for each trial. The error metric captures the accuracy of the participant’s movement—a common target for robotic training. In tasks such as self-feeding, this accuracy is crucial, but for other tasks acceptable trajectories lie within a range as in cleaning a spill. As long as the center of one’s palm travels close enough to the spill, the task can be successfully completed.
5.3.2 Ergodicity.
We treat each image as a discrete histogram over the domain and generate 100 random samples from that distribution as can be seen in Figure
3. We use these to calculate the ergodic metric according to Equation (
2), giving us the trajectory’s distance from ergodicity for each image. This distance characterizes how far a particular task execution is from the distribution of normative task executions. Therefore, trajectories that are close enough to the distribution of demonstrations—represented by the pixel values in the drawings in this experiment—are not penalized for minor variations under this metric.
5.3.3 Completeness.
Eight students and faculty with limited knowledge of the study design were recruited to rate the completeness of the drawings generated in the study. Scorers were asked to provide a rating evaluating the completeness of each drawing on a scale from 1 to 100. Each participant drawing was assigned a random code and was randomly assigned to one of eight scorers via an online survey. Scorers were instructed not to judge the quality of the drawing on the basis of scale or accuracy. Instead, the scorers were asked to rate the completeness of the image based on what percentage of the elements of the original image was completed. Due to randomization of 3,000 ratings to 2,880 total images, many of the drawings received only one score from the online survey. To mitigate rater bias, drawings were analyzed in sets of 10 for the baseline and post-training sets. In a typical upper limb task like target reaching, one could evaluate whether a task was successful or complete based on whether or not a certain region around the target was visited. In drawing, the completion score provides a continuous measurement of task success.
6 Experimental Results
The results were reported as follows. First, the error of each group was statistically tested in Section
6.1. An analysis of the ergodicity was performed to test for differences in the relative information communicated in the drawings of the HSC group and the VF group in Section
6.2. Finally, an analysis of the completion scores of each group was performed in Section
6.3. The results demonstrated that training with the ergodic HSC increased subject performance in later trials within the same session. In each section, the relevant statistics are reported first, followed by a summary and interpretation of the results.
6.1 Error Measure
The mean error of each group in each set can be seen in Figure
4. The progress of the two groups over the training session was analyzed by performing mixed design ANOVAs on the HSC group (between participants) and set (within participants) using the error on all four images. Only the baseline trials (set 1) and post-training trials (set 3) were used to avoid measuring the effects of the assistance itself in the analysis.
The mean error of the apple drawings had two significant factors. The main effect of the training group was not significant (\(p=0.636,\ F(1,20)=0.231\)). However, the main effect of the set was significant (\(p=5.58\times 10^{-7},\ F(1,454)=25.784\)), as was the interaction of training and set (\(p=0.0272,\ F(1,454)=4.913\)). Interestingly, study participants in the VF group increased their average distance from a dark pixel both during and after training, whereas participants using HSC had similar levels of error in sets 1 and 3.
The mixed design ANOVA design was also applied to the error in the banana drawings, and the main effect of the training group was not significant (
\(p=0.4611,\ F(1,20)=0.565\)). The main effect of the set was not significant either (
\(p=0.1132,\ F(1,454)=2.514\)). The interaction effect of the block and training group was significant (
\(p=0.00202,\ F(1,454)=9.643\)). This reflects the fact that the two groups performed similarly in the first set, but the VF group had lower error in the post-training set. This statistical result is unique to the banana drawings. The grayscale area on the interior of this drawing created a narrow gap between the virtual fixtures meant to constrain participants to the outer line and the fixtures meant to constrain them to the inner gray line. Participants would oscillate between the two fixtures because when they reached the midpoint of the region between a black pixel and a gray pixel, the direction of the robotic force would change. The lower error in set 3 may be a result of participants learning to stay in the interior of the drawing where error will be low because a black or gray pixel is frequently nearby. The interior of this drawing is a relatively low area of the distribution used to represent the task, so this low error area was not targeted by the HSC assistance as the gray area would have relatively low impact on reducing the cost in Equation (
5).
When the same mixed design ANOVA was applied to the error in the umbrella drawings, the main effects of set (\(p=0.071\), \(F(1,454)=3.27\)) and group (\(p=0.811\), \(F(1,20)=0.059\)) were not significant. The interaction effect of the training group and set was significant (\(p=0.0495\), \(F(1,454)=3.88\)). In the case of this drawing, the VF group had higher error in the post-training trial compared to the HSC group, though the two groups had similar baseline error.
The analysis of the error in the drawings of the house revealed a significant main effect of set (\(p=1.23\times 10^{-5}\), \(F(1,454)=19.545\)), but the interaction effect of set and training group (\(p=0.46\), \(F(1,454)=0.546\)) was not significant. The main effect of group also was not significant (\(p=0.728\), \(F(1,20)=0.124\)). Participants generally improved as a result of increased practice in this simple line drawing as opposed to the feedback from either of the training paradigms. As in the drawings of the apple and umbrella, the HSC group had lower error than the VF group in the post-training set, though they started at the same baseline error.
In three of the four drawings, the group that trained with virtual fixtures, which were designed to reduce error, actually performed worse during set 3 in terms of error compared to the group that trained with HSC. When we look at Figure
4, we can see that even when the virtual fixtures were engaged in set 2, the HSC group had lower error than the VF group when drawing each image except the umbrella.
These results demonstrate that even when feedback is based on spatial statistics, other standard measures like error can be improved though they are not directly targeted by the algorithm.One reason that error increases when participants train with virtual fixtures is that participants exploit these guides when they are present. For the participant shown in Figure
5, it is clear when they were drawing the apple, banana, and house, that they found the virtual wall and followed it such that they maintain a consistent distance from the desired lines. When the virtual fixtures are removed, this bias remains. The offsets in the post-training drawings are similar to those we see in the drawings with virtual fixtures.
6.2 Ergodic Measure
Two-factor mixed design ANOVAs were used to assess the effects of the group (between-subjects) and set (within-subjects) on the ergodic measure defined in Section
5.3.2 for each image used in the study. The HSC group and VF group were evaluated based on the baseline trials (set 1) and the post-training trials (set 3) only. Set 2 was left out of the ANOVA, so that effects of the assistance itself would not be measured in the analysis (Figure
6).
The factorial ANOVA of the ergodic measure on the apple image revealed that the interaction effect of group and set was the only significant factor (\(p=6.17\times 10^{-4},\ F(1,454)=11.889\)). The main effects of group and set were not significant for the apple drawings (\(p\gt 0.05\)). The HSC and VF group performed similarly in the baseline trials, but the HSC group performed slightly better after the training set.
When an analysis of variance was performed on the ergodicity of the banana drawings, again, there was no significant effect of group, set, or the interaction of those two factors (
\(p\gt 0.05\)). Although the error measure and VF algorithm places equal weight on the black outline and the gray interior line of this drawing, the reference distribution (Figure
1) does not. Therefore, the ergodic measure and the HSC algorithm does not improve when participants fill in these lower density areas.
When the ergodicity of the trajectories drawing the umbrella were compared, the significant factors were the set (\(p=6.21\times 10^{-5},\ F(1,454)=16.34\)) and the interaction between group and set (\(p=1.95\times 10^{-4},\ F(1,454) = 14.11\)). The main effect of group was not significant (\(p=0.613,\ F(1,20)=0.264\)). The group was not a significant factor affecting the error in the house drawings (\(p=0.238,\ F(1,20)=1.477\)). The main effect of set (\(p=0.73,\ F(1,454)=0.119\)) also was not significant, but the interaction of group and set (\(p=7.90\times 10^{-8},\ F(1,454) = 29.795\)) was significant.
The results of the ANOVA of ergodicity for three of the four drawings showed that the interaction effect of set and group was a significant factor—implying that while the participants started at the same performance level in their baseline set, participants in the HSC group attained a higher performance level in the post-training set than the VF group.
The differences in the ergodic measure implies that participants training with HSC generated trajectories that encoded more information about the original image. This is likely due to the assistance intervening based on a measure of overall performance as opposed to the local distance measure employed by the virtual fixtures. The virtual fixtures generally led participants to draw more slowly—not completing the image. Participants receiving feedback from HSC drew images that were smaller, but more complete as can be seen in the examples in Figure
7.
6.3 Completion Score
The mean completion score of each group in each set can be seen in Figure
8. The change in completion percentage of the two groups over the course of training was analyzed by performing mixed design ANOVAs on the HSC group (between participants) and set (within participants) using the ratings on all four images. Only the baseline trials (set 1) and post-training trials (set 3) were used to avoid measuring the effects of the assistance itself in the analysis.
The completion percentage of the apple drawings had two significant factors. The main effect of the training group was not significant (\(p=0.400,\ F(1,20)=0.739\)). However, the main effect of the set was significant (\(p=0.00245,\ F(1,460)=9.276\)), as was the interaction of training and set (\(p=5.56\times 10^{-5},\ F(1,\!460)=16.556\)). Study participants in the VF group completed around the same amount of this drawing both before and after training, whereas participants using HSC completed 7% more on average.
The mixed design ANOVA was also applied to the completion score in the banana drawings, and the main effect of the training group was not significant (\(p=0.393,\ F(1,20)=0.761\)). However, the main effect of the set was significant (\(p=2.200\times 10^{-5},\ F(1,470)=18.373\)). The interaction effect of the block and training group was significant (\(p=0.0497,\ F(1,470)=3.871\)). This reflects the fact that the two groups performed similarly in the first set, but the VF group had completed more of the drawings in the post-training set. Both groups improved their completion scores post-training.
When the same mixed design ANOVA was applied to the completion scores in the umbrella drawings, the main effect of set (\(p=1.82\times 10^{-8},\ F(1,477)=32.79\)) and the interaction effect of the training group and set were significant (\(p=0.005,\ F(1,477)=7.67\)). The main effect of group (\(p=0.662,\ F(1,20)=0.197\)) was not significant. In the case of this drawing, the VF group and the HSC group had higher completion scores in the post-training trials compared to their baseline. Although the two groups had similar baseline completion scores, the HSC group achieved significantly higher scores that the VF group post-training.
The analysis of the completion score in the drawings of the house revealed that the main effects of set (\(p=0.162,\ F(1,455)=1.963\)) and group (\(p=0.284,\ F(1,20)=1.210\)) were not significant, but the interaction effect of the set and training group (\(p=0.003,\ F(1,455)=9.190\)) was significant. As in the drawings of the apple and umbrella, the HSC group had higher completion scores than the VF group in the post-training set, though they started at the same baseline error. In this drawing in particular, the completion scores of the VF group actually went down in the post-training trials.
In three of the four drawings, the group that trained with virtual fixtures had lower completion scores on their drawings compared to the group that trained with HSC. When we look at Figure
8, we can see that the HSC group had much higher completion scores in set 2 and retained a modest advantage over the VF group in the post-training trials.
These results demonstrate that the HSC encouraged participants to take actions that improved the overall quality of the drawing rather than the accuracy of an individual pose.7 Discussion and Conclusions
There are numerous articles stating the potential of robotics to support training and rehabilitation because of their ability to assist users in completing many repetitions and their ability to provide quantitative feedback. The questions of how robots should assist/resist users, how to define the task, and what metrics they should use to quantify success have a profound impact on the efficacy of training. Prior studies show that error, variability, and active user participation—achieved by adapting the robot support as needed—are crucial to motor learning. Furthermore, we know that there are many equally good solutions to a particular task that a person might use.
We have developed a hybrid shared controller that selectively rejects or accepts user actions based on how that action will affect the time-averaged statistics of the trajectory for some time into the future; that is, how the amount of task information present in the trajectory will be affected. Using this controller, a robot can provide physical corrective feedback during training while avoiding issues of time dependence and selection of a particular strategy to complete a task. When user inputs increase the ergodicity of the trajectory with respect to a distribution defining the task, the controller is transparent to the user. Otherwise, user inputs are rejected by providing an equal and opposite force—maintaining a constant velocity. In our study, we experimentally compare this novel assistance paradigm, ergodic hybrid shared control, to a standard form of assistance based on error.
Our results demonstrated that although ergodic HSC is based on a global measure of the distance from ergodicity rather than a local measure of error, it improved the error measure. Participants who trained with the error-based assistance actually performed worse in terms of error than those who trained with ergodic HSC. Because the virtual fixtures provide feedback based on distance from a local point on the desired path, there is a tendency for participants to follow the virtual fixtures. This leads to drawings that are precise but not accurate, following the same incorrect path over multiple trials.
1 Or, when two desired points are close together on the desired path as seen in the drawings of the banana, the user can oscillate between those points, achieving low error without following the intended paths at all. This emphasizes the fact that error is a limited measure that cannot capture facets of broader goals. Although error was higher in the VF trained participant drawings, one would not say that their skill in reproducing the drawings was necessarily poor. If the reference had been built off many example drawings, they likely would have fallen within range of one or more of the samples. The banana can be viewed as an example of how specifications using a single trajectory can over-prescribe a task. The gray line in the middle is a minor detail, but it is weighed equally with other dark points when the task is specified as a path. Ergodic HSC treats the gray line as less important in the context of the overall goal just as it would if one had combined multiple demonstrations to form the task definition and a majority of the demonstrations did not include this detail. The fact that a detail appears in only a few demonstrations in an ensemble implies that it may not be necessary to complete the task.
When evaluating the impact of training in terms of the time-averaged statistics of the participants’ trajectories, we found that there was a significant advantage to training with ergodic HSC. The group that trained with virtual fixtures produced consistently incomplete drawings, whereas the group receiving feedback based on the ergodic measure produced drawings that were smaller than the original image, but more complete. This difference emphasizes the need for assistance based on global measures as opposed to local interactions. Combining the ergodic measure with the dynamics of the system in the MIG, allows ergodic HSC to be sensitive to time without being dependent on time like some other assistance strategies.
Timed drawing is not a daily task that people need assistance with or training for. Nevertheless, it shares characteristics with tasks such as cleaning, personal care, cooking, and reaching. In these
activities of daily living (
ADLs), completing the task in a way that is “good enough” can be more valuable than partially completing the task with a highly accurate, precise movement. For instance, when cleaning a spill on your countertop with a sponge, you could use a back and forth scanning motion, but the precise trajectory would not be very important, and you could overreach or underreach the edge of the spill while still removing most of it from the surface. Reference distributions for these types of tasks can be developed with relatively few task demonstrations using the ergodic imitation framework presented in [
24]. For activities that require higher accuracy such as self-feeding, a single reference distribution may lead to an ergodic controller that fails to adequately support the task. If the task is segmented, task segments requiring high accuracy like reaching the mouth in self-feeding could be represented with distributions that approach a delta function as more demonstrations are included in the training data. Alternatively, one could specify accuracy-oriented tasks using trajectories; we do not claim that
all tasks should be represented using distributions. Yet, many tasks can be and some tasks cannot be represented well using trajectories.
The combination of the control approach presented here with ergodic imitation (e.g., combining multiple demonstrations into a single distribution and executing ergodic HSC with respect to that distribution) could enable one to develop a robotic system that can rapidly take demonstrations from a teacher and return an autonomous training assistant that accounts for both the task goal and the natural variability that one should expect in typical human motion.