7.1 Metrics and Results
System features were available to the operators to aid task completion. The IA visualization had lower
global clutter percentages, which was the percentage of area obstructed by all displayed objects. IA interface operators using the
\(M_{2}\) model had lower global clutter percentages compared to
\(M_{3}\). Collective interface operators in general had lower global clutter percentages using
\(M_{2}\). The IA visualization had lower global clutter percentages in general compared to the Collective visualization. The statistical test details were provided in Section
5.1.
The Euclidean
distance (pixels) between the SA probe interest and where the operator was interacting with the visualization indicated where operators focused their attention. Euclidean distance can be used to assess the effectiveness of the object placements on the display. Larger distances are not ideal, because more time [
18] and effort is required to locate and interact with the object. The first requirement of calculating the Euclidean distance was to determine which collective or target was of interest in an SA probe question. Target 3 is the target of interest for the following question: “What collectives are investigating Target 3?” The second requirement was to determine where the operator was interacting with the system (i.e., clicking on the interface). Operators from both visualizations when using
\(M_{2}\) had shorter Euclidean distances compared to
\(M_{3}\). Shorter Euclidean distances, however, occurred at all timings for
\(SA_{3}\) and 15 seconds before asking and during response to an SA probe question for
\(SA_{1}\) when the Collective interface operators used
\(M_{3}\).
The Euclidean distance between the SA probe interest and operator clicks for each combined model and visualization are shown in Figure
16. IA interface operators had significantly different Euclidean distances between the SA probe interest and their interaction between models 15 seconds before asking, while being asked, and during response to an SA probe question for
\(SA_{O}\) and
\(SA_{2}\). A significant difference between models occurred for Collective interface operators 15 seconds before asking an
\(SA_{2}\) probe question. Between-visualizations Mann-Whitney-Wilcoxon tests found significant effects using
\(M_{2}\) 15 seconds before asking an SA probe question for
\(SA_{O}\) (
N = 557,
U = 43,303,
p = 0.02) and
\(SA_{1}\) (
N = 273,
U = 10,577,
p = 0.05). A moderate significant effect between visualizations using
\(M_{2}\) while being asked an SA probe question was found for
\(SA_{O}\) (
N = 464,
U = 31,052,
p \(\lt\) 0.01) and a significant effect for
\(SA_{1}\) (
N = 229,
U = 7,645,
p = 0.01). A significant effect between visualizations using
\(M_{2}\) during response to an SA probe question was found for
\(SA_{O}\) (
N = 499,
U = 35,029,
p = 0.02). Shorter Euclidean distances occurred when IA interface operators used
\(M_{2}\), while Collective interface operators had shorter Euclidean distances using
\(M_{3}\). The Spearman correlation analysis found a weak correlation between the Euclidean distance of the SA probe’s interest and the operators’ click and SA probe accuracy for the IA interface operators using
\(M_{2}\) 15 seconds before asking an SA probe question for
\(SA_{1}\) (
r =
\(-\)0.18,
p = 0.04). Weak correlations were found for the Collective interface operators using
\(M_{3}\) for
\(SA_{O}\) while being asked (
r = 0.14,
p = 0.04) and during response to an SA probe question (
r = 0.16,
p = 0.01).
The percentage of time an operator was in the middle of an action during an SA probe question identified how often operators were interrupted by the secondary task. Distracted operators may have needed more time to focus their attention on the SA probe question or may have prioritized their interaction over answering the SA probe question. Understanding how distractions may have negatively influenced operator behavior is needed to design the system to promote effective human-collective interactions. Operators using \(M_{2}\) were interrupted less frequently by the SA probe question compared to those using \(M_{3}\) irrespective of the visualization. The percentage of times operators using either visualization were in the middle of an action during an SA probe question for both models for all decision difficulties had min = 0 and max = 100. IA interface operators at \(SA_{O}\), \(SA_{1}\), and \(SA_{2}\) had median = 0 and at \(SA_{3}\) median = 50, while Collective interface operators had median = 100 at all decision difficulties. The percentage of times operators from both evaluations were in the middle of an action during an SA probe question was significantly different between models for \(SA_{O}\), \(SA_{1}\), and \(SA_{2}\). Between-visualizations Mann-Whitney-Wilcoxon tests identified highly significant effects using \(M_{2}\) for \(SA_{O}\) (N = 670, U = 74,938, p \(\lt\) 0.001), \(SA_{1}\) (N = 294, U = 14,595, p \(\lt\) 0.001), \(SA_{2}\) (N = 224, U = 8,344, p \(\lt\) 0.001), and \(SA_{3}\) (N = 152, U = 3,780, p \(\lt\) 0.001). Highly significant effects between visualizations using \(M_{3}\) were found for \(SA_{O}\) (N = 672, U = 78,456, p \(\lt\) 0.001), \(SA_{1}\) (N = 253, U = 10,944, p \(\lt\) 0.001), \(SA_{2}\) (N = 252, U = 11,172, p \(\lt\) 0.001), and \(SA_{3}\) (N = 167, U = 4,678, p \(\lt\) 0.001). IA interface operators were interrupted less frequently by SA probe questions. The Spearman correlation analysis revealed weak correlations between the middle of an action during an SA probe and SA probe accuracy for the IA interface operators using \(M_{2}\) for \(SA_{1}\) (r = \(-\)0.22, p \(\lt\) 0.01) as well as \(M_{3}\) for \(SA_{2}\) (r = 0.19, p = 0.05) and \(SA_{3}\) (r = \(-\)0.33, p \(\lt\) 0.01). A weak correlation was revealed for the Collective interface operators using \(M_{2}\) for \(SA_{3}\) (r = 0.24, p = 0.05).
The percentage of times a participant
completed an interrupted SA probe action identified how often operators returned back to their previous task. A system designed to bring an operator back into the loop via engaging prompts, such as the dynamic individual entity behaviors or opacity of support for targets, can mitigate poor human-collective interactions and performance. A system that is easy to remember is desirable to attain optimal operator behavior [
29]. IA interface operators using
\(M_{3}\) were able to complete 100% of their interrupted actions compared to those using
\(M_{2}\), while Collective interface operators using
\(M_{2}\) completed approximately 99% of their interrupted actions. The percentage of completed interrupted SA probe actions for both models at all decision difficulties had median = 100 and max = 100. IA interface operators using
\(M_{3}\) had median = 100 at all decision difficulties, while Collective interface operators had median = 0. IA interface operators using
\(M_{2}\) had median = 0 at
\(SA_{O}\),
\(SA_{2}\), and
\(SA_{3}\) with a median = 100 at
\(SA_{1}\). Collective interface operators had median = 0 at
\(SA_{O}\),
\(SA_{1}\), and
\(SA_{2}\) with a median = 100 at
\(SA_{3}\). Significant differences existed between models for the IA interface operators for
\(SA_{O}\), while no differences existed for the Collective interface operators. Between-visualizations Mann-Whitney-Wilcoxon tests found a significant effect when using
\(M_{3}\) for
\(SA_{1}\) (
N = 253,
U = 55,608,
p = 0.03). IA interface operators completed more interrupted actions compared those Collective interface operators. No correlations were found between the completed interrupted SA probe actions and SA probe accuracy.
The
investigate command permitted increasing a collective’s support for an operator specified target. Additional support for the same target was achieved by reissuing the investigate command repeatedly. Generally, operators using
\(M_{2}\) and Collective visualization issued fewer investigate commands. The number of investigate commands issued per decision for each combined model and visualization are shown in Figure
17(a). Significant differences were found between models for the number of investigate commands issued per decision for both visualizations at all decision difficulties. Additional between-visualizations Mann-Whitney-Wilcoxon tests identified a moderate significant effect when using
\(M_{2}\) for overall decisions (
N = 672,
U = 63,866,
p \(\lt\) 0.01) and a highly significant effect for hard decisions (
N = 298,
U = 14,066,
p \(\lt\) 0.001). Highly significant effects between visualizations when using
\(M_{3}\) were also found for overall (
U = 17,990,
p \(\lt\) 0.001), easy (
N = 396,
U = 6,279.5,
p \(\lt\) 0.001), and hard decisions (
N = 276,
U = 2,331.5,
p \(\lt\) 0.001).
The abandon command permitted decreasing a collective’s support for a target. The abandon command only needed to be issued once for the collective to ignore a specified target for the duration of a decision. Operators using \(M_{2}\) in general issued fewer abandon commands compared to \(M_{3}\); however, IA interface operators using \(M_{3}\) issued fewer abandon commands for hard decisions. The number of abandon commands issued per decision for both models at all decision difficulties had median = 0 and min = 0. IA interface operators using \(M_{2}\) had max = 8 for overall and hard decisions, while \(M_{3}\) resulted in a max = 2 for overall and easy decisions. The Collective interface operators using both models for overall and hard decisions had max = 2, as well as for easy decisions using \(M_{3}\). Operators using \(M_{2}\) for easy decisions had max = 1. Significant differences were found between models for the number of abandon commands issued per decision with both visualizations for overall and easy decisions. No significant effects between visualizations were found. IA interface operators issued fewer abandon commands compared to Collective interface operators. Collective interface operators using \(M_{2}\) issued fewer abandon commands for overall and hard decisions only.
A collective’s entities stopped exploring targets and moved to the operator selected target when the
decide command was issued. A decide request required at least 30% of the collective support for the operator-specified target. Collectives that reached 50% support for a target transitioned into the executing state and the operator was no longer able to influence the collective behavior. IA interface operators using
\(M_{2}\) issued fewer decide commands compared to those using
\(M_{3}\) or the Collective visualization. The number of decide commands issued per decision for each combined model and visualization are shown in Figure
17(b). Significant differences were found between models for the number of decide commands issued per decision for both visualizations at all decision difficulties. Between-visualizations Mann-Whitney-Wilcoxon tests found highly significant effects using
\(M_{2}\) for overall (
N = 672,
U = 63,968,
p \(\lt\) 0.01) and easy decisions (
N = 374,
U = 21,014,
p \(\lt\) 0.001). A moderately significant effect between visualizations using
\(M_{3}\) was found for overall decisions (
U = 57,952,
p \(\lt\) 0.01) and a significant effect existed for easy decisions (
N = 377,
U = 19,997,
p = 0.05).
Collective right-clicks and
target right-clicks allowed the operator to access the information windows, which provided the number of individual entities in each state and the percentage of collective support for each target.
\(M_{3}\) had fewer collective and target right-clicks compared to
\(M_{2}\), while the Collective visualization had fewer target right-clicks compared to the IA visualization. The statistical analyses of both metrics were provided in Section
6.1.
Metrics showing how operators used the abandon command were assessed. IA interface operators using
\(M_{3}\) abandoned the
highest-value target less frequently and had fewer
abandoned target information windows open. The statistical analyses of both metrics were provided in Section
6.1. Instances may have occurred when the operator accidentally issued an undesired abandon command or repeatedly issued the abandon command, although targets were abandoned after a single command. The percent of times
abandon commands exceeded abandoned targets was examined. Operators using
\(M_{2}\) issued fewer repeated abandon commands compared to
\(M_{3}\). The percent of times abandon commands exceeded abandoned targets for each combined model and visualization are shown in Figure
18. Significant differences were found between models for the percent of times abandon commands exceeded abandoned targets with both visualizations for overall and hard decisions. No significant effects between visualizations were found. IA interface operators had fewer repeated abandon commands compared to Collective interface operators. Collective interface operators using
\(M_{3}\) had fewer repeated abandon commands for overall and hard decisions.
The
time difference (minutes) between the commit state and issued decide command assessed the operator’s ability to predict the collective’s future transition from the committed state (30% support for a target) to executing (50% support for a target). Operators using
\(M_{3}\) issued decide commands faster than
\(M_{2}\). The time difference between commit state and issued decide command for each combined model and visualization are shown in Figure
19. Significant differences existed between models for the time difference between the commit state and decide command for both visualizations at all decision difficulties. Collective interface operators had smaller time differences between the committed state and issued decide commands compared to those using the IA visualization; however, no significant effects between visualizations were found. IA interface operators using
\(M_{2}\) had smaller time differences between the commit state and decide command for hard decisions.
Further analysis of how operators used the collective and target information windows was conducted. The average number of times target information windows were opened per target per decision identified the
average frequency at which the information windows were accessed. Operators using
\(M_{3}\) in general accessed target information windows less frequently compared to
\(M_{2}\). Target information windows were accessed less frequently for operators from both evaluations using
\(M_{2}\) for easy decisions. The average frequency of an accessed target information window for each combined model and visualization are shown in Figure
20(a). IA interface operators had significantly different average frequencies of accessed target information windows between models for hard decisions, while the Collective interface operators had no significant differences between models. Additional between-visualizations Mann-Whitney-Wilcoxon tests identified a significant effect when using
\(M_{2}\) for overall decisions (
N = 619,
U = 42,857,
p = 0.02) and a moderate significant effect for hard decisions (
N = 282,
U = 7,908.5,
p \(\lt\) 0.01). Operators using the Collective visualization accessed target information windows less frequently compared to the IA visualization.
Operators using the target information windows may have used them frequently for short time periods or left them open for long time periods. The average percentage of
time a target information window was open per target relative to the decision time for each combined model and visualization are shown in Figure
20(b). IA interface operators using
\(M_{2}\) left target information windows open for shorter time periods. Significant differences were found between models for the average time target information windows were open for both visualizations at all decision difficulties; however, no significant effects between visualizations were found.
Particular information windows may have been accessed more frequently for longer time periods. The average percentage of
time the decision target information window was open relative to the decision time for each combined model and visualization are shown in Figure
21(a). Operators using
\(M_{2}\) left the decision target information window open for shorter periods of time compared to
\(M_{3}\). Significant differences were found between models for the time the decision target information window was open for both visualizations at all decision difficulties. Between-visualizations Mann-Whitney-Wilcoxon tests found a highly significant effect using
\(M_{2}\) for overall decisions (
N = 672,
U = 65,102,
p \(\lt\) 0.001), as well as significant effects for easy (
N = 374,
U = 20,114,
p = 0.01), and hard decisions (
N = 298,
U = 12,832,
p = 0.02). A moderately significant effect between visualizations using
\(M_{3}\) was found for overall decisions (
U = 48,749,
p \(\lt\) 0.01), with significant effects for easy (
N = 396,
U = 17,095,
p = 0.03) and hard decisions (
N = 276,
U = 8,157,
p = 0.04). IA interface operators using
\(M_{2}\) left the decision target information window open for shorter time periods compared to
\(M_{3}\), while the Collective interface operators had shorter time periods using
\(M_{3}\).
The average percentage of
time the decision collective information window was open relative to the decision time for each combined model and visualization are shown in Figure
21(b). The time the decision collective information window was open was only assessed for the IA evaluation, because the Collective evaluation did not record which collective window was opened or closed. IA interface operators using
\(M_{3}\) left the decision collective information window open for shorter time periods compared to
\(M_{2}\). IA interface operators had significantly different times for hard decisions.
The post-trial questionnaire assessed the
perceived effectiveness of each request type (investigate, abandon, and decide), not effective (1) to very effective (7). The investigate, abandon, and decide rankings were ranked higher for operators using
\(M_{3}\) when compared to those using
\(M_{2}\). Collective interface operators using
\(M_{2}\) ranked abandon effectiveness higher. The post-trial effectiveness for each combined model and visualization are shown in Figure
22. Significant differences between models were found in IA interface operator rankings for the decide command and for Collective operator interface rankings for both the abandon and decide commands. Between-visualizations Mann-Whitney-Wilcoxon tests identified a moderate significant effect for the abandon effectiveness using
\(M_{2}\) (
N = 56,
U = 554.5,
p \(\lt\) 0.01). IA interface operators using
\(M_{3}\) ranked investigate, abandon, and decide effectiveness higher compared to those using the Collective visualization, while Collective interface operators ranked abandon effectiveness higher using
\(M_{2}\).
The post-experiment questionnaire assessed the collective’s
responsiveness to requests, the participants’
ability to choose the highest-valued target, and their
understanding of the collective behavior. IA interface operators using
\(M_{2}\) had the best collective responsiveness, operator ability, and understanding versus
\(M_{3}\). Collective interface operators ranked the collective’s responsiveness highest using
\(M_{3}\), while operator ability and understanding were highest using
\(M_{2}\). Statistical test details were provided in the Metrics and Results Section
5.1.
A summary of
\(R_{3}\)’s results by the hypotheses, with significant results identified, is provided in Table
7. This summary table is intended to facilitate the discussion.
7.2 Discussion
The analysis of which combined model and visualization promoted better usability suggests that the IA visualization promoted transparency more effectively than the Collective visualization, while both models had advantages and disadvantages. Operators using the \(M_{2}\) model had less global clutter, due to target information windows being open for less time, smaller Euclidean distances between the interest of a SA probe question and their interaction, were able to complete interrupted actions after answering a SA probe question, and issued fewer abandon and decide commands. \(H_{6}\), which hypothesized that \(M_{2}\) and the Collective visualization will promote better usability by being more predictable and explainable, was not supported. Operators with both interfaces using \(M_{2}\) abandoned the highest-valued target more frequently, which may have occurred due to misunderstanding or poor SA. IA interface operators using \(M_{2}\) were not as timely at predicting when a collective was committed to a target and had the decision collective information window open for a longer time duration (i.e., lower explainability) compared to using \(M_{3}\). The Collective evaluation did not record which collectives were right-clicked on, which impeded the ability to associate right-clicks with a collective; however, a similar reliance on the decision collective information windows may have occurred considering how the Collective interface operators used the target information windows. Future evaluations will validate Collective interface operator usability behavior.
The Collective visualization enabled operators to complete actions prior to an SA probe question and to issue decide commands shortly after a collective was committed to a target. \(H_{6}\) was not supported for the Collective visualization, since more highest-valued targets were abandoned. The continuous display of collective and target information windows promoted higher SA performance for the Collective interface operators using both models. The reliance of the information provided in the pop-up windows suggests that the information was more explainable and reliable than the information provided on the collective icons. Incorporating the numerical percentage of support from the respective collective on a target icon or identifying the most favored target on a collective hub may reduce the reliance of the information windows and simultaneously improve SA by mitigating potential observability issues if the operator must interact with more collectives.
IA interface operators using \(M_{3}\) and Collective interface operators using \(M_{2}\) were able to complete actions that were interrupted by an SA probe question 99% of the time. The memorability of both models and visualizations enabled operators to return to their previous task after answering the SA probe question, because of the required operator engagement (\(M_{3}\)) and established expectations of collective behaviors (\(M_{2}\)). The predictability of \(M_{3}\) and the Collective visualization justified issuing decide commands shortly after collectives were in a committed state; however, this finding may be biased for \(M_{3}\), because of the required operator influence to achieve the task. The same bias can be attributed to the command effectiveness rankings, which were higher for \(M_{3}\). The IA interface operators’ ability to identify objects on the visualization may have been impeded by displaying all of the individual entities, collective and target icons, and collective and target information windows when the SA probe question inquired about an object further away from the center of the operator’s current attentional focus. Asking SA probe questions about objects at various distances from the operator’s current focal point is necessary to understand how clutter, or moving individual entities, may affect the operator’s ability to identify the object of interest and impact SA performance.
\(H_{7}\), which hypothesized that operators using \(M_{2}\) and the Collective visualization will require fewer interactions, was not supported. \(M_{2}\) enabled fewer commands compared to \(M_{3}\). The IA visualization enabled fewer abandon and decide commands. Collective interface operators using \(M_{2}\) had better decision-making performance when more investigate commands were issued. Issuing more investigate commands for high-value targets located further from the hub may suggest that the interaction delay embedded in \(M_{2}\), designed to reduce the impacts of environmental bias and improve the success of choosing the ground truth best targets, may have not accommodated operators’ expectations if lower-valued targets were being favored solely because they were closer to the hub. Collective interface operators who issued more commands may have wanted control and directability over the decision-making, which may have occurred due to lower trust or misunderstanding. Investigations are needed to determine if and how trust may influence operators. Operators used different strategies to fulfill the task; however, the most successful strategy promoted more consensus decision-making (i.e., investigate commands), as opposed to prohibiting exploration of targets (i.e., abandon commands). Understanding how operators used commands is necessary to promote effective interactions and produce desired human-collective performance.
The transparency embedded in \(M_{2}\) and the Collective visualization combination did not support the best overall system usability. The IA visualization promoted less clutter, by alleviating the dependence of the collective and target information windows, and promoted fewer interactions. Modifications to both \(M_{2}\) and the Collective visualization must be made to mitigate the highest-valued target being abandoned more frequently, as well as reduce the reliance on the information windows. The assumption that fewer interactions are optimal may not be accurate for all decision difficulties, such as hard decisions. Understanding strategies and justifications for more interactions is necessary to promote transparency that aids operators during particular situations and results in higher human-collective performance.