Keywords

1 Introduction

Users often perceive security as a barrier that interferes with their productivity [21]. They experience weariness or reluctance towards security or frustration, denial, complacency or overwhelm [52]. User discontent have been observed when forced to adhere to password policies [27, 30], and annoyance by the shift in stricter password policies [42, 49].

Few recent research have begun to measure and investigate the impact of users’ current state of cognition and emotion during security and privacy decision-making [10]. These include the effects of cognitive depletion [24], fear and stress [19] and prior effortful security task [14] on password choice, fear with respect to privacy evaluations, and happiness with respect to wilful self-disclosure or sharing [13]. However, cyber security and privacy research has yet to investigate the influence of anger in decision-making. As a consequence, this paper reports on a study aiming to address this gap.

Although there has been demonstrated user and research preference for not having the burden of a portfolio of passwords [18, 51], passwords are still the cheapest and most common method of authentication. They are unlikely to disappear in the near future and password choice research present a typical scenario where users make security decisions as a secondary task. The results of such user studies can further inform the landscape of usable security research. In addition, since password research has benefitted from much effort, research synthesis can be conducted across investigations.

1.1 Contributions

In this paper we reproduced methods already employed in the context of user-password research, such as those in [10, 14, 19]. The paper provides empirical evidence of the effects of incidental anger emotion on password choice. It demonstrates risk-seeking choices in a security context, as a direct consequence of an external source of anger, in a lab study. It also compares effects of anger from this study, and fear and Captcha stimulus, from previous studies with similar measurements, on password strength.

1.2 Outline

After the introduction, we provide background research in the area of emotion influences and password research. We then provide the aim and methodology of our study including ethics. We follow with the results and discussion before completing with the conclusion.

2 Background

2.1 Influence of Emotion

While affect (as expressed emotion) are thought to impact judgment and decision-making [47, 57], research intentions on affect influences in cyber security [19] and privacy [9, 13] are relatively new. Fear and anger emotions are particularly important because of their influence on threat appraisal, on risk perception [35], and on coping strategies influencing behaviour [4, 61] and their likely mis-attribution [57].

Fear vs. Anger. Although both negative emotions, fear and anger differ in the appraisal themes of certainty and control, and have opposing effects on risk perception [35]. Anger produced in one situation carries over to a wide range of other situations, increasing both optimistic expectations for one’s future and the likelihood of making risk-seeking choices. Fear, on the other hand, leads to more pessimistic expectations and more risk-avoidant choices [33, 37]. Fear also decreases the human’s perceived ability to exercise control whereas anger increases one’s perceived ability.

Frustration is a precursor to anger [4, 5, 7] and anger produced in one situation carries over to a wide range of other situations, increasing both optimistic expectations for one’s future and the likelihood of making risk-seeking choices [35].

Incidental vs. Integral Emotion. While emotions are often an outcome of a situation [38], human judgment and decisions can also be based on fleeting incidental emotion that become the basis for future decisions and hence outlive the original cause for the behaviour [34].

We distinguish between emotional experiences that are either (1) normatively relevant to present judgments and choices [34], defined from a consequentialist perspective [26], or experienced feelings about a stimulus [44], that is integral affect, or (2) normatively unrelated to the decision at hand [26, 34, 46] or independent of a stimulus [44], but can be mis-attributed to it and influence decision processes [48] that is incidental affect. The impact of incidental emotions on decision-making is well established [57] and have been shown to influence how much people eat [25], help [39], trust [15], procrastinate [53], or choose price different products [36].

Duration of Emotion. Emotions are processes that unfold over time and unlike a mood, an emotional experience is elicited by a certain event and has a clear onset point [6]. The duration of an emotional episode can be defined as the amount of time between this onset point and the first moment the emotional experience is no longer felt [55]. Research has empirically shown that emotions generally last from a couple of seconds up to several hours. In addition, it is thought that the duration of an emotional response is positively related to the duration of the eliciting event [20].

2.2 Text Passwords

Although text passwords are the cheapest and most commonly used method of computer authentication, a large proportion of users are frustrated when forced to comply to password policies such as monthly reset [29]. Users may therefore develop habits to cope with the situation, for example via password re-use [22] writing passwords down, incrementing the number in the password at each reset [1], storing passwords in electronic files and reusing or recycling old passwords [29].

On average, the user has 6.5 passwords, each shared across 3.9 different sites and that each user has 25 accounts requiring passwords and type 8 passwords per day [17].

3 Aim

We investigate the main RQHow does anger emotion influence password choice?

3.1 Impact on Password Strength

While security has been described as being ‘irritating’, ‘annoying’, and ‘frustrating’, together with being cumbersome, and overwhelming [52], annoyance has also been associated with a shift to stricter password policies [42], with a result of more guessable password choices for the latter.

Frustration, irritation, annoyance are expressions of anger [4, 5, 7] where anger is one of the measurable emotions.

We investigate incidental anger that can be induced from requirements for security compliance, human-computer interaction design, or any incidental life situation.

Question 1

( RQ-P). How does incidental anger emotion influence password strength?

\(\mathsf {H_P,_0} \): There is no difference in password strength between users induced with incidental anger emotion and those with neutral emotion.

\(\mathsf {H_P,_1} \): There is a significant difference in password strength between users induced with incidental anger emotion and those with neutral emotion.

3.2 Emotion Induced

Mood Induction Protocol [40, 59] is a process for inducing emotions during user studies. The common methods are film stimuli [45], autobiographical recall [40] or music and guided imagery together [31, 59]. Film/video stimuli produce the largest effect sizes (magnitude of the impact) [23, 45, 59].

Question 2

( RQ-E ). How does [anger/neutral] video stimulus impact reported anger?

\(\mathsf {H_E,_0} \): There is no difference in reported anger between users induced with incidental anger emotion and those with neutral emotion.

\(\mathsf {H_E,_1} \): There is a significant difference in reported anger between users induced with incidental anger emotion and those with neutral emotion.

3.3 Password Reuse and Strategy

We investigate whether there is any difference behind password choices across the two conditions.

Question 3

( RQ-R). How does password reuse and password strategy differ across the two conditions?

3.4 Treatment Comparison

We evaluate how the effects of anger in the current study compare with that of previous studies with fear stimulus [19] and Captcha stimulus [14].

Question 4

( RQ-C ). How does the effect of anger stimulus on password strength compare with other treatments?

4 Methodology

We designed a between-subject lab experiment with \(N=56\). We follow the good practice guidelines for empirical research in security and privacy [11, 12, 41, 43], founded on scientific hallmarks. We used standard questionnaires and methods, as well as reproduced methods employed before [14, 24]. We define research questions and hypotheses at the fore and discuss limitations. We follow the standard APA Guidelines [3] to report statistical analyses, and we report on effect sizes, assumptions and test constraints.

4.1 Participants

Participants were recruited from the Newcastle University student population, via departmental email and flyers. With the study lasting on average 20 min, participants were remunerated £10 for their time.

The \(N=56\) participants consisted of 21 female, 34 male and 1 identified as other gender. The mean age \(=26.95\), \(sd=9.345\). \(55.4\%\) of the participants had an undergraduate education level, \(17.9\%\) postgraduate, \(17.9\%\) further education (PhD), \(7.1\%\) secondary school while \(1.8\%\) did not choose. \(30\%\) of the participants reported a computer science related education background.

We employed a randomised block sample design, similar to previous lab experiment in the same context [14]. While we an expected equal number of participants in each condition, at analysis we removed 3 participants who showed signs of not going through the experiment protocol. We consequently ended up with 27 participants in the control group and 29 in the experimental group.

4.2 Procedure

The procedure consisted of (1) pre-task questionnaires for demographics and emotion, (2) a manipulation to induce anger emotion versus a neutral state, (3) a password entry for a mock-up GMail registration, (4) a manipulation check on emotion induced, and (5) a debriefing. Figure 1 depicts the experiment design.

We designed the GMail registration task to mimic Google Email registration online. Similar to the real online policy, we suggested passwords of at least 8 characters long, including digits, uppercase letters and symbols.

Fig. 1.
figure 1

Experiment design.

4.3 Manipulation

We employed a Mood Induction Protocol (MIP) [40, 59] via standard film stimuli to either induce emotion of anger or to induce a neutral emotional state [23, 45]. Video stimulus are one of the most effective methods of inducing emotions in lab studies [8, 59], with Ray [45] and Hewig et al. [28] providing validated lists of such stimulus.

Apparatus. The duration of an emotion is influenced by the emotion-eliciting event characteristics such as the event duration, characteristics of the emotion itself and characteristics of the person experiencing the emotion [54]. Because the length of the chosen video clip influences how long the emotion will last during the study, we chose two clips of similar length. We chose a clip from the movie Witness lasting for 91 s for the anger stimulus and a clip from Hannah and her sisters lasting for 92 s for the neutral stimulus, both from a database of standard video clips for MIP [28, 45].

Manipulation Check. We employ emotion elicitation methods as manipulation check to verify whether the manipulation was successful. In the debriefing questionnaire, we queried participants on the video. We asked for freeform text to “After watching the video, what emotions did you feel?” In addition to qualitatively looking into the emotions reported, we use the IBM’s Tone Analyzer as a tool to compute participants’ emotional tone from their self-reports. IBM’s Tone Analyzer service uses linguistic analysis to detect joy, fear, sadness, anger, analytical, confident and tentative tones found in text.

We also administered a standard questionnaire, the Positive and Negative Affect Schedule (PANAS-X) [58], both at the beginning of the study and after manipulation and GMail registration to enable evaluation of difference in affect state caused by the stimulus. We chose to administer the second one after the GMail registration, since we used a 60-item questionnaire which can be long enough to dilute the stimulus effect on the password strength. We set the time boundary of the elicitation to “How do you feel right now?” and used the full 60-item PANAS-X scale based on a 5-point Likert items anchored on 1 - “very slightly or not at all”, 2 - “a little”, 3 - “moderately”, 4 - “quite a bit” and 5 - “extremely”.

4.4 Measurement

We measured the dependent variable (DV) password strength via \(\mathsf {log_{10}} \) number of password guesses and an ordinal value from 0 to 4 of password strength via zxcvbn [60]. This is similar to the previous studies [14, 19].

4.5 Ethics

The study received ethics approval from the institution and followed its ethics guidelines. The laboratory setting ensured a face-to-face environment where participants could ask questions or cease the experiment should they feel any discomfort. They received an informed consent form and could withdraw from the experiment at any time.

Participants were exposed to mild anger emotion, not more than daily life. In particular, we chose not use the strongest standard stimulus for anger emotion. Our choice of stimulus also comes from a database of mood induction protocol stimulus that have been validated in affective psychology research and found appropriate for use in experiments [28, 45].

Although participants were asked to create a new GMail account during the experiment, at debriefing they were told that it was only for the purpose of the study.

We computed password strength via \(\mathsf {zxcvbn} \) offline and anonymised and stored participant data on an encrypted hard disk.

5 Results

All inferential statistics are computed at a significance level \(\alpha \) of \(5\%\). We estimate population parameters, such as standardized effect sizes of differences between conditions with \(95\%\) confidence intervals.

5.1 Emotion Manipulation Check

As verification of the influence of the video stimulus, we investigate RQ-E “How does [anger/neutral] video stimulus impact reported anger?” via \(\mathsf {H_E,_0} \) that “There is no difference in reported anger between users induced with incidental anger emotion and those with neutral emotion”.

Self-report. We look into participants’ freeform responses to “After watching the video, what emotions did you feel?” in the debriefing questionnaire. We find a large number of participants reporting frustration, anger or disgust in the anger stimulus condition compared to only one.

For the anger stimulus condition, we find that \(69\%\) of participants responded with angered, irritated, frustrated or disgusted, for example \(\mathsf {P31} \) reported “disgust at the teenager bullying the amish”, \(\mathsf {P33} \) “Frustration on behalf of the amish being unable to defend themselves against those that were hassling them” and \(\mathsf {P51} \) “annoyed and wronged like something should’ve been done about it.” There was also a mix of anger and feeling sorry, such as \(\mathsf {P39} \) “angry at the bullies. Sorry for the others” and \(\mathsf {P46} \) “I felt sorry for the family, I thought the bully was a jerk. I wanted to see the end and see the other guy smack him one!”

The other \(31\%\) of participants in this condition reported excitement, intrigue or curiosity such as \(\mathsf {P56} \) “Excited; [sic] want to watch the complete film”, \(\mathsf {P55} \) “Intrigued, excited, saddened, amused” and \(\mathsf {P38} \) “Empathy, confusion, wonder, curiosity”.

For the neutral stimulus condition, participants reported a mix of emotions including (a) inspiration such as \(\mathsf {P3} \) “Inspired because of the conversation” and \(\mathsf {P6} \) “Inspired by the woman chasing her dream”, (b) happiness such as \(\mathsf {P7} \) “happier” and \(\mathsf {P9} \) “Slightly happier”, (c) boredom such as \(\mathsf {P8} \) “Bored. I was expecting something to happen during the video”, (d) sadness/anger such as \(\mathsf {P13} \) “sad for her, emphatic” and \(\mathsf {P23} \) “Felt sad/pity for the women who mentioned her audition [sic] Angry at her friend for not supporting her” or (e) mixed emotions such as \(\mathsf {P19} \) “Mixed emotions, inquisitive, happy, concerned, interested”.

Emotional Tone. In addition, we use IBM’s Tone Analyzer to compute participants’ emotional tone from their self-reports. We compute an independent samples t-test on anger emotional tone (TA_anger) across the two conditions. There was a statistically significant difference in TA_anger between the two conditions, with \(t(54) = -3.021\), \(p=.004\), CI\([-.450,-.090]\), with a large effect size Hedges \(g=1.38\), power statistics \(1-\beta =.91\). Figure 2 shows the impact of the two stimulus on TA_Anger, showing a clear distinction between the two conditions.

Fig. 2.
figure 2

IBM Tone Analyzer anger tone by condition.

PANAS-X Affect Score. PANAS-X ‘hostility’ score is the sum of the individual “angry”, “hostile”, “irritable”, “scornful”, “disgusted” and “loathing” PANAS-X scores. We compute diff-hostility as the difference in hostility score between reports after the manipulation check and GMail registration and reports at the beginning of the study. We then run an independent samples t-test on diff-hostility across the two conditions. Although we do not observe a significant difference between the two conditions, with \(t(54) = -1.782\), \(p=.081\), CI\([-3.295,.197]\), we the magnitude of difference between the two conditions is Hedges \(g=.46\), which refers to a near medium effect size. Figure 3 shows the impact of the two stimulus on diff-hostility, showing a clear distinction between the two conditions.

Fig. 3.
figure 3

Change in hostility after manipulation.

5.2 Password Descriptives

We describe the password characteristics and composition across the two conditions in Tables 1. We detail password length, the number of digits, lowercase letters, uppercase letters and symbols.

Table 1. Password characteristic descriptives

5.3 Password Re-use

We asked participants whether they registered the GMail account via a password they currently use for any services. In general, of the \(N=56\) participants, \(42\%\) answered “Yes” to the question “Is it a password that you use for any other services?” From the neutral stimulus condition, \(33\%\) reused an existing password, whereas from the anger stimulus condition \(51.7\%\) reused an existing password.

We then asked participants to select the services they use the password for and the last time they used it. Figure 4 depicts the reuse context from the 24 participants who responded “Yes” to having reused an existing password.

Fig. 4.
figure 4

Password reuse context by condition.

For the question “When was the last time you used this password?”, in the neutral condition, 4 participants responded “past week” and 1 “today” whereas in the anger condition, 1 participant responded “past week” and 5 “today”.

5.4 Password Strength

We investigate RQ-P “How does incidental anger emotion influence password strength?” via \(\mathsf {H_P,_0} \) that “There is no difference in password strength between users induced with incidental anger emotion and those with neutral emotion”.

The distribution of the \(\mathsf {zxcvbn} \) \(\mathsf {log_{10}} \) guesses is measured on interval level and is not significantly different from a normal distribution for each condition. Saphiro-Wilk for (a)neutral: \(D(27) = .969\), \(p = .576 > .05\), (b) anger: \(D(29) = .944\), \(p = .132 > .05\). We also compute Levene’s test for the homogeneity of variances. For the \(\mathsf {zxcvbn} \) \(\mathsf {log_{10}} \), the variances were not significantly unequal across conditions, \(F(1, 54) = .027\), \(p = .871> .05\). We provide the descriptive statistics in Table 2.

All Participants. We compute an independent samples t-test with the \(\mathsf {zxcvbn} \) \(\mathsf {log_{10}} \) guesses as dependent variable. There was a statistically significant difference in password strength for neutral (\(M=8.346\), \(SD=2.502\)) and anger (\(M=6.425\), \(SD=2.452\)) conditions, \(t(54)=2.901\), \(p=.005\), CI[.593, 3.249], effect size Hedges \(g=.765\), power statistics \(1-\beta =.81\).

In addition, we compute a Mann-Whitney test on the ordinal values of \(\mathsf {zxcvbn} \) password strength score across the two conditions. There was a statistically significant difference in password strength score, where participants in the anger stimulus condition chose weaker password strength (\(Mdn=1\)) than participants in the neutral stimulus condition (\(Mdn=3\)), \(U=218\), \(z=-2.965\), \(p=.003\).

Table 2. Descriptive statistics of password strength via \(\mathsf {zxcvbn} \) \(\mathsf {log_{10}} \) guesses by condition.

Non-Password-ReUse Participants. Since \(42\%\) of participants reused an existing password, we compute the mean difference between the anger and neutral group for those participants who did not reuse a password, \(N=32\), [18 neutral, 14 anger].

We compute an independent samples t-test with the \(\mathsf {zxcvbn} \) \(\mathsf {log_{10}} \) guesses as dependent variable. There was a statistically significant difference in password strength for neutral (\(M=8.715\), \(SD=2.563\)) and anger (\(M=5.899\), \(SD=2.074\)) conditions, \(t(54)=3.342\), \(p=.002\), CI[1.095, 4.536], effect size Hedges \(g=1.196\), power statistics \(1-\beta =.87\).

In addition, we compute a Mann-Whitney test on the ordinal values of \(\mathsf {zxcvbn} \) password strength score across the two conditions. There was a statistically significant difference in password strength score, where participants in the anger stimulus condition chose weaker password strength (\(Mdn=1\)) than participants in the neutral stimulus condition (\(Mdn=3\)), \(U=56.5\), \(z=-2.763\), \(p=.006\).

Therefore, for all participants as well as those who did not reuse a password, for both \(\mathsf {zxcvbn} \) \(\mathsf {log_{10}} \) guesses and the ordinal zxcvbn password strength score, we reject the null hypothesis \(\mathsf {H_{P,0}} \).

5.5 Password Strategy

We code participants’ password strategies across the following six categories, with qualitative details below and a summary in Fig. 5.

Fig. 5.
figure 5

Password strategy by condition.

Random. \(19.6\%\) [3 neutral, 8 anger] of the participants did not have a strategy, or created a one-time password or something they would not use again, for example \(\mathsf {P27} \) described “made a new password that was explicitly not a real or particularly strong”, \(\mathsf {P48} \) “used a random word with a combination of random numbers” and \(\mathsf {P38} \) “something easy to remember and pertaining to this task as I didn’t think I’d be using it for anything else”.

Personal. \(30.3\%\) [11 neutral, 6 anger] of the participants chose a password related to their preference or something personal to them, with \(14.3\%\) [7 neutral, 1 anger] not adding a date or number such as \(\mathsf {P9} \) “I chose a personal part of my life as no one else will be able to guess it” or \(\mathsf {P22} \) “The names of my children mixed”.

Personal with Date or Number. While overall \(30.3\%\) [4 neutral, 5 anger] of the participants chose a password related to their preference, \(16.1\%\) [4 neutral, 5 anger] combined the personal data with numbers or dates, for example \(\mathsf {P1} \) “I came up with something new for this experiment so I used my Mum’s nickname for me with my year of birth” and \(\mathsf {P47} \) “picked something personal to me and added some numerics”.

Manipulation. We found that \(14.3\%\) [4 neutral, 4 anger] participants had a strategy involving complexity combinations, changing characters to numbers or the equivalent in another language, for example as expressed by \(\mathsf {P14} \) “I thought of a word/series of words relating to the video and substituted letters for numbers” and \(\mathsf {P51} \) “Make sure it’s long enough, has an upper case letter and number in it. Also one that I could actually remember”.

Same As. \(25\%\) [5 neutral, 9 anger] participants reported a re-use strategy, for example as expressed by \(\mathsf {P3} \) “This is the password I use for every site”, \(\mathsf {P35} \) “normal password”, \(\mathsf {P17} \) “I have selected one of the passwords I have been using since being a child. This time I selected a seldom used one which is not associated with any other important accounts”, \(\mathsf {P42} \) “my usual original password I use for each account I make for the first time” and \(\mathsf {P52} \) “a strong password I already use”.

Easy to Remember. \(23.2\%\) [8 neutral, 5 anger] participants reported that they created an easy to remember password, for example as expressed by \(\mathsf {P11} \) “One that I could remember, but that I didn’t use for anything else and was not easily guessable by other people” or \(\mathsf {P13} \) “in this case just [sic] used sentence that I might not to [sic] forget afterwards” and \(\mathsf {P56} \) “None, I used the most convenient strategy that would be easy to remember”.

5.6 Treatment Comparison Across Studies

We investigate RQ-C, that is, “How does the effect of anger stimulus on password strength compare with other treatments?”

We conduct a meta-analysis, which is a statistical methodology for combining quantitative evidence from studies and is key for research synthesis. It helps to distinguish one-time results from consistent findings, as well as to compare treatments across studies. The meta-analysis was computed with the R packages meta and metafor [56]. We provide a graphical display via a forest plot of the estimated meta-analysis results from the studies with (1) the anger and neutral treatments of the current study, with (2) the fear and stress treatments of Fordyce et al.’s study [19], and (3) the captcha treatments of Coopamootoo et al.’s study [14]. We note that the three studies employed the same password creation scenario via a mockup GMail account and measured password strength via \(\mathsf {zxcvbn} \).

We provide the forest plot in Fig. 6. Each treatment is represented by a point-effect estimate (the mid-point of the box, or the best guess of the true effect in the population) and a horizontal line for the confidence interval. The area of the box represents the weight given to the treatment. The diamond represents the overall effect. The width of the diamond depicts the confidence interval for the overall effect.

Figure 6 shows the overall treatment effect and comparison across treatments with the anger emotion and the Captcha treatments demonstrating more negative effects in password strength, that is weaker password choices than the fear and stress treatments.

Fig. 6.
figure 6

Forest plot of treatment effects.

6 Discussion

We induced anger emotion via a source external to the GMail registration and password creation scenario and find a significant effect of the anger stimulus condition in resulting in weaker password choice, with a near large effect size \(g=.765\).

6.1 Effect on Password Strength

Our findings support the observation from affect psychology research of more risk-seeking choices with anger emotion [35]. Given the impact of incidental anger emotion, we postulate requirements for human-interaction designs to support users in avoiding frustrated states. However, similar to Mazurek et al. [42] who reported participants’ annoyance due to a change to stricter password policy, we also observed an effect from anger emotion induced by an external source. We do not yet know the emotional aspects of password creation itself.

6.2 Password Reuse

We observed that overall, a larger number of participants reused an existing password in the anger group than in the neutral group, with social media passwords showing the most distinct appearance in the anger group and retail passwords in the neutral group, while email password reuse is somewhat balanced in both groups. This percentage of reported password reuse is not surprising when compared to previous reports of password reuse by a student population (100%) [2] and in general (34.6 to 82%) [2, 32].

We did not find a significant effect of the experiment condition on password reuse given our sample size. However further investigation on the effects of emotion on reuse and reuse contexts with a larger sample size will provide a deeper understanding.

6.3 Password Strategy

For reported password strategy, we observed a higher number of participants choosing a random [3 neutral, 8 anger] and “same as” [5 neutral, 9 anger] password in the anger group than in the neutral group. This may be an indication of less thoughtful choices, yet this can only be confirmed via further research investigations.

In addition the neutral group had a larger number of passwords that can be remembered [8 neutral, 5 anger] or that used a personal strategy [11 neutral, 6 anger] than the anger group.

While these observations are informative, we cannot make conclusive remarks due to the small numbers per strategy or reuse context. However, future research specifically on the effects of emotion on password memorability and password strategies, will likely provide finer and more conclusive details of the impact of emotion.

6.4 Treatment Impact

The meta-analysis across studies enabled a comparison of the impact of different treatments on password strength. This is a first research synthesis of emotion research in the area of cyber security that compares fear with anger treatments. Research and comparison of these two negative valence emotions are particularly important because of their differing threat appraisal tendencies and risk-avoiding versus risk-seeking choices. From the meta-analysis, we can visualise the impact of feeling anger during security tasks relative to feeling fear or a more neutral affect tone.

In addition, the meta-analysis demonstrates the effect of solving a Captcha as stimulus versus watching a medium effect anger video, where it is likely that frustration was involved in the previous Captcha study (evidenced by the number of attempts at solving the Captcha).

6.5 Ecological Validity

Previous password studies fall into two main categories: (1) those using real-world password datasets from security leaks, or (2) those generating passwords within controlled lab studies and Amazon Mechanical Turk online studies; where passwords collected from user studies are thought to be comparable to and a reasonable approximation of real passwords [16, 42]. Our study was designed as a controlled lab experiment, employing a GMail registration task with its password policy suggesting passwords of at least 8 characters long, including digits, uppercase letters and symbols.

In addition, we compare Table 1 with the large-scale study conducted at CMU [42] in 2013 and from that leaked data sets in 2016 [50]. Our mean password length of 10.36 CI[9.61,11.10] is not far from that of Mazurek et al.’s CMU dataset of 10.7 [42] and Shen et al.’s of 9.46 [50]. Our passwords had a mean of 2.38 digits, 1.14 uppercase letters and 0.29 symbols while Mazurek et al.’s CMU dataset had a mean of 2.8 digits, 1.5 uppercase letters and 1.2 symbols.

6.6 Limitations

Our sample was drawn from a University student and academic population, \(55.4\%\) at an undergraduate level and \(35.8\%\) at a post-graduate level, with \(30\%\) reporting to have a Computer Science background. Since students and IT professionals are thought to exhibit nuanced behaviour from students [2], it is possible that a representative sample drawn from the country population may show different password characteristics.

Although we observe a near to large effect size on password strength and power statistics of .81, our sample is not large. A larger sample size would have provided clearer indications for password reuse and strategy.

7 Conclusion

This paper provides a first study with empirical evidence of the effects of anger emotion, induced via a film stimulus video with no connection with cyber security, on password choice. It demonstrates risk-seeking choices in a security context, while describing a study that induced and measured anger emotion. The paper compares the effects of anger, fear and Captcha stimuli on password strength. Our findings show the impact of emotion on a security choice, where the emotion may be induced from the environment or any incidental situation.