6.5.1 Euclidean Distance of Attackers’ Selections from the End Users’ Secret Selections.
To investigate the
RQ, we conducted three analyses: (1) we calculated the Euclidean distance of the attackers’ guessing selections from the legitimate end users’ password secret selections; (2) based on the first analysis (Euclidean distance), we adjusted the brute-force attack performed in Section
5.5.1 to investigate whether users who share common experiences were able to run a more effective attack by starting to guess regions they suspected that the users selected their password; and (3) we performed a qualitative analysis based on the participants’ feedback at the end of the human guessing attack study to better understand the approach followed by attackers on graphical passwords created on location-aware images. In the analyses that follow, data are mean ± standard error. There were no significant outliers in the data.
A. Disregarding the type of the gesture and the exact order. Figure
8 depicts the Euclidean distance of each gesture of each participant by disregarding the type and the exact order of the attackers’ gestures and the end user's gestures. For the analysis, we adopted a threshold of three segments by considering the allowed tolerance of the graphical password mechanism.
4 Accordingly, among 276 gestures (3 gestures x 92 participants), 49 gestures (17.7%) were in close proximity with the attacker's guessed selections. Furthermore, we conducted a one-way
multivariate analysis of variance (MANOVA) to determine the effect of relationship between attackers and legitimate end users on how far the attackers’ password selections were from the legitimate end users’ password selections. Three measures were assessed: Euclidean distance of the legitimate users and the attackers on the first gesture, second gesture, and third gesture of the graphical password. Participants belonged to one of the following categories: family member, friend, medical staff, patient, and nurse. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (
p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (
p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (
r = .117,
p = .004 between gesture one and gesture two;
r = .021,
p = .007 between gesture one and gesture three; and
r = .133,
p = .010 between gesture two and gesture three); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (
p = .007). The analysis revealed that the differences between groups on the combined dependent variables were statistically significant,
F(12, 225.180) = 4.356,
p < .0005; Wilks’
Λ = .576;
partial η2 = .168. Follow-up univariate ANOVAs revealed that all three gestures were statistically significantly different between the participants from different relationship groups (first gesture:
F(4, 87) = 5.351,
p = .001;
partial η2 = .197; second gesture:
F(4, 87) = 3.305,
p = .014;
partial η2 = .132; third gesture:
F(4, 87) = 4.550,
p = .002;
partial η2 = .173; Tukey-Kramer post hoc tests showed that for the first gesture, participants from the family group had statistically significantly lower mean scores than participants from either the patient group (
p = .002) or the nurse group (
p = .050), whereas participants from the friend group had statistically significantly lower mean scores than participants from the patient group (
p = .006). Regarding the second gesture, participants from the family group had statistically significantly lower mean scores than participants from the nurse group (
p = .020). Regarding the third gesture, participants from the family group had statistically significantly lower mean scores than participants from the medical staff group (
p = .036), the patient group (
p = .025), and the nurse group (
p = .001).
B. Disregarding the type of the gesture but considering the exact order. Figure
9 depicts the Euclidean distance by disregarding the type of selections but considering the exact order of the attackers’ gestures and the end users’ gestures. Applying the same threshold of three segments, the analysis revealed that among 276 gestures (3 gestures x 92 participants) made by the participants, 19 gestures (6.8%) were in close proximity with the attacker's guessed selections. Furthermore, we conducted a one-way MANOVA to determine the effect of relationship between attackers and legitimate end users on how far the attackers’ password selections were from the legitimate end users’ password selections. Three measures were assessed: Euclidean distance of the legitimate users and the attackers on the first gesture, second gesture, and third gesture of the graphical password. Participants belonged to one of the following categories: family member, friend, medical staff, patient, nurse. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (
p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (
p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (
r = –.089,
p = .004 between gesture one and gesture two;
r = .253,
p = .008 between gesture one and gesture three; and
r = .055,
p = .009 between gesture two and gesture three); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (
p < . 0005). The analysis revealed that the differences between groups on the combined dependent variables were statistically significant,
F(12, 225.180) = 11.500,
p < .0005; Wilks’
Λ = .282;
partial η2 = .344. Follow-up univariate ANOVAs revealed that all three gestures were statistically significantly different between the participants from different relationship groups (first gesture:
F(4, 87) = 12.567,
p < .0005;
partial η2 = .366; second gesture:
F(4, 87) = 3.930,
p = .006;
partial η2 = .153; third gesture:
F(4, 87) = 13.832,
p < .0005;
partial η2 = .389; Tukey-Kramer post hoc tests showed that for the first gesture, participants from the family group had statistically significantly lower mean scores than participants from the medical staff group (
p < .0005), the patient group (
p = .001), and the nurse group (
p < .0005), whereas participants from the friend group had statistically significantly lower mean scores than participants from the medical staff group (
p = .001), the patient group (
p = .004), and the nurse group (
p < .0005). Regarding the second gesture, participants from the family group had statistically significantly lower mean scores than participants from the nurse group (
p = .002). Regarding the third gesture, participants from the family group had statistically significantly lower mean scores than participants from the medical staff group (
p = .011), the patient group (
p < .0005), and the nurse group (
p < .005), whereas participants from the friend group had statistically significantly lower mean scores than participants from the patient group (
p < .0005) and the nurse group (
p < .0005).
C. Considering the type of the gesture and exact order. We compared the three attempts of each attacker with the end user's stored password from the same pair of participants. From a total of 276 attacking guesses (3 attempts of each attacker x 92 participants), there was no successful attempt, yielding an online success guessing rate of 0%.
6.5.2 Security Strength of the Created Graphical Passwords Based on Experience-Spot-Driven Brute-Force Attack.
To investigate whether the suggested location-aware image approach holds against attacks when considering the experience-spots indicated by each participant who acted as an attacker, we conducted an offline attack comparing a PoI-assisted brute-force attack (the same attack that considers PoIs as described in Section
5.5.1) and a personalized PoI-assisted brute-force attack that was further enhanced to consider the experience-spots regions as indicated by the human attacker.
A. Disregarding order and type of gestures across all participants. Given that the implementation of PGA-like mechanisms takes into consideration the order and the type of gestures, which could impact the total guesses required to crack a graphical password (e.g., circles are more complex than simple taps but less complex than lines4), it is interesting to first understand how each attack type (PoI-assisted brute-force attack vs. personalized PoI-assisted brute-force attack) performs when we disregard the order and the type of the gestures and rather focus on the positions of the password selections. To do so, we simplify the gesture type as follows: for circles, we disregard the radius and the directionality and keep only the center of the circle as an x, y segment, whereas for lines, we consider only the x, y segment of the start of the line.
A one-way MANOVA was run to determine the effect of relationship between attackers and legitimate end users on the number of guesses required to crack the passwords when using a PoI-assisted brute-force attack vs. a personalized PoI-assisted brute-force attack by considering also the experience-spots provided by the attackers. Two measures were assessed: number of guesses required to crack the passwords when using a PoI-assisted brute-force attack and number of guesses when using a personalized PoI-assisted brute-force attack. Participants belonged to one of the following categories: family member, friend, medical staff, patient, nurse. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (
p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (
p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (
r = .394,
p < .0005); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (
p = .002). The analysis revealed that the differences between groups on the combined dependent variables were statistically significant,
F(8, 172) = 4.546,
p < .0005; Wilks’
Λ = .681;
partial η2 = .175. Follow-up univariate ANOVAs revealed that in both types of attacks, the number of guesses required to crack the passwords was statistically significantly different between the participants from different relationship groups (PoI-assisted brute-force attack:
F(4, 87) = 4.650,
p = .002;
partial η2 = .176; personalized PoI-assisted brute-force attack:
F(4, 87) = 7.473,
p < .0005;
partial η2 = .256; Tukey-Kramer post hoc tests showed that for the PoI-assisted brute-force attack, participants from the family group had statistically significantly lower mean scores than participants from the nurse group (
p = .001), whereas participants from the friend group had statistically significantly lower mean scores than participants from the nurse group (
p = .024). Regarding the personalized PoI-assisted brute-force attack, participants from the family group had statistically significantly lower mean scores than participants from the nurse group (
p < .0005), participants from the friend group had statistically significantly lower mean scores than participants from the nurse group (
p = .002), participants from the medical staff group had statistically significantly lower mean scores than participants from the nurse group (
p = .003), and participants from the patient group had statistically significantly lower mean scores than participants from the nurse group (
p = .008). In the PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 43,576.94 ± 7,488.35; (2) friend: 71,732.66 ± 17,131.09; (3) medical staff: 99,921.83 ± 18,620.08; (4) patient: 112,232.45 ± 22,590.04; and (5) nurse: 159,116.55 ± 27,663.24. In the personalized PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 28,694.61 ± 4,596.83; (2) friend: 52,546.77 ± 19,951.18; (3) medical staff: 55,511.50 ± 11,560.53; (4) patient: 62,751.50 ± 11,903.80; and (5) nurse: 124,987.88 ± 12,485.52. Figure
10 depicts the means of password strength among attack types by disregarding the order and the type of gestures across all participants.
B. Disregarding order and type of gestures across participants with at least one gesture containing an experience-spot. A one-way MANOVA was run to determine the effect of relationship between attackers and legitimate end users on the number of guesses required to crack the passwords when using a PoI-assisted brute-force attack vs. a personalized PoI-assisted brute-force attack by considering participants with at least one gesture containing an experience-spot. Two measures were assessed: number of guesses required to crack the passwords when using a PoI-assisted brute-force attack and number of guesses when using a personalized PoI-assisted brute-force attack. Participants belonged to one of the following categories: family member, friend, medical staff, patient, nurse. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (
p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (
p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (
r = .764,
p < .0005); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (
p = .001). The analysis revealed that the differences between groups on the combined dependent variables were statistically significant,
F(8, 96) = 43.855,
p < .0005; Wilks’
Λ = .046;
partial η2 = .785. Follow-up univariate ANOVAs revealed that in both types of attacks, the number of guesses required to crack the passwords was statistically significantly different between the participants from different relationship groups (PoI-assisted brute-force attack:
F(4, 49) = 66.535,
p < .0005;
partial η2 = .845; personalized PoI-assisted brute-force attack:
F(4, 49) = 81.915,
p < .0005;
partial η2 = .870; Tukey-Kramer post hoc tests showed that for the PoI-assisted brute-force attack, participants from the family group had statistically significantly lower mean scores than participants from the friend group (
p = .032), the medical staff group (
p < .0005), the patient group (
p < .0005), and the nurse group (
p < .0005). Participants from the friend group had statistically significantly lower mean scores than participants from the medical staff group (
p = .001), the patient group (
p = .020), and the nurse group (
p < .0005). Participants from the medical staff group had statistically significantly lower mean scores than participants from the nurse group (
p < .0005), whereas participants from the patient group had statistically significantly lower mean scores than participants from the nurse group (
p < .0005). Regarding the personalized PoI-assisted brute-force attack, participants from the family group had statistically significantly lower mean scores than participants from the medical staff group (
p = .001), the patient group (
p < .0005), and the nurse group (
p < .0005). Participants from the friend group had statistically significantly lower mean scores than participants from the medical staff group (
p = .002), the patient group (
p < .0005), and the nurse group (
p < .0005). Participants from the medical staff group had statistically significantly lower mean scores than participants from the patient group (
p < .0005) and the nurse group (
p < .0005), whereas participants from the patient group had statistically significantly lower mean scores than participants from the nurse group (
p = .050). In the PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 25,308.50 ± 1,454.21; (2) friend: 32,241.70 ± 1,330.90; (3) medical staff: 41,972.60 ± 2,062.57; (4) patient: 39,267.41 ± 1,436.24; and (5) nurse: 58,851.83 ± 1,498.21. In the personalized PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 7,536.10 ± 351.76; (2) friend: 7,815.60 ± 308.75; (3) medical staff: 12,074.80 ± 394.72; (4) patient: 19,233.83 ± 969.97; and (5) nurse: 22,047.91 ± 1,000.61. Figure
11 depicts the means of password strength among attack types by disregarding the order and the type of gestures across participants with at least one gesture containing an experience-spot.
C. Considering order and type of gestures across all participants. A one-way MANOVA was run to determine the effect of relationship between attackers and legitimate end users on the number of guesses required to crack the passwords when using a PoI-assisted brute-force attack vs. a personalized PoI-assisted brute-force attack by taking into account the order and type of gestures across all participants. Two measures were assessed: number of guesses required to crack the passwords when using a PoI-assisted brute-force attack and number of guesses when using a personalized PoI-assisted brute-force attack. Participants belonged to one of the following categories: family member, friend, medical staff, patient, nurse. Data are expressed as mean ± standard error. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (
p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (
p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (
r = .183,
p = .008); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (
p = .012). The analysis revealed that the differences between groups on the combined dependent variables were not statistically significant,
F(8, 172) = .020,
p = 0.99; Wilks’
Λ = .998; partial
η2 = .001. In the PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 1,660,935.38 ± 348,113.52; (2) friend: 1,687,709.11 ± 447,019.45; (3) medical staff: 1,629,472.16 ± 124,750.15; (4) patient: 1,624,675.54 ± 849,169.82; and (5) nurse: 1,622,255.16 ± 66,263.27. In the personalized PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 1,745,307.61 ± 253,165.96; (2) friend: 1,690,173.83 ± 126,111.69; (3) medical staff: 1,789,121.611 ± 217,395.14; (4) patient: 1,797,917.49 ± 475,354.52; and (5) nurse: 1,818,772.27 ± 164,674.17. Figure
12 depicts the means of password strength among attack types by considering the order and the type of gestures across all participants.
6.5.3 Qualitative Analysis.
To further shed light and understand the approach followed by attackers on graphical passwords created on location-aware images, we used the data gathered from the feedback mechanism at the end of the study, as well as observations made by the researchers during the attack phase. In many cases, attackers used knowledge about the end user under attack, related to their habits, preferences, and facts about their personality: “My colleague is a great storyteller and I believe he would try to create a story for the password, like arriving at the entrance of the hospital, then entering by the stairs, then reading information on the panel. So, I decided to make my attack having this story-telling process in mind”. – P19; “I thought she will have used the three possible gestures (instead of just one or two of the options), and marked colorful items, because of her personality”. – P28; “Most of the times he has his coffee in the front yard outside the emergency room during his shift break. I would be surprised if he hadn't selected this particular area”. – P56; “My colleague likes flowers and plants. I think that some of her selections must be on the flowers”. – P39; “She likes painting at her free time and I think she drew straight lines on objects that have bright colors”. – P11.
In other cases, it is evident that the scenery depicted on the location-aware images impacted the selections of the attackers. In particular, attackers used a more personalized approach by considering specific information related to their common shared experiences with the end user under attack within the places depicted on the location-aware images: “Usually we have lunch together at the hospital's cafeteria and we tend to be seated at the tables near the entrance. Hence, I made my selections around these tables”. – P7; “Considering direction of movement; places he usually sits or similar”. – P14; “The pictures are from daily routines, so I'm trying to guess where he is going on a daily basis or important aspects for him in the photo”. – P29; “I work at the emergency department and the colleague that I am requested to guess his password is the ambulance driver. I think it is very possible that he made some of his password selections near or on the ambulance outside of the emergency department”. – P8; “The photo shows two parking lots, but I know that she usually parks her car to the one next to the left entrance of the hospital because it is more convenient for her. I guess that some of her selections must be within this specific parking lot area”. – P16; “Being a hospital receptionist and a friendly person that interacts daily with many patients, I think that she must have selected the people standing in front of the reception desk”. – P33.
In very few cases, attackers did not employ any sophisticated attack but rather focused on the obvious PoIs of the images: “Tried to guess likely features in the images, but type of gesture I used was just random”. – P23; “I clicked on the most dominant views. I chose them because they caught my eye”. – P34; “I believe he must have selected the chairs because these are the most visible points in the image”. – P42.
The preceding observations were concentrated in a coding schema relevant to the approach followed by the attackers as follows:
—
Habits/preferences/characteristics of end users (e.g., storytelling, coffee, flowers, painting)
—
Common shared experiences (i.e., experiences within the depicted place/scenery)
—
Random-guessing approach relying on areas of the image that attract peoples’ attention (i.e., PoIs).
Table
7 summarizes the responses about the approach employed by the attackers based on the aforementioned coding schema.