research-article

Open access

Security and Usability of a Personalized User Authentication Paradigm: Insights from a Longitudinal Study with Three Healthcare Organizations

Authors:

Argyris Constantinides,

Andreas PitsillidesAuthors Info & Claims

ACM Transactions on Computing for Healthcare, Volume 4, Issue 1

Article No.: 2, Pages 1 - 40

https://doi.org/10.1145/3564610

Published: 27 February 2023 Publication History

All formats PDF

Abstract

This article proposes a user-adaptable and personalized authentication paradigm for healthcare organizations, which anticipates to seamlessly reflect patients’ episodic and autobiographical memories to graphical and textual passwords aiming to improve the security strength of user-selected passwords and provide a positive user experience. We report on a longitudinal study that spanned over 3 years in which three public European healthcare organizations participated to design and evaluate the aforementioned paradigm. Three studies were conducted (n = 169) with different stakeholders: (1) a verification study aiming to identify existing authentication practices of the three healthcare organizations with diverse stakeholders (n = 9), (2) a patient-centric feasibility study during which users interacted with the proposed authentication system (n = 68), and (3) a human guessing attack study focusing on vulnerabilities among people sharing common experiences within location-aware images used for graphical passwords (n = 92). Results revealed that the suggested paradigm scored high with regard to users’ likeability, perceived security, usability, and trust, but more importantly it assists the creation of more secure passwords. On the downside, the suggested paradigm introduces password guessing vulnerabilities by individuals sharing common experiences with the end users. Findings are expected to scaffold the design of more patient-centric knowledge-based authentication mechanisms within today's dynamic computation realms.

1 Introduction

User authentication is an essential security task within modern healthcare systems, which is performed daily by millions of patients across the world. The healthcare domain entails intrinsic characteristics and requirements, which position the user authentication process in a unique perspective. This is mainly accredited to the fact that healthcare organizations deploy different policies for a variety of end user categories (e.g., medical staff, patients, external caregivers), depending on the context of use, as well as the users’ profiles. Given that sensitive information can be accessed online from patients and shared among medical care staff, healthcare environments increase the risk for information leaks and entail several challenges from a security, privacy, and legal perspective [16, 31, 55, 93].

To safeguard health-related information, organizations deploy a variety of user authentication schemes, which can be based on either (1) a secret known by the user (knowledge-based authentication), such as a textual password or graphical password [13, 87, 109]; (2) a specific object owned by the user (token-based authentication), such as a smart card, smartphone, or hardware token [39, 67, 68, 73, 77]; (3) specific biometric information about the user (biometric-based authentication), such as her or his fingerprint, face, voice, or physiological signals [14, 103]; and (4) a combination of the aforementioned factors (multi-factor authentication) [84].

Nonetheless, the aforementioned authentication schemes entail varying strengths and weaknesses with regard to security, privacy, and usability from the end user's perspective [15], but also with regard to costs and maintenance aspects from the healthcare organization's perspective. For example, current password policies create user frustration (e.g., when users forget the passwords or reset the passwords frequently due to strict policies) [32, 51, 98]; biometric authentication schemes entail privacy threats (e.g., biometric data could be used in impersonation attacks, the data cannot be revoked in case they are compromised or leaked) [43, 101]; unusable password policies increase maintenance costs (e.g., password resets increase labor costs of an organization) [98, 104]; and in the case of password data breaches, such events negatively affect the organization's reputation and trust, lead to penalties by the corresponding health agencies [38], and may even threaten human life [31].

Furthermore, the literature reveals that healthcare organizations still rely on knowledge-based user authentication (e.g., passwords), and research suggests that it will continue to prevail in the next decades [70] even in combination with other approaches (e.g., token based, biometric based). In this respect, researchers have attempted to provide alternative knowledge-based authentication methods through graphical authentication, in which users either draw a secret gesture on the screen (drawmetric authentication) or select regions of images (locimetric authentication) [13, 28, 88]. Locimetric user authentication approaches have gained popularity in recent years, with popular examples including Android's pattern lock for unlocking smartphone devices, and Microsoft's Windows 10 Picture Gesture Authentication (PGA) for unlocking conventional computers. Graphical user authentication research is motivated as follows: (1) it leverages on the fact that visual information is better recalled than textual information according to the picture superiority effect [80], and (2) it can be easily adapted to ubiquitous environments due to its natural user interaction by clicking or drawing on regions of an image [42].

In this context, bearing in mind that (1) user interactions in the healthcare domain are characterized by high security standards, (2) users prefer seamless authentication policies, which should be able to adapt to different technology and contextual factors [12, 25, 28, 73], and (3) authentication policies of textual passwords have become non-user-friendly and hence non-secure [98], there is an urgent need to elaborate on novel user authentication paradigms that improve the current state of the art. Therefore, our work is primarily driven by our vision to increase security of user-selected secrets and simultaneously provide a positive user experience through a seamless and adaptable user authentication paradigm, which will allow to transition from current “one-size-fits-all” authentication systems to user-adaptable and personalized authentication systems.

In this article, we aim to introduce a novel patient-centric authentication paradigm, coined DuoPass, which has been derived based on a longitudinal study that spanned over 3 years during which we verified, implemented, and evaluated the suggested approach in the healthcare domain, by following a User-Centered Design (UCD) approach. Table A1 in the appendix depicts our research methodology. We next present a literature review on state-of-the-art user authentication practices in healthcare environments, which is subsequently triangulated with diverse stakeholders of three European healthcare organizations. Consequently, we elaborate on the conceptual design of the suggested paradigm. We then present the user evaluation of DuoPass in terms of security strength, memorability, and user experience. Then, we report on a human guessing attack study that investigated vulnerabilities among people sharing common experiences within location-aware images used for graphical passwords. We conclude the article with a discussion on the main findings, implications, and limitations of this work.

2 User Authentication Research and Practice in Healthcare Organizations

2.1 Literature Review

2.1.1 Search Strategy, Paper Selection, and Eligibility Criteria.

We examined papers from the ACM Digital Library and IEEE Xplore and used the following keywords in our queries: authentication; password; biometric; locimetric; drawmetric; healthcare; health. We reviewed 40 papers based on our inclusion criteria, which were published from January 1, 2015, to January 6, 2021.

2.1.2 Review Outcome.

Several literature reviews on user authentication research and practices in the healthcare domain exist. For example, Jayabalan and O'Daniel [56] and Fernández-Alemán et al. [41] reported a study on authentication factors in electronic health records. Mason et al. [74], Fatima et al. [40], and Okoh and Awad [79] reviewed the current state in biometric authentication in healthcare environments. Schwartze et al. [90] conducted a systematic literature review on authentication systems for securing clinical documentation workflows. Kumar et al. [65] conducted a review on user authentication in gadget-free healthcare environments.

The healthcare domain embraces unique constraints and characteristics [64] that are related to different access control scenarios that need to be supported not only from a healthcare staff perspective but also from a patient-centric approach. Starting from the medical staff (e.g., doctors, nurses, caregivers), there are numerous scenarios in which stakeholders interact with medical systems that are deployed on heterogeneous devices and within different contexts of use [38, 108]. For example, when doctors visit patients within the hospital, they typically have access to the patients’ records through a smartphone or tablet device [7], whereas accessing more controlled places like surgery rooms, intensive care rooms, and such necessitates a multi-factor and/or biometric-based authentication approach. From a patients’ perspective, access to services and data is limited to a Web-based solution in which patients access their personal health records using a textual password [38], in combination with a one-time password as a second layer for authentication. These authentication methods are analyzed in the next section under the following perspectives: security, privacy, usability, memorability, user experience, user acceptance, and trust.

The majority of healthcare organizations currently employ traditional textual password solutions [61, 63, 110]. However, textual password schemes have known security issues and are constantly becoming less usable and less memorable due to strict password policies [38, 98]. To address memorability issues of using multiple textual passwords in different services within the hospital, healthcare organizations deploy Single Sign-On (SSO) solutions that allow the end users to enter their password credentials once in the beginning, and then access several independent services within their organization, without requiring them to re-enter their credentials. However, studies have shown that although SSO is effective in administrative contexts of medical staff, it creates difficulties in collaborative contexts due to the strict password policies of the healthcare organization. For example, medical staff need to frequently logout and login when they change location in the hospital, or after a small period of inactivity [38, 49, 76]. According to Heckle and Lutters [49], given that the clinical staff are frequently changing locations in the hospital when taking care of patients, they are required to continuously login and logout from the system (e.g., due to system timeouts, different access control depending on the system used in the corresponding location, walking away from the screen). In addition, studies have shown that medical staff may typically utilize SSO features within their network; however, they also utilize different textual password credentials for accessing systems that are off the network of their healthcare organization.

Furthermore, the literature reveals other proposals for improving password security and usability, such as through mutual authentication or two-way authentication [52, 66]; group-based authentication for assisting the access and sharing of electronic health records among trusted members [69]; approaches suggesting practical recommendations for the creation of strong and usable passwords that combine minimum-strength, minimum-length, and blocklist requirements [98]; providing guidance and feedback during password creation [91]; and proposing alternative mechanisms, such as graphical user authentication schemes, which require from users to draw secret gestures on an image or select a sequence of images as their secret key [1, 11, 13, 28]. In addition, healthcare environments entail unique constraints and characteristics with regard to the end user activities, workflows, and context of use. For example, given that patients and medical staff access data and services from different locations (e.g., within the hospital or through a trusted location), hence several works have proposed location-based authentication approaches in healthcare environments [56, 72].

In the past years, a variety of directives and regulations have been proposed that require the deployment of two-factor or multi-factor authentication solutions in healthcare environments aiming to add multiple layers of security in the user authentication process (e.g., U.S. Health Insurance Portability Accountability Act (HIPAA), European Union Agency for Network and Information Security (ENISA)). Multi-factor authentication solutions typically combine a knowledge-based authentication method (e.g., a textual password) along with a token-based authentication method (e.g., smart card, one-time password sent to the users’ smartphone) [66]. Token-based authentication utilizing smart cards is one of the most common methods for multi-factor authentication in healthcare systems with numerous proposals in the literature [44, 48, 56, 60, 66, 71]. Works have also proposed three-factor authentication combining textual passwords, smart cards, and biometric technology for increased security and usability [33, 35, 57], and the work by Amin et al. [5] proposed a remote patient mutual authentication scheme using smart cards and elliptic curve cryptography. Furthermore, with the advent of smartphone technology in recent years, token-based authentication is achieved with the usage of the user's smartphone device that acts as a trusted token for multi-factor authentication [1, 45, 56, 92].

Finally, biometric-based authentication is constantly gaining market share, aiming to provide increased usability for accessing medical records without compromising the patients’ privacy and security [79]. Biometric technologies are typically based on information about the users’ physical characteristics (e.g., fingerprint, iris, face, voice) and/or behavioral characteristics (e.g., typing patterns, interaction and engagement patterns) [54]. Numerous biometric-based authentication schemes have been proposed that retrieve the users’ biometric characteristics either (1) through their interaction device (smartphone, laptop, etc.) [46, 111] or (2) through surroundings within smart environments, such as smart healthcare, automated monitoring, and smart manufacturing [47, 65]. Various biometric-based approaches have been proposed in the literature for granting access to medical records by utilizing voice acoustics’ analysis and audio-visual identity verification [94], physiological signals analysis (e.g., photoplethysmogram signals) [23, 112], face and voice analysis [75], hand geometry analysis [78], hand gesture spatial interaction analysis based on fingertips and joints [53], and periocular-based analysis [74].

2.2 Triangulating Results of Current State-of-the-Art with Healthcare Organizations

This section presents a user survey aiming to assess current user authentication practices at three European healthcare organizations to validate results and manifest the current literature on user authentication in the healthcare domain. Based on the analysis of the literature review, we formed the main research topics, which were further verified based on a mixed evaluation method that embraced semi-structured interviews with relevant stakeholders (security officers, department managers, doctors) of the healthcare organizations.

2.2.1 Participating Healthcare Organizations and Stakeholders.

Stakeholders from three public European healthcare organizations, which support thousands of patients and users annually, have participated in the survey: Zuyderland Medical Center,¹ Sittard, the Netherlands; Hospital Clinic Barcelona,² Barcelona, Spain; and Western General Hospital within NHS Lothian,³ Edinburgh, Scotland. A total of nine individuals participated in the user survey with varying roles in the aforementioned organization (i.e., chief information security officers, enterprise architects, IT department managers, security experts, doctors, and project managers). Each stakeholder participated in a semi-structured interview that lasted for approximately 45 minutes each. Participation in the interviews was voluntary and could be canceled at any time.

2.2.2 Procedure.

A series of semi-structured interviews was conducted with key stakeholders from the participating healthcare organizations. The interviews were split into two parts. In Part A, participants were initially guided to an online consent form, and each one read and agreed to participate. Participants were then introduced to the survey, its purpose, and its objectives. In Part B, we conducted an initial profiling (approximately 5 minutes) of the participants asking questions that relate to the participant's background and position in the organization, with the aim to understand the background of the interviewee and the context of her or his answers. Then we discussed two main topics: Topic 1—User Authentication Policy (approximately 20 minutes), which was focused on eliciting details about the user authentication policy and procedures of the organization (e.g., how the policy was derived, since when the policy is valid), and Topic 2—Technical Details and Workflows (approximately 20 minutes), which was focused on eliciting details with regard to technical and security matters of the currently applied user authentication scheme and policy (e.g., what is the current password complexity of the applied authentication policy, which is the maximum number of days a password may be used).

2.2.3 Highlights of Participants’ Responses.

Responses of the interviewees and the organizations were anonymized. All organizations reported that their user authentication policy is based on current industry standards and best practices, and that they primarily apply textual passwords as their core means for authentication. The main password policy is based on a widely applied policy—that is, a textual password with a minimum length of eight characters containing no part of the user's real name or username, and including a minimum of one uppercase, one lowercase, one symbol, and one numeric character. The policy had variations across organizations in terms of character type and their combination.

Furthermore, one out of three organizations employed multi-factor authentication in certain scenarios: (1) when accessing the patients’ database, medical staff uses an RFID badge, combined with a four-digit PIN code, and (2) when accessing the healthcare system from outside the organization's network, medical staff and patients are required to login with their textual password, combined with a second factor for authentication based on a one-time password and/or a push notification that is sent on a third-party mobile application. In this respect, one participant stated: “Active directory is your first line of defense so that's why when you are internally it's ok and you login with your badge, so your badge is your second factor. If you are outside of the hospital that's also possible, you can get a remote reader at home, so you get an SMS or via the Microsoft app, you can authenticate and get your session” ∼ Security Expert. The same organization applies variations in the policy depending on the role of the user as well as the context of use. For example, exceptions for policies can be requested by end users. In addition, doctors may use their RFID badge to enter the emergency room. In this respect, one participant stated: “People can request exceptions on a policy and then we look at the case and decide whether we can change the policy” ∼ Security Expert.

We further asked participants whether their organization considers investing and deploying biometric technology, and the majority reported that they have considered this technology; however, this has not been implemented yet due to increased costs and known security and privacy issues within biometric technologies. For instance, facial recognition was given a trial by one organization, but there were problems in some cases. One participant stated: “The problem is that if you go a little away from the screen, or two persons are standing, one person is close and one is standing behind the screen, the system did not know which one is the user” ∼ Security Expert.

Moreover, interviews with end users (e.g., administrators, doctors) reveal that a high number of users expressed complaints on the authentication policy: “There are complaints about the complexity of the passwords, the amount of passwords they have to use, changing the passwords, so it's not a very nice picture” ∼ Administrator. Another participant reported that the users easily forget their passwords, either due to holidays or due to the frequent password changing, so there is often the need to reset them via the helpdesk: “They have problems to remember and sometimes they have to put the password in a post-it and the password is not hidden from the public when they are working in their desk” ∼ Security Expert. Several users of the organization also stated that they must remember and use more than one password, a factor that renders the authentication process harder to complete. In addition, the Web browser of these systems does not allow saving the password for the organization: “It feels like quite a large number. I would say at least 10 [passwords]” ∼ Doctor. Finally, users in general feel like they are putting a lot of effort to remember passwords and need to login several times per day. Some participants stated that they are more than willing to change their current authentication scheme, as long as it applies across multiple systems and it is not too complicated to be used: “I certainly will be willing to change as long as it is applied across multiple systems. But if it's a new authentication type that's different for each system then that would cause problems” ∼ Doctor.

Finally, we asked participants to provide details about the “perfect authentication scheme” and a wish list for “better passwords”. The majority responded that they would like to have a secure system that respects their privacy and usability. One participant responded: “I would really like to leave our employees free and choosing what mechanism they want, the only concern is the level of security and its usability” ∼ Manager. Interest was expressed on the deployment of two-factor authentication methods utilizing the users’ smartphones. Another participant was very interested in the integration of alternative and usable authentication schemes; however, concerns relate to the increased complexity and cost of applying new policies and systems in the organization's production line: “There are many procedures in order to make small changes. It is very difficult to implement. We are now testing another user authentication but this takes a lot of time and it will take as much time to implement it” ∼ Security Expert.

3 Research Motivation and Method

3.1 Research Motivation

Based on the aforementioned analysis, we conclude that healthcare organizations still rely on traditional knowledge-based authentication approaches, and, specifically, on textual passwords and/or location-aware approaches (e.g., RFID, VPN). This is based on several reasons—that is, due to increased implementation and maintenance costs, due to immaturity of new authentication approaches, as well as known security and privacy issues of new user authentication paradigms (e.g., biometrics), which negatively affect wide adoption of such technologies. Simultaneously, healthcare organizations’ experts are aware that textual passwords negatively affect usability and security aspects due to complex policies, and therefore seek for novel and easy-to-adapt knowledge-based user authentication approaches as alternative solutions to avoid affecting the users’ familiarity and existing practice.

Furthermore, the analysis revealed that (1) a plethora of user authentication methods (knowledge-, token-, biometric based) has been introduced for healthcare environments, each one having its own strengths and weaknesses with regard to security, privacy, and user experience; (2) it is estimated that knowledge-based authentication mechanisms will continue to prevail in the next decades [70], even in combination with other approaches (e.g., token-based) or as fallback mechanisms, hence, new approaches need to partially rely on existing textual password approaches to support the technology transition of users; (3) user authentication in healthcare environments entails a mixture of unique constraints and challenges related to the location and context in which interaction takes place [38]; and (4) evidence has shown that a user's preference and task performance varies depending on the user (e.g., age, abilities) and the context of use (e.g., interaction device, screen size), suggesting that any specific solution might not please everyone [73].

Bearing in mind that user authentication in healthcare environments is performed by users with varying profiles, in different contexts of use, and on multiple heterogeneous devices, this article investigates whether end users would benefit from a flexible and personalized user authentication solution that would adapt and personalize different authentication mechanisms (graphical and textual) depending on their context of interaction, aiming to achieve a viable balance between security and usability [11, 13, 27, 29, 30, 59]. Our work is primarily driven by our vision to combine graphical and textual password mechanisms based on a new “Single-Secret Two Reflections” (SS2R) user authentication paradigm, which allows us to move from current generic “one-size-fits-all” authentication systems toward flexible, user-adaptable, and personalized authentication systems [12, 28]. The aim is to provide a viable and flexible authentication solution by following state-of-the-art practices in the healthcare domain and applicable within current healthcare organizations.

3.2 Research Method

The research work adopted a UCD methodology throughout the entire research, design, and development process. Multiple design iterations and a significant amount of evaluation have been incorporated into the research work, with the active participation of end users with the aim of improving the framework design. The key idea of applying a UCD approach was to partially move our focus away from the technical issues of security toward understanding the users and developing new approaches for offering personalized solutions within the healthcare domain. The research adopted a three-phase methodological approach as follows:

Phase A. The first phase involved the literature review on state-of-the-art user authentication research in the healthcare domain. To verify and triangulate the literature, we conducted semi-structured interviews with diverse key stakeholders (n = 9) of three European healthcare organizations, including chief information security officers, enterprise architects, IT department managers, security experts, doctors, and project managers. This phase lasted 6 months.

Phase B. The second phase involved the design and development of the DuoPass authentication system, which is based on the SS2R paradigm by following a UCD approach. With regard to design factors, we considered security factors, usability and user experience factors, adaptation and personalization, as well as security and usability key performance indicators. As part of this phase, we also set the key performance indicators that would be adopted for the evaluation study of the DuoPass authentication system, which included password guessability, password creation efficiency, memory time, login time, and users’ perceived security, usability, trust, and likeability. This phase lasted 12 months.

Phase C. The third and final phase involved the user evaluation with participants of three European healthcare organizations, during which we recorded users’ interactions with the suggested DuoPass approach vs. a state-of-the-art authentication approach, aiming to evaluate its security, memorability, and user experience. We conducted a patient-centric feasibility study during which users interacted with the proposed authentication system (n = 68) and a human guessing attack study (n = 92) focusing on vulnerabilities among people sharing common experiences within location-aware images used for graphical passwords. This phase lasted 11 months.

Table A1 in the appendix depicts our research methodology.

4 A Flexible and Personalized Locimetric User Authentication Paradigm in Healthcare

In this section, we propose a user authentication method, coined DuoPass, which is based on a novel SS2R authentication paradigm. We first provide details on the underlying theory and conceptual design of the approach. We further present the prototype designs and describe how we addressed security and usability aspects during the design of DuoPass.

4.1 Conceptual Design Based on the Dual Coding Theory

User scenario: From location-based memories toward location-aware passwords. Consider a scenario in which a patient, Emma, visits her hospital for a weekly checkup with her doctor. Emma drives her car through the entrance of the hospital and then parks her car. She further walks from the car parking lot through the hospital's garden, enters the building, and goes to the reception hall. She then registers at the reception hall, in which she confirms her appointment with her doctor. She is then asked to wait for 15 minutes until her appointment. During these 15 minutes, Emma walks to the hospital's cafeteria and orders a coffee and croissant until her appointment. Emma completes the checkup with her doctor, receives a prescription medication, and then leaves the hospital and drives back home.

During Emma's visit at the hospital, she created several real-life memories within the hospital (e.g., walk through the garden, visit at the cafeteria, appointment with the doctor). Based on the dual coding theory [80, 96], Emma encrypted a series of visual and verbal stimuli within her long-term memory [6, 9], and more specifically with the episodic, semantic, and autobiographical memories [95, 102, 106], which entail information about certain events experienced in an individual's lifetime and the corresponding semantic information describing these events. Furthermore, according to the dual coding theory, the human brain consists of a visual cognitive sub-system, which is utilized by the human brain during processing, representation, and recall of imagery information, as well as a verbal cognitive sub-system, which is utilized by the human brain during processing, representation, and recall of verbal information [80]. For example, information such as the word cappuccino is represented in the human mind as a visual representation of a cappuccino coffee cup, as well as the word cappuccino. During recall, individuals retrieve and process both representations simultaneously or separately.

4.2 DuoPass Authentication Paradigm

DuoPass aims to leverage on the dual coding theory based on a novel SS2R authentication paradigm by enabling patients to create a single conceptual secret leveraging upon their personal location-based memories they have built through their interactions in certain locations within the hospitals, and further reflect the secret on a graphical and/or textual password key. For creating the graphical password key, DuoPass presents location-aware images that depict image content of a certain location of a hospital in which the patient had prior interaction. In addition, DuoPass provides an additional option to the patient to create a textual password key that may be then utilized interchangeably with the graphical password based on the user's preference. Our Web-based solution intentionally includes a textual password as an option to avoid changing the current state-of-the-art practice in the healthcare domain and a method with which users are familiar. Hence, we anticipate that DuoPass will be more easily transferable from the current state-of-the-art toward the new suggested approach, providing the option to users to switch to their preferred authentication type (graphical or textual).

Graphical passwords. The graphical password mechanism is based on cued-recall graphical authentication mechanisms [13], which ask users to draw secret gestures on a background image that acts as a cue. For its implementation, we follow design and development guidelines of Microsoft's PGA mechanism [58], deployed in Windows 8 and 10, which allows users to draw three types of gestures on the background image: taps (clicks), lines, and circles. Free line gestures are automatically converted into one of the three allowed gestures. To process the gestures, the mechanism creates a grid of the image containing 100 squares (segments) on the longest side, then divides the shortest side by the same scale. Rounding is not applied to any decimal segments, and the mechanism allows 0.25 segments size overflow at the rightmost side of the image. The approach of creating a grid of squares allows for storing the gestures based on their segment position on the grid rather than the coordinates in pixels. The following data is stored: for taps, the (x, y) coordinates of a point; for lines, the (x, y) coordinates of the starting and ending point; and for circles, the (x, y) coordinates of the center, the radius, and the directionality (clockwise/counterclockwise). The credentials are represented as a 7-tuple alphanumeric string (e.g., <g, x₁, y₁, x₂, y₂, r, d>), which consists of the gesture's type, location, and other attributes (e.g., radius and directionality in case of circles) [114], hashed using a hash function (e.g., sha256), and securely stored similar to text-based passwords.

Textual passwords. DuoPass follows state-of-the-art security metrics and authentication policies with regard to the implementation of textual passwords [19, 62]. The textual password keys rely on a basic 16-character password policy, allowing the creation of dictionary words with no composition requirements, which is more usable and as secure as traditional complex 8-character policies [62] (NIST predicts that both policies generate 30 bits of security entropy [19]).

In this context, DuoPass allows users to create a secret graphical and/or a textual password. During graphical password composition, DuoPass deploys images depicting popular sceneries of the hospital (e.g., garden, reception hall, cafeteria). The user is asked to select an image of her preference and then create a graphical password by drawing secret gestures on certain regions of the image based on the experience she had with the depicted content in the image. For example, based on the aforementioned user scenario, a conceptual secret derived from Emma's episodic memory and experiences at the hospital would be “the cappuccino I drank at the hospital”. Emma would reflect this secret on the graphical password by selecting, for example, a coffee cup and the exact table where she sat while having her coffee in the hospital's cafeteria. As a next step, DuoPass also allows users to create a textual password by asking the patient to reflect the conceptual-based graphical secret as a textual representation by articulating the secret—for example, the textual version of the secret would be “CappuccinoIDrankAtTheHospitalsCafeteria”.

Hence, the SS2R paradigm extends existing works in knowledge-based user authentication based on the dual coding theory aiming to (1) enhance security by enabling users to select regions on an image that are familiar to the users and not to the attackers; (2) to enhance memorability through ownership, and prior experience and knowledge of each single user; and (3) to support user authentication adaptability since users can choose their preferred way to login based on their needs and context of use. For example, users who are on the move might prefer to login through touch-based graphical password input on the tablet device, whereas users who are in the office might prefer to login through a textual password input on the conventional desktop computer.

5 Feasibility Study

The goal of the feasibility study is threefold: (1) compare the security strength of graphical passwords when users create a graphical password based on a location-aware image vs. non-location-aware image; (2) compare the memorability aspects when users create a graphical password based on a location-aware image vs. non-location-aware image; and (3) elicit the users’ perceived security, usability, memorability, trust, and likeability toward the DuoPass paradigm.

5.1 Research Questions

We investigated the following research questions. The aim of RQ₁, RQ₂, and RQ₃ is to compare the suggested personalized and location-aware graphical password scheme of DuoPass with the state-of-the-art approaches in graphical password authentication. In addition, after analyzing quantitatively the observed effects, we investigate in RQ₄ the perceived security, usability, memorability, trust, and likeability toward the DuoPass approach, and in RQ₅ we investigate which of the authentication types of DuoPass (graphical vs. textual) the users prefer for authentication:

RQ₁: Is there a significant improvement in security strength of the selected graphical passwords between the DuoPass condition (experimental group) and the state-of-the-art condition (control group)?

RQ₂: Is there a significant difference in graphical password entry efficiency between the DuoPass condition (experimental group) and the state-of-the-art condition (control group)?

RQ₃: Is there a significant improvement in memorability between the DuoPass condition (experimental group) and the state-of-the-art condition (control group)?

RQ₄: Do end users score positively with regard to perceived security, usability, memorability, trust, and likeability toward the DuoPass paradigm?

RQ₅: Which authentication type (graphical vs. textual) do users prefer for authentication?

5.2 Image Sets: Location-Aware and Non-Location-Aware Image Semantics

We created two image sets to control the image semantics and consequently investigate the research questions as follows: (1) a location-aware image set (experimental group): this image set included images that depicted content relevant to the participants’ hospital (e.g., hospital cafeteria, reception hall, front yard), which was related to their location-based experiences and memories created during their visits at the hospital, and (2) a non-location-aware image set (control group): this image set included images that depicted generic content that was not relevant to the users (e.g., sceneries from landscapes, people) to control the participants’ familiarity with the image content. Both image sets followed existing research, which revealed that end users typically choose images depicting sceneries [4, 18, 59, 83, 86, 113, 114].

Furthermore, bearing in mind that the complexity of an image and the number of Points-of-Interest (PoI) (regions of an image that attract the users’ attention) affect the security strength of user-created graphical passwords [28, 59], we carefully selected images that had similar content complexity and number of PoIs for both user groups (experimental and control). This was achieved by applying saliency maps and saliency filters [28, 81] to detect salient regions on the images, entropy estimators [20] to calculate the image complexity, and computer vision techniques to detect PoIs. Figure 1 illustrates a subset of the images used in the study. Table 1 illustrates the means of image complexity and mean number of PoIs for each image set.

Fig. 1.

Table 1.

	Control		Experimental
	Mean	St. Dev.	Mean	St. Dev.
Complexity in Bits	7.53	.15	7.47	.14
Number of PoI Regions	6.77	.62	7.11	.73

Table 1. Means of Image Complexity and Number of PoIs for Each Image Set

5.3 Procedure and Participants

For investigating the research questions, a between-subjects study design was conducted in which we formed two groups of users—that is, the experimental group that used an authentication system, including location-aware images based on the suggested DuoPass paradigm, and the control group that used an authentication system, including non-location-aware images based on current state-of-the-art authentication approaches in graphical user authentication. Specifically, the experimental group included patients from three different hospitals, which received location-aware images (i.e., image content depicting sceneries from their hospitals) (Figure 1(a)), whereas the control group included end users in a non-healthcare context, which received non-location-aware images that depicted generic content that was not familiar to them (Figure 1(b)). To avoid bias, we provided a set of six images for each group during password composition, and participants chose one image to create their secret and eventually graphical password.

A total of 68 individuals (36 in the experimental group and 32 in the control group), ranging in age from 20 to 60 years, were recruited and split in the two groups. To assure that users in both groups were motivated to use secure passwords, we applied the user authentication task in the frame of an online service. Users were asked to perform specific tasks (e.g., access a specific service and view information) that first required them to login. This way, we did not explicitly ask the participants to login to keep the authentication task as a secondary task of interaction and hence increase ecological validity. All individuals participated voluntarily and provided their consent that their interactions would be recorded anonymously in the context of an experimental research study. In addition, the participants could opt out of the study any time they liked.

The experiment was split in two phases. In Phase A (Day 0), participants were introduced to the assigned authentication system, completed a questionnaire on demographics, and then created and confirmed their password key. Users then completed a short task within the service, requiring them to first login. Phase B was performed on Day 1, Day 3, and Day 6 after Phase A. In all sessions, we asked participants to complete a task in the online service, which was only accessible through login, during which they had to recall their password key and access the service through the assigned authentication system.

5.4 Data Metrics

Graphical password strength. We measured graphical password strength based on an accredited password guessability metric [113, 114], which is calculated based on the number of guesses required to crack the users’ passwords. Based on existing approaches that have applied this metric [28, 59], we similarly implemented and applied a brute-force attack model that considers PoIs (i.e., regions on an image that attract the users’ attention), starting from segments covering the PoI segments, then checking the neighboring segments, and finally checking the rest of the segments.

Password composition time. Password composition time is calculated as the time required to create the graphical password, starting from the time the image is illustrated until the end user successfully completes the password composition task.

Memorability. For measuring memorability, we used memory time [97], which is the greatest length of time between a password creation and a successful password login using the same password.

Users’ perceived security, memorability, trust, and likeability. At the end of the experiment, we asked participants from the experimental group on aspects that relate to perceived security, memorability, trust, and likeability of the proposed paradigm. We also measured usability aspects by utilizing the System Usability Scale (SUS) [17], which is a widely applied instrument for measuring password usability.

5.5 Analysis of Results

5.5.1 Security Strength Between the Control and Experimental Group (RQ₁).

To investigate RQ₁, we ran two security analyses to investigate whether there are differences in security strength of the user-created graphical passwords between the control and experimental user groups. The first analysis compared password guessability that was based on a naïve brute-force attack, whereas the second analysis compared password guessability that was based on the PoI-assisted brute-force attack. Figure 2 (left) illustrates the means of password guessability among user groups, as assessed by the naïve and the PoI-assisted brute-force attack model. For the naïve brute-force attack, we ran an independent samples t-test to determine whether the two user groups (control vs. experimental) generated different password strengths in terms of password guessability. The assumption of homogeneity of variances was not violated, as assessed by Levene's test for equality of variances (p = .075). There were no significant outliers in the data, as assessed by inspection of boxplots, and data were normally distributed, as assessed by Shapiro-Wilk's test (p > .05). Results revealed significant differences with a mean difference of 22 million guesses (95% CI, –5.7 million to 1.28 million), t(66) = –1.261, p = .021. In particular, user-chosen graphical passwords of the experimental group required 53 million guesses to crack, whereas for the control group, 31 million guesses were required.

Fig. 2.

For the PoI-assisted brute-force attack, we ran a Welch t-test to determine whether the two user groups (control vs. experimental) generated different password strengths in terms of password guessability, due to the assumption of homogeneity of variances being violated, as assessed by Levene's test for equality of variances (p = .048). There were no significant outliers in the data, as assessed by inspection of boxplots, and data were normally distributed, as assessed by Shapiro-Wilk's test (p > .05). Results revealed significant differences with a mean difference of 30 million guesses (95% CI, –6 million to –346,000), t(36.165) = –2.140, p = .039. In particular, user-chosen graphical passwords of the experimental group required 47 million guesses to crack, whereas those of the control group required 17 million guesses to crack. Figure 2 (right) illustrates the percentage of passwords cracked indicating that users from the control group exhibited a higher percentage of passwords cracked than users from the experimental group. The percentage of graphical passwords cracked reached 100% for the control group within 2²⁶ guesses, and for the experimental group within 2²⁹ guesses.

To further verify the security strength of the created passwords, we took an extra step to analyze the users’ individual gestures with respect to PoI regions (i.e., regions that attract the users’ attention and are prone to automated guessing attacks) [28]. To identify the PoIs of each image, we followed a semi-automated image analysis approach by applying saliency maps and saliency filters [81] to detect salient regions on the images and computer vision techniques to detect PoIs [28]. Figure 3 (left) illustrates an example of a hospital image used and its corresponding salient regions as detected through the image analysis (Figure 3, right).

Fig. 3.

A Mann-Whitney U test was run to determine if there were differences in number of PoI selections between the control and experimental group. Distributions of values for the two groups were similar, as assessed by visual inspection. The median number of PoI selections for the control group (1.72) was statistically significantly higher compared to the experimental group (1.53), U = 224.000, z = – 4.561, p < .001, using an exact sampling distribution for U [36]. We further ran an independent-samples t-test, with the user group (control vs. experimental) as the independent variable, and the proportion of gestures falling into PoI regions as the dependent variable. The analysis (Figure 4) revealed that users of the experimental group made a lower proportion of selections falling into PoI regions (0.45 ± 0.04) than users of the control group (0.71 ± 0.04), a statistically significant difference of 0.26 ± 0.05 (95% CI, .15 to .37), t(65) = 4.93, p < .001. In addition, the effect size (Hedge's g = 1.112 [50]) indicates a large effect since it is greater than 0.8 [24].

Fig. 4.

To further investigate whether individuals, who share common experiences within the sceneries depicted in the location-aware images, tend to create similar passwords when they use the same image during password creation, we first split the participants from the experimental group into subgroups based on the image they used. In the sample of the experimental group (n = 36), one out of nine images was not used by any participant. From the remaining eight images that were selected by participants, two images were selected by only one participant, and six images were selected by more than one participant, thus forming six subgroups of participants.

Given that the implementation of the DuoPass graphical password mechanism takes into consideration the order and the type of gestures (e.g., circles are more complex than simple taps but less complex than lines⁴), to understand the similarities of users’ password selections, we have disregarded the order and the type of the gestures and rather focused on the positions of the password selections. To do so, we simplified the gesture type as follows: for circles, we disregarded the radius and the directionality and kept only the center of the circle as an x, y segment, whereas for lines, we considered only the x, y segment of the starting point of the line. Table 2 summarizes the similarities in image regions across users who created their password on the same image. Accordingly, out of 102 gestures made by 34 users (n = 36, but we exclude the two images that were selected by only one participant), 6 users chose one same region, 2 users chose two same regions, and no user selected all three same regions. We would also like to note that when we consider the exact order of the gestures, there are no observed similarities in image regions across all subgroups and all participants.

Table 2.

Image Subgroup	No. of Users Who Selected	Common Regions in Password Selections
	the Same Image	1 out of 3	2 out of 3	3 out of 3
1	7	2	1	–
2	5	–	–	–
3	7	1	–	–
4	4	1	–	–
5	6	2	1	–
6	5	–	–	–
Total	34	6	2	0

Table 2. Summary of the Similarities in Image Regions Across Users who Created Their Password on the Same Image

5.5.2 Graphical Password Composition Efficiency Between the Control and Experimental Group (RQ₂).

To investigate RQ₂, we ran a two-way mixed analysis of variance (ANOVA) with the user group (control vs. experimental) and users’ password selections (three consecutive selections) as the independent variables, and the time to make each password selection as the dependent variable. There were no significant outliers, as assessed by inspection of boxplots. The data were normally distributed, as assessed by Shapiro-Wilk's test of normality (p > .05). There was homogeneity of variances (p > .05), as assessed by Levene's test of homogeneity of variances. The analysis revealed significant differences between the three users’ password selections on the time to compose the graphical password, F(1, 66) = 86.942, p < .01, partial η² = .568. The analysis revealed that there was no interaction between the user group and users’ password selections on the time to compose the graphical password, F(1, 66) = 1.459, p = .231, partial η² = .022. Figure 5 (left) depicts the time to make each of the three graphical password selections.

Fig. 5.

We further examined simple main effects for each password selection. Data are mean ± standard error unless otherwise stated. The analysis revealed that the time to create the last (third) selection between the two groups was statistically significant (control: 1,109.81 ± 1,222.91 msec vs. experimental: 650.83 ± 509.97 msec) with a mean difference of 458.97 msec, F(1, 66) = 4.161, p = .043, partial η² = .06. With regard to the first and second selections, there were no significant differences between the control and experimental groups (p > .05).

5.5.3 Memorability Differences Between the Control and Experimental Group (RQ₃).

To investigate RQ₃, we measured login task completion time for each of the login sessions over the 7-day period, and memory time, which is the maximum amount of time (in hours) someone could effectively remember their password from the day of creation. Accordingly, we initially analyzed login task completion time using a mixed-effects analysis (with the lme4 package in R) [10] since this enabled us to handle all variables of the study while accounting for repeated measures of individuals (four login sessions over a period of 7 days) and for handling missing data of users—for example, a user who has not participated in some sessions across the 7 days of the study can be used in the analysis without requiring removing the user from the sample [82]. In this respect, we performed a mixed-effects analysis of the relationship between the time to successfully authenticate (by also including any failed attempts that eventually ended in a successful authentication) and the user group. As fixed effects, we entered the user group (control and experimental) into the model. As random effects, we used subjects to account for non-independence of measures. Visual inspection of residual plots revealed that linearity and homoscedasticity were not violated. p-Values were obtained by likelihood ratio tests of the full model with the effect in question against the model without the effect in question [107]. The analysis revealed that the user group had no impact on the time needed to authenticate (x²(1) = .171, p = .679). The mean login time for the users of the control group was 7.47 ± 4.78 seconds, whereas for the users of the experimental group, it was 7.63 ± 6.3 seconds. Figure 5 (right) depicts the mean login time across the user group for each of the four sessions. We further analyzed the users’ login attempts, indicating that overall, most login attempts were completed using the graphical password. Regarding the textual password login attempts, 5 out of 36 individuals from the experimental group also logged in once using a textual passphrase with a mean login time of 5.21 ± 2.43 seconds, whereas 10 out of 32 individuals from the control group also logged in once using a textual passphrase with a mean login time of 8.66 ± 3.05 seconds.

The maximum memory time that someone could achieve was approximately 168 hours (7 days x 24 hours). To investigate memorability, we conducted an independent-samples t-test, with the user group (control vs. experimental) as the independent variable and the memory time as the dependent variable. The analysis revealed that memory time between the two user groups was not statistically significant different, t(66) = –.961, p = .340. Memory time of the control group was 106.13 ± 76.8 hours, whereas memory time of the experimental group was 121.58 ± 55.2 hours.

5.5.4 Users’ Perceptions Toward Security and Experience of the DuoPass Approach (RQ₄).

To investigate RQ₄, we conducted a post-study survey to elicit the users’ perceptions (experimental group) on the security, memorability, trust, usability, and likeability toward the DuoPass system based on their interactions. For this purpose, we designed a questionnaire by following state-of-the-art works and guidelines on eliciting perceived security, trust, memorability, usability, and user experience [8, 17, 21]. Some of the example statements of the survey were “Overall, how secure do you find the DuoPass password system?”, “How mentally demanding was the login task?”, and “I trust in the ability of the DuoPass password system to protect my privacy”. Users rated the statements through a 5-point Likert scale, with the labels changing depending on the question (e.g., 1: Strongly disagree to 5: Strongly agree; 1: Very insecure to 5: Very secure). Perceived usability was measured through the SUS [17], which is an accredited and widely applied system usability instrument and widely used in password studies [21]. The survey also investigated the likeability toward the DuoPass personalized and flexible approach of DuoPass with users rating the statements through a 5-point Likert scale (1: Not at all to 5: Absolutely). Figure 6 illustrates the responses of participants toward perceived security, memorability, trust, and likeability.

Fig. 6.

Results revealed that the majority of participants perceive the DuoPass system as secure (75%), with low mental demand (77%) in recalling the password and users could effectively recall their password (84%), whereas 80% of the participants trust the technology and its ability to keep their data private and secure. Furthermore, when participants were asked whether they like the flexible and personalized approach for user authentication, the majority of participants (91%) extremely (21/36) or very much (12/36) liked the idea, with three users either moderately (1/36) and slightly (2/36) liking the idea. Table 3 also summarizes the likeability scores per healthcare organization, indicating that patients across organizations had a consensus on liking the suggested approach and increase ecological validity.

Table 3.

	Healthcare Organization 1	Healthcare Organization 2	Healthcare Organization 3	Total
Extremely	10	7	4	21
Very much	5	3	4	12
Moderately	1	0	0	1
Slightly	1	0	1	2
Not at all	0	0	0	0

Table 3. Likeability of DuoPass Per Healthcare Organization

Representative positive responses from participants included the following: “I like it very much, [it's] a great new approach”, “easy yet hard to crack by (hackers)”, “no one knows what I see in it”, “easier to remember”, “it's a much trickier way for criminals to find out what someone has entered”, “accessible to everyone”, “easy to use, especially through the use of images that capture the imagination of the user and therefore easier to remember and harder to find for people who want to harm. Each image is basically different and very personal”, “very personal password”, “I would like a lot that the images displayed were personal images that I could upload”, “genius”, “a sentence is better for me to remember”, “point out the many possibilities with regard to places in the photo”, “it was very easy to remember”, “Very individual. Less prompts requirements usual password and feels very secure because of that”, “pictures are good”, “Easy to use and bespoke”, “Works for people who learn using pictures. Hopefully easier to remember”, “great not to have to remember yet another password”, “it is very personalized”, “not having to remember text”, “It is nice and quick to be able to log with 3 clicks you remember from an almost infinite possibility of clicks and orders”, “For people who have visual memory, using images instead of text it's probably easy to remember”, “Better for dyslexic people”, “I like the concept of using picture passwords, it can make it easier to remember”.

Negative responses from participants included the following: “there are some places in the images that obviously pop out more and possibly people is more prone to use them in their password, making this system more insecure”, “people will use the easiest gestures and this could be a safety concern”, “if you don't use it often would it be still memorable?”, “with a photo you have to be able to use the same photo everywhere, because otherwise I can't remember the login code”, “My main concern is that choices of pass gestures and locations are not truly random and can be figured out using a user's publicly available data, such as for instance a social network profile”, “Perhaps tapping on the heads is too obvious, people might create weak passwords”, “security, if I pick 3 simple gestures it may be easier for a criminal to guess”.

Finally, we measured system usability based on SUS with participants scoring an overall SUS score of 74.77%. Table 4 summarizes the SUS scores of patients across the participating healthcare organizations. Based on the literature, the average SUS score is 68% [89]. In case the score is under 68%, the system entails various usability issues that need improvement, whereas a score above 68%,indicates that the system entails good usability practices. Accordingly, the scores across the three healthcare organizations ranged between 72.14% for Healthcare Organization 1, 74.58% for Healthcare Organization 2, and 80% for Healthcare Organization 3, with an overall score of 74.77%. Such results are encouraging for further investigating and improving the system since the score suggests that the DuoPass system scores very well in usability, end users like the system, and they can easily complete the authentication-related tasks. Nevertheless, given that the score is below 80%, there are still aspects that require improvements. For example, during the studies, some patients had difficulties entering their graphical password through the developed gesture input mechanism, hence next steps entail improving the gesture input functionality. In addition, we conducted a one-way ANOVA to determine if the SUS score was different for people belonging to different healthcare organizations. There were no outliers, as assessed by boxplot; data was normally distributed for each group, as assessed by Shapiro-Wilk's test (p > .05), and there was homogeneity of variances, as assessed by Levene's test of homogeneity of variances (p = .843). Data is presented as mean ± standard error. The SUS score increased from 72.14 ± 4.23% to 74.58 ± 5.59% to 80 ± 5.10%, in that order, but the differences between the healthcare organizations was not statistically significant, F(2, 36) = 0.629, p = .538.

Table 4.

	Healthcare Organization 1	Healthcare Organization 2	Healthcare Organization 3	Overall
SUS Score	72.14%	74.58%	80%	74.77%

Table 4. SUS Scores Across Healthcare Organizations

5.5.5 Summary of Main Findings.

The experimental evaluation study revealed interesting insights related to security, memorability, and user experience with regard to the suggested DuoPass approach. Table 5 provides a summary of the main findings. Users make arbitrary choices in knowledge-based user authentication, which decreases the security. In locimetric passwords, this is a well-known and highly researched issue. The DuoPass approach aims to overcome this issue by recommending personalized images that are related to the users’ prior experiences in the healthcare environment. As such, we expected to improve security and at least retain memorability and user acceptance. Both security and memorability analyses provide evidence that the DuoPass approach assisted end users in making password choices based on their experiences, overcoming arbitrary choices, which are one of the main reasons for decreased security and memorability in locimetric approaches.

Table 5.

	Experimental Group	Control Group	Significance
RQ₁: Is there a significant improvement in security strength of the selected graphical passwords between the DuoPass condition (experimental group) and the state-of-the-art condition (control group)?
Naïve Brute-Force Attack	53 million	31 million	Mean difference: 22 million guesses (95% CI, –5.7 million to 1.28 million), t(66) = –1.261, p = .021
PoI-Assisted Brute-Force Attack	47 million	17 million	Mean difference: 30 million guesses (95% CI, –6 million to –346,000), t(36.165) = –2.140, p = .039
PoI Selections	0.45 ± 0.04	0.71 ± 0.04	Mean difference: 0.26 ± 0.05 (95% CI, .15 to .37), t(65) = 4.93, p < .001
Representative user responses: “Very individual. Less prompts requirements usual password and feels very secure because of that”, “The picture with gestures seems very robust, I think it would be very hard to hack”.
*RQ₂: Is there a significant difference in graphical password entry efficiency between the DuoPass condition (experimental group) and the state-of-the-art condition (control group)?*
Password Composition (third selection)	650.83 ± 509.97 msec	1,109.81 ± 1,222.91 msec	Mean difference: 458.97 msec, F(1, 66) = 4.161, p = .043
Representative user responses: “3 clicks is faster and easier to remember than a 16 character password”, “it's something I would like as an option, but I'd still need a text password. I think connecting dots in a user chosen pattern would work better rather than arbitrary shapes on a screen”.
*RQ₃: Is there a significant improvement in memorability between the DuoPass condition (experimental group) and the state-of-the-art condition (control group)?*
Login time	7.63 ± 6.3 seconds	7.47 ± 4.78 seconds	x²(1) = .171, p = .679
Memorability	121.58 ± 55.2 hours	106.13 ± 76.8 hours	t(66) = –.961, p = .340
Representative user responses: “I like the concept of using picture passwords, it can make it easier to remember”, “easier to remember”, “This is an easy to remember password sequence that visually minded users will likely find very appealing”.
*RQ₄: Do end users score positively with regard to perceived usability and likeability toward the DuoPass paradigm?*
Perceived Security	75% positive	–	–
Perceived Memorability	84% positive	–	–
Perceived Trust	80% positive	–	–
Perceived Usability (SUS)	74.77% positive	–	–
Likeability	91% positive	–	–
Representative user responses: “I like it very much, it a great new approach”, “I feel it could be a robust system”, “easy yet hard to crack by (hackers)”, “no one knows what I see in it”, “it's a much trickier way for criminals to find out what someone has entered”.
*RQ₅: Which authentication type (graphical vs. textual) do users prefer for authentication?*
Authentication Type Preference	Graphical: 30 users Textual: 6 users	–	p < .001
Representative user responses: “I like the idea of using picture passwords”, “Some people would find choosing a picture password easier than remembering an only word password. It will also be more secure”. “This is an easy to remember password sequence that visually minded users will likely find very appealing”, “great not to have to remember yet another password”.

Table 5. Summary of Main Findings

From a security perspective, we report DuoPass’ superiority against the state-of-the-art approach given that the experimental user group scored significantly higher guessability compared to the control group. This can be accredited to the fact that users from the experimental group created graphical passwords on images that were related to their prior experiences within the hospital and hence created selections on regions based on their experiences, rather than generic regions, which may be susceptible to a brute-force attack. Such a finding is in line with existing research, which revealed that depicting images related to the users’ prior sociocultural experiences increases the security of user-selected graphical secrets [28]. Furthermore, task completion efficiency analyses revealed that during the last password selections (third gesture), there were significant differences in user selections, with users from the experimental group making a significantly faster selection compared to the users from the control group. This can be explained by the fact that users from the control group needed more time to reason about a secret story on an image that was rather not familiar to them, whereas in the experimental group, users created a story based on their familiarity with the image, and consequently, as they composed their password, they were faster in making their last selections.

From a memorability and user experience perspective, descriptive statistics reveal that users from the experimental group scored higher memory time compared to the control group, indicating good memorability aspects of the approach; however, this difference was not statistically significant. This finding was triangulated with end users’ qualitative feedback in which participants perceived the DuoPass secrets as highly memorable, users were able to memorize their secret for the whole period of the study, the majority reported low mental demand (77%) in recalling their password, and they could effectively recall their password (84%).

Finally, DuoPass scores well in usability based on participant responses to the SUS (74.77%), but indicating that there is still room for improvement given that best SUS scores should be 80% and above. From feedback received during the studies, some patients had difficulties in entering their graphical password through the developed gesture input mechanism, hence our efforts are focused on improving the interaction design of gestures to address cross-compatibility issues and heterogeneity of devices. When users were asked about likeability aspects of the approach, the significant majority of users (90%) extremely and very much like the flexible and personalized approach, and the majority would like to use DuoPass as an alternative password system (75%).

6 Human Guessing Attack Study

Bearing in mind that when using location-aware images in graphical passwords, the password selections are based on the end users’ existing experiences within the depicted sceneries. Hence, it is probable that the individuals who share common experiences with the end users might be able to guess their selections. To shed light on this aspect, we conducted a human attack study focusing on guessing vulnerabilities among people sharing common experiences. Each session of the study embraced pairs of participants who were closely related (e.g., family members, friends, patients, medical staff, nurses) and who shared common experiences between them. In each session, both participants were first requested to create a graphical password independently, then each participant was requested to guess the password selections of the other participant from the same pair.

6.1 Research Question

RQ. Does the suggested user-adaptable and personalized authentication paradigm, which utilizes location-aware images for graphical passwords, entail guessing vulnerabilities in terms of allowing attackers who share common experiences with the end users to more easily identify regions of their selected secrets?

6.2 Image Set: Location-Aware Image Semantics

We extended the location-aware image set from Section 5.2 to include images that were related to individuals’ (e.g., patients, medical staff, nurses) location-based experiences and memories within the hospital. Figure 7 illustrates a subset of the images used in the human guessing attack study. We assigned each participant a specific location-aware image based on their role and their relationship with the other participant from the same pair. Furthermore, to control the image complexity and number of PoIs, we carefully selected images that had similar content complexity and number of PoIs following the approach described in Section 5.2. Table 6 illustrates the means of image complexity and mean number of PoIs of the initial and the extended location-aware image sets.

Fig. 7.

Table 6.

	Initial		Extended
	Mean	St. Dev.	Mean	St. Dev.
Complexity in Bits	7.47	.14	7.31	.38
Number of PoI Regions	7.11	.73	7.29	.72

Table 6. Means of Image Complexity and Number of PoIs for Each Location-Aware Image Set

6.3 Data Metrics

With regard to calculating the graphical password strength, we adjusted the PoI-assisted brute-force attack model from Section 5.4 to start from segments covering the segments provided by each attacker, then checking the neighboring segments, then checking the PoI segments and their neighboring segments, and finally checking the rest of the segments.

6.4 Procedure and Participants

Participants were split into pairs, and they were first requested to create a graphical password independently, then guess the password of each other from the same pair. The study was run remotely with the researcher supporting the participants. The study was split in two phases as follows.

Phase A: Password creation. During the first phase, each pair of participants connected to a meeting via an online means of communication (i.e., Microsoft Teams) in a pre-scheduled time, and participants were asked independently to create a graphical password to access an online service. To avoid bias effects during the attack phase, each participant created a password on a different location-aware image that depicted places in which they share common experiences within the hospital.

Phase B: Human guessing attack. In this phase, we switched the image of the pairs and each participant was requested to guess the other participant's secrets as follows: (1) by first indicating three areas (x, y segments on the grid) on the image for which they believe that the other participant made their selections around them, then (2) by actually drawing three gestures for a total of three attempts to guess the actual password (i.e., considering the ordering of gestures and type of gestures). At the end of the attack phase, participants submitted their feedback about the rationale behind their selections as attackers. This allowed us to elicit whether the attacker's rationale is related to the shared memories and experiences she possesses with the other participant from the same pair. Finally, both participants completed a questionnaire on demographics.

A total of 92 individuals, ranging in age from 28 to 62 years, were recruited from two healthcare organizations (44 from Healthcare Organization 1 and 48 from Healthcare Organization 2). Since the purpose of this study was to understand how individuals decide on their selections when performing an attack on a password created by another individual with whom they share common experiences within places depicted on location-aware images at the hospital, we intentionally recruited pairs of participants who are close to each other (e.g., family members (n = 18), friends (n = 18), patients (n = 20), medical staff (n = 18), and nurses (n = 18)). To assure that participants were motivated to use secure passwords, we applied the user authentication task in the frame of an online service. Users were asked to perform specific tasks (e.g., access a specific service and view information) that first required them to login. This way, we did not explicitly ask the participants to login to keep the authentication task as a secondary task of interaction and hence increase ecological validity. All individuals participated voluntarily and provided their consent that their interactions would be recorded anonymously in the context of an experimental research study. In addition, the participants could opt out of the study at any time they liked.

6.5 Analysis of Results

6.5.1 Euclidean Distance of Attackers’ Selections from the End Users’ Secret Selections.

To investigate the RQ, we conducted three analyses: (1) we calculated the Euclidean distance of the attackers’ guessing selections from the legitimate end users’ password secret selections; (2) based on the first analysis (Euclidean distance), we adjusted the brute-force attack performed in Section 5.5.1 to investigate whether users who share common experiences were able to run a more effective attack by starting to guess regions they suspected that the users selected their password; and (3) we performed a qualitative analysis based on the participants’ feedback at the end of the human guessing attack study to better understand the approach followed by attackers on graphical passwords created on location-aware images. In the analyses that follow, data are mean ± standard error. There were no significant outliers in the data.

A. Disregarding the type of the gesture and the exact order. Figure 8 depicts the Euclidean distance of each gesture of each participant by disregarding the type and the exact order of the attackers’ gestures and the end user's gestures. For the analysis, we adopted a threshold of three segments by considering the allowed tolerance of the graphical password mechanism.⁴ Accordingly, among 276 gestures (3 gestures x 92 participants), 49 gestures (17.7%) were in close proximity with the attacker's guessed selections. Furthermore, we conducted a one-way multivariate analysis of variance (MANOVA) to determine the effect of relationship between attackers and legitimate end users on how far the attackers’ password selections were from the legitimate end users’ password selections. Three measures were assessed: Euclidean distance of the legitimate users and the attackers on the first gesture, second gesture, and third gesture of the graphical password. Participants belonged to one of the following categories: family member, friend, medical staff, patient, and nurse. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (r = .117, p = .004 between gesture one and gesture two; r = .021, p = .007 between gesture one and gesture three; and r = .133, p = .010 between gesture two and gesture three); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (p = .007). The analysis revealed that the differences between groups on the combined dependent variables were statistically significant, F(12, 225.180) = 4.356, p < .0005; Wilks’ Λ = .576; partial η² = .168. Follow-up univariate ANOVAs revealed that all three gestures were statistically significantly different between the participants from different relationship groups (first gesture: F(4, 87) = 5.351, p = .001; partial η² = .197; second gesture: F(4, 87) = 3.305, p = .014; partial η² = .132; third gesture: F(4, 87) = 4.550, p = .002; partial η² = .173; Tukey-Kramer post hoc tests showed that for the first gesture, participants from the family group had statistically significantly lower mean scores than participants from either the patient group (p = .002) or the nurse group (p = .050), whereas participants from the friend group had statistically significantly lower mean scores than participants from the patient group (p = .006). Regarding the second gesture, participants from the family group had statistically significantly lower mean scores than participants from the nurse group (p = .020). Regarding the third gesture, participants from the family group had statistically significantly lower mean scores than participants from the medical staff group (p = .036), the patient group (p = .025), and the nurse group (p = .001).

Fig. 8.

B. Disregarding the type of the gesture but considering the exact order. Figure 9 depicts the Euclidean distance by disregarding the type of selections but considering the exact order of the attackers’ gestures and the end users’ gestures. Applying the same threshold of three segments, the analysis revealed that among 276 gestures (3 gestures x 92 participants) made by the participants, 19 gestures (6.8%) were in close proximity with the attacker's guessed selections. Furthermore, we conducted a one-way MANOVA to determine the effect of relationship between attackers and legitimate end users on how far the attackers’ password selections were from the legitimate end users’ password selections. Three measures were assessed: Euclidean distance of the legitimate users and the attackers on the first gesture, second gesture, and third gesture of the graphical password. Participants belonged to one of the following categories: family member, friend, medical staff, patient, nurse. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (r = –.089, p = .004 between gesture one and gesture two; r = .253, p = .008 between gesture one and gesture three; and r = .055, p = .009 between gesture two and gesture three); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (p < . 0005). The analysis revealed that the differences between groups on the combined dependent variables were statistically significant, F(12, 225.180) = 11.500, p < .0005; Wilks’ Λ = .282; partial η² = .344. Follow-up univariate ANOVAs revealed that all three gestures were statistically significantly different between the participants from different relationship groups (first gesture: F(4, 87) = 12.567, p < .0005; partial η² = .366; second gesture: F(4, 87) = 3.930, p = .006; partial η² = .153; third gesture: F(4, 87) = 13.832, p < .0005; partial η² = .389; Tukey-Kramer post hoc tests showed that for the first gesture, participants from the family group had statistically significantly lower mean scores than participants from the medical staff group (p < .0005), the patient group (p = .001), and the nurse group (p < .0005), whereas participants from the friend group had statistically significantly lower mean scores than participants from the medical staff group (p = .001), the patient group (p = .004), and the nurse group (p < .0005). Regarding the second gesture, participants from the family group had statistically significantly lower mean scores than participants from the nurse group (p = .002). Regarding the third gesture, participants from the family group had statistically significantly lower mean scores than participants from the medical staff group (p = .011), the patient group (p < .0005), and the nurse group (p < .005), whereas participants from the friend group had statistically significantly lower mean scores than participants from the patient group (p < .0005) and the nurse group (p < .0005).

Fig. 9.

C. Considering the type of the gesture and exact order. We compared the three attempts of each attacker with the end user's stored password from the same pair of participants. From a total of 276 attacking guesses (3 attempts of each attacker x 92 participants), there was no successful attempt, yielding an online success guessing rate of 0%.

6.5.2 Security Strength of the Created Graphical Passwords Based on Experience-Spot-Driven Brute-Force Attack.

To investigate whether the suggested location-aware image approach holds against attacks when considering the experience-spots indicated by each participant who acted as an attacker, we conducted an offline attack comparing a PoI-assisted brute-force attack (the same attack that considers PoIs as described in Section 5.5.1) and a personalized PoI-assisted brute-force attack that was further enhanced to consider the experience-spots regions as indicated by the human attacker.

A. Disregarding order and type of gestures across all participants. Given that the implementation of PGA-like mechanisms takes into consideration the order and the type of gestures, which could impact the total guesses required to crack a graphical password (e.g., circles are more complex than simple taps but less complex than lines⁴), it is interesting to first understand how each attack type (PoI-assisted brute-force attack vs. personalized PoI-assisted brute-force attack) performs when we disregard the order and the type of the gestures and rather focus on the positions of the password selections. To do so, we simplify the gesture type as follows: for circles, we disregard the radius and the directionality and keep only the center of the circle as an x, y segment, whereas for lines, we consider only the x, y segment of the start of the line.

A one-way MANOVA was run to determine the effect of relationship between attackers and legitimate end users on the number of guesses required to crack the passwords when using a PoI-assisted brute-force attack vs. a personalized PoI-assisted brute-force attack by considering also the experience-spots provided by the attackers. Two measures were assessed: number of guesses required to crack the passwords when using a PoI-assisted brute-force attack and number of guesses when using a personalized PoI-assisted brute-force attack. Participants belonged to one of the following categories: family member, friend, medical staff, patient, nurse. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (r = .394, p < .0005); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (p = .002). The analysis revealed that the differences between groups on the combined dependent variables were statistically significant, F(8, 172) = 4.546, p < .0005; Wilks’ Λ = .681; partial η² = .175. Follow-up univariate ANOVAs revealed that in both types of attacks, the number of guesses required to crack the passwords was statistically significantly different between the participants from different relationship groups (PoI-assisted brute-force attack: F(4, 87) = 4.650, p = .002; partial η² = .176; personalized PoI-assisted brute-force attack: F(4, 87) = 7.473, p < .0005; partial η² = .256; Tukey-Kramer post hoc tests showed that for the PoI-assisted brute-force attack, participants from the family group had statistically significantly lower mean scores than participants from the nurse group (p = .001), whereas participants from the friend group had statistically significantly lower mean scores than participants from the nurse group (p = .024). Regarding the personalized PoI-assisted brute-force attack, participants from the family group had statistically significantly lower mean scores than participants from the nurse group (p < .0005), participants from the friend group had statistically significantly lower mean scores than participants from the nurse group (p = .002), participants from the medical staff group had statistically significantly lower mean scores than participants from the nurse group (p = .003), and participants from the patient group had statistically significantly lower mean scores than participants from the nurse group (p = .008). In the PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 43,576.94 ± 7,488.35; (2) friend: 71,732.66 ± 17,131.09; (3) medical staff: 99,921.83 ± 18,620.08; (4) patient: 112,232.45 ± 22,590.04; and (5) nurse: 159,116.55 ± 27,663.24. In the personalized PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 28,694.61 ± 4,596.83; (2) friend: 52,546.77 ± 19,951.18; (3) medical staff: 55,511.50 ± 11,560.53; (4) patient: 62,751.50 ± 11,903.80; and (5) nurse: 124,987.88 ± 12,485.52. Figure 10 depicts the means of password strength among attack types by disregarding the order and the type of gestures across all participants.

Fig. 10.

B. Disregarding order and type of gestures across participants with at least one gesture containing an experience-spot. A one-way MANOVA was run to determine the effect of relationship between attackers and legitimate end users on the number of guesses required to crack the passwords when using a PoI-assisted brute-force attack vs. a personalized PoI-assisted brute-force attack by considering participants with at least one gesture containing an experience-spot. Two measures were assessed: number of guesses required to crack the passwords when using a PoI-assisted brute-force attack and number of guesses when using a personalized PoI-assisted brute-force attack. Participants belonged to one of the following categories: family member, friend, medical staff, patient, nurse. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (r = .764, p < .0005); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (p = .001). The analysis revealed that the differences between groups on the combined dependent variables were statistically significant, F(8, 96) = 43.855, p < .0005; Wilks’ Λ = .046; partial η² = .785. Follow-up univariate ANOVAs revealed that in both types of attacks, the number of guesses required to crack the passwords was statistically significantly different between the participants from different relationship groups (PoI-assisted brute-force attack: F(4, 49) = 66.535, p < .0005; partial η² = .845; personalized PoI-assisted brute-force attack: F(4, 49) = 81.915, p < .0005; partial η² = .870; Tukey-Kramer post hoc tests showed that for the PoI-assisted brute-force attack, participants from the family group had statistically significantly lower mean scores than participants from the friend group (p = .032), the medical staff group (p < .0005), the patient group (p < .0005), and the nurse group (p < .0005). Participants from the friend group had statistically significantly lower mean scores than participants from the medical staff group (p = .001), the patient group (p = .020), and the nurse group (p < .0005). Participants from the medical staff group had statistically significantly lower mean scores than participants from the nurse group (p < .0005), whereas participants from the patient group had statistically significantly lower mean scores than participants from the nurse group (p < .0005). Regarding the personalized PoI-assisted brute-force attack, participants from the family group had statistically significantly lower mean scores than participants from the medical staff group (p = .001), the patient group (p < .0005), and the nurse group (p < .0005). Participants from the friend group had statistically significantly lower mean scores than participants from the medical staff group (p = .002), the patient group (p < .0005), and the nurse group (p < .0005). Participants from the medical staff group had statistically significantly lower mean scores than participants from the patient group (p < .0005) and the nurse group (p < .0005), whereas participants from the patient group had statistically significantly lower mean scores than participants from the nurse group (p = .050). In the PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 25,308.50 ± 1,454.21; (2) friend: 32,241.70 ± 1,330.90; (3) medical staff: 41,972.60 ± 2,062.57; (4) patient: 39,267.41 ± 1,436.24; and (5) nurse: 58,851.83 ± 1,498.21. In the personalized PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 7,536.10 ± 351.76; (2) friend: 7,815.60 ± 308.75; (3) medical staff: 12,074.80 ± 394.72; (4) patient: 19,233.83 ± 969.97; and (5) nurse: 22,047.91 ± 1,000.61. Figure 11 depicts the means of password strength among attack types by disregarding the order and the type of gestures across participants with at least one gesture containing an experience-spot.

Fig. 11.

C. Considering order and type of gestures across all participants. A one-way MANOVA was run to determine the effect of relationship between attackers and legitimate end users on the number of guesses required to crack the passwords when using a PoI-assisted brute-force attack vs. a personalized PoI-assisted brute-force attack by taking into account the order and type of gestures across all participants. Two measures were assessed: number of guesses required to crack the passwords when using a PoI-assisted brute-force attack and number of guesses when using a personalized PoI-assisted brute-force attack. Participants belonged to one of the following categories: family member, friend, medical staff, patient, nurse. Data are expressed as mean ± standard error. Preliminary assumption checking revealed that data was normally distributed, as assessed by Shapiro-Wilk's test (p > .05); there were no univariate or multivariate outliers, as assessed by boxplot and Mahalanobis distance (p > .001), respectively; there were linear relationships, as assessed by scatterplot; no multicollinearity (r = .183, p = .008); and there was homogeneity of variance-covariance matrices, as assessed by Box's M test (p = .012). The analysis revealed that the differences between groups on the combined dependent variables were not statistically significant, F(8, 172) = .020, p = 0.99; Wilks’ Λ = .998; partial η² = .001. In the PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 1,660,935.38 ± 348,113.52; (2) friend: 1,687,709.11 ± 447,019.45; (3) medical staff: 1,629,472.16 ± 124,750.15; (4) patient: 1,624,675.54 ± 849,169.82; and (5) nurse: 1,622,255.16 ± 66,263.27. In the personalized PoI-assisted brute-force attack, the mean number of guesses required to crack the passwords per group was as follows: (1) family member: 1,745,307.61 ± 253,165.96; (2) friend: 1,690,173.83 ± 126,111.69; (3) medical staff: 1,789,121.611 ± 217,395.14; (4) patient: 1,797,917.49 ± 475,354.52; and (5) nurse: 1,818,772.27 ± 164,674.17. Figure 12 depicts the means of password strength among attack types by considering the order and the type of gestures across all participants.

Fig. 12.

6.5.3 Qualitative Analysis.

To further shed light and understand the approach followed by attackers on graphical passwords created on location-aware images, we used the data gathered from the feedback mechanism at the end of the study, as well as observations made by the researchers during the attack phase. In many cases, attackers used knowledge about the end user under attack, related to their habits, preferences, and facts about their personality: “My colleague is a great storyteller and I believe he would try to create a story for the password, like arriving at the entrance of the hospital, then entering by the stairs, then reading information on the panel. So, I decided to make my attack having this story-telling process in mind”. – P19; “I thought she will have used the three possible gestures (instead of just one or two of the options), and marked colorful items, because of her personality”. – P28; “Most of the times he has his coffee in the front yard outside the emergency room during his shift break. I would be surprised if he hadn't selected this particular area”. – P56; “My colleague likes flowers and plants. I think that some of her selections must be on the flowers”. – P39; “She likes painting at her free time and I think she drew straight lines on objects that have bright colors”. – P11.

In other cases, it is evident that the scenery depicted on the location-aware images impacted the selections of the attackers. In particular, attackers used a more personalized approach by considering specific information related to their common shared experiences with the end user under attack within the places depicted on the location-aware images: “Usually we have lunch together at the hospital's cafeteria and we tend to be seated at the tables near the entrance. Hence, I made my selections around these tables”. – P7; “Considering direction of movement; places he usually sits or similar”. – P14; “The pictures are from daily routines, so I'm trying to guess where he is going on a daily basis or important aspects for him in the photo”. – P29; “I work at the emergency department and the colleague that I am requested to guess his password is the ambulance driver. I think it is very possible that he made some of his password selections near or on the ambulance outside of the emergency department”. – P8; “The photo shows two parking lots, but I know that she usually parks her car to the one next to the left entrance of the hospital because it is more convenient for her. I guess that some of her selections must be within this specific parking lot area”. – P16; “Being a hospital receptionist and a friendly person that interacts daily with many patients, I think that she must have selected the people standing in front of the reception desk”. – P33.

In very few cases, attackers did not employ any sophisticated attack but rather focused on the obvious PoIs of the images: “Tried to guess likely features in the images, but type of gesture I used was just random”. – P23; “I clicked on the most dominant views. I chose them because they caught my eye”. – P34; “I believe he must have selected the chairs because these are the most visible points in the image”. – P42.

The preceding observations were concentrated in a coding schema relevant to the approach followed by the attackers as follows:

—

Habits/preferences/characteristics of end users (e.g., storytelling, coffee, flowers, painting)

—

Common shared experiences (i.e., experiences within the depicted place/scenery)

—

Random-guessing approach relying on areas of the image that attract peoples’ attention (i.e., PoIs).

Table 7 summarizes the responses about the approach employed by the attackers based on the aforementioned coding schema.

Table 7.

Attacking Approach Followed	Frequency
Habits/preferences/characteristics of end users	30 out of 92
Common shared experiences	48 out of 92
Random-guessing approach	14 out of 92

Table 7. Summary of the Approach Followed by the Attackers Based on the Coding Schema Extracted from Data Collected During the Attack Phase

6.6 Summary of Main Findings

The human guessing attack study revealed that the suggested user-adaptable and personalized authentication paradigm, which utilizes location-aware images for graphical passwords, increases guessing vulnerabilities in case someone knows the user, since analyses indicate that individuals who share common experiences may spot certain regions that the end user used to create the graphical password gestures. In particular, the main findings of the analyses are as follows:

—

Human guessing vulnerabilities exist when we disregard the type of the gesture and the exact order, since in some cases participants from specific groups (e.g., family member, friend) scored lower Euclidean distances than participants from other groups (e.g., patient, nurse, and medical staff).

—

Human guessing vulnerabilities also exist when we disregard the type of the gesture but consider the exact order, since in some cases participants from specific groups (e.g., family member, friend) scored lower Euclidean distances than participants from other groups (e.g., patient, nurse, and medical staff).

—

There were no human guessing vulnerabilities when we consider the type of the gesture and the exact order, since there was no successful attempt, yielding an online success guessing rate of 0%.

—

With regard to the security strength when we disregard the order and type of gestures across all participants, we observed differences in the number of guesses required to crack the passwords using the PoI-assisted brute-force attack, since in some cases participants from specific groups (e.g., family member, friend) scored a lower number of guesses than participants from other groups (e.g., nurse). Similarly, in the personalized PoI-assisted brute-force attack, participants from specific groups (e.g., family member, friend, medical staff, patient) scored a lower number of guesses than participants from other groups (e.g., nurse).

—

With regard to the security strength when we disregard the order and type of gestures across participants with at least one gesture containing an experience-spot, we also observed differences in the number of guesses required to crack the passwords using the PoI-assisted brute-force attack, since in some cases participants from specific groups (e.g., family member, friend, medical staff, patient) scored a lower number of guesses than participants from other groups (e.g., friend, medical staff, patient, nurse). Similarly, in the personalized PoI-assisted brute-force attack, participants from specific groups (e.g., family member, friend, medical staff, patient) scored a lower number of guesses than participants from other groups (e.g., medical staff, patient, nurse).

—

With regard to the security strength when we considering the order and type of gestures across all participants, there were no observed differences in the number of guesses required to crack the passwords using either the PoI-assisted brute-force attack or the personalized PoI-assisted brute-force attack.

Based on the aforementioned, we can conclude that some relationship groups are able to run a more effective attack by starting to guess regions in which they suspected that the users selected their password. Nonetheless, based on the brute-force attacks on the DuoPass graphical mechanism as a whole (i.e., when also considering the order and type of gesture⁴), this did not affect the security of the created graphical passwords on location-aware images.

7 Discussion and Implications

In this section, we elaborate about the applicability of DuoPass in the broader healthcare domain and provide guidelines that can serve as a basis for implementing an adaptation and personalization system based on the suggested authentication paradigm, as well as the limitations of this research work.

We envision that DuoPass may be deployed as a stand-alone Web-based user authentication system within healthcare organizations, which will extend the existing textual-based password solutions that patients currently use to access their personal health records through the Web portal [38]. Given that DuoPass would rely on location-based experiences, habits, and memories created during patients’ visits at the hospital, we anticipate that it would be more suitable for patients who visit the same hospital on a frequent or regular basis. At a first stage, an organization would need to identify mainstream spatial areas of the hospital—that is, areas that are visited by the majority of individuals (medical staff, patients, relatives, visitors, etc.). Next, the spatial relevance of each mainstream area should be identified to create a neighborhood/relationship map among the diverse mainstream spatial areas identified—for example, the mainstream spatial area “reception hall” is related to the hospital's “cafeteria”, hence, a relationship rule would be created connecting the two areas. Finally, the system administrator would need to prepare and upload relevant images depicting sceneries for each of the identified mainstream of the hospital. These images would then be processed through an adaptation and recommendation engine that would recommend best-fit images to end users aiming to improve memorability and security of passwords. For doing so, the recommendation engine would also receive as input the end user's visitation record to extract the relevant experiences and visits the end users had in specific mainstream spatial areas of the hospital. In this respect, DuoPass will leverage on the existing authentication infrastructure that exists in the healthcare organization for retrieving the user's visitation record.

The following scenarios are anticipated. First, the enrollment scenario: during user enrollment, the system would retrieve (based on the username and a unique enrollment code) the user's visitation record within the hospital. Based on the semantic similarity of the user visits and the mainstream spatial areas of the hospital, the system would recommend three relevant images to choose from for creating their graphical password. Note that the three images would have the same level of complexity and PoIs to avoid scenarios in which the user would create predictable passwords. Second, the login scenario: during login, the system would illustrate two options for authentication (graphical vs. textual), and accordingly the user would enter their secret credentials to login. Third, the reset scenario: password reset could be initiated either by the user (e.g., in case they forget their password) or by the system based on the organization's applied policy. In this case, the same procedure would follow as in the enrollment scenario, considering, however, the previous image selections of the user, to avoid users selecting the same password.

DuoPass would consist of the following modules: the System Administration module, the User Modeling module, the Recommendation module, and the Flexible User Authentication module. The System Administration module would allow administrators to upload and maintain images that depict sceneries of various locations of the hospital (e.g., reception hall, main rooms of the hospital). The system's image database would also be filled by end users, who would be able to upload their own images taken within the hospital, once approved by the system administrator by following organizational policies and requirements. The User Modeling module would analyze the existing health record of the patients based on their activity and visits at the hospital (e.g., the patient may visit doctors of the ophthalmology department or the orthopedic department). Based on the analysis, the module would infer the patient's frequent visits and important locations within the hospital, which would be then provided as input to the Recommendation module to recommend images depicting sceneries from the patient's most common visits. The Recommendation module would be further enhanced with image analysis technologies aiming to semantically automatically annotate the images with the depicted content, which may be used during password creation for recommendation and user guidance for the creation of more memorable and secure passwords. Finally, the Flexible User Authentication module would be responsible for authenticating users based on an easy-to-use and a flexible authentication paradigm that would be based on the recommended and/or user-adaptable graphical passwords.

Algorithm 1 in the appendix presents our content-based recommendation algorithm that will recommend relevant images during password creation/reset based on the rules of mainstream spatial areas and user's visitation records and experiences within the hospital. The algorithm initially requires configuration by the organization in terms of identifying the mainstream spatial areas of the hospital and then creating the relationship map between them. Next, the set of candidate images is generated as follows: (1) the system administrator uploads images that depict the mainstream spatial areas of the hospital, as well as approves relevant images within the hospital that were provided by the end users; (2) a set of tags and sentences that describe the semantic content of the image is generated by explicit (i.e., annotated by the system administrator) and implicit (i.e., annotated by computer vision techniques for object and label detection⁵^–⁷) methods; and (3) the set of tags and sentences is pre-processed and cleaned. The part of image recommendation involves the following steps: (1) for each user, the frequent visits and locations within the hospital are inferred from their existing health records based on their activity and visits; (2) a set of tags and sentences that describe the semantic content of the users’ visits and locations is generated; (3) the set of tags and sentences is then filtered to contain relevant information based on the relationship map between the mainstream areas; (4) the set of tags and sentences is pre-processed and cleaned; (5) for each image in the set of candidate images, a semantic similarity score is calculated (i.e., through Natural Language Processing (NLP) techniques (e.g., BERT [34])) between the set of tags and sentences that describe the users’ frequent visits and locations and the set of tags and sentences that describe the semantic content of the images; and (6) finally, the semantic similarity scores are sorted and the top N images are recommended to each end user.

7.1 Limitations

Despite our efforts to keep the validity of the study, some design aspects of the experiments introduce limitations. We used specific personalized location-aware images to control the factors of the study (location-aware vs. non-location-aware images). Although users’ choices may be affected by the content and complexity of the image [37, 105], we provided images of the most widely used image categories (i.e., depicting sceneries and people [4, 37]) and of similar complexity [28, 59]. Although works exist on location-based authentication [2, 3, 85], expansion of our research will also consider a greater variety of location-aware image categories for triangulating the findings with diverse user communities and location-based experiences on different levels of abstractions (i.e., individual, group, organizational, national, global) [28] and thus increase the validity of the study. Furthermore, the proposed personalization approach in DuoPass was compared against one baseline generic approach. Nevertheless, this was intentional to get comparable results, which probably would not be the case had we compared the suggested non-intrusive personalized approach against other intrusive approaches (e.g., “presentation effect” [100], hiding salient areas [18]). In addition, to control the similarity of image factors in terms of complexity and PoIs, we intentionally did not compare the suggested approach against user-uploaded images, which could have introduced images of varying complexity and PoIs.

Moreover, considering that DuoPass relies on location-based experiences, habits, and memories created during patients’ visits at the hospital, it is probable that patients could share similar experiences with other patients or with other people who are close to them and know their habits (e.g., enter the building through the same entrance, walk in the same corridor, visit the same hospital's cafeteria and order the same drink, visit the hospital with their accompanying caregiver). Although such scenarios could entail password guessing vulnerabilities in terms of allowing people sharing same experiences or are close to them to more easily identify regions of their selected secrets [26, 28], the human guessing attack study we conducted revealed that the security of such a personalized graphical password mechanism is not compromised when additional measures (i.e., type and order of gestures) are considered. Furthermore, similar to most graphical password systems, DuoPass is also susceptible to shoulder-surfing attacks [99], as it was not designed to account for such threat scenarios. In the case that the username, the image, and the gestures are observed through shoulder-surfing, then an attacker has all of the information needed to break into the account, as is the case with most other graphical password systems [22]. Another challenge of DuoPass relates to generating and maintaining a diverse pool of location-aware set of images, to form a dictionary that contains adequate images that people can reflect upon based on their experiences.

In addition, we stress that the DuoPass graphical authentication mechanism primarily relies on visual elements, requiring end users to perceive, process, and recall visual information, and accordingly select certain regions on an image by using human motor functions—that is, by pointing on and selecting secret regions of the image through a computer mouse or finger input on a touch screen. Consequently, such graphical user authentication systems create accessibility issues for some user populations that might have visual and/or human motor difficulties. To address visual accessibility issues, Braille code based images and haptics could be used in the graphical user authentication process. Such an approach would require utilizing and/or implementing certain hardware and software technology for storing the Braille code in the DuoPass system, and end users to read and select secret regions of the Braille code image through haptic technology. However, many people with vision difficulties do not actually know or use Braille, and Braille also differs largely across countries (e.g., British Braille, American Braille). Hence, it is more likely that people with vision difficulties would use speech recognition for passwords or biometric passwords.

Another limitation relates to getting useful patients’ visitation records to form relevant image recommendations. We envision that the DuoPass’ User Modeling module could be extended by existing third-party services, such as indoor positioning systems that track locations and activities of individuals while they are within the premises of an organization (e.g., healthcare institution). Nonetheless, such an extension would require additional infrastructure and usage of third-party services that might increase the operational costs of the organization.

Finally, due to the inherent nature of memory, the suggested personalization approach of DuoPass might be practical for patients who (1) visit the same hospital on a frequent or regular basis, (2) do not suffer from memory impairment (e.g., Alzheimer's disease, dementia), and (3) are familiar with specific information of healthcare locations and are able to memorize it (e.g., patients who required emergency hospitalization due to an accident or were unconscious during their visit at the hospital might not be able to memorize the locations of the hospital). Nonetheless, in cases of one-off patients (i.e., patients who visit hospitals/clinics once a year for routine check-ups or required emergency hospitalization and are not able to memorize healthcare locations), DuoPass will be configured to not recommend personalized location-aware images but instead will recommend state-of-the-art non-location-aware/generic images that have been previously approved by the system administrator in terms of complexity and policies. Expansion of our research will also consider the practicability of DuoPass with diverse patient communities, such as an elderly population with memory impairment.

8 Conclusion

This article presents a novel knowledge-based user authentication paradigm, which aims to provide a secure, memorable, and patient-centric authentication solution within current highly heterogeneous computational realms of healthcare environments. Results of a feasibility study, during which users interacted with the suggested authentication paradigm, revealed significant differences on users’ password selections falling into PoI regions of the images and subsequently on the security strength of the selected graphical passwords between the experimental and control groups. Furthermore, there was no interaction between the user group and users’ password selections on the time to compose the graphical password; however, the experimental group required significantly less time to create the last (third) selection of their password compared to the control group. Simultaneously, both experimental and control groups performed similarly in terms of memorability and login efficiency. Moreover, responses from the post-study survey revealed that the suggested paradigm scored high in terms of users’ likeability, perceived security, usability, and trust. On the downside, the suggested paradigm introduces password guessing vulnerabilities in terms of allowing attackers, who share common experiences with the end users, to identify regions of the end users’ selected secrets more easily. Nonetheless, the results of the human guessing attack revealed that the security of the suggested paradigm is not compromised when additional measures (i.e., type and order of gestures) are considered.

We anticipate that the suggested approach will have a positive impact on both healthcare organizations and end users. From the organization's perspective, the flexible approach will assist healthcare organizations to easily adjust their policies to the varying roles of their end users (patients, doctors, nurses), in which current practice indicates that the “one-size-fits-all” approach is not adequate in the highly dynamic and heterogeneous contexts of use in the healthcare domain. From the end user's perspective, the suggested flexible and personalized paradigm and supported results open new directions for considering novel knowledge-based user authentication mechanisms to assist end users to choose the “best-fit” authentication scheme depending on preference, unique characteristics, and the context of interaction (e.g., interaction in the office, in the emergency room, off the network).

From a procedural perspective, given that DuoPass is solely based on knowledge-based authentication approaches, it is less expensive compared to token-based and biometric-based solutions, which entail increased implementation and maintenance costs, but at the same time, results of this study reveal increased security and a positive user experience. In addition, through the flexible and adaptable character of DuoPass (i.e., shift between graphical and textual passwords), it still supports the user interactions within the current state-of-the-art authentication approaches in the healthcare domain, which is solely based on traditional text-based approaches. In this respect, DuoPass also adapts easily to current multi-factor authentication approaches—for instance, DuoPass can be used as the first step during authentication, and any additional layer may be added as a following step to increase security.

A side effect of the approach relates to the password creation efficiency since users require more time to create their password (graphical and textual) than the traditional approach (only textual). Nonetheless, the majority of the participants commented that this has not negatively affected their likeability toward DuoPass—for example, a user commented that “The creation phase happens only once so I'm ok with that”. Future work will focus on improving the efficiency of the password creation phase with alternative visual and interaction designs. Furthermore, the open-ended nature of the suggested authentication paradigm raises new security threats and might affect users toward misuse strategies that need to be carefully addressed. To assure that users will not create semantically insecure (predictable) selections on images as a side effect of allowing them to create their own images for the graphical passwords, automated image tagging technologies will be used to prevent users’ unsafe coping strategies. Furthermore, evidence suggests that individuals, in an attempt to reduce the memory load of remembering multiple passwords, tend to reuse the same or similar passwords across multiple accounts [13], which has a negative impact on the security. Hence, another future research prospect would be to investigate whether differences exist in individuals’ perceptions about reduced memory load between individuals who utilize location-aware images in DuoPass and individuals who utilize non-location-aware images in other graphical user authentication schemes. In addition, future work entails investigating whether differences exist in password reuse approach between individuals who utilize location-aware images in DuoPass and individuals who utilize non-location-aware images in other graphical user authentication schemes.

Bearing in mind that within today's information era patients and medical staff interact in highly dynamic healthcare environments and contexts, and tend to use multiple devices to authenticate themselves, it is obvious that the current widely deployed “one-size-fits-all” text-based authentication paradigm might soon become obsolete. Hence, we believe that approaches like DuoPass provide an alternative solution to current state-of-the-art research and practice, and have the potential to be easily adopted with a rather inexpensive solution compared to other token-based (e.g., smartcards) and biometric-based solutions (e.g., fingerprint), which necessitate increased implementation and maintenance costs. Although initial experiments are promising, further studies are required to evaluate DuoPass in the wild with the aim to get further insights on its validity, user acceptance, and real-world user behavior.

Appendix

Table A1.

Timeline
Phase A *(6 months)*	Literature Review and Needs Verification (Section 2 of the article) Verify and triangulate the state-of-the-art user authentication literature in the healthcare domain with diverse stakeholders of three European healthcare organizations Literature Review — Sources: ACM Digital Library, IEEE Xplore — Keywords: authentication; password; biometric; locimetric; drawmetric; healthcare; health — Number of papers reviewed based on inclusion criteria: 40 — Publication date: 01/01/2015–01/06/2021 Triangulation of Literature with Healthcare Organizations — Zuyderland Medical Center, Netherlands; ∼100K annual patients and users — Hospital Clinic of Barcelona, Spain; ∼25K annual patients and users — Western General Hospital, Scotland; ∼20K annual patients and users Procedure — Semi-structured interviews with key stakeholders (n = 9) — Stakeholder profiles: Chief information security officers, enterprise architects, IT department managers, security experts, doctors, project managers
Phase B *(12 months)*	DuoPass Design and Development (Section 4 of the article) Design and development of the DuoPass authentication system based on the “Single-Secret Two Reflections” paradigm, following a user-centered design approach Design Considerations — Security Factors — Usability and User Experience Factors — Adaptation and Personalization Factors — Security and Usability Key Performance Indicators Key Performance Indicators adopted for the evaluation study — Password guessability — Password creation efficiency — Memory time — Login time — Users’ perceived security, usability, trust and likeability
Phase C *(11 months)*	User Evaluation with Participants of Healthcare Organizations (Sections 5 and6 of the article) Record users’ interactions with the suggested DuoPass approach or a state-of-the-art authentication approach, aiming to evaluate its security, memorability, and user experience Sampling and Procedure — Between-subjects’ feasibility study (n = 68); human guessing attack study (n = 92) — Experimental group used the DuoPass authentication system, which included a graphical password system with location-aware images based on the suggested authentication paradigm — Control group used a state-of-the-art authentication system, including non-location-aware images based on current state-of-the-art authentication approaches

Table A1. Research Methodology Outline

Algorithm for image recommendation during password creation/reset.

Acknowledgments

We sincerely thank participants of the healthcare organizations (Zuyderland Medical Center, Netherlands; Hospital Clinic Barcelona, Spain; Western General Hospital within NHS Lothian, Scotland, UK) for their time and efforts in conducting the user needs’ verification and evaluation studies, and the valuable feedback received during the surveys and focus groups.

Footnotes

Zuyderland Medical Center: https://www.zuyderland.nl/english.

Hospital Clinic Barcelona: https://www.clinicbarcelona.org/en.

Western General Hospital: https://www.nhslothian.scot.

⁴

Microsoft Picture Password blog: bit.ly/2SajCDO.

⁵

Google Cloud Vision: https://cloud.google.com/vision.

⁶

Amazon Rekognition: https://aws.amazon.com/rekognition.

⁷

TensorFlow: https://www.tensorflow.org.

References

[1]

A. Abdellaoui, Y. I. Khamlichi, and H. Chaoui. 2016. A robust authentication scheme for telecare medicine information system. Procedia Computer Science 58 (2016), 584–589. DOI:

Abstract

1 Introduction

2 User Authentication Research and Practice in Healthcare Organizations

2.1 Literature Review

2.1.1 Search Strategy, Paper Selection, and Eligibility Criteria.

2.1.2 Review Outcome.

2.2 Triangulating Results of Current State-of-the-Art with Healthcare Organizations

2.2.1 Participating Healthcare Organizations and Stakeholders.

2.2.2 Procedure.

2.2.3 Highlights of Participants’ Responses.

3 Research Motivation and Method

3.1 Research Motivation

3.2 Research Method

4 A Flexible and Personalized Locimetric User Authentication Paradigm in Healthcare

4.1 Conceptual Design Based on the Dual Coding Theory

4.2 DuoPass Authentication Paradigm

5 Feasibility Study

5.1 Research Questions

5.2 Image Sets: Location-Aware and Non-Location-Aware Image Semantics

5.3 Procedure and Participants

5.4 Data Metrics

5.5 Analysis of Results

5.5.1 Security Strength Between the Control and Experimental Group (RQ1).

5.5.2 Graphical Password Composition Efficiency Between the Control and Experimental Group (RQ2).

5.5.3 Memorability Differences Between the Control and Experimental Group (RQ3).

5.5.4 Users’ Perceptions Toward Security and Experience of the DuoPass Approach (RQ4).

5.5.5 Summary of Main Findings.

6 Human Guessing Attack Study

6.1 Research Question

6.2 Image Set: Location-Aware Image Semantics

6.3 Data Metrics

6.4 Procedure and Participants

6.5 Analysis of Results

6.5.1 Euclidean Distance of Attackers’ Selections from the End Users’ Secret Selections.

6.5.2 Security Strength of the Created Graphical Passwords Based on Experience-Spot-Driven Brute-Force Attack.

6.5.3 Qualitative Analysis.

6.6 Summary of Main Findings

7 Discussion and Implications

7.1 Limitations

8 Conclusion

Appendix

Acknowledgments

Footnotes

References

Cited By

Index Terms

Recommendations

Increasing security and usability of computer systems with graphical passwords

Sweet-spotting security and usability for intelligent graphical authentication mechanisms

User Perceptions of Security and Usability of Mobile-Based Single Password Authentication and Two-Factor Authentication

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations

5.5.1 Security Strength Between the Control and Experimental Group (RQ₁).

5.5.2 Graphical Password Composition Efficiency Between the Control and Experimental Group (RQ₂).

5.5.3 Memorability Differences Between the Control and Experimental Group (RQ₃).

5.5.4 Users’ Perceptions Toward Security and Experience of the DuoPass Approach (RQ₄).