Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The password allocation problem

2013, Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society - WPES '13

The Password Allocation Problem Strategies for Reusing Passwords Effectively Rishab Nithyanand Rob Johnson Department of Computer Science Stony Brook University New York, USA Department of Computer Science Stony Brook University New York, USA rob@cs.stonybrook.edu rnithyanand@cs.stonybrook.edu ABSTRACT 1. INTRODUCTION Each Internet user has, on average, 25 password-protected accounts, but only 6.5 distinct passwords [4]. Despite the advice of security experts, users are obviously re-using passwords across multiple sites. So this paper asks the question: given that users are going to re-use passwords across multiple sites, how should they best allocate those passwords to sites so as to minimize their losses from accidental password disclosures? We provide both theoretical and practical results. First, we provide a mathematical formulation of the Password Allocation (PA) problem and show that it is NP-complete with a reduction via the 3-Partition problem. We then study several special cases and show that the optimal solution is often a contiguous allocation – i.e., similar accounts share passwords. Next, we evaluate several human- and machinecomputable heuristics that have very good performance and produce solutions that are reasonably close to optimal. We find that the human-computable heuristics do not perform nearly as well as the machine-computable heuristics, however, they provide a useful and easy to follow set of guidelines for re-using passwords. Current guidelines for password re-use rarely go beyond: don’t do it [7]. However, the problem of how to allocate and reuse passwords deserves more practical solutions due to the rapid proliferation of password protected web services in recent years and the facts that (1) human memory rarely allows for the commitment of more than six to eight base passwords (i.e., not including minor variations which are easily guessed) [4] and (2) evidence strongly suggests that user chosen pseudonyms are easily linkable for a majority of users – therefore the compromise of a single account due to password disclosure leads to the effective compromise of all accounts that share that password [9], [8]. Our work focuses on developing strategies for password re-use for no-fault users. These are users that do not accidentally disclose their passwords to attackers by themselves – i.e., they are phish-proof, key-logger free, etc. Instead, our focus is on password leaks caused by theft of password files stored by service providers. Recently, such thefts have impacted millions of users of services provided by organizations such as Canonical, FBI, Hotmail, IEEE, LinkedIn, Sony, Ubisoft, Yahoo!, and many others. Often, due to an organizations poor security practices such as using plaintext password files (examples from the previous list are the FBI, IEEE, Sony, and Yahoo! [2]), even having high entropy passwords and following password guidelines does not protect its users in the event of a server breakin. In this paper we approach the problem of password allocation and reuse from both, a theoretical and a practical setting. Towards the understanding of the theoretical aspects of the problem, we provide a mathematical formulation of the Password Allocation (PA) problem and the problem is shown to be NP-complete in section 2. In section 3, we study several special cases and show that the optimal solution often has the property of contiguous allocations – i.e., similar accounts share passwords. From the practical standpoint, in sections 4 and 5, we study several human which perform reasonably well and machine computable heuristics that produce near optimal solutions. The performance of these heuristics yields a preliminary set of guidelines for reusing passwords in section 7. Categories and Subject Descriptors F.2.0 [Analysis of Algorithms and Problem Complexity]: General; K.6.5 [Management of Computing and Information Systems]: Security and Protection; H.1.2 [Models and Principles]: User/Machine Systems—Human Factors General Terms Algorithms, Security, Human Factors Keywords Password re-use; Heuristics; Authentication; Usability Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. WPES’13, November 4, 2013, Berlin, Germany. Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2485-4/13/11 ...$15.00. http://dx.doi.org/10.1145/2517840.2517870 . 1.1 Problem Definition Informally, the Password Allocation (PA) problem may be described as follows: If a user has n web accounts and k passwords and the users ith password is used for the j th account, then the probability, qij , that an attacker obtains the ith password by breaking into the j th web service is a func- tion of (1) the security of the j th service and (2) the strength of the ith password (if service j hashes passwords). Thus, given estimates of the value (vj ) of the j th account (based on the perceived value or sensitivity of contained data), the security of each service provider (based on historical data and published policies), and strength of each password (based on entropy), how should a user allocate passwords to accounts so as to minimize their expected loss (or, equivalently – maximize their expected surviving value) from server break-ins? The PA problem is mathematically formulated as: Problem 1. PA Problem: Given v1 , . . . , vn and pij for i = 1, . . . k and j = 1 . . . , n, findPpartition · · · ∪ Sk of P S1 ∪ Q {1, . . . , n} that maximizes ES = ki=1 ( j∈Si vj j∈Si pij ). Here, pij = 1 − qij , Si denotes the set of accounts allocated to the ith password, and ES denotes the expected surviving value of all accounts after passwordPallocation is complete. We will also use the notation Vi = j∈Si vj and Q P P Pi = j∈Si pij , so ES = ki=1 ESi = ki=1 Pi Vi . Accounts that provide the means to obtain access to other linked accounts that do not share the same password (eg., email accounts that may be used to recover other account passwords) are referred to as gateway accounts. Our formulation allows the modeling of gateway accounts by having the values associated with gateway accounts set to be the sum of the values of the corresponding linked accounts. While the compromise probability of the gateway account does not need to be changed, the compromise probability of each account should incorporate the compromise probabilities of each of their gateway accounts. The mathematical formulation of the password allocation problem requires users to assign values to the data contained in each service/account. We emphasize that there should be no standard guidelines for assigning values to the data (other than requiring users to maintain consistency between accounts). As long as users are aware of (1) the data made available in each service and (2) how the data may be used by an adversary, the absence of a standard procedure in assigning values to services is desirable since it permits users to better protect accounts that they perceive to be more important – e.g., a company spokesperson might find that their social network accounts are generally more valuable (and important to protect) than their shopping accounts, while this might not be the case for other users. Note that compromise probabilities are easily assigned using publicly available data about password file type (i.e., hashed vs. unhashed), recent password file compromises, and estimated number of current users. Such information can be found using public breach/vulnerability databases (e.g., [2]). 2. COMPLEXITY Problem 2. Decisional PA Problem (D-PA): Given the instance of a PA problem and a threshold r, does there exist an allocation such that ES ≥ r? Theorem 1. D-PA problems are NP-Complete. Proof. Membership in NP is easily established. Given a guess for S1 , . . . , Sk , we just compute ES and verify that it is at least r. This can be done in time proportional to the length of the words representing the values and probabilities. We will reduce the well studied 3-Partition problem (3P) [5], to the D-PA problem. Given an instance I3P : X = P3n {x1 , . . . , x3n } (where j=1 xj = nB) of the 3P problem, we create an instance IP A of the D-PA problem with 3n accounts and n passwords as follows: (1) Pick a rational number b ∈ (1, e4/nB ), (2) Set vj = xj , ∀j ∈ {1, . . . , 3n}, (3) Set pij = b−xj , ∀j ∈ {1, . . . , 3n} and ∀i ∈ {1, . . . , n}, and (4) Set r = n × B ×P b−B . n By construction, i = nB and, for any allocation i=1 V P −Vi S1 , . . . , Sn , we have ES = n . We now argue that, i=1 Vi b P −B = r. because of the choice of the base, b, ES ≤ n i=1 Bb Consider any unequal allocation, i.e. in which there exist i and i′ such that Vi 6= Vi′ . We will show that the allocation would be improved by redistributing the value equally between passwords i and i′ . Consider the function f (x) = xb−x + (c − x)b−(c−x) , where c = Vi + Vi′ ≤ nB. First, observe that f ′ (c/2) = 0. Second, since b < e4/c , f ′ (x) > 0 for all x ∈ [0, c/2) and f ′ (x) < 0 for x ∈ (c/2, c]. Hence x = c/2 is a global maximum. Thus, if an allocation has any i, i′ such that Vi 6= Vi′ , we could improve it by replacing Vi and Vi′ by (VP i + Vi′ )/2. Hence the optimal −B allocation must have ES ≤ n = r, and this can i=1 Bb only be obtained if V1 = · · · = Vn = B. Therefore, if there is a solution to the 3P problem, then there exists an allocation for the D-PA problem in which V1 = · · · = Vn = B and ES = r. If, on the other hand, there is an allocation of the D-PA problem such that ES ≥ r, then we must have ES = r and V1 = · · · = Vn = B, and there is a solution to the 3P problem. Theorem 1 shows that even the restricted D-PA problem – where probabilities are account dependent but password independent (for eg., when all passwords are equally secure, or all websites use unhashed password files) – is weakly NPComplete. It remains an open question whether the D-PA problem is strongly NP-Complete. 3. SPECIAL CASES: WHEN OPTIMAL ALLOCATIONS ARE CONTIGUOUS 3.1 Identical Risk Accounts Here, each account possesses a distinct value and compromise probabilities that are password dependent (but, account independent) – i.e., we have pij = pi (∀i, ∀j). This special case is applicable to the PA problem when dealing with organizations that are known to use hashed password files and are (more or less) equally vulnerable to attacker breakins. The compromise probabilities are dependent only on the strength of the passwords allocated to each account. The problem can be stated as: Given pij = pi and vj for i = 1, . . . k and j = 1 . . . , n, find partition S1 ∪ · · · ∪ Sk of P P |S | {1, . . . , n} that maximizes ES = ki=1 ( j∈Si vj pi i ). Theorem 2. If accounts and passwords are ordered by decreasing values (vj ) and survival probabilities (pi ), respectively, the allocation of accounts to passwords is contiguous. Proof. Let v1 ≥ v2 ≥ · · · ≥ vn . Consider the allocation given by S = hS1 , . . . , Sk i where without loss of generality, |S | |S | |S | p1 1 ≥ p2 2 ≥ · · · ≥ pk k . Let Si be the first password with a non-contiguous allocation – i.e., account l ∈ Si but account m ∈ Si+1 (where vl ≤ vm ). Now, consider the ∗ allocation given by S ∗ = hS1 , . . . , Si∗ , Si+1 , . . . , Sk i where ∗ account l ∈ Si+1 and account m ∈ Si∗ – i.e., the allocation where the passwords of accounts l and m are swapped. ∗ ∗ h Now, observe that: [ESi + ES i i+1 ] − [ESi + ESi+1 ] = |Si+1 | |S | pi i (vm − vl ) − pi+1 (vl − vm ) ≤ 0. Therefore, the expected survival value of a contiguous allocation is always at-least as good as any non-contiguous allocation. Since the optimal allocation of accounts to passwords is contiguous (given accounts sorted by their values), the following recursive relation may be used to find the optimal allocation: h i Pi OP T (i, j) = max1≤l≤i pi−l j m=l vm + OP T (l, j − 1) . Here, OP T (i, 0) = ∞ and OP T (n, k) returns the maximum expected survival value for n accounts allocated to k passwords in O(n2 k) time. 3.2 Identical Passwords and Identical Valued Accounts In this scenario we have that all accounts are (nearly) equally valuable and have compromise probabilities that are account dependent (but, password independent) – i.e., we have (w.l.o.g) pij = pj and vj = 1 (∀i, ∀j). For example, when allocating passwords to equally valuable accounts where all passwords have the same entropy, or when password files are unhashed by all service providers. The problem can be stated as: Given pij = pj and vj = 1 for i = 1, . . . , k and j = 1, . . . , n, find P partition Q S1 ∪ · · · ∪ Sk of {1, . . . , n} that maximizes ES = ki=1 (|Si | j∈Si pj ). Theorem 3. If accounts are ordered by decreasing survival probabilities (pj ), each password is allocated a contiguous subset of accounts. Proof. Without loss of generality, let p1 ≥ p2 ≥ · · · ≥ pn . Consider the allocation given by S = hS1 , . . . , Sk i where ∀i ∈ {1, . . . , k − 1} we have ESi ≥ ESi+1 . Let Si be the first password with a non-contiguous allocation – i.e., account l ∈ Si but account m ∈ Si+1 (where pl ≤ pm ). Now, con∗ sider the allocation given by S ∗ = hS1 , . . . , Si∗ , Si+1 , . . . , Sk i ∗ where account l ∈ Si+1 and account m ∈ Si∗ – i.e., the allocation where the passwords for accounts l and m are swapped. ∗ Now, observe that: [ESi + ESi+1 ]i− [ESi∗ + ESi+1 ] = h [ESi + ESi+1 ] − ppml ESi + ppml ESi+1 ≤ [ESi + ESi+1 ] − h i ESi + ESi+1 ( ppml + ppml − 1) ≤ 0. Therefore, the expected survival value of a contiguous allocation is always at-least as good as any non-contiguous allocation. Since the optimal allocation of accounts to passwords is contiguous (given accounts sorted by their survival probabilities), the following recursive relation may be used to find the optimal allocation. h i Q OP T (i, j) = max1≤l≤i (i − l) im=l pm + OP T (l, j − 1) Here, OP T (i, 0) = ∞ and OP T (n, k) returns the maximum expected survival value for n accounts allocated to k passwords in O(n2 k) time. 3.3 Identical Passwords and Atleast Exponentially Varying Accounts Here, each account possesses a distinct value and compromise probabilities that are account dependent (but, password independent) – i.e., we have pij = pj (∀i, ∀j). For example, when allocating passwords to accounts with a large range of values where all passwords have the same entropy, or when password files are unhashed by all service providers. Observe that when we make no assumptions about the distributions of the values and probabilities, the problem is NP-Complete as shown in Theorem 1. However, the problem becomes instantly solvable Q Pn in the case where we have pj ≤ ( j−1 l=1 pl ) and vj ≥ ( l=j+1 vl ), ∀j ∈ {1, . . . , n}, as illustrated by theorem 4. Theorem 4. If accounts are ordered by decreasing values (vj ), then allocation of accounts to passwords is contiguous. Proof. Let v1 ≥ v2 ≥ · · · ≥ vn . Consider the allocation given by S = hS1 , . . . , Sk i where without loss of generality, ES1 ≥ ES2 ≥ · · · ≥ ESk . Let Si be the first password with a non-contiguous allocation – i.e., account l ∈ Si but account m ∈ Si+1 (where vl < vm ). Now, consider the ∗ allocation given by S ∗ = hS1 , . . . , Si∗ , Si+1 , . . . , Sk i where ∗ ∗ account l ∈ Si+1 and account m ∈ Si – i.e., the allocation where theQpasswords of accountsQl and m are swapped. Let Pi = p1l a∈Si pa , Pi+1 = p1m a∈Si+1 pa and Vi = vl + P P a∈Si va , Vi+1 = vm + a∈Si+1 va . ∗ Now, observe that: [ESi + ESi+1 ] − [ESi∗ + ESi+1 ] = −[Pi Vi (pm − pl ) + Pi (vm pm − vl pl ) + Pi+1 Vi+1 (pl − pm ) +Pi+1 (pl vl − pm vm )] ≤ 0. Therefore, the expected survival value of a contiguous allocation is always at-least as good as any non-contiguous allocation. Since the optimal allocation of accounts to passwords is contiguous (given accounts sorted by their values), the following recursive relation may be used the optimal Q to find P allocation: OP T (i, j) = max1≤l≤i [ im=l pm im=l vm + OP T (l, j − 1)]. Here, OP T (i, 0) = ∞ and OP T (n, k) returns the maximum expected survival value for n accounts allocated to k passwords in O(n2 k) time. 4. HUMAN-COMPUTABLE HEURISTICS From section 3 and other analysis, it is observed that in many cases, some contiguous allocation of accounts (sorted by value) to passwords results in optimal (or, near-optimal) solutions to the PA problem. Based on this observation, we provide three simple human computable heuristics for allocating passwords to accounts. A performance comparison of the heuristics is provided in Table 1. PA problem instances for the results shown in Table 1 were randomly generated. 4.1 The k -Drops Heuristic This heuristic is based on the idea that similarly valued accounts should be allocated to the same password. In the k-Drops heuristic, accounts are ordered by decreasing values and the decrease in value between each successive account is computed. Let the accounts with the k − 1 largest drops in value be denoted by δ1 , . . . , δk−1 where vδ1 ≥ vδ2 ≥ · · · ≥ vδk−1 . Now the accounts are partitioned into k subsets (A1 , . . . , Ak ) as follows: accounts with values in the range [v1 , vδ1 ] are placed in subset A1 , . . . , accounts with values in the range (vδk−1 , vn ] are placed in the subset Ak . Now, accounts in subset Al are allocated to the password for which their cumulative compromise probability is minimum – i.e., Q Q to password i where i ← arg min { j∈Al p1j , . . . , j∈Al pkj } and password i has not been allocated previously. n 25 50 100 250 k 5 10 10 25 25 50 50 100 k-Drops µ σ .13 .070 .397 .101 .140 .052 .562 .056 .238 .039 .635 .044 .208 .031 .561 .038 Bounded Range µ σ .053 .011 .083 .028 .020 .008 .104 .021 .039 .006 .142 .022 .019 .003 .084 .009 NES µ σ .213 .071 .567 .068 .250 .044 .693 .025 .419 .032 .764 .015 .408 .016 .738 .008 Table 1: Mean (µ) and Std Devn (σ) of the ratio between human computable heuristic ES and upperbound ES for varying n and k (50 trials each). 4.2 The Bounded Range Heuristic The general idea behind this heuristic is to bound the range of values allocated to each password. As before, we order accounts by decreasing values. We then compute δ such that δ k ≥ v0 − vn . Now the accounts are partitioned into k subsets (A1 , . . . , Ak ) as follows: accounts with values in the range [vn , vn + δ 1 ] are placed in subset A1 , . . . , accounts with values in the range (vn + δ k−1 , v1 ] are placed in the kth subset Ak . Now, accounts in the subset Al are allocated to the password for which their cumulative compromise probability is minimum Q Q – i.e., to password i where i ← arg min { j∈Al p1j , . . . , j∈Al pkj } and password i has not been allocated previously. 4.3 Algorithm 1 The MMR and CMMR Algorithms function MMR(n, k, v1 , . . . , vn , p11 , . . . , pkn , cmmr) ⊲ cmmr ← 1 if running Clairvoyent MMR for i = 1 → k do Si ← ∅, ESi ← 0, Sdumpster ← ∅ end for for i = 1 → n do for j = 1 →S k do Tj ← Sj i, δj ← Compute-ES(Tj ) − ESj end for m ← arg max{δ1 , . . . , δk } if cmmr = 1 ∧ δm < 0 thenS Sdumpster ← Sdumpster i else S Sm ← Sm i, ESm ← Compute-ES(Sm ) end if end for if cmmr = 1 then for j = 1 →S k do Tj ← Sj Sdumpster , δj ← Compute-ES(Tj )−ESj end for m ← arg max{δ1 , . . . , δk } end if return hS1 , . . . , Sk i end function The Near Even Split (NES) Heuristic The general idea behind the NES heuristic is to allocate a near equal value to each password. In the NES heuristic, accounts are ordered by decreasing values P and the running sum of values is computed. Let T ← ⌈ n j=1 vj /k⌉ and P Ti ← n v . Now the accounts are partitioned into k subj j=i sets (A1 , . . . , Ak ) as follows: accounts which have Ti in the range (T ∗ (j − 1), T ∗ j] are placed in the j th subset Ai . Similar to the k-Drops heuristic, accounts in subset Al are allocated to the password for which their cumulative compromise probability is minimum Q – i.e., to password i where Q i ← arg min { j∈Al p1j , . . . , j∈Al pkj } and password i has not been allocated previously. Based on the impressive performance of the NES heuristic, one may be tempted to claim that optimal solutions must have nearly equal values allocated to each password (regardless of contiguity), however, simple counter-examples may be made by manipulating the compromise probabilities of certain account-password pairs. 5. MACHINE COMPUTABLE HEURISTICS The heuristics presented here are based on greedy and dynamic programming approaches which are unlikely to be usable as human computable heuristics. A performance comparison of the heuristics is provided in Table 2. PA problem instances for the results shown in Table 2 were randomly generated. 5.1 passwords that result in the largest increase (or, smallest decrease) in the value of the objective function (i.e., the ES). The accounts are allocated in order of decreasing value using the algorithm illustrated in algorithm 1. Our experimental analysis revealed that allocating in decreasing order of value performed better (on average) than when accounts were unordered, or ordered by increasing values. The Ordered Maximum Marginal Return The Maximum Marginal Return algorithm is a O(nk) greedy approach which makes allocations of accounts to the 5.2 Clairvoyant MMR The following variation is made to the Ordered-MMR algorithm: If there is no password to which the current account can be allocated without causing a drop in the cumulative expected survival value, then that account is placed in a dumpster. After initial allocation of all accounts is complete, all accounts in the dumpster are allocated together (as a single account) to the one password that experiences the smallest drop in ES. 5.3 Dynamic Programming Based Heuristic The DPH algorithm runs in O(n2 k) time and is loosely based on a dynamic programming approach with state space trimming [10]. Consider allocating the accounts to passwords one at a time, i.e., we allocate the first account, then the second, etc. Let Sti be the set of accounts allocated to P the ith password at time step t, Vti = j∈Sti vj , and Pti = Q Pk j∈Sti pij , ESti = Pti Vti , and ESt = i=1 ESti . If we allocate the t+1st account to the ℓth password, then we will have Pt+1,i = Pt,i pℓ,t+1 if i = ℓ and Pt+1,i = Pt,i otherwise. Similarly, Vt+1,i = Vt,i + vℓ,t+1 if i = ℓ and Vt+1,i = Vt,i otherwise. Thus (Vt1 , . . . , Vtk , Pt1 , . . . , Ptk ) is the only state information we need to compute the state after allocating the t’th account. This gives a dominance relation among allocations: if Vti ≤ Vti′ and Pti ≤ Pti′ for all i, then every extension of allocation (Vt1 , . . . , Vtk , Pt1 , . . . , Ptk ) will have lower ES than ′ ′ ′ the corresponding extension of (Vt1′ , . . . , Vtk , Pt1 , . . . , Ptk ). ′ ′ ′ ′ Thus we only need to consider (Vt1 , . . . , Vtk , Pt1 , . . . , Ptk ) in our search for the optimal allocation. Now, we reduce the size of the state space from exponentially to polynomially large by collapsing similar states into one. To do this, we n 25 50 100 250 k 5 10 10 25 25 50 50 100 O-MMR µ σ .491 .054 .799 .024 .594 .027 .924 .019 .862 .029 .956 .027 .898 .013 .973 .003 C-MMR µ σ .518 .076 .807 .041 .561 .02 .930 .011 .851 .031 .956 .006 .871 .019 .973 .004 DP-H µ σ .498 .092 .799 .043 .594 .041 .927 .024 .862 .021 .951 .009 .901 .028 .973 .002 Table 2: Mean (µ) and Std Devn (σ) of the ratio between machine computable heuristic ES and upperbound ES for varying n and k (50 trials each). √ divide the state space into n uniformly sized blocks ( n regions for the P s and V s, respectively) and maintain exactly n states for each iteration of the dynamic program – i.e., the state with the highest ES for each block. Therefore, in each iteration, no more than n promising states are maintained while the remaining are culled. Finally, after all n accounts are allocated, the state with the maximum ES is returned. The algorithm is illustrated in algorithm 2. Algorithm 2 Dynamic Program Based Heuristic function DOH-Solve(v1 , . . . , vn , p11 , . . . , pkn ) Φ0 ← {(0, . . . , 0, 1, . . . , 1)} for j = 1 → n do Φj ← ∅ for all (V1 , . . . , Vk , P1 , . . . , Pk ) ∈ Φj−1 do for all i  = 1 → k do Pi pij if i = ℓ ′ Pℓ = P if i 6= ℓ  ℓ V + v if i = ℓ i ij Vℓ′ = Vℓ if i 6= ℓ Φj ← Φj ∪ {(V1′ , . . . , Vk′ , P1′ , . . . , Pk′ )} end for end for Φj ← Trim(Φj ) ⊲ Divide state-space into n blocks and store the state with highest ES in each block. end for return maxD∈Φn ES(D) end function 6. DISCUSSION: FUTURE RESEARCH It is clear that the question of how to re-use passwords deserves significantly more attention and practical solutions than it has so far received from the academic community. Besides providing a mathematical framework for the password allocation problem, the focus of this paper has been to identify strategies to minimize the expected user perceived value of data lost by sharing passwords between accounts and the methods suggested in sections 4 and 5 appear to be reasonably successful at achieving this. However, there still remains a number of avenues for future research that need to be pursued for the password allocation problem to be properly understood. Below, we identify several directions of future research and some important questions that need to be answered. • User Password Re-use Behavior. Current user studies and empirical analysis [4], [6], [3], [1] have shown that users are clearly sharing passwords between sites. However, they shed little light on the following questions: (1) Are users aware of the risks of sharing passwords? (2) Do users share passwords arbitrarily? Or, do they follow certain self-formulated guidelines (if yes, how are these formulated)? Finding answers to these questions could reveal insights into whether the problem for end-user security is a lack of password re-use risk awareness (that may be solved by simple methods – eg., browsers that notify users of the risks of re-using passwords at every sign-up page), or simply the lack of standard guidelines, suggesting that users would be receptive to usable guidelines for password re-use. • Usability of Re-use Guidelines and Heuristics. Since any proposed guidelines and heuristics (including those presented in this paper) require active user participation, its usability will be one of the key factors influencing its potential acceptance. Therefore, usability studies need to be conducted for any proposed guidelines and heuristics in order to answer the following key questions: (1) Are users willing to undertake periodic (or, one-time) overheads in order to minimize their expected losses? (2) Are users able to reasonably follow the guidelines without significant errors – i.e., errors that cause a significant increase in their expected loss? (3) How does the usability of reuse guidelines compare with the usability of password managers – in both, the single device and multi-device scenario? 7. CONCLUSIONS In this paper we studied the problem of how to effectively re-use passwords from a theoretical and practical standpoint. Although the general problem is shown to be NPcomplete, several relevant and efficiently solvable special cases were identified. In addition, human and machine computable heuristics were evaluated and found to perform reasonably well. Based on the special cases identified and the performance of the human computable heuristics, we provide the first set of practical and effective guidelines for effective on-the-fly password allocations and re-use when the algorithms described in section 5 are inapplicable. 1. When dealing with accounts that have similar security infrastructure/policies and are known to use hashed password files, maintain a contiguous allocation. In this case, the k-Drops heuristic is found to be the best performing human computable heuristic if the strength of passwords has high deviation from the mean. Otherwise, the NES heuristic performs best. 2. When dealing with equally valuable accounts (eg., social networks) and similar strength passwords or accounts with unhashed password files, the optimal allocation is contiguous in terms of probabilities. In this case, a variant of the k-Drops heuristic (where allocations are made based on the largest drops in running product of probabilities) is found to work best. 3. When dealing with accounts that are highly varying (eg., exponentially increasing values/probabilities), the same rules as case (1) apply. 4. For allocating passwords to new accounts on the fly (when generating a new password is not an option), inserting the account into the allocation generated by the NES is recommended as long as periodic re-partition of accounts-to-passwords is performed. 5. For any other case not mentioned above, the NES appears yield best average performance. 6. The most effective method to improve heuristic performance is to reduce the n/k ratio. 8. REFERENCES [1] Reused login credentials: Security advisory. Trusteer, Inc., February 2010. [2] Office of inadeqaute security. http://www.databreaches.net/category/ breach-types/exposure/, 2013. [3] J. Bonneau. Measuring password re-use empirically. Light Blue Touchpaper, February 2011. [4] D. Florencio and C. Herley. A large-scale study of web password habits. In Proceedings of the 16th international conference on World Wide Web, 2007. [5] M. R. Garey and D. S. Johnson. Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., 1990. [6] S. Gaw and E. W. Felten. Password management strategies for online accounts. In Proceedings of the second symposium on Usable privacy and security, SOUPS ’06, pages 44–55, New York, NY, USA, 2006. ACM. [7] A. Huth, M. Orlando, and L. Pesante. Password security, protection, and management. United States Computer Emergency Readiness Team, October 2012. [8] B. Ives, K. R. Walsh, and H. Schneider. The domino effect of password reuse. Communications of the ACM, 47(4):75–78, Apr. 2004. [9] D. Perito, C. Castelluccia, M. Kaafar, and P. Manils. How unique and traceable are usernames? In Proceedings of Privacy Enhancing Technologies. 2011. [10] G. J. Woeginger. When does a dynamic programming formulation guarantee the existence of an fptas? In Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, 1999.