Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
A Probabilistic Bound on the Basic Role Mining Problem and its Applications Alessandro Colantonio, Roberto Di Pietro, Alberto Ocello, Nino Vincenzo Verde Abstract In this paper we describe a new probabilistic approach to the role engineering process for RBAC. In particular, we address the issue of minimizing the number of roles, problem known in literature as the Basic Role Mining Problem (basicRMP). We leverage the equivalence of the above issue with the vertex coloring problem. Our main result is the proof that the minimum number of roles is sharply concentrated around its expected value. A further contribution is to show how this result can be applied as a stop condition when striving to find out an approximation for the basicRMP. We also show that the proposal can be used to decide whether it is advisable to undertake the efforts to renew an RBAC state. Note that both these applications can result in a substantial saving of resources. A thorough analysis using advanced probabilistic tools supports our results. Finally, further relevant research directions are also highlighted. 1 Introduction An access control model is an abstract representation of security technology, providing a high-level logical view to describe all peculiarities and behaviors of an access control system. The Role-Based Access Control (RBAC, [1]) is certainly the most widespread access control model proposed in the literature for medium to large-size organizations. Alessandro Colantonio Engiweb Security, Roma, Italy, e-mail: alessandro.colantonio@eng.it Università di Roma Tre, Roma, Italy,e-mail: colanton@mat.uniroma3.it Roberto Di Pietro Università di Roma Tre, Roma, Italy, UNESCO Chair in Data Privacy, Tarragona, Spain, e-mail: dipietro@{mat.uniroma3.it,urv.cat} Alberto Ocello Engiweb Security, Roma, Italy, e-mail: alberto.ocello@eng.it Nino Vincenzo Verde Università di Roma Tre, Roma, Italy, e-mail: nverde@mat.uniroma3.it 1 2 Alessandro Colantonio, Roberto Di Pietro, Alberto Ocello, Nino Vincenzo Verde The simplicity of this model is one of the main reasons for its adoption: a role is just a collection of privileges, while users are assigned to roles based on duties to fulfil [10]. The migration to RBAC introduces different benefits, such as simplified system administration, enhanced organizational productivity, reduction in new employee downtime, enhanced system security and integrity, simplified regulatory compliance, and enhanced security policy enforcement [6]. To maximize all the advantages offered by adopting the role-based approach, the model must be customized to describe the organizational roles and functions of the company [3]. However, this migration process often has a high economic impact. To optimize the model customization, the role engineering discipline has been introduced. It can be defined as the set of methodologies and tools to define roles and to assign permissions to roles according to the actual needs of the company [5]. To date, various role engineering approaches have been proposed in order to address this problem. They are usually classified in literature as: top-down and bottom-up. The former carefully decomposes business processes into elementary components, identifying which system features are necessary to carry out specific tasks. This approach is mainly manual, as it requires a high level analysis of the business. The bottom-up class searches legacy access control systems to find de facto roles embedded in existing permissions. This process can be automated resorting to data mining techniques, thus leading to what is usually referred to as role mining. Since the bottom-up approach can be automated, it has attracted a lot of interest from researchers who proposed new data mining techniques particularly designed for role engineering purposes. Various role mining approaches can be found in the literature [3, 7, 12, 16–20, 22]. A problem partially addressed in these works is the “interestingness” of roles. Indeed, the importance of role completeness and role management efficiency resulting from the role engineering process has been evident from the earliest papers on the subject. However, only recently researchers have started to formalize the role-set optimality concept. One possible optimization approach is minimizing the total number of roles [7, 12, 18]. Yet, the identification of the role-set that describes the access control configuration with the minimum number of roles is an NP-complete problem [18]. Thus, all of the aforementioned papers just offer an approximation of the optimal solution in order to address the complexity of the problem. However, since none of them quantify the introduced approximations, it is not possible to estimate the quality of the proposed role mining algorithm outcomes. Contributions. In this paper we provide a probabilistic method to optimize the number of roles needed to cover all the existing user-permission assignments. The method leverages a known reduction of the role number minimization problem to the chromatic number of a graph. The main contribution of this work is proving that the optimal role number is sharply concentrated around its expected value. We further show how this result can be used as a stop condition when striving to find an approximation of the optimum for any role mining algorithm. The corresponding rational is that if a result is tight to the optimum, and the effort required to discover a better result is high, it might be appropriate to accept the current result. Roadmap. This paper is organized as follows: Section 2 reports relevant related works. Section 3 summarizes the main concepts used in the rest of the paper; namely, a formal description of the RBAC model, some probabilistic tools, and a brief review of graph theory. A Probabilistic Bound on the Basic Role Mining Problem and its Applications 3 In Section 4 the role minimization problem and the model used are formally described. Section 5 provides the main theoretical result and discusses some practical applications of this result. Finally, Section 6 presents some concluding remarks and further research directions. 2 Related Work Kuhlmann et al. [11] first introduced the term “role mining”, trying to apply existing data mining techniques (i.e., clustering similar to k-means) to implement a bottom-up approach. The first algorithm explicitly designed for role engineering is described in [17], applying hierarchical clustering on permissions. Another example of a role mining algorithm is provided by Vaidya et al. [20]; they applied subset enumeration techniques to generate a set of candidate roles, computing all possible intersections among permissions possessed by users. The work of Colantonio et al. [3,4] represents the first attempt to discover roles with semantic meanings. The authors define a metric for evaluating good collections of roles that can be used to minimize the number of candidate roles. Vaidya et al. [18, 19] also studied the problem of finding the minimum number of roles covering all permissions possessed by the users, calling it the basic Role Mining Problem (basicRMP). They also demonstrated that such a problem is NP-complete. Ene et al. [7] offer yet another alternative model to minimize the number of candidate roles. In particular, they reduced the problem to the well-known minimum clique partition problem or, equivalently, to the minimum biclique covering. Actually, not only is the role number minimization equivalent to the clique covering, but it has been reduced to many other NP problems, like binary matrices factorization [12] and tiling database [9] to cite a few. These reductions make it possible to apply fast graph reduction algorithms to exactly identify the optimal solution for some realistic data set—however, the general problem is still NP-complete. Recently, Frank et al. [8] proposed a probabilistic model for RBAC. They defined a framework that expresses user-permission relationships in a general way, specifying the related probability. Through this probability it is possible to elicit the role-user and role-permission assignments which then make the corresponding direct user-permission assignments more likely. The authors also presented a sampling algorithm that can be used to infer their model parameters. The algorithm converges asymptotically to the optimal value; the approach described in this paper can be used to offer a stop condition for the quest to the optimum. 3 Background In this section we review all the notions used in rest of the paper, namely the Role-Based Access Control entities, some probabilistic tools, and some graph theory concepts. 4 Alessandro Colantonio, Roberto Di Pietro, Alberto Ocello, Nino Vincenzo Verde 3.1 Role-Based Access Control The RBAC entities of interest are: • PERMS , the set of access permissions, namely all grantable operations for each system object; • USERS , the set of all system users; • ROLES ⊆ 2PERMS , the set of all roles, namely permission combinations. • UA ⊆ USERS × ROLES , the set of user-role assignments; given a role, the function ass_users : ROLES → 2USERS identifies all the assigned users. • PA ⊆ PERMS × ROLES , the set of permission-role assignments; given a role, the function ass_perms : ROLES → 2PERMS identifies all the assigned perms. The RBAC model also allows to establish a partial order among roles, namely a hierarchy of roles based on the permission set inclusion, identified by the set RH ⊆ ROLES × ROLES . Although very useful in certain applications, we are able to achieve our results without resorting to it; this greatly simplifies the analysis. In addition to the RBAC standard entities, the set UP ⊆ USERS × PERMS identifies permission to user assignments. In an access control system it is represented by entities describing access rights (e.g., access control lists). 3.2 Martingales and Azuma-Hoeffding Inequality In this section we will introduce some definitions and theorems that provide the mathematical basis we will build on in the sequel of the paper. In particular, we need to introduce: martingales, Doob martingales, and the Azuma-Hoeffding inequality. These are well known tools for the analysis of randomized algorithms [15, 21]. Definition 1 (Martingale). A sequence of random variables Z0 , Z1 , . . . , Zn is a martingale with respect to the sequence X0 , X1 , . . . , Xn if for all n ≥ 0, the following condition holds: • Zn is function of X0 , X1 , . . . , Xn , • ❊[|Zn |] ≤ ∞, • ❊[Zn+1 | X0 , . . . , Xn ] = Zn , where the operator ❊[·] indicates the expected value of a random variable. A sequence of random variables Z0 , Z1 , . . . is called martingale when it is a martingale with respect to himself. That is ❊[|Zn |] ≤ ∞ and ❊[Zn+1 | Z0 , . . . , Zn ] = Zn . Definition 2 (Doob Martingale). A Doob martingale refers to a martingale constructed using the following general approach. Let X0 , X1 , . . . , Xn be a sequence of random variables, and let Y be a random variable with ❊[|Y |] < ∞. (Generally Y , will depend on X0 , X1 , . . . , Xn .) Then Zi = ❊[Y | X0 , . . . , Xi ], i = 0, 1, . . . , n, gives a martingale with respect to X0 , X1 , . . . , Xn . A Probabilistic Bound on the Basic Role Mining Problem and its Applications 5 The previous construction assures that the resulting sequence Z0 , Z1 , . . . , Zn is always a martingale. Martingales are especially useful to predict the value of a random variable Y that is function of the random variables X1 , . . . , Xn . In this case, we can use a Doob martingale where Z0 , Z1 , . . . , Zn represents a sequence of refined estimates of the value Y , that gradually offers more and more information on the values of the random variables X1 , X2 , . . . , Xn . Z0 is just the expectation of Y , whereas Zi is the expected value of Y when the values of X1 , . . . , Xi are known. This way, if Y is fully determined by X1 , . . . , Xn , then Zn = Y . A useful property of the martingales that we will use in this paper is the AzumaHoeffding inequality [15]: Theorem 1 (Azuma-Hoeffding inequality). Let X0 , . . . , Xn be a martingale such that Bk ≤ Xk − Xk−1 ≤ Bk + dk for some constants dk and for some random variables Bk that may be functions of X0 , X1 , . . . , Xk−1 . Then, for all t ≥ 0 and any λ > 0,   −2λ 2 Pr(|Xt − X0 | ≥ λ ) ≤ 2 exp (1) ∑tk=1 dk2 The Azuma-Hoeffding inequality applied to the Doob martingale gives the so called Method of Bounded Differences (MOBD) [14]. 3.3 Graphs Modeling We shall now review some graph related concepts that will be used to generate our RBAC model. A graph G is an ordered pair G = hV, Ei, where V is the set of vertices, and E is a set of unordered pairs of vertices. We say that v, w ∈ V are endpoints of the edge hv, wi ∈ E. Given a subset S of the vertices V (G), then the subgraph induced by S is the graph where the set of vertices is S, and the edges are the members of E(G) such that the corresponding endpoints are both in S. We denote with G[S] the subgraph induced by S. A bipartite graph is a graph where the set of vertex can be partitioned into two subsets V1 and V2 such that for every edge hv1 , v2 i ∈ E(G), v1 ∈ V1 and v2 ∈ V2 . A clique is a subset S of vertices in G, such that the subgraph induced by S is a complete graph, namely for every two vertices in S there exists an edge connecting the two. A biclique in a bipartite graph, also called bipartite clique, is a set of vertices B1 ⊆ V1 and B2 ⊆ V2 such that hb1 , b2 i ∈ E for all b1 ∈ B1 and b2 ∈ B2 . In other words, if G is a bipartite graph, a set S of vertices V (G) is a biclique if and only if the subgraph induced by S is a complete bipartite graph. In this case we will say that the vertices of S induce a biclique in G. A maximal clique or biclique is a set of vertices that induces a complete subgraph, and that is not a subset of the vertices of any larger complete subgraph. A clique cover of G is a collection of cliques C1 , . . . ,Ck , such that for each edge hu, vi ∈ E there is some Ci that contains both u and v. A minimum clique partition (MCP) of a graph is a smallest by cardinality collection of cliques such that each vertex is a member 6 Alessandro Colantonio, Roberto Di Pietro, Alberto Ocello, Nino Vincenzo Verde of exactly one of the cliques; it is a partition of the vertices into cliques. Similar to the clique cover, a biclique cover of G is a collection of biclique B1 , . . . , Bk such that for each edge hu, vi ∈ E there is some Bi that contains both u and v. We say that Bi covers hu, vi if Bi contains both u and v. Thus, in a biclique cover, each edge of G is covered at least by one biclique. A minimum biclique cover (MBC) is the smallest collection of bicliques that covers the edges of a given bipartite graph, or in other words, is a biclique cover of minimum cardinality. 4 Problem Modelling In this section we show how to model our role engineering approach. 4.1 Definitions The following definitions are required to formally describe the role engineering problem: Definition 3 (System Configuration). Given an access control system, we refer to its configuration as the tuple ϕ = hUSERS , PERMS , UP i, that is the set of all existing users, permissions, and the corresponding relationships between them within the system. A system configuration represents the user authorization state before migrating to RBAC, or the authorizations derivable from the current RBAC implementation—in this case, the user-permission relationships may be derived as UP = {hu, pi | ∃r ∈ ROLES : u ∈ ass_users(r) ∧ p ∈ ass_perms(r)} Definition 4 (RBAC State). An RBAC state is a tuple ψ = hROLES , UA , PA i, namely an instance of all the sets characterizing the RBAC model. An RBAC state is used to obtain a system configuration. Indeed, the role engineering goal is to find the “best” state that correctly describes a given configuration. In particular, we are interested in finding the following kind of states: Definition 5 (Candidate Role-Set). Given an access control system configuration ϕ, a candidate role-set is the RBAC state ψ that “covers” all possible combinations of permissions possessed by users according to ϕ, namely a set of roles such that the union of related permissions exactly matches with the permissions possessed by the user. Formally ∀u ∈ USERS , ∃R ⊆ ROLES : [ ass_perms(r) = {p ∈ PERMS | hu, pi ∈ UP }. r∈R Definition 6 (Cost Function). Let Φ,Ψ be respectively the set of all possible system configurations and RBAC states. We refer to the cost function cost as A Probabilistic Bound on the Basic Role Mining Problem and its Applications ❘ cost : Φ ×Ψ → 7 ❘+ where + indicates positive real numbers including 0; it represents an administration cost estimate for the state ψ used to obtain the configuration ϕ. The administration cost concept was first introduced in [3]. Leveraging the cost metric enables to find candidate role-sets with the lowest possible effort to administer the resulting RBAC state. Definition 7 (Optimal Candidate Role-Set). Given a configuration ϕ, an optimal candidate role-set is the corresponding configuration ψ that simultaneously represents a candidate role-set for ϕ and minimized the cost function cost(ϕ, ψ). The main goal related to mining roles is to find optimal candidate role-sets. In the next section we focus on optimizing a particular cost function. Let cost indicate the number of needed roles. The role mining objective then becomes to find a candidate role-set that has the minimum number of roles for a given system configuration. This is exactly the Basic Role Mining Problem. We will show that this problem is equivalent to that of finding the chromatic number of a given graph. Using this problem equivalence, we will identify a useful property on the concentration of the optimal candidate role-sets. This allows us to provide a stop condition for any iterative role mining algorithm that approximates the minimum number of roles. 4.2 The proposed model Given the configuration ϕ = hUSERS , PERMS , UP i we can build a bipartite graph G = hV, Ei, where the vertex set V is partitioned into the two disjoint subset USERS and PERMS , and where E is a set of pairs hu, pi such that u ∈ USERS and p ∈ PERMS . Two vertices u and p are connected if and only if hu, pi ∈ UP . A biclique coverage of the graph G identifies a unique candidate role-set for the configuration ϕ [7], that is ψ = hROLES , UA , PA i . Indeed, every biclique identifies a role, and the vertices of the biclique identify the users and the permission assigned to this role. Let the function cost return the number of roles, that is: cost(ϕ, ψ) = |ROLES | (2) In this case, minimizing the cost function is equivalent to finding a candidate role-set that minimizes the number of roles. This corresponds to basicRMP. Let B a biclique coverage of a graph G, we define the function cost ′ as: cost ′ (B) = cost(ϕ, ψ) where ψ is the state hUA , PA , ROLES i that can be deduced by the biclique coverage B of G, and G is the bipartite graph built from the configuration ϕ that is uniquely identified by hUSERS , PERMS , UP i. In this model, the problem of finding an optimal candidate role-set can be equivalently expressed as finding a biclique coverage for a given bipartite 8 Alessandro Colantonio, Roberto Di Pietro, Alberto Ocello, Nino Vincenzo Verde graph G that minimizes the number of required bicliques. This is exactly the minimum biclique coverage (MBC) problem. In the following we first recall both the reduction of the MBC problem to the minimum clique partition (MCP) problem [7] and the reduction of MCP to the chromatic number problem. From the graph G, it is possible to construct a new undirected unipartite graph G′ where the edges of G become the vertices of G′ : two vertices in G′ are connected by an edge if and only if the endpoints of the corresponding edges of G induce a biclique in G. Formally: G′ = E, {he1 , e2 i | e1 , e2 induce a biclique in G} The vertices of a (maximal) clique in G′ correspond to a set of edges of G, where the endpoints induce a (maximal) biclique in G. The edges covered by a (maximal) biclique of G induce a (maximal) clique in G′ . Thus, every biclique edge cover of G corresponds to a collection of cliques of G′ such that their union contains all of the vertices of G′ . From such a collection, a clique partition of G′ can be obtained by removing any redundantly covered vertex from all but one of the cliques to which it belongs to. Similarly, any clique partition of G′ corresponds to a biclique cover of G. Thus, the size of a minimum biclique coverage of a bipartite graph G is equal to the size of a minimum clique partition of G′ . Finding a clique partition of a graph G = hV, Ei is equivalent to finding a coloring of its complement G = hV, (V × V ) \ Ei. This implies that the biclique cover number of a bipartite graph G corresponds to the chromatic number of G′ [7]. 5 A Concentration Result for Optimal Candidate Role-Sets Using the model described in the previous section, we will prove that the cost of an optimal candidate role-set ψ for a given system configuration ϕ is tightly concentrated around its expected value. We will use the concept of martingales and the Azuma-Hoeffding inequality to obtain a concentration result for the chromatic number of a graph G [14, 15]. Since finding the chromatic number is equivalent to both MCP and MBP, we can conclude that the minimum number of roles required to cover the user-permission relationships in a given configuration is tightly concentrated around its expected value. Let us denote with G an undirected unipartite graph, and with χ(G) the chromatic number of G. Theorem 2. Given a graph G with n vertices, the following equation holds:   −2λ 2 Pr(|χ(G) − ❊[χ(G)]| ≥ λ ) ≤ 2 exp n (3) Proof. We fix an arbitrary numbering of the vertices from 1 to n. Let Gi be the subgraph of G induced by the set of vertices 1, . . . , i. Let Z0 = ❊[χ(G)] and Zi = ❊[χ(G) | G1 , . . . , Gi ]. Since adding a new vertex to the graph requires no more than one new color, the gap between Zi and Zi−1 is at most 1. This allows us to apply the Azuma-Hoeffding inequality, that is Equation 1 where dk = 1. A Probabilistic Bound on the Basic Role Mining Problem and its Applications 9 Note that this result holds even without knowing ❊[χ(G)]. Informally, Theorem 2 states that the chromatic number of a graph G is sharply concentrated around its expected value. Since finding the chromatic number of a graph is equivalent to MCP, and MCP is equivalent to MBC, this result holds also for MBC. Translating these concepts in terms of RBAC entities, this means that the cost of an optimal candidate role-set of any configuration ϕ with |UP | = n is sharply concentrated around its expected value according to Equation 3, where χ(G) is equal to the minimum number of required roles. It is important to note that n represents the number of vertices in the coloring problem but, according to the proposed model, it is also the number of edges in MBP; that is, the user-permission assignments of the system configuration. Probability 1 2 2exp(-2λ /n) 0.5 0.3 0.1 0.8 0.6 0.4 0.2 0 0 50000 100000 150000 200000 250000 n 300000 350000 400000 450000 500000 1400 1000 1200 800 600 200 400 0 λ (a) Plot of Equation 3 0.5 0.3 0.1 0 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 1400 1200 1000 800 600 400 200 0 λ (b) Highlight of some λ values for Figure 1(a) Fig. 1 Relationship between the parameters λ , n and the resulting probability n 10 Alessandro Colantonio, Roberto Di Pietro, Alberto Ocello, Nino Vincenzo Verde Figure 1(a) shows the plot of the Equation 3 for n varying between 1 and 500,000, and λ less than 1,500. It is possible to see that for n = 500,000 it is sufficient to choose λ = 900 to assure that Pr(|χ(G) − ❊[χ(G)]| ≥ λ ) ≤ 0.1. In the same way, choosing λ = 600, then Pr(|χ(G) − ❊[χ(G)]| ≥ λ ) is less than 0.5. Figure 1(b) shows the values for λ and n to have the left part of the inequality in Equation 3 to hold with probability less than 0.5, 0.3, and 0.1 respectively. √ Setting λ = n log n, Equation 3 can be expressed as: Pr(|χ(G) − ❊[χ(G)]| ≥ p 2 n log n) ≤ 2 n (4) √ That is, the probability that our approach differ from the optimum more than n log n is less than n22 . This probability becomes quickly negligible as n increases. To support the viability of the result, note that in a large organization there are usually thousands user-permission assignments. 5.1 Applications of the Bound ˜ [χ(G)− ˜ [χ(G)] for ❊[χ(G)] such that |❊ Assuming that we can estimate an approximation ❊ ❊[χ(G)]| ≤ ε for any ε > 0, Theorem 2 can be utilized as a stop condition when striving to find an approximation of the optimum for any role mining algorithm. Indeed, suppose that we have a probabilistic algorithm that provides an approxima˜ [χ(G)], we can use tion of χ(G), and suppose that its output is χ̃(G). Since we know ❊ this value to evaluate whether the output is acceptable and therefore decide to stop the iterations procedure. Indeed, we have that:   2 ˜ (χ(G))| ≥ λ + ε) ≤ 2 exp −2λ Pr(|χ(G) − ❊ n This is because ˜ (χ(G))| ≥ λ + ε) ≤ Pr(|χ(G) − ❊(χ(G))| ≥ λ ) Pr(|χ(G) − ❊  and, because of Theorem 2, this probability is less than or equal to 2 exp −2λ 2 /n . Thus, ˜ [χ(G)]| ≤ λ + ε holds, then we can stop the iteration, otherwise we have to if |χ̃(G) − ❊ reiterate the algorithm until it outputs an acceptable value. For a direct application of this result, we can consider a system configuration with |UP | = x. If λ = y, the probability that |χ(G)− ❊[χ(G)]| ≤ y is greater than 2 exp −2y2 /x . ˜ [χ(G)] − ❊[χ(G)]| ≤ ε we can conclude that We do not know ❊[χ(G)], but since |❊  ˜ |χ(G) − ❊[χ(G)]| < y + ε with probability at least 2 exp −2y2 /x . For instance, we have considered the real case of a large size company, with 500,000 user-permissions assign˜ [χ(G)]| < ments. With λ = 1200, and considering ε = 100, the probability that |χ(G) − ❊ ˜ λ +ε is at least 99.36%. This means that, if ❊[χ(G)] = 24, 000, with the above probability the optimum is between 22,700 and 25,300. If a probabilistic role mining algorithm outputs a value χ̃(G) that is estimated quite from this range, then it is appropriate to reiterate A Probabilistic Bound on the Basic Role Mining Problem and its Applications 11 the process in order to find a better result. Conversely, let us assume that the algorithm outputs a value within the given range. We know that the identified solution differs, from the optimum, by at most 2(λ + ε), with probability at least 99.36%. Thus, one can assess whether it is appropriate to continue investing resources in the effort to find a better solution, or to simply accept the provided solution. This choice can depend on many factors, such as the computational cost of the algorithm, the economic cost due to a new analysis, and the error that we are prone to accept, to name a few. There is also another possible application for this bound. Assume that a company is assessing whether to renew its RBAC state, just because it is several years old [19]. By means of the proposed bound, the company can establish whether it is the case to invest money and resources in this process. Indeed, if the cost of the RBAC state in use is be˜ [χ(G)] − λ − ε and ❊ ˜ [χ(G)] + λ + ε, the be best option would be not to renew it tween ❊ because the possible improvement is likely to be marginal. Moreover, changing the RBAC state requires a huge effort for the administrators, since they need to get used to the new configuration. In our proposal it is quite easy to assess whether a renewal is appropriate. This indication can lead to important time and money saving. ˜ [χ(G)]. Currently, not many Note that in our hypothesis, we have assumed to know ❊ researchers have addressed this specific issue in reference to a generic graph, whereas plenty of results have been provided for Random Graphs. In particular, it has been proven [2, 13] that for G ∈ Gn,p : ❊[χ(G)] ∼ 2 logn 1 n 1−p We are presently striving to apply a slight modification of the same probabilistic techniques used in this paper, to derive a similar bound for the class of graphs used in our model. 6 Conclusions and Future Works In this paper we proved that the optimal administration cost for RBAC, when striving to minimize the number of roles, is sharply concentrated around its expected value. The result has been achieved by adopting a model reduction and advanced probabilistic tools. Further, we have shown how to apply this result to deal with practical issues in administering RBAC; that is, how it can be used as a stop condition in the quest for the optimum. This paper also highlights a few research directions. First, a challenge that we are currently addressing is to derive an estimate of the expected optimal number of roles (❊[χ(G)]) from a generic system configuration. Another research path is applying both the exposed reduction and the probabilistic tools to obtain similar bounds while simultaneously minimizing more parameters. 12 Alessandro Colantonio, Roberto Di Pietro, Alberto Ocello, Nino Vincenzo Verde References 1. American National Standards Institute (ANSI) and InterNational Committee for Information Technology Standards (INCITS): ANSI/INCITS 359-2004, Information Technology – Role Based Access Control (2004) 2. Bollobás, B.: The chromatic number of random graphs. Combinatorica 8(1), 49–55 (1988) 3. Colantonio, A., Di Pietro, R., Ocello, A.: A cost-driven approach to role engineering. In: Proceedings of the 23rd ACM Symposium on Applied Computing, SAC ’08, vol. 3, pp. 2129–2136. Fortaleza, Ceará, Brazil (2008) 4. Colantonio, A., Di Pietro, R., Ocello, A.: Leveraging lattices to improve role mining. In: Proceedings of the IFIP TC 11 23rd International Information Security Conference, SEC ’08, IFIP International Federation for Information Processing, vol. 278, pp. 333–347. Springer (2008) 5. Coyne, E.J.: Role engineering. In: RBAC ’95: Proceedings of the first ACM Workshop on Role-based access control, p. 4. ACM, New York, NY, USA (1996) 6. Coyne, E.J., Davis, J.M.: Role Engineering for Enterprise Security Management. Artech House (2007) 7. Ene, A., Horne, W., Milosavljevic, N., Rao, P., Schreiber, R., Tarjan, R.E.: Fast exact and heuristic methods for role minimization problems. In: Proceedings of the 13th ACM Symposium on Access Control Models and Technologies, SACMAT ’08, pp. 1–10 (2008) 8. Frank, M., Basin, D., Buhmann, J.M.: A class of probabilistic models for role engineering. In: Proceedings of the 15th ACM Conference on Computer and Communications Security, CCS ’08,(to appear) (2008) 9. Geerts, F., Goethals, B., Mielikäinen, T.: Tiling databases. In: Discovery Science, Lecture Notes in Computer Science, vol. 3245, pp. 278–289. Springer (2004) 10. Jajodia, S., Samarati, P., Subrahmanian, V.S.: A logical language for expressing authorizations. In: SP ’97: Proceedings of the 1997 IEEE Symposium on Security and Privacy, p. 31. IEEE Computer Society, Washington, DC, USA (1997) 11. Kuhlmann, M., Shohat, D., Schimpf, G.: Role mining – revealing business roles for security administration using data mining technology. In: Proceedings of the 8th ACM Symposium on Access Control Models and Technologies, SACMAT ’03, pp. 179–186 (2003) 12. Lu, H., Vaidya, J., Atluri, V.: Optimal boolean matrix decomposition: Application to role engineering. In: Proceedings of the 24th IEEE International Conferene on Data Engineering, ICDE ’08, pp. 297– 306 (2008) 13. Łuczak, T.: The chromatic number of random graphs. Combinatorica 11(1), 45–54 (1991) 14. McDiarmid, C.J.H.: On the method of bounded differences. In: J. Siemons (ed.) Surveys in Combinatorics: Invited Papers at the 12th British Combinatorial Conference, 141 in London Mathematical Society Lecture Notes Series, pp. 148–188. Cambridge University Press (1989) 15. Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York, NY, USA (2005) 16. Rymon, R.: Method and apparatus for role grouping by shared resource utilization (2003). United States Patent Application 20030172161 17. Schlegelmilch, J., Steffens, U.: Role mining with ORCA. In: Proceedings of the 10th ACM Symposium on Access Control Models and Technologies, SACMAT ’05, pp. 168–176 (2005) 18. Vaidya, J., Atluri, V., Guo, Q.: The role mining problem: finding a minimal descriptive set of roles. In: Proceedings of the 12th ACM Symposium on Access Control Models and Technologies, SACMAT ’07, pp. 175–184 (2007) 19. Vaidya, J., Atluri, V., Guo, Q., Adam, N.: Migrating to optimal RBAC with minimal perturbation. In: Proceedings of the 13th ACM Symposium on Access Control Models and Technologies, SACMAT ’08, pp. 11–20 (2008) 20. Vaidya, J., Atluri, V., Warner, J.: RoleMiner: mining roles using subset enumeration. In: Proceedings of the 13th ACM Conference on Computer and Communications Security, pp. 144–153 (2006) 21. Williams, D.: Probability with Martingales. Cambridge University Press (1991) 22. Zhang, D., Ramamohanarao, K., Ebringer, T.: Role engineering using graph optimisation. In: Proceedings of the 12th ACM Symposium on Access Control Models and Technologies, SACMAT ’07, pp. 139–144 (2007)