Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Replicated declustering of spatial data

2004, Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS '04

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/221559542 Replicated Declustering of Spatial Data. Conference Paper · January 2004 DOI: 10.1145/1055558.1055577 · Source: DBLP CITATIONS READS 21 12 3 authors, including: Hakan Ferhatosmanoglu Max Planck Institute & Bilkent University 80 PUBLICATIONS 1,707 CITATIONS SEE PROFILE All content following this page was uploaded by Hakan Ferhatosmanoglu on 24 July 2015. The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document and are linked to publications on ResearchGate, letting you access and read them immediately. Replicated Declustering of Spatial Data ∗ Hakan Ferhatosmanoglu Ali Şaman Tosun Aravind Ramachandran Computer Science and Engineering Ohio State University Columbus, OH 43210 Computer Science University of Texas San Antonio, TX 78249 Computer Science and Engineering Ohio State University Columbus, OH 43210 hakan@cis.ohiostate.edu tosun@cs.utsa.edu ABSTRACT The problem of disk declustering is to distribute data among multiple disks to reduce query response times through parallel I/O. A strictly optimal declustering technique is one that achieves optimal parallel I/O for all possible queries. In this paper, we focus on techniques that are optimized for spatial range queries. Current declustering techniques, which have single copies of the data, have been shown to be suboptimal for range queries. The lower bound on extra disk accesses is proved to be Ω(log N ) for N disks even in the restricted case of an N -by-N grid, and all current approaches have been trying to achieve this bound. Replication is a well-known and effective solution for several problems in databases, especially for availability and load balancing. In this paper, we explore the idea of replication in the context of declustering and propose a framework where strictly optimal parallel I/O is achievable using a small amount of replication. We provide some theoretical foundations for replicated declustering, e.g., a bound for number of copies for strict optimality on any number of disks, and propose a class of replicated declustering schemes, periodic allocations, which are shown to be strictly optimal. The results for optimal disk allocation are extended for larger number of disks by increasing replication. Our techniques and results are valid for any arbitrary a-by-b grids, and any declustering scheme can be further improved using our replication framework. Using the framework, we perform experiments to identify a strictly optimal disk access schedule for any given arbitrary range query. In addition to the theoretical bounds, we compare the proposed replication based scheme to other existing techniques by performing experiments on real datasets. 1. INTRODUCTION ∗This work was partially supported by DOE Early Career Principle Investigator Award DE-FG02-03ER25573. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGMOD 2004 June 13-18, 2004, Paris, France. Copyright 2004 ACM 1-58113-859-8/04/06 . . . $5.00. 125 ramachan@cis.ohiostate.edu Spatial databases have been in use in several fields of science like cartography, transportation and epidemiology and geographical information systems (GIS). With the growing popularity of spatial data in modern database applications, effective storage and retrieval of such data is becoming increasingly more important. Spatial databases have data objects represented as two-dimensional vectors, and the correlation between the data objects is defined by a distance function. For example, in GIS, objects are the locations of places defined with their coordinates (longitude and latitude) and the distance function between a pair of data points is the geographic distance between them. A common type of query is the range query, where the user specifies an area of interest (usually a rectangular region) and all data points in this area are retrieved. Typical spatial data applications include large data repositories. Therefore, efficient retrieval and scalable storage of large spatial data becomes more important. Several retrieval structures and methods have been proposed for retrieval of spatial data [22, 3, 30, 17]. Traditional retrieval methods based on index structures developed for single disk and single processor environments are becoming ineffective for the storage and retrieval in multiple processor and multiple disk environments. Multiple disk architectures have been in popular use for fault tolerance and for backup of the stored data. In addition to fault tolerance and scalability with respect to storage of data, multi-disk architectures give the opportunity to exploit I/O parallelism during retrieval. The most crucial part of exploiting I/O parallelism is to develop careful storage techniques of the data so that the data can be accessed in parallel. Declustering is the technique that allocates disjoint partitions of data to different disks/devices to allow parallelism in data retrieval while processing of a query. To process a range query, all buckets that intersect the query are accessed from secondary storage. The cost of executing the query is proportional to the maximum number of buckets accessed from a single I/O device. The minimum possible cost when retrieving b buckets distributed over N devices is ⌈ Nb ⌉. An allocation policy is said to be strictly optimal if no query, which retrieves b buckets, has more than ⌈ Nb ⌉ buckets allocated to the same device. However, it has been proved that, except in very restricted cases, it is impossible to reach strict optimality for spatial range queries [1]. Tight bounds have also been identified for the efficiency of disk allocation schemes [20]. In other words, no allocation technique can achieve optimal performance for all possible range queries. The lower bound on extra disk accesses is proved to be Ω(log N ) for N disks even in the restricted case of an N -by-N grid [5]. Given the established bounds on the extra cost and the impossibility result, a large number of declustering techniques have been proposed to achieve performance close to the bounds either on the average case [12, 24, 14, 18, 21, 25, 13, 19, 4, 28, 29, 23, 15] or in the worst case [7, 2, 5, 34, 8]. While initial approaches in the literature were originally for relational databases or cartesian product files, recent techniques focus more on spatial data declustering. Each of these techniques is built on a uniform grid, where the buckets of the grid are declustered using the proposed mapping function. Techniques for uniform grid partitioning can be extended to nonuniform grid partitioning as discussed in [26] and [11]. All these techniques in the literature, along with their theoretical foundations, have a common assumption: there is only one copy of the data. In this paper, we explore the idea of replication in the context of declustering to achieve strictly optimal I/O parallelism. Replication is a well-studied and effective solution for several problems in a database context, especially for fault tolerance and load balancing. It is implemented in multimedia storage servers which support multiple concurrent applications such as video-on-demand, to achieve load balancing, real-time throughput, delay guarantees, and high data availability [10, 32, 27]. We introduced earlier, the idea of using replication to achieve parallel I/O [36]. Given the importance of latency over storage capacity and the necessity of replication also for availability, it is of great practical interest to investigate the replicated declustering problem in detail. A general framework is needed for effective replication of spatial data to improve the performance of disk allocation techniques and to achieve strictly optimal parallel I/O. In this paper, we provide some theoretical foundations for replicated declustering and propose a class of replicated declustering techniques, periodic allocations, which are shown to be strictly optimal for a wide range of available numbers of disks. We are able to achieve strict optimality with a single replica (one extra copy of the data) for 2-15 disks, and with two replicas for 16-50 disks. We also provide extensions to our techniques to make them applicable to a larger number of disks and for any arbitrary a-by-b grids. We perform a series of experiments on a realistic scenario on real datasets to demonstrate the superiority of the proposed techniques. Besides the allocation schemes, we also show how to efficiently find optimal disk access (schedule) for a given arbitrary query by storing minimal information. Using a very small size table, the schedule with optimal cost can easily be chosen for retrieval. Some additional properties about fault tolerance of the proposed schemes are also briefly discussed. There has been earlier work in replicated declustering [9, 31]. In [9], a max-flow model is used to compute the optimal retrieval scheme for replicated data. However, this technique is more suited for a scenarios where the queries are known apriori. In [31], bounds have been proved for random reads, which are near optimal for all queries. However, in their approach, computing the retrieval 126 schedule for each query is computationally intensive. In a problem scenario, where all range queries are equally likely, storing the schedule or computing it at run-time would be infeasible. Thus, we propose a scheme based on periodic allocation schemes, which is optimal for a subproblem, namely range queries - where it is possible to compute the retrieval schedule by just a lookup in main memory. Section 2 provides some definitions and derives properties which are used in the development of the proposed replication schemes. In particular, we show some useful properties about a general class of disk allocation schemes, i.e., latin squares and periodic allocations. Section 3 describes independent and dependent periodic allocations and proves certain characteristics of these allocations and their implications on optimality in a limited context. Section 4 generalizes these results for arbitrary grids and large number of disks. Section 5 has experiments that determine the parameters for the restricted case discussed in Section 3 and then applies the extensions described in Section 4 to apply the techniques in a realistic scenario. The results are compared with other techniques currently known. Section 6 concludes the paper with a discussion. 2. FOUNDATIONS In this section we provide some definitions and derive some properties which are used in the proposed replicated declustering schemes. This includes formal definitions of some of the concepts used later in the paper. We introduce the concept of Latin Squares and provide the intuitive motivation for using Latin Squares as a solution for optimal allocation. We formalize the problem of parallel retrieval of replicated data , i.e., processing a query optimally among several copies of the data and define Periodic Allocation (which is based on Latin Squares) which we later use in the replication scheme that we propose in this paper. 2.1 Latin Squares In this section, we define the concept of Latin Squares. We provide some definitions that will use throughout the paper. D EFINITION 1. An i-by-j query is a range query that spans i rows, j columns and have ij buckets. D EFINITION 2. A Latin square of n symbols is an n-by-n array such that each of the n symbols occurs once in each row and in each column. The number n is called the order of the square. D EFINITION 3. If A=(aij ) and B=(bij ) are two n-by-n arrays, the join (A,B) of A and B is the n-by-n array whose (i,j)’th entry is the pair (aij ,bij ). D EFINITION 4. The squares A=(aij ),B=(bij ) of order n are orthogonal if all the entries in the join of A and B are distinct. If A, B are orthogonal, B is called an orthogonal mate of A. Note that orthogonal mate of A need not be unique. Latin squares have been extensively studied in the past [6]. In this paper, we establish that orthogonality and latin squares can be used q1 2.2 0 1 2 3 4 5 6 0 6 5 4 3 2 1 3 4 5 6 0 1 2 3 2 1 0 6 5 4 6 0 1 2 3 4 5 6 5 4 3 2 1 0 2 3 4 5 6 0 1 2 1 0 6 5 4 3 5 6 0 1 2 3 4 5 4 3 2 1 0 6 1 2 3 4 5 6 0 1 0 6 5 4 3 2 4 5 6 0 1 2 3 4 3 2 1 0 6 5 q2 Figure 1: Orthogonal latin squares of order 7 to develop a technique for strictly optimal declustering with the help of replication. An intuitive idea of why orthogonal squares are good for replication is as follows. Assume we have a 3-by-2 range query and we have 7 disks. Let’s say for this query 2 buckets are mapped to disk 0 and thus query is not optimal. With replication, we can look at the replicated copies to see if we can have optimal access. If two buckets map to same disk, we want replicated copies to map to different disks. This is where orthogonality come into play. Orthogonality guarantees that, if two buckets map to same disk in original copy, they map to different disks in replicated copy to increase chances of finding a solution. There will be queries where 4 buckets map to the same disk. To increase chances of finding a solution what we want is to have a mapping which will map these four buckets to 4 different disks. A pair of orthogonal squares of order 7 is given is Figure 1. If orthogonal squares are used, the pairs (i, i) will appear somewhere in the join. We keep these pairs to maintain the cyclic structure of the latin square. But it is possible to map second copy to some other disk if properties of periodic allocation (derived later in the paper) are not used. The problem of generating orthogonal latin squares(Greco-Latin Squares) has been studied and no solutions have been found for some square sizes. We do not attempt to solve the problem. While it is intuitive that orthogonal latin square allocation would help finding an optimal retrieval schedule, we observe that it is however not a necessary condition. BUCKETS DISKS 0 1 1 2 Parallel Retrieval of Replicated Data One aspect of the optimal replicated declustering problem (that we have briefly discussed in the previous section) is identifying allocations for copies of the data, which when considered together are optimal for all queries. Another issue that has to be addressed is developing a scheme for optimal retrieval among the copies. We can effectively represent the parallel retrieval problem using bipartite graphs as follows. Let the buckets be the first set of nodes and the disks be the second set of nodes in the graph. Connect bucket i to node j if bucket i is stored in disk j in original or replicated copy. For query q1 (in Figure 1), the graph is as shown in Figure 2. We can process q1 in single disk access if the bipartite graph has a 6-matching i.e. every bucket will be matched to a single disk (in general a bucket will be matched to ⌈b/N ⌉ disks for optimality where b is the number of buckets retrieved during the query, and N is the total number of disks). The bipartite matching problem requires that each node on first set is matched with a single node on second set. Consider a 2-by-4 query with 7 disks. To represent this query we list each disk twice (optimal is 2 disk access) on the disk node list and apply the matching. We assign the buckets to two nodes which denote the same disk in round-robin order. For a bipartite graph G=(V,E) where V is the set of vertices and E is the set of edges, a breadth-first search p based algorithm for maximum matching has complexity Θ( |V |.(|V | + |E|)). |E| is proportional to |V | for thepparallel retrieval problem, therefore the overall complexity is Θ( |V |.|V |). This algorithm is used to get the results presented in Section 5. Given a pair of latin squares, checking whether the pair is optimal or not may not be easy. For each query we need to construct a bipartite graph and see if there is a complete matching or not. In an N-by-N grid the number of 2-by-2 queries is (N-1)*(N-1) and this bipartite-matching has to be repeated for every 2x2 query. Consider the queries q1 and q2 in Figure 1. The bipartite graphs constructed are not isomorphic for the 2 queries. Using basic combinatorics construction of bipartite graph and finding a matching process has to be repeated N ΣN i=2 Σj=2 (N −i)(N −j) times. We can put additional restrictions on functions that assigns buckets to disks for replicated copies. We now limit our allocation schemes to periodic allocations which will be shown to have important properties that can be used to simplify replication schemes. Note that in this case, in the expense of simplifications, we may not find optimal although it exists. However, we will show later in experimental results section that replication of carefully chosen periodic allocations guarantee strict optimality for many number of disks. 2.3 Periodic Allocation 2 3 Now that we have a technique for optimal retrieval, we return to the problem of identifying the mutually complementary optimal allocations that we need for the copies of the data. Towards this purpose, we define periodic allocation and prove certain properties of periodic allocation and their relation to Latin squares. In the subsequent sections, we propose the use of periodic allocations in a replicated allocation. 3 4 4 5 5 6 6 Figure 2: Representation of query q1 D EFINITION 5. A disk allocation scheme f (i, j) is periodic if f (i, j) = (ai + bj + c) mod N , where N is the number of disks 127 and a,b and c are constants. L EMMA 4. All n-by-m range queries require the same number of disk accesses if disk allocation is periodic. We note that our definition of periodic allocation is more general than the cyclic allocation proposed in [28]. We state the following lemmas which establish the link between periodicity, orthogonality, and latin squares. Proofs follow from the definitions. L EMMA 1. A periodic disk allocation scheme f (i, j) = (ai + bj + c) mod N is a latin square if gcd(a,N)=1 and gcd(b,N) = 1 . Proof We need to show that each number appears once on each row and column. Assume f (i, j) = f (i, k), and first show j = k (means each number appears once on each row). Proof Consider 2 distinct n-by-m queries. Assume that the first has top left bucket (i,j) and second has top left bucket (i+s,j+t). Let’s now write the expression f(i+s,j+t) in terms of f(i,j). f (i + s,j + t) = (a(i + s) + b(j + t) + c) mod N = ai + as + bj + bt + c mod N = ((ai + bj + c) + (as + bt)) mod N = (f (i, j) + (as + bt)) mod N This is a 1-1 function between buckets of 2 n-by-m queries (as + bt depends on dimensions of query). By Lemma 3, number of disk accesses for both queries is the same.  ai + bj + c = ai + bk + c (mod N ) b(j − k) = 0 (mod N ) Since gcd(b, N ) = 1, j − k = 0 (mod N ) and j = k. Similarly, we can show that if f (k, j) = f (i, j) then i = k. Therefore, if gcd(a, N ) = 1 and gcd(b, N ) = 1 then f (i, j) is a latin square.  L EMMA 2. Periodic, latin square allocation schemes f (i, j) = (ai + bj + c) mod N and g(i, j) = (di + ej + f ) mod N are orthogonal if gcd(bd-ae,N)=1 . Consider queries q1 and q2 in figure 1 to observe the 1-1 mapping between queries. D EFINITION 6. Periodic allocations f (i, j) = (ai+bj+c) mod N and g(i, j) = (di + ej + f ) mod N are dependent if a = d and e = b and independent otherwise. 3. REPLICATED PERIODIC ALLOCATION Proof Assume f (i, j) = f (m, n) and g(i, j) = g(m, n) and show i = m and j = n (means each pair appears only once). f (i, j) = f (m, n) ⇒ ai + bj + c = am + bn + c (mod N ) a(i − m) + b(j − n) = 0 (mod N ) (1) Similarly, d(i − m) + e(j − n) = 0 (mod N ) (2) Fact1: If i = m then j = n. If i = m Equations 1 and 2 reduce to b(j − n) = 0 (mod N ) e(j − n) = 0 (mod N ) Since f and g are latin squarea, we have j = n. Fact2: If j = n then i = m. (This can be proved similar to the above fact.) From Equations 1 and 2 we get In this section, we extend the concept of periodic allocation t develop efficient replication schemes. Using the properties developed in Section 2, we now propose two periodic allocation schemes: Independent Periodic Allocation and Dependent Periodic Allocation. Dependent periodic allocation achieves strict optimality for several number of disks where optimality is proved to be impossible with current approaches that use a single copy. 3.1 Independent Periodic Allocation In this scheme, each copy (the original and the replicated data) is allocated based on periodic allocation with independent parameters. The idea is to have one copy optimal for every possible i-by-j query. D EFINITION 7. Independent Periodic Allocation with multiple copies is a disk allocation which satisfies the following conditions: a(i − m) · e(j − n) = d(i − m) · b(j − n) (mod N ) (bd − ae)(i − m)(j − n) = 0 (mod N ) Since gcd(bd − ae) = 1, (i − m)(j − n) = 0 (mod N ) If i = m then j = n by Fact 1, and if j = n then i = m by Fact 2. Therefore i = m and j = n, i.e., each pair appears only once.  L EMMA 3. For an n-by-m range query, cardinality of the disk id which appears maximum number of times determines the number of disk accesses. 1. Each copy is a periodic allocation. 2. ∀ i-by-j query ∃ a copy which is optimal (without matching). L EMMA 5. If A is an N-by-N latin square then all i-by-N, i-by1, 1-by-i and N-by-i queries are optimal where 1 ≤ i ≤ N . Proof By definition of latin square (each number appears in each row and column once).  Proof Trivial (stated also in [28]). 128 L EMMA 6. Let A be an N-by-N disk allocation. An i-by-j query q is optimal if max{di : 0 ≤ i ≤ N − 1} − min{di : 0 ≤ i ≤ N − 1} ≤ 1 where di is the number of buckets mapped to disk i. Proof Trivial.  L EMMA 7. Let A be an N-by-N disk allocation given by f (i, j) = (ki + j) mod N , then A is optimal for all m-by-k queries and all (N-m)-by-k queries where 1 ≤ m ≤ N . Proof Consider traversal of the N-by-k block left to right. On first row numbers start with 0 and increase by 1. Now let’s see if this holds for last number of row s and first number of row s+1. f (s, k − 1) + 1 = (ks + k − 1) + 1 mod N = (ks + k) mod N = (s + 1)k mod N = f (s + 1, 0). We encounter numbers in increasing order in mod N in left to right traversal. By Lemma 6, all m-by-k queries are optimal. Now consider traversal of an (N-m)-by-k block right to left. On first row numbers start with (N-m-1) and decrease by 1. Now let’s see of this holds for first number of row s and last number of row s+1. f (s, 0) − 1 = (ks) + N − 1 mod N = (ks + k + N − k − 1) mod N = (s+1)k +N −k −1 mod N = f (s+1, N −k −1). We encounter numbers in decreasing order in mod N in right to left traversal. By Lemma 6, all (N-m)-by-k queries are optimal.  Lemma 7 can be visually represented for k=2 as in figure 3 for an 8-by-8 latin square. In this figure optimal queries are marked. Optimality of j-by-2 and j-by-6, 1 ≤ j ≤ N queries are based on the traversal pattern shown in figure 3 and Lemma 6. Lemma 7 says that each copy makes 2 columns optimal depending on k. By using Lemmas 5 and 7, we can find optimal allocation using independent periodic allocation. By Lemma 5, if we have a latin square first column and last two columns are optimal. We can use Lemma 7 to make other columns optimal. Each allocation of the form (ki + j)modN will make two columns optimal. Therefore we need ⌈ N 2−3 ⌉ copies if one of the copies is a latin square. This can be stated formally as follows: T HEOREM 1. Let A be an N-by-N disk allocation. All queries in A can be answered in optimal time if we have ⌈ N 2−3 ⌉ copies using independent periodic allocation and if at least one of the copies is a latin square. Proof Use the allocations f (i, j) = mi + j where 2 ≤ m ≤ ⌈ N 2−1 ⌉. All i-by-1, i-by-(N-1) and i-by-N queries are optimal in a latin square by lemma 5. The remaining queries are partitioned in 129 sets such that m-by-k and (N-m)-by-k queries are in same partition. Lemma 7 is used to show optimality of each partition.  This theorem gives a linear upper bound on the number of copies required by independent periodic allocation. In practice however, we can find optimal using fewer copies of independent periodic allocation. We discuss this in the experimental results section. 3.2 Dependent Periodic Allocation In this section we propose replicated declustering schemes based on periodic allocations with carefully chosen parameters that are dependent to each other. We first prove that this allocation does not lose out on fault tolerance, and then present the motivation behind Dependent Periodic Allocation. Although we present this scheme using 2 copies for simplicity, dependent periodic allocation can have any number of copies. Allocation g(i, j) is dependent on allocation f (i, j) if g(i, j) = (f (i, j) + c)modN . In terms of the definition of independent allocation, dependent allocation is a special case where the parameters are restricted by the conditions a=d , b=e and c=0. Dependent periodic allocation ensures that we do not lose fault tolerance at the expense of optimal performance. In case of a disk crash we may lose optimality, but no data is lost. T HEOREM 2. In dependent periodic allocation with x copies, no data is lost if at most x-1 disks crash. Proof By definition of dependent periodic allocation, if a bucket (i,j) is mapped to the same disk in 2 dependent periodic allocations then the allocations are exactly the same. Therefore all x dependent allocations should map bucket (i,j) to distinct disks. Bucket (i,j) is lost only if x disks to which it is mapped crash and no data is lost if at most x-1 disks crash.  We now present theoretical results which will help us simplify the framework and give us an intuitive understanding of dependent periodic allocation. In the dependent periodic allocation scheme, the copies satisfy the following Lemma. L EMMA 8. If bucket assignment functions for 2 copies satisfy the condition g(i,j) = (f(i,j) + c) mod N, then all n-by-m queries require the same number of disk accesses. Here f (i, j) is the mapping function for first copy and g(i, j) is mapping function for second copy. Proof: There is a 1-1 function which maps nodes of a n-by-m query to nodes of another n-by-m query (explained in proof of Lemma 4). The bipartite graph constructed will be same except that the nodes will have different labels.  For example, all 3-by-5 queries require the same number of disk accesses irrespective of where they are located in the N-by-N grid. This property helps us test optimality of all 3-by-5 queries by testing only one. 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 1 0 0 1 2 3 4 5 6 7 0 0 0 1 1 2 3 4 5 6 7 2 3 4 5 6 7 2 1 2 3 4 5 6 7 0 1 1 2 3 2 4 5 6 7 0 1 2 3 2 4 3 4 5 6 7 0 1 5 6 7 0 1 2 3 4 3 6 7 0 1 2 3 4 5 3 6 5 4 0 1 2 3 4 5 6 7 4 0 7 0 1 2 3 4 5 1 2 3 4 5 6 6 5 2 3 4 5 6 7 0 1 5 2 7 3 4 5 6 7 0 7 6 4 5 6 7 0 1 2 3 6 1 4 5 6 7 0 1 2 8 7 6 7 0 1 2 3 4 5 7 3 6 7 0 1 2 3 4 5 Figure 3: Visual representation of Lemma 7 for k=2 D EFINITION 8. Rotation of a periodic allocation is defined as follows. 1. g right rotation of f if g(i, j) = f (i, j + 1 mod N ) 2. g left rotation of f if g(i, j) = f (i, j − 1 mod N ) 3. g up rotation of f if g(i, j) = f (i − 1 mod N, j) 4. g down rotation of f if g(i, j) = f (i + 1 mod N, j) q1 1 5 2 6 3 0 4 6 3 0 4 1 5 2 2 6 3 0 4 1 5 0 4 1 5 2 6 3 3 0 4 1 5 2 6 1 5 2 6 3 0 4 4 1 5 2 6 3 0 2 6 3 0 4 1 5 5 2 6 3 0 4 1 3 0 4 1 5 2 6 6 3 0 4 1 5 2 4 1 5 2 6 3 0 0 4 1 5 2 6 3 5 2 6 3 0 4 1 L EMMA 9. All rotations of a periodic allocation satisfy Lemma 8. q2 Figure 4: Periodic allocation of 2 copies of data Proof Follows from definition of periodic allocation and rotation.  T HEOREM 3. If f(i,j) is an N-by-N latin square then all dependent periodic allocations can be generated using only left, right rotations or only up, down rotations. Proof Follows from elementary number theory.  Theorem 3 gives us an intuitive understanding of what dependent periodic allocations are for latin squares. L EMMA 10. An optimal solution using 2 copies with periodic allocation can be represented as f (i, j) = (ai + bj) mod N and g(i, j) = (f (i, j) + d) mod N . Proof Assume we have a solution with periodic allocation f (i, j) = (ai + bj + c) mod N and g(i, j) = (f (i, j) + c) mod N . By adding (N − c) mod N to both functions we get the form given in the lemma.  L EMMA 11. In dependent periodic allocation, if a single copy is optimal for an i-by-j query then all other copies are individually optimal. If a single copy is non-optimal then all other copies are individually non-optimal. Proof Follows from the existence of a 1-1 function between dependent periodic allocations and Lemma 3.  130 3.3 Finding the Optimal Retrieval Schedule In both the independent and the dependent periodic allocation techniques, we have copies that are optimal for different sets of queries. Given a query, we need to determine the copy that would be optimal for the query. We describe the technique for finding an optimal retrieval schedule in this section. The schedule is represented as an N-by-N Table (N is the number of disks) OPT where OPT[i,j] stores index of copy which is optimal for i-by-j queries. It is important to note that the OPT Table size depends only on the number of disks and not on the size of grid. For a dataspace with B buckets per dimension and N disks, the size of the table would be Θ(N 2 log(N )). For instance, a 50-by-50 grid with 50 disks and 1000-by-1000 grid with 50 disks require the same amount of space. Given a query, the optimal retrieval schedule can be computed efficiently using the properties of periodic allocation. We obtain the schedule as follows: The element OPT[i,j] in the matrix is NULL if the allocation in a single copy is optimal for an i-by-j query. By Lemma 11, any of the copies can be used for retrieval. If a single copy is non-optimal we need to perform bipartite matching. In this case, OPT[i,j] stores the matched disk ids for i-by-j query with top left bucket (0,0). The disk ids for buckets are listed row by row and left to right in each row. For arbitrary i-by-j queries, this stored bipartite matching can be used by relabeling disks as indicated by Lemma 8. Consider the queries q1 and q2 shown in Figure 4. Assume we have a matching for query q1 and we want to find a matching for query q2 (user requests q2 but we store the matching for q1 only). We can simply find the matching for q2 by relabeling the disk i with (i + 5) mod 7 in the matching for q1. So, for every non-optimal (with one copy) query type, we store the matching only once. If a single 3-by-5 query is non-optimal then all 3-by-5 queries are nonoptimal and we store a single matching for all 3-by-5 queries. 4. EXTENDED PERIODIC ALLOCATION In Section 3, we came up with a framework for optimal allocation of N disks for an N-by-N grid. However, many applications have data that is represented as a rectangular grid with different number of splits in each dimension. For instance, both the North-East dataset and the Sequoia dataset that we use for our experiments in Section 5 have large grid sizes. Thus, we propose extensions to the framework to find optimal disk allocation for arbitrary a-by-b (a > N , b > N ) grids. We also provide a framework for extending these optimality results for a larger number of disks by increasing replication. yN T HEOREM 4. If there is an optimal disk allocation for N-by-N grid using x copies with N disks, and at least one of the copies is a latin square then there is optimal disk allocation for a-by-b grid (a>N, b>N) using x copies with N disks. Proof All i-by-j, i<N, j<N queries are optimal by assumption. Consider queries of the form i-by-j where i = xN + z,j = yN + t, z, t < N ,i < a, j < b. Divide the query into 4 quadrants as shown in Figure 5. xN-by-yN and z-by-yN segments can optimally be retrieved row-wise order and xN-by-t segment can optimally be read column-wise using the copy which is a latin square. Here we used the fact that all disks are busy while reading the first 3 segments. Therefore optimality as a whole depends on z-by-t segment. The zby-t segment can optimally be read by assumption since z, t < N .  C OROLLARY 1. If there is an N-by-N latin square disk allocation using N disks with worst case cost OPT+c, then there is an a-by-b disk allocation (a>N, b>N) using N disks with worst case cost OPT+c. t  4.2 Extension to Large Number of Disks xN xN In the datasets we have used for our experiments, and in most databases, the number of buckets per dimension is larger than the number of disks. However, this scenario is likely to change in the near future. We would like to reiterate that our technique is scalable and for the sake of completion, we provide a framework for extending our results from small number of disks to higher number of disks by increasing replication, t z z yN Figure 5: Retrieval of i-by-j query (i = xN + z, j = yN + t) 4.1 Extension to Arbitrary Grids D EFINITION 9. Extended periodic allocation of an a-by-b grid (a>N, b>N) using N disks is defined as fe (i, j) = (ci + dj + e) mod N where c, d, e are constants and 0 ≤ i ≤ a − 1 and 0 ≤ i ≤ b − 1. fe is extension of f (i, j) = (ci + dj + e) mod N where c, d, e are constants (same as in fe ) and 0 ≤ i ≤ N − 1 and 0 ≤ i ≤ N − 1. L EMMA 12. If an N-by-N periodic disk allocation using N disks is a latin square, then N consecutive buckets in a row or column of extended periodic allocation has only 1 bucket mapped to each of the N disks. T HEOREM 5. If there is an N-by-N disk allocation using x copies with worst case cost OPT+c, then there is an kN-by-kN disk allocation using kx copies with worst case cost OPT+ ⌈ kc ⌉. Proof Assume there is an N-by-N disk allocation using x copies with worst case cost OP T + c. Consider an n-by-m query in an kN-by-KN grid. By assumption there is a bipartite matching which assigns at most ⌈ nm ⌉ + c buckets to each of the N disks. Partition N kN disks into k classes such that each disk appears in only one class. (Pi = {j|j mod N = i} where Pi is partition i) Replicate buckets that are mapped to a disk t on disks that are in same partition as t (done for each of x copies to get kx copies). Extend bipartite matching of N disks to kN disks by mapping buckets mapped to disk i (0 ≤ i ≤ N − 1) to disks in i’s partition in round robin ⌈ nm ⌉+c ⌉ buckets mapped order. With kN disks there are at most ⌈ Nk ⌉. From the definition to one of the kN disks and optimal is ⌈ nm kN ⌈ nm ⌉+c ⌉ ≤ ⌈ nm ⌉+ of the ceiling function, we have the result ⌈ Nk Proof We will prove this for N consecutive buckets in a row. The kN c ⌉, 1 ≤ n, m, k ≤ N , 0 ≤ c ≤ N . Hence the result.  proof for N consecutive buckets in a column is similar. Let fe (i, j), ..., fe⌈(i+ k N − 1, j) be N consecutive buckets and Let f (i, j) = (ci + dj + e) mod N be a latin square disk allocation. fe (i + k, j) = f ((i + The above theorem has several important consequences that are not k) mod N, j mod N ) by definition of extended periodic allocajust constrained to range queries. They include the following: tion. Therefore N consecutive buckets fe (i, j), ..., fe (i + N − 1, j) are equal to f (i, j), ..., f (i + N − 1, j). These buckets are mapped • If there is an optimal N-by-N disk allocation using x copies, to N distinct disks since f(i,j) is a latin square.  131 then there is an optimal kN-by-kN disk allocation using kx copies. • OPT+1 worst case cost can be achieved using N is square number. √ N copies if • OPT+ ⌈ kN2 ⌉ worst case cost can be achieved using k copies for arbitrary queries (any combination of buckets) if N is divisible by k 2 . • Results in declustering research can be improved by replicated declustering since any N-by-N declustering scheme with worst case cost OPT+c can be used to get a kN-by-kN replicated declustering scheme with worst case cost OPT+⌈ kc ⌉ using k copies. 5. EXPERIMENTAL RESULTS We use the results obtained in Section 3 to construct strictly optimal periodic allocations for N-by-N grids and N disks where 1 ≤ N ≤ 50. It is possible to find parameters for dependent and independent periodic allocation that satisfy the criterion of optimality, through an exhaustive search of the possible parameters for the allocation. It is of importance that the choice of periodic allocation narrows down the search space from Θ(N N k ) to Θ(N 3k ) for independent allocation and Θ(N k+2 ) for dependent allocation, where N is the number of disks and k is the number of copies. From Lemma 8, we need to check only one i-by-j query for optimality to decide the optimality for all i-by-j queries. For each value of N, we look for parameters that would result in optimal allocations with less than 3 copies of the data. We observe that optimal allocations can be found for all values of N ≤ 50 using only 3 copies. It must be noted that these computations need to be performed only once and are not required during query retrieval. On testing the search space, we obtain the following results. It is impossible to reach optimality with disk allocations that use single copy for systems with 6 and more disks. We also observe that as the number of disks increases, the performance of current schemes degrades very significantly, where the proposed schemes keep its strict optimality. We found strictly optimal disk allocations for up to 15 disks (except 12) using single replica and for up to 50 disks using 2 replicas of the data. Using the generalizations proved in the previous section, the optimality results can be extended to arbitrary a-by-b grids using the same number of disks and extended to an even larger number of disks by increasing replication by the techniques described in Section 4. We test the dependent periodic allocation scheme on two spatial datasets- the North-East dataset and the Sequoia dataset. The former is a spatial dataset containing the locations of 123,593 postal addresses, which represent three metropolitan areas (New York, Philadelphia and Boston). The latter, which is data from the Sequioa 2000 Global Change Research Project, contains the co-ordinates of 62,556 locations in California. We perform experiments for different ranges and compare the performance with other single-copy based allocation schemes that are in use. 132 No. disks 6 7 8 9 10 11 12 13 14 15 a 1 1 1 1 1 1 NA 1 2 1 b 1 2 1 2 2 2 NA 2 5 4 c 2 2 4 3 3 3 NA 5 3 6 Overhead (Bytes) 209 222 576 535 1278 1215 NA 2470 4004 3565 Table 1: Optimal Dependent Periodic Allocation using 2 copies 5.1 Dependent Periodic Allocation In our experiments, we present the results for dependent periodic allocation. We also performed experiments to come up with an optimal independent periodic allocation. The results can be found in our technical report [16]. Again we emphasize that dependent periodic allocation satisfies Lemma 8. We can represent a strictly optimal solution with 2 copies using dependent periodic allocation with 3 parameters a,b and c. Disk allocation for first copy is f (i, j) = (ai + bj) mod N and allocation for second copy is g(i, j) = (f (i, j) + c) mod N . Optimal assignments (for all possible queries) using this scheme are given in Table 1. Strictly optimal assignments using 3 copies are given in Table 2. Disk allocation for 3 copies are f (i, j) = (ai + bj) mod N , g(i, j) = (f (i, j) + c) mod N and h(i, j) = (f (i, j) + c + d) mod N . Overhead of keeping track of bipartite matching in dependent periodic allocation is very low. The structure of matchings requires less than 5 KB for 2 copies and will fit in memory. This overhead depends only on number of disks and not on size of grid. So a 16-by-16 grid with 16 disks will have the same overhead of storing matchings as a 1024-by-1024 grid with 16 disks. 5.2 Performance comparison We implemented Cyclic Allocation [28, 29] and General Multidimensional Data Allocation (GMDA) [23], and compared them with the proposed techniques. Cyclic allocation assigns buckets to disks in a consecutive way in each row; and the starting allocated disk id of each row differs by a skip value of H. Many declustering methods prior to cyclic allocation were based on the same idea, and they are special cases of cyclic allocation. It has been shown that cyclic allocations significantly outperforms others such as DM, FX, HCAM [28, 29]. Since we also had the same experience with our experimental setup we will only report results for cyclic allocation. GMDA follows a similar approach to cyclic allocation, but if a row is allocated with exactly the disk ids with the previous checked row, the current row is shifted by one and marked as the new checked row. As an example of Cyclic Allocation, we implemented BEST Cyclic, i.e., best possible cyclic scheme that is computed by exhaustively searching all possible skip values H, and picking the values that give the best performance. We perform experiments on range queries on the North East dataset. We partition the dataset for a page size of 4KB. The queries are for rectangular range queries centered around points in the dataset. These queries would be analogous to looking for places within a No. disks 12 16 17 18 19 20 21 22 23 24 25 26 a 1 1 1 1 1 1 2 3 2 1 1 1 b 7 7 7 7 7 7 5 5 13 7 7 7 c 3 3 3 3 3 3 3 4 9 3 3 3 d 6 6 6 6 6 6 3 4 9 6 6 6 No disks 27 28 29 30 31 32 33 34 35 36 37 38 a 1 1 1 1 1 1 1 1 1 1 1 1 b 7 19 7 13 7 3 3 5 13 11 3 11 c 3 6 3 7 4 8 8 8 10 15 14 14 d 6 6 6 7 8 8 8 8 10 15 14 14 No disks 39 40 41 42 43 44 45 46 47 48 49 50 a 1 1 1 1 1 1 1 1 1 1 1 1 b 4 9 4 13 13 5 7 7 6 7 6 7 c 14 7 10 9 18 18 10 10 8 9 8 9 d 14 14 10 9 18 36 20 20 16 18 16 18 Table 2: Optimal Dependent Periodic Allocation using 3 copies TIME Average Seek Time(msec) Latency(msec) Transfer(MByte/sec) Fast Disk 3.6 2.00 86 Average Disk 8.5 4.16 57 single copy schemes by as much as 103% and the worst by 248%. In the Sequoia dataset, the periodic allocation scheme outperforms the best and the worst by 93% and 272% respectively for fast disks for M=50. It is important to note that the computational overhead is negligible in the Periodic Allocation Scheme as our scheme only involves a lookup from a M-by-M table, which is typically in the order of a few µs. Table 3: Disk Specification particular distance (in the specified rectangular region) from a chosen location in the dataset. In this scenario, which is very common in GIS applications, we compute the expected seek time, latency time and the transfer time in the dependent periodic allocation scheme for various range values. We also compute these times for the same query for each of Disk Modulo(DM), Fieldwise XOR (FX), GDMA and Best Cyclic. We provide results for query selectivity = 25% for both square queries and rectangular queries. We also perform experiments on asymmetric rectilinear queries (with different selectivity in each dimension). The graphs can be found in our technical report [16]. The results are similar to symmetric range queries. For instance, in the North-East dataset, our scheme is 101% faster than the Best Cyclic Allocation, which is the best among the single copy schemes for M=50. These results demonstrate that our technique performs better for asymmetric queries as well. For our calculations, we evaluate the techniques on two different architectures - one with the average speed disks and the other with the fastest disks available. The specifications for the fast disks have been taken from the Cheetah specifications in [33] and the specifications for the average disks have been taken from the Barracuda specifications in [33]. Table 5.2 provides the key parameters that describe the architectures. We compute the total I/O time for queries that are centered about a randomly chosen point in the dataset for different values for the number of disks, M. We average our results for 1000 range queries in each case. The results are similar for both the North-East dataset and the Sequioa dataset. We present the results from the larger dataset in the figures. 6. CONCLUSION Replication is commonly used in database applications for the purpose of fault tolerance. If the database is read-only or the frequency of update operations is less than the queries, then the replication can also be used for optimizing performance. On the other hand, if updates occur very frequently in the database, although they can be done in parallel in a multi-disk architecture, the amount of replication should be kept small. In this paper, we have proposed a scheme to achieve optimal retrieval with a minimal amount of replication for range queries. One of the authors has also extended the idea to arbitrary queries [35]. We summarize the contributions of the paper as follows. From the results, we observe that BEST Cyclic outperforms RPHM in all cases (for obvious reasons), and it performs better than GMDA for almost all cases. The performance of DM and FX are comparatively poorer. In both symmetric rectilinear (square) queries and asymmetric rectilinear queries, the query processing times for the Dependent Periodic allocation scheme is better. In Graphs 6(a) and 6(b), we present the results for square queries on the dataset. For the North-East dataset, we notice that for M=10, the average I/O time is 106 ms whereas among the single copy schemes, the best I/O time is in the Best Cyclic Allocation, which takes more than 170ms. In average disks, the corresponding values are 256ms and 426ms. The difference in the performance is more prominent for higher values of M. For M=50, our scheme outperforms the best 133 • We provided some theoretical foundations for replicated declustering. We studied the replicated declustering problem utilizing Latin Squares and orthogonality. • We provided several theoretical results for replicated declustering, e.g., a constraint for orthogonality and a bound on the number of copies required for strict optimality on any number of disks. • We proposed a class of replicated declustering techniques, periodic allocations, which are shown to be strictly optimal for several number of disks in a restricted scenario. We prove Average I/O time vs Number of devices Average I/O time vs Number of devices 220 550 Periodic Cyclic Best Cyclic RPHM DM FX GDMA 200 180 450 400 Disk I/O time (in ms) Disk I/O time (in ms) 160 140 120 350 300 100 250 80 200 60 150 40 20 10 Periodic Cyclic Best Cyclic RPHM DM FX GDMA 500 100 15 20 25 number of devices (M) 30 35 40 (a) Fast Disks 50 10 15 20 25 number of devices (M) 30 35 40 (b) Average Disks Figure 6: I/O time for Square Queries some properties of periodic allocations that make them suitable for optimal replication. We also showed the fault tolerance property of the proposed replication scheme. • The proposed technique achieved strictly optimal disk allocation with 6-15 disks using 2 copies and 16-50 using 3 copies. Note that, in these cases, it is impossible to reach optimality with any single copy declustering technique. • We also showed how to extend the optimal disk allocation results for small number of disks to larger number of disks and to arbitrary non-uniform grids. The optimality of the extended techniques was also proved. • An efficient and scalable query retrieval technique was proposed. In particular, we showed that by storing minimal information we can efficiently find the disk access schedule needed for optimal parallel I/O for a given arbitrary query. [4] S. Berchtold, C. Bohm, B. Braunmuller, D. A. Keim, and H.-P. Kriegel. Fast parallel similarity search in multimedia databases. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 1–12, Arizona, U.S.A., 1997. [5] R. Bhatia, R. K. Sinha, and C. Chen. Hierarchical declustering schemes for range queries. In Advances in Database Technology - EDBT 2000, 7th International Conference on Extending Database Technology, Lecture Notes in Computer Science, pages 525–537, Konstanz, Germany, March 2000. [6] R. Bose and S. Shrikhande. On the construction of sets of mutually orthogonal latin squares and the falsity of a conjecture of euler. Euler. Trans. Am. Math. Sm., 95:191–209, 1960. • Our experimental results on real spatial data demonstrated I/O costs 2 to 4 times more efficient using the extended periodic allocation scheme than current techniques. [7] C. Chen, R. Bhatia, and R. Sinha. Declustering using golden ratio sequences. In International Conference on Data Engineering, pages 271–280, San Diego, California, Feb 2000. The ideas of orthogonality and periodic allocations can be extended to higher dimensions. Currently we are working on analyzing the performance of such extensions. [8] C. Chen and C. T. Cheng. From discrepancy to declustering: Near optimal multidimensional declustering strategies for range queries. In Proc. ACM Symp. on Principles of Database Systems, pages 29–38, Wisconsin, Madison, 2002. 7. REFERENCES [1] K. A. S. Abdel-Ghaffar and A. El Abbadi. Optimal allocation of two-dimensional data. In International Conference on Database Theory, pages 409–418, Delphi, Greece, January 1997. [2] M. J. Atallah and S. Prabhakar. (Almost) optimal parallel block access for range queries. In Proc. ACM Symp. on Principles of Database Systems, pages 205–215, Dallas, Texas, May 2000. [3] N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger. The R* tree: An efficient and robust access method for points and rectangles. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 322–331, May 23-25 1990. 134 [9] L. Chen and D. Rotem. Optimal response time retrieval of replicated data. In Proc. ACM Symp. on Principles of Database Systems, pages 36–44, Minneapolis, Minnesota, May 1994. [10] M. Chen, H. Hsiao, C. Lie, and P. Yu. Using rotational mirrored declustering for replica placement in a disk array-based video server. In Proceedings of the ACM Multimedia, pages 121–130, 1995. [11] P. Ciaccia and A.Veronesi. Dynamic declustering methods for parallel grid files. In Proceedings of Third International ACPC Conference with Special Emphasis on Parallel Databases and Parallel I/O, pages 110–123, Berlin, Germany, Sept. 1996. [12] H. C. Du and J. S. Sobolewski. Disk allocation for cartesian product files on multiple-disk systems. ACM Transactions of Database Systems, 7(1):82–101, March 1982. [26] B. Moon, A. Acharya, and J. Saltz. Study of scalable declustering algorithms for parallel grid files. In Proc. of the Parallel Processing Symposium, Apr. 1996. [13] C. Faloutsos and P. Bhagwat. Declustering using fractals. In Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems, pages 18 – 25, San Diego, CA, Jan 1993. [27] R. Muntz, J. Santos, and S. Berson. A parallel disk storage system for real-time multimedia applications. International Journal of Intelligent Systems, Special Issue on Multimedia Computing System, 13(12):1137–74, December 1998. [14] C. Faloutsos and D. Metaxas. Declustering using error correcting codes. In Proc. ACM Symp. on Principles of Database Systems, pages 253–258, 1989. [28] S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. El Abbadi. Cyclic allocation of two-dimensional data. In International Conference on Data Engineering, pages 94–101, Orlando, Florida, Feb 1998. [15] H. Ferhatosmanoglu, D. Agrawal, and A. E. Abbadi. Concentric hyperspaces and disk allocation for fast parallel range searching. In Proc. Int. Conf. Data Engineering, pages 608–615, Sydney, Australia, Mar. 1999. [16] H. Ferhatosmanoglu, A. S. Tosun, and A. Ramachandran. Replicated declustering of spatial data : A technical report. http://www.cse.ohio-state.edu/∼hakan/repdec-report.pdf, 2003. [30] H. Samet. The Design and Analysis of Spatial Structures. Addison Wesley Publishing Company, Inc., Massachusetts, 1989. [17] V. Gaede and O. Gunther. Multidimensional access methods. ACM Computing Surveys, 30:170–231, 1998. [18] S. Ghandeharizadeh and D. J. DeWitt. Hybrid-range partitioning strategy: A new declustering strategy for multiprocessor database machines. In Proceedings of 16th International Conference on Very Large Data Bases, pages 481–492, August 1990. [31] P. Sanders, S. Egner, and J. H. M. Korst. Fast concurrent access to parallel disks. In Symposium on Discrete Algorithms, pages 849–858, 2000. [32] J. Santos and R. Muntz. Design of the RIO (randomized I/O) storage server. Technical Report TR970032, UCLA Computer Science Department, 1997. http://mml.cs.ucla.edu/publications/papers/cstech970032.ps. [19] S. Ghandeharizadeh and D. J. DeWitt. A performance analysis of alternative multi-attribute declustering strategies. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 29–38, San Diego, 1992. [20] L. Golubchik, S. Khanna, S. Khuller, R. Thurimella, and A. Zhu. Approximation algorithms for data placement on parallel disks. In Symposium on Discrete Algorithms, pages 223–232, 2000. [21] J. Gray, B. Horst, and M. Walker. Parity striping of disc arrays: Low-cost reliable storage with acceptable throughput. In Proceedings of the Int. Conf. on Very Large Data Bases, pages 148–161, Washington DC., Aug. 1990. [22] A. Guttman. R-trees: A dynamic index structure for spatial searching. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 47–57, 1984. [23] K. A. Hua and H. C. Young. A general multidimensional data allocation method for multicomputer database systems. In Database and Expert System Applications, pages 401–409, Toulouse, France, Sept. 1997. [24] M. H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 173–182, Chicago, 1988. [25] J. Li, J. Srivastava, and D. Rotem. CMD: a multidimensional declustering method for parallel database systems. In Proceedings of the Int. Conf. on Very Large Data Bases, pages 3–14, Vancouver, Canada, Aug. 1992. 135 View publication stats [29] S. Prabhakar, D. Agrawal, and A. El Abbadi. Efficient disk allocation for fast similarity searching. In 10th International Symposium on Parallel Algorithms and Architectures, SPAA‘98, pages 78–87, Puerto Vallarta, Mexico, June 1998. [33] Seagate. Seagate specifications. http://www.seagate.com/pdf/datasheets/, December 2003. [34] R. K. Sinha, R. Bhatia, and C. Chen. Asymptotically optimal declustering schemes for range queries. In 8th International Conference on Database Theory, Lecture Notes in Computer Science, pages 144–158, London, UK, Jan. 2001. Springer. [35] A. S. Tosun. Replicated declustering for arbitrary queries. In 19th ACM Symposium on Applied Computing, March 2004. [36] A. S. Tosun and H. Ferhatosmanoglu. Optimal parallel I/O using replication. In Proceedings of International Workshops on Parallel Processing (ICPP), Vancouver, Canada, Aug. 2002.