Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Logically Clustered Architectures for Networked Databases

Published: 01 September 2001 Publication History

Abstract

By effectively harnessing networked computing resources, the two-tier client-server model has been used to support shared data access. In systems based on this approach, the database servers often become performance bottlenecks when the number of concurrent users is large. Client data caching techniques have been proposed in order to ease resource contention at the servers. The key theme of these techniques is the exploitation of user data access locality. In this paper, we propose a three-tiered model that takes advantage of such data access locality to furnish a much more scalable system. Groups of clients that demonstrate similarities in their data access behavior are logically clustered together. Each such group of clients is handled by an Intermediate Cluster Manager (ICM) that acts as a cluster-wide directory service and cache manager. Clients within the same cluster are now capable of sharing data among themselves without interacting with the server(s). This results in reduced server load and allows the support of a much larger number of clients. Through prototyping and experimentation, we show that the logical clustering of clients, and the introduction of the ICM layer, significantly improve system scalability as well as transaction response times. Logical clusters, consisting of clients with similar data access patterns, are identified with the help of both a greedy algorithm and a genetic algorithm. For the latter, we have developed an encoding scheme and its corresponding operators.

References

[1]
1. J. Andrade, M. Carges, and M. MacBlane, "The TUXEDO System: An open on-line transaction processing environment," Data Engineering Bulletin, vol. 17, no. 1, 1994.
[2]
2. P. Apers, "Data allocation in distributed database systems," ACM-Transaction on Database Systems, vol. 13, no. 3, pp. 263-304, 1988.
[3]
3. J. Banerjee, W. Kim, S.-J. Kim, and J.F. Garza, "Clustering a DAG for CAD Databases," IEEE Transactions on Software Engineering, vol. 14, no. 11, 1988.
[4]
4. P. Bernstein, V. Hadzilakos, and N. Goodman, Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman, Reading, MA, 1987.
[5]
5. A. Biliris and J. Orenstein, "Object storage management architectures," in: Advances in Object-Oriented Database Systems, Proceedings of the NATO Advanced Study Institute on Object-Oriented Database Systems, Kusadasi, Turkey, 1993.
[6]
6. M. Blaze and R. Alonso, "Dynamic hierarchical caching in large-scale distributed File Systems," in: Proc. 12th International Conference On Distributed Computing Systems, Yokohama, Japan, 1992.
[7]
7. P. Butterworth, "The resurgent mainframe and the future of distributed computing," Technical report, Forté Software Inc., Oakland, CA. White Paper on Forté Fusion Technologies available at http://www.forte.com, 1999.
[8]
8. M. Carey, M. Franklin, and M. Zaharioudakis, "Fine-grained sharing in a page server OODBMS," in: Proceedings of the ACM SIGMOD Conference, Minneapolis, MN, 1994.
[9]
9. A. Chankhunthod, P. Danzig, C. Neerdaels, M. Schwartz, and K. Worrell, "A hierarchical internet object cache". in: Proceedings of the USENIX 1996 Annual Technical Conference, San Diego, pp. 153-163, 1996.
[10]
10. R. Chow and T. Johnson, Distributed Operating Systems and Algorithms, Addison-Wesley Reading, MA 1997.
[11]
11. I. Chu and M. Winslett, "Choices in database workstation-server architecture," in: Proceedings of the 17th Annual International Computer Software and Applications Conference, Phoenix, AZ, 1993.
[12]
12. T. Cormen, C. Leiserson, and R. Rivest: 1990, Introduction to Algorithms. New York, NY: McGraw Hill.
[13]
13. M. Dahlin, C. Mather, R. Wang, T. Anderson, and D. Patterson, "A quantitative analysis of cache policies for scalable network file systems," in Proceedings of the Sigmetrics Conference on Measurement and Modeling of Computer Systems, 1994.
[14]
14. A. Delis and N. Roussopoulos, "Performance comparison of three modern DBMS architectures," IEEE-Transactions on Software Engineering, vol. 19, no. 2, pp. 120-138, 1993.
[15]
15. D. DeWitt, P. Futtersack, D. Maier, and F. Velez, "A study of three alternative workstation-server architectures for object oriented database systems," in Proceedings of the 16th International Conference on Very Large Data Bases, Brisbane, Queensland, Australia, pp. 107-121, 1990.
[16]
16. D. Dias, W. Kish, R. Mukherjee, and R. Tewari, "A scalable and highly available web server," in Proceedings of COMPCON 1996, Forty-First IEEE Computer Society International Conference: Technologies for the Information Superhighway, Santa Clara, CA, 1996.
[17]
17. D. Dilts and W. Wu, "Using knowledge-based technology to integrate CIM databases," IEEE Transactions on Knowledge and Data Engineering, vol. 3, no. 2, pp. 237-245, 1991.
[18]
18. B. Duska, D. Marwood, and M. Freeley, "The measured access characteristics of World-Wide-Web client proxy caches," in Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems (USITS-97), Monterey, CA, 1997.
[19]
19. L. Fan, P. Cao, J. Almeida, and A. Broder, "Summary cache: A scalable wide-area web cache sharing protocol," in: Proceedings of the ACM SIGCOMM'98 Conference, Vancouver, Canada, pp. 254-265, 1998.
[20]
20. S. Gadde, M. Rabinovich, and J. Chase, "Reduce, reuse, recycle: An approach to building large internet caches," in Proceedings of the 6th Workshop on Hot Topics in Operating Systems, Cape Cod, MA, 1997.
[21]
21. M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness W.H. Freeman & Company, New York, NY, 1979.
[22]
22. G. Gerlhof, A. Kemper, C. Kilger, and G. Moerkotte, "Partition-based clustering in object bases: From theory to practice," in Proceedings of the International Conference on Foundations of Data Organization, Chicago, IL, 1993.
[23]
23. S. Glassman, "A caching relay for the World-Wide Web," in Proceedings of the First International World Wide Web Conference, Geneva, Switzerland, 1994.
[24]
24. D. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley Reading, MA, 1989.
[25]
25. J. Grefenstette, "Optimization of control parameters for genetic algorithms," IEEE Transactions on Systems, Man and Cybernetics, vol. 16, no. 1, pp. 122-128, 1986.
[26]
26. S. Gribble and E. Brewer, "System design issues for internet middleware services: Deductions from a large client trace," in Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems (USITS- 97), Monterey, CA, 1997.
[27]
27. J. Holland, Adaptation in Natural and Artificial Systems, Ann Arbor, MI, University of Michigan Press, 1975.
[28]
28. S. Hudson and R. King, "Cactis: A self-adaptive, concurrent implementation of an object-oriented database management system," ACM Transactions on Database Systems, vol. 14, no. 3, pp. 291-321, 1989.
[29]
29. A. Hurson, S. Pakzad, and J. Cheng, "Object-oriented database management systems: Evolution and performance issues," IEEE Computer, vol. 26, no. 2, 1993.
[30]
30. H. Ishikawa, Y. Yamane, Y. Izumida, and N. Kawato, "An object-oriented database system Jasmine: Implementation, application, and extension," IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 2, 1996.
[31]
31. A. Iyengar, M. Squillante, and L. Zhang, "Analysis and characterization of large-scale web server access patterns and performance," World Wide Web, vol. 2, nos. 1-2, 1999.
[32]
32. A. Jain, M. Murty, and P. Flynn, "Data clustering: A review," ACM Computing Surveys, vol. 31, no. 3, pp. 264-323, 1999.
[33]
33. G. Jones, ObjectStore 6.0, Technical report, Object Design, Inc., 1999.
[34]
34. H. Kitagawa and N. Ohbo, "Design data modeling with versioned conceptual configuration," in Proceedings of the 13th Annual International Computer Software and Applications Conference, Orlando, FL, September 1989.
[35]
35. A. Leff, P. Yu, and J. Wolf, "Policies for efficient memory utilization in a remote caching architecture," Miami Beach, FL, December 1991.
[36]
36. A. Luotonen and K. Atlis, "World-Wide Web proxies," in Proceedings of the First International World Wide Web Conference, Geneva, Switzerland, 1994.
[37]
37. R. Malpani, J. Lorch, and D. Berger, "Making World-Wide Web caching servers cooperate," in Proceedings of the 4th International WWW Conference, Boston, MA, 1995.
[38]
38. J. McIver and R. King, "Self-adaptive, on-line reclustering of complex object data," in Proceedings of the International Conference on Management of Data, ACM Press, Minneapolis, MI, 1994.
[39]
39. S. Milliner, A. Bouguettaya, and M. Papazoglou, "A scalable architecture for autonomous heterogeneous database interactions," in Proceedings of the 21st International Conference on Very Large Data Bases, Zurich, Switzerland, 1995.
[40]
40. C. Mohan and I. Narang, "ARIES/CSA: A method for database recovery in client-server architectures," SIGMOD Record, vol. 23, no. 2, pp. 55-66, 1994.
[41]
41. M. Oates, D. Corne, and R. Loader, "Investigating evolutionary approaches for self-adaptation in the large distributed databases," in Proceedings of the 1998 IEEE International Conference on Evolutionary Computation, Anchorage, AK, 1998.
[42]
42. M. Ozsu and P. Valduriez, Principles of Distributed Database Systems, Upper Saddle River, NJ, Second Edition, 1999.
[43]
43. E. Panagos, A. Biliris, H. Jagadish, and R. Rastogi, "Client-based logging for high performance distributed architectures," in Proceedings of the 12th International Conference on Data Engineering, New Orleans, LA, pp. 344-351, 1996.
[44]
44. J. Park, V. Kanitkar, R. Uma, and A. Delis, "Optimal client clustering is NP-complete," Technical Report, Polytechnic University, Brooklyn, NY, 1998.
[45]
45. R. Polamraju and W. Potter, "Databases for engineering applications," in IEEE Proceedings of SOUTHEASTCON '91, vol. 2. Williamsburg, VA, 1991.
[46]
46. M. Rabinovich, J. Chase, and S. Gadde, "Not all hits are created equal: Cooperative proxy caching over a wide-area network," Computer Networks and ISDN Systems, vol. 30, nos. 22-23, pp. 2253-2259, 1998.
[47]
47. D. Saccà and G. Wiederhold, "Database partitioning in a cluster of processors," ACM-Transaction on Database Systems, vol. 10, no. 1, pp. 29-56, 1985.
[48]
48. H. Sandhu and S. Zhou, "Cluster-based file replication in large-scale distributed systems," in ACM SIGMETRICS and Performance '92 Conference, 1992.
[49]
49. M. Satyanarayanan, J. Kistler, P. Kumar, M. Okasaki, E. Siegel, and D. Steere, "Coda: A highly available file system for a distributed workstation environment," IEEE-Transactions on Computers, vol. 39, no. 4, 1990.
[50]
50. A. Sinha, "Client-server computing," Communications of ACM, vol. 35, no. 7, 1992.
[51]
51. S. Su, H. Lam, S. Eddula, J. Arroyo, N. Prasad, and R. Zhuang, "OSAM<sup>*</sup>KBMS:Anobject-oriented knowledge base management system for supporting advanced applications," in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, 1993.
[52]
52. T.M.D. Team, "The miniRel relational DBMS," University of Wisconsin, Madison, WI, 1989.
[53]
53. R. Tewari, M. Dahlin, H. Vin, and J. Kay, "Design considerations for distributed caching on the internet," in Proceedings of the 19th IEEE International Conference on Distributed Computing Systems, Austin, TX, 1999.
[54]
54. S. Theodoridis and K. Koutroumbas, Pattern Recognition, Academic Press, London, 1999.
[55]
55. P. Triantafillou and C. Neilson, "Achieving strong consistency in a distributed file system," IEEE Transactions on Software Engineering, vol. 3, no. 1, pp. 35-55, 1997.
[56]
56. M. Tsangaris and J. Naughton, "On the performance of object clustering techniques," in Proceedings of 20th ACM SIGMOD Conference on the Management of Data, San Diego, CA, 1992.
[57]
57. Y. Wang and L. Rowe, "Cache consistency and concurrency control in a client/server DBMS architecture," in Proceedings of the 1991 ACM SIGMOD Conference, Denver, CO, 1991.
[58]
58. V. Wietrzyk and M. Orgun, "Dynamic reorganization of object databases," in Proceedings of the the 1999 IEEE International Database Engineering and Applications Symposium, Montreal, Canada, 1999.
[59]
59. K. Wilkinson and M. Neimat, "Maintaining consistency of client-cached data," in Proceedings of the 16th International Conference on Very Large Data Bases, pp. 122-133, 1990.
[60]
60. C. Yu, C. Suen, K. Lam, and M. Siu, "Adaptive record clustering," ACM Transactions on Database Systems, vol. 10, no. 2, pp. 180-204, 1985.
[61]
61. P. Yu, M. Chen, H. Heiss, and S. Lee, "On workload characterization of relational database environments," IEEE Transaction of Software Engineering, vol. 18, no. 4, pp. 347-355, 1992.

Recommendations

Reviews

Athena Vakali

Database systems have been introduced in domains that involve complex information processing. These systems must handle high volumes of data between sites in networked environments. Contemporary databases, which are based on the client-server model, are widely used to meet these challenges. In this paper, two alternative architectures to the client-sever model are proposed. The key feature of these configurations is logical client clustering. By analyzing the data access patterns of involved sites, clients that access similar segments of a database are grouped together. Each cluster of clients is managed by intermediate cluster managers (ICMs), which are connected to the existing database server or servers. The paper begins by describing the two network database architectures: logically clustered client-server database (LC-CS), and its extended version (extended LC-CS). In the extended version, the ICMs support caching techniques. The authors then perform an analysis of the two types of configurations, based on the probability of object request satisfaction at clients, and within the clusters. Two clustering algorithms (greedy and genetic), which are used to cluster clients with the same database access patterns, are described next. Readers should have a background in such algorithms, since the paper itself doesn't provide enough detail to understand them. More specifically, the example for the genetic algorithm is not very clearly explained (Figure 7 has some missing points). The authors then describe the experimental evaluation of the two architectures, which are compared with the client-server model. The performance of the proposed environment is examined by varying the number of clients and the workload. The experiment section of the paper is very well organized, and includes a complete view of the experiments carried out. Finally, the authors provide a good overview of related work on client clustering in distributed file systems. This paper presents well-organized and important work. The paper is intended for database specialists, particularly those interested in the application of artificial intelligence technology to database systems. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image Distributed and Parallel Databases
Distributed and Parallel Databases  Volume 10, Issue 2
September 2001
88 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 September 2001

Author Tags

  1. logical client clustering
  2. multi-tier database architectures
  3. networked databases

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media