Abstract
The performance of database operations can be enhanced with an efficient storage structure design using attribute partitioning and/or tuple clustering. Previous research deals mostly with attribute partitioning. We address here the combined problem of attribute partitioning and tuple clustering. We propose a novel approach for this mixed fragmentation problem by applying a genetic algorithm iteratively to attribute partitioning and tuple clustering sub-problems. We compared our results to attribute-only partitioning and random search solution, resulting in a database access cost reduction of upto 70% and 67% respectively. We analyzed the effect of varying genetic parameters on the optimal solution through experimentation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The authors are thankful to the reviewer that provided this feedback.
References
Ailamaki, A., Dewitt, D.J., Hill, M. D., & Skounakis, M. (2001). Weaving relations for cache performance. In Proceedings of the 27th VLDB conference.
Baiao, F., Mattoso, M., & Zaverucha, G. (2004). A distribution design methodology for object DBMS. Journal of Distributed and Parallel Databases, 16(6), 45–90.
Ceri, S., Navathe, S., & Wiederhold, G. (1983). Distribution design of logical database schemas. IEEE Transactions on Software Engineering, 9(4), 487–504.
Chambers, L. (1995). Practical handbook of genetic algorithms (Vol. 1). CRC Press.
Cheng, C. H., Lee, W. K., & Wong, K. F. (2002). A genetic algorithm-based clustering approach for database partitioning. IEEE Transactions on Systems, Man, and Cybernetics, 32(3), 215–230.
Chu, W. W., & Ieong, I. T. (1993). A transaction-based approach to vertical partitioning for relational database sytems. IEEE Transactions on Systems, Man, and Cybernetics, 19(8), 804–812.
Cornell, D. W., & Yu, P. S. (1990). An effective approach to vertical partitioning for physical design of relational databases. IEEE Transactions on Software Engineering, 16(22), 248–258.
Du, J., Alhajj, R., & Barker, K. (2006). Genetic algorithms based approach to database vertical partitioning. Journal of Intelligent Information Systems, 26(2), 167–183.
Eisner, M. J., & Severance, D. G. (1976). Mathematical techniques for efficient record segmentation in large shared database. Journal of ACM, 23(4), 619–635.
Ezeife, C. I. (2001). Selecting and materializing horizontally partitioned warehouse views. Data and Knowledge Engineering, 36, 185–210.
Franti, P., Kivijarvi, J., Kaukoranta, T., & Nevalainen, O. (1997). Genetic algorithm for large-scale clustering problems. Computer Journal, 40(9), 547–554.
Fung, C. W., Karlapalem, K., & Li, Q. (2002). An evaluation of vertical class partitioning for query processing in object-oriented databases. IEEE Transactions on Knowledge and Data Engineering, 14(5), 1095–1118.
Furtado, C., Lima, A. A. B., Pacitti, E., Valduriez, P., & Mattoso, M. (2005). Physical and virtual partitioning in OLAP database cluster. In 17th international symposium on computer architecture and high performance computing (pp. 143–150).
Goldberg, D. E. (1989). Genetic algorithm in search, optimization, and machine learning. Addison-Wesley.
Gorla, N. (2001). An object-oriented database design for improved performance. Data and Knowledge Engineering, 37(2), 117–138.
Gorla, N. (2007). A methodology for vertically partitioning in a multi-relation database environment. Journal of Computer Science & Technology, 7(3), 217–227.
Gorla, N., & Betty, P. W. Y. (2008). Vertical fragmentation in databases using data-mining technique. International Journal of Data Warehousing & Mining, 4(3), 35–53.
Gorla, N., & Boe, W. (1990). Database performance of fragmented databases in mainframe, Mini, and micro computer systems. Data and Knowledge Engineering, 5(1), 1–19.
Gorla, N., & Quinn, W. (1991). Combined optimal tuple ordering and attribute partitioning in storage schema design. Information and Software Technology, 33(5), 335–339.
Gorla, N., & Song, S. K. (2010). Sub query allocations in distributed databases using genetic algorithms. Journal of Computer Science & Technology 10(1), 31–37.
Hoffer, J. A., & Serverance, D. G. (1975). The use of cluster analysis in physical design. In Proc. first international conference on very large data bases.
Knuth, D. (1973). Sorting and searching. The art of computer programming (Vol. 3). Addison-Welsey.
Li, B., & Jiang, W. (2000). A novel stochastic optimization algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 30(1), 193–198.
Mahboubi, H., & Darmont, J. (2008). Data mining-based fragmentation of XML data warehouses. In DOLAP ’08.
March, S. T., & Rho, S. (1995). Allocating data and operations to nodes in distributed database design. IEEE Transactions on Knowledge and Data Engineering, 7(2), 305–317.
Navathe, S., Ceri, S., Wiederhold, G., & Dou, J. (1984). Vertical partitioning algorithms for database design. ACM Transactions on Database Systems, 9(4), 680–710.
Ng, V., Law, D. M., Gorla N., & Chan, C. K. (2003). Applying genetic algorithms in database partitioning. In 2003 ACM symposium on applied computing (SAC).
Niamir, B. (1978). Attribute partitioning in a self-adaptive relational database system. PhD Dissertation, MIT Lab. for Computer Science, Jan 1978.
Ozsu, M., & Valduriez, P. (1996). Principles of database systems. Prentice Hall.
Ramamurthy, R., Dewitt, D. J., & Su, Q. (2002). A case for fractured mirrors. In Proceedings of the 28th VLDB conference.
Rivest, R. (1976). On self-organizing sequential search heuristics. Communication of the ACM, 19(2), 63–67.
Serban, G., & Campan, A. (2008). Hierarchical adaptive clustering. Informatica, 19(1), 101–112.
Song, S. K., & Gorla, N. (2000). A genetic algorithm for vertical fragmentation and access path selection. The Computer Journal, 45(1), 81–93.
Tam, K. Y. (1992). Genetic algorithms, function optimization, and facility layout design. European Journal of Operations Research, 63(2), 322–346.
Tamhanker, A. J., & Ram, S. (1998). Database fragmentation and allocation: An integrated methodology and case study. IEEE Transactions on Man, Systems, and Cybernetics, 28(3), 288–305.
Author information
Authors and Affiliations
Corresponding author
Additional information
An earlier version of this paper has been presented at the 2003 ACM Symposium on Applied Computing (SAC)
Rights and permissions
About this article
Cite this article
Gorla, N., Ng, V. & Law, D.M. Improving database performance with a mixed fragmentation design. J Intell Inf Syst 39, 559–576 (2012). https://doi.org/10.1007/s10844-012-0203-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-012-0203-x