Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Fuzzy Statistics Estimation in Supporting Multidatabase Query Optimization

  • Published:
Electronic Commerce Research Aims and scope Submit manuscript

Abstract

Advances in networking and database technology have made global information sharing a reality. Multidatabase systems (MDBSs) represent a promising approach to addressing the challenges of achieving interoperability among multiple pre-existing databases that are highly autonomous and possibly heterogeneous. The performance of an MDBS is greatly dependent on effectiveness of multidatabase query optimization (MQO). However, the unavailability of and uncertainty in the statistics essential to query optimization have made multidatabase query optimization (MQO) significantly more challenging than distributed query optimization. This research undertook to develop a fuzzy statistics-based MQO approach to addressing statistics estimation and uncertainty problems in an MDBS environment. We analyzed the statistics needed in an MDBS environment and classified them into three categories: point-based, distribution-function-based and dependency-based. Fuzzy numbers were adopted to represent point-based statistics, and a fuzzy polynomial regression method was developed for estimating distribution function-based statistics (i.e., attribute or join selectivity) from a set of subquery results. For dependency-based statistics, a fuzzy regression method was employed for estimating logical-parameter-based local cost functions. Furthermore, methods for ranking the fuzzy numbers that are fundamental to fuzzy-statistics-based MQO were also discussed. The proposed fuzzy statistics estimation methods were illustrated using examples to demonstrate its applicability in supporting MQO.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adamo, J.M. (1980). “Fuzzy Decision Trees.” Fuzzy Sets and Systems 4, 207–219.

    Google Scholar 

  2. Bandermer, H. and S. Gottwald. (1995). Fuzzy Sets, Fuzzy Logic and Fuzzy Methods: With Applications. Chichester, UK: John Wiley & Sons.

    Google Scholar 

  3. Campos, L. and J.L. Verdegay. (1989). “Linear Programming Problems and Ranking of Fuzzy Numbers.” Fuzzy Sets and Systems32, 1–11.

    Google Scholar 

  4. Chen, C.M. and N. Roussopoulos. (1994). “Adaptive Selectivity Estimation Using Query Feedback.” SIGMOD Record 23(2), 161–172.

    Google Scholar 

  5. Chen, S.J. and C.L. Hwang. (1992). Fuzzy Multiple Attribute Decision Making: Methods and Applications. New York: Springer.

    Google Scholar 

  6. Du, W., R. Krishnamurthy, and M.-C. Shan. (1992). “Query Optimization in Heterogeneous DBMS.” In Proceedings of the 18th Very Large Data Bases (VLDB) Conference, Vancouver, British Columbia, Canada, pp. 277–291.

  7. Dubois, D. and H. Prade. (1979). “Decision-Making Under Fuzziness.” In M.M. Gupta, R.K. Ragade and R.R. Yager (eds.), Advances in Fuzzy Set Theory and Applications. North-Holland, pp. 279–302.

  8. Georgakopoulos, D., M. Rusinjiewicz, and A.P. Sheth. (1994). “Using Tickets to Enforce the Serializability of Multidatabase Transactions.” IEEE Transactions on Knowledge and Data Engineering 6(1), 166–180.

    Google Scholar 

  9. Kamel, N.M. and N.N. Kamel. (1992). “Federated Database Management System: Requirements, Issues and Solutions.” Computer Communications 15(4), 270–278.

    Google Scholar 

  10. Kim, W. (1995). “Technology for Interoperability Legacy Databases.” In W. Kim (ed.), Modern Database Systems: The Object Model, Interoperability, and Beyond. Reading, MA: Addison-Wesley, Chapter 25, pp. 515–520.

    Google Scholar 

  11. Klir, J.G. and B. Yuan. (1995). Fuzzy Sets and Fuzzy Logic: Theory and Applications. Upper Saddle River, NJ: Prentice-Hall.

    Google Scholar 

  12. Lohmann, M.G., C. Mohan, M.L. Haas, et al. (1985). “Query Processing in R*.” In W. Kim, D. Batory, and D. Reiner (eds.), Query Processing in Database Systems. Berlin: Springer, pp. 31–47.

    Google Scholar 

  13. Lu, H., B.-C. Ooi, and C.-H. Goh. (1992). “On Global Multidatabase Query Optimization.” SIGMOD Record 12(4), 6–11.

    Google Scholar 

  14. Mannino, V.M., P. Chu, and T. Sager. (1988). “Statistical Profile Estimation in Database Systems.” ACM Computing Surveys 20(3).

  15. Mehrotra, S., R. Rostogi, Y. Breitbart, H.F. Korth, and A. Silberschatz. (1992). “The Concurrency Control Problem in Multidatabases: Characteristics and Solutions.” SIGMOD Record, June, 228–297.

  16. Meng, W. and C. Yu. (1995). “Query Processing in Multidatabase Systems.” In W. Kim (ed.), Modern Database Systems: The Object Model, Interoperability, and Beyond. Reading, MA: Addison-Wesley, Chapter 27, pp. 551–572.

    Google Scholar 

  17. Mizumoto, M. and K. Tanaka. (1979). “Some Properties of Fuzzy Numbers.” In M.M. Gupta, K.R. Ragade and R.R. Yager (eds.), Advances in Fuzzy Set Theory and Applications. North-Holland, pp. 153–164.

  18. Mon, D.-L. and C.-H. Cheng. (1994). “Fuzzy System Reliability Analysis for Components with Different Membership Functions.” Fuzzy Sets and Systems 64, 145–157.

    Google Scholar 

  19. Morzy, T. and Z. Krolikowski. (1998). “Query Optimization in Multidatabase Systems: Solutions and Open Issues.” In Proceedings of 10th International Workshop on Database and Expert Systems Applications.

  20. Sacco, G.M. and B.S. Yao. (1982). “Query Optimization in Distributed Databases.” In M. Rubinoff and Y.C. Marshall (eds.), Advances in Computers, Vol. 21. New York: Academic Press, pp. 225–273.

    Google Scholar 

  21. Satoh, K., M. Tsuchida, F. Nakamura, and K. Oomachi. (1985). “Local and Global Query Optimization Mechanisms for Relational Databases.” In Proceedings of the 11th Conference on Very Large Data Bases.

  22. Sheth, P.A. and A.J. Larson. (1990). “Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases.” ACM Computing Surveys 22(3), 183–236.

    Google Scholar 

  23. Sun, W., Y. Ling, N. Rishe, and Y. Deng. (1993). “An Instant and Accurate Size Estimation Method for Joins and Selection in a Retrieval-Intensive Environment.” In Proceedings of International Conference on Management of Data, Washington, DC, pp. 79–88.

  24. Tanaka, H., S. Uejima, and K. Asai. (1982). “Linear Regression Analysis with Fuzzy Model.” IEEE Transactions on Systems, Man, and Cybernetics 12(6), 903–907.

    Google Scholar 

  25. Veijalainen, J. and R. Popesch-Zeletin. (1988). “Multidatabase Systems in ISO/OSI Environment.” In N. Malagardis and E. Williams (eds.), Standards in Information Technology and Industrial Control. Amsterdam, The Netherlands: North-Holland, pp. 83–97.

    Google Scholar 

  26. Wang, Z.-Y. and S.-M. Li. (1990). “Fuzzy Linear Regression Analysis on Fuzzy Valued Variables.” Fuzzy Sets and Systems 36, 125–136.

    Google Scholar 

  27. Wei, C.-P. (1996). “Schema Management for Large-Scale Multidatabase Systems.” Unpublished Ph.D. Dissertation, Department of Management Information Systems, University of Arizona.

  28. Yager, R.R. (1981). “A Procedure for Ordering Fuzzy Subsets of the Unit Interval.” Information Science 24, 143–161.

    Google Scholar 

  29. Yu, C. and C. Chang. (1984). “Distributed Query Processing.” ACM Computing Survey 16(4), 399–433.

    Google Scholar 

  30. Zhu, Q. and A.P. Larson. (1994). “A Query Sampling Method for Estimating Local Cost Parameters in a Multidatabase System.” In Proceedings of 10th International Conference on Data Engineering, pp. 144–153.

  31. Zhu, Q. and A.P. Larson. (1994). “Establishing a Fuzzy Cost Model for Query Optimization in A Multidatabase System.” In Proceedings of 27th Annual Hawaii International Conference on System Sciences, pp. 263–272.

  32. Zhu, Q., Y. Sun, and S. Motheramgari. (1998). “Developing Cost Models with Qualitative Variables for Dynamic Multidatabase Environments.” In Proceedings of 16th International Conference on Data Engineering.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, CP., Sheng, O.R.L. & Hu, P.JH. Fuzzy Statistics Estimation in Supporting Multidatabase Query Optimization. Electronic Commerce Research 2, 287–316 (2002). https://doi.org/10.1023/A:1016014716253

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1016014716253