Abstract
Building an index tree is a common approach to speed up the k nearest neighbour search in large databases of many-dimensional records. Many applications require varying distance metrics by putting a weight on different dimensions. The main problem with k nearest neighbour searches using weighted euclidean metrics in a high dimensional space is whether the searches can be done efficiently. We present a solution to this problem which uses the bounding rectangle of the nearest-neighbour disk instead of using the disk directly. The algorithm is able to perform nearest-neighbour searches using distance metrics different from the metric used to build the search tree without having to rebuild the tree. It is efficient for weighted euclidean distance and extensible to higher dimensions.
Preview
Unable to display preview. Download preview PDF.
References
David W. Aha. A study of instance-based algorithms for supervised learning tasks: Mathematical, empirical, and psychological evaluations (dissertation). Technical Report ICS-TR-90-42, University of California, Irvine, Department of Information and Computer Science, November 1990.
S. Belkasim, M. Shridhar, and M. Ahmadi. Pattern classification using an efficient KNNR. Pattern Recognition, 25(10):1269–1274, 1992.
Jon Louis Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9):509–517, September 1975.
Christos Faloutsos. Searching Multimedia Databases by Content. Advances in Database Systems. Kluwer Academic Publishers, Boston, August 1996.
Christos Faloutsos, William Equitz, Myron Flickner, Wayne Niblack, Dragutin Petkovic, and Ron Barber. Efficient and effective querying by image content. J. of Intelligent Information Systems, 3:231–262, July 1994.
Christos Faloutsos, M. Ranganathan, and Yannis Manolopoulos. Fast subsequence matching in time-series databases. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 419–429, May 1994.
Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huang, Bryan Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, and Peter Yanker. Query by image and video content: The QBIC system. IEEE Computer, pages 23–32, September 1995.
Jerome H. Friedman, Jon Louis Bentley, and R.A. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Trans. on Math. Software (TOMS), 3(3):209–226, September 1977.
Keinosuke Fukunaga and Larry D. Hostetler. Optimization of k-nearest-neighbor density estimates. IEEE Transactions on Information Theory, IT-19(3):316–326, May 1973.
A. Guttman. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 47–57, Boston, MA, June 1984.
Gisli R. Hjaltason and Hanan Samet. Ranking in spatial databases. In Max J. Egenhofer and John R. Herring, editors, Advances in Spatial Databases, 4th International Symposium, SSD'95, volume 951 of Lecture Notes in Computer Science, pages 83–95, Berlin, 1995. Springer-Verlag.
Jesse Jin, Lai Sin Tiu, and Sai Wah Stephen Tarn. Partial image retrieval in multimedia databases. In Proceedings of Image and Vision Computing New Zealand, pages 179–184, Christchurch, 1995. Industrial Research Ltd.
Jesse S. Jin, Guangyu Xu, and Ruth Kurniawati. A scheme for intelligent image retrieval in multimedia databases. Journal of Visual Communication and Image Representation, 7(4):369–377, 1996.
D. Kibler, D. W. Aha, and M. Albert. Instance-based prediction of real-valued attributes. Computational Intelligence, 5:51–57, 1989.
Flip Korn, Nikolaos Sidiropoulos, Christos Faloutsos, and Eliot Siegel. Fast nearestneighbor search in medical image databases. In International Conference on Very Large Data Bases, Bombay, India, Sep 1996.
Ruth Kurniawati, Jesse S. Jin, and John A. Shepherd. The SS+-tree: An improved index structure for similarity searches in a high-dimensional feature space. In Proceedings of the SPIE: Storage and Retrieval for Image and Video Databases V, volume 3022, pages 110–120, San Jose, CA, February 1997.
Ruth Kurniawati, Jesse S. Jin, and John A. Shepherd. Efficient nearest-neighbour searches using weighted euclidean metrics. Technical report, Information Engineering Department, School of Computer Science and Engineering, University of New South Wales, Sydney 2052, January 1998.
Nick Roussopoulos, Stephen Kelley, and Frédéric Vincent. Nearest neighbor queries. In Michael J. Carey and Donovan A. Schneider, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 71–79, San Jose, California, May 1995.
Robert F. Sproull. Refinements to nearest-neighbour searching in k-dimensional trees. Algorithmica, 6:579–589, 1991.
Gilbert Strang. Introduction to applied mathematics. Wellesley-Cambridge Press, Wellesley, MA, 1986.
Gilbert Strang. Linear algebra and its applications. Harcourt, Brace, Jovanovich, Publishers, San Diego, 1988.
David A. White and Ramesh Jain. Similarity indexing with the SS-tree. In Proc. 12th IEEE International Conference on Data Engineering, New Orleans, Louisiana, February 1996.
Author information
Authors and Affiliations
Corresponding author
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kurniawati, R., Jin, J.S., Shepherd, J.A. (1998). Efficient nearest-neighbour searches using weighted euclidean metrics. In: Embury, S.M., Fiddian, N.J., Gray, W.A., Jones, A.C. (eds) Advances in Databases. BNCOD 1998. Lecture Notes in Computer Science, vol 1405. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0053472
Download citation
DOI: https://doi.org/10.1007/BFb0053472
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64659-4
Online ISBN: 978-3-540-69112-9
eBook Packages: Springer Book Archive