Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient search for the top-k probable nearest neighbors in uncertain databases

Published: 01 August 2008 Publication History

Abstract

Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor data management, deal with different kinds of uncertainty. Finding the nearest neighbor objects to a given query point is an important query type in these applications.
In this paper, we study the problem of finding objects with the highest marginal probability of being the nearest neighbors to a query object. We adopt a general uncertainty model allowing for data and query uncertainty. Under this model, we define new query semantics, and provide several efficient evaluation algorithms. We analyze the cost factors involved in query evaluation, and present novel techniques to address the trade-offs among these factors. We give multiple extensions to our techniques including handling dependencies among data objects, and answering threshold queries. We conduct an extensive experimental study to evaluate our techniques on both real and synthetic data.

References

[1]
Topologically integrated geographic encoding and referencing (tiger) system, http://www.census.gov/geo/www/tiger/.
[2]
F. Aurenhammer. Voronoi diagrams - a survey of a fundamental geometric data structure. ACM Computing Surveys, 1991.
[3]
M. Berg, M. Kreveld, M. Overmars, and O. Schwarzkopf. Computational geometry algorithms and applications, 2nd ed. springer verlag, 2000.
[4]
K. C.-C. Chang and S. won Hwang. Minimal probing: supporting expensive predicates for top-k queries. In SIGMOD, 2002.
[5]
J. Chen and R. Cheng. Efficient evaluation of imprecise location-dependent queries. In ICDE, 2007.
[6]
R. Cheng, J. Chen, M. Mokbel, and C. Chow. Probabilistic verifiers: Evaluating constrained nearest-neighbor queries over uncertain data. In ICDE, 2008.
[7]
R. Cheng, D. V. Kalashnikov, and S. Prabhakar. Querying imprecise data in moving object environments. In TKDE, 2004.
[8]
X. Dai, M. L. Yiu, N. Mamoulis, Y. Tao, and M. Vaitis. Probabilistic spatial queries on existentially uncertain data. In SSTD, 2005.
[9]
V. de Almeida and R. Hartmut. Supporting uncertainty in moving objects in network databases. In GIS, 2005.
[10]
R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. Journal of Computer and System Science, 2001.
[11]
H. Franco-Lopez, A. R. Ek, and M. E. Bauer. Estimation and mapping of forest stand density, volume, and cover type using the k-nearest neighbors method. Remote Sensing of Environment 2001.
[12]
J. Geweke. Efficient simulation from the multivariate normal and student---t distributions subject to linear constraints and the evaluation of constraint probabilities. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface, 1991.
[13]
M. Hadjieleftheriou. Spatial index library, http://research.att.com/~marioh/spatialindex/.
[14]
G. R. Hjaltason and H. Samet. Ranking in spatial databases. In SSD, 1995.
[15]
D. V. Kalashnikov, Y. Ma, S. Mehrotra, and R. Hariharan. Index for fast retrieval of uncertain spatial point data. In GIS, 2006.
[16]
H.-P. Kriegel, P. Kunath, and M. Renz. Probabilistic nearest-neighbor query on uncertain objects. In DASFAA, 2007.
[17]
I. Lazaridis and S. Mehrotra. Progressive approximate aggregate queries with a multi-resolution tree structure. In SIGMOD Conference, 2001.
[18]
E. Li, D. Boos, and M. Gumpertz. Simulation study in statistics. Journal of Interconnection Networks, 2001.
[19]
Q. Liu, W. Yan, H. Lu, and S. Ma. Occlusion robust face recognition with dynamic similarity features. In ICPR, 2006.
[20]
V. Ljosa and A. K. Singh. Apla: Indexing arbitrary probability distributions. In ICDE, 2007.
[21]
S. R. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. The design of an acquisitional query processor for sensor networks. In SIGMOD, 2003.
[22]
N. Mamoulis, M. L. Yiu, K. H. Cheng, and D. W. Cheung. Efficient top-k aggregation of ranked inputs. ACM TODS., 32(3), 2007.
[23]
K. Munagala, S. Babu, R. Motwani, and J. Widomy. The pipelined set cover problem. In ICDT, 2005.
[24]
C. Re, N. Dalvi, and D. Suciu. Efficient top-k query evaluation on probabilistic data. In ICDE, 2007.
[25]
N. Roussopoulos, S. Kelley, and F. Vincent. Nearest neighbor queries. In SIGMOD, 1995.
[26]
A. D. Sarma, O. Benjelloun, A. Halevy, and J. Widom. Working models for uncertain data. In ICDE, 2006.
[27]
M. A. Soliman, I. F. Ilyas, and K. C. Chang. Top-k query processing in uncertain databases. In ICDE, 2007.
[28]
Y. Tao, R. Cheng, X. Xiao, W. K. Ngai, B. Kao, and S. Prabhakar. Indexing multi-dimensional uncertain data with arbitrary probability density functions. In VLDB, 2005.
[29]
Y. Tao, D. Papadias, and J. Zhang. Aggregate processing of planar points. In EDBT, 2002.
[30]
R. Weber, H. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In VLDB, 1998.
[31]
R. Yamamoto, H. Matsutani, H. Matsuki, T. Oono, and H. Ohtsuka. Position location technologies using signal strengths in cellular system. In VTC-Spring, 2001.

Cited By

View all
  • (2024)Probabilistic Counting in Uncertain Spatial Databases Using Generating FunctionsSpatial Gems, Volume 210.1145/3617291.3617300(85-96)Online publication date: 25-Jan-2024
  • (2024)Spatial Gems, Volume 2undefinedOnline publication date: 25-Jan-2024
  • (2022)Complete and Sufficient Spatial Domination of Multidimensional RectanglesSpatial Gems, Volume 110.1145/3548732.3548737(25-32)Online publication date: 5-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 1, Issue 1
August 2008
1216 pages

Publisher

VLDB Endowment

Publication History

Published: 01 August 2008
Published in PVLDB Volume 1, Issue 1

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Probabilistic Counting in Uncertain Spatial Databases Using Generating FunctionsSpatial Gems, Volume 210.1145/3617291.3617300(85-96)Online publication date: 25-Jan-2024
  • (2024)Spatial Gems, Volume 2undefinedOnline publication date: 25-Jan-2024
  • (2022)Complete and Sufficient Spatial Domination of Multidimensional RectanglesSpatial Gems, Volume 110.1145/3548732.3548737(25-32)Online publication date: 5-Aug-2022
  • (2022)Spatial Data Quality in the IoT Era: Management and ExploitationProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3522568(2474-2482)Online publication date: 10-Jun-2022
  • (2022)Spatial Data Quality in the Internet of Things: Management, Exploitation, and ProspectsACM Computing Surveys10.1145/349833855:3(1-41)Online publication date: 3-Feb-2022
  • (2022)Range-constrained probabilistic mutual furthest neighbor queries in uncertain databasesKnowledge and Information Systems10.1007/s10115-022-01807-065:6(2375-2402)Online publication date: 24-Dec-2022
  • (2022)Spatial Gems, Volume 1undefinedOnline publication date: 5-Aug-2022
  • (2022)Query Processing over Uncertain DatabasesundefinedOnline publication date: 2-Mar-2022
  • (2021)Efficient Probabilistic K-NN Computation in Uncertain Sensor NetworksIEEE Transactions on Network Science and Engineering10.1109/TNSE.2021.30998648:3(2575-2587)Online publication date: 1-Jul-2021
  • (2020)Modelling the expected probability of correct assignment under uncertaintyScientific Reports10.1038/s41598-020-71558-x10:1Online publication date: 15-Sep-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media