Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Finding attribute-aware similar regions for data analysis

Published: 01 July 2019 Publication History

Abstract

With the proliferation of mobile devices and location-based services, increasingly massive volumes of geo-tagged data are becoming available. This data typically also contains non-location information. We study how to use such information to characterize a region and then how to find a region of the same size and with the most similar characteristics. This functionality enables a user to identify regions that share characteristics with a user-supplied region that the user is familiar with and likes. More specifically, we formalize and study a new problem called the attribute-aware similar region search (ASRS) problem. We first define so-called composite aggregators that are able to express aspects of interest in terms of the information associated with a user-supplied region. When applied to a region, an aggregator captures the region's relevant characteristics. Next, given a query region and a composite aggregator, we propose a novel algorithm called DS-Search to find the most similar region of the same size. Unlike any previous work on region search, DS-Search repeatedly discretizes and splits regions until an split region either satisfies a drop condition or it is guaranteed to not contribute to the result. In addition, we extend DS-Search to solve the ASRS problem approximately. Finally, we report on extensive empirical studies that offer insight into the efficiency and effectiveness of the paper's proposals.

References

[1]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[2]
X. Cao, G. Cong, T. Guo, C. S. Jensen, and B. C. Ooi. Efficient processing of spatial group keyword queries. ACM Transactions on Database Systems, 40(2):13, 2015.
[3]
A. Cary, O. Wolfson, and N. Rishe. Efficient and scalable method for processing top-k spatial Boolean queries. In Proceedings of the International Conference on Scientific and Statistical Database Management, pages 87--95, 2010.
[4]
H.-J. Cho and C.-W. Chung. Indexing range sum queries in spatio-temporal databases. Information and Software Technology, 49(4):324--331, 2007.
[5]
D.-W. Choi, C.-W. Chung, and Y. Tao. A scalable algorithm for maximizing range sum in spatial databases. PVLDB, 5(11):1088--1099, 2012.
[6]
D.-W. Choi, J. Pei, and X. Lin. Finding the minimum spatial keyword cover. In Proceedings of the 32nd IEEE International Conference on Data Engineering, pages 685--696, 2016.
[7]
M. Christoforaki, J. He, C. Dimopoulos, A. Markowetz, and T. Suel. Text vs. space: efficient geo-search query processing. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pages 423--432, 2011.
[8]
G. Cong, K. Feng, and K. Zhao. Querying and mining geo-textual data for exploration: Challenges and opportunities. In Proceedings of the 32nd IEEE International Conference on Data Engineering Workshops, pages 165--168. 2016.
[9]
G. Cong, C. S. Jensen, and D. Wu. Efficient retrieval of the top-k most relevant spatial web objects. PVLDB, 2(1):337--348, 2009.
[10]
I. De Felipe, V. Hristidis, and N. Rishe. Keyword search on spatial databases. In Proceedings of the IEEE International Conference on Data Engineering, pages 656--665, 2008.
[11]
K. Feng, G. Cong, S. S. Bhowmick, W.-C. Peng, and C. Miao. Towards best region search for data exploration. In Proceedings of the 2016 International Conference on Management of Data, pages 1055--1070, 2016.
[12]
K. Feng, T. Guo, G. Cong, S. S. Bhowmicks, and S. Ma. Surge: Continuous detection of bursty regions over a stream of spatial objects. In Proceedings of the IEEE International Conference on Data Engineering, pages 1292--1295, 2018.
[13]
T. Guo, X. Cao, and G. Cong. Efficient algorithms for answering the m-closest keywords query. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 405--418. 2015.
[14]
R. Hariharan, B. Hore, C. Li, and S. Mehrotra. Processing spatial-keyword (SK) queries in geographic information retrieval (GIR) systems. In Proceedings of the 19th International Conference on Scientific and Statistical Database Management, article 16, 2007.
[15]
M. Jurgens and H.-J. Lenz. The Ra*-tree: an improved R*-tree with materialized data for supporting range queries on OLAP-data. In Proceedings of the Ninth International Workshop on Database and Expert Systems Applications, pages 186--191. 1998.
[16]
I. Lazaridis and S. Mehrotra. Progressive approximate aggregate queries with a multi-resolution tree structure. In Proceedings of the 2015 International Conference on Management of Data, volume 30, pages 401--412, 2001.
[17]
F. Li, D. Cheng, M. Hadjieleftheriou, G. Kollios, and S.-H. Teng. On trip planning queries in spatial databases. In Proceedings of the International Symposium on Spatial and Temporal Databases, pages 273--290. Springer, 2005.
[18]
Z. Li, K. C. K. Lee, B. Zheng, W.-C. Lee, D. L. Lee, and X. Wang. IR-Tree: An efficient index for geographic document search. IEEE Transactions on Knowledge and Data Engineering, 23(4):585--599, 2011.
[19]
X. Ma, S. Shekhar, H. Xiong, and P. Zhang. Exploiting a page-level upper bound for multi-type nearest neighbor queries. In Proceedings of the 14th annual ACM International Symposium on Advances in Geographic Information Systems, pages 179--186, 2006.
[20]
M. I. Mostafiz, S. Mahmud, M. M.-u. Hussain, M. E. Ali, and G. Trajcevski. Class-based conditional MaxRS query in spatial data streams. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management, Article 13. 2017.
[21]
S. C. Nandy and B. B. Bhattacharya. A unified algorithm for finding maximum and minimum object enclosing rectangles and cuboids. Computers & Mathematics with Applications, 29(8):45--61, 1995.
[22]
D. Papadias, P. Kalnis, J. Zhang, and Y. Tao. Efficient OLAP operations in spatial data warehouses. In Proceedings of the International Symposium on Spatial and Temporal Databases, pages 443--459. Springer, 2001.
[23]
M. Sharifzadeh, M. Kolahdouzan, and C. Shahabi. The optimal sequenced route query. The VLDB Journal, 17(4):765--787, 2008.
[24]
Y. Tao, X. Hu, D.-W. Choi, and C.-W. Chung. Approximate maxrs in spatial databases. PVLDB, 6(13):1546--1557, 2013.
[25]
D. Zhang, Y. M. Chee, A. Mondal, A. K. H. Tung, and M. Kitsuregawa. Keyword search in spatial databases: Towards searching by document. In Proceedings of the IEEE International Conference on Data Engineering, pages 688--699, 2009.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 12, Issue 11
July 2019
543 pages

Publisher

VLDB Endowment

Publication History

Published: 01 July 2019
Published in PVLDB Volume 12, Issue 11

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)EDense: a convolutional neural network with ELM-based dense connectionsNeural Computing and Applications10.1007/s00521-020-05181-235:5(3651-3663)Online publication date: 1-Feb-2023
  • (2022)ReSKY: Efficient Subarray Skyline Computation in Array DatabasesDistributed and Parallel Databases10.1007/s10619-022-07419-540:2-3(261-298)Online publication date: 1-Sep-2022
  • (2021)A parametric approximation algorithm for spatial group keyword queriesIntelligent Data Analysis10.3233/IDA-19507125:2(305-319)Online publication date: 1-Jan-2021
  • (2020)DARS: Diversity and Distribution-Aware Region SearchDatabase Systems for Advanced Applications10.1007/978-3-030-59419-0_13(204-220)Online publication date: 24-Sep-2020
  • (2020)Discovering Relational Intelligence in Online Social NetworksDatabase and Expert Systems Applications10.1007/978-3-030-59003-1_22(339-353)Online publication date: 14-Sep-2020

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media