Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Asymptotically Optimal Encodings of Range Data Structures for Selection and Top-k Queries

Published: 06 March 2017 Publication History

Abstract

Given an array A[1, n] of elements with a total order, we consider the problem of building a data structure that solves two queries: (a) selection queries receive a range [i, j] and an integer k and return the position of the kth largest element in A[i, j]; (b) top-k queries receive [i, j] and k and return the positions of the k largest elements in A[i, j]. These problems can be solved in optimal time, O(1+lg k/lg lg n) and O(k), respectively, using linear-space data structures.
We provide the first study of the encoding data structures for the above problems, where A cannot be accessed at query time. Several applications are interested in the relative order of the entries of A, and their positions, rather their actual values, and thus we do not need to keep A at query time. In those cases, encodings save storage space: we first show that any encoding answering such queries requires nlg k - O(n+k lg k) bits of space; then, we design encodings using O(nlg k) bits, that is, asymptotically optimal up to constant factors, while preserving optimal query time.

References

[1]
A. Andersson, T. Hagerup, S. Nilsson, and R. Raman. 1998. Sorting in linear time? J. Comput. System Sci. 57, 1 (1998), 74--93.
[2]
D. Belazzougui, P. Boldi, R. Pagh, and S. Vigna. 2009. Monotone minimal perfect hashing: Searching a sorted table with O(1) accesses. In Proc. 20th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 785--794.
[3]
D. Belazzougui and G. Navarro. 2014. Alphabet-independent compressed text indexing. ACM Transactions on Algorithms (TALG) 10, 4 (2014), article 23.
[4]
D. Belazzougui and G. Navarro. 2015. Optimal lower and upper bounds for representing sequences. ACM Transactions on Algorithms 11, 4 (2015), article 31.
[5]
T. Bell, J. Cleary, and I. Witten. 1990. Text Compression. Prentice Hall.
[6]
G. S. Brodal, R. Fagerberg, M. Greve, and A. Lopez-Ortiz. 2009. Online sorted range reporting. In Proc. 20th International Symposium on Algorithms and Computation (ISAAC) (LNCS 5878). 173--182.
[7]
G. S. Brodal, B. Gfeller, A. G. Jørgensen, and P. Sanders. 2011. Towards optimal range medians. Theoretical Computer Science 412, 24 (2011), 2588--2601.
[8]
G. S. Brodal and A. G. Jørgensen. 2009. Data structures for range median queries. In Proc. 20th International Symposium on Algorithms and Computation (ISAAC) (LNCS 5878). 822--831.
[9]
T. Chan and B. T. Wilkinson. 2013. Adaptive and approximate orthogonal range counting. In Proc. 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 241--251.
[10]
D. Clark. 1996. Compact Pat Trees. Ph.D. Dissertation. University of Waterloo, Canada.
[11]
P. Davoodi, G. Navarro, R. Raman, and S. Srinivasa Rao. 2014. Encoding range minima and range top-2 queries. Philosophical Transactions of the Royal Society A 372, 20130131 (2014).
[12]
J. Fischer. 2010. Optimal succinctness for range minimum queries. In Proc. 9th Latin American Symposium on Theoretical Informatics (LATIN). 158--169.
[13]
J. Fischer. 2011. Combined data structure for previous- and next-smaller-values. Theoretical Computer Science 412, 22 (2011), 2451--2456.
[14]
J. Fischer and V. Heun. 2011. Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM Journal of Computing 40, 2 (2011), 465--492.
[15]
T. Gagie, G. Navarro, and S. J. Puglisi. 2012. New algorithms on wavelet trees and applications to information retrieval. Theoretical Computer Science 426-427 (2012), 25--41.
[16]
T. Gagie, S. J. Puglisi, and A. Turpin. 2009. Range quantile queries: Another virtue of wavelet trees. In Proc. 16th International Symposium on String Processing and Information Retrieval (SPIRE) (LNCS 5721). 1--6.
[17]
P. Gawrychowski and P. K. Nicholson. 2015b. Optimal encodings for range min-max and top-k. CoRR 1411.6581v2 (2015). http://arxiv.org/abs/1411.6581v2.
[18]
P. Gawrychowski and P. K. Nicholson. 2015a. Optimal encodings for range top-k, selection, and min-max. In Proc. 42nd International Colloquium on Automata, Languages, and Programming (ICALP), Part I (LNCS 9134). 593--604.
[19]
A. Golynski, I. Munro, and S. Rao. 2006. Rank/select operations on large alphabets: A tool for text indexing. In Proc. 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 368--373.
[20]
R. Grossi, J. Iacono, G. Navarro, R. Raman, and S. Srinivasa Rao. 2013. Encodings for range selection and top-k queries. In Proc. 21st Annual European Symposium on Algorithms (ESA) (LNCS 8125). 553--564.
[21]
R. Grossi, A. Orlandi, R. Raman, and S. S. Rao. 2009. More haste, less waste: Lowering the redundancy in fully indexable dictionaries. In Proc. 26th International Symposium on Theoretical Aspects of Computer Science (STACS) (LIPIcs 3). 517--528.
[22]
P. Hsu and G. Ottaviano. 2013. Space-efficient data structures for top-k completion. In Proc. World Wide Web Conference (WWW’13). 583--594.
[23]
A. G. Jørgensen and K. G. Larsen. 2011. Range selection and median: Tight cell probe lower bounds and adaptive data structures. In Proc. 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 805--813.
[24]
G. Li, S. Ji, C. Li, and J. Feng. 2009. Efficient type-ahead search on relational data: A TASTIER approach. In SIGMOD Conference, U. Çetintemel, S. B. Zdonik, D. Kossmann, and N. Tatbul (Eds.). ACM, 695--706.
[25]
Jirí Matousek. 1991. Cutting hyperplane arrangements. Discrete and Computational Geometry 6 (1991), 385--406.
[26]
J. I. Munro and V. Raman. 2001. Succinct representation of balanced parentheses and static trees. SIAM J. Comput. 31, 3 (2001), 762--776.
[27]
S. Muthukrishnan. 2002. Efficient algorithms for document retrieval problems. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 657--666.
[28]
G. Navarro, R. Raman, and S. Srinivasa Rao. 2014. Asymptotically optimal encodings for range selection. In Proc. 34th Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS). 291--302.
[29]
G. Navarro and K. Sadakane. 2014. Fully-functional static and dynamic succinct trees. ACM Transactions on Algorithms 10, 3 (2014), article 16.
[30]
M. Pătraşcu and M. Thorup. 2006. Time-space trade-offs for predecessor search. In Proc. 38th Annual ACM Symposium on Theory of Computing (STOC). 232--240.
[31]
R. Raman, V. Raman, and S. Srinivasa Rao. 2007. Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Transactions on Algorithms 2, 4 (2007), Article 43.
[32]
K. Sadakane. 2002. Succinct representations of lcp information and improvements in the compressed suffix arrays. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 225--232.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Algorithms
ACM Transactions on Algorithms  Volume 13, Issue 2
Special Issue on SODA'15 and Regular Papers
April 2017
316 pages
ISSN:1549-6325
EISSN:1549-6333
DOI:10.1145/3040971
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 March 2017
Accepted: 01 October 2016
Revised: 01 May 2016
Received: 01 August 2015
Published in TALG Volume 13, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Succinct data structures
  2. encoding data structures
  3. range minimum queries
  4. range search data structures

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Millennium Nucleus Information and Coordination in Networks
  • Ministry of Education, Science and Technology
  • Basic Science Research Program through the National Research Foundation of Korea (NRF)
  • MIUR PRIN 2012C4E3KT national research project

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media