Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access

Lower bounds for external memory integer sorting via network coding

Published: 23 September 2020 Publication History

Abstract

Sorting extremely large datasets is a frequently occurring task in practice. These datasets are usually much larger than the computer's main memory; thus, external memory sorting algorithms, first introduced by Aggarwal and Vitter, are often used. The complexity of comparison-based external memory sorting has been understood for decades by now; however, the situation remains elusive if we assume the keys to be sorted are integers. In internal memory, one can sort a set of n integer keys of Θ(lg n) bits each in O(n) time using the classic Radix Sort algorithm; however, in external memory, there are no faster integer sorting algorithms known than the simple comparison-based ones. Whether such algorithms exist has remained a central open problem in external memory algorithms for more than three decades.
In this paper, we present a tight conditional lower bound on the complexity of external memory sorting of integers. Our lower bound is based on a famous conjecture in network coding by Li and Li, who conjectured that network coding cannot help anything beyond the standard multicommodity flow rate in undirected graphs.
The only previous work connecting the Li and Li conjecture to lower bounds for algorithms is due to Adler et al. Adler et al. indeed obtain relatively simple lower bounds for oblivious algorithms (the memory access pattern is fixed and independent of the input data). Unfortunately, obliviousness is a strong limitation, especially for integer sorting: we show that the Li and Li conjecture implies an Ω(n lg n) lower bound for internal memory oblivious sorting when the keys are Θ(lg n) bits. This is in sharp contrast to the classic (nonoblivious) Radix Sort algorithm. Indeed, going beyond obliviousness is highly nontrivial; we need to introduce several new methods and involved techniques, which are of their own interest, to obtain our tight lower bound for external memory integer sorting.

References

[1]
Adler, M., Harvey, N.J.A., Jain, K., Kleinberg, R., Lehman, A.R. On the capacity of information networks. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, SODA '06 (2006), 241--250.
[2]
Aggarwal, A., Vitter, J. The input/output complexity of sorting and related problems. Commun. ACM 9, 31 (1988), 1116--1127.
[3]
Barak, B., Braverman, M., Chen, X., Rao, A. How to compress interactive communication. In Proceedings of the Forty-Second ACM Symposium on Theory of Computing, STOC '10 (2010), 67--76.
[4]
Braverman, M., Garg, S., Schvartzman, A. Coding in undirected graphs is either very helpful or not helpful at all. In 8th Innovations in Theoretical Computer Science Conference, ITCS 2017, January 9--11, 2017, Berkeley, CA, USA (2017), 18:1--18:18.
[5]
Han, Y. Deterministic sorting in O(n lg lg n) time and linear space. In Proceedings of the Thirty-Fourth Annual ACM Symposium on Theory of Computing (2002), ACM, New York, 602--608.
[6]
Han, Y., Thorup, M. Integer sorting [EQUATION - please see PDF] expected time and linear space. In Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science (2002), IEEE, 135--144.
[7]
Li, Z., Li, B. Network coding: the case of multiple unicast sessions. In Proceedings of the 42nd Allerton Annual Conference on Communication, Control and Computing, Allerton '04 (2004).

Index Terms

  1. Lower bounds for external memory integer sorting via network coding

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Communications of the ACM
    Communications of the ACM  Volume 63, Issue 10
    October 2020
    97 pages
    ISSN:0001-0782
    EISSN:1557-7317
    DOI:10.1145/3426225
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 September 2020
    Published in CACM Volume 63, Issue 10

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 5,842
      Total Downloads
    • Downloads (Last 12 months)312
    • Downloads (Last 6 weeks)46
    Reflects downloads up to 24 Dec 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Digital Edition

    View this article in digital edition.

    Digital Edition

    Magazine Site

    View this article on the magazine site (external)

    Magazine Site

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media