Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Optimizing sorting algorithms by using sorting networks

Published: 01 May 2017 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper, we show how the theory of sorting networks can be applied to synthesize optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm, with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. By replacing it with faster sorting networks, we can improve the performance of these algorithms. We show that by considering the number of comparisons and swaps alone we are not able to predict any real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. Furthermore, a close control of how often registers have to be spilled to memory gives us a complete explanation of the performance of different sorting networks, allowing us to choose an optimal one for each particular architecture. Our experimental results show that using code synthesized from these efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.

    References

    References

    [1]
    Batcher KE (1968) Sorting networks and their applications. In: AFIPS Conference Proceedings, vol 32. Thomson Book Company, pp 307–314
    [2]
    Baddar SWA-H, Batcher KE (2011) Designing sorting networks: a new paradigm. Springer
    [3]
    Bose RC and Nelson RJ A sorting problem J ACM 1962 9 2 282-296
    [4]
    Bundala D, Závodný J (2014) Optimal sorting networks. In: Dediu AH, Martín-Vide C, Sierra-Rodríguez JL, Truthe B (eds) LATA 2014, vol 8370 of LNCS. Springer, pp 236–247
    [5]
    Codish M, Cruz-Filipe L, Frank M, Schneider-Kamp P (2014) Twenty-five comparators is optimal when sorting nine inputs (and twenty-nine for ten). In: ICTAI 2014. IEEE, December, pp 186–193
    [6]
    Codish M, Cruz-Filipe L, Frank M, and Schneider-Kamp P Sorting nine inputs requires twenty-five comparisons J Comput Syst Sci 2016 82 3 551-563
    [7]
    Codish M, Cruz-Filipe L, Nebel M, Schneider-Kamp P (2015) Applying sorting networks to synthesize optimized sorting libraries. In: Falaschi M (ed) LOPSTR, vol 9527 of LNCS. Springer, pp 127–142
    [8]
    Codish M, Cruz-Filipe L, Schneider-Kamp P (2015) The quest for optimal sorting networks: efficient generation of two-layer prefixes. In: Winkler F, Negru V, Ida T, Jebelan T, Petcu D, Watt SM, Zaharie D (eds) SYNASC 2014. IEEE, pp 359–366
    [9]
    Codish M, Cruz-Filipe L, Schneider-Kamp P (2015) Sorting networks: the end game. In: Dediu AH, Formenti E, Martín-Vide C, Truthe B (eds) LATA 2015, vol 8977 of LNCS. Springer, pp 664–675
    [10]
    Eppstein D, Goodrich MT, Tamassia R (2010) Privacy-preserving data-oblivious geometric algorithms for geographic data. In: GIS 10, ACM, pp 13–22
    [11]
    Ehlers T, Müller M (2015) New bounds on optimal sorting networks. In: Beckmann A, Mitrana V, Soskova MI (eds) CiE 2015, vol 9136 of LNCS. Springer, pp 167–176
    [12]
    Furtak T, Amaral JN, Niewiadomski R (2007) Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms. In: SPAA. ACM, pp 348–357
    [13]
    Fisher JA, Faraboschi P, Young C (2005) Embedded computing: a VLIW approach to architecture, compilers, and tools. Morgan Kaufman
    [15]
    Greß A, Zachmann G (2006) GPU-ABiSort: optimal parallel sorting on stream architectures. In: IPDPS. IEEE
    [16]
    Hibbard TN A simple sorting algorithm J ACM 1963 10 2 142-150
    [17]
    Hoare CAR Quicksort Comput J 1962 5 1 10-15
    [18]
    Knuth DE (1973) The art of computer programming, vol III: sorting and searching. Addison-Wesley
    [19]
    Lopez B, Cruz-Cortes N (2014) On the usage of sorting networks to big data. In: Arabnia HR, Yang MQ, Jandieri G, Park JJ, Solo AMG, Tinetti FG (eds) Advances in big data analytics: the 2014 WorldComp International Conference Proceedings. Mercury Learning and Information
    [20]
    Paoloni G (2010) How to benchmark code execution times on intel® IA-32 and IA-64 instruction set architectures. White paper 324264-001, Intel Corporation, September
    [21]
    Parberry I A computer-assisted optimal depth lower bound for nine-input sorting networks Math Syst Theor 1991 24 2 101-116
    [22]
    Sedgewick R The analysis of quicksort programs Acta Inf 1977 7 327-355
    [23]
    Sedgewick R, Flajolet P (1996) An introduction to the analysis of algorithms. Addison-Wesley-Longman
    [24]
    Silc J, Robic B, Ungerer T (1999) Processor architecture: from dataflow to superscalar and beyond. Springer
    [25]
    Sedgewick R, Wayne K (2011) Algorithms. Addison-Wesley, 4th edn

    Cited By

    View all
    • (2022)Engineering In-place (Shared-memory) Sorting AlgorithmsACM Transactions on Parallel Computing10.1145/35052869:1(1-62)Online publication date: 31-Mar-2022

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Formal Aspects of Computing
    Formal Aspects of Computing  Volume 29, Issue 3
    May 2017
    196 pages
    ISSN:0934-5043
    EISSN:1433-299X
    Issue’s Table of Contents

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 May 2017
    Accepted: 04 October 2016
    Received: 03 April 2016
    Published in FAC Volume 29, Issue 3

    Author Tags

    1. Sorting algorithms
    2. Sorting networks
    3. Instruction-level parallelism
    4. Out-of-order execution

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)59
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Engineering In-place (Shared-memory) Sorting AlgorithmsACM Transactions on Parallel Computing10.1145/35052869:1(1-62)Online publication date: 31-Mar-2022

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media