Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3471873.3472974acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

Improving GHC Haskell NUMA profiling

Published: 22 August 2021 Publication History
  • Get Citation Alerts
  • Abstract

    As the number of cores increases Non-Uniform Memory Access (NUMA) is becoming increasingly prevalent in general purpose machines. Effectively exploiting NUMA can significantly reduce memory access latency and thus runtime by 10-20%, and profiling provides information on how to optimise. Language-level NUMA profilers are rare, and mostly profile conventional languages executing on Virtual Machines. Here we profile, and develop new NUMA profilers for, a functional language executing on a runtime system.
    We start by using existing OS and language level tools to systematically profile 8 benchmarks from the GHC Haskell nofib suite on a typical NUMA server (8 regions, 64 cores). We propose a new metric: NUMA access rate that allows us to compare the load placed on the memory system by different programs, and use it to contrast the benchmarks. We demonstrate significant differences in NUMA usage between computational and data-intensive benchmarks, e.g. local memory access rates of 23% and 30% respectively. We show that small changes to coordination behaviour can significantly alter NUMA usage, and for the first time quantify the effectiveness of the GHC 8.2 NUMA adaption.
    We identify information not available from existing profilers and extend both the numaprof profiler, and the GHC runtime system to obtain three new NUMA profiles: OS thread allocation locality, GC count (per region and generation) and GC thread locality. The new profiles not only provide a deeper understanding of program memory usage, they also suggest ways that GHC can be adapted to better exploit NUMA architectures.

    References

    [1]
    M. S. Aljabri et al. Balancing shared and distributed heaps on NUMA architectures. In TFP14: Intl Symp on Trends in Functional Programming, LNCS 8843, pages 1–17. Springer, 2014.
    [2]
    K. Alnowaiser. A study of connected object locality in NUMA heaps. In MSPC14: Workshop on Memory Systems Performance and Correctness, pages 1:1–1:9. ACM, 2014.
    [3]
    K. Alnowaiser and J. Singer. Topology-aware parallelism for NUMA copying collectors. In LCPC15: Workshop on Languages and Compilers for Parallel Computing, LNCS 9519, pages 191–205. Springer, 2015.
    [4]
    F. Broquedis et al. ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures. Intl. J. Parallel Program., 38(5-6):418–439, 2010.
    [5]
    P. J. Drongowski. Instruction-based sampling: A new performance analysis technique for AMD family 10h processors. Technical report, Advanced Micro Devices, 2007.
    [6]
    F. Gaud et al. Challenges of Memory Management on Modern NUMA Systems. ACM Queue, 13(8), Dec. 2015.
    [7]
    L. Gidra et al. NumaGiC: a Garbage Collector for Big Data on Big NUMA Machines. In ASPLOS15: Intl Conf on Architectural Support for Programming Langs and O.S., pages 661–673. ACM, 2015.
    [8]
    S. Kell. Some were meant for C: the endurance of an unmanageable language. In Onward! Intl Symp on New Ideas, New Paradigms, and Reflections on Programming and Software, pages 229–245. ACM, 2017.
    [9]
    R. La Rowe et al. Evaluation of NUMA Memory Management Through Modeling and Measurements. IEEE Trans. Parallel Distributed Syst., 3(6):686–701, 1992.
    [10]
    R. Lachaize et al. Memprof: A memory profiler for NUMA multicore systems. In USENIX, pages 53–64, 2012.
    [11]
    C. Luk et al. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI05: Programming Languages Design and Implementation, pages 190–200. ACM, 2005.
    [12]
    Z. Majo and T. R. Gross. Memory management in NUMA multicore systems: trapped between cache contention and interconnect overhead. In ISMM11: Intl Symp on Memory Mgmt, pages 11–20. ACM, 2011.
    [13]
    S. Marlow et al. Seq no more: better strategies for parallel Haskell. In Haskell’10: Intl Symposium on Haskell, pages 91–102. ACM, 2010.
    [14]
    O. Papadakis et al. You can’t hide you can’t run: a performance assessment of managed applications on a NUMA machine. In MPLR ’20: 17th International Conference on MPLR, pages 80–88. ACM, 2020.
    [15]
    W. Partain. The nofib benchmark suite of Haskell programs. In GlaFP’92: Glasgow Functional Programming Workshop, pages 195–202. Springer, 1992.
    [16]
    I. Sánchez Barrera et al. Reducing data movement on large shared memory systems by exploiting computation dependencies. In ICS’18: Intl Conf on Supercomputing, pages 207–217, Beijing, China, 2018. ACM.
    [17]
    J. Shapiro. Programming language challenges in systems codes: why systems programmers still use C, and what to do about it. In ASPLOS’06: Architectural Support for Programming Languages and Operating System, page 9. ACM, Oct 2006.
    [18]
    L. Tang et al. Optimizing Google’s warehouse scale computers: The NUMA experience. In HPCA’13: Intl Symp on High Performance Computer Architecture, pages 188–197. IEEE Computer Society, 2013.
    [19]
    D. Terpstra et al. Collecting performance data with PAPI-C. In IWPTHPC’09: Tools for High Performance Computing, pages 157–173. Springer, 2009.
    [20]
    S. Valat and O. Bouizi. Numaprof, A NUMA memory profiler. In EuroPar’18, LNCS 11339, pages 159–170. Springer, 2018.
    [21]
    S. Valat et al. MALT: a malloc tracker. In SEPS’17: Intl Workshop on Software Engineering for Parallel Systems, pages 1–10. ACM, Oct 2017.
    [22]
    G. Voron et al. An Interface to Implement NUMA Policies in the Xen Hypervisor. In European Conference on Computer Systems (EuroSys’17), pages 453–467. ACM, 2017.
    [23]
    X. Zhao et al. NumaPerf: predictive and full NUMA profiling. CoRR, abs/2102.05204, 2021.

    Cited By

    View all
    • (2023)Scaling Up Performance of Managed Applications on NUMA SystemsProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595270(1-14)Online publication date: 6-Jun-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    FHPNC 2021: Proceedings of the 9th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing
    August 2021
    49 pages
    ISBN:9781450386142
    DOI:10.1145/3471873
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 August 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. GHC
    2. Haskell
    3. NUMA
    4. Profiling

    Qualifiers

    • Research-article

    Funding Sources

    • UK EPSRC

    Conference

    ICFP '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 18 of 25 submissions, 72%

    Upcoming Conference

    ICFP '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Scaling Up Performance of Managed Applications on NUMA SystemsProceedings of the 2023 ACM SIGPLAN International Symposium on Memory Management10.1145/3591195.3595270(1-14)Online publication date: 6-Jun-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media