Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Speculative parallelization of sequential loops on multicores

Published: 01 October 2009 Publication History

Abstract

The advent of multicores presents a promising opportunity for speeding up the execution of sequential programs through their parallelization. In this paper we present a novel solution for efficiently supporting software-based speculative parallelization of sequential loops on multicore processors. The execution model we employ is based upon state separation, an approach for separately maintaining the speculative state of parallel threads and non-speculative state of the computation. If speculation is successful, the results produced by parallel threads in speculative state are committed by copying them into the computation's non-speculative state. If misspeculation is detected, no costly state recovery mechanisms are needed as the speculative state can be simply discarded. Techniques are proposed to reduce the cost of data copying between non-speculative and speculative state and efficiently carrying out misspeculation detection. We apply the above approach to speculative parallelization of loops in several sequential programs which results in significant speedups on a Dell PowerEdge 1900 server with two Intel Xeon quad-core processors.

References

[1]
Bridges, M., Vachharajani, N., Zhang, Y., Jablin, T., August, D.: Revisiting the sequential programming model for multi-core. In: MICRO, pp. 69-84 (2007).
[2]
Kulkarni, M., Pingali, K., Walter, B., Ramanarayanan, G., Bala, K., Paul Chew, L.: Optimistic parallelism requires abstractions. In: PLDI, pp. 211-222 (2007).
[3]
Luk, C., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, v.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: PLDI, pp. 190-200 (2005).
[4]
Lattner, C., Adve, V.: Llvm: a compilation framework for lifelong program analysis & transformation. In: CGO, pp. 75-88 (2004).
[5]
http://www.spec.org
[6]
Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: Mibench: a free, commercially representative embedded benchmark suite. In: IEEE 4th Annual Workshop on Workload Characterization (2001).
[7]
Cao Minh, C., Chung, J., Kozyrakis, C., Olukotun, K.: Stamp: stanford transactional applications for multi-processing. In: IISWC, pp. 35-46 (2005).
[8]
Dice, D., Shalev, O., Shavit, N.: Transactional locking ii. In: DISC, pp. 194-208 (2006).
[9]
Cintra, M.H., Martínez, J.F., Torrellas, J.: Architectural support for scalable speculative parallelization in shared-memory multiprocessors. In: ISCA, pp. 13-24 (2000).
[10]
Hammond, L., Willey, M., Olukotun, K.: Data speculation support for a chip multiprocessor. In: ASPLOS, pp. 58-69 (1998).
[11]
Vijaykumar, T.N., Gopal, S., Smith, J.E., Sohi, G.S.: Speculative versioning cache. IEEE Trans. Parallel Distrib. Syst. 12(12), 1305-1317 (2001).
[12]
Gregory Steffan, J., Colohan, C.B., Zhai, A., Mowry, T.C.: A scalable approach to thread-level speculation. In: ISCA, pp. 1-12 (2000).
[13]
Bhowmik, A., Franklin, M.: A general compiler framework for specUlative multithreading. In: SPAA, pp. 99-108 (2002).
[14]
Marcuello, P., González, A.: Clustered speculative multithreaded processors. In: ICS, pp. 365-372 (1999).
[15]
Zilles, C., Sohi, G.: Master/slave speculative parallelization. In: MICRO, pp. 85-96 (2002).
[16]
Ding, C., Shen, X., Kelsey, K., Tice, C., Huang, R., Zhang, C.: Software behavior oriented parallelization. In: PLDI, pp. 223-234 (2007).
[17]
Kulkarni, M., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Paul Chew, L.: Optimistic parallelism benefits from data partitioning. In: ASPLOS, pp. 233-243 (2008).
[18]
Ottoni, G., Rangan, R., Stoler, A., August, D.I.: Automatic thread extraction with decoupled software pipelining. In: MICRO, pp. 105-118 (2005).
[19]
Raman, E., Ottoni, G., Raman, A., Bridges, M.J., August, D.I.: Parallel-stage decoupled software pipelining. In: CGO, pp. 114-123 (2008).
[20]
Vachharajani, N., Rangan, R., Raman, E., Bridges, M.J., Ottoni, G., August, D.I.: Speculative decoupled software pipelining. In: PACT, pp. 49-59 (2007).
[21]
Buck, I.: Stream computing on graphics hardware. PhD thesis, Stanford, CA, USA (2005).
[22]
Fan, K., Park, H., Kudlur, M., Mahlke, S.A.: Modulo scheduling for highly customized datapaths to increase hardware reusability. In: CGO, pp. 124-133 (2008).
[23]
Kapasi, U.J., Rixner, S., Dally, W.J., Khailany, B., Ahn, J.H., Mattson, P., Owens, J.D.: Programmable stream processors. Computer 36(8), 54-62 (2003).
[24]
Kudlur, M., Mahlke, S.: Orchestrating the execution of stream programs on multicore platforms. In: PLDI, pp. 114-124 (2008).
[25]
Thies, W., Chandrasekhar, V., Amarasinghe, S.: A practical approach to exploiting coarse-grained pipeline parallelism in c programs. In: MICRO, pp. 356-369 (2007).
[26]
Vijaykumar, T.N., Sohi, G.S.: Task selection for a multiscalar processor. In: MICRO, pp. 81-92 (1998).
[27]
Chu, M., Ravindran, R., Mahlke, S.: Data access partitioning for fine-grain parallelism on multicore architectures. In: MICRO, pp. 369-380 (2007).
[28]
Adl-Tabatabai, A.-R., Lewis, B.T., Menon, V., Murphy, B.R., Saha, B., Shpeisman, T.: Compiler and runtime support for efficient software transactional memory. In: PLDI, pp. 26-37 (2006).
[29]
Damron, P., Fedorova, A., Lev, Y., Luchangco, V., Moir, M., Nussbaum, D.: Hybrid transactional memory. In: ASPLOS, pp. 336-346 (2006).
[30]
Herlihy, M., Moss, J.E.B.: Transactional memory: architectural support for lock-free data structures. In: ISCA, pp. 289-300 (1993).
[31]
Moravan, M.J., Bobba, J., Moore, K.E., Yen, L., Hill, M.D., Liblit, B., Swift, M.M., Wood, D.A.: Supporting nested transactional memory in logtm. SIGOPS Oper. Syst. Rev. 40(5), 359-370 (2006).

Cited By

View all
  • (2024)A new thread-level speculative automatic parallelization model and library based on duplicate code executionThe Journal of Supercomputing10.1007/s11227-024-05987-080:10(13714-13737)Online publication date: 1-Jul-2024
  • (2016)A Survey on Thread-Level Speculation TechniquesACM Computing Surveys10.1145/293836949:2(1-39)Online publication date: 30-Jun-2016
  • (2016)New Data Structures to Handle Speculative Parallelization at RuntimeInternational Journal of Parallel Programming10.1007/s10766-014-0347-044:3(407-426)Online publication date: 1-Jun-2016
  • Show More Cited By
  1. Speculative parallelization of sequential loops on multicores

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image International Journal of Parallel Programming
    International Journal of Parallel Programming  Volume 37, Issue 5
    October 2009
    103 pages

    Publisher

    Kluwer Academic Publishers

    United States

    Publication History

    Published: 01 October 2009
    Accepted: 31 May 2009
    Received: 12 January 2009

    Author Tags

    1. multicores
    2. profile-guided parallelization
    3. speculative parallelization
    4. state separation

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A new thread-level speculative automatic parallelization model and library based on duplicate code executionThe Journal of Supercomputing10.1007/s11227-024-05987-080:10(13714-13737)Online publication date: 1-Jul-2024
    • (2016)A Survey on Thread-Level Speculation TechniquesACM Computing Surveys10.1145/293836949:2(1-39)Online publication date: 30-Jun-2016
    • (2016)New Data Structures to Handle Speculative Parallelization at RuntimeInternational Journal of Parallel Programming10.1007/s10766-014-0347-044:3(407-426)Online publication date: 1-Jun-2016
    • (2012)Dynamic trace-based analysis of vectorization potential of applicationsACM SIGPLAN Notices10.1145/2345156.225410847:6(371-382)Online publication date: 11-Jun-2012
    • (2012)Dynamic trace-based analysis of vectorization potential of applicationsProceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/2254064.2254108(371-382)Online publication date: 11-Jun-2012
    • (2011)Enhanced speculative parallelization via incremental recoveryACM SIGPLAN Notices10.1145/2038037.194158046:8(189-200)Online publication date: 12-Feb-2011
    • (2011)Enhanced speculative parallelization via incremental recoveryProceedings of the 16th ACM symposium on Principles and practice of parallel programming10.1145/1941553.1941580(189-200)Online publication date: 12-Feb-2011
    • (2010)Speculative parallelization using state separation and multiple value predictionACM SIGPLAN Notices10.1145/1837855.180666345:8(63-72)Online publication date: 5-Jun-2010
    • (2010)Supporting speculative parallelization in the presence of dynamic data structuresACM SIGPLAN Notices10.1145/1809028.180660445:6(62-73)Online publication date: 5-Jun-2010
    • (2010)Speculative parallelization using state separation and multiple value predictionProceedings of the 2010 international symposium on Memory management10.1145/1806651.1806663(63-72)Online publication date: 5-Jun-2010
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media