Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

On-the-Fly Principled Speculation for FSM Parallelization

Published: 14 March 2015 Publication History
  • Get Citation Alerts
  • Abstract

    Finite State Machine (FSM) is the backbone of an important class of applications in many domains. Its parallelization has been extremely difficult due to inherent strong dependences in the computation. Recently, principled speculation shows good promise to solve the problem. However, the reliance on offline training makes the approach inconvenient to adopt and hard to apply to many practical FSM applications, which often deal with a large variety of inputs different from training inputs. This work presents an assembly of techniques that completely remove the needs for offline training. The techniques include a set of theoretical results on inherent properties of FSMs, and two newly designed dynamic optimizations for efficient FSM characterization. The new techniques, for the first time, make principle speculation applicable on the fly, and enables swift, automatic configuration of speculative parallelizations to best suit a given FSM and its current input. They eliminate the fundamental barrier for practical adoption of principle speculation for FSM parallelization. Experiments show that the new techniques give significantly higher speedups for some difficult FSM applications in the presence of input changes.

    References

    [1]
    M. Abadi, A. Birrell, T. Harris, and M. Isard. Semantics of transactional memory and automatic mutual exclusion. In Proceedings of ACM Symposium on Principles of Programming Languages, 2008.
    [2]
    A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison Wesley, 2nd edition, August 2006.
    [3]
    K. Asanovic, R. Bodik, B. Catanzaro, J. Gebis, P. Husbands, K. Keutzer, D. Patterson, W. Plishker, J. Shalf, S. Williams, and K. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006--18, University of California at Berkele, 2006.
    [4]
    F. Baccelli and T. Fleury. On parsing arithmetic expressions in a multiprocessing environment. Acta Inf., 17: 287--310, 1982.
    [5]
    F. Baccelli and P. Mussi. An asynchronous parallel interpreter for arithmetic expressions and its evaluation. IEEE Trans. Computers, 35 (3): 245--256, 1986.
    [6]
    B. Carlstrom, A. McDonald, H. Chafi, J. Chung, C. Minh, C. Kozyrakis, and K. Olukotun. The atomos transactional programming langauges. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 2006.
    [7]
    P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In OOPSLA, 2005.
    [8]
    C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior-oriented parallelization. In PLDI, 2007.
    [9]
    M. Feng, R. Gupta, and Y. Hu. Spicec: Scalable parallelism via implicit copying and explicit commit. In Proceedings of the ACM SIGPLAN Symposium on Principles Practice of Parallel Programming, 2011.
    [10]
    C. N. Fischer. On Parsing Context Free Languages in Parallel Environments. PhD thesis, Cornell University, 1975.
    [11]
    M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 1998.
    [12]
    M. Herlihy and J. E. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), 1993.
    [13]
    K. Ingham, A. Somayaji, J. Burge, and S. Forrest. Learning dfa representations of http for protecting web applications. Computer Networks, 51 (5): 1239--1255, Apr. 2007.
    [14]
    C. Jones, R. Liu, L. Meyerovich, K. Asanovic, and R. Bodik. Parallelizing the web browser. In HotPar, 2009.
    [15]
    B. Kaplan. Speculative parsing path.texttthttp://bugzilla.mozilla.org.
    [16]
    S. Klein and Y. Wiseman. Parallel huffman decoding with applications to jpeg files. Jounal of Computing, 46 (5), 2003.
    [17]
    M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. Chew. Optimistic parallelism requires abstractions. In PLDI, 2007.
    [18]
    R. E. Ladner and M. J. Fischer. Parallel prefix computation. J. ACM, 27 (4): 831--838, Oct. 1980. ISSN 0004--5411.
    [19]
    D. Llanos, D. Orden, and B. Palop. New scheduling strategies for randomized incremental algorithms in the context of speculative parallelization. IEEE Transactions on Computers, 2007.
    [20]
    W. Lu, K. Chiu, and Y. Pan. A parallel approach to xml parsing. In Proceedings of the 7th IEEE/ACM International Conference on Grid Computing, GRID '06, pages 223--230, 2006.
    [21]
    D. Luchaup, R. Smith, C. Estan, and S. Jha. Multi-byte regular expression matching with speculation. In RAID, 2009.
    [22]
    T. Mytkowicz, M. Musuvathi, and W. Schulte. Data-parallel finite-state machines. In ASPLOS '14: Proceedings of 19th International Conference on Architecture Support for Programming Languages and Operating Systems. ACM Press, 2014.
    [23]
    P. Prabhu, G. Ramalingam, and K. Vaswani. Safe programmable speculative parallelism. In Proceedings of ACM SIGPLAN Conference on Programming Languages Design and Implementation, 2010.
    [24]
    C. Quinones, C. Madriles, J. Sanchez, P. Marcuello, A. Gonzalez, and D. M. Tullsen. Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. In PLDI, 2005.
    [25]
    A. Raman, H. Kim, T. R. Mason, T. B. Jablin, and D. I. August. Speculative parallelization using software multi-threaded transactions. In Proceedings of the international conference on Architectural support for programming languages and operating systems, 2010.
    [26]
    S. Sandberg. Homing and synchronizing sequences. Model-based testing of reactive systems, 2005.
    [27]
    J. G. Steffan, C. Colohan, A. Zhai, and T. C. Mowry. The STAMPede approach to thread-level speculation. ACM Transactions on Computer Systems, 23 (3): 253--300, 2005.
    [28]
    C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy or discard execution model for speculative parallelization on multicores. In Proceedings of the International Symposium on Microarchitecture, 2008.
    [29]
    M. V. Volkov. Synchronizing automata and the 'ern' conjecture. Lecture Notes in Computer Science, pages 11--27, 2008.
    [30]
    E. Witte, R. Chamberlain, and M. Franklin. Parallel simulated annealing using speculative computation. IEEE Transactions on Parallel and Distributed Systems, 2 (4): 483--494, 1991.
    [31]
    Z. Zhao, B. Wu, and X. Shen. Challenging the "embarrassingly sequential": Parallelizing finite state machine-based computations through principled speculation. In ASPLOS '14: Proceedings of 19th International Conference on Architecture Support for Programming Languages and Operating Systems. ACM Press, 2014.
    [32]
    Y. Zu, M. Yang, Z. Xu, L. Wang, X. Tian, K. Peng, and Q. Dong. Gpu-based nfa implementation for memory efficient high speed regular expression matching. In PPoPP '12: Proceedings of the ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 129--140, 2009.

    Cited By

    View all
    • (2023)Harry: A Scalable SIMD-based Multi-literal Pattern Matching Engine for Deep Packet InspectionIEEE INFOCOM 2023 - IEEE Conference on Computer Communications10.1109/INFOCOM53939.2023.10229022(1-10)Online publication date: 17-May-2023
    • (2023)SimdFSM: An Adaptive Vectorization of Finite State Machines for Speculative ExecutionParallel and Distributed Computing, Applications and Technologies10.1007/978-3-031-29927-8_37(481-493)Online publication date: 8-Apr-2023
    • (2020)Scaling out speculative execution of finite-state machines with parallel mergeProceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3332466.3374524(160-172)Online publication date: 19-Feb-2020
    • Show More Cited By

    Index Terms

    1. On-the-Fly Principled Speculation for FSM Parallelization

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 50, Issue 4
      ASPLOS '15
      April 2015
      676 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2775054
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS '15: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems
        March 2015
        720 pages
        ISBN:9781450328357
        DOI:10.1145/2694344
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 14 March 2015
      Published in SIGPLAN Volume 50, Issue 4

      Check for updates

      Author Tags

      1. DFA
      2. FSM
      3. finite state machine
      4. multicore
      5. online profiling
      6. speculative parallelization

      Qualifiers

      • Research-article

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)105
      • Downloads (Last 6 weeks)22

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Harry: A Scalable SIMD-based Multi-literal Pattern Matching Engine for Deep Packet InspectionIEEE INFOCOM 2023 - IEEE Conference on Computer Communications10.1109/INFOCOM53939.2023.10229022(1-10)Online publication date: 17-May-2023
      • (2023)SimdFSM: An Adaptive Vectorization of Finite State Machines for Speculative ExecutionParallel and Distributed Computing, Applications and Technologies10.1007/978-3-031-29927-8_37(481-493)Online publication date: 8-Apr-2023
      • (2020)Scaling out speculative execution of finite-state machines with parallel mergeProceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3332466.3374524(160-172)Online publication date: 19-Feb-2020
      • (2019)Enabling prefix sum parallelism pattern for recurrences with principled function reconstructionProceedings of the 28th International Conference on Compiler Construction10.1145/3302516.3307354(17-28)Online publication date: 16-Feb-2019
      • (2018)CSEProceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2018.00012(29-41)Online publication date: 20-Oct-2018
      • (2024)ngAP: Non-blocking Large-scale Automata Processing on GPUsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624848(268-285)Online publication date: 27-Apr-2024
      • (2023)Asynchronous Automata Processing on GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35794537:1(1-27)Online publication date: 2-Mar-2023
      • (2021)Scalable FSM parallelization via path fusion and higher-order speculationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446705(887-901)Online publication date: 19-Apr-2021
      • (2021)Plex: Scaling Parallel Lexing with Backtrack-Free Prescanning2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00079(693-702)Online publication date: May-2021
      • (2020)Reliability Analysis for Unreliable FSM ComputationsACM Transactions on Architecture and Code Optimization10.1145/337745617:2(1-23)Online publication date: 29-May-2020
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media