Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1863543.1863594acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

A play on regular expressions: functional pearl

Published: 27 September 2010 Publication History

Abstract

Cody, Hazel, and Theo, two experienced Haskell programmers and an expert in automata theory, develop an elegant Haskell program for matching regular expressions: (i) the program is purely functional; (ii) it is overloaded over arbitrary semirings, which not only allows to solve the ordinary matching problem but also supports other applications like computing leftmost longest matchings or the number of matchings, all with a single algorithm; (iii) it is more powerful than other matchers, as it can be used for parsing every context-free language by taking advantage of laziness.
The developed program is based on an old technique to turn regular expressions into finite automata which makes it efficient both in terms of worst-case time and space bounds and actual performance: despite its simplicity, the Haskell implementation can compete with a recently published professional C++ program for the same problem.

Supplementary Material

JPG File (icfp-weds-1705-fischer.jpg)
MOV File (icfp-weds-1705-fischer.mov)

References

[1]
}}C. Allauzen and M. Mohri. A unified construction of the Glushkov, follow, and Antimirov automata. In R. Kralovic and P. Urzyczyn, editors, phMathematical Foundations of Computer Science 2006 (MFCS 2006), Stará Lesná, Slovakia, volume 4162 of Lecture Notes in Computer Science, pages 110--121. Springer, 2006.
[2]
}}P. Caron and M. Flouret. From Glushkov WFAs to rational expressions. In Z. Ésik and Z. Fülöp, editors, Developments in Language Theory, 7th International Conference (DLT 2003), Szeged, Hungary, volume 2710 of Lecture Notes in Computer Science, pages 183--193. Springer, 2003.
[3]
}}M. Droste, W. Kuich, and H. Vogler. Handbook of Weighted Automata. Springer, New York, 2009.
[4]
}}V. M. Glushkov. On a synthesis algorithm for abstract automata. Ukr. Matem. Zhurnal, 12 (2): 147--156, 1960.
[5]
}}S. A. Greibach. A new normal-form theorem for context-free phrase structure grammars. J. ACM, 12 (1): 42--52, 1965.
[6]
}}Haskell Wiki. Haskell - regular expressions. http://www.haskell.org/haskellwiki/Regular_expressions.
[7]
}}P. Hudak, J. Hughes, S. L. Peyton-Jones, and P. Wadler. A history of Haskell: being lazy with class. In Third ACM SIGPLAN History of Programming Languages Conference (HOPL-III), San Diego, California, pages 1--55. ACM, 2007.
[8]
}}S. Kleene. Representation of events in nerve nets and finite automata. In C. Shannon and J. McCarthy, editors, Automata Studies, pages 3--42. Princeton University Press, Princeton, N.J., 1956.
[9]
}}R. McNaughton and H. Yamada. Regular expressions and state graphs for automata. IEEE Transactions on Electronic Computers, 9 (1): 39--47, 1960.
[10]
}}M. O. Rabin and D. Scott. Finite automata and their decision problems. IBM journal of research and development, 3 (2): 114--125, 1959.
[11]
}}M. P. Schützenberger. On the definition of a family of automata. Information and Control, 4 (2--3): 245--270, 1961.
[12]
}}K. Thompson. Programming techniques: Regular expression search algorithm. Commun. ACM, 11 (6): 419--422, 1968.

Cited By

View all
  • (2025)RE#: High Performance Derivative-Based Regex Matching with Intersection, Complement, and Restricted LookaroundsProceedings of the ACM on Programming Languages10.1145/37048379:POPL(1-32)Online publication date: 9-Jan-2025
  • (2025)Verified and Efficient Matching of Regular Expressions with LookaroundProceedings of the 14th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3703595.3705884(198-213)Online publication date: 10-Jan-2025
  • (2024)Lean Formalization of Extended Regular Expression Matching with LookaroundsProceedings of the 13th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3636501.3636959(118-131)Online publication date: 9-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICFP '10: Proceedings of the 15th ACM SIGPLAN international conference on Functional programming
September 2010
398 pages
ISBN:9781605587943
DOI:10.1145/1863543
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 45, Issue 9
    ICFP '10
    September 2010
    382 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1932681
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 September 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. finite automata
  2. glushkov construction
  3. purely functional programming
  4. regular expressions

Qualifiers

  • Research-article

Conference

ICFP '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 333 of 1,064 submissions, 31%

Upcoming Conference

ICFP '25
ACM SIGPLAN International Conference on Functional Programming
October 12 - 18, 2025
Singapore , Singapore

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)7
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)RE#: High Performance Derivative-Based Regex Matching with Intersection, Complement, and Restricted LookaroundsProceedings of the ACM on Programming Languages10.1145/37048379:POPL(1-32)Online publication date: 9-Jan-2025
  • (2025)Verified and Efficient Matching of Regular Expressions with LookaroundProceedings of the 14th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3703595.3705884(198-213)Online publication date: 10-Jan-2025
  • (2024)Lean Formalization of Extended Regular Expression Matching with LookaroundsProceedings of the 13th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3636501.3636959(118-131)Online publication date: 9-Jan-2024
  • (2024)A Logical Treatment of Finite AutomataTools and Algorithms for the Construction and Analysis of Systems10.1007/978-3-031-57246-3_20(350-369)Online publication date: 4-Apr-2024
  • (2023)Derivative Based Nonbacktracking Real-World Regex Matching with Backtracking SemanticsProceedings of the ACM on Programming Languages10.1145/35912627:PLDI(1026-1049)Online publication date: 6-Jun-2023
  • (2020)Regex matching with counting-set automataProceedings of the ACM on Programming Languages10.1145/34282864:OOPSLA(1-30)Online publication date: 13-Nov-2020
  • (2020)Regenerate: a language generator for extended regular expressionsACM SIGPLAN Notices10.1145/3393934.327813353:9(202-214)Online publication date: 7-Apr-2020
  • (2019)Typed parsing and unparsing for untyped regular expression enginesProceedings of the 2019 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation10.1145/3294032.3294082(35-46)Online publication date: 14-Jan-2019
  • (2018)Regenerate: a language generator for extended regular expressionsProceedings of the 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3278122.3278133(202-214)Online publication date: 5-Nov-2018
  • (2018)Prototyping a functional language using higher-order logic programming: a functional pearl on learning the ways of λProlog/MakamProceedings of the ACM on Programming Languages10.1145/32367882:ICFP(1-30)Online publication date: 30-Jul-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

EPUB

View this article in ePub.

ePub

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media