Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

DReX: A Declarative Language for Efficiently Evaluating Regular String Transformations

Published: 14 January 2015 Publication History

Abstract

We present DReX, a declarative language that can express all regular string-to string transformations, and can still be efficiently evaluated. The class of regular string transformations has a robust theoretical foundation including multiple characterizations, closure properties, and decidable analysis questions, and admits a number of string operations such as insertion, deletion, substring swap, and reversal. Recent research has led to a characterization of regular string transformations using a primitive set of function combinators analogous to the definition of regular languages using regular expressions. While these combinators form the basis for the language DReX proposed in this paper, our main technical focus is on the complexity of evaluating the output of a DReX program on a given input string. It turns out that the natural evaluation algorithm involves dynamic programming, leading to complexity that is cubic in the length of the input string. Our main contribution is identifying a consistency restriction on the use of combinators in DReX programs, and a single-pass evaluation algorithm for consistent programs with time complexity that is linear in the length of the input string and polynomial in the size of the program. We show that the consistency restriction does not limit the expressiveness, and whether a DReX program is consistent can be checked efficiently. We report on a prototype implementation, and evaluate it using a representative set of text processing tasks.

Supplementary Material

MPG File (p125-sidebyside.mpg)

References

[1]
R. Alur and P. Černy. Streaming transducers for algorithmic verification of single-pass list-processing programs. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 599--610. ACM, 2011.
[2]
R. Alur and L. D'Antoni. Streaming tree transducers. In A. Czumaj, K. Mehlhorn, A. Pitts, and R. Wattenhofer, editors, Automata, Languages, and Programming, volume 7392 of Lecture Notes in Computer Science, pages 42--53. Springer, 2012.
[3]
R. Alur, A. Freilich, and M. Raghothaman. Regular combinators for string transformations. In Proceedings of the Joint Meeting of the 23rd EACSL Annual Conference on Computer Science Logic (CSL) and the 29th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS '14, pages 9:1--9:10. ACM, 2014.
[4]
O. Becker. Streaming transformations for xml-stx. In XMIDX, volume 24 of LNI, pages 83--88. GI, 2003.
[5]
A. Bohannon, N. Foster, B. Pierce, A. Pilkiewicz, and A. Schmitt. Boomerang: Resourceful lenses for string data. In Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 407--419. ACM, 2008.
[6]
M. Bojańczyk. Transducers with origin information. In Automata, Languages, and Programming, volume 8573 of Lecture Notes in Computer Science, pages 26--37. Springer, 2014.
[7]
R. Book, S. Even, S. Greibach, and G. Ott. Ambiguity in graphs and expressions. IEEE Transactions on Computers, 20(2):149--153, February 1971.
[8]
A. Brüggemann-Klein. Regular expressions into finite automata. In LATIN '92, volume 583 of Lecture Notes in Computer Science, pages 87--98. Springer, 1992.
[9]
M. Chytil and V. Jákl. Serial composition of 2-way finite-state transducers and simple programs on strings. In Automata, Languages, and Programming, volume 52 of Lecture Notes in Computer Science, pages 135--147. Springer, 1977.
[10]
B. Courcelle. Monadic second-order definable graph transductions: a survey. Theoretical Computer Science, 126(1):53--75, 1994.
[11]
L. D'Antoni and R. Alur. Symbolic visibly pushdown automata. In Computer Aided Verification, volume 8559 of Lecture Notes in Computer Science, pages 209--225. Springer, 2014.
[12]
L. D'Antoni and M. Veanes. Equivalence of extended symbolic finite transducers. In Computer Aided Verification, volume 8044 of Lecture Notes in Computer Science, pages 624--639. Springer, 2013.
[13]
L. D'Antoni and M. Veanes. Static analysis of string encoders and decoders. In Verification, Model Checking, and Abstract Interpretation, volume 7737 of Lecture Notes in Computer Science, pages 209--228. Springer, 2013.
[14]
L. D'Antoni and M. Veanes. Minimization of symbolic automata. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 541--553, New York, NY, USA, 2014. ACM.
[15]
L. D'Antoni, M. Veanes, B. Livshits, and D. Molnar. Fast: A transducer- based language for tree manipulation. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 384--394. ACM, 2014.
[16]
J. Engelfriet and H. J. Hoogeboom. MSO definable string transductions and two-way finite-state transducers. ACM Transactions on Computational Logic, 2(2):216--254, April 2001.
[17]
J. Engelfriet and S. Maneth. Macro tree transducers, attribute grammars, and MSO definable tree translations. Information and Computation, 154(1):34--91, 1999.
[18]
J. Engelfriet, G. Rozenberg, and G. Slutzki. Tree transducers, L systems, and two-way machines. Journal of Computer and System Sciences, 20(2):150--202, 1980.
[19]
J. Engelfriet and H. Vogler. Macro tree transducers. Journal of Computer and System Sciences, 31(1):71--146, 1985.
[20]
S. Gulwani. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th Annual ACM SIGPLAN- SIGACT Symposium on Principles of Programming Languages, pages 317--330. ACM, 2011.
[21]
E. Gurari. The equivalence problem for deterministic two-way sequential transducers is decidable. In 21st Annual Symposium on Foundations of Computer Science, pages 83--85, 1980.
[22]
T. Mytkowicz, M. Musuvathi, and W. Schulte. Data-parallel finite- state machines. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 529--542. ACM, 2014.
[23]
G. Rosu. An effective algorithm for the membership problem for extended regular expressions. In Foundations of Software Science and Computational Structures, volume 4423 of Lecture Notes in Computer Science, pages 332--345. Springer, 2007.
[24]
M. Sipser. Introduction to the Theory of Computation. Cengage Learning, 3rd edition, 2012.
[25]
R. Stearns and H. Hunt. On the equivalence and containment problems for unambiguous regular expressions, grammars, and automata. In Proceedings of the 22nd Annual Symposium on Foundations of Computer Science, pages 74--81. IEEE Computer Society, 1981.
[26]
M. Veanes, P. Hooimeijer, B. Livshits, D. Molnar, and N. Bjorner. Symbolic finite state transducers: Algorithms and applications. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 137--150. ACM, 2012.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 50, Issue 1
POPL '15
January 2015
682 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2775051
  • Editor:
  • Andy Gill
Issue’s Table of Contents
  • cover image ACM Conferences
    POPL '15: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
    January 2015
    716 pages
    ISBN:9781450333009
    DOI:10.1145/2676726
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 January 2015
Published in SIGPLAN Volume 50, Issue 1

Check for updates

Author Tags

  1. declarative languages
  2. drex
  3. string transformations

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Data Transformation Acceleration using Deterministic Finite-State Transducers2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020756(141-150)Online publication date: 17-Dec-2022
  • (2021)SD-regular transducer expressions for aperiodic transformationsProceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science10.1109/LICS52264.2021.9470738(1-13)Online publication date: 29-Jun-2021
  • (2019)Modular Descriptions of Regular FunctionsAlgebraic Informatics10.1007/978-3-030-21363-3_1(3-9)Online publication date: 24-May-2019
  • (2017)Forward Bisimulations for Nondeterministic Symbolic Finite AutomataProceedings, Part I, of the 23rd International Conference on Tools and Algorithms for the Construction and Analysis of Systems - Volume 1020510.1007/978-3-662-54577-5_30(518-534)Online publication date: 22-Apr-2017
  • (2017)The Power of Symbolic Automata and TransducersComputer Aided Verification10.1007/978-3-319-63387-9_3(47-67)Online publication date: 13-Jul-2017
  • (2016)Regular Programming for Quantitative Properties of Data StreamsProgramming Languages and Systems10.1007/978-3-662-49498-1_2(15-40)Online publication date: 2016
  • (2022)Data Transformation Acceleration using Deterministic Finite-State Transducers2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020756(141-150)Online publication date: 17-Dec-2022
  • (2021)A Survey on String Constraint SolvingACM Computing Surveys10.1145/348419855:1(1-38)Online publication date: 23-Nov-2021
  • (2021)SD-regular transducer expressions for aperiodic transformationsProceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science10.1109/LICS52264.2021.9470738(1-13)Online publication date: 29-Jun-2021
  • (2020)Streamable regular transductionsTheoretical Computer Science10.1016/j.tcs.2019.11.018807:C(15-41)Online publication date: 6-Feb-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media