Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Balancing Straight-line Programs

Published: 30 June 2021 Publication History

Abstract

We show that a context-free grammar of size that produces a single string of length (such a grammar is also called a string straight-line program) can be transformed in linear time into a context-free grammar for of size , whose unique derivation tree has depth . This solves an open problem in the area of grammar-based compression, improves many results in this area, and greatly simplifies many existing constructions. Similar results are shown for two formalisms for grammar-based tree compression: top dags and forest straight-line programs. These balancing results can be all deduced from a single meta-theorem stating that the depth of an algebraic circuit over an algebra with a certain finite base property can be reduced to with the cost of a constant multiplicative size increase. Here, refers to the size of the unfolding (or unravelling) of the circuit. In particular, this results applies to standard arithmetic circuits over (noncommutative) semirings.

References

[1]
Eric Allender, Jia Jiao, Meena Mahajan, and V. Vinay. 1998. Non-commutative arithmetic circuits: Depth reduction and size lower bounds. Theor. Comput. Sci. 209, 1--2 (1998), 47–86.
[2]
Djamal Belazzougui, Patrick Hagge Cording, Simon J. Puglisi, and Yasuo Tabei. 2015. Access, rank, and select in grammar-compressed strings. In Proceedings of the 23rd Annual European Symposium on Algorithms (ESA’15), volume 9294 of Lecture Notes in Computer Science. Springer, 142–154.
[3]
Philip Bille, Patrick Hagge Cording, and Inge Li Gørtz. 2017. Compressed subsequence matching and packed tree coloring. Algorithmica 77, 2 (2017), 336–348.
[4]
Philip Bille, Finn Fernstrøm, and Inge Li Gørtz. 2017. Tight bounds for top tree compression. In Proceedings of the 24th International Symposium on String Processing and Information Retrieval (SPIRE’17), volume 10508 of Lecture Notes in Computer Science. Springer, 97–102.
[5]
Philip Bille, Paweł Gawrychowski, Inge Li Gørtz, Gad M. Landau, and Oren Weimann. 2019. Top tree compression of tries. In Proceedings of the 30th International Symposium on Algorithms and Computation (ISAAC’19), volume 149 of LIPIcs, Pinyan Lu and Guochuan Zhang (Eds.). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 4:1–4:18.
[6]
Philip Bille, Inge Li Gørtz, Patrick Hagge Cording, Benjamin Sach, Hjalte Wedel Vildhøj, and Søren Vind. 2017. Fingerprints in compressed strings. J. Comput. Syst. Sci. 86 (2017), 171–180.
[7]
Philip Bille, Inge Li Gørtz, Gad M. Landau, and Oren Weimann. 2015. Tree compression with top trees. Inf. Comput. 243 (2015), 166–177.
[8]
Philip Bille, Gad M. Landau, Rajeev Raman, Kunihiko Sadakane, Srinivasa Rao Satti, and Oren Weimann. 2015. Random access to grammar-compressed strings and trees. SIAM J. Comput. 44, 3 (2015), 513–539.
[9]
Mikołaj Bojańczyk and Igor Walukiewicz. 2008. Forest algebras. In Proceedings of Logic and Automata: History and Perspectives [in Honor of Wolfgang Thomas], volume 2 of Texts in Logic and Games. Amsterdam University Press, 107–132.
[10]
Giorgio Busatto, Markus Lohrey, and Sebastian Maneth. 2008. Efficient memory representation of XML document trees. Inf. Syst. 33, 4–5 (2008), 456–474.
[11]
Moses Charikar, Eric Lehman, Ding Liu, Rina Panigrahy, Manoj Prabhakaran, Amit Sahai, and Abhi Shelat. 2005. The smallest grammar problem. IEEE Trans. Inf. Theory 51, 7 (2005), 2554–2576.
[12]
Richard Cole and Uzi Vishkin. 1988. The accelerated centroid decomposition technique for optimal parallel tree evaluation in logarithmic time. Algorithmica 3 (1988), 329–346.
[13]
Bartłomiej Dudek and Paweł Gawrychowski. 2018. Slowing down top trees for better worst-case compression. In Proceedings of the Annual Symposium on Combinatorial Pattern Matching (CPM’18), volume 105 of LIPIcs. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 16:1–16:8.
[14]
Michael L. Fredman and Dan E. Willard. 1993. Surpassing the information theoretic bound with fusion trees. J. Comput. Syst. Sci. 47, 3 (1993), 424–436.
[15]
Moses Ganardi, Danny Hucke, Artur Jeż, Markus Lohrey, and Eric Noeth. 2017. Constructing small tree grammars and small circuits for formulas. J. Comput. Syst. Sci. 86 (2017), 136–158.
[16]
Adrià Gascón, Markus Lohrey, Sebastian Maneth, Carl Philipp Reh, and Kurt Sieber. 2018. Grammar-based compression of unranked trees. In Proceedings of 13th International Computer Science Symposium in Russia (CSR’18), volume 10846 of Lecture Notes in Computer Science. Springer, 118–131.
[17]
Moses Ganardi and Markus Lohrey. 2018. A universal tree balancing theorem. ACM Trans. Comput. Theory 11, 1 (2018), 1:1–1:25.
[18]
Moses Ganardi, Markus Lohrey, and Artur Jeż. 2019. Balancing Straight-Line Programs. In Proceedings of 60th IEEE Annual Symposium on Foundations of Computer Science (FOCS’19). IEEE Computer Society, 1169–1183.
[19]
Leszek Gasieniec, Roman M. Kolpakov, Igor Potapov, and Paul Sant. 2005. Real-time traversal in grammar-based compressed files. In Proceedings of the 2005 Data Compression Conference (DCC’05). IEEE Computer Society, 458.
[20]
Paweł Gawrychowski, Seungbum Jo, Shay Mozes, and Oren Weimann. 2020. Compressed range minimum queries. Theor. Comput. Sci. 812 (2020), 39–48.
[21]
Dov Harel and Robert Endre Tarjan. 1984. Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13, 2 (1984), 338–355.
[22]
Lorenz Hübschle-Schneider and Rajeev Raman. 2015. Tree compression with top trees revisited. In Proceedings of the 14th International Symposium on Experimental Algorithms (SEA’15), volume 9125 of Lecture Notes in Computer Science. Springer, 15–27.
[23]
Tomohiro I., Wataru Matsubara, Kouji Shimohira, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Kazuyuki Narisawa, and Ayumi Shinohara. 2015. Detecting regularities on grammar-compressed strings. Inf. Comput. 240 (2015), 74–89.
[24]
Artur Jeż. 2015. Approximation of grammar-based compression via recompression. Theor. Comput. Sci. 592 (2015), 115–134.
[25]
Artur Jeż. 2015. Faster fully compressed pattern matching by recompression. ACM Trans. Algor. 11, 3 (2015), 20:1–20:43.
[26]
Artur Jeż. 2016. A really simple approximation of smallest grammar. Theor. Comput. Sci. 616 (2016), 141–150.
[27]
S. Rao Kosaraju. 1990. On parallel evaluation of classes of circuits. In Proceedings of the 10th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS’90), volume 472 of Lecture Notes in Computer Science. Springer, 232–237.
[28]
Yury Lifshits. 2007. Processing compressed texts: A tractability border. In Proceedings of the 18th Annual Symposium on Combinatorial Pattern Matching (CPM’07), volume 4580 of Lecture Notes in Computer Science. Springer, 228–240.
[29]
Markus Lohrey. 2012. Algorithmics on SLP-compressed strings: A survey. Groups Complex. Cryptol. 4, 2 (2012), 241–299.
[30]
Markus Lohrey. 2014. The Compressed Word Problem for Groups. SpringerBriefs in Mathematics. Springer.
[31]
Markus Lohrey. 2015. Grammar-based tree compression. In Proceedings of the 19th International Conference on Developments in Language Theory (DLT’15), volume 9168 of Lecture Notes in Computer Science. Springer, 46–57.
[32]
Markus Lohrey, Sebastian Maneth, and Roy Mennicke. 2013. XML tree structure compression using RePair. Inf. Syst. 38, 8 (2013), 1150–1167.
[33]
Markus Lohrey, Sebastian Maneth, and Carl Philipp Reh. 2018. Constant-time tree traversal and subtree equality check for grammar-compressed trees. Algorithmica 80, 7 (2018), 2082–2105.
[34]
Gary L. Miller and Shang-Hua Teng. 1997. Tree-based parallel algorithm design. Algorithmica 19, 4 (1997), 369–389.
[35]
Mike Paterson and Leslie G. Valiant. 1976. Circuit size is nonlinear in depth. Theor. Comput. Sci. 2, 3 (1976), 397–400.
[36]
Wojciech Rytter. 2003. Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302, 1–3 (2003), 211–222.
[37]
Robert Endre Tarjan. 1983. Data Structures and Network Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, PA.
[38]
Leslie G. Valiant, Sven Skyum, S. Berkowitz, and Charles Rackoff. 1983. Fast parallel computation of polynomials using few processors. SIAM J. Comput. 12, 4 (1983), 641–644.
[39]
Elad Verbin and Wei Yu. 2013. Data structure lower bounds on random access to grammar-compressed strings. In Proceedings of the 24th Annual Symposium on Combinatorial Pattern Matching (CPM’13), volume 7922 of Lecture Notes in Computer Science. Springer, 247–258.

Cited By

View all
  • (2024)Enumeration for MSO-Queries on Compressed TreesProceedings of the ACM on Management of Data10.1145/36511412:2(1-17)Online publication date: 14-May-2024
  • (2024)Faster Maximal Exact Matches with Lazy LCP Evaluation2024 Data Compression Conference (DCC)10.1109/DCC58796.2024.00020(123-132)Online publication date: 19-Mar-2024
  • (2024)Revisiting the Folklore Algorithm for Random Access to Grammar-Compressed StringsString Processing and Information Retrieval10.1007/978-3-031-72200-4_7(88-101)Online publication date: 19-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 68, Issue 4
August 2021
297 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/3468065
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 June 2021
Accepted: 01 March 2021
Revised: 01 February 2021
Received: 01 June 2020
Published in JACM Volume 68, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Grammar-based compression
  2. balancing
  3. straight-line programs
  4. random access problem

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • National Science Centre, Poland
  • DFG research project

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)263
  • Downloads (Last 6 weeks)67
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Enumeration for MSO-Queries on Compressed TreesProceedings of the ACM on Management of Data10.1145/36511412:2(1-17)Online publication date: 14-May-2024
  • (2024)Faster Maximal Exact Matches with Lazy LCP Evaluation2024 Data Compression Conference (DCC)10.1109/DCC58796.2024.00020(123-132)Online publication date: 19-Mar-2024
  • (2024)Revisiting the Folklore Algorithm for Random Access to Grammar-Compressed StringsString Processing and Information Retrieval10.1007/978-3-031-72200-4_7(88-101)Online publication date: 19-Sep-2024
  • (2024)Generalization of Repetitiveness Measures for Two-Dimensional StringsString Processing and Information Retrieval10.1007/978-3-031-72200-4_5(57-72)Online publication date: 19-Sep-2024
  • (2024)Space-Efficient SLP Encoding for O(log N)-Time Random AccessString Processing and Information Retrieval10.1007/978-3-031-72200-4_25(336-347)Online publication date: 19-Sep-2024
  • (2024)Computing String Covers in Sublinear TimeString Processing and Information Retrieval10.1007/978-3-031-72200-4_21(272-288)Online publication date: 19-Sep-2024
  • (2024)How to Find Long Maximal Exact Matches and Ignore Short OnesDevelopments in Language Theory10.1007/978-3-031-66159-4_10(131-140)Online publication date: 12-Aug-2024
  • (2023)Homomorphic Compression: Making Text Processing on Compression UnlimitedProceedings of the ACM on Management of Data10.1145/36267651:4(1-28)Online publication date: 12-Dec-2023
  • (2023)Collapsing the Hierarchy of Compressed Data Structures: Suffix Arrays in Optimal Compressed Space2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS57990.2023.00114(1877-1886)Online publication date: 6-Nov-2023
  • (2022)Survey of Grammar-Based Data Structure CompressionIEEE BITS the Information Theory Magazine10.1109/MBITS.2022.32108912:2(19-35)Online publication date: 1-Nov-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media