Abstract
In this paper we study the number \(r_{\texttt {bwt}}\) of equal-letter runs produced by the Burrows-Wheeler transform (BWT) when it is applied to purely morphic finite words, which are words generated by iterating prolongable morphisms. Such a parameter \(r_{\texttt {bwt}}\) is very significant since it provides a measure of the performances of the BWT, in terms of both compressibility and indexing. In particular, we prove that, when BWT is applied to whichever purely morphic finite word on a binary alphabet, \(r_{\texttt {bwt}}\) is \(\mathcal {O}(\log n)\), where n is the length of the word. Moreover, we prove that \(r_{\texttt {bwt}}\) is \(\varTheta (\log n)\) for the binary words generated by a large class of prolongable binary morphisms. These bounds are proved by providing some new structural properties of the bispecial circular factors of such words.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Belazzougui, D., Cunial, F., Gagie, T., Prezza, N., Raffinot, M.: Composite repetition-aware data structures. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 26–39. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19929-0_3
Brlek, S., Frosini, A., Mancini, I., Pergola, E., Rinaldi, S.: Burrows-Wheeler transform of words defined by morphisms. In: Colbourn, C.J., Grossi, R., Pisanti, N. (eds.) IWOCA 2019. LNCS, vol. 11638, pp. 393–404. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25005-8_32
Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. Technical report, DIGITAL System Research Center (1994)
Cassaigne, J.: Complexity and special factors. (complexité et facteurs spéciaux.). Bull. Belgian Math. Soc. - Simon Stevin 4(1), 67–88 (1997)
Christiansen, A.R., Ettienne, M.B., Kociumaka, T., Navarro, G., Prezza, N.: Optimal-time dictionary-compressed indexes. ACM Trans. Algorithms 17(1), 8:1–8:39 (2021)
Constantinescu, S., Ilie, L.: The Lempel-Ziv complexity of fixed points of morphisms. SIAM J. Discret. Math. 21(2), 466–481 (2007)
Ehrenfeucht, A., Lee, K.P., Rozenberg, G.: Subword complexities of various classes of deterministic developmental languages without interactions. Theor. Comput. Sci. 1(1), 59–75 (1975)
Ferenczi, S., Zamboni, L.Q.: Clustering words and interval exchanges. J. Integer Seq. 16(2), Article 13.2.1 (2013)
Ferragina, P., Manzini, G.: Indexing compressed text. J. ACM 52, 552–581 (2005)
Frosini, A., Mancini, I., Rinaldi, S., Romana, G., Sciortino, M.: Burrows-Wheeler transform on purely morphic words. In: DCC, pp. 452–452. IEEE (2022)
Gagie, T., Navarro, G., Prezza, N.: Fully functional suffix trees and optimal text searching in BWT-runs bounded space. J. ACM 67(1), 2:1–2:54 (2020)
Giuliani, S., Inenaga, S., Lipták, Z., Prezza, N., Sciortino, M., Toffanello, A.: Novel results on the number of runs of the Burrows-Wheeler transform. In: Bureš, T., et al. (eds.) SOFSEM 2021. LNCS, vol. 12607, pp. 249–262. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67731-2_18
Kempa, D., Kociumaka, T.: Resolution of the Burrows-Wheeler transform conjecture. In: FOCS, pp. 1002–1013. IEEE (2020)
Kempa, D., Prezza, N.: At the roots of dictionary compression: string attractors. In: STOC. pp. 827–840. ACM (2018)
Mantaci, S., Restivo, A., Rosone, G., Sciortino, M.: Burrows-Wheeler transform and Run-Length Enconding. In: Brlek, S., Dolce, F., Reutenauer, C., Vandomme, É. (eds.) WORDS 2017. LNCS, vol. 10432, pp. 228–239. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66396-8_21
Mantaci, S., Restivo, A., Rosone, G., Sciortino, M., Versari, L.: Measuring the clustering effect of BWT via RLE. Theoret. Comput. Sci. 698, 79–87 (2017)
Mantaci, S., Restivo, A., Sciortino, M.: Burrows-Wheeler transform and Sturmian words. Inform. Process. Lett. 86, 241–246 (2003)
Navarro, G.: Indexing highly repetitive string collections, part I: repetitiveness measures. ACM Comput. Surv. 54(2), 29:1–29:31 (2021)
Navarro, G., Urbina, C.: On stricter reachable repetitiveness measures. In: Lecroq, T., Touzet, H. (eds.) SPIRE 2021. LNCS, vol. 12944, pp. 193–206. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86692-1_16
Pansiot, J.: Complexité des facteurs des mots infinis engendrés par morphimes itérés. In: ICALP. Lecture Notes Computer Science, vol. 172, pp. 380–389. Springer (1984)
Pansiot, J.J.: Decidability of periodicity for infinite words. RAIRO - Theor. Inform. Appl. 20(1), 43–46 (1986)
Restivo, A., Rosone, G.: Burrows-Wheeler transform and palindromic richness. Theoret. Comput. Sci. 410(30–32), 3018–3026 (2009)
Rozenberg, G., Salomaa, A.: The Mathematical Theory of L Systems. Elsevier Science (1980)
Seward, J.: The bzip2 home page (2006). http://www.bzip.org
Shaeffer, L., Shallit, J.: String attractors for automatic sequences. CoRR abs/2012.06840 (2020)
Simpson, J., Puglisi, S.J.: Words with simple Burrows-Wheeler transforms. Electron. J. Combin. 15 (article R83) (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Frosini, A., Mancini, I., Rinaldi, S., Romana, G., Sciortino, M. (2022). Logarithmic Equal-Letter Runs for BWT of Purely Morphic Words. In: Diekert, V., Volkov, M. (eds) Developments in Language Theory. DLT 2022. Lecture Notes in Computer Science, vol 13257. Springer, Cham. https://doi.org/10.1007/978-3-031-05578-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-05578-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05577-5
Online ISBN: 978-3-031-05578-2
eBook Packages: Computer ScienceComputer Science (R0)