Abstract
bzip is a program written by Julian Seward that is often used under Unix to compress single files. It splits the file into blocks which are compressed individually using a combination of the Burrows-Wheeler-Transformation, the Move-To-Front algorithm, Huffman and Runlength encoding. The author himself stated that compressed blocks that are damaged, i.e., part of which are lost, are essentially non-recoverable. This paper gives a formal proof that this is indeed true: focusing on the Burrows-Wheeler-Transformation, the problem of completing a transformed string, such that the decoded string obeys certain file format restrictions, is NP-hard.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brandstädt, A., Le, V.B., Spinrad, J.P.: Graph Classes; A Survey. SIAM Monographs on Discrete Mathematics and Applications (1999)
Burrows, M., Wheeler, D.J.: A Block-sorting Lossless Data Compression Algorithm. SRC Research Report (1994)
Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1979)
Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading (2000)
Seward, J.: - The official BZip Homepage, http://www.bzip.org
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hundt, C., Ochsenfahrt, U. (2008). Damaged BZip Files Are Difficult to Repair. In: Hu, X., Wang, J. (eds) Computing and Combinatorics. COCOON 2008. Lecture Notes in Computer Science, vol 5092. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69733-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-69733-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69732-9
Online ISBN: 978-3-540-69733-6
eBook Packages: Computer ScienceComputer Science (R0)