Abstract
Obfuscation is used to protect programs from analysis and reverse engineering. There are theoretically effective and resistant obfuscation methods, however, most of them are not implemented in practice yet. The main issues are the large overhead for the execution of obfuscated code and the limitation of application only to a specific class of programs. On the other hand, a large number of obfuscation methods have been developed that are applied in practice. The existing approaches to the assessment of such obfuscation methods are based mainly on the static characteristics of programs. Therefore, the comprehensive (taking into account the dynamic characteristics of programs) justification of their effectiveness and resistance is a relevant task. It seems that such a justification can be made using machine learning methods based on feature vectors that describe both static and dynamic characteristics of programs. In this paper, it is proposed to build such a vector on the basis of characteristics of two compared programs: the original and obfuscated, original and deobfuscated, obfuscated and deobfuscated. In order to obtain the dynamic characteristics of the program, a scheme based on a symbolic execution is constructed and presented in this paper. The choice of the symbolic execution is justified by the fact that such characteristics can describe the difficulty of comprehension of the program in dynamic analysis. This paper proposes two implementations of the scheme: extended and simplified. The extended scheme is closer to the process of analyzing a program by an analyst, since it includes the steps of disassembly and translation into intermediate code, while in the simplified scheme these steps are excluded. In order to identify the characteristics of symbolic execution that are suitable for assessing the effectiveness and resistance of obfuscation based on machine learning methods, experiments with the developed schemes were carried out. Based on the obtained results, a set of suitable characteristics is determined.
REFERENCES
Collberg, C. and Tomborson, C., Watermarking, tamper-proofing, and obfuscation—Tools for software production, IEEE Trans. Software Eng., 2002, vol. 28, no. 8, pp. 735–746. https://doi.org/10.1109/TSE.2002.1027797
Garg, S., Gentry, C., Halevi, S., Raykova, M., Sahai, A., and Waters, B., Candidate indistinguishability obfuscation and functional encryption for all circuits, SIAM J. Comput., 2016, vol. 45, no. 3, pp. 882–929. https://doi.org/10.1137/14095772X
Xu, H., Zhou, Y., Ming, J., and Lyu, M., Layered obfuscation: a taxonomy of software obfuscation techniques for layered security, Cybersecurity, 2020, vol. 3, p. 9. https://doi.org/10.1186/s42400-020-00049-3
Collberg, C., Thomborson, C., and Low, D., A taxonomy of obfuscating transformations, Tech. Report, Department of Computer Science, Univ. of Auckland, 1997, no. 148.
Kanzaki, Y., Monden, A., and Collberg, C., Code artificiality: A metric for the code stealth based on an N-gram model, IEEE/ACM 1st Int. Workshop on Software Protection, Florence, 2015, IEEE, 2015, pp. 31–37. https://doi.org/10.1109/SPRO.2015.14
Mohsen, R. and Pinto, A.M., Algorithmic information theory for obfuscation security, 12th Int. Joint Conf. on e-Business and Telecommunications (ICETE), Colmar, France, 2015, IEEE, 2015, pp. 76–87.
Mohsen, R. and Pinto, A.M., Evaluating obfuscation security: A quantitative approach, Foundations and Practice of Security. FPS 2015, Garcia-Alfaro, J., Kranakis, E., Bonfante, G., Eds., Lecture Notes in Computer Science, vol. 9482, Cham: Springer, 2015, pp. 174–192. https://doi.org/10.1007/978-3-319-30303-1_11
Ceccato, M., Di Penta, M., Nagra, J., Falcarin, P., Ricca, F., Torchiano, M., and Tonella, P., The effectiveness of source code obfuscation: An experimental assessment, 17th Int. Conf. on Program Comprehension, Vancouver, 2009, IEEE, 2009, pp. 178–187. https://doi.org/10.1109/ICPC.2009.5090041
Siegmund, J., Program comprehension: Past, present, and future, IEEE 23rd Int. Conf. on Software Analysis, Evolution, and Reengineering (SANER), Osaka, Japan, 2016, IEEE, 2016, pp. 13–20. https://doi.org/10.1109/SANER.2016.35
Avidan, E. and Feitelson, D.G., From obfuscation to comprehension, IEEE 23rd Int. Conf. on Program Comprehension, Florence, 2015, IEEE, 2015, pp. 178–181. https://doi.org/10.1109/ICPC.2015.27
Borisov, P.D. and Kosolapov, Yu.V., On the automatic analysis of the practical resistance of obfuscating transformations, Model. Anal. Inf. Sist., 2019, vol. 26, no. 3, pp. 317–331. https://doi.org/10.18255/1818-1015-317-331
King, J.C., Symbolic execution and program testing, Commun. ACM, 1976, vol. 19, no. 7, pp. 385–394. https://doi.org/10.1145/360248.360252
Yadegari, B. and Debray, S., Symbolic execution of obfuscated code, Proc. 22nd ACM SIGSAC Conf. on Computer and Communications Security, Denver, Colo., 2015, New York: Association for Computing Machinery, 2015, pp. 732–744. https://doi.org/10.1145/2810103.2813663
Lattner, C. and Adve, V., LLVM: A Compilation framework for lifelong program analysis and transformation, Int. Symp. on Code Generation and Optimization. CGO 2004, San Jose, Calif., 2004, IEEE, 2004, pp. 75–86. https://doi.org/10.1109/CGO.2004.1281665
Brown, P.F., Della Pietra, V.J., deSouza, P.V., and Lai, J.C., and Mercer, R.L., Class-based n-gram models of natural language, Comput. Linguist., 1992, vol. 18, no. 4, pp. 467–479.
Zhang, N., Hikari—An improvement over Obfuscator-LLVM, 2017. https://github.com/HikariObfuscator/Hikari.
Dinaburg, A. and Ruef, A., McSema: Static translation of x86 instructions to LLVM, ReCon Conf., Montreal, 2014.
Cadar, C. and Nowack, M., KLEE symbolic execution engine in 2019, Int. J. Software Tools Technol. Transfer, 2021, vol. 23, pp. 867–870. https://doi.org/10.1007/s10009-020-00570-3
Muchnick, S., Advanced Compiler Design Implementation, San Francisco: Morgan Kaufmann, 1997.
C. Eagle, The IDA Pro Book: The Unoffcial Guide to the World’s Most Popular Disassembler, No Starch Press, 2008, 2nd ed.
Ravipati, G., Bemat, A.R., Rosenblum, N., Miller, B.P., and Hollingsworth, J.K., Towards the deconstruction of Dyninst, Tech. Rep., Univ. of Wisconsin–Madison, 2007.
Horspool, R.N. and Marovac, N., An approach to the problem of detranslation of computer programs, Comput. J., 1980, vol. 23, no. 3, pp. 223–229. https://doi.org/10.1093/comjnl/23.3.223
Microsoft Portable Executable and Common Object File Format Specification, Visual C++ Business Unit, Microsoft Corporation, 1999.
Lu, H., ELF: From the programmer’s perspective, 1995.
Křoustek, J., Matula, P., Končický, J., and Kolář, Accurate retargetable decompilation using additional debugging information, SECURWARE 2012: The Sixth Int. Conf. on Emerging Security Information, System and Technologies, Rome, 2012, pp. 79–84.
Dasgupta, S., Dinesh, D., Venkatesh, D., Adve, V.S., and Fletcher, C.W., Scalable validation of binary lifters, Proc. 41st ACM SIGPLAN Conf. on Programming Language Design and Implementation, 2020, New York: Association for Computing Machinery, London, 2020, pp. 655–671. https://doi.org/10.1145/3385412.3385964
Banescu, S., Collberg, C., Ganesh, V., Newsham, Z., and Pretschner, A., Code obfuscation against symbolic execution attacks, ACSAC’16: Proc. 32nd Ann. Conf. on Computer Security Applications, Los Angeles, 2016, New York: Association for Computing Machinery, 2016, pp. 189–200. https://doi.org/10.1145/2991079.2991114
Junod, P., Rinaldini, J., Wehrli, J., and Michielin, J., Obfuscator-LLVM—Software protection for the masses, IEEE/ACM 1st Int. Workshop on Software Protection, Florence, 2015, IEEE, 2015, pp. 3–9. https://doi.org/10.1109/SPRO.2015.10
Laszlo, T. and Kiss, A., Obfuscating C++ programs via control flow flattening, Ann. Univ. Sci. Budapest., Sect. Comput., 2009, vol. 30, no. 1, pp. 3–19.
Kosolapov, Yu. and Borisov, P., Similarity features for the evaluation of obfuscation effectiveness, Int. Conf. on Decision Aid Sciences and Application (DASA), Sakheer, Bahrain, 2020, IEEE, 2020, pp. 898–902. https://doi.org/10.1109/DASA51403.2020.9317301
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare that they have no conflicts of interest.
Additional information
Translated by F. Baron
About this article
Cite this article
Borisov, P.D., Kosolapov, Y.V. On the Characteristics of Symbolic Execution in the Problem of Assessing the Quality of Obfuscating Transformations. Aut. Control Comp. Sci. 56, 595–605 (2022). https://doi.org/10.3103/S014641162207001X
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S014641162207001X