research-article

Anything to Hide? Studying Minified and Obfuscated Code in the Web

Authors:

Philippe Skolka,

Cristian-Alexandru Staicu,

Michael PradelAuthors Info & Claims

WWW '19: The World Wide Web Conference

Pages 1735 - 1746

https://doi.org/10.1145/3308558.3313752

Published: 13 May 2019 Publication History

Get Access

Abstract

JavaScript has been used for various attacks on client-side web applications. To hinder both manual and automated analysis from detecting malicious scripts, code minification and code obfuscation may hide the behavior of a script. Unfortunately, little is currently known about how real-world websites use such code transformations. This paper presents an empirical study of obfuscation and minification in 967,149 scripts (424,023 unique) from the top 100,000 websites. The core of our study is a highly accurate (95%-100%) neural network-based classifier that we train to identify whether obfuscation or minification have been applied and if yes, using what tools. We find that code transformations are very widespread, affecting 38% of all scripts. Most of the transformed code has been minified, whereas advanced obfuscation techniques, such as encoding parts of the code or fetching all strings from a global array, affect less than 1% of all scripts (2,842 unique scripts in total). Studying which code gets obfuscated, we find that obfuscation is particularly common in certain website categories, e.g., adult content. Further analysis of the obfuscated code shows that most of it is similar to the output produced by a single obfuscation tool and that some obfuscated scripts trigger suspicious behavior, such as likely fingerprinting and timing attacks. Finally, we show that obfuscation comes at a cost, because it slows down execution and risks to produce code that changes the intended behavior. Overall, our study shows that the security community must consider minified and obfuscated JavaScript code, and it provides insights into what kinds of transformations to focus on. Our learned classifiers provide an automated and accurate way to identify obfuscated code, and we release a set of real-world obfuscated scripts for future research.

References

[1]

Ismail Adel AL-Taharwa, Hahn-Ming Lee, Albert B. Jeng, Kuo-Ping Wu, Cheng-Seen Ho, and Shyi-Ming Chen. 2015. JSOD: JavaScript Obfuscation Detector. Sec. and Commun. Netw. 8, 6 (April 2015), 1092-1107.

Abstract

References

Cited By

Recommendations

Obfuscated malware detection using API call dependency

Characterizing Obfuscated JavaScript Using Abstract Syntax Trees: Experimenting with Malicious Scripts

Metadata recovery from obfuscated programs using machine learning

Comments

Information

Published In

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations