Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2336717.2336725acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections

Hash-flow taint analysis of higher-order programs

Published: 15 June 2012 Publication History


As web applications have grown in popularity, so have attacks on such applications. Cross-site scripting and injection attacks have become particularly problematic. Both vulnerabilities stem, at their core, from improper sanitization of user input.
We propose static taint analysis, which can verify the absence of unsanitized input errors at compile-time. Unfortunately, precise static analysis of modern scripting languages like Python is challenging: higher-orderness and complex control-flow collide with opaque, dynamic data structures like hash maps and objects. The interdependence of data-flow and control-flow make it hard to attain both soundness and precision.
In this work, we apply abstract interpretation to sound and precise taint-style static analysis of scripting languages. We first define λH, a core calculus of modern scripting languages, with hash maps, dynamic objects, higher-order functions and first class control. Then we derive a framework of k-CFA-like CESK-style abstract machines for statically reasoning about λH, but with hash maps factored into a "Curried Object store." The Curried object store---and shape analysis on this store---allows us to recover field sensitivity, even in the presence of dynamically modified fields. Lastly, atop this framework, we devise a taint-flow analysis, leveraging its field-sensitive, interprocedural and context-sensitive properties to soundly and precisely detect security vulnerabilities, like XSS attacks in web applications.
We have prototyped the analytical framework for Python, and conducted preliminary experiments with web applications. A low rate of false alarms demonstrates the promise of this approach.


Pyblosxom. http://pyblosxom.bluesock.org/.
Andrews, M. Guest editor's introduction: The state of web security. IEEE Security and Privacy 4 (July 2006), 14--15.
Chang, W., Streiff, B., and Lin, C. Efficient and extensible security enforcement using dynamic data flow analysis. In Conference on Computer and Communications Security (2008).
Chase, D. R., Wegman, M., and Zadeck, F. K. Analysis of pointers and structures. In PLDI '90: Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation (New York, NY, USA, 1990), PLDI '90, ACM, pp. 296--310.
Denning, D. E. A lattice model of secure information flow. Commun. ACM 19, 5 (May 1976), 236--243.
Felleisen, M. The Calculi of Lambda-v-CS Conversion: A Syntactic Theory of Control and State in Imperative Higher-Order Programming Languages. PhD thesis, Indiana University, 1987.
Felleisen, M., and Friedman, D. P. Control operators, the SECD-machine, and the lambda-calculus. In 3rd Working Conference on the Formal Description of Programming Concepts (Aug. 1986).
Flanagan, C., Sabry, A., Duba, B. F., and Felleisen, M. The essence of compiling with continuations. In PLDI '93: Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (New York, NY, USA, June 1993), ACM, pp. 237--247.
Grossbart, Z. Search engine in python, 2007. http://www.zackgrossbart.com/hackito/search-engine-python/.
Huang, Y. W., Yu, F., Hang, C., Tsai, C. H., Lee, D. T., and Kuo, S. Y. Securing web application code by static analysis and runtime protection. In WWW '04: Proceedings of the 13th international conference on World Wide Web (New York, NY, USA, 2004), ACM, pp. 40--52.
Jovanovic, N., Kruegel, C., and Kirda, E. Pixy: A static analysis tool for detecting web application vulnerabilities (short paper). In IN 2006 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (2006), pp. 258--263.
Kozlov, D., and Petukhov. Implementation of tainted mode approach to finding security vulnerabilities for python technology.
Meyerovich, L. A., and Livshits, B. ConScript: Specifying and enforcing Fine-Grained security policies for JavaScript in the browser. In Proceedings of the 2010 IEEE Symposium on Security and Privacy (Washington, DC, USA, 2010), SP '10, IEEE Computer Society, pp. 481--496.
Might, M. A-normalization: Why and how. http://matt.might.net/articles/a-normalization/.
Might, M. You don't understand exceptions, but you should. http://matt.might.net/articles/implementing-exceptions/.
Might, M. Environment Analysis of Higher-Order Languages. PhD thesis, Georgia Institute of Technology, June 2007.
Might, M. Shape analysis in the absence of pointers and structure. In VMCAI 2010: International Conference on Verification, Model-Checking and Abstract Interpretation (Madrid, Spain, Jan. 2010), pp. 263--278.
Might, M., and Shivers, O. Improving flow analyses via Gamma-CFA: Abstract garbage collection and counting. In ICFP '06: Proceedings of the 11th ACM SIGPLAN International Conference on Functional Programming (New York, NY, USA, 2006), ACM, pp. 13--25.
Might, M., and Shivers, O. Exploiting reachability and cardinality in higher-order flow analysis. Journal of Functional Programming 18, Special Double Issue 5--6 (2008), 821--864.
Might, M., Smaragdakis, Y., and Van Horn, D. Resolving and exploiting the k-CFA paradox: Illuminating functional vs. object-oriented program analysis. In PLDI '10: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation (2010), PLDI '10, ACM Press, pp. 305--315.
Minamide, Y. Static approximation of dynamically generated web pages. In WWW. 2005, pp. 432--441.
Moin moin. http://moinmo.in/.
Nguyen-Tuong, A., Guarnieri, S., Greene, D., Shirley, J., and Evans, D. Automatically hardening web applications using precise tainting. In In 20th IFIP International Information Security Conference (2005), pp. 372--382.
Owasp. OWASP top 10 for 2010. https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project.
Richards, G., Lebresne, S., Burg, B., and Vitek, J. An analysis of the dynamic behavior of JavaScript programs. In Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2010), PLDI '10, ACM, pp. 1--12.
Ruby. http://ruby-doc.org/docs/ProgrammingRuby/.
Sabelfeld, A., and Myers, A. Language-Based Information-Flow security, 2003.
Salib, M. Static Type Inference with Starkiller. In PyCon DC (2004).
Seo, J., and Monica. InvisiType: Object-Oriented security policies. In NDSS (2010).
Tripp, O., Pistoia, M., Fink, S. J., Sridharan, M., and Weisman, O. Taj: effective taint analysis of web applications. In Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2009), PLDI '09, ACM, pp. 87--97.
Van Horn, D., and Might, M. Abstracting abstract machines. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming (New York, NY, USA, 2010), ICFP '10, ACM, pp. 51--62.
Vogt, P., Nentwich, F., Jovanovic, N., Kirda, E., Kruegel, C., and Vigna, G. Cross site scripting prevention with dynamic data tainting and static analysis.
Wall, L., Christiansen, T., Schwartz, R. L., and Potter, S. Programming Perl (2nd Edition).
Wassermann, G., and Su, Z. Sound and precise analysis of web applications for injection vulnerabilities. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2007), ACM, pp. 32--41.
Wassermann, G., and Su, Z. Static detection of cross-site scripting vulnerabilities. In Proceedings of the 30th international conference on Software engineering (Leipzig, Germany, 2008), ACM, pp. 171--180.

Cited By

View all
  • (2016)Information flow analysis for a dynamically typed language with staged metaprogrammingJournal of Computer Security10.3233/JCS-16055724:5(541-582)Online publication date: 8-Nov-2016
  • (2016)Allocation characterizes polyvariance: a unified methodology for polyvariant control-flow analysisACM SIGPLAN Notices10.1145/3022670.295193651:9(407-420)Online publication date: 4-Sep-2016
  • (2016)Allocation characterizes polyvariance: a unified methodology for polyvariant control-flow analysisProceedings of the 21st ACM SIGPLAN International Conference on Functional Programming10.1145/2951913.2951936(407-420)Online publication date: 4-Sep-2016
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Conferences
PLAS '12: Proceedings of the 7th Workshop on Programming Languages and Analysis for Security
June 2012
91 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2012


Request permissions for this article.

Check for updates

Author Tags

  1. abstract interpretation
  2. higher-order programs
  3. static analysis
  4. taint analysis


  • Research-article


PLDI '12

Acceptance Rates

Overall Acceptance Rate 43 of 77 submissions, 56%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)5
Reflects downloads up to 12 Feb 2025

Other Metrics


Cited By

View all
  • (2016)Information flow analysis for a dynamically typed language with staged metaprogrammingJournal of Computer Security10.3233/JCS-16055724:5(541-582)Online publication date: 8-Nov-2016
  • (2016)Allocation characterizes polyvariance: a unified methodology for polyvariant control-flow analysisACM SIGPLAN Notices10.1145/3022670.295193651:9(407-420)Online publication date: 4-Sep-2016
  • (2016)Allocation characterizes polyvariance: a unified methodology for polyvariant control-flow analysisProceedings of the 21st ACM SIGPLAN International Conference on Functional Programming10.1145/2951913.2951936(407-420)Online publication date: 4-Sep-2016
  • (2016)A Posteriori Taint-Tracking for Demonstrating Non-interference in Expressive Low-Level Languages2016 IEEE Security and Privacy Workshops (SPW)10.1109/SPW.2016.58(179-184)Online publication date: May-2016
  • (2015)Static Analysis of Non-interference in Expressive Low-Level LanguagesStatic Analysis10.1007/978-3-662-48288-9_1(1-17)Online publication date: 2-Sep-2015
  • (2013)Sound and precise malware analysis for android via pushdown reachability and entry-point saturationProceedings of the Third ACM workshop on Security and privacy in smartphones & mobile devices10.1145/2516760.2516769(21-32)Online publication date: 8-Nov-2013

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media