research-article

Hash-flow taint analysis of higher-order programs

Authors:

Matthew MightAuthors Info & Claims

PLAS '12: Proceedings of the 7th Workshop on Programming Languages and Analysis for Security

Article No.: 8, Pages 1 - 12

https://doi.org/10.1145/2336717.2336725

Published: 15 June 2012 Publication History

Abstract

As web applications have grown in popularity, so have attacks on such applications. Cross-site scripting and injection attacks have become particularly problematic. Both vulnerabilities stem, at their core, from improper sanitization of user input.

We propose static taint analysis, which can verify the absence of unsanitized input errors at compile-time. Unfortunately, precise static analysis of modern scripting languages like Python is challenging: higher-orderness and complex control-flow collide with opaque, dynamic data structures like hash maps and objects. The interdependence of data-flow and control-flow make it hard to attain both soundness and precision.

In this work, we apply abstract interpretation to sound and precise taint-style static analysis of scripting languages. We first define λ_H, a core calculus of modern scripting languages, with hash maps, dynamic objects, higher-order functions and first class control. Then we derive a framework of k-CFA-like CESK-style abstract machines for statically reasoning about λ_H, but with hash maps factored into a "Curried Object store." The Curried object store---and shape analysis on this store---allows us to recover field sensitivity, even in the presence of dynamically modified fields. Lastly, atop this framework, we devise a taint-flow analysis, leveraging its field-sensitive, interprocedural and context-sensitive properties to soundly and precisely detect security vulnerabilities, like XSS attacks in web applications.

We have prototyped the analytical framework for Python, and conducted preliminary experiments with web applications. A low rate of false alarms demonstrates the promise of this approach.

References

[1]

Pyblosxom. http://pyblosxom.bluesock.org/.

[2]

Andrews, M. Guest editor's introduction: The state of web security. IEEE Security and Privacy 4 (July 2006), 14--15.

Digital Library

[3]

Chang, W., Streiff, B., and Lin, C. Efficient and extensible security enforcement using dynamic data flow analysis. In Conference on Computer and Communications Security (2008).

Digital Library

[4]

Chase, D. R., Wegman, M., and Zadeck, F. K. Analysis of pointers and structures. In PLDI '90: Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation (New York, NY, USA, 1990), PLDI '90, ACM, pp. 296--310.

Digital Library

[5]

Denning, D. E. A lattice model of secure information flow. Commun. ACM 19, 5 (May 1976), 236--243.

Digital Library

[6]

Felleisen, M. The Calculi of Lambda-v-CS Conversion: A Syntactic Theory of Control and State in Imperative Higher-Order Programming Languages. PhD thesis, Indiana University, 1987.

Digital Library

[7]

Felleisen, M., and Friedman, D. P. Control operators, the SECD-machine, and the lambda-calculus. In 3rd Working Conference on the Formal Description of Programming Concepts (Aug. 1986).

[8]

Flanagan, C., Sabry, A., Duba, B. F., and Felleisen, M. The essence of compiling with continuations. In PLDI '93: Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (New York, NY, USA, June 1993), ACM, pp. 237--247.

Digital Library

[9]

Grossbart, Z. Search engine in python, 2007. http://www.zackgrossbart.com/hackito/search-engine-python/.

[10]

Huang, Y. W., Yu, F., Hang, C., Tsai, C. H., Lee, D. T., and Kuo, S. Y. Securing web application code by static analysis and runtime protection. In WWW '04: Proceedings of the 13th international conference on World Wide Web (New York, NY, USA, 2004), ACM, pp. 40--52.

Digital Library

[11]

Jovanovic, N., Kruegel, C., and Kirda, E. Pixy: A static analysis tool for detecting web application vulnerabilities (short paper). In IN 2006 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (2006), pp. 258--263.

Digital Library

[12]

Kozlov, D., and Petukhov. Implementation of tainted mode approach to finding security vulnerabilities for python technology.

[13]

Meyerovich, L. A., and Livshits, B. ConScript: Specifying and enforcing Fine-Grained security policies for JavaScript in the browser. In Proceedings of the 2010 IEEE Symposium on Security and Privacy (Washington, DC, USA, 2010), SP '10, IEEE Computer Society, pp. 481--496.

Digital Library

[14]

Might, M. A-normalization: Why and how. http://matt.might.net/articles/a-normalization/.

[15]

Might, M. You don't understand exceptions, but you should. http://matt.might.net/articles/implementing-exceptions/.

[16]

Might, M. Environment Analysis of Higher-Order Languages. PhD thesis, Georgia Institute of Technology, June 2007.

Digital Library

[17]

Might, M. Shape analysis in the absence of pointers and structure. In VMCAI 2010: International Conference on Verification, Model-Checking and Abstract Interpretation (Madrid, Spain, Jan. 2010), pp. 263--278.

Digital Library

[18]

Might, M., and Shivers, O. Improving flow analyses via Gamma-CFA: Abstract garbage collection and counting. In ICFP '06: Proceedings of the 11th ACM SIGPLAN International Conference on Functional Programming (New York, NY, USA, 2006), ACM, pp. 13--25.

Digital Library

[19]

Might, M., and Shivers, O. Exploiting reachability and cardinality in higher-order flow analysis. Journal of Functional Programming 18, Special Double Issue 5--6 (2008), 821--864.

Digital Library

[20]

Might, M., Smaragdakis, Y., and Van Horn, D. Resolving and exploiting the k-CFA paradox: Illuminating functional vs. object-oriented program analysis. In PLDI '10: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation (2010), PLDI '10, ACM Press, pp. 305--315.

Digital Library

[21]

Minamide, Y. Static approximation of dynamically generated web pages. In WWW. 2005, pp. 432--441.

Digital Library

[22]

Moin moin. http://moinmo.in/.

[23]

Nguyen-Tuong, A., Guarnieri, S., Greene, D., Shirley, J., and Evans, D. Automatically hardening web applications using precise tainting. In In 20th IFIP International Information Security Conference (2005), pp. 372--382.

[24]

Owasp. OWASP top 10 for 2010. https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project.

[25]

Richards, G., Lebresne, S., Burg, B., and Vitek, J. An analysis of the dynamic behavior of JavaScript programs. In Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2010), PLDI '10, ACM, pp. 1--12.

Digital Library

[26]

Ruby. http://ruby-doc.org/docs/ProgrammingRuby/.

[27]

Sabelfeld, A., and Myers, A. Language-Based Information-Flow security, 2003.

[28]

Salib, M. Static Type Inference with Starkiller. In PyCon DC (2004).

[29]

Seo, J., and Monica. InvisiType: Object-Oriented security policies. In NDSS (2010).

[30]

Tripp, O., Pistoia, M., Fink, S. J., Sridharan, M., and Weisman, O. Taj: effective taint analysis of web applications. In Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2009), PLDI '09, ACM, pp. 87--97.

Digital Library

[31]

Van Horn, D., and Might, M. Abstracting abstract machines. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming (New York, NY, USA, 2010), ICFP '10, ACM, pp. 51--62.

Digital Library

[32]

Vogt, P., Nentwich, F., Jovanovic, N., Kirda, E., Kruegel, C., and Vigna, G. Cross site scripting prevention with dynamic data tainting and static analysis.

[33]

Wall, L., Christiansen, T., Schwartz, R. L., and Potter, S. Programming Perl (2nd Edition).

Digital Library

[34]

Wassermann, G., and Su, Z. Sound and precise analysis of web applications for injection vulnerabilities. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2007), ACM, pp. 32--41.

Digital Library

[35]

Wassermann, G., and Su, Z. Static detection of cross-site scripting vulnerabilities. In Proceedings of the 30th international conference on Software engineering (Leipzig, Germany, 2008), ACM, pp. 171--180.

Digital Library

Cited By

Lester MOng LSchäfer M(2016)Information flow analysis for a dynamically typed language with staged metaprogrammingJournal of Computer Security10.3233/JCS-16055724:5(541-582)Online publication date: 8-Nov-2016
https://doi.org/10.3233/JCS-160557
Gilray TAdams MMight M(2016)Allocation characterizes polyvariance: a unified methodology for polyvariant control-flow analysisACM SIGPLAN Notices10.1145/3022670.295193651:9(407-420)Online publication date: 4-Sep-2016
https://dl.acm.org/doi/10.1145/3022670.2951936
Gilray TAdams MMight MGarrigue JKeller GSumii E(2016)Allocation characterizes polyvariance: a unified methodology for polyvariant control-flow analysisProceedings of the 21st ACM SIGPLAN International Conference on Functional Programming10.1145/2951913.2951936(407-420)Online publication date: 4-Sep-2016
https://dl.acm.org/doi/10.1145/2951913.2951936
Show More Cited By

Index Terms

Hash-flow taint analysis of higher-order programs
1. Security and privacy
  1. Software and application security
    1. Software security engineering
2. Software and its engineering
  1. Software organization and properties
    1. Software functional properties
      1. Correctness
        Access protection

Recommendations

P/Taint: unified points-to and taint analysis

Static information-flow analysis (especially taint-analysis) is a key technique in software security, computing where sensitive or untrusted data can propagate in a program. Points-to analysis is a fundamental static program analysis, computing what ...
Pushdown control-flow analysis for free
POPL '16

Traditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...
Pushdown control-flow analysis for free
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages

Traditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PLAS '12: Proceedings of the 7th Workshop on Programming Languages and Analysis for Security

June 2012

91 pages

ISBN:9781450314411

DOI:10.1145/2336717

Conference Chairs:
Sergio Maffeis
Imperial College London
,
Tamara Rezk
INRIA

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

PLDI '12

Sponsor:

SIGPLAN

PLDI '12: ACM SIGPLAN Conference on Programming Language Design and Implementation

June 15, 2012

Beijing, China

Acceptance Rates

Overall Acceptance Rate 43 of 77 submissions, 56%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
216
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)5

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lester MOng LSchäfer M(2016)Information flow analysis for a dynamically typed language with staged metaprogrammingJournal of Computer Security10.3233/JCS-16055724:5(541-582)Online publication date: 8-Nov-2016
https://doi.org/10.3233/JCS-160557
Gilray TAdams MMight M(2016)Allocation characterizes polyvariance: a unified methodology for polyvariant control-flow analysisACM SIGPLAN Notices10.1145/3022670.295193651:9(407-420)Online publication date: 4-Sep-2016
https://dl.acm.org/doi/10.1145/3022670.2951936
Gilray TAdams MMight MGarrigue JKeller GSumii E(2016)Allocation characterizes polyvariance: a unified methodology for polyvariant control-flow analysisProceedings of the 21st ACM SIGPLAN International Conference on Functional Programming10.1145/2951913.2951936(407-420)Online publication date: 4-Sep-2016
https://dl.acm.org/doi/10.1145/2951913.2951936
Aldous PMight M(2016)A Posteriori Taint-Tracking for Demonstrating Non-interference in Expressive Low-Level Languages2016 IEEE Security and Privacy Workshops (SPW)10.1109/SPW.2016.58(179-184)Online publication date: May-2016
https://doi.org/10.1109/SPW.2016.58
Aldous PMight M(2015)Static Analysis of Non-interference in Expressive Low-Level LanguagesStatic Analysis10.1007/978-3-662-48288-9_1(1-17)Online publication date: 2-Sep-2015
https://doi.org/10.1007/978-3-662-48288-9_1
Liang SKeep AMight MLyde SGilray TAldous PVan Horn DEnck WFelt AAsokan N(2013)Sound and precise malware analysis for android via pushdown reachability and entry-point saturationProceedings of the Third ACM workshop on Security and privacy in smartphones & mobile devices10.1145/2516760.2516769(21-32)Online publication date: 8-Nov-2013
https://dl.acm.org/doi/10.1145/2516760.2516769

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten