Abstract
We perform static analysis of Java programs to answer a simple question: which values may occur as results of string expressions? The answers are summarized for each expression by a regular language that is guaranteed to contain all possible values. We present several applications of this analysis, including statically checking the syntax of dynamically generated expressions, such as SQL queries. Our analysis constructs flow graphs from class files and generates a context-free grammar with a nonterminal for each string expression. The language of this grammar is then widened into a regular language through a variant of an algorithm previously used for speech recognition. The collection of resulting regular languages is compactly represented as a special kind of multi-level automaton from which individual answers may be extracted. If a program error is detected, examples of invalid strings are automatically produced. We present extensive benchmarks demonstrating that the analysis is efficient and produces results of useful precision.
Supported by the Carlsberg Foundation contract number ANS-1069/20.
Basic Research in Computer Science (http://www.brics.dk), funded by the Danish National Research Foundation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers — Principles, Techniques, and Tools. Addison-Wesley, November 1985.
Alex Aiken. Introduction to set constraint-based program analysis. Science of Computer Programming, 35:79–111, 1999.
Aske Simon Christensen, Anders Møller, and Michael I. Schwartzbach. Static analysis for dynamic XML. Technical Report RS-02-24, BRICS, May 2002. Presented at Programming Language Technologies for XML, PLAN-X, October 2002.
Aske Simon Christensen, Anders Møller, and Michael I. Schwartzbach. Extending Java for high-level Web service construction. ACM Transactions on Programming Languages and Systems, 2003. To appear.
James Clark and Steve DeRose. XML path language, November 1999. W3C Recommendation. http://www.w3.org/TR/xpath.
H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. Tree automata techniques and applications, 1999. Available from http://www.grappa.univ-lille3.fr/tata/.
Patrick Cousot and Radhia Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proc. 4th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL’77, pages 238–252, 1977.
Nurit Dor, Michael Rodeh, and Mooly Sagiv. Cleanness checking of string manipulations in C programs via integer analysis. In Proc. 8th International Static Analysis Symposium, SAS’ 01, volume 2126 of LNCS. Springer-Verlag, July 2001.
John E. Hopcroft and Jeffrey D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, April 1979.
Haruo Hosoya and Benjamin C. Pierce. XDuce: A typed XML processing language. In Proc. 3rd International Workshop on the World Wide Web and Databases, WebDB’00, volume 1997 of LNCS. Springer-Verlag, May 2000.
Mehryar Mohri and Mark-Jan Nederhof. Robustness in Language and Speech Technology, chapter 9: Regular Approximation of Context-Free Grammars through Transformation. Kluwer Academic Publishers, 2001.
Anders Møller. Document Structure Description 2.0, December 2002. BRICS, Department of Computer Science, University of Aarhus, Notes Series NS-02-7. Available from http://www.brics.dk/DSD/.
Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. Principles of Program Analysis. Springer-Verlag, October 1999.
Rajesh Parekh and Vasant Honavar. DFA learning from simple examples. Machine Learning, 44:9–35, 2001.
Thomas Reps. Program analysis via graph reachability. Information and Software Technology, 40(11–12):701–726, November/December 1998.
Umesh Shankar, Kunal Talwar, Jeffrey S. Foster, and David Wagner. Detecting format string vulnerabilities with type qualifiers. In Proc. 10th USENIX Security Symposium, 2001.
Naoshi Tabuchi, Eijiro Sumii, and Akinori Yonezawa. Regular expression types for strings in a text processing language. In Proc. Workshop on Types in Programming, TIP’ 02, 2002.
Raja Vallee-Rai, Laurie Hendren, Vijay Sundaresan, Patrick Lam, Etienne Gagnon, and Phong Co. Soot — A Java optimization framework. In Proc. IBM Centre for Advanced Studies Conference, CASCON’99. IBM, November 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Christensen, A.S., Møller, A., Schwartzbach, M.I. (2003). Precise Analysis of String Expressions. In: Cousot, R. (eds) Static Analysis. SAS 2003. Lecture Notes in Computer Science, vol 2694. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44898-5_1
Download citation
DOI: https://doi.org/10.1007/3-540-44898-5_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40325-8
Online ISBN: 978-3-540-44898-3
eBook Packages: Springer Book Archive