The maintenance and evolution of data-intensive systems should ideally rely on a complete and accurate database documentation. Unfortunately, this documentation is often missing, or, at best, outdated. Database redocumentation, a process also known as database reverse engineering, then comes to the rescue. This process typically involves the elicitation of implicit schema constructs, that is, data structures and constraints that have been incompletely translated into the operational database schema. In this context, the SQL statements executed by the programs may be a particularly rich source of information. SQL APIs come in two variants, namely static and dynamic. The latter is intensively used in object-oriented and web applications, notably through ODBC and JDBC APIs. While the static analysis of SQL queries has long been studied, coping with automatically generated SQL statements requires other weapons. This tutorial provides an in-depth exploration of the use of dynamic program analysis as a basis for reverse engineering relational databases. It describes and illustrates several automated techniques allowing to capture the trace of the SQL-related events occuring during the execution of data-intensive programs. It then presents and evaluates several heuristics and techniques supporting the automatic recovery of implicit schema constructs from SQL execution traces. Other applications of SQL execution trace analysis are also identified.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Unable to display preview. Download preview PDF.
Similar content being viewed by others
Chikofsky, E.J., Cross, J.H.: Reverse engineering and design recovery: A taxonomy. IEEE Software 7(1), 13–17 (1990)
Blaha, M.R., Premerlani, W.J.: Observed idiosyncracies of relational database designs. In: Proc. of the Second Working Conference on Reverse Engineering (WCRE 1995), p. 116. IEEE Computer Society, Washington, DC (1995)
Petit, J.M., Kouloumdjian, J., Boulicaut, J.F., Toumani, F.: Using Queries to Improve Database Reverse Engineering. In: Loucopoulos, P. (ed.) ER 1994. LNCS, vol. 881, pp. 369–386. Springer, Heidelberg (1994)
Andersson, M.: Searching for semantics in cobol legacy applications. In: Data Mining and Reverse Engineering: Searching for Semantics, IFIP TC2/WG2.6 Seventh Conference on Database Semantics (DS-7). IFIP Conference Proceedings, vol. 124, pp. 162–183. Chapman & Hall (1998)
Embury, S.M., Shao, J.: Assisting the comprehension of legacy transactions. In: Proc. of the 8th Working Conference on Reverse Engineering (WCRE 2001), p. 345. IEEE Computer Society, Washington, DC (2001)
Willmor, D., Embury, S.M., Shao, J.: Program slicing in the presence of a database state. In: ICSM 2004: Proceedings of the 20th IEEE International Conference on Software Maintenance, pp. 448–452. IEEE Computer Society, Washington, DC (2004)
Cleve, A., Henrard, J., Hainaut, J.L.: Data reverse engineering using system dependency graphs. In: Proc. of the 13th Working Conference on Reverse Engineering (WCRE 2006), pp. 157–166. IEEE Computer Society, Washington, DC (2006)
Cleve, A.: Program Analysis and Transformation for Data-Intensive System Evolution. PhD thesis, University of Namur (October 2009)
Cleve, A., Hainaut, J.L.: Dynamic analysis of SQL statements for data-intensive applications reverse engineering. In: Proc. of the 15th Working Conference on Reverse Engineering, pp. 192–196. IEEE Computer Society (2008)
Cleve, A., Meurisse, J.R., Hainaut, J.L.: Database semantics recovery through analysis of dynamic SQL statements. Journal on Data Semantics 15, 130–157 (2011)
Hainaut, J.L.: Introduction to database reverse engineering. LIBD Publish. (2002), http://www.info.fundp.ac.be/~dbm/publication/2002/DBRE-2002.pdf
Lämmel, R., De Schutter, K.: What does aspect-oriented programming mean to Cobol? In: Proc. of Aspect-Oriented Software Development (AOSD 2005), pp. 99–110. ACM Press (March 2005)
Kiczales, G., Hilsdale, E., Hugunin, J., Kersten, M., Palm, J., Griswold, W.G.: An Overview of AspectJ. In: Lee, S.H. (ed.) ECOOP 2001. LNCS, vol. 2072, pp. 327–353. Springer, Heidelberg (2001)
Petit, J.M., Toumani, F., Kouloumdjian, J.: Relational database reverse engineering: A method based on query analysis. Int. J. Cooperative Inf. Syst. 4(2-3), 287–316 (1995)
Lopes, S., Petit, J.M., Toumani, F.: Discovery of “Interesting” Data Dependencies from a Workload of SQL Statements. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 430–435. Springer, Heidelberg (1999)
Tan, H.B.K., Ling, T.W., Goh, C.H.: Exploring into programs for the recovery of data dependencies designed. IEEE Trans. Knowl. Data Eng. 14(4), 825–835 (2002)
Tan, H.B.K., Zhao, Y.: Automated elicitation of inclusion dependencies from the source code for database transactions. Journal of Software Maintenance 15(6), 379–392 (2003)
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)
DB-MAIN: The DB-MAIN official website (2011), http://www.db-main.be
Zhu, H., Hall, P.A.V., May, J.H.R.: Software unit test coverage and adequacy. ACM Comput. Surv. 29, 366–427 (1997)
Kapfhammer, G.M., Soffa, M.L.: A family of test adequacy criteria for database-driven applications. In: Proc. of the 9th European Software Engineering Conference Held Jointly with 11th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE-11, pp. 98–107. ACM, New York (2003)
Casanova, M.A., De Sa, J.E.A.: Mapping uninterpreted schemes into entity-relationship diagrams: two applications to conceptual schema design. IBM J. Res. Dev. 28(1), 82–94 (1984)
Davis, K.H., Arora, A.K.: A methodology for translating a conventional file system into an entity-relationship model. In: Proc. of the Fourth International Conference on Entity-Relationship Approach, pp. 148–159. IEEE Computer Society, Washington, DC (1985)
Navathe, S.B., Awong, A.M.: Abstracting relational and hierarchical data with a semantic data model. In: Proc. of the Sixth International Conference on Entity-Relationship Approach (ER 1987), pp. 305–333. North-Holland Publishing Co., Amsterdam (1988)
Johannesson, P.: A method for transforming relational schemas into conceptual schemas. In: Proc. of the Tenth International Conference on Data Engineering (ICDE 2004), pp. 190–201. IEEE Computer Society, Washington, DC (1994)
Hainaut, J.L., Englebert, V., Henrard, J., Hick, J.M., Roland, D.: Database reverse engineering: From requirements to care tools. Automated Software Engineering 3, 9–45 (1996)
Davis, K.H., Aiken, P.H.: Data reverse engineering: A historical survey. In: Proc. of the Seventh Working Conference on Reverse Engineering (WCRE 2000), p. 70. IEEE Computer Society, Washington, DC (2000)
Hainaut, J.L., Chandelon, M., Tonneau, C., Joris, M.: Contribution to a theory of database reverse engineering. In: Proc. of the IEEE Working Conf. on Reverse Engineering, pp. 161–170. IEEE Computer Society Press, Baltimore (1993)
Signore, O., Loffredo, M., Gregori, M., Cima, M.: Reconstruction of ER Schema from Database Applications: a Cognitive Approach. In: Loucopoulos, P. (ed.) ER 1994. LNCS, vol. 881, pp. 387–402. Springer, Heidelberg (1994)
Yang, H., Chu, W.C.: Acquisition of entity relationship models for maintenance-dealing with data intensive programs in a transformation system. J. Inf. Sci. Eng. 15(2), 173–198 (1999)
Shao, J., Liu, X., Fu, G., Embury, S.M., Gray, W.A.: Querying Data-Intensive Programs for Data Design. In: Dittrich, K.R., Geppert, A., Norrie, M. (eds.) CAiSE 2001. LNCS, vol. 2068, pp. 203–218. Springer, Heidelberg (2001)
Markowitz, V.M., Makowsky, J.A.: Identifying extended entity-relationship object structures in relational schemas. IEEE Trans. Softw. Eng. 16(8), 777–790 (1990)
Premerlani, W.J., Blaha, M.R.: An approach for reverse engineering of relational databases. Commun. ACM 37(5), 42–49 (1994)
Chiang, R.H.L., Barron, T.M., Storey, V.C.: Reverse engineering of relational databases: extraction of an eer model from a relational database. Data Knowl. Eng. 12(2), 107–142 (1994)
Lopes, S., Petit, J.M., Toumani, F.: Discovering interesting inclusion dependencies: application to logical database tuning. Inf. Syst. 27(1), 1–19 (2002)
Yao, H., Hamilton, H.J.: Mining functional dependencies from data. Data Min. Knowl. Discov. 16(2), 197–219 (2008)
Pannurat, N., Kerdprasop, N., Kerdprasop, K.: Database reverse engineering based on association rule mining. CoRR abs/1004.3272 (2010)
Choobineh, J., Mannino, M.V., Tseng, V.P.: A form-based approach for database analysis and design. Communications of the ACM 35(2), 108–120 (1992)
Terwilliger, J.F., Delcambre, L.M.L., Logan, J.: The User Interface Is the Conceptual Model. In: Embley, D.W., Olivé, A., Ram, S. (eds.) ER 2006. LNCS, vol. 4215, pp. 424–436. Springer, Heidelberg (2006)
Ramdoyal, R., Cleve, A., Hainaut, J.-L.: Reverse Engineering User Interfaces for Interactive Database Conceptual Analysis. In: Pernici, B. (ed.) CAiSE 2010. LNCS, vol. 6051, pp. 332–347. Springer, Heidelberg (2010)
Di Lucca, G.A., Fasolino, A.R., de Carlini, U.: Recovering class diagrams from data-intensive legacy systems. In: Proc. of the 16th IEEE International Conference on Software Maintenance (ICSM 2000), p. 52. IEEE Computer Society (2000)
Henrard, J.: Program Understanding in Database Reverse Engineering. PhD thesis, University of Namur (2003)
van den Brink, H., van der Leek, R., Visser, J.: Quality assessment for embedded sql. In: Proc. of the 7th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2007), pp. 163–170. IEEE Computer Society (2007)
Ngo, M.N., Tan, H.B.K.: Applying static analysis for automated extraction of database interactions in web applications. Inf. Softw. Technol. 50(3), 160–175 (2008)
Cornelissen, B., Zaidman, A., van Deursen, A., Moonen, L., Koschke, R.: A systematic survey of program comprehension through dynamic analysis. IEEE Trans. Software Eng. 35(5), 684–702 (2009)
Debusmann, M., Geihs, K.: Efficient and Transparent Instrumentation of Application Components Using an Aspect-Oriented Approach. In: Brunner, M., Keller, A. (eds.) DSOM 2003. LNCS, vol. 2867, pp. 209–220. Springer, Heidelberg (2003)
Del Grosso, C., Di Penta, M.: García Rodríguez de Guzmán, I.: An approach for mining services in database oriented applications. In: Proceedings of the 11th European Conference on Software Maintenance and Reengineering (CSMR 2007), pp. 287–296. IEEE Computer Society (2007)
Yang, Y., Peng, X., Zhao, W.: Domain feature model recovery from multiple applications using data access semantics and formal concept analysis. In: Proc. of the 16th International Working Conference on Reverse Engineering (WCRE 2009), pp. 215–224. IEEE Computer Society (2009)
Alalfi, M., Cordy, J., Dean, T.: WAFA: Fine-grained dynamic analysis of web applications. In: Proc. of the 11th International Symposium on Web Systems Evolution (WSE 2009), pp. 41–50. IEEE Computer Society (2009)
Cleve, A., Lemaitre, J., Hainaut, J.L., Mouchet, C., Henrard, J.: The role of implicit schema constructs in data quality. In: Proc. of the 6th International Workshop on Quality in Databases (QDB 2008), pp. 33–40 (2008)
Deursen, A.V., Kuipers, T.: Rapid system understanding: Two cobol case studies. In: Proc. of the 6th International Workshop on Program Comprehension (IWPC 1998), p. 90. IEEE Computer Society (1998)
Merlo, E., Letarte, D., Antoniol, G.: Insider and outsider threat-sensitive sql injection vulnerability analysis in php. In: Proc. Working Conf. Reverse Engineering (WCRE), pp. 147–156. IEEE Computer Society, Washington, DC (2006)
Halfond, W.G.J., Orso, A.: Combining static analysis and runtime monitoring to counter sql-injection attacks. In: WODA 2005: Proceedings of the Third International Workshop on Dynamic Analysis, pp. 1–7. ACM, New York (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Cleve, A., Noughi, N., Hainaut, JL. (2013). Dynamic Program Analysis for Database Reverse Engineering. In: Lämmel, R., Saraiva, J., Visser, J. (eds) Generative and Transformational Techniques in Software Engineering IV. GTTSE 2011. Lecture Notes in Computer Science, vol 7680. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35992-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-35992-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35991-0
Online ISBN: 978-3-642-35992-7
eBook Packages: Computer ScienceComputer Science (R0)