Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Automatic high-quality reengineering of database programs by abstraction, transformation and reimplementation

Published: 01 July 2003 Publication History

Abstract

Old-generation database models, such as the indexed-sequential, hierarchical, or network models, provide record-level access to their data, with all application logic residing in the hosting program. In contrast, relational databases can perform complex operations, such as filter, aggregation, and join, on multiple records without an external specification of the record-access logic. Programs written for relational databases attempt to move as much of the application logic as possible into the database, in order to make the most of the optimizations performed internally by the database.This conceptual gap between the programming styles makes automatic high-quality translation of programs written for the older database models to the relational model difficult. It is not enough to convert just the database-access operations, since this would result in unacceptably inefficient programs. It is necessary to convert parts of the application logic from the procedural style of the hosting program (which is almost always Cobol) to the declarative style of SQL.This article describes an automatic system, called MIDAS, that performs high-quality reengineering of legacy database programs in this way. MIDAS is based on the paradigm of translation by abstraction, transformation, and reimplementation. The abstract representation is based on the Plan Calculus, with the addition of Query Graphs, introduced in this article in order to abstract the temporal behavior of database access patterns.The results of MIDAS's translation were found to be superior to those of the naive translation that only converts database-access operations in terms of readability, size of code, speed, and network data traffic. Initial industrial experience with MIDAS also demonstrates the high quality of its translations on large-scale programs.

References

[1]
Buss, E. et al. 1994. Investigating reverse engineering technologies for the CAS program understanding project. IBM Systems J. 33, 3 (Mar.), 477--500.
[2]
Davis, K. H. and Aiken, P. H. 2000. Data reverse engineering: A historical survey. In Proceedings of the 7th Working Conf. Reverse Engineering (WCRE'00). 70--78.
[3]
Faust, G. 1981. Semiautomatic translation of COBOL into HIBOL. Tech. Rep. 256, MIT Lab. for Computer Science. Master's thesis.
[4]
Feldman, Y. A. and Friedman, D. A. 1999. Portability by automatic translation: A large-scale case study. Artif. Intell. 107, 1, 1--28.
[5]
Fong, J. 1992. Methodology for schema translation from hierarchical or network into relational. Inf. Softw. Tech. 34, 3 (Mar.), 159--174.
[6]
Fong, J. and Bloor, C. 1994. Data conversion rules from network to relational databases. Inf. Softw. Tech. 36, 3 (Mar.), 141--153.
[7]
Gillenson, M. L. 1990. Physical design equivalencies in database conversion. Commun. ACM 33, 8 (Aug.), 120--131.
[8]
Griswold, W. G. and Notkin, D. 1993. Automatic assistance for program restructuring. ACM Trans. Softw. Eng. Meth. 2, 3 (July), 228--269.
[9]
Henrard, J., Englebert, V., Hick, J.-M., Roland, D., and Hainaut, J.-L. 1998. Program understanding in databases reverse engineering. In Proceedings of the 9th International Conference Database and Expert Systems Applications (DEXA'98). 70--79.
[10]
Horwitz, S., Reps, T., and Binkley, D. 1990. Interprocedural slicing using dependence graphs. ACM Trans. Prog. Lang. Syst. 12, 1 (Jan.), 26--60.
[11]
Katz, R. H. and Wong, E. 1982. Decompiling CODASYL DML into relational queries. ACM Trans. Datab. Syst. 7, 1 (Mar.), 1--23.
[12]
Loveman, D. B. 1977. Program improvement by source-to-source transformation. J. ACM 24, 1 (Jan.), 121--145.
[13]
Polak, W., Nelson, L. D., and Bickmore, T. W. 1995. Reengineering IMS databases to relational systems. In Proceedings of the 7th Annual Software Technology Conference, (Salt Lake City, Ut.). Published on CD-ROM.
[14]
Rich, C. 1981. A formal representation for plans in the Programmer's Apprentice. In Proceedings of the 7th International Joint Conference on Artificial Intelligence (Vancouver, BC, Canada). 1044--1052. (Reprinted in M. Brodie, J. Mylopoulos, and J. Schmidt, editors, On Conceptual Modelling, pages 239--270, Springer-Verlag, New York, NY, 1984, and in C. Rich and R. C. Waters, editors, Readings in Artificial Intelligence and Software Engineering, Morgan Kaufmann, 1986).
[15]
Rich, C. and Waters, R. C. 1990. The Programmer's Apprentice. Addison-Wesley, Reading, Mass., and ACM, New York.
[16]
Spooner, D. L., Sanderson, D., and Charalambous, G. 1989. A data translation tool for engineering systems. In Proceedings of the 2nd International Conference on Data and Knowledge Systems for Manufacturing and Engineering. IEEE Computer Society Press, Los Alamitos, CA, 96--104.
[17]
Tangorra, F. and Chiarolla, D. 1995. A methodology for reverse engineering hierarchical databases. Inf. Softw. Tech. 37, 4 (Apr.), 225--231.
[18]
Waters, R. C. 1978. Automatic analysis of the logical structure of programs. Tech. Rep. 492, MIT Artificial Intelligence Lab. PhD thesis.
[19]
Waters, R. C. 1979. A method for analyzing loop programs. IEEE Trans. Softw. Eng. 5, 3 (May), 237--247.
[20]
Waters, R. C. 1988. Program translation via abstraction and reimplementation. IEEE Trans. Softw. Eng. 14, 8 (Aug.), 1207--1228.
[21]
Weiser, M. 1984. Program slicing. IEEE Trans. Softw. Eng. 10, 4 (July), 352--357.
[22]
Wills, L. M. 1990. Automated program recognition: A feasibility demonstration. Artif. Intell. 45, 1--2 (Sept.), 113--172.
[23]
Winans, J. and Davis, K. H. 1991. Software reverse engineering from a currently existing IMS database to an entity-relationship model. In Entity-Relationship Approach: the Core of Conceptual Modelling, Proceedings of the Ninth International Conference. Amsterdam, 333--348.

Cited By

View all
  • (2021)Active Learning for Inference and Regeneration of Applications that Access DatabasesACM Transactions on Programming Languages and Systems10.1145/343095242:4(1-119)Online publication date: 22-Jan-2021
  • (2020)The Node Vector Distance Problem in Complex NetworksACM Computing Surveys10.1145/341650953:6(1-27)Online publication date: 6-Dec-2020
  • (2020)Techniques for Inverted Index CompressionACM Computing Surveys10.1145/341514853:6(1-36)Online publication date: 6-Dec-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 12, Issue 3
July 2003
86 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/958961
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2003
Published in TOSEM Volume 12, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Database program reengineering
  2. query graphs
  3. temporal abstraction
  4. the plan calculus

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Active Learning for Inference and Regeneration of Applications that Access DatabasesACM Transactions on Programming Languages and Systems10.1145/343095242:4(1-119)Online publication date: 22-Jan-2021
  • (2020)The Node Vector Distance Problem in Complex NetworksACM Computing Surveys10.1145/341650953:6(1-27)Online publication date: 6-Dec-2020
  • (2020)Techniques for Inverted Index CompressionACM Computing Surveys10.1145/341514853:6(1-36)Online publication date: 6-Dec-2020
  • (2020)Synchronous Transmissions in Low-Power WirelessACM Computing Surveys10.1145/341015953:6(1-39)Online publication date: 6-Dec-2020
  • (2020)A Survey on Heart BiometricsACM Computing Surveys10.1145/341015853:6(1-38)Online publication date: 6-Dec-2020
  • (2019)Using active learning to synthesize models of applications that access databasesProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314591(269-285)Online publication date: 8-Jun-2019
  • (2013)Detecting refactored clonesProceedings of the 27th European conference on Object-Oriented Programming10.1007/978-3-642-39038-8_21(502-526)Online publication date: 1-Jul-2013
  • (2010)MetNetAPI: A flexible method to access and manipulate biological network data from MetNetBMC Research Notes10.1186/1756-0500-3-3123:1Online publication date: 18-Nov-2010
  • (2009)Improving slice accuracy by compression of data and control flow pathsProceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering10.1145/1595696.1595729(223-232)Online publication date: 24-Aug-2009
  • (2009)Towards a Modernization Process for Secure Data WarehousesProceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery10.1007/978-3-642-03730-6_3(24-35)Online publication date: 30-Aug-2009
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media