Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Software Redocumentation Using Distributed Data Processing Technique to Support Program Understanding for Legacy System: A Proposed Approach

  • Conference paper
  • First Online:
Advances in Visual Informatics (IVIC 2021)

Abstract

Source code is the most updated source among all the available software artifacts. The majority of existing software redocumentation approaches relied on source code to extract the necessary information for program comprehension in order to support software maintenance tasks. However, performing Extract, Transform and Load (ETL) using a parser from the source code becoming a challenging task. The traditional approach is no longer able to handle the ETL efficiently due to the effect of the analysis efficiency, especially for large source code. This paper proposed to use distributed data processing technique to extract legacy source code components to generate detailed designed or technical software documentation at source code level to support program understanding. The objective of this paper is to apply the distributed data processing technique to the parser by using Hadoop Distributed File System and Apache Spark. Legacy java source code used as a case study to apply our proposed approach to extract the source code components and generate the technical software documentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Khadka, R., Batlajery, B.V., Saeidi, A.M., Jansen, S., Hage, J.: How do professionals perceive legacy systems and software modernization? In: Proc. Int. Conf. Softw. Eng., pp. 36–47 (2014). https://doi.org/10.1145/2568225.2568318

  2. Matthiesen, S., Bjørn, P.: Why replacing legacy systems is so hard in global software development: an information infrastructure perspective. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, pp. 876–890 (2015)

    Google Scholar 

  3. Crotty, J., Horrocks, I.: Managing legacy system costs: a case study of a meta-assessment model to identify solutions in a large financial services company. Appl. Comput. Inform. 13, 175–183 (2017)

    Article  Google Scholar 

  4. de Souza, S.C.B., Anquetil, N., de Oliveira, K.M.: Which documentation for software maintenance? J. Braz. Comput. Soc. 12(3), 31–44 (2007). https://doi.org/10.1007/BF03194494

    Article  Google Scholar 

  5. Van Geet, J., Ebraert, P., Demeyer, S.: Redocumentation of a legacy banking system: an experience report. In: Proceedings of the Joint ERCIM Workshop on Software Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE), pp. 33–41 (2010)

    Google Scholar 

  6. Tadonki, C.: Universal Report: a generic reverse engineering tool. In: 12th IEEE International Workshop on Program Comprehension (IWPC 2004), pp. 266–267 (2004)

    Google Scholar 

  7. Nallusamy, S., Ibrahim, S., Mahrin, M.N.: A software redocumentation process using ontology based approach in software maintenance. Int. J. Inf. Electron. Eng. 1, 133 (2011)

    Google Scholar 

  8. Dorninger, B., Moser, M., Pichler, J.: Multi-language re-documentation to support a COBOL to Java migration project. In: SANER 2017 – 24th IEEE Int. Conf. Softw. Anal. Evol. Reengineering, pp. 536–540 (2017). https://doi.org/10.1109/SANER.2017.7884669

  9. Kienle, H.M., Müller, H.A.: Rigi – an environment for software reverse engineering, exploration, visualization, and redocumentation. Sci. Comput. Program. 75, 247–263 (2010). https://doi.org/10.1016/j.scico.2009.10.007

    Article  MathSciNet  MATH  Google Scholar 

  10. Sabtu, A., et al.: The challenges of Extract, Transform and Loading (ETL) system implementation for near real-time environment. In: Int. Conf. Res. Innov. Inf. Syst. ICRIIS, pp. 3–7 (2017). https://doi.org/10.1109/ICRIIS.2017.8002467

  11. García, S., Ramírez-Gallego, S., Luengo, J., Benítez, J.M., Herrera, F.: Big data preprocessing: methods and prospects. Big Data Anal. 1, 1–23 (2016). https://doi.org/10.1186/s41044-016-0014-0

    Article  Google Scholar 

  12. Ragab, M., Tommasini, R., Awaysheh, F.M., Ramos, J.C.: An In-depth Investigation of Large-Scale RDF Relational Schema Optimizations Using Spark-SQL (2021)

    Google Scholar 

  13. Christa, S., Madhusudhan, V., Suma, V., Rao, J.J.: Software maintenance: from the perspective of effort and cost requirement. In: Proceedings of the International Conference on Data Engineering and Communication Technology, pp. 759–768. Springer (2017)

    Google Scholar 

  14. Sugumaran, N., Ibrahim, S.: An evaluation on software redocumentation approaches and tools in software maintenance. In: Commun. IBIMA, pp. 1–10 (2011). https://doi.org/10.5171/2011.875759

  15. Kaur, U., Singh, G.: A review on software maintenance issues and how to reduce maintenance efforts. Int. J. Comput. Appl. 118, 6–11 (2015). https://doi.org/10.5120/20707-3021

    Article  Google Scholar 

  16. Kaur, P.: The study of software re-engineering. WWJMRD 4, 381–383 (2018)

    Google Scholar 

  17. Rostkowycz, A.J., Rajlich, V., Marcus, A.: A case study on the long-term effects of software redocumentation. In: IEEE Int. Conf. Softw. Maintenance, ICSM, pp. 92–101 (2004). https://doi.org/10.1109/ICSM.2004.1357794

  18. Nanthaamornphong, A., Leatongkam, A.: Extended ForUML for automatic generation of UML sequence diagrams from object-oriented Fortran. Sci. Program. (2019). https://doi.org/10.1155/2019/2542686

  19. Singh, K.: Transformation of source code into UML diagrams through visualization tool. Int. J. Adv. Sci. Technol. 29(8), 4861–1114 (2020)

    Google Scholar 

  20. Sheer, A., Tahrawi, A., Jeesh, J., Al Ibrahim, Y.: A Framework for software re-documentation by using reverse engineering approach. Int. J. Comput. Appl. 118, 1–21 (2016)

    Google Scholar 

  21. Pathania, Y., Bathla, G.: A review on re-documentation approaches and their comparative study. Int. J. Comput. Sci. Trends Technol. 2, 48–51 (2014)

    Google Scholar 

  22. Geist, V., Moser, M., Pichler, J., Beyer, S., Pinzger, M.: Leveraging machine learning for software redocumentation. In: SANER 2020 – Proc. 2020 IEEE 27th Int. Conf. Softw. Anal. Evol. Reengineering, pp. 622–626 (2020). https://doi.org/10.1109/SANER48275.2020.9054838

  23. Wolfart, D., et al.: Modernizing legacy systems with microservices: a roadmap. In: Evaluation and Assessment in Software Engineering, pp. 149–159. Association for Computing Machinery (2021)

    Google Scholar 

  24. Puri, R., et al.: Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks. https://arxiv.org/abs/2105.12655 (2021)

  25. Casado, R., Younas, M.: Emerging trends and technologies in big data processing. Concurr. Comput. 27, 2078–2091 (2015). https://doi.org/10.1002/cpe.3398

    Article  Google Scholar 

  26. Shaikh, F., Pawaskar, D., Siddiqui, A., Khan, U.: YouTube data analysis using MapReduce on Hadoop. In: 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information and Communication Technology, RTEICT 2018 – Proceedings, pp. 2037–2041 (2018). https://doi.org/10.1109/RTEICT42901.2018.9012635

  27. Nibareke, T., Laassiri, J.: Using Big Data-machine learning models for diabetes prediction and flight delays analytics. J. Big Data 7(1), 1–18 (2020). https://doi.org/10.1186/s40537-020-00355-0

    Article  Google Scholar 

  28. Jonnalagadda, V.S., Srikanth, P., Thumati, K., Nallamala, S.H., Dist, K.: A review study of apache spark in big data processing. Int. J. Comput. Sci. Trends Technol. 4, 93–98 (2016)

    Google Scholar 

  29. Han, Z., Zhang, Y.: Spark: a big data processing platform based on memory computing. In: Proc. – Int. Symp. Parallel Archit. Algorithms Program, PAAP, pp. 172–176 (2016). https://doi.org/10.1109/PAAP.2015.41

  30. Chikofsky, E.J., Cross, J.H.: Reverse engineering and design recovery: a taxonomy. IEEE Softw. 7, 13–17 (1990)

    Article  Google Scholar 

  31. Müller, H.A., Kienle, H.M.: A Small Primer on Software Reverse Engineering (2009)

    Google Scholar 

  32. Databricks Community Edition. https://community.cloud.databricks.com. Accessed 10 November 2020

  33. Van Deursen, A., Moonen, L.: Documenting software systems using types. Sci. Comput. Program. 60, 205–220 (2006)

    Article  MathSciNet  Google Scholar 

  34. Canfora, G., Di Penta, M., Cerulo, L.: Achievements and challenges in software reverse engineering. Commun. ACM 54, 142–151 (2011)

    Article  Google Scholar 

  35. Freeman, R.M., Munro, M.: Redocumentation for the Maintenance of Software. In: Proceedings of the 30th Annual Southeast Regional Conference, pp. 413–416 (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sugumaran Nallusamy , Hoo Meei Hao or Farizuwana Akma Zulkifle .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nallusamy, S., Hao, H.M., Zulkifle, F.A. (2021). Software Redocumentation Using Distributed Data Processing Technique to Support Program Understanding for Legacy System: A Proposed Approach. In: Badioze Zaman, H., et al. Advances in Visual Informatics. IVIC 2021. Lecture Notes in Computer Science(), vol 13051. Springer, Cham. https://doi.org/10.1007/978-3-030-90235-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-90235-3_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-90234-6

  • Online ISBN: 978-3-030-90235-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics