Abstract
Astronomy is increasingly becoming a data-driven science as the community builds larger instruments which are capable of gathering more data than previously possible. As the sizes of the datasets increase, it becomes even more important to make the most efficient use of the computational resources available. In this work, we highlight how provenance can be used to increase the computational efficiency of astronomical workflows. We describe a provenance-enabled image processing pipeline and motivate the generation of provenance with two relevant use cases. The first use case investigates the origin of an optical variation and the second is concerned with the objects used to calibrate the image. The provenance was then queried in order to evaluate the relative computational efficiency of use case evaluation, with and without the use of provenance. We find that recording the provenance of the pipeline increases the original processing time by \(\sim \)45%. However, we find that when evaluating the two identified use cases, the inclusion of provenance improves the efficiency of processing by \(\sim \)99% and \(\sim \)96% for Use Cases 1 and 2, respectively. Furthermore, we combine these results with the probability that Use Cases 1 and 2 will need to be evaluated and find a net decrease in computational processing efficiency of 13–44% when incorporating provenance generation within the workflow. However, we deduce that provenance has the potential to produce a net increase in this efficiency if more uses cases are to be considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
York, D.G.: The Sloan digital sky survey: technical summary. Astron. J. 120(3), 1579 (2000)
Law, N.M., et al.: The palomar transient factory: system overview, performance, and first results. Publ. Astron. Soc. Pac. 121(886), 1395 (2009)
Anthony Tyson, J.: Large synoptic survey telescope: overview. In: Survey and Other Telescope Technologies and Discoveries, vol. 4836, p. 10–21. International Society for Optics and Photonics (2002)
Moreau, L., Batlajery, B., Huynh, T.D., Michaelides, D., Packer, H.: A templating system to generate provenance. IEEE Trans. Softw. Eng. 44, 103–121 (2017)
Wenger, M., et al.: The SIMBAD astronomical database-the CDS reference database for astronomical objects. Astron. Astrophys. Suppl. Ser. 143(1), 9–22 (2000)
Sáenz-Adán, C., Pérez, B., Huynh, T.D., Moreau, L.: UML2PROV: automating provenance capture in software engineering. In: Tjoa, A.M., Bellatreche, L., Biffl, S., van Leeuwen, J., Wiedermann, J. (eds.) SOFSEM 2018. LNCS, vol. 10706, pp. 667–681. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73117-9_47
Lanter, D.P.: Design of a lineage-based meta-data base for GIS. Cartograph. Geograph. Inf. Syst. 18(4), 255–261 (1991)
Stevens, R.D., Robinson, A.J., Goble, C.A.: myGrid: personalised bioinformatics on the information grid. Bioinformatics 19(suppl. 1), 302–304 (2003)
Foster, I., Vockler, J., Wilde, M., Zhao, Y.: Chimera: a virtual data system for representing, querying, and automating data derivation. In: Proceedings of 14th International Conference on Scientific and Statistical Database Management, pp. 37–46. IEEE (2002)
Ludäscher, B., et al.: Scientific workflow management and the Kepler system. Concurr. Comput.: Pract. Exp. 18(10), 1039–1065 (2006)
McPhillips, T., et al.: YesWorkFlow: a user-oriented, language-independent tool for recovering workflow information from scripts. arXiv preprint arXiv:1502.02403 (2015)
Murta, L., Braganholo, V., Chirigati, F., Koop, D., Freire, J.: NoWorkFlow: capturing and analyzing provenance of scripts. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 71–83. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16462-5_6
Groth, P., Deelman, E., Juve, G., Mehta, G., Berriman, B.: Pipeline-centric provenance model. In: Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, p. 4. ACM (2009)
Giesler, A., Czekala, M., Hagemeier, B., Grunzke, R.: UniProv: a flexible provenance tracking system for UNICORE. In: Di Napoli, E., Hermanns, M.-A., Iliev, H., Lintermann, A., Peyser, A. (eds.) JHPCS 2016. LNCS, vol. 10164, pp. 233–242. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-53862-4_20
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Johnson, M.A.C., Moreau, L., Chapman, A., Gandhi, P., Sáenz-Adán, C. (2018). Using the Provenance from Astronomical Workflows to Increase Processing Efficiency. In: Belhajjame, K., Gehani, A., Alper, P. (eds) Provenance and Annotation of Data and Processes. IPAW 2018. Lecture Notes in Computer Science(), vol 11017. Springer, Cham. https://doi.org/10.1007/978-3-319-98379-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-98379-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98378-3
Online ISBN: 978-3-319-98379-0
eBook Packages: Computer ScienceComputer Science (R0)