Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Datagridflows: Managing Long-Run Processes on Datagrids

  • Conference paper
Data Management in Grids (DMG 2005)

Abstract

This paper is an introduction to Datagridflows. Until recently, datagrids were generally considered over-hyped and the associated technologies not widely embraced in the academic community. Today, datagrids have become a reality and an important technology for managing large, unstructured data and storage resources distributed over autonomous administrative domains. The datagrids that are operating in production provide us an idea of new requirements and challenges that will be faced in future datagrid environments. One such requirement is the coordinated execution of long-run data management processes in datagrids. We term these processes as “datagridflows”. This new area provides exciting opportunities and challenges to researchers in distributed computing and distributed databases. This paper is intended to introduce these challenges to other researchers, including those new to grid computing. We provide motivation through discussion of datagridflow requirements and real production scenarios. We introduce current work on datagridflow technologies including the Datagrid Language (DGL) for describing datagridflows in datagrids.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Moore, R.W., Jagatheesan, A., Rajasekar, A., Wan, M., Schroeder, W.: Data Grid Management Systems. In: Proceedings of the 21st IEEE/NASA Conference on Mass Storage Systems and Technologies, Maryland (2004)

    Google Scholar 

  2. Rajasekar, A., Wan, M., Moore, R.W., Jagatheesan, A., Kremenek, G.: Real Experiences with Data Grids – Case-studies in using the SRB. In: Proceedings of 6th International Conference/Exhibition on High Performance Computing Conference in Asia Pacific Region (HPC-Asia), Bangalore, India (December 2002)

    Google Scholar 

  3. BBSRC-CCLRC Data Grid. Web site, http://www.e-science.clrc.ac.uk/web/projects/bbsrc_grid_support

  4. Archivist Grid Website, http://www.sdsc.edu/Press/2004/04/040904_PersistenArchives.html

  5. Foster, I., Voeckler, J., Wilde, M., Zhao, Y.: Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation. In: Scientific and Statistical Database Management (2002)

    Google Scholar 

  6. Jagatheesan, A., Moore, R., Rajasekar, A., Zhu, B.: Virtual Services in Data Grids. In: The 11th IEEE International Symposium on High Performance Distributed Computing (HPDC), Scotland (July 2002)

    Google Scholar 

  7. Southern California Earthquake Center, SCEC, http://www.scec.org/cme

  8. Lee, E.A., Neuendorffer, S.: MoML — A Modeling Markup Language in XML — Version 0.4. Technical report, University of California at Berkeley (March 2000)

    Google Scholar 

  9. Jagatheesan, A.: Architecture of Grid File System, Gridforge, https://forge.gridforum.org/projects/gfs-wg

  10. SRB Matrix Website, http://www.sdsc.edu/srb/matrix

  11. Deelman, E., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Patil, S., Su, M., Vahi, K., Livny, M.: Pegasus: Mapping scientific workflows onto the grid. In: Across Grids Conference, Nicosia, Cyprus (2004)

    Google Scholar 

  12. Amin, K., von Laszewski, G.: GridAnt: A Grid Workflow System. Manual (February 2003), http://www-unix.globus.org/cog/projects/gridant/

  13. Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger-Frank, E., Jones, M., Lee, E., Tao, J., Zhao, Y.: Scientific Workflow Management and the Kepler System. In: Concurrency and Computation: Practice & Experience, Special Issue on Scientific Workflows

    Google Scholar 

  14. Weinberg, J., Jagatheesan, A., Ding, A., Fareman, M., Hu, Y.: Gridflow Description, Query, and Execution at SCEC using the SDSC Matrix. In: Proceedings of the 13th IEEE International Symposium on High-Performance Distributed Computing (HPDC), Honolulu, Hawaii, USA, June 4-6 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jagatheesan, A. et al. (2006). Datagridflows: Managing Long-Run Processes on Datagrids. In: Pierson, JM. (eds) Data Management in Grids. DMG 2005. Lecture Notes in Computer Science, vol 3836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11611950_10

Download citation

  • DOI: https://doi.org/10.1007/11611950_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31212-3

  • Online ISBN: 978-3-540-32452-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics