Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1028493.1028503acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
Article

Grid middleware services for virtual data discovery, composition, and integration

Published: 18 October 2004 Publication History

Abstract

We describe the services, architecture and application of the GriPhyN Virtual Data System, a suite of components and services that allow users to describe virtual data products in declarative terms, discover definitions and assemble workflows based on those definitions, and execute the resulting workflows on Grid resources. We show how these middleware-level services have been applied by specific communities to manage scientific data and workflows. In particular, we highlight and introduce <i>Chiron</i>, a portal facility that enables the interactive use of the virtual data system. Chiron has been used within the QuarkNet education project and as an online "educator" for virtual data applications. We also present applications from functional MRI-based neuroscience research.

References

[1]
Avery, P. and Foster, I. The GriPhyN Project: Towards Petascale Virtual Data Grids, 2001. www.griphyn.org.]]
[2]
A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke. The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets. Journal of Network and Computer Applications, 23:187--200, 2001.]]
[3]
Foster, I., Voeckler, J., Wilde, M. and Zhao, Y., The Virtual Data Grid: A New Model and Architecture for Data-Intensive Collaboration. First Biennial Conference on Innovative Data Systems Research, Jan., 2004.]]
[4]
Foster, I., Voeckler, J., Wilde, M. and Zhao, Y., Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation. in 14th Conference on Scientific and Statistical Database Management, (2002).]]
[5]
A. Arbree, P. Avery, D. Bourilkov, R. Cavanaugh, S. Katageri, J. Rodriguez, G. Graham, J. Vöckler, M. Wilde. Virtual Data in CMS Productions. In Proceedings of Computing in High Energy and Nuclear Physics, March 2003.]]
[6]
Annis, J., Zhao, Y., Voeckler, J., Wilde, M., Kent, S. and Foster, I., Applying Chimera Virtual Data Concepts to Cluster Finding in the Sloan Sky Survey. in SC'2002, (2002).]]
[7]
A. Rodriguez, D. Sulakhe, E. Marland, V. Nefedova, M. Wilde, N. Maltsev. Grid Enabled Server for High Throughput Analysis of Genomes, Workshop on Case Studies on Grid Applications, March 13, 2004, Berlin, Germany In conjunction with GGF10.]]
[8]
K. Leung, R. Heckemann, N. Saeed, K. Brooks, J. Buckton, K. Changani, D. Reid, D. Rueckert, J. Hajnal, M. Holden, D. Hill, Analysis of serial MR images of joints, IEEE International Symposium on Biomedical Imaging (ISBI) 2004, page 221--224.]]
[9]
J. Frey, T. Tannenbaum, I. Foster, M. Livny, S. Tuecke. Condor-G: A Computation Management Agent for Multi-Institutional Grids. Cluster Computing, 5(3):237--246, 2002.]]
[10]
E. Deelman, et al., "Pegasus: Mapping Scientific Workflows onto the Grid," Proc. 2nd EU Across Grids Conf., Cyprus, 2004.]]
[11]
Buneman, P., Khanna, S. and Tan, W.-C., Why and Where: A Characterization of Data Provenance. In International Conference on Database Theory, 2001.]]
[12]
The QuarkNet Project. http://quarknet.final.gov/, (June. 2004).]]
[13]
K. Amin, et. al, GridAnt: A Client-Controllable Grid Workflow System. 37th Hawaii International Conference on System Sciences (HICSS-37-2004)]]
[14]
J. Myers et. al., A Collaborative Informatics Infrastructure for Multi-scale Science (pdf), Proceedings of the Challenges of Large Applications in Distributed Environments (CLADE) Workshop, June 7, 2004, Honolulu, HI]]
[15]
Stein, G., Web Digital Authoring and Versioning (WebDAV) Resources Community Website, http://www.webdav.org/, 2004]]
[16]
S. Krishnan et al., The XCAT Science Portal. In SC'2001, November 2001]]
[17]
Goble, C., Pettifer, S. and Stevens, R. Knowledge Integration: In silico Experiments in Bioinformatics. The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, 2004]]
[18]
J. Van Horn, Online Availability of fMRI Results Images, Journal of Cognitive Neuroscience, 15(6):769--770, 2003.]]
[19]
Woods RP, et. al. Automated image registration: I. General methods and intrasubject, intramodality validation. Journal of Computer Assisted Tomography 1998;22:139--152.]]
[20]
Woods RP, et. al., Automated image registration: II. Intersubject validation of linear and nonlinear models. Journal of Computer Assisted Tomography 1998;22:153--165.]]
[21]
AIR 5 Suite: http://bishopw.loni.ucla.edu/AIR5/]]

Cited By

View all
  • (2014)A Hyper-Heuristic Scheduling Algorithm for CloudIEEE Transactions on Cloud Computing10.1109/TCC.2014.23157972:2(236-250)Online publication date: 1-Apr-2014
  • (2013)A genetic algorithm for multi-objective optimisation in workflow scheduling with hard constraintsInternational Journal of Metaheuristics10.1504/IJMHEUR.2013.0584752:4(415-433)Online publication date: 1-Dec-2013
  • (2013)Multi‐Objective Approach for Energy‐Aware Workflow Scheduling in Cloud Computing EnvironmentsThe Scientific World Journal10.1155/2013/3509342013:1Online publication date: 4-Nov-2013
  • Show More Cited By

Index Terms

  1. Grid middleware services for virtual data discovery, composition, and integration

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    MGC '04: Proceedings of the 2nd workshop on Middleware for grid computing
    October 2004
    92 pages
    ISBN:1581139500
    DOI:10.1145/1028493
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 October 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data discovery
    2. data grid
    3. data integration
    4. portal
    5. provenance
    6. virtual data
    7. workflow

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate 14 of 36 submissions, 39%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2014)A Hyper-Heuristic Scheduling Algorithm for CloudIEEE Transactions on Cloud Computing10.1109/TCC.2014.23157972:2(236-250)Online publication date: 1-Apr-2014
    • (2013)A genetic algorithm for multi-objective optimisation in workflow scheduling with hard constraintsInternational Journal of Metaheuristics10.1504/IJMHEUR.2013.0584752:4(415-433)Online publication date: 1-Dec-2013
    • (2013)Multi‐Objective Approach for Energy‐Aware Workflow Scheduling in Cloud Computing EnvironmentsThe Scientific World Journal10.1155/2013/3509342013:1Online publication date: 4-Nov-2013
    • (2012)A set-based discrete PSO for cloud workflow scheduling with user-defined QoS constraints2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/ICSMC.2012.6377821(773-778)Online publication date: Oct-2012
    • (2012)Design, verification and prototyping the next generation of desktop grid middlewareProceedings of the 7th international conference on Advances in Grid and Pervasive Computing10.1007/978-3-642-30767-6_7(74-88)Online publication date: 11-May-2012
    • (2011)Budget-Deadline constrained workflow planning for admission control in market-oriented environmentsProceedings of the 8th international conference on Economics of Grids, Clouds, Systems, and Services10.1007/978-3-642-28675-9_8(105-119)Online publication date: 5-Dec-2011
    • (2010)A knowledge-based ant colony optimization for a grid workflow scheduling problemProceedings of the First international conference on Advances in Swarm Intelligence - Volume Part I10.1007/978-3-642-13495-1_30(241-248)Online publication date: 12-Jun-2010
    • (2010)On‐demand data co‐allocation with user‐level cache for gridsConcurrency and Computation: Practice and Experience10.1002/cpe.158722:18(2488-2513)Online publication date: 12-Nov-2010
    • (2009)An ant colony optimization approach to a grid workflow scheduling problem with various QoS requirementsIEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews10.1109/TSMCC.2008.200172239:1(29-43)Online publication date: 1-Jan-2009
    • (2009)Research on a Virtual Data Integration on the WebProceedings of the 2009 Second International Symposium on Electronic Commerce and Security - Volume 0210.1109/ISECS.2009.156(437-440)Online publication date: 22-May-2009
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media