Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2949550.2949647acmotherconferencesArticle/Chapter ViewAbstractPublication PagesxsedeConference Proceedingsconference-collections
research-article
Public Access

An Architecture for Automatic Deployment of Brown Dog Services at Scale into Diverse Computing Infrastructures

Published: 17 July 2016 Publication History

Abstract

Brown Dog is an extensible data cyberinfrastructure, that provides a set of extensible and distributed data conversion and metadata extraction services to enable access and search within unstructured, un-curated and inaccessible research data across different domains of sciences and social science, which ultimately aids in supporting reproducibility of results. We envision that Brown Dog, as a data cyberinfrastructure, is an essential service in a comprehensive cyberinfrastructure which includes data services, high performance computing services and more that would enable scholarly research in a variety of disciplines that today is not yet possible. Brown Dog focuses on four initial use cases, specifically, addressing the conversion and extraction needs in the research areas of ecology, civil and environmental engineering, library and information science, and use by the general public. In this paper, we describe an architecture that supports contribution of data transformation tools from users, and automatic deployment of the tools as Brown Dog services in diverse infrastructures such as cloud or high performance computing (HPC) based on user demands and load on the system. We also present results validating the performance of the initial implementation of Brown Dog.

References

[1]
E. Deelman, G. Singh, and et al. Pegasus: A framework for mapping complex scientific workflows onto distributed systems. Scientific Programming Journal, 2005.
[2]
M. Dietze, D. LeBauer, and R. Kooper. On improving the communication between models and data. Plant, Cell & Environment, 2012.
[3]
D. Garette and E. Klein. An extensible toolkit for computational semantics. International Conference on Computational Semantincs, 2009.
[4]
J. Goecks, A. Nekrutenko, and J. Taylor. Galaxy: A comprehensive approach for supporting accessible, reproducble, and transparent computation research in the life sciences. Genome Biology, 2010.
[5]
J. Heard and R. Marciano. A system for scalable visualization of geographic archival records. IEEE Symposium on Large Data Analysis and Visualization, 2011.
[6]
J.Towns, T. Cockerill, and et al. XSEDE: Accelerating Scientific Discovery. Computing in Science and Engineering, 16(5):62--74, Sept.-Oct. 2014.
[7]
G. Klimeck, M. McLennan, S. Brophy, G. Adams, and M. Lundstrom. nanohub.org: Advancing education and research in nanotechnology. Computing in Science and Engineering, 2008.
[8]
V. Kuhn, A. Craig, M. Simeone, S. P. Satheesan, and L. Marini. The vat: Enhanced video analysis. In Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure, XSEDE '15, pages 11:1--11:4, New York, NY, USA, 2015. ACM.
[9]
B. Ludascher, I. Altintas, and et al. Scientific workflow management and the kepler system. Concurrence and computation: Practice and Experience, Special Issue on Scientific Workflows, 2006.
[10]
L. Marini, R. Kooper, and et al. Medici: A scalable multimedia environment for research. The Microsoft e-Science Workshop, 2010.
[11]
K. McHenry, R. Kooper, and P. Bajcsy. Towards a universal, quantifiable, and scalable file format converter. The IEEE Conference on e-Science, 2009.
[12]
K. McHenry, R. Kooper, M. Ondrejcek, L. Marini, and P. Bajcsy. A mosaic of software. The IEEE International Conference on eScience, 2011.
[13]
W. Michener, S. Allard, and et al. Participatory design of dataone - enabling cyberinfrastruture for the biological and environmental sciences. Ecological Informatics, 2012.
[14]
J. Myers, M. Hedstrom, and et al. Towards sustainable curation and preservation: The sead project's data services approach. Interoperable Infrastructures for Interdisciplinary Big Data Sciences Workshop, IEEE eScience, 2015.
[15]
S. Padhy, G. Jansen, and et al. Brown dog: Leveraging everything towards autocuration. In IEEE Big Data, 2015.
[16]
T. Rath and R. Manmatha. Word spotting for historical documents. International Journal on Document Analysis and Recognition, 2007.
[17]
F. Soper. The pronom file format registry. Experts Workgroup on the Preservation of Digital Memory, 2004.

Cited By

View all
  • (2018)ClowderProceedings of the Practice and Experience on Advanced Research Computing: Seamless Creativity10.1145/3219104.3219159(1-8)Online publication date: 22-Jul-2018
  • (2018)Brown DogProceedings of the Practice and Experience on Advanced Research Computing: Seamless Creativity10.1145/3219104.3219132(1-8)Online publication date: 22-Jul-2018
  • (2017)Extracting Meaningful Data from Decomposing BodiesPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093368(1-8)Online publication date: 9-Jul-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
XSEDE16: Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale
July 2016
405 pages
ISBN:9781450347556
DOI:10.1145/2949550
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Autocuration
  2. Civil and Environmental Engineering
  3. Cloud
  4. Data conversion
  5. Data cyberinfrastructure
  6. Digital preservation
  7. Ecology
  8. Elasticity
  9. HPC
  10. Library and Information Science
  11. Metadata extraction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

XSEDE16

Acceptance Rates

Overall Acceptance Rate 129 of 190 submissions, 68%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)12
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)ClowderProceedings of the Practice and Experience on Advanced Research Computing: Seamless Creativity10.1145/3219104.3219159(1-8)Online publication date: 22-Jul-2018
  • (2018)Brown DogProceedings of the Practice and Experience on Advanced Research Computing: Seamless Creativity10.1145/3219104.3219132(1-8)Online publication date: 22-Jul-2018
  • (2017)Extracting Meaningful Data from Decomposing BodiesPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093368(1-8)Online publication date: 9-Jul-2017
  • (2017)Extracting, Assimilating, and Sharing the Results of Image Analysis on the FSA/OWI Photography CollectionPractice and Experience in Advanced Research Computing 2017: Sustainability, Success and Impact10.1145/3093338.3093365(1-6)Online publication date: 9-Jul-2017

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media