Cyberenvironments: Adaptive
Middleware for Scientific
Cyberinfrastructure
ARM’07
Jim Myers, Bob McGrath
jimmyers@ncsa.uiuc.edu
National Center for Supercomputing Applications (NCSA),
University of Illinois at Urbana-Champaign
University of Illinois at Urbana-Champaign
National Center for Supercomputing Applications
Outline
•
•
•
•
•
What’s Changing in Science?
What Role should Cyberinfrastructure (CI)
play?
Requirements and Design for
Cyberenvironments: Adaptive/Reflective
Techniques
Some Examples
Conclusions
National Center for Supercomputing Applications
How is Science Changing?
•
•
•
Quantitative Modeling and Simulation
Better Data (e.g. Higher Signal to Noise)
More Data (e.g. High Throughput)
Æ
– Closer ties between research and application
– Investigation of subtle, non-linear, multi-dimensional
phenomena
– Statistical analysis of complex systems
National Center for Supercomputing Applications
Supporting the Research Lifecycle…
∆θ2
Standards /
Best practice
Valid
Ran
ge
∆θ1
Algorithms/
Services
Engineering Views
OH+
Curate
OH
20 21
22 22a
16 17
18 19
Apply
Reference Data
14
15
34
H2 O
H2O2
56
H
10 11
12
O
23
2
H2O (l)
1
789
H2O2 (l)
Gap
Analysis
Analyze Publish
Provenance
13
H2
O2
Annotation
Experiment Design
Project Execution
National Center for Supercomputing Applications
‘Amdahl’s Law’ for Scientific
Progress
Data production
Processing power
Data discovery
Translation
Experiment setup
Group coordination
Tool integration
Training
National Center for Supercomputing
Applications
Data transfer/storage
Feature Extraction
Data interpretation
Acceptance of new models/tools
Dissemination of best practices
Interdisciplinary communication
!
CI versus the Literature/Out of Band Processes?
•
•
•
Higher Fidelity, Multiple Levels of Description
Custom Views
Actionable, Faster, Automatable
•
But software is rigid relative to text…
–
–
–
–
–
CI must be built before the parts are done
It must be evolvable by independent parties
It must enable coordination without central control
It must allow science to evolve / progress (no fixed domain model)
Researchers/educators must be able to work in multiple
communities/value chains (across CI projects)
– It must convey knowledge as well as tools to end users
– It must align the interests of CI funders, developers, providers,
users, …
National Center for Supercomputing Applications
Key Cyberenvironment Design
Concepts
•
Explicit Separation of How from What:
– Content (type, global IDs, …) and Conceptual Context
(metadata…)
– Process (workflow, provenance, …)
– Virtual Organizations/Social Networks (policies, resources,
semantics, translation)
– GUI Integration (portals, rich clients, …)
– …
•
Ability to pass information through components that don’t
understand the details (everything is data)…
…e-Science, Semantic Grid, Cyberenvironments, Web 2.0 …
…intelligence at the edges…
National Center for Supercomputing Applications
Mid-America Earthquake Center
Examples: MAEViz
(Consequence-Based Risk Management for Seismic
Events)
Maeviz – [Memphis Test Bed]
File
Decision
Support
Inventory
Vulnerability
Hazards
Interventions
Decision support
Interdependencies
Help
?
Consequence Table
?
Scheme Comparison
Loss ($M)
Consequence Comparison
100
90
80
70
60
50
40
30
20
10
0
Description
Scheme #1
Life Loss
No Action
Scheme #1
C2M
C2L
URML
Rebuild
Rebuild
Rebuild
C2M
C2L
URML
Rehab LS
Rehab LS
No Action
Scheme #2
Dollar Loss
Scheme #2
Alternatives
Prob. Distribution
Preference Plot
OK
Earthquake Level: 5% PE in 50 years
POS plot
Cancel
Fragility
Models
Social/Economic Impact Limit State
Damage
Prediction
Input error margin
Response error margin
Input Motion Parameter
Inventory
Selection
• Engineering View of MAE Center Research
• Portal-based Collaboration Environment
• Distributed Data/metadata Sources
• Multi-disciplinary Collaboration
University of Illinois at Urbana-Champaign
Hazard
Definition
0.3g
0.5g
0.6g
National Center for Supercomputing Applications
Compare Schemes
Examples: CyberIntegrator
•
•
•
•
•
•
•
Exploratory workflow
(macro-recording)
Simple integration with
Matlab, Excel, Fortran, etc.
Provenance tracking
Distributed, shared data
access (HIS, WebDAV, …)
Remote Execution
Workflow/model
publication
Metadata and Annotation of
data, modules, workflows
National Center for Supercomputing
Applications
Examples: CyberCollaboratory Portal
•
•
•
•
•
•
Group Spaces
Library,
discussion,
announcements,
wiki, …
Simplified
invitation
Email integration
Provenance
tracking/social
network analysis
…
National Center for Supercomputing
Applications
Content & VO Aware
Desktop
Secure
Enterprise
Data
Data/Metadata
Check VO and
personal preferences
Public
Reference
Data
Translate
Virtual
Data
(from Recipes)
National Center for Supercomputing Applications
Process Aware
Process
Capture
Publish/
Discover
Execute
Retrieve Data
Retrieve Code
National Center for Supercomputing Applications
Dynamic
New Third-Party
Analyses (Forms, Visualizations)
Compare, Contrast,
Validate
Auto-update
MAEviz
GIS
Workflow Data
Eclipse RCP
Plug-in Framework
National Center for Supercomputing Applications
Social/Conceptual Context
• Capture of Interactions in
•
Portal and in the Literature
Capture of
Annotations/Associations
• Provide Browsing and
Recommender Interfaces
National Center for Supercomputing Applications
What do CyberEnvironments/CI for
scientific discourse have to do with ARM?
• Thesis: the principles of ARM are critical design
patterns for viable CEs
– Abstract services
• NSF CI, Grid—resource management, authentication, etc.
• Support for science process (e.g., virtual organizations)
• RCP and other component frameworks for composing software
– Expose metadata
• Generic content management
• Generic process management
• Open metadata using RDF
– Instrumentation
• Universal capture of provenance, annotation
National Center for Supercomputing Applications
A Reflective Model
What needs to be done
Which component(s) can do the work?
What does the component need to know?
Where can the information be found?
What can the component add to the story?
•
•
•
•
VO manager separate from App and CI developers
Can move from local to grid/web solutions w/o app
changes
Semantic middleware as scalable communication layer…
Open Provenance Model, FOAF, DC, … as common
conventions
National Center for Supercomputing Applications
Conclusions
•
•
•
•
Building Cyberenvironments/supporting Scientific
Discourse is critical for scientific
efficiency/competitiveness.
Abstract management of data, process/provenance, social,
and conceptual contexts solves real socio-technical
problems in science and engineering research.
Our experience in building Cyberenvironments on these
principles is showing their potential in terms of supporting
systems science and evolving research.
E-Science, semantic web/grid, content management, Web
2.0 are all driving in this direction, but their impact is not
well stated in terms of value to science researchers.
National Center for Supercomputing Applications
Acknowledgments
The authors wish to acknowledge the contribution of many CI
researchers to the concepts and systems discussed here with specific
recognition of members of NCSA’s Cyberenvironments Directorate.
The National Center for Supercomputing Applications is funded by the
US National Science Foundation under Grant No. SCI-0438712. Any
opinions, findings, and conclusions or recommendations expressed in
this material are those of the authors and do not necessarily reflect the
views of the National Science Foundation.
National Center for Supercomputing Applications