Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 115 results for author: Foster, T

.
  1. arXiv:2408.14627  [pdf, ps, other

    cs.DB cs.CE cs.CY cs.ET

    Sustainable Data Democratization: A Multifaceted Investment for an Equitable Future

    Authors: Michela Taufer, Valerio Pascucci, Christine R. Kirkpatric, Ian T. Foster

    Abstract: The urgent need for data democratization in scientific research was the focal point of a panel discussion at SC23 in Denver, Colorado, from November 12 to 17, 2023. This article summarizes the outcomes of that discussion and subsequent conversations. We advocate for strategic investments in financial, human, and technological resources for sustainable data democratization. Emphasizing that data is… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: 5 pages

  2. arXiv:2408.14434  [pdf, other

    cs.DC cs.LG

    Employing Artificial Intelligence to Steer Exascale Workflows with Colmena

    Authors: Logan Ward, J. Gregory Pauloski, Valerie Hayot-Sasson, Yadu Babuji, Alexander Brace, Ryan Chard, Kyle Chard, Rajeev Thakur, Ian Foster

    Abstract: Computational workflows are a common class of application on supercomputers, yet the loosely coupled and heterogeneous nature of workflows often fails to take full advantage of their capabilities. We created Colmena to leverage the massive parallelism of a supercomputer by using Artificial Intelligence (AI) to learn from and adapt a workflow as it executes. Colmena allows scientists to define how… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  3. arXiv:2408.07236  [pdf, other

    cs.DC

    TaPS: A Performance Evaluation Suite for Task-based Execution Frameworks

    Authors: J. Gregory Pauloski, Valerie Hayot-Sasson, Maxime Gonthier, Nathaniel Hudson, Haochen Pan, Sicheng Zhou, Ian Foster, Kyle Chard

    Abstract: Task-based execution frameworks, such as parallel programming libraries, computational workflow systems, and function-as-a-service platforms, enable the composition of distinct tasks into a single, unified application designed to achieve a computational goal. Task-based execution frameworks abstract the parallel execution of an application's tasks on arbitrary hardware. Research into these task ex… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: To appear in the Proceedings of 20th IEEE International Conference on e-Science

  4. arXiv:2407.09434  [pdf, other

    cs.LG cs.AI cs.CE eess.SY

    A Perspective on Foundation Models for the Electric Power Grid

    Authors: Hendrik F. Hamann, Thomas Brunschwiler, Blazhe Gjorgiev, Leonardo S. A. Martins, Alban Puech, Anna Varbella, Jonas Weiss, Juan Bernabe-Moreno, Alexandre Blondin Massé, Seong Choi, Ian Foster, Bri-Mathias Hodge, Rishabh Jain, Kibaek Kim, Vincent Mai, François Mirallès, Martin De Montigny, Octavio Ramos-Leaños, Hussein Suprême, Le Xie, El-Nasser S. Youssef, Arnaud Zinflou, Alexander J. Belvi, Ricardo J. Bessa, Bishnu Prasad Bhattari , et al. (2 additional authors not shown)

    Abstract: Foundation models (FMs) currently dominate news headlines. They employ advanced deep learning architectures to extract structural information autonomously from vast datasets through self-supervision. The resulting rich representations of complex systems and dynamics can be applied to many downstream applications. Therefore, FMs can find uses in electric power grids, challenged by the energy transi… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Lead contact: H.F.H.; Major equal contributors: H.F.H., T.B., B.G., L.S.A.M., A.P., A.V., J.W.; Significant equal contributors: J.B., A.B.M., S.C., I.F., B.H., R.J., K.K., V.M., F.M., M.D.M., O.R., H.S., L.X., E.S.Y., A.Z.; Other equal contributors: A.J.B., R.J.B., B.P.B., J.S., S.S

  5. arXiv:2407.01764  [pdf, other

    cs.DC

    Object Proxy Patterns for Accelerating Distributed Applications

    Authors: J. Gregory Pauloski, Valerie Hayot-Sasson, Logan Ward, Alexander Brace, André Bauer, Kyle Chard, Ian Foster

    Abstract: Workflow and serverless frameworks have empowered new approaches to distributed application design by abstracting compute resources. However, their typically limited or one-size-fits-all support for advanced data flow patterns leaves optimization to the application programmer -- optimization that becomes more difficult as data become larger. The transparent object proxy, which provides wide-area r… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  6. arXiv:2405.15828  [pdf, other

    cs.DL cs.AI

    Oil & Water? Diffusion of AI Within and Across Scientific Fields

    Authors: Eamon Duede, William Dolan, André Bauer, Ian Foster, Karim Lakhani

    Abstract: This study empirically investigates claims of the increasing ubiquity of artificial intelligence (AI) within roughly 80 million research publications across 20 diverse scientific fields, by examining the change in scholarly engagement with AI from 1985 through 2022. We observe exponential growth, with AI-engaged publications increasing approximately thirteenfold (13x) across all fields, suggesting… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  7. arXiv:2404.19717  [pdf, other

    cs.DC

    Automated, Reliable, and Efficient Continental-Scale Replication of 7.3 Petabytes of Climate Simulation Data: A Case Study

    Authors: Lukasz Lacinski, Lee Liming, Steven Turoscy, Cameron Harr, Kyle Chard, Eli Dart, Paul Durack, Sasha Ames, Forrest M. Hoffman, Ian T. Foster

    Abstract: We report on our experiences replicating 7.3 petabytes (PB) of Earth System Grid Federation (ESGF) climate simulation data from Lawrence Livermore National Laboratory (LLNL) in California to Argonne National Laboratory (ANL) in Illinois and Oak Ridge National Laboratory (ORNL) in Tennessee. This movement of some 29 million files, twice, undertaken in order to establish new ESGF nodes at ANL and OR… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  8. MalleTrain: Deep Neural Network Training on Unfillable Supercomputer Nodes

    Authors: Xiaolong Ma, Feng Yan, Lei Yang, Ian Foster, Michael E. Papka, Zhengchun Liu, Rajkumar Kettimuthu

    Abstract: First-come first-serve scheduling can result in substantial (up to 10%) of transiently idle nodes on supercomputers. Recognizing that such unfilled nodes are well-suited for deep neural network (DNN) training, due to the flexible nature of DNN training tasks, Liu et al. proposed that the re-scaling DNN training tasks to fit gaps in schedules be formulated as a mixed-integer linear programming (MIL… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  9. arXiv:2403.19257  [pdf, other

    cs.DC

    UniFaaS: Programming across Distributed Cyberinfrastructure with Federated Function Serving

    Authors: Yifei Li, Ryan Chard, Yadu Babuji, Kyle Chard, Ian Foster, Zhuozhao Li

    Abstract: Modern scientific applications are increasingly decomposable into individual functions that may be deployed across distributed and diverse cyberinfrastructure such as supercomputers, clouds, and accelerators. Such applications call for new approaches to programming, distributed execution, and function-level management. We present UniFaaS, a parallel programming framework that relies on a federated… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 13 pages, 13 figures, IPDPS2024

  10. arXiv:2403.06077  [pdf, other

    cs.DC

    Steering a Fleet: Adaptation for Large-Scale, Workflow-Based Experiments

    Authors: Jim Pruyne, Valerie Hayot-Sasson, Weijian Zheng, Ryan Chard, Justin M. Wozniak, Tekin Bicer, Kyle Chard, Ian T. Foster

    Abstract: Experimental science is increasingly driven by instruments that produce vast volumes of data and thus a need to manage, compute, describe, and index this data. High performance and distributed computing provide the means of addressing the computing needs; however, in practice, the variety of actions required and the distributed set of resources involved, requires sophisticated "flows" defining the… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  11. arXiv:2312.06592  [pdf, other

    cs.CV

    Flexible visual prompts for in-context learning in computer vision

    Authors: Thomas Foster, Ioana Croitoru, Robert Dorfman, Christoffer Edlund, Thomas Varsavsky, Jon Almazán

    Abstract: In this work, we address in-context learning (ICL) for the task of image segmentation, introducing a novel approach that adapts a modern Video Object Segmentation (VOS) technique for visual in-context learning. This adaptation is inspired by the VOS method's ability to efficiently and flexibly learn objects from a few examples. Through evaluations across a range of support set sizes and on diverse… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  12. arXiv:2312.03989  [pdf, other

    cs.LG cond-mat.mtrl-sci eess.IV physics.data-an

    Rapid detection of rare events from in situ X-ray diffraction data using machine learning

    Authors: Weijian Zheng, Jun-Sang Park, Peter Kenesei, Ahsan Ali, Zhengchun Liu, Ian T. Foster, Nicholas Schwarz, Rajkumar Kettimuthu, Antonino Miceli, Hemant Sharma

    Abstract: High-energy X-ray diffraction methods can non-destructively map the 3D microstructure and associated attributes of metallic polycrystalline engineering materials in their bulk form. These methods are often combined with external stimuli such as thermo-mechanical loading to take snapshots over time of the evolving microstructure and attributes. However, the extreme data volumes and the high costs o… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  13. arXiv:2312.03876  [pdf, other

    physics.ao-ph cs.AI cs.LG

    Scaling transformer neural networks for skillful and reliable medium-range weather forecasting

    Authors: Tung Nguyen, Rohan Shah, Hritik Bansal, Troy Arcomano, Sandeep Madireddy, Romit Maulik, Veerabhadra Kotamarthi, Ian Foster, Aditya Grover

    Abstract: Weather forecasting is a fundamental problem for anticipating and mitigating the impacts of climate change. Recently, data-driven approaches for weather forecasting based on deep learning have shown great promise, achieving accuracies that are competitive with operational systems. However, those methods often employ complex, customized architectures without sufficient ablation analysis, making it… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  14. arXiv:2310.18948  [pdf, other

    cs.LG cs.AI cs.DM math.PR

    Multi-Path Long-Term Vessel Trajectories Forecasting with Probabilistic Feature Fusion for Problem Shifting

    Authors: Gabriel Spadon, Jay Kumar, Derek Eden, Josh van Berkel, Tom Foster, Amilcar Soares, Ronan Fablet, Stan Matwin, Ronald Pelot

    Abstract: This paper addresses the challenge of boosting the precision of multi-path long-term vessel trajectory forecasting on engineered sequences of Automatic Identification System (AIS) data using feature fusion for problem shifting. We have developed a deep auto-encoder model and a phased framework approach to predict the next 12 hours of vessel trajectories using 1 to 3 hours of AIS data as input. To… ▽ More

    Submitted 10 July, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

  15. arXiv:2310.16270  [pdf, other

    cs.CL cs.AI cs.LG

    Attention Lens: A Tool for Mechanistically Interpreting the Attention Head Information Retrieval Mechanism

    Authors: Mansi Sakarvadia, Arham Khan, Aswathy Ajith, Daniel Grzenda, Nathaniel Hudson, André Bauer, Kyle Chard, Ian Foster

    Abstract: Transformer-based Large Language Models (LLMs) are the state-of-the-art for natural language tasks. Recent work has attempted to decode, by reverse engineering the role of linear layers, the internal mechanisms by which LLMs arrive at their final predictions for text completion tasks. Yet little is known about the specific role of attention heads in producing the final token prediction. We propose… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  16. arXiv:2310.00510  [pdf, other

    cs.RO

    Exploring Benchmarks for Self-Driving Labs using Color Matching

    Authors: Tobias Ginsburg, Kyle Hippe, Ryan Lewis, Doga Ozgulbas, Aileen Cleary, Rory Butler, Casey Stone, Abraham Stroka, Ian Foster

    Abstract: Self Driving Labs (SDLs) that combine automation of experimental procedures with autonomous decision making are gaining popularity as a means of increasing the throughput of scientific workflows. The task of identifying quantities of supplied colored pigments that match a target color, the color matching problem, provides a simple and flexible SDL test case, as it requires experiment proposal, sam… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

  17. arXiv:2309.05605  [pdf, other

    cs.CL cs.AI cs.LG

    Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

    Authors: Mansi Sakarvadia, Aswathy Ajith, Arham Khan, Daniel Grzenda, Nathaniel Hudson, André Bauer, Kyle Chard, Ian Foster

    Abstract: Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response… ▽ More

    Submitted 28 February, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: Oral Presentation at BlackboxNLP Workshop at EMNLP 2023

  18. arXiv:2308.13701  [pdf, other

    cs.DC cs.AI

    Linking the Dynamic PicoProbe Analytical Electron-Optical Beam Line / Microscope to Supercomputers

    Authors: Alexander Brace, Rafael Vescovi, Ryan Chard, Nickolaus D. Saint, Arvind Ramanathan, Nestor J. Zaluzec, Ian Foster

    Abstract: The Dynamic PicoProbe at Argonne National Laboratory is undergoing upgrades that will enable it to produce up to 100s of GB of data per day. While this data is highly important for both fundamental science and industrial applications, there is currently limited on-site infrastructure to handle these high-volume data streams. We address this problem by providing a software architecture capable of s… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  19. arXiv:2308.09793  [pdf, other

    cs.RO

    Towards a Modular Architecture for Science Factories

    Authors: Rafael Vescovi, Tobias Ginsburg, Kyle Hippe, Doga Ozgulbas, Casey Stone, Abraham Stroka, Rory Butler, Ben Blaiszik, Tom Brettin, Kyle Chard, Mark Hereld, Arvind Ramanathan, Rick Stevens, Aikaterini Vriza, Jie Xu, Qingteng Zhang, Ian Foster

    Abstract: Advances in robotic automation, high-performance computing (HPC), and artificial intelligence (AI) encourage us to conceive of science factories: large, general-purpose computation- and AI-enabled self-driving laboratories (SDLs) with the generality and scale needed both to tackle large discovery problems and to support thousands of scientists. Science factories require modular hardware and softwa… ▽ More

    Submitted 17 October, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

  20. arXiv:2306.08695  [pdf, other

    cond-mat.mtrl-sci cs.AI

    A generative artificial intelligence framework based on a molecular diffusion model for the design of metal-organic frameworks for carbon capture

    Authors: Hyun Park, Xiaoli Yan, Ruijie Zhu, E. A. Huerta, Santanu Chaudhuri, Donny Cooper, Ian Foster, Emad Tajkhorshid

    Abstract: Metal-organic frameworks (MOFs) exhibit great promise for CO2 capture. However, finding the best performing materials poses computational and experimental grand challenges in view of the vast chemical space of potential building blocks. Here, we introduce GHP-MOFassemble, a generative artificial intelligence (AI), high performance framework for the rational and accelerated design of MOFs with high… ▽ More

    Submitted 12 March, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 25 pages, 17 figures, 6 tables, accepted to Nature Communications Chemistry. This work was awarded the HPCwire 2023 Editors' Choice Awards for Best Use of High Performance Data Analytics \& Artificial Intelligence see https://www.hpcwire.com/2023-readers-editors-choice-data-analytics-ai/

    ACM Class: I.2

    Journal ref: Commun Chem 7, 21 (2024)

  21. arXiv:2306.06283  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.chem-ph

    14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon

    Authors: Kevin Maik Jablonka, Qianxiang Ai, Alexander Al-Feghali, Shruti Badhwar, Joshua D. Bocarsly, Andres M Bran, Stefan Bringuier, L. Catherine Brinson, Kamal Choudhary, Defne Circi, Sam Cox, Wibe A. de Jong, Matthew L. Evans, Nicolas Gastellu, Jerome Genzling, María Victoria Gil, Ankur K. Gupta, Zhi Hong, Alishba Imran, Sabine Kruschwitz, Anne Labarre, Jakub Lála, Tao Liu, Steven Ma, Sauradeep Majumdar , et al. (28 additional authors not shown)

    Abstract: Large-language models (LLMs) such as GPT-4 caught the interest of many scientists. Recent studies suggested that these models could be useful in chemistry and materials science. To explore these possibilities, we organized a hackathon. This article chronicles the projects built as part of this hackathon. Participants employed LLMs for various applications, including predicting properties of mole… ▽ More

    Submitted 14 July, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

  22. arXiv:2305.09593  [pdf, other

    cs.DC

    Accelerating Communications in Federated Applications with Transparent Object Proxies

    Authors: J. Gregory Pauloski, Valerie Hayot-Sasson, Logan Ward, Nathaniel Hudson, Charlie Sabino, Matt Baughman, Kyle Chard, Ian Foster

    Abstract: Advances in networks, accelerators, and cloud services encourage programmers to reconsider where to compute -- such as when fast networks make it cost-effective to compute on remote accelerators despite added latency. Workflow and cloud-hosted serverless computing frameworks can manage multi-step computations spanning federated collections of cloud, high-performance computing (HPC), and edge syste… ▽ More

    Submitted 29 August, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: Accepted for publication at the International Conference for High Performance Computing, Networking, Storage and Analysis (SC23)

  23. arXiv:2305.03275  [pdf, other

    physics.plasm-ph

    Fast Correlation Heating in Moderately Coupled Electron-Ion Plasmas

    Authors: Thomas E. Foster, Henry Fetsch, Nathaniel J. Fisch

    Abstract: If the electrons in a plasma are suddenly heated, the resulting change in Debye shielding causes the ion kinetic energy to quickly increase. For the first time, this correlation heating, which is much faster than collisional energy exchange, is rigorously derived for a moderately coupled, electron-ion plasma. The electron-ion mass ratio is taken to be the smallest parameter in the BBGKY hierarchy,… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: 55 pages, 13 figures

  24. arXiv:2304.11120  [pdf, other

    cond-mat.mtrl-sci

    What is missing in autonomous discovery: Open challenges for the community

    Authors: Phillip M. Maffettone, Pascal Friederich, Sterling G. Baird, Ben Blaiszik, Keith A. Brown, Stuart I. Campbell, Orion A. Cohen, Tantum Collins, Rebecca L. Davis, Ian T. Foster, Navid Haghmoradi, Mark Hereld, Nicole Jung, Ha-Kyung Kwon, Gabriella Pizzuto, Jacob Rintamaki, Casper Steinmann, Luca Torresi, Shijing Sun

    Abstract: Self-driving labs (SDLs) leverage combinations of artificial intelligence, automation, and advanced computing to accelerate scientific discovery. The promise of this field has given rise to a rich community of passionate scientists, engineers, and social scientists, as evidenced by the development of the Acceleration Consortium and recent Accelerate Conference. Despite its strengths, this rapidly… ▽ More

    Submitted 2 May, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

  25. arXiv:2303.16441  [pdf, ps, other

    math.AG

    Adic tropicalizations and cofinality of Gubler models

    Authors: Tyler Foster, Sam Payne

    Abstract: We introduce adic tropicalizations for subschemes of toric varieties as limits of Gubler models associated to polyhedral covers of the ordinary tropicalization. Our main result shows that Huber's adic analytification of a subscheme of a toric variety is naturally isomorphic to the inverse limit of its adic tropicalizations, in the category of locally topologically ringed spaces. The key new techni… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: 17 pages

    MSC Class: 14T05; 14G22

  26. arXiv:2303.11415  [pdf, other

    physics.plasm-ph

    Temperature separation under compression of moderately-coupled plasma

    Authors: H. Fetsch, T. E. Foster, N. J. Fisch

    Abstract: In moderately-coupled plasmas, a significant fraction of the internal energy resides in electric fields. As these plasmas are heated or compressed, the shifting partition of energy between particles and fields leads to surprising effects, particularly when ions and electrons have different temperatures. In this work, quasi-equations of state (quasi-EOS) are derived for two-temperature moderately-c… ▽ More

    Submitted 21 July, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

  27. A New Distance to the Supernova Remnant DA 530 Based on HI Absorption of Polarized Emission

    Authors: Rebecca A. Booth, Roland Kothes, Tom Landecker, Jo-Anne Brown, Andrew Gray, Tyler Foster, Eric Greisen

    Abstract: Supernova remnants (SNRs) are significant contributors of matter and energy to the interstellar medium. Understanding the impact and the mechanism of this contribution requires knowledge of the physical size, energy, and expansion rate of individual SNRs, which can only come if reliable distances can be obtained. We aim to determine the distance to the SNR DA 530 (G93.3+6.9), an object of low surf… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  28. Discovery of a filamentary synchrotron structure connected to the coherent magnetic field in the outer Galaxy

    Authors: J. L. West, J. L. Campbell, P. Bhaura, R. Kothes, S. Safi-Harb, J. M. Stil, A. R. Taylor, T. Foster, B. M. Gaensler, S. J. George, S. J. Gibson, R. Ricci

    Abstract: Using data from the Galactic Arecibo L-band Feed Array Continuum Transit Survey (GALFACTS), we report the discovery of two previously unidentified, very compressed, thin, and straight polarized filaments approximately centred at Galactic coordinates, $(l,b)=(182.5^\circ,-4.0^\circ)$, which we call G182.5--4.0. Using data from the Isaac Newton Telescope Galactic Plane Survey (IGAPS), we also find s… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: 17 pages, 9 figures, accepted to Astrophysical Journal

  29. arXiv:2210.08973  [pdf, ps, other

    cs.CY cs.HC cs.LG hep-ex

    FAIR for AI: An interdisciplinary and international community building perspective

    Authors: E. A. Huerta, Ben Blaiszik, L. Catherine Brinson, Kristofer E. Bouchard, Daniel Diaz, Caterina Doglioni, Javier M. Duarte, Murali Emani, Ian Foster, Geoffrey Fox, Philip Harris, Lukas Heinrich, Shantenu Jha, Daniel S. Katz, Volodymyr Kindratenko, Christine R. Kirkpatrick, Kati Lassila-Perini, Ravi K. Madduri, Mark S. Neubauer, Fotis E. Psomopoulos, Avik Roy, Oliver Rübel, Zhizhen Zhao, Ruike Zhu

    Abstract: A foundational set of findable, accessible, interoperable, and reusable (FAIR) principles were proposed in 2016 as prerequisites for proper data management and stewardship, with the goal of enabling the reusability of scholarly data. The principles were also meant to apply to other digital assets, at a high level, and over time, the FAIR guiding principles have been re-interpreted or extended to i… ▽ More

    Submitted 1 August, 2023; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: 10 pages, comments welcome!; v2: 12 pages, accepted to Scientific Data

    ACM Class: I.2.0; E.0

    Journal ref: Scientific Data 10, 487 (2023)

  30. funcX: Federated Function as a Service for Science

    Authors: Zhuozhao Li, Ryan Chard, Yadu Babuji, Ben Galewsky, Tyler Skluzacek, Kirill Nagaitsev, Anna Woodard, Ben Blaiszik, Josh Bryan, Daniel S. Katz, Ian Foster, Kyle Chard

    Abstract: funcX is a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. Unlike centralized FaaS systems, funcX decouples the cloud-hosted management functionality from the edge-hosted execution functionality. funcX's endpoint software can be deployed, by users or administrators, on arbitrary laptops, clouds, clusters, and superc… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2005.04215

  31. arXiv:2209.09408  [pdf, other

    cs.LG eess.IV

    Deep learning at the edge enables real-time streaming ptychographic imaging

    Authors: Anakha V Babu, Tao Zhou, Saugat Kandel, Tekin Bicer, Zhengchun Liu, William Judge, Daniel J. Ching, Yi Jiang, Sinisa Veseli, Steven Henke, Ryan Chard, Yudong Yao, Ekaterina Sirazitdinova, Geetika Gupta, Martin V. Holt, Ian T. Foster, Antonino Miceli, Mathew J. Cherukara

    Abstract: Coherent microscopy techniques provide an unparalleled multi-scale view of materials across scientific and technological fields, from structural materials to quantum devices, from integrated circuits to biological cells. Driven by the construction of brighter sources and high-rate detectors, coherent X-ray microscopy methods like ptychography are poised to revolutionize nanoscale materials charact… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

  32. arXiv:2208.09513  [pdf, other

    cs.DC cs.AI

    Globus Automation Services: Research process automation across the space-time continuum

    Authors: Ryan Chard, Jim Pruyne, Kurt McKee, Josh Bryan, Brigitte Raumann, Rachana Ananthakrishnan, Kyle Chard, Ian Foster

    Abstract: Research process automation -- the reliable, efficient, and reproducible execution of linked sets of actions on scientific instruments, computers, data stores, and other resources -- has emerged as an essential element of modern science. We report here on new services within the Globus research data management platform that enable the specification of diverse research processes as reusable sets of… ▽ More

    Submitted 6 December, 2022; v1 submitted 19 August, 2022; originally announced August 2022.

  33. arXiv:2207.00611  [pdf, other

    cs.AI cond-mat.mtrl-sci cs.LG

    FAIR principles for AI models with a practical application for accelerated high energy diffraction microscopy

    Authors: Nikil Ravi, Pranshu Chaturvedi, E. A. Huerta, Zhengchun Liu, Ryan Chard, Aristana Scourtas, K. J. Schmidt, Kyle Chard, Ben Blaiszik, Ian Foster

    Abstract: A concise and measurable set of FAIR (Findable, Accessible, Interoperable and Reusable) principles for scientific data is transforming the state-of-practice for data management and stewardship, supporting and enabling discovery and innovation. Learning from this initiative, and acknowledging the impact of artificial intelligence (AI) in the practice of science and engineering, we introduce a set o… ▽ More

    Submitted 21 December, 2022; v1 submitted 1 July, 2022; originally announced July 2022.

    Comments: 11 pages, 3 figures; Accepted to Scientific Data; for press release see https://www.anl.gov/article/argonne-scientists-promote-fair-standards-for-managing-artificial-intelligence-models and https://www.ncsa.illinois.edu/ncsa-student-researchers-lead-authors-on-award-winning-paper; Received 2022 HPCwire Readers' Choice Award on Best Use of High Performance Data Analytics & Artificial Intelligence

    MSC Class: 68T01; 68T05 ACM Class: I.2; J.2

    Journal ref: Scientific Data 9, 657 (2022)

  34. arXiv:2205.11342  [pdf, other

    cs.CL cs.LG

    The Diminishing Returns of Masked Language Models to Science

    Authors: Zhi Hong, Aswathy Ajith, Gregory Pauloski, Eamon Duede, Kyle Chard, Ian Foster

    Abstract: Transformer-based masked language models such as BERT, trained on general corpora, have shown impressive performance on downstream tasks. It has also been demonstrated that the downstream task performance of such models can be improved by pretraining larger models for longer on more data. In this work, we empirically evaluate the extent to which these results extend to tasks in science. We use 14… ▽ More

    Submitted 3 May, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: 12 pages. 3 figures. 5 tables. Accepted to the Findings of ACL 2023

    ACM Class: I.2.7

  35. arXiv:2205.10602  [pdf, ps, other

    physics.ins-det

    Pushing compute and AI onto detector silicon

    Authors: Antonino Miceli, Kazutomo Yoshii, Ian T. Foster

    Abstract: In order to take full advantage of the U.S. Department of Energy's billion-dollar investments into the next-generation research infrastructure (e.g., exascale, light sources, colliders), advances are required not only in detector technology but also in computing and specifically AI. Let us consider an example from X-ray science. Nanoscale X-ray imaging is a crucial tool to enable a wide range of s… ▽ More

    Submitted 21 May, 2022; originally announced May 2022.

    Comments: White paper for AI@DOE Roundtable, December 8-9, 2021 (virtual). arXiv admin note: text overlap with arXiv:2110.07828

  36. arXiv:2204.05128  [pdf, other

    cs.DC

    Linking Scientific Instruments and HPC: Patterns, Technologies, Experiences

    Authors: Rafael Vescovi, Ryan Chard, Nickolaus Saint, Ben Blaiszik, Jim Pruyne, Tekin Bicer, Alex Lavens, Zhengchun Liu, Michael E. Papka, Suresh Narayanan, Nicholas Schwarz, Kyle Chard, Ian Foster

    Abstract: Powerful detectors at modern experimental facilities routinely collect data at multiple GB/s. Online analysis methods are needed to enable the collection of only interesting subsets of such massive data streams, such as by explicitly discarding some data elements or by directing instruments to relevant areas of experimental space. Such online analyses require methods for configuring and running hi… ▽ More

    Submitted 22 August, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

  37. The History of the Grid

    Authors: Ian Foster, Carl Kesselman

    Abstract: With the widespread availability of high-speed networks, it becomes feasible to outsource computing to remote providers and to federate resources from many locations. Such observations motivated the development, from the mid-1990s onwards, of a range of innovative Grid technologies, applications, and infrastructures. We review the history, current status, and future prospects for Grid computing.

    Submitted 8 April, 2022; originally announced April 2022.

    Journal ref: High Performance Computing: From Grids and Clouds to Exascale, IOS Press, pages 3-30, 2011

  38. arXiv:2204.02881  [pdf

    cond-mat.mtrl-sci

    Community Action on FAIR Data will Fuel a Revolution in Materials Research

    Authors: LC Brinson, LM Bartolo, B Blaiszik, D Elbert, I Foster, A Strachan, PW Voorhees

    Abstract: Data - arguably the most important product of worldwide materials research investment - are rarely shared. The small and biased proportion of results published are buried in plots and text licensed by journals. This situation wastes resources, hinders innovation, and, in the current era of data-driven discovery, is no longer tenable. In this comment, we identify opportunities for synergistic, coll… ▽ More

    Submitted 23 February, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

  39. Multi-Output Physics-Informed Neural Networks for Forward and Inverse PDE Problems with Uncertainties

    Authors: Mingyuan Yang, John T. Foster

    Abstract: Physics-informed neural networks (PINNs) have recently been used to solve various computational problems which are governed by partial differential equations (PDEs). In this paper, we propose a multi-output physics-informed neural network (MO-PINN) which can provide solutions with uncertainty distributions for both forward and inverse PDE problems with noisy data. In this framework, the uncertaint… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  40. arXiv:2201.11146  [pdf, other

    math.NA

    Machine-learning of nonlocal kernels for anomalous subsurface transport from breakthrough curves

    Authors: Xiao Xu, Marta D'Elia, Christian Glusa, John T. Foster

    Abstract: Anomalous behavior is ubiquitous in subsurface solute transport due to the presence of high degrees of heterogeneity at different scales in the media. Although fractional models have been extensively used to describe the anomalous transport in various subsurface applications, their application is hindered by computational challenges. Simpler nonlocal models characterized by integrable kernels and… ▽ More

    Submitted 28 January, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

  41. CUF-Links: Continuous and Ubiquitous FAIRness Linkages for reproducible research

    Authors: Ian Foster, Carl Kesselman

    Abstract: Despite much creative work on methods and tools, reproducibility -- the ability to repeat the computational steps used to obtain a research result -- remains elusive. One reason for these difficulties is that extant tools for capturing research processes do not align well with the rich working practices of scientists. We advocate here for simple mechanisms that can be integrated easily with curren… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

    Journal ref: Computer, vol. 55, no. 8, pp. 20-30, Aug. 2022

  42. Sharing Begins at Home

    Authors: William Dempsey, Ian Foster, Scott Fraser, Carl Kesselman

    Abstract: The broad sharing of research data is widely viewed as of critical importance for the speed, quality, accessibility, and integrity of science. Despite increasing efforts to encourage data sharing, both the quality of shared data, and the frequency of data reuse, remain stubbornly low. We argue here that a major reason for this unfortunate state of affairs is that the organization of research resul… ▽ More

    Submitted 8 July, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

    Journal ref: Harvard Data Science Review, Volume 4, Issue 3, 2022

  43. arXiv:2111.11330  [pdf, other

    cs.DC

    High-Performance Ptychographic Reconstruction with Federated Facilities

    Authors: Tekin Bicer, Xiaodong Yu, Daniel J. Ching, Ryan Chard, Mathew J. Cherukara, Bogdan Nicolae, Rajkumar Kettimuthu, Ian T. Foster

    Abstract: Beamlines at synchrotron light source facilities are powerful scientific instruments used to image samples and observe phenomena at high spatial and temporal resolutions. Typically, these facilities are equipped only with modest compute resources for the analysis of generated experimental datasets. However, high data rate experiments can easily generate data in volumes that take days (or even week… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

    Comments: 19 pages, 5 figures, to be published in Smoky Mountains Computational Sciences and Engineering Conference (SMC 2021)

  44. GASKAP-HI Pilot Survey Science I: ASKAP Zoom Observations of HI Emission in the Small Magellanic Cloud

    Authors: N. M. Pingel, J. Dempsey, N. M. McClure-Griffiths, J. M. Dickey, K. E. Jameson, H. Arce, G. Anglada, J. Bland-Hawthorn, S. L. Breen, F. Buckland-Willis, S. E. Clark, J. R. Dawson, H. Dénes, E. M. Di Teodoro, B. -Q. For, Tyler J. Foster, J. F. Gómez, H. Imai, G. Joncas, C. -G. Kim, M. -Y. Lee, C. Lynn, D. Leahy, Y. K. Ma, A. Marchal , et al. (31 additional authors not shown)

    Abstract: We present the most sensitive and detailed view of the neutral hydrogen (HI) emission associated with the Small Magellanic Cloud (SMC), through the combination of data from the Australian Square Kilometre Array Pathfinder (ASKAP) and Parkes (Murriyang), as part of the Galactic Australian Square Kilometre Array Pathfinder (GASKAP) pilot survey. These GASKAP-HI pilot observations, for the first time… ▽ More

    Submitted 10 December, 2021; v1 submitted 9 November, 2021; originally announced November 2021.

    Comments: Accepted for publication in PASA, 34 pages, 18 figures, 5 tables

  45. arXiv:2110.02827  [pdf, other

    cs.DC cond-mat.mtrl-sci cs.LG

    Colmena: Scalable Machine-Learning-Based Steering of Ensemble Simulations for High Performance Computing

    Authors: Logan Ward, Ganesh Sivaraman, J. Gregory Pauloski, Yadu Babuji, Ryan Chard, Naveen Dandu, Paul C. Redfern, Rajeev S. Assary, Kyle Chard, Larry A. Curtiss, Rajeev Thakur, Ian Foster

    Abstract: Scientific applications that involve simulation ensembles can be accelerated greatly by using experiment design methods to select the best simulations to perform. Methods that use machine learning (ML) to create proxy models of simulations show particular promise for guiding ensembles but are challenging to deploy because of the need to coordinate dynamic mixes of simulation and learning tasks. We… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

    Comments: camera-ready version for ML in HPC Environments 2021

  46. arXiv:2108.11265  [pdf, other

    math.NA

    Coupling of IGA and Peridynamics for Air-Blast Fluid-Structure Interaction Using an Immersed Approach

    Authors: Masoud Behzadinasab, Georgios Moutsanidis, Nathaniel Trask, John T. Foster, Yuri Bazilevs

    Abstract: We present a novel formulation based on an immersed coupling of Isogeometric Analysis (IGA) and Peridynamics (PD) for the simulation of fluid-structure interaction (FSI) phenomena for air blast. We aim to develop a practical computational framework that is capable of capturing the mechanics of air blast coupled to solids and structures that undergo large, inelastic deformations with extreme damage… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

    Comments: 26 pages, 12 figures

  47. A Petrov-Galerkin method for nonlocal convection-dominated diffusion problems

    Authors: Yu Leng, Xiaochuan Tian, Leszek Demkowicz, Hector Gomez, John T. Foster

    Abstract: We present a Petrov-Gelerkin (PG) method for a class of nonlocal convection-dominated diffusion problems. There are two main ingredients in our approach. First, we define the norm on the test space as induced by the trial space norm, i.e., the optimal test norm, so that the inf-sup condition can be satisfied uniformly independent of the problem. We show the well-posedness of a class of nonlocal co… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  48. Toward Interlanguage Parallel Scripting for Distributed-Memory Scientific Computing

    Authors: Justin M. Wozniak, Timothy G. Armstrong, Ketan C. Maheshwari, Daniel S. Katz, Michael Wilde, Ian T. Foster

    Abstract: Scripting languages such as Python and R have been widely adopted as tools for the productive development of scientific software because of the power and expressiveness of the languages and available libraries. However, deploying scripted applications on large-scale parallel computer systems such as the IBM Blue Gene/Q or Cray XE6 is a challenge because of issues including operating system limitat… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: 2015 IEEE International Conference on Cluster Computing

  49. KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks

    Authors: J. Gregory Pauloski, Qi Huang, Lei Huang, Shivaram Venkataraman, Kyle Chard, Ian Foster, Zhao Zhang

    Abstract: Kronecker-factored Approximate Curvature (K-FAC) has recently been shown to converge faster in deep neural network (DNN) training than stochastic gradient descent (SGD); however, K-FAC's larger memory footprint hinders its applicability to large models. We present KAISA, a K-FAC-enabled, Adaptable, Improved, and ScAlable second-order optimizer framework that adapts the memory footprint, communicat… ▽ More

    Submitted 20 September, 2021; v1 submitted 4 July, 2021; originally announced July 2021.

    Comments: Accepted for publication at the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21)

  50. A FETI approach to domain decomposition for meshfree discretizations of nonlocal problems

    Authors: Xiao Xu, Christian Glusa, Marta D'Elia, John T. Foster

    Abstract: We propose a domain decomposition method for the efficient simulation of nonlocal problems. Our approach is based on a multi-domain formulation of a nonlocal diffusion problem where the subdomains share "nonlocal" interfaces of the size of the nonlocal horizon. This system of nonlocal equations is first rewritten in terms of minimization of a nonlocal energy, then discretized with a meshfree appro… ▽ More

    Submitted 15 May, 2021; originally announced May 2021.

    Report number: SAND2021-5958 O