-
Towards Unlocking Insights from Logbooks Using AI
Authors:
Antonin Sulc,
Alex Bien,
Annika Eichler,
Daniel Ratner,
Florian Rehm,
Frank Mayet,
Gregor Hartmann,
Hayden Hoschouer,
Henrik Tuennermann,
Jan Kaiser,
Jason St. John,
Jennefer Maldonado,
Kyle Hazelwood,
Raimund Kammering,
Thorsten Hellert,
Tim Wilksen,
Verena Kain,
Wan-Lin Hu
Abstract:
Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly t…
▽ More
Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly testing a tailored Retrieval Augmented Generation (RAG) model for enhancing the usability of particle accelerator logbooks at institutes like DESY, BESSY, Fermilab, BNL, SLAC, LBNL, and CERN. The RAG model uses a corpus built on logbook contributions and aims to unlock insights from these logbooks by leveraging retrieval over facility datasets, including discussion about potential multimodal sources. Our goals are to increase the FAIR-ness (findability, accessibility, interoperability, and reusability) of logbooks by exploiting their information content to streamline everyday use, enable macro-analysis for root cause analysis, and facilitate problem-solving automation.
△ Less
Submitted 25 May, 2024;
originally announced June 2024.
-
Automated Anomaly Detection on European XFEL Klystrons
Authors:
Antonin Sulc,
Annika Eichler,
Tim Wilksen
Abstract:
High-power multi-beam klystrons represent a key component to amplify RF to generate the accelerating field of the superconducting radio frequency (SRF) cavities at European XFEL. Exchanging these high-power components takes time and effort, thus it is necessary to minimize maintenance and downtime and at the same time maximize the device's operation. In an attempt to explore the behavior of klystr…
▽ More
High-power multi-beam klystrons represent a key component to amplify RF to generate the accelerating field of the superconducting radio frequency (SRF) cavities at European XFEL. Exchanging these high-power components takes time and effort, thus it is necessary to minimize maintenance and downtime and at the same time maximize the device's operation. In an attempt to explore the behavior of klystrons using machine learning, we completed a series of experiments on our klystrons to determine various operational modes and conduct feature extraction and dimensionality reduction to extract the most valuable information about a normal operation. To analyze recorded data we used state-of-the-art data-driven learning techniques and recognized the most promising components that might help us better understand klystron operational states and identify early on possible faults or anomalies.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
PACuna: Automated Fine-Tuning of Language Models for Particle Accelerators
Authors:
Antonin Sulc,
Raimund Kammering,
Annika Eichler,
Tim Wilksen
Abstract:
Navigating the landscape of particle accelerators has become increasingly challenging with recent surges in contributions. These intricate devices challenge comprehension, even within individual facilities. To address this, we introduce PACuna, a fine-tuned language model refined through publicly available accelerator resources like conferences, pre-prints, and books. We automated data collection…
▽ More
Navigating the landscape of particle accelerators has become increasingly challenging with recent surges in contributions. These intricate devices challenge comprehension, even within individual facilities. To address this, we introduce PACuna, a fine-tuned language model refined through publicly available accelerator resources like conferences, pre-prints, and books. We automated data collection and question generation to minimize expert involvement and make the data publicly available. PACuna demonstrates proficiency in addressing intricate accelerator questions, validated by experts. Our approach shows adapting language models to scientific domains by fine-tuning technical texts and auto-generated corpora capturing the latest developments can further produce pre-trained models to answer some intricate questions that commercially available assistants cannot and can serve as intelligent assistants for individual facilities.
△ Less
Submitted 27 November, 2023; v1 submitted 29 October, 2023;
originally announced October 2023.
-
Textual Analysis of ICALEPCS and IPAC Conference Proceedings: Revealing Research Trends, Topics, and Collaborations for Future Insights and Advanced Search
Authors:
Antonin Sulc,
Annika Eichler,
Tim Wilksen
Abstract:
In this paper, we show a textual analysis of past ICALEPCS and IPAC conference proceedings to gain insights into the research trends and topics discussed in the field. We use natural language processing techniques to extract meaningful information from the abstracts and papers of past conference proceedings. We extract topics to visualize and identify trends, analyze their evolution to identify em…
▽ More
In this paper, we show a textual analysis of past ICALEPCS and IPAC conference proceedings to gain insights into the research trends and topics discussed in the field. We use natural language processing techniques to extract meaningful information from the abstracts and papers of past conference proceedings. We extract topics to visualize and identify trends, analyze their evolution to identify emerging research directions, and highlight interesting publications based solely on their content with an analysis of their network. Additionally, we will provide an advanced search tool to better search the existing papers to prevent duplication and easier reference findings. Our analysis provides a comprehensive overview of the research landscape in the field and helps researchers and practitioners to better understand the state-of-the-art and identify areas for future research.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Log Anomaly Detection on EuXFEL Nodes
Authors:
Antonin Sulc,
Annika Eichler,
Tim Wilksen
Abstract:
This article introduces a method to detect anomalies in the log data generated by control system nodes at the European XFEL accelerator. The primary aim of this proposed method is to provide operators a comprehensive understanding of the availability, status, and problems specific to each node. This information is vital for ensuring the smooth operation. The sequential nature of logs and the absen…
▽ More
This article introduces a method to detect anomalies in the log data generated by control system nodes at the European XFEL accelerator. The primary aim of this proposed method is to provide operators a comprehensive understanding of the availability, status, and problems specific to each node. This information is vital for ensuring the smooth operation. The sequential nature of logs and the absence of a rich text corpus that is specific to our nodes poses significant limitations for traditional and learning-based approaches for anomaly detection. To overcome this limitation, we propose a method that uses word embedding and models individual nodes as a sequence of these vectors that commonly co-occur, using a Hidden Markov Model (HMM). We score individual log entries by computing a probability ratio between the probability of the full log sequence including the new entry and the probability of just the previous log entries, without the new entry. This ratio indicates how probable the sequence becomes when the new entry is added. The proposed approach can detect anomalies by scoring and ranking log entries from EuXFEL nodes where entries that receive high scores are potential anomalies that do not fit the routine of the node. This method provides a warning system to alert operators about these irregular log events that may indicate issues.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Towards Monocular Shape from Refraction
Authors:
Antonin Sulc,
Imari Sato,
Bastian Goldluecke,
Tali Treibitz
Abstract:
Refraction is a common physical phenomenon and has long been researched in computer vision. Objects imaged through a refractive object appear distorted in the image as a function of the shape of the interface between the media. This hinders many computer vision applications, but can be utilized for obtaining the geometry of the refractive interface. Previous approaches for refractive surface recov…
▽ More
Refraction is a common physical phenomenon and has long been researched in computer vision. Objects imaged through a refractive object appear distorted in the image as a function of the shape of the interface between the media. This hinders many computer vision applications, but can be utilized for obtaining the geometry of the refractive interface. Previous approaches for refractive surface recovery largely relied on various priors or additional information like multiple images of the analyzed surface. In contrast, we claim that a simple energy function based on Snell's law enables the reconstruction of an arbitrary refractive surface geometry using just a single image and known background texture and geometry. In the case of a single point, Snell's law has two degrees of freedom, therefore to estimate a surface depth, we need additional information. We show that solving for an entire surface at once introduces implicit parameter-free spatial regularization and yields convincing results when an intelligent initial guess is provided. We demonstrate our approach through simulations and real-world experiments, where the reconstruction shows encouraging results in the single-frame monocular setting.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.