Search | arXiv e-print repository

Fully invertible hyperbolic neural networks for segmenting large-scale surface and sub-surface data

Authors: Bas Peters, Eldad Haber, Keegan Lensink

Abstract: The large spatial/temporal/frequency scale of geoscience and remote-sensing datasets causes memory issues when using convolutional neural networks for (sub-) surface data segmentation. Recently developed fully reversible or fully invertible networks can mostly avoid memory limitations by recomputing the states during the backward pass through the network. This results in a low and fixed memory req… ▽ More The large spatial/temporal/frequency scale of geoscience and remote-sensing datasets causes memory issues when using convolutional neural networks for (sub-) surface data segmentation. Recently developed fully reversible or fully invertible networks can mostly avoid memory limitations by recomputing the states during the backward pass through the network. This results in a low and fixed memory requirement for storing network states, as opposed to the typical linear memory growth with network depth. This work focuses on a fully invertible network based on the telegraph equation. While reversibility saves the major amount of memory used in deep networks by the data, the convolutional kernels can take up most memory if fully invertible networks contain multiple invertible pooling/coarsening layers. We address the explosion of the number of convolutional kernels by combining fully invertible networks with layers that contain the convolutional kernels in a compressed form directly. A second challenge is that invertible networks output a tensor the same size as its input. This property prevents the straightforward application of invertible networks to applications that map between different input-output dimensions, need to map to outputs with more channels than present in the input data, or desire outputs that decrease/increase the resolution compared to the input data. However, we show that by employing invertible networks in a non-standard fashion, we can still use them for these tasks. Examples in hyperspectral land-use classification, airborne geophysical surveying, and seismic imaging illustrate that we can input large data volumes in one chunk and do not need to work on small patches, use dimensionality reduction, or employ methods that classify a patch to a single central pixel. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 22 pages, 13 figures

MSC Class: 86A04

arXiv:2407.00257 [pdf, other]

Inverting airborne electromagnetic data with machine learning

Authors: Michael S. McMillan, Bas Peters, Ophir Greif, Paulina Wozniakowska, Eldad Haber

Abstract: This study focuses on inverting time-domain airborne electromagnetic data in 2D by training a neural-network to understand the relationship between data and conductivity, thereby removing the need for expensive forward modeling during the inversion process. Instead the forward modeling is completed in the training stage, where training models are built before calculating 3D forward modeling traini… ▽ More This study focuses on inverting time-domain airborne electromagnetic data in 2D by training a neural-network to understand the relationship between data and conductivity, thereby removing the need for expensive forward modeling during the inversion process. Instead the forward modeling is completed in the training stage, where training models are built before calculating 3D forward modeling training data. The method relies on training data being similar to the field dataset of choice, therefore, the field data was first inverted in 1D to get an idea of the expected conductivity distribution. With this information, $ 10,000 $ training models were built with similar conductivity ranges, and the research shows that this provided enough information for the network to produce realistic 2D inversion models over an aquifer-bearing region in California. Once the training was completed, the actual inversion time took only a matter of seconds on a generic laptop, which means that if future data was collected in this region it could be inverted in near real-time. Better results are expected by increasing the number of training models and eventually the goal is to extend the method to 3D inversion. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: 4 pages, 5 figures, conference submission

MSC Class: 86A22

arXiv:2405.13220 [pdf, other]

Paired Autoencoders for Inverse Problems

Authors: Matthias Chung, Emma Hart, Julianne Chung, Bas Peters, Eldad Haber

Abstract: We consider the solution of nonlinear inverse problems where the forward problem is a discretization of a partial differential equation. Such problems are notoriously difficult to solve in practice and require minimizing a combination of a data-fit term and a regularization term. The main computational bottleneck of typical algorithms is the direct estimation of the data misfit. Therefore, likelih… ▽ More We consider the solution of nonlinear inverse problems where the forward problem is a discretization of a partial differential equation. Such problems are notoriously difficult to solve in practice and require minimizing a combination of a data-fit term and a regularization term. The main computational bottleneck of typical algorithms is the direct estimation of the data misfit. Therefore, likelihood-free approaches have become appealing alternatives. Nonetheless, difficulties in generalization and limitations in accuracy have hindered their broader utility and applicability. In this work, we use a paired autoencoder framework as a likelihood-free estimator for inverse problems. We show that the use of such an architecture allows us to construct a solution efficiently and to overcome some known open problems when using likelihood-free estimators. In particular, our framework can assess the quality of the solution and improve on it if needed. We demonstrate the viability of our approach using examples from full waveform inversion and inverse electromagnetic imaging. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 18 pages, 6 figures

arXiv:2404.06377 [pdf, ps, other]

doi 10.1103/PhysRevAccelBeams.22.083202

Experimental study of second sound quench detection for superconducting cavities

Authors: Juliette Plouin, Bertrand Baudouy, Aurélien Four, Jean-Pierre Charrier, Luc Maurice, Jorge Novo, Benedikt Peters, Kitty Liao

Abstract: Superconducting RF cavities are used in particle accelerators to provide energy to the particle beam. Such cavities are mostly fabricated in niobium and often operated in superfluid helium. One of their limits of operation is the appearance of a local quench, initiated by a local field enhancement due to a defect, which leads to a normal conducting transition of the cavity. Localizing the quench a… ▽ More Superconducting RF cavities are used in particle accelerators to provide energy to the particle beam. Such cavities are mostly fabricated in niobium and often operated in superfluid helium. One of their limits of operation is the appearance of a local quench, initiated by a local field enhancement due to a defect, which leads to a normal conducting transition of the cavity. Localizing the quench area can be achieved with temperature mapping systems. Another method is the use of second sound wave propagation in superfluid helium. Measuring the time of propagation of these waves from quench location to special sensors, called Oscillating Superleak Transducers (OSTs), and using their well-known velocity should allow trilateration. However, most of experimental measurements on cavities show "premature signals", i.e. the second sound signals arrive earlier on the OSTs than expected. This paper presents several quench experiments on cavities equipped with OSTs and temperature mapping quench detection systems. Two hypotheses can explain the observed premature signals. The first one assesses faster propagation in helium. An experimental setup has been developed for testing this hypothesis, where second sound is created by a localized heater in a controlled environment up to 4.3 kW/cm2 and 2.8 J. Premature signals could not be verified in this setup. A second hypothesis based on a simple model including several processes in niobium and second sound propagation in helium is discussed. The model improves significantly the prediction of the times of arrival of the second sound waves. The overall study shows that the processes in niobium play a prominent role in the second sound detection for superconducting cavities. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 12 pages, 22 images

Journal ref: Phys. Rev. Accel. Beams 22, 083202 (2019)

arXiv:2403.03923 [pdf, other]

Did Translation Models Get More Robust Without Anyone Even Noticing?

Authors: Ben Peters, André F. T. Martins

Abstract: Neural machine translation (MT) models achieve strong results across a variety of settings, but it is widely believed that they are highly sensitive to "noisy" inputs, such as spelling errors, abbreviations, and other formatting issues. In this paper, we revisit this insight in light of recent multilingual MT models and large language models (LLMs) applied to machine translation. Somewhat surprisi… ▽ More Neural machine translation (MT) models achieve strong results across a variety of settings, but it is widely believed that they are highly sensitive to "noisy" inputs, such as spelling errors, abbreviations, and other formatting issues. In this paper, we revisit this insight in light of recent multilingual MT models and large language models (LLMs) applied to machine translation. Somewhat surprisingly, we show through controlled experiments that these models are far more robust to many kinds of noise than previous models, even when they perform similarly on clean data. This is notable because, even though LLMs have more parameters and more complex training processes than past models, none of the open ones we consider use any techniques specifically designed to encourage robustness. Next, we show that similar trends hold for social media translation experiments -- LLMs are more robust to social media text. We include an analysis of the circumstances in which source correction techniques can be used to mitigate the effects of noise. Altogether, we show that robustness to many types of noise has increased. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2403.03560 [pdf, other]

Sparse convex relaxations in polynomial optimization

Authors: Gennadiy Averkov, Benjamin Peters, Sebastian Sager

Abstract: We present a novel, general, and unifying point of view on sparse approaches to polynomial optimization. Solving polynomial optimization problems to global optimality is a ubiquitous challenge in many areas of science and engineering. Historically, different approaches on how to solve nonconvex polynomial optimization problems based on convex relaxations have been developed in different scientific… ▽ More We present a novel, general, and unifying point of view on sparse approaches to polynomial optimization. Solving polynomial optimization problems to global optimality is a ubiquitous challenge in many areas of science and engineering. Historically, different approaches on how to solve nonconvex polynomial optimization problems based on convex relaxations have been developed in different scientific communities. Here, we introduce the concept of monomial patterns. A pattern determines what monomials are to be linked by convex constraints in a convex relaxation of a polynomial optimization problem. This concept helps to understand existing approaches from different schools of thought, to develop novel relaxation schemes, and to derive a flexible duality theory, which can be specialized to many concrete situations that have been considered in the literature. We unify different approaches to polynomial optimization including polyhedral approximations, dense semidefinite relaxations, SONC, SAGE, and TSSOS in a self-contained exposition. We also carry out computational experiments to demonstrate the practical advantages of a flexible usage of pattern-based sparse relaxations of polynomial optimization problems. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: This manuscript evolved from arxiv:1901.05675 but its presentation and contents are significantly different. The manuscript uses the same numerical results as in arxiv:1901.05675

MSC Class: 90C23; 90C22; 90C26; 13J30

arXiv:2402.17733 [pdf, other]

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

Authors: Duarte M. Alves, José Pombal, Nuno M. Guerreiro, Pedro H. Martins, João Alves, Amin Farajian, Ben Peters, Ricardo Rei, Patrick Fernandes, Sweta Agrawal, Pierre Colombo, José G. C. de Souza, André F. T. Martins

Abstract: While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task. In this paper, we propose a recipe for tailoring LLMs to multiple tasks present in translation workflows. We perform continued pretraining on a multilingual mixture of monolingual and pa… ▽ More While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task. In this paper, we propose a recipe for tailoring LLMs to multiple tasks present in translation workflows. We perform continued pretraining on a multilingual mixture of monolingual and parallel data, creating TowerBase, followed by finetuning on instructions relevant for translation processes, creating TowerInstruct. Our final model surpasses open alternatives on several tasks relevant to translation workflows and is competitive with general-purpose closed LLMs. To facilitate future research, we release the Tower models, our specialization dataset, an evaluation framework for LLMs focusing on the translation ecosystem, and a collection of model generations, including ours, on our benchmark. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.12604 [pdf, ps, other]

Generative Adversarial Collaborations: A practical guide for conference organizers and participating scientists

Authors: Gunnar Blohm, Benjamin Peters, Ralf Haefner, Leyla Isik, Nikolaus Kriegeskorte, Jennifer S. Lieberman, Carlos R. Ponce, Gemma Roig, Megan A. K. Peters

Abstract: Generative adversarial collaborations (GACs) are a form of formal teamwork between groups of scientists with diverging views. The goal of GACs is to identify and ultimately resolve the most important challenges, controversies, and exciting theoretical and empirical debates in a given research field. A GAC team would develop specific, agreed-upon avenues to resolve debates in order to move a field… ▽ More Generative adversarial collaborations (GACs) are a form of formal teamwork between groups of scientists with diverging views. The goal of GACs is to identify and ultimately resolve the most important challenges, controversies, and exciting theoretical and empirical debates in a given research field. A GAC team would develop specific, agreed-upon avenues to resolve debates in order to move a field of research forward in a collaborative way. Such adversarial collaborations have many benefits and opportunities but also come with challenges. Here, we use our experience from (1) creating and running the GAC program for the Cognitive Computational Neuroscience (CCN) conference and (2) implementing and leading GACs on particular scientific problems to provide a practical guide for future GAC program organizers and leaders of individual GACs. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2401.06005 [pdf, other]

How does the primate brain combine generative and discriminative computations in vision?

Authors: Benjamin Peters, James J. DiCarlo, Todd Gureckis, Ralf Haefner, Leyla Isik, Joshua Tenenbaum, Talia Konkle, Thomas Naselaris, Kimberly Stachenfeld, Zenna Tavares, Doris Tsao, Ilker Yildirim, Nikolaus Kriegeskorte

Abstract: Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remo… ▽ More Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes giving rise to it. In this conception, vision inverts a generative model through an interrogation of the evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2312.13480 [pdf, other]

InvertibleNetworks.jl: A Julia package for scalable normalizing flows

Authors: Rafael Orozco, Philipp Witte, Mathias Louboutin, Ali Siahkoohi, Gabrio Rizzuti, Bas Peters, Felix J. Herrmann

Abstract: InvertibleNetworks.jl is a Julia package designed for the scalable implementation of normalizing flows, a method for density estimation and sampling in high-dimensional distributions. This package excels in memory efficiency by leveraging the inherent invertibility of normalizing flows, which significantly reduces memory requirements during backpropagation compared to existing normalizing flow pac… ▽ More InvertibleNetworks.jl is a Julia package designed for the scalable implementation of normalizing flows, a method for density estimation and sampling in high-dimensional distributions. This package excels in memory efficiency by leveraging the inherent invertibility of normalizing flows, which significantly reduces memory requirements during backpropagation compared to existing normalizing flow packages that rely on automatic differentiation frameworks. InvertibleNetworks.jl has been adapted for diverse applications, including seismic imaging, medical imaging, and CO2 monitoring, demonstrating its effectiveness in learning high-dimensional distributions. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: Submitted to Journal of Open Source Software (JOSS)

arXiv:2311.17533 [pdf, other]

A general model and toolkit for the ionization of three or more electrons in strongly driven molecules using an effective Coulomb potential for the interaction between bound electrons

Authors: Georgios Petros Katsoulis, Matthew Benjamin Peters, Agapi Emmanouilidou

Abstract: We formulate a general three-dimensional semiclassical model for the study of correlated multielectron escape during fragmentation of molecules driven by intense infrared laser pulses, while fully accounting for the magnetic field of the laser pulse. We do so in the context of triple ionization of strongly driven HeH$_{2}^{+}$. Our model fully accounts for the singularity in the Coulomb potentials… ▽ More We formulate a general three-dimensional semiclassical model for the study of correlated multielectron escape during fragmentation of molecules driven by intense infrared laser pulses, while fully accounting for the magnetic field of the laser pulse. We do so in the context of triple ionization of strongly driven HeH$_{2}^{+}$. Our model fully accounts for the singularity in the Coulomb potentials of a recolliding electron with the core and a bound electron with the core as well as for the interaction of a recolliding with a bound electron. To avoid artificial autoionization, our model employs effective potentials to treat the interaction between bound electrons. We focus on triple and double ionization as well as frustrated triple and frustrated double ionization. In these processes, we identify and explain the main features of the sum of the kinetic energies of the final ion fragments. We find that frustrated double ionization is a major ionization process, and we identify the different channels and hence different final fragments that are obtained through frustrated double ionization. Also, we discuss the differences between frustrated double and triple ionization. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 20 pages, 9 figures

arXiv:2309.04053 [pdf, other]

Materials Design for Hypersonics

Authors: Adam B. Peters, Dajie Zhang, Samuel Chen, Catherine Ott, Corey Oses, Stefano Curtarolo, Ian McCue, Tresa Pollock, Suhas Eswarappa Prameela

Abstract: Hypersonic vehicles must withstand extreme conditions during flights that exceed five times the speed of sound. These systems have the potential to facilitate rapid access to space, bolster defense capabilities, and create a new paradigm for transcontinental earth-to-earth travel. However, extreme aerothermal environments create significant challenges for vehicle materials and structures. This wor… ▽ More Hypersonic vehicles must withstand extreme conditions during flights that exceed five times the speed of sound. These systems have the potential to facilitate rapid access to space, bolster defense capabilities, and create a new paradigm for transcontinental earth-to-earth travel. However, extreme aerothermal environments create significant challenges for vehicle materials and structures. This work addresses the critical need to develop resilient refractory alloys, composites, and ceramics. We will highlight key design principles for critical vehicle areas such as primary structures, thermal protection, and propulsion systems; the role of theory and computation; and strategies for advancing laboratory-scale materials to flight-ready components. △ Less

Submitted 23 January, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

arXiv:2302.10895 [pdf, other]

CQnet: convex-geometric interpretation and constraining neural-network trajectories

Authors: Bas Peters

Abstract: We introduce CQnet, a neural network with origins in the CQ algorithm for solving convex split-feasibility problems and forward-backward splitting. CQnet's trajectories are interpretable as particles that are tracking a changing constraint set via its point-to-set distance function while being elements of another constraint set at every layer. More than just a convex-geometric interpretation, CQne… ▽ More We introduce CQnet, a neural network with origins in the CQ algorithm for solving convex split-feasibility problems and forward-backward splitting. CQnet's trajectories are interpretable as particles that are tracking a changing constraint set via its point-to-set distance function while being elements of another constraint set at every layer. More than just a convex-geometric interpretation, CQnet accommodates learned and deterministic constraints that may be sample or data-specific and are satisfied by every layer and the output. Furthermore, the states in CQnet progress toward another constraint set at every layer. We provide proof of stability/nonexpansiveness with minimal assumptions. The combination of constraint handling and stability put forward CQnet as a candidate for various tasks where prior knowledge exists on the network states or output. △ Less

Submitted 9 February, 2023; originally announced February 2023.

Comments: 12 pages, 7 figures

MSC Class: 68T07 ACM Class: I.2.6; G.1.6

arXiv:2302.03777 [pdf, other]

doi 10.1103/PhysRevA.107.L041101

Singularity in electron-core potential as a gateway to accurate multi-electron ionization spectra in strongly driven atoms

Authors: Agapi Emmanouilidou, Matthew Benjamin Peters, Georgios Petros Katsoulis

Abstract: We demonstrate a general three-dimensional semiclassical model as a powerful technique for the study of correlated multi-electron escape in atoms driven by infrared laser pulses at intensities where electron-electron correlation prevails. We do so in the context of triple ionization of strongly driven Ne. We show that a drawback of other current quantum mechanical and classical models of triple io… ▽ More We demonstrate a general three-dimensional semiclassical model as a powerful technique for the study of correlated multi-electron escape in atoms driven by infrared laser pulses at intensities where electron-electron correlation prevails. We do so in the context of triple ionization of strongly driven Ne. We show that a drawback of other current quantum mechanical and classical models of triple ionization is that they soften the Coulomb potential of each electron with the core. The model we employ fully accounts for the singularity in the Coulomb potentials of a recolliding electron with the core and a bound electron with the core as well as for the interaction of a recolliding with a bound electron. Our model treats approximately only the interaction between bound electrons through the use of effective potentials. These effective potentials ensure that no artificial autoionization takes place as a result of the full treatment of the electron-core potential. We demonstrate the accuracy of our model by obtaining triple ionization distributions of the sum of the final electron momenta which we find to be in very good agreement with experiments. Also, we explain the main features of these momenta distributions in terms of the prevalent pathways of correlated three-electron escape in Ne. We also show that the different ionization pathways prevailing in three-electron escape in strongly driven Ne versus Ar give rise to different momenta distributions in these two atoms. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 6 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2210.17394

arXiv:2210.17394 [pdf, other]

Nondipole electron momentum offset as a probe of correlated three electron ionization in strongly driven atoms

Authors: Georgios Petros Katsoulis, Matthew Benjamin Peters, Agapi Emmanouilidou

Abstract: We employ a recently developed three-dimensional semiclassical model to identify nondipole effects in triple ionization of Ne driven by infrared laser pulses at intensities where electron-electron correlation prevails. This model fully accounts for the Coulomb interaction of each electron with the core and avoids artificial autoionization by employing effective Coulomb potentials to describe the i… ▽ More We employ a recently developed three-dimensional semiclassical model to identify nondipole effects in triple ionization of Ne driven by infrared laser pulses at intensities where electron-electron correlation prevails. This model fully accounts for the Coulomb interaction of each electron with the core and avoids artificial autoionization by employing effective Coulomb potentials to describe the interaction between bound electrons (ECBB). Using the ECBB model, we identify a prominent signature of nondipole effects. Namely, the component along the direction of light propagation of the average sum of the final electron momenta is large and positive. That is, we identify a positive momentum offset, absent in the dipole approximation. We find that this positive momentum offset stems mostly from the momentum change due to the magnetic field. To further understand this momentum change, we also develop a simple model for the motion of an electron inside an electromagnetic field. This simple model accounts for the effect of the Coulomb forces only as a sharp change in the momentum of the electron during recollision. We show that the momentum change due to the magnetic field is related with the sharp change in momentum during recollision for the recolliding electron as well as with the time of recollision for both the recolliding and bound electrons. Hence, we demonstrate that the final electron momentum offset probes the strength of a recollision and hence the degree of correlation in multielectron ionization. △ Less

Submitted 25 February, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

Comments: 10 pages, 8 figures

arXiv:2208.13770 [pdf, other]

Local Verlet buffer approach for broad-phase interaction detection in Discrete Element Method

Authors: Abdoul Wahid Mainassara Checkaraou, Xavier Besseron, Alban Rousset, Fenglei Qi, Bernhard Peters

Abstract: The Extended Discrete Element Method (XDEM) is an innovative numerical simulation technique that extends the dynamics of granular materials known as Discrete Element Method (DEM) by additional properties such as the thermodynamic state, stress/strain for each particle. Such DEM simulations used by industries to set up their experimental processes are complexes and heavy in computation time. At e… ▽ More The Extended Discrete Element Method (XDEM) is an innovative numerical simulation technique that extends the dynamics of granular materials known as Discrete Element Method (DEM) by additional properties such as the thermodynamic state, stress/strain for each particle. Such DEM simulations used by industries to set up their experimental processes are complexes and heavy in computation time. At each time step, those simulations generate a list of interacting particles and this phase is one of the most computationally expensive parts of a DEM simulation. The Verlet buffer method, initially introduced in Molecular Dynamic (MD) (and also used in DEM), allows keeping the interaction list for many time steps by extending each particle neighbourhood by a certain extension range, and thus broadening the interaction list. The method relies on the temporal coherency of DEM, which guarantees that no particles move erratically from one time step to the next. In the classical approach, all the particles have their neighbourhood extended by the same value which leads to suboptimal performances in simulations where different flow regimes coexist. Additionally, and unlike in MD, there is no comprehensive study analysing the different parameters that affect the performance of the Verlet buffer method in DEM. In this work, we propose a new method for the dynamic update of the neighbour list that depends on the particles individual displacement and define a particle-specific extension range based on the local flow regime. The interaction list is analysed throughout the simulation based on the particle's displacement allowing a flexible update according to the flow regime conditions. We evaluate the influence of the Verlet extension range on the execution time through different test cases and analyse empirically the extension range value giving the best performance. △ Less

Submitted 25 August, 2022; originally announced August 2022.

arXiv:2208.02041 [pdf]

Reactive Laser Synthesis of Ultra-high-temperature Ceramics HfC, ZrC, TiC, HfN, ZrN, and TiN for Additive Manufacturing

Authors: Adam B. Peters, Chuhong Wang, Dajie Zhang, Alberto Hernandez, Dennis C. Nagle, Tim Mueller, James B. Spicer

Abstract: Ultra-high-temperature ceramics (UHTCs) are optimal structural materials for applications that require extreme temperature resilience, resistance to chemically aggressive environments, wear, and mechanical stress. Processing UHTCs with laser-based additive manufacturing (AM) has not been fully realized due to a variety of obstacles. In this work, selective laser reaction sintering (SLRS) technique… ▽ More Ultra-high-temperature ceramics (UHTCs) are optimal structural materials for applications that require extreme temperature resilience, resistance to chemically aggressive environments, wear, and mechanical stress. Processing UHTCs with laser-based additive manufacturing (AM) has not been fully realized due to a variety of obstacles. In this work, selective laser reaction sintering (SLRS) techniques were investigated for the production of near net-shape UHTC ceramics such as HfC, ZrC, TiC, HfN, ZrN, and TiN. Group IV transition metal and metal oxide precursor materials were chemically converted and reaction-bonded into layers of UHTCs using single-step selective laser processing in CH4 or NH3 gas that might be compatible with prevailing powder bed fusion techniques. Conversion of either metals (Hf, Zr and Ti) or metal oxides (HfO2, ZrO2, and TiO2) particles was first investigated to examine reaction mechanisms and volume changes associated with SLRS of single-component precursor systems. SLRS processing of metal or metal oxide alone produced near stoichiometric UHTC phases with yields up to 100 wt% total for carbides and nitrides. However, for single component precursors, gas-solid reactivity induced volumetric changes resulted in residual stresses and cracking in the product layer. To mitigate conversion-induced stresses, composite metal/metal oxide precursors were employed to compensate for the volume changes of either the metal (which expands during conversion) or the metal oxide precursor (which contracts). △ Less

Submitted 6 December, 2022; v1 submitted 3 August, 2022; originally announced August 2022.

Comments: 58 pages, 17 figures

arXiv:2208.00054 [pdf]

Selective Laser Reaction Synthesis of SiC, Si$_3$N$_4$ and HfC/SiC Composites for Additive Manufacturing

Authors: Adam B. Peters, Dajie Zhang, Alberto Hernandez, Chuhong Wang, Dennis C. Nagle, Tim Mueller, James B. Spicer

Abstract: Selective laser reaction sintering techniques (SLRS) techniques were investigated for the production of near net-shape non-oxide ceramics including SiC, Si$_3$N$_4$, and HfC/SiC composites that might be compatible with prevailing powder bed fusion additive manufacturing processes. Reaction bonded layers of covalent ceramics were produced using in-situ reactions that occur during selective laser pr… ▽ More Selective laser reaction sintering techniques (SLRS) techniques were investigated for the production of near net-shape non-oxide ceramics including SiC, Si$_3$N$_4$, and HfC/SiC composites that might be compatible with prevailing powder bed fusion additive manufacturing processes. Reaction bonded layers of covalent ceramics were produced using in-situ reactions that occur during selective laser processing and layer formation. During SLRS, precursor materials composed of metal and/or metal oxide powders were fashioned into powder beds for conversion to non-oxide ceramic layers. Laser-processing was used to initiate simultaneous chemical conversion and local interparticle bonding of precursor particles in CH4 or NH3 gases. Several factors related to the reaction synthesis process (precursor chemistry, gas-solid and gas-liquid synthesis mechanisms, precursor vapor pressures) were investigated in relation to resulting microstructures and non-oxide yields. Results indicated that the volumetric changes which occurred during in-situ conversion of single component precursors negatively impacted the surface layer microstructure. To circumvent the internal stresses and cracking that accompanied the conversion of Si or Hf (that expands upon conversion) or SiO$_x$ (that contracts during conversion), optimized ratios of the precursor constituents were used to produce near isovolumetric conversion to the product phase. The results demonstrate that under appropriate processing conditions and precursor selection, the formation of near net-shape SiC and SiC composites might be achieved through single-step AM-compatible techniques. △ Less

Submitted 29 July, 2022; originally announced August 2022.

Comments: 31 pages, 11 figures

arXiv:2208.00052 [pdf]

Reactive Two-Step Additive Manufacturing of Ultra-high Temperature Carbide Ceramics

Authors: Adam B. Peters, Dajie Zhang, Dennis C. Nagle, James B. Spicer

Abstract: Ultra-high-temperature ceramics (UHTCs) are candidate structural materials for applications that require resiliency to extreme temperature (>2000°C), high mechanical loads, or aggressive oxidizing environments. Processing UHTC transition metal carbides as standalone materials using additive manufacturing (AM) methods has not been fully realized due to their extremely slow atomic diffusivities that… ▽ More Ultra-high-temperature ceramics (UHTCs) are candidate structural materials for applications that require resiliency to extreme temperature (>2000°C), high mechanical loads, or aggressive oxidizing environments. Processing UHTC transition metal carbides as standalone materials using additive manufacturing (AM) methods has not been fully realized due to their extremely slow atomic diffusivities that impede sintering and large volume changes during indirect AM that can induce defect structures. In this work, a two-step, reactive AM approach was studied for the formation of the ultra-high temperature ceramic TiCx. Readily available equipment including a polymer powder bed fusion AM machine and a traditional tube furnace were used to produce UHTC cubes and lattice structures with sub-millimeter resolution. This processing scheme incorporated, (1) selective laser sintering of a Ti precursor mixed with a phenolic binder for green body shaping, and (2) ex-situ, isothermal gas-solid conversion of the green body in CH4 to form TiCx structures. Reactive post-processing in CH4 resulted in up to 98.2 wt% TiC0.90 product yield and a reduction in net-shrinkage during consolidation due to the volume expansion associated with the conversion of Ti to TiC. Results indicated that reaction bonding associated with the Gibbs free energy release associated with TiC formation produced interparticle adhesion at low furnace processing temperatures. The ability to bond highly refractory materials through this type of process resulted in structures that were crack-free and resisted fracture during thermal shock testing. Broadly, the additive manufacturing approach presented could be useful for the production of many UHTC carbides that might otherwise be incompatible with prevailing AM techniques that do not include reaction synthesis. △ Less

Submitted 6 December, 2022; v1 submitted 29 July, 2022; originally announced August 2022.

Comments: 23 pages,14 figures, one figure with link to external video

arXiv:2207.02056 [pdf]

doi 10.1093/database/baac087

Ontology Development Kit: a toolkit for building, maintaining, and standardising biomedical ontologies

Authors: Nicolas Matentzoglu, Damien Goutte-Gattat, Shawn Zheng Kai Tan, James P. Balhoff, Seth Carbon, Anita R. Caron, William D. Duncan, Joe E. Flack, Melissa Haendel, Nomi L. Harris, William R Hogan, Charles Tapley Hoyt, Rebecca C. Jackson, HyeongSik Kim, Huseyin Kir, Martin Larralde, Julie A. McMurry, James A. Overton, Bjoern Peters, Clare Pilgrim, Ray Stefancsik, Sofia MC Robb, Sabrina Toro, Nicole A Vasilevsky, Ramona Walls , et al. (2 additional authors not shown)

Abstract: Similar to managing software packages, managing the ontology life cycle involves multiple complex workflows such as preparing releases, continuous quality control checking, and dependency management. To manage these processes, a diverse set of tools is required, from command line utilities to powerful ontology engineering environments such as ROBOT. Particularly in the biomedical domain, which has… ▽ More Similar to managing software packages, managing the ontology life cycle involves multiple complex workflows such as preparing releases, continuous quality control checking, and dependency management. To manage these processes, a diverse set of tools is required, from command line utilities to powerful ontology engineering environments such as ROBOT. Particularly in the biomedical domain, which has developed a set of highly diverse yet inter-dependent ontologies, standardising release practices and metadata, and establishing shared quality standards, are crucial to enable interoperability. The Ontology Development Kit (ODK) provides a set of standardised, customisable, and automatically executable workflows, and packages all required tooling in a single Docker image. In this paper, we provide an overview of how the ODK works, show how it is used in practice, and describe how we envision it driving standardisation efforts in our community. △ Less

Submitted 5 July, 2022; originally announced July 2022.

Comments: 19 pages, 2 supplementary tables, 1 supplementary figure

arXiv:2204.08083 [pdf, other]

AfriWOZ: Corpus for Exploiting Cross-Lingual Transferability for Generation of Dialogues in Low-Resource, African Languages

Authors: Tosin Adewumi, Mofetoluwa Adeyemi, Aremu Anuoluwapo, Bukola Peters, Happy Buzaaba, Oyerinde Samuel, Amina Mardiyyah Rufai, Benjamin Ajibade, Tajudeen Gwadabe, Mory Moussou Koulibaly Traore, Tunde Ajayi, Shamsuddeen Muhammad, Ahmed Baruwa, Paul Owoicho, Tolulope Ogunremi, Phylis Ngigi, Orevaoghene Ahia, Ruqayya Nasir, Foteini Liwicki, Marcus Liwicki

Abstract: Dialogue generation is an important NLP task fraught with many challenges. The challenges become more daunting for low-resource African languages. To enable the creation of dialogue agents for African languages, we contribute the first high-quality dialogue datasets for 6 African languages: Swahili, Wolof, Hausa, Nigerian Pidgin English, Kinyarwanda & Yorùbá. These datasets consist of 1,500 turns… ▽ More Dialogue generation is an important NLP task fraught with many challenges. The challenges become more daunting for low-resource African languages. To enable the creation of dialogue agents for African languages, we contribute the first high-quality dialogue datasets for 6 African languages: Swahili, Wolof, Hausa, Nigerian Pidgin English, Kinyarwanda & Yorùbá. These datasets consist of 1,500 turns each, which we translate from a portion of the English multi-domain MultiWOZ dataset. Subsequently, we investigate & analyze the effectiveness of modelling through transfer learning by utilziing state-of-the-art (SoTA) deep monolingual models: DialoGPT and BlenderBot. We compare the models with a simple seq2seq baseline using perplexity. Besides this, we conduct human evaluation of single-turn conversations by using majority votes and measure inter-annotator agreement (IAA). We find that the hypothesis that deep monolingual models learn some abstractions that generalize across languages holds. We observe human-like conversations, to different degrees, in 5 out of the 6 languages. The language with the most transferable properties is the Nigerian Pidgin English, with a human-likeness score of 78.1%, of which 34.4% are unanimous. We freely provide the datasets and host the model checkpoints/demos on the HuggingFace hub for public access. △ Less

Submitted 19 May, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

Comments: 14 pages, 1 figure, 8 tables

arXiv:2201.12160 [pdf, ps, other]

doi 10.1103/PhysRevA.105.043102

A general model and toolkit for the ionization of three or more electrons in strongly driven atoms using an effective Coulomb potential for the interaction between bound electrons

Authors: M. B. Peters, G. P. Katsoulis, A. Emmanouilidou

Abstract: We formulate a three-dimensional semi-classical model to address triple and double ionization in three-electron atoms driven by intense infrared laser pulses. During time propagation, our model fully accounts for the Coulomb singularities, the magnetic field of the laser pulse and for the motion of the nucleus at the same time as for the motion of the three electrons. The framework we develop is g… ▽ More We formulate a three-dimensional semi-classical model to address triple and double ionization in three-electron atoms driven by intense infrared laser pulses. During time propagation, our model fully accounts for the Coulomb singularities, the magnetic field of the laser pulse and for the motion of the nucleus at the same time as for the motion of the three electrons. The framework we develop is general and can account for multi-electron ionization in strongly-driven atoms with more than three electrons. To avoid unphysical autoionization arising in classical models of three or more electrons, we replace the Coulomb potential between pairs of bound electrons with effective Coulomb potentials. The Coulomb forces between electrons that are not both bound are fully accounted for. We develop a set of criteria to determine when electrons become bound during time propagation. We compare ionization spectra obtained with the model developed here and with the Heisenberg model that includes a potential term restricting an electron from closely approaching the core. Such spectra include the sum of the electron momenta along the direction of the laser field as well as the correlated electron momenta. We also compare these results with experimental ones. △ Less

Submitted 28 January, 2022; originally announced January 2022.

arXiv:2201.00463 [pdf, other]

doi 10.1051/0004-6361/201937034

The APEX Large CO Heterodyne Orion Legacy Survey (ALCOHOLS). I. Survey overview

Authors: Thomas Stanke, H. G. Arce, J. Bally, P. Bergman, J. Carpenter, C. J. Davis, W. Dent, J. Di Francesco, J. Eislöffel, D. Froebrich, A. Ginsburg, M. Heyer, D. Johnstone, D. Mardones, M. J. McCaughrean, S. T. Megeath, F. Nakamura, M. D. Smith, A. Stutz, K. Tatematsu, C. Walker, J. P. Williams, H. Zinnecker, B. J. Swift, C. Kulesa , et al. (7 additional authors not shown)

Abstract: The Orion molecular cloud complex harbours the nearest GMCs and site of high-mass star formation. Its YSO populations are thoroughly characterized. The region is therefore a prime target for the study of star formation. Here, we verify the performance of the SuperCAM 64 pixel heterodyne array on APEX. We give a descriptive overview of a set of wide-field CO(3-2) spectral cubes obtained towards t… ▽ More The Orion molecular cloud complex harbours the nearest GMCs and site of high-mass star formation. Its YSO populations are thoroughly characterized. The region is therefore a prime target for the study of star formation. Here, we verify the performance of the SuperCAM 64 pixel heterodyne array on APEX. We give a descriptive overview of a set of wide-field CO(3-2) spectral cubes obtained towards the Orion GMC complex, aimed at characterizing the dynamics and structure of the extended molecular gas in diverse regions of the clouds, ranging from very active sites of clustered star formation in Orion B to comparatively quiet regions in southern Orion A. We present a 2.7 square degree (130pc$^2$) mapping survey in the CO(3-2) transition, obtained using SuperCAM on APEX at an angular resolution of 19'' (7600AU or 0.037pc at a distance of 400pc), covering L1622, NGC2071, NGC2068, OriB9, NGC2024, and NGC2023 in Orion B, and the southern part of the L1641 cloud in Orion A. We describe CO integrated emission and line moment maps and position-velocity diagrams and discuss a few sub-regions in some detail. Evidence for expanding bubbles is seen with lines splitting into double components, most prominently in NGC2024, where we argue that the bulk of the molecular gas is in the foreground of the HII region. High CO(3-2)/CO(1-0) line ratios reveal warm CO along the western edge of Orion B in the NGC2023/NGC2024 region facing the IC434 HII region. Multiple, well separated radial velocity components seen in L1641-S suggest that it consists of a sequence of clouds at increasingly larger distances. We find a small, spherical cloud - the 'Cow Nebula' globule - north of NGC2071. We trace high velocity line wings for the NGC2071-IR outflow and the NGC2024 CO jet. The protostellar dust core FIR4 (rather than FIR5) is the true driving source of the NGC2024 monopolar outflow. △ Less

Submitted 2 January, 2022; originally announced January 2022.

Comments: Accepted for publication in Astronomy and Astrophysics

Journal ref: A&A 658, A178 (2022)

arXiv:2112.07051 [pdf]

doi 10.1093/database/baac035

A Simple Standard for Sharing Ontological Mappings (SSSOM)

Authors: Nicolas Matentzoglu, James P. Balhoff, Susan M. Bello, Chris Bizon, Matthew Brush, Tiffany J. Callahan, Christopher G Chute, William D. Duncan, Chris T. Evelo, Davera Gabriel, John Graybeal, Alasdair Gray, Benjamin M. Gyori, Melissa Haendel, Henriette Harmse, Nomi L. Harris, Ian Harrow, Harshad Hegde, Amelia L. Hoyt, Charles T. Hoyt, Dazhi Jiao, Ernesto Jiménez-Ruiz, Simon Jupp, Hyeongsik Kim, Sebastian Koehler , et al. (19 additional authors not shown)

Abstract: Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, ar… ▽ More Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Are they associated in some other way? Such relationships between the mapped terms are often not documented, leading to incorrect assumptions and making them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Also, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. The Simple Standard for Sharing Ontological Mappings (SSSOM) addresses these problems by: 1. Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. 2. Defining an easy to use table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data standards. 3. Implementing open and community-driven collaborative workflows designed to evolve the standard continuously to address changing requirements and mapping practices. 4. Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases, and survey some existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable, and Reusable (FAIR). The SSSOM specification is at http://w3id.org/sssom/spec. △ Less

Submitted 13 December, 2021; originally announced December 2021.

Comments: Corresponding author: Christopher J. Mungall <cjmungall@lbl.gov>

arXiv:2111.12153 [pdf]

Methodology and feasibility of neurofeedback to improve visual attention to letters in mild Alzheimer's disease

Authors: Deirdre McLaughlin, Daniel Klee, Tab Memmott, Betts Peters, Jack Wiedrick, Melanie Fried-Oken, Barry Oken

Abstract: Brain computer interfaces systems are controlled by users through neurophysiological input for a variety of applications including communication, environmental control, motor rehabilitation, and cognitive training. Although individuals with severe speech and physical impairment are the primary users of this technology, BCIs have emerged as a potential tool for broader populations, especially with… ▽ More Brain computer interfaces systems are controlled by users through neurophysiological input for a variety of applications including communication, environmental control, motor rehabilitation, and cognitive training. Although individuals with severe speech and physical impairment are the primary users of this technology, BCIs have emerged as a potential tool for broader populations, especially with regards to delivering cognitive training or interventions with neurofeedback. The goal of this study was to investigate the feasibility of using a BCI system with neurofeedback as an intervention for people with mild Alzheimer's disease. The study focused on visual attention and language since ad is often associated with functional impairments in language and reading. The study enrolled five adults with mild ad in a nine to thirteen week BCI EEG based neurofeedback intervention to improve attention and reading skills. Two participants completed intervention entirely. The remaining three participants could not complete the intervention phase because of restrictions related to covid. Pre and post assessment measures were used to assess reliability of outcome measures and generalization of treatment to functional reading, processing speed, attention, and working memory skills. Participants demonstrated steady improvement in most cognitive measures across experimental phases, although there was not a significant effect of NFB on most measures of attention. One subject demonstrated significantly significant improvement in letter cancellation during NFB. All participants with mild AD learned to operate a BCI system with training. Results have broad implications for the design and use of bci systems for participants with cognitive impairment. Preliminary evidence justifies implementing NFB-based cognitive measures in AD. △ Less

Submitted 23 November, 2021; originally announced November 2021.

Comments: 50 pages including 6 figures and 4 tables

arXiv:2109.03351 [pdf, other]

Capturing the objects of vision with neural networks

Authors: Benjamin Peters, Nikolaus Kriegeskorte

Abstract: Human visual perception carves a scene at its physical joints, decomposing the world into objects, which are selectively attended, tracked, and predicted as we engage our surroundings. Object representations emancipate perception from the sensory input, enabling us to keep in mind that which is out of sight and to use perceptual content as a basis for action and symbolic cognition. Human behaviora… ▽ More Human visual perception carves a scene at its physical joints, decomposing the world into objects, which are selectively attended, tracked, and predicted as we engage our surroundings. Object representations emancipate perception from the sensory input, enabling us to keep in mind that which is out of sight and to use perceptual content as a basis for action and symbolic cognition. Human behavioral studies have documented how object representations emerge through grouping, amodal completion, proto-objects, and object files. Deep neural network (DNN) models of visual object recognition, by contrast, remain largely tethered to the sensory input, despite achieving human-level performance at labeling objects. Here, we review related work in both fields and examine how these fields can help each other. The cognitive literature provides a starting point for the development of new experimental tasks that reveal mechanisms of human object perception and serve as benchmarks driving development of deep neural network models that will put the object into object recognition. △ Less

Submitted 7 September, 2021; originally announced September 2021.

Comments: 25 pages, 5 figures

arXiv:2108.04010 [pdf, other]

doi 10.1088/1674-1137/ac66cc

Measurement of Muon-induced Neutron Production at the China Jinping Underground Laboratory

Authors: Lin Zhao, Wentai Luo, Lars Bathe Peters, Shaomin Chen, Mourad Chouaki, Wei Dou, Lei Guo, Ziyi Guo, Ghulam Hussain, Jinjing Li, Ye Liang, Qian Liu, Guang Luo, Ming Qi, Wenhui Shao, Jian Tang, Linyan Wan, Zhe Wang, Yiyang Wu, Benda Xu, Tong Xu, Weiran Xu, Yuzi Yang, Minfang Yeh, Bin Zhang

Abstract: Solar, terrestrial, and supernova neutrino experiments are subject to muon-induced radioactive backgrounds. The China Jinping Underground Laboratory (CJPL), with its unique advantage of a 2400 m rock coverage and long distance from nuclear power plants, is ideal for MeV-scale neutrino experiments. Using a 1-ton prototype detector of the Jinping Neutrino Experiment (JNE), we detected 343 high-energ… ▽ More Solar, terrestrial, and supernova neutrino experiments are subject to muon-induced radioactive backgrounds. The China Jinping Underground Laboratory (CJPL), with its unique advantage of a 2400 m rock coverage and long distance from nuclear power plants, is ideal for MeV-scale neutrino experiments. Using a 1-ton prototype detector of the Jinping Neutrino Experiment (JNE), we detected 343 high-energy cosmic-ray muons and (7.86$ \pm $3.97) muon-induced neutrons from an 820.28-day dataset at the first phase of CJPL (CJPL-I). Based on the muon-induced neutrons, we measured the corresponding muon-induced neutron yield in a liquid scintillator to be $ (3.44 \pm 1.86_{\rm stat.}\pm 0.76_{\rm syst.})\times 10^{-4}μ^{-1}\rm g^{-1}cm^{2} $ at an average muon energy of \SI{340}{GeV}. We provided the first study for such neutron background at CJPL. A global fit including this measurement shows a power-law coefficient of (0.75$ \pm $0.02) for the dependence of the neutron yield at the liquid scintillator on muon energy. △ Less

Submitted 26 June, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

Journal ref: 2022 Chinese Phys. C 46 085001

arXiv:2103.10291 [pdf, other]

Smoothing and Shrinking the Sparse Seq2Seq Search Space

Authors: Ben Peters, André F. T. Martins

Abstract: Current sequence-to-sequence models are trained to minimize cross-entropy and use softmax to compute the locally normalized probabilities over target sequences. While this setup has led to strong results in a variety of tasks, one unsatisfying aspect is its length bias: models give high scores to short, inadequate hypotheses and often make the empty string the argmax -- the so-called cat got your… ▽ More Current sequence-to-sequence models are trained to minimize cross-entropy and use softmax to compute the locally normalized probabilities over target sequences. While this setup has led to strong results in a variety of tasks, one unsatisfying aspect is its length bias: models give high scores to short, inadequate hypotheses and often make the empty string the argmax -- the so-called cat got your tongue problem. Recently proposed entmax-based sparse sequence-to-sequence models present a possible solution, since they can shrink the search space by assigning zero probability to bad hypotheses, but their ability to handle word-level tasks with transformers has never been tested. In this work, we show that entmax-based models effectively solve the cat got your tongue problem, removing a major source of model error for neural machine translation. In addition, we generalize label smoothing, a critical regularization technique, to the broader family of Fenchel-Young losses, which includes both cross-entropy and the entmax losses. Our resulting label-smoothed entmax loss models set a new state of the art on multilingual grapheme-to-phoneme conversion and deliver improvements and better calibration properties on cross-lingual morphological inflection and machine translation for 6 language pairs. △ Less

Submitted 18 March, 2021; originally announced March 2021.

Comments: NAACL 2021

arXiv:2012.06435 [pdf, ps, other]

doi 10.1103/PhysRevA.103.033115

Signatures of magnetic field effects in non-sequential double ionization manifesting as back-scattering for molecules versus forward-scattering for atoms

Authors: Georgios Petros Katsoulis, Matthew Benjamin Peters, André Staudte, Ravi Bhardwaj, Agapi Emmanouilidou

Abstract: For two-electron diatomic molecules, we investigate magnetic field effects in non-sequential double ionization where recollisions prevail. We do so by formulating a three-dimensional semi-classical model that fully accounts for the Coulomb singularities and for magnetic field effects during time propagation. Using this model, we identify a prominent signature of non-dipole effects. Namely, we demo… ▽ More For two-electron diatomic molecules, we investigate magnetic field effects in non-sequential double ionization where recollisions prevail. We do so by formulating a three-dimensional semi-classical model that fully accounts for the Coulomb singularities and for magnetic field effects during time propagation. Using this model, we identify a prominent signature of non-dipole effects. Namely, we demonstrate that the recolliding electron back-scatters along the direction of light propagation. Hence, this electron escapes opposite to the direction of change in momentum due to the magnetic field. This is in striking contrast to strongly-driven atoms where the recolliding electron forward-scatters along the direction of light propagation. We attribute these distinct signatures to the different gate that the magnetic field creates jointly with a soft recollision in molecules compared to a hard recollision in atoms. These two different gates give rise, shortly before recollision, to different momenta and positions of the recolliding electron along the direction of light propagation. As a result, we show that the Coulomb forces from the nuclei act to back-scatter the recolliding electron in molecules and forward-scatter it in atoms along the direction of light propagation. △ Less

Submitted 11 December, 2020; originally announced December 2020.

Comments: 14 pages, 8 figures

Journal ref: Phys. Rev. A 103, 033115 (2021)

arXiv:2010.16216 [pdf, ps, other]

doi 10.1103/PhysRevA.103.043109

Triple ionization and "frustrated" triple ionization in triatomic molecules driven by intense laser fields

Authors: M. B. Peters, V. P. Majety, A. Emmanouilidou

Abstract: We formulate a three-dimensional semi-classical model to treat three-electron escape dynamics in a strongly-driven linear triatomic molecule, HeH$_{2}^{+}$. Our model includes the Coulomb singularities. Hence, to avoid unphysical autoionization, we employ two criteria to switch off the Coulomb repulsive force between two bound electrons and switch it on when the motion of one electron is mostly de… ▽ More We formulate a three-dimensional semi-classical model to treat three-electron escape dynamics in a strongly-driven linear triatomic molecule, HeH$_{2}^{+}$. Our model includes the Coulomb singularities. Hence, to avoid unphysical autoionization, we employ two criteria to switch off the Coulomb repulsive force between two bound electrons and switch it on when the motion of one electron is mostly determined by the laser field. We investigate triple and "frustrated" triple ionization. In the latter process two electrons escape while one electron remains bound in a Rydberg state. We find that two pathways prevail in "frustrated" triple ionization, as in "frustrated" double ionization. We also find that the electron that remains in a Rydberg state is more likely to be attached to He$^{2+}$ compared to H$^{+}$. Our results indicate that in triple and "frustrated" triple ionization electronic correlation is weak. Moreover, we compute the sum of the kinetic energies as well as the angular patterns of the final ion fragments in triple and "frustrated" triple ionization. These patterns suggest that the fragmenting molecule deviates from its initial linear configuration. △ Less

Submitted 30 October, 2020; originally announced October 2020.

Comments: 7 pages, 4 figures

Journal ref: Phys. Rev. A 103, 043109 (2021)

arXiv:2007.13251 [pdf, other]

Point-to-set distance functions for weakly supervised segmentation

Authors: Bas Peters

Abstract: When pixel-level masks or partial annotations are not available for training neural networks for semantic segmentation, it is possible to use higher-level information in the form of bounding boxes, or image tags. In the imaging sciences, many applications do not have an object-background structure and bounding boxes are not available. Any available annotation typically comes from ground truth or d… ▽ More When pixel-level masks or partial annotations are not available for training neural networks for semantic segmentation, it is possible to use higher-level information in the form of bounding boxes, or image tags. In the imaging sciences, many applications do not have an object-background structure and bounding boxes are not available. Any available annotation typically comes from ground truth or domain experts. A direct way to train without masks is using prior knowledge on the size of objects/classes in the segmentation. We present a new algorithm to include such information via constraints on the network output, implemented via projection-based point-to-set distance functions. This type of distance functions always has the same functional form of the derivative, and avoids the need to adapt penalty functions to different constraints, as well as issues related to constraining properties typically associated with non-differentiable functions. Whereas object size information is known to enable object segmentation from bounding boxes from datasets with many general and medical images, we show that the applications extend to the imaging sciences where data represents indirect measurements, even in the case of single examples. We illustrate the capabilities in case of a) one or more classes do not have any annotation; b) there is no annotation at all; c) there are bounding boxes. We use data for hyperspectral time-lapse imaging, object segmentation in corrupted images, and sub-surface aquifer mapping from airborne-geophysical remote-sensing data. The examples verify that the developed methodology alleviates difficulties with annotating non-visual imagery for a range of experimental settings. △ Less

Submitted 26 July, 2020; originally announced July 2020.

MSC Class: 68T45

arXiv:2007.04126 [pdf, other]

doi 10.1063/5.0002766

Solvent reaction coordinate for an S$_N$2 reaction

Authors: Christian Leitold, Christopher J. Mundy, Marcel D. Baer, Gregory K. Schenter, Baron Peters

Abstract: We study the prototypical SN2 reaction Cl$^-$ + CH$_3$Cl $\to$ CH$_3$Cl + Cl$^-$ in water using quantum mechanics / molecular mechanics (QM/MM) computer simulations with transition path sampling and inertial likelihood maximization. We have identified a new solvent coordinate to complement the original atom-exchange coordinate used in the classic analysis by Chandrasekhar, Smith, and Jorgensen [Re… ▽ More We study the prototypical SN2 reaction Cl$^-$ + CH$_3$Cl $\to$ CH$_3$Cl + Cl$^-$ in water using quantum mechanics / molecular mechanics (QM/MM) computer simulations with transition path sampling and inertial likelihood maximization. We have identified a new solvent coordinate to complement the original atom-exchange coordinate used in the classic analysis by Chandrasekhar, Smith, and Jorgensen [Ref1]. The new solvent coordinate quantifies instantaneous solvent induced polarization relative to the equilibrium average charge density at each point along the reaction pathway. On the basis of likelihood scores and committor distributions, the new solvent coordinate improves upon the description of solvent dynamical effects relative to previously proposed solvent coordinates. However, it does not increase the transmission coefficient or the accuracy of a transition state theory rate calculation. △ Less

Submitted 8 July, 2020; originally announced July 2020.

Comments: 11 pages, 10 figures

Journal ref: J. Chem. Phys. 153, 024103 (2020)

arXiv:2005.02181 [pdf, other]

A neural network walks into a lab: towards using deep nets as models for human behavior

Authors: Wei Ji Ma, Benjamin Peters

Abstract: What might sound like the beginning of a joke has become an attractive prospect for many cognitive scientists: the use of deep neural network models (DNNs) as models of human behavior in perceptual and cognitive tasks. Although DNNs have taken over machine learning, attempts to use them as models of human behavior are still in the early stages. Can they become a versatile model class in the cognit… ▽ More What might sound like the beginning of a joke has become an attractive prospect for many cognitive scientists: the use of deep neural network models (DNNs) as models of human behavior in perceptual and cognitive tasks. Although DNNs have taken over machine learning, attempts to use them as models of human behavior are still in the early stages. Can they become a versatile model class in the cognitive scientist's toolbox? We first argue why DNNs have the potential to be interesting models of human behavior. We then discuss how that potential can be more fully realized. On the one hand, we argue that the cycle of training, testing, and revising DNNs needs to be revisited through the lens of the cognitive scientist's goals. Specifically, we argue that methods for assessing the goodness of fit between DNN models and human behavior have to date been impoverished. On the other hand, cognitive science might have to start using more complex tasks (including richer stimulus spaces), but doing so might be beneficial for DNN-independent reasons as well. Finally, we highlight avenues where traditional cognitive process models and DNNs may show productive synergy. △ Less

Submitted 2 May, 2020; originally announced May 2020.

arXiv:2003.09745 [pdf, other]

doi 10.1063/5.0003224

Solid-solid phase equilibria in the NaCl-KCl system

Authors: Jamshed Anwar, Christian Leitold, Baron Peters

Abstract: Solid solutions, structurally ordered but compositionally disordered mixtures, can form for salts, metals, and even organic compounds. The NaCl-KCl system forms a solid solution at all compositions between 657°C and 505°C. Below a critical temperature of 505°C, the system exhibits a miscibility gap with coexisting Na-rich and K-rich rocksalt phases. We calculate the phase diagram in this region us… ▽ More Solid solutions, structurally ordered but compositionally disordered mixtures, can form for salts, metals, and even organic compounds. The NaCl-KCl system forms a solid solution at all compositions between 657°C and 505°C. Below a critical temperature of 505°C, the system exhibits a miscibility gap with coexisting Na-rich and K-rich rocksalt phases. We calculate the phase diagram in this region using the semi-grand canonical Widom method, which averages over virtual particle transmutations. We verify our results by comparison with free energies calculated from thermodynamic integration and extrapolate the location of the critical point. The calculations reproduce the experimental phase diagram remarkably well and illustrate how solid-solid equilibria and chemical potentials, including those at metastable conditions, can be computed for materials that form solid solutions. △ Less

Submitted 15 April, 2020; v1 submitted 21 March, 2020; originally announced March 2020.

Comments: 10 pages, 9 figures

Journal ref: Journal of Chemical Physics 152, 144109 (2020)

arXiv:2003.08466 [pdf, other]

Fully reversible neural networks for large-scale 3D seismic horizon tracking

Authors: Bas Peters, Eldad Haber

Abstract: Tracking a horizon in seismic images or 3D volumes is an integral part of seismic interpretation. The last few decades saw progress in using neural networks for this task, starting from shallow networks for 1D traces, to deeper convolutional neural networks for large 2D images. Because geological structures are intrinsically 3D, we hope to see improved horizon tracking by training networks on 3D s… ▽ More Tracking a horizon in seismic images or 3D volumes is an integral part of seismic interpretation. The last few decades saw progress in using neural networks for this task, starting from shallow networks for 1D traces, to deeper convolutional neural networks for large 2D images. Because geological structures are intrinsically 3D, we hope to see improved horizon tracking by training networks on 3D seismic data cubes. While there are some 3D convolutional neural networks for various seismic interpretation tasks, they are restricted to shallow networks or relatively small 3D inputs because of memory limitations. The required memory for the network states and weights increases with network depth. We present a fully reversible network for horizon tracking that has a memory requirement that is independent of network depth. To tackle memory issues regarding the network weights, we use layers that train in a factorized form directly. Therefore, we can maintain a large number of network channels while keeping the number of convolutional kernels low. We use the saved memory to increase the input size of the data by order of magnitude such that the network can better learn from large structures in the data. A field data example verifies the proposed network structure is suitable for seismic horizon tracking. △ Less

Submitted 18 March, 2020; originally announced March 2020.

MSC Class: 68U10 ACM Class: I.4.6

arXiv:2003.07908 [pdf, other]

Deep connections between learning from limited labels & physical parameter estimation -- inspiration for regularization

Authors: Bas Peters

Abstract: Recently established equivalences between differential equations and the structure of neural networks enabled some interpretation of training of a neural network as partial-differential-equation (PDE) constrained optimization. We add to the previously established connections, explicit regularization that is particularly beneficial in the case of single large-scale examples with partial annotation.… ▽ More Recently established equivalences between differential equations and the structure of neural networks enabled some interpretation of training of a neural network as partial-differential-equation (PDE) constrained optimization. We add to the previously established connections, explicit regularization that is particularly beneficial in the case of single large-scale examples with partial annotation. We show that explicit regularization of model parameters in PDE constrained optimization translates to regularization of the network output. Examination of the structure of the corresponding Lagrangian and backpropagation algorithm do not reveal additional computational challenges. A hyperspectral imaging example shows that minimum prior information together with cross-validation for optimal regularization parameters boosts the segmentation accuracy. △ Less

Submitted 17 March, 2020; originally announced March 2020.

MSC Class: 68T45 ACM Class: I.2.10; I.4.6

arXiv:2003.07474 [pdf, other]

Fully reversible neural networks for large-scale surface and sub-surface characterization via remote sensing

Authors: Bas Peters, Eldad Haber, Keegan Lensink

Abstract: The large spatial/frequency scale of hyperspectral and airborne magnetic and gravitational data causes memory issues when using convolutional neural networks for (sub-) surface characterization. Recently developed fully reversible networks can mostly avoid memory limitations by virtue of having a low and fixed memory requirement for storing network states, as opposed to the typical linear memory g… ▽ More The large spatial/frequency scale of hyperspectral and airborne magnetic and gravitational data causes memory issues when using convolutional neural networks for (sub-) surface characterization. Recently developed fully reversible networks can mostly avoid memory limitations by virtue of having a low and fixed memory requirement for storing network states, as opposed to the typical linear memory growth with depth. Fully reversible networks enable the training of deep neural networks that take in entire data volumes, and create semantic segmentations in one go. This approach avoids the need to work in small patches or map a data patch to the class of just the central pixel. The cross-entropy loss function requires small modifications to work in conjunction with a fully reversible network and learn from sparsely sampled labels without ever seeing fully labeled ground truth. We show examples from land-use change detection from hyperspectral time-lapse data, and regional aquifer mapping from airborne geophysical and geological data. △ Less

Submitted 16 March, 2020; originally announced March 2020.

MSC Class: 68T45 ACM Class: I.4.6

arXiv:1912.12137 [pdf, other]

Symmetric block-low-rank layers for fully reversible multilevel neural networks

Authors: Bas Peters, Eldad Haber, Keegan Lensink

Abstract: Factors that limit the size of the input and output of a neural network include memory requirements for the network states/activations to compute gradients, as well as memory for the convolutional kernels or other weights. The memory restriction is especially limiting for applications where we want to learn how to map volumetric data to the desired output, such as video-to-video. Recently develope… ▽ More Factors that limit the size of the input and output of a neural network include memory requirements for the network states/activations to compute gradients, as well as memory for the convolutional kernels or other weights. The memory restriction is especially limiting for applications where we want to learn how to map volumetric data to the desired output, such as video-to-video. Recently developed fully reversible neural networks enable gradient computations using storage of the network states for a couple of layers only. While this saves a tremendous amount of memory, it is the convolutional kernels that take up most memory if fully reversible networks contain multiple invertible pooling/coarsening layers. Invertible coarsening operators such as the orthogonal wavelet transform cause the number of channels to grow explosively. We address this issue by combining fully reversible networks with layers that contain the convolutional kernels in a compressed form directly. Specifically, we introduce a layer that has a symmetric block-low-rank structure. In spirit, this layer is similar to bottleneck and squeeze-and-expand structures. We contribute symmetry by construction, and a combination of notation and flattening of tensors allows us to interpret these network structures in linear algebraic fashion as a block-low-rank matrix in factorized form and observe various properties. A video segmentation example shows that we can train a network to segment the entire video in one go, which would not be possible, in terms of memory requirements, using non-reversible networks and previously proposed reversible networks. △ Less

Submitted 14 December, 2019; originally announced December 2019.

MSC Class: 68T45

arXiv:1910.02137 [pdf, other]

Microfoundations of Discounting

Authors: Alexander T. I. Adamou, Yonatan Berman, Diomides P. Mavroyiannis, Ole B. Peters

Abstract: An important question in economics is how people choose between different payments in the future. The classical normative model predicts that a decision maker discounts a later payment relative to an earlier one by an exponential function of the time between them. Descriptive models use non-exponential functions to fit observed behavioral phenomena, such as preference reversal. Here we propose a m… ▽ More An important question in economics is how people choose between different payments in the future. The classical normative model predicts that a decision maker discounts a later payment relative to an earlier one by an exponential function of the time between them. Descriptive models use non-exponential functions to fit observed behavioral phenomena, such as preference reversal. Here we propose a model of discounting, consistent with standard axioms of choice, in which decision makers maximize the growth rate of their wealth. Four specifications of the model produce four forms of discounting -- no discounting, exponential, hyperbolic, and a hybrid of exponential and hyperbolic -- two of which predict preference reversal. Our model requires no assumption of behavioral bias or payment risk. △ Less

Submitted 8 January, 2020; v1 submitted 4 October, 2019; originally announced October 2019.

arXiv:1906.02971 [pdf, other]

doi 10.1142/S0217751X20500086

On the Ubiquity Of Electromagnetic-Duality Rotations in 4D, N = 1 Holoraumy Tensors for On-Shell 4D Supermultiplets

Authors: S. James Gates, Jr., Daniel Lay, S. -N. Hazel Mak, Brock Peters, Aravind Ramakrishnan, Kory Stiffler, Zachary Wimpee, Xiao Xiao, Yifan Yuan, Jinjie Zhang, Peter V. Zhou

Abstract: Holoraumy is a tool being developed for dimensional enhancement (supersymmetry holography) where the goal is to build higher dimensional supersymmetric multiplets from lower dimensional supersymmetric multiplets. In this paper, for the first time we investigate holoraumy for on-shell supersymmetry. Specifically, the holoraumy tensors for a number of familiar 4D, $\mathcal{N}=1$ multiplets are calc… ▽ More Holoraumy is a tool being developed for dimensional enhancement (supersymmetry holography) where the goal is to build higher dimensional supersymmetric multiplets from lower dimensional supersymmetric multiplets. In this paper, for the first time we investigate holoraumy for on-shell supersymmetry. Specifically, the holoraumy tensors for a number of familiar 4D, $\mathcal{N}=1$ multiplets are calculated. It is shown in all of these cases of on-shell theories, the holoraumy is of the form of an electromagnetic duality charge multiplying a composite transformation involving an electromagnetic duality rotation through an angle of $π/2$ times a space time translation. The details of our calculations can be found at the HEPTHools Data Repository at https://hepthools.github.io/Data/. △ Less

Submitted 13 June, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

Comments: Latex twice, 21 pages, final version includes one additional reference, teo citation correction, and 2-form clarifying language, link to the HEPTHools Data Repository included

Report number: Brown University Preprint HET-1789 MSC Class: 81T60; 20C35

arXiv:1905.10484 [pdf, other]

Fully Hyperbolic Convolutional Neural Networks

Authors: Keegan Lensink, Bas Peters, Eldad Haber

Abstract: Convolutional Neural Networks (CNN) have recently seen tremendous success in various computer vision tasks. However, their application to problems with high dimensional input and output, such as high-resolution image and video segmentation or 3D medical imaging, has been limited by various factors. Primarily, in the training stage, it is necessary to store network activations for back propagation.… ▽ More Convolutional Neural Networks (CNN) have recently seen tremendous success in various computer vision tasks. However, their application to problems with high dimensional input and output, such as high-resolution image and video segmentation or 3D medical imaging, has been limited by various factors. Primarily, in the training stage, it is necessary to store network activations for back propagation. In these settings, the memory requirements associated with storing activations can exceed what is feasible with current hardware, especially for problems in 3D. Motivated by the propagation of signals over physical networks, that are governed by the hyperbolic Telegraph equation, in this work we introduce a fully conservative hyperbolic network for problems with high dimensional input and output. We introduce a coarsening operation that allows completely reversible CNNs by using a learnable Discrete Wavelet Transform and its inverse to both coarsen and interpolate the network state and change the number of channels. We show that fully reversible networks are able to achieve results comparable to the state of the art in 4D time-lapse hyper spectral image segmentation and full 3D video segmentation, with a much lower memory footprint that is a constant independent of the network depth. We also extend the use of such networks to Variational Auto Encoders with high resolution input and output. △ Less

Submitted 7 July, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: 21 pages, 9 figures, Updated work to include additional numerical experiments, a section about VAEs and learnable wavelets

arXiv:1905.05702 [pdf, other]

Sparse Sequence-to-Sequence Models

Authors: Ben Peters, Vlad Niculae, André F. T. Martins

Abstract: Sequence-to-sequence models are a powerful workhorse of NLP. Most variants employ a softmax transformation in both their attention mechanism and output layer, leading to dense alignments and strictly positive output probabilities. This density is wasteful, making models less interpretable and assigning probability mass to many implausible outputs. In this paper, we propose sparse sequence-to-seque… ▽ More Sequence-to-sequence models are a powerful workhorse of NLP. Most variants employ a softmax transformation in both their attention mechanism and output layer, leading to dense alignments and strictly positive output probabilities. This density is wasteful, making models less interpretable and assigning probability mass to many implausible outputs. In this paper, we propose sparse sequence-to-sequence models, rooted in a new family of $α$-entmax transformations, which includes softmax and sparsemax as particular cases, and is sparse for any $α> 1$. We provide fast algorithms to evaluate these transformations and their gradients, which scale well for large vocabulary sizes. Our models are able to produce sparse alignments and to assign nonzero probability to a short list of plausible outputs, sometimes rendering beam search exact. Experiments on morphological inflection and machine translation reveal consistent gains over dense models. △ Less

Submitted 12 June, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

Comments: ACL 2019 Camera Ready

arXiv:1904.04413 [pdf, other]

doi 10.1190/segam2019-3216640.1

Does shallow geological knowledge help neural-networks to predict deep units?

Authors: Bas Peters, Eldad Haber, Justin Granek

Abstract: Geological interpretation of seismic images is a visual task that can be automated by training neural networks. While neural networks have shown to be effective at various interpretation tasks, a fundamental challenge is the lack of labeled data points in the subsurface. For example, the interpolation and extrapolation of well-based lithology using seismic images relies on a small number of known… ▽ More Geological interpretation of seismic images is a visual task that can be automated by training neural networks. While neural networks have shown to be effective at various interpretation tasks, a fundamental challenge is the lack of labeled data points in the subsurface. For example, the interpolation and extrapolation of well-based lithology using seismic images relies on a small number of known labels. Besides well-known data augmentation techniques, as well as regularization of the network output, we propose and test another approach to deal with the lack of labels. Non learning-based horizon trackers work very well in the shallow subsurface where seismic images are of higher quality and the geological units are roughly layered. We test if these segmented and shallow units can help train neural networks to predict deeper geological units that are not layered and flat. We show that knowledge of shallow geological units helps to predict deeper units when there are only a few labels for training using a dataset from the Sea of Ireland. We employ U-net based multi-resolution networks, and we show that these networks can be described using matrix-vector product notation in a similar fashion as standard geophysical inverse problems. △ Less

Submitted 8 April, 2019; originally announced April 2019.

Comments: 7 pages, 5 figures

MSC Class: 86A99

arXiv:1903.11215 [pdf, other]

Neural-networks for geophysicists and their application to seismic data interpretation

Authors: Bas Peters, Eldad Haber, Justin Granek

Abstract: Neural-networks have seen a surge of interest for the interpretation of seismic images during the last few years. Network-based learning methods can provide fast and accurate automatic interpretation, provided there are sufficiently many training labels. We provide an introduction to the field aimed at geophysicists that are familiar with the framework of forward modeling and inversion. We explain… ▽ More Neural-networks have seen a surge of interest for the interpretation of seismic images during the last few years. Network-based learning methods can provide fast and accurate automatic interpretation, provided there are sufficiently many training labels. We provide an introduction to the field aimed at geophysicists that are familiar with the framework of forward modeling and inversion. We explain the similarities and differences between deep networks to other geophysical inverse problems and show their utility in solving problems such as lithology interpolation between wells, horizon tracking and segmentation of seismic images. The benefits of our approach are demonstrated on field data from the Sea of Ireland and the North Sea. △ Less

Submitted 26 March, 2019; originally announced March 2019.

Comments: 8 pages, 5 figures

MSC Class: 86A04

arXiv:1903.03942 [pdf, other]

Generalized Minkowski sets for the regularization of inverse problems

Authors: Bas Peters, Felix J. Herrmann

Abstract: Many works on inverse problems in the imaging sciences consider regularization via one or more penalty functions or constraint sets. When the models/images are not easily described using one or a few penalty functions/constraints, additive model descriptions for regularization lead to better imaging results. These include cartoon-texture decomposition, morphological component analysis, and robust… ▽ More Many works on inverse problems in the imaging sciences consider regularization via one or more penalty functions or constraint sets. When the models/images are not easily described using one or a few penalty functions/constraints, additive model descriptions for regularization lead to better imaging results. These include cartoon-texture decomposition, morphological component analysis, and robust principal component analysis; methods that typically rely on penalty functions. We propose a regularization framework, based on the Minkowski set, that merges the strengths of additive models and constrained formulations. We generalize the Minkowski set, such that the model parameters are the sum of two components, each of which is constrained to an intersection of sets. Furthermore, the sum of the components is also an element of another intersection of sets. These generalizations allow us to include multiple pieces of prior knowledge on each of the components, as well as on the sum of components, which is necessary to ensure physical feasibility of partial-differential-equation based parameters estimation problems. We derive the projection operation onto the generalized Minkowski sets and construct an algorithm based on the alternating direction method of multipliers. We illustrate how we benefit from using more prior knowledge in the form of the generalized Minkowski set using seismic waveform inversion and video background-anomaly separation. △ Less

Submitted 10 March, 2019; originally announced March 2019.

Comments: 18 pages, 3 figures

MSC Class: 68U10; 86A22

arXiv:1902.09699 [pdf, other]

Algorithms and software for projections onto intersections of convex and non-convex sets with applications to inverse problems

Authors: Bas Peters, Felix J. Herrmann

Abstract: We propose algorithms and software for computing projections onto the intersection of multiple convex and non-convex constraint sets. The software package, called SetIntersectionProjection, is intended for the regularization of inverse problems in physical parameter estimation and image processing. The primary design criterion is working with multiple sets, which allows us to solve inverse problem… ▽ More We propose algorithms and software for computing projections onto the intersection of multiple convex and non-convex constraint sets. The software package, called SetIntersectionProjection, is intended for the regularization of inverse problems in physical parameter estimation and image processing. The primary design criterion is working with multiple sets, which allows us to solve inverse problems with multiple pieces of prior knowledge. Our algorithms outperform the well known Dykstra's algorithm when individual sets are not easy to project onto because we exploit similarities between constraint sets. Other design choices that make the software fast and practical to use, include recently developed automatic selection methods for auxiliary algorithm parameters, fine and coarse grained parallelism, and a multilevel acceleration scheme. We provide implementation details and examples that show how the software can be used to regularize inverse problems. Results show that we benefit from working with all available prior information and are not limited to one or two regularizers because of algorithmic, computational, or hyper-parameter selection issues. △ Less

Submitted 7 March, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

Comments: 37 pages, 9 figures

MSC Class: 68U10; 86A22; 90C06

arXiv:1901.05675 [pdf, other]

Convexification of box-constrained polynomial optimization problems via monomial patterns

Authors: Gennadiy Averkov, Benjamin Peters, Sebastian Sager

Abstract: Convexification is a core technique in global polynomial optimization. Currently, there are two main approaches competing in theory and practice: the approach of nonlinear programming and the approach based on positivity certificates from real algebra. The former are comparatively cheap from a computational point of view, but typically do not provide tight relaxations with respect to bounds for th… ▽ More Convexification is a core technique in global polynomial optimization. Currently, there are two main approaches competing in theory and practice: the approach of nonlinear programming and the approach based on positivity certificates from real algebra. The former are comparatively cheap from a computational point of view, but typically do not provide tight relaxations with respect to bounds for the original problem. The latter are typically computationally expensive, but do provide tight relaxations. We embed both kinds of approaches into a unified framework of monomial relaxations. We develop a convexification strategy that allows to trade off the quality of the bounds against computational expenses. Computational experiments show very encouraging results. △ Less

Submitted 28 September, 2021; v1 submitted 17 January, 2019; originally announced January 2019.

arXiv:1901.03786 [pdf, other]

Automatic classification of geologic units in seismic images using partially interpreted examples

Authors: Bas Peters, Justin Granek, Eldad Haber

Abstract: Geologic interpretation of large seismic stacked or migrated seismic images can be a time-consuming task for seismic interpreters. Neural network based semantic segmentation provides fast and automatic interpretations, provided a sufficient number of example interpretations are available. Networks that map from image-to-image emerged recently as powerful tools for automatic segmentation, but stand… ▽ More Geologic interpretation of large seismic stacked or migrated seismic images can be a time-consuming task for seismic interpreters. Neural network based semantic segmentation provides fast and automatic interpretations, provided a sufficient number of example interpretations are available. Networks that map from image-to-image emerged recently as powerful tools for automatic segmentation, but standard implementations require fully interpreted examples. Generating training labels for large images manually is time consuming. We introduce a partial loss-function and labeling strategies such that networks can learn from partially interpreted seismic images. This strategy requires only a small number of annotated pixels per seismic image. Tests on seismic images and interpretation information from the Sea of Ireland show that we obtain high-quality predicted interpretations from a small number of large seismic images. The combination of a partial-loss function, a multi-resolution network that explicitly takes small and large-scale geological features into account, and new labeling strategies make neural networks a more practical tool for automatic seismic interpretation. △ Less

Submitted 11 January, 2019; originally announced January 2019.

Comments: 7 pages, 3 figures

MSC Class: 68T45

arXiv:1812.11092 [pdf, other]

Multi-resolution neural networks for tracking seismic horizons from few training images

Authors: Bas Peters, Justin Granek, Eldad Haber

Abstract: Detecting a specific horizon in seismic images is a valuable tool for geological interpretation. Because hand-picking the locations of the horizon is a time-consuming process, automated computational methods were developed starting three decades ago. Older techniques for such picking include interpolation of control points however, in recent years neural networks have been used for this task. Unti… ▽ More Detecting a specific horizon in seismic images is a valuable tool for geological interpretation. Because hand-picking the locations of the horizon is a time-consuming process, automated computational methods were developed starting three decades ago. Older techniques for such picking include interpolation of control points however, in recent years neural networks have been used for this task. Until now, most networks trained on small patches from larger images. This limits the networks ability to learn from large-scale geologic structures. Moreover, currently available networks and training strategies require label patches that have full and continuous annotations, which are also time-consuming to generate. We propose a projected loss-function for training convolutional networks with a multi-resolution structure, including variants of the U-net. Our networks learn from a small number of large seismic images without creating patches. The projected loss-function enables training on labels with just a few annotated pixels and has no issue with the other unknown label pixels. Training uses all data without reserving some for validation. Only the labels are split into training/testing. Contrary to other work on horizon tracking, we train the network to perform non-linear regression, and not classification. As such, we propose labels as the convolution of a Gaussian kernel and the known horizon locations that indicate uncertainty in the labels. The network output is the probability of the horizon location. We demonstrate the proposed computational ingredients on two different datasets, for horizon extrapolation and interpolation. We show that the predictions of our methodology are accurate even in areas far from known horizon locations because our learning strategy exploits all data in large seismic images. △ Less

Submitted 26 December, 2018; originally announced December 2018.

Comments: 24 pages, 13 figures

MSC Class: 68T45 (Primary)

arXiv:1808.08028 [pdf, other]

The XDEM Multi-physics and Multi-scale Simulation Technology: Review on DEM-CFD Coupling, Methodology and Engineering Applications

Authors: Bernhard Peters, Maryam Baniasadi, Mehdi Baniasadi, Xavier Besseron, Alvaro Estupinan Donoso, Mohammad Mohseni, Gabriele Pozzetti

Abstract: The XDEM multi-physics and multi-scale simulation platform roots in the Ex- tended Discrete Element Method (XDEM) and is being developed at the In- stitute of Computational Engineering at the University of Luxembourg. The platform is an advanced multi- physics simulation technology that combines flexibility and versatility to establish the next generation of multi-physics and multi-scale simulatio… ▽ More The XDEM multi-physics and multi-scale simulation platform roots in the Ex- tended Discrete Element Method (XDEM) and is being developed at the In- stitute of Computational Engineering at the University of Luxembourg. The platform is an advanced multi- physics simulation technology that combines flexibility and versatility to establish the next generation of multi-physics and multi-scale simulation tools. For this purpose the simulation framework relies on coupling various predictive tools based on both an Eulerian and Lagrangian approach. Eulerian approaches represent the wide field of continuum models while the Lagrange approach is perfectly suited to characterise discrete phases. Thus, continuum models include classical simulation tools such as Computa- tional Fluid Dynamics (CFD) or Finite Element Analysis (FEA) while an ex- tended configuration of the classical Discrete Element Method (DEM) addresses the discrete e.g. particulate phase. Apart from predicting the trajectories of individual particles, XDEM extends the application to estimating the thermo- dynamic state of each particle by advanced and optimised algorithms. The thermodynamic state may include temperature and species distributions due to chemical reaction and external heat sources. Hence, coupling these extended features with either CFD or FEA opens up a wide range of applications as diverse as pharmaceutical industry e.g. drug production, agriculture food and processing industry, mining, construction and agricultural machinery, metals manufacturing, energy production and systems biology. △ Less

Submitted 24 August, 2018; originally announced August 2018.

Showing 1–50 of 60 results for author: Peters, B