-
Rethinking SO(3)-equivariance with Bilinear Tensor Networks
Authors:
Chase Shimmin,
Zhelun Li,
Ema Smith
Abstract:
Many datasets in scientific and engineering applications are comprised of objects which have specific geometric structure. A common example is data which inhabits a representation of the group SO$(3)$ of 3D rotations: scalars, vectors, tensors, \textit{etc}. One way for a neural network to exploit prior knowledge of this structure is to enforce SO$(3)$-equivariance throughout its layers, and sever…
▽ More
Many datasets in scientific and engineering applications are comprised of objects which have specific geometric structure. A common example is data which inhabits a representation of the group SO$(3)$ of 3D rotations: scalars, vectors, tensors, \textit{etc}. One way for a neural network to exploit prior knowledge of this structure is to enforce SO$(3)$-equivariance throughout its layers, and several such architectures have been proposed. While general methods for handling arbitrary SO$(3)$ representations exist, they computationally intensive and complicated to implement. We show that by judicious symmetry breaking, we can efficiently increase the expressiveness of a network operating only on vector and order-2 tensor representations of SO$(2)$. We demonstrate the method on an important problem from High Energy Physics known as \textit{b-tagging}, where particle jets originating from b-meson decays must be discriminated from an overwhelming QCD background. In this task, we find that augmenting a standard architecture with our method results in a \ensuremath{2.3\times} improvement in rejection score.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
Symmetry Group Equivariant Architectures for Physics
Authors:
Alexander Bogatskiy,
Sanmay Ganguly,
Thomas Kipf,
Risi Kondor,
David W. Miller,
Daniel Murnane,
Jan T. Offermann,
Mariel Pettee,
Phiala Shanahan,
Chase Shimmin,
Savannah Thais
Abstract:
Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In t…
▽ More
Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In this report, we argue that both the physics community and the broader machine learning community have much to understand and potentially to gain from a deeper investment in research concerning symmetry group equivariant machine learning architectures. For some applications, the introduction of symmetries into the fundamental structural design can yield models that are more economical (i.e. contain fewer, but more expressive, learned parameters), interpretable (i.e. more explainable or directly mappable to physical quantities), and/or trainable (i.e. more efficient in both data and computational requirements). We discuss various figures of merit for evaluating these models as well as some potential benefits and limitations of these methods for a variety of physics applications. Research and investment into these approaches will lay the foundation for future architectures that are potentially more robust under new computational paradigms and will provide a richer description of the physical systems to which they are applied.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
Particle Convolution for High Energy Physics
Authors:
Chase Shimmin
Abstract:
We introduce the Particle Convolution Network (PCN), a new type of equivariant neural network layer suitable for many tasks in jet physics. The particle convolution layer can be viewed as an extension of Deep Sets and Energy Flow network architectures, in which the permutation-invariant operator is promoted to a group convolution. While the PCN can be implemented for various kinds of symmetries, w…
▽ More
We introduce the Particle Convolution Network (PCN), a new type of equivariant neural network layer suitable for many tasks in jet physics. The particle convolution layer can be viewed as an extension of Deep Sets and Energy Flow network architectures, in which the permutation-invariant operator is promoted to a group convolution. While the PCN can be implemented for various kinds of symmetries, we consider the specific case of rotation about the jet axis the $η- φ$ plane. In two standard benchmark tasks, q/g tagging and top tagging, we show that the rotational PCN (rPCN) achieves performance comparable to graph networks such as ParticleNet. Moreover, we show that it is possible to implement an IRC-safe rPCN, which significantly outperforms existing IRC-safe tagging methods on both tasks. We speculate that by generalizing the PCN to include additional convolutional symmetries relevant to jet physics, it may outperform the current state-of-the-art set by graph networks, while offering a new degree of control over physically-motivated inductive biases.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.
-
AI Safety for High Energy Physics
Authors:
Benjamin Nachman,
Chase Shimmin
Abstract:
The field of high-energy physics (HEP), along with many scientific disciplines, is currently experiencing a dramatic influx of new methodologies powered by modern machine learning techniques. Over the last few years, a growing body of HEP literature has focused on identifying promising applications of deep learning in particular, and more recently these techniques are starting to be realized in an…
▽ More
The field of high-energy physics (HEP), along with many scientific disciplines, is currently experiencing a dramatic influx of new methodologies powered by modern machine learning techniques. Over the last few years, a growing body of HEP literature has focused on identifying promising applications of deep learning in particular, and more recently these techniques are starting to be realized in an increasing number of experimental measurements. The overall conclusion from this impressive and extensive set of studies is that rarer and more complex physics signatures can be identified with the new set of powerful tools from deep learning. However, there is an unstudied systematic risk associated with combining the traditional HEP workflow and deep learning with high-dimensional data. In particular, calibrating and validating the response of deep neural networks is in general not experimentally feasible, and therefore current methods may be biased in ways that are not covered by current uncertainty estimates. By borrowing ideas from AI safety, we illustrate these potential issues and propose a method to bound the size of unaccounted for uncertainty. In addition to providing a pragmatic diagnostic, this work will hopefully begin a dialogue within the community about the robust application of deep learning to experimental analyses.
△ Less
Submitted 18 October, 2019;
originally announced October 2019.
-
Decorrelated Jet Substructure Tagging using Adversarial Neural Networks
Authors:
Chase Shimmin,
Peter Sadowski,
Pierre Baldi,
Edison Weik,
Daniel Whiteson,
Edward Goul,
Andreas Søgaard
Abstract:
We describe a strategy for constructing a neural network jet substructure tagger which powerfully discriminates boosted decay signals while remaining largely uncorrelated with the jet mass. This reduces the impact of systematic uncertainties in background modeling while enhancing signal purity, resulting in improved discovery significance relative to existing taggers. The network is trained using…
▽ More
We describe a strategy for constructing a neural network jet substructure tagger which powerfully discriminates boosted decay signals while remaining largely uncorrelated with the jet mass. This reduces the impact of systematic uncertainties in background modeling while enhancing signal purity, resulting in improved discovery significance relative to existing taggers. The network is trained using an adversarial strategy, resulting in a tagger that learns to balance classification accuracy with decorrelation. As a benchmark scenario, we consider the case where large-radius jets originating from a boosted resonance decay are discriminated from a background of nonresonant quark and gluon jets. We show that in the presence of systematic uncertainties on the background rate, our adversarially-trained, decorrelated tagger considerably outperforms a conventionally trained neural network, despite having a slightly worse signal-background separation power. We generalize the adversarial training technique to include a parametric dependence on the signal hypothesis, training a single network that provides optimized, interpolatable decorrelated jet tagging across a continuous range of hypothetical resonance masses, after training on discrete choices of the signal mass.
△ Less
Submitted 9 March, 2017;
originally announced March 2017.
-
Boosting low-mass hadronic resonances
Authors:
Chase Shimmin,
Daniel Whiteson
Abstract:
Searches for new hadronic resonances typically focus on high-mass spectra, due to overwhelming QCD backgrounds and detector trigger rates. We present a study of searches for relatively low-mass hadronic resonances at the LHC in the case that the resonance is boosted by recoiling against a well-measured high-$p_{\textrm{T}}$ probe such as a muon, photon or jet. The hadronic decay of the resonance i…
▽ More
Searches for new hadronic resonances typically focus on high-mass spectra, due to overwhelming QCD backgrounds and detector trigger rates. We present a study of searches for relatively low-mass hadronic resonances at the LHC in the case that the resonance is boosted by recoiling against a well-measured high-$p_{\textrm{T}}$ probe such as a muon, photon or jet. The hadronic decay of the resonance is then reconstructed either as a single large-radius jet or as a resolved pair of standard narrow-radius jets, balanced in transverse momentum to the probe. We show that the existing 2015 LHC dataset of $pp$ collisions with $\int\mathcal{L}dt = 4\ \mathrm{fb}^{-1}$ should already have powerful sensitivity to a generic $Z'$ model which couples only to quarks, for $Z'$ masses ranging from 20-500 GeV/c$^2$.
△ Less
Submitted 24 February, 2016;
originally announced February 2016.
-
Systematically Searching for New Resonances at the Energy Frontier using Topological Models
Authors:
Mohammad Abdullah,
Eric Albin,
Anthony DiFranzo,
Meghan Frate,
Craig Pitcher,
Chase Shimmin,
Suneet Upadhyay,
James Walker,
Pierce Weatherly,
Patrick J. Fox,
Daniel Whiteson
Abstract:
We propose a new strategy to systematically search for new physics processes in particle collisions at the energy frontier. An examination of all possible topologies which give identifiable resonant features in a specific final state leads to a tractable number of `topological models' per final state and gives specific guidance for their discovery. Using one specific final state, $\ell\ell jj$, as…
▽ More
We propose a new strategy to systematically search for new physics processes in particle collisions at the energy frontier. An examination of all possible topologies which give identifiable resonant features in a specific final state leads to a tractable number of `topological models' per final state and gives specific guidance for their discovery. Using one specific final state, $\ell\ell jj$, as an example, we find that the number of possibilities is reasonable and reveals simple, but as-yet-unexplored, topologies which contain significant discovery potential. We propose analysis techniques and estimate the sensitivity for $pp$ collisions with $\sqrt{s}=14$ TeV and $\mathcal{L}=300$ fb$^{-1}$.
△ Less
Submitted 7 January, 2014;
originally announced January 2014.
-
Mono-Higgs: a new collider probe of dark matter
Authors:
Linda M. Carpenter,
Anthony DiFranzo,
Michael Mulhearn,
Chase Shimmin,
Sean Tulin,
Daniel Whiteson
Abstract:
We explore the LHC phenomenology of dark matter (DM) pair production in association with a 125 GeV Higgs boson. This signature, dubbed `mono-Higgs,' appears as a single Higgs boson plus missing energy from DM particles escaping the detector. We perform an LHC background study for mono-Higgs signals at $\sqrt{s} = 8$ and $14$ TeV for four Higgs boson decay channels: $γγ$, $b \bar b$, and…
▽ More
We explore the LHC phenomenology of dark matter (DM) pair production in association with a 125 GeV Higgs boson. This signature, dubbed `mono-Higgs,' appears as a single Higgs boson plus missing energy from DM particles escaping the detector. We perform an LHC background study for mono-Higgs signals at $\sqrt{s} = 8$ and $14$ TeV for four Higgs boson decay channels: $γγ$, $b \bar b$, and $ZZ^* \to 4\ell$, $\ell\ell j j$. We estimate the LHC sensitivities to a variety of new physics scenarios within the frameworks of both effective operators and simplified models. For all these scenarios, the $γγ$ channel provides the best sensitivity, whereas the $b\bar b$ channel suffers from a large $t \bar t$ background. Mono-Higgs is unlike other mono-$X$ searches ($X$=jet, photon, etc.), since the Higgs boson is unlikely to be radiated as initial state radiation, and therefore probes the underlying DM vertex directly.
△ Less
Submitted 9 June, 2014; v1 submitted 9 December, 2013;
originally announced December 2013.
-
Collider searches for dark matter in events with a Z boson and missing energy
Authors:
Linda M. Carpenter,
Andrew Nelson,
Chase Shimmin,
Tim M. P. Tait,
Daniel Whiteson
Abstract:
Searches for dark matter at colliders typically involve signatures with energetic initial-state radiation without visible recoil particles. Searches for mono-jet or mono-photon signatures have yielded powerful constraints on dark matter interactions with Standard Model particles. We extend this to the mono-Z signature and reinterpret an ATLAS analysis of events with a Z boson and missing transvers…
▽ More
Searches for dark matter at colliders typically involve signatures with energetic initial-state radiation without visible recoil particles. Searches for mono-jet or mono-photon signatures have yielded powerful constraints on dark matter interactions with Standard Model particles. We extend this to the mono-Z signature and reinterpret an ATLAS analysis of events with a Z boson and missing transverse momentum to derive constraints on dark matter interaction mass scale and nucleon cross sections in the context of effective field theories describing dark matter which interacts via heavy mediator particles with quarks or weak bosons.
△ Less
Submitted 21 December, 2012; v1 submitted 13 December, 2012;
originally announced December 2012.