Search | arXiv e-print repository

Rethinking SO(3)-equivariance with Bilinear Tensor Networks

Authors: Chase Shimmin, Zhelun Li, Ema Smith

Abstract: Many datasets in scientific and engineering applications are comprised of objects which have specific geometric structure. A common example is data which inhabits a representation of the group SO$(3)$ of 3D rotations: scalars, vectors, tensors, \textit{etc}. One way for a neural network to exploit prior knowledge of this structure is to enforce SO$(3)$-equivariance throughout its layers, and sever… ▽ More Many datasets in scientific and engineering applications are comprised of objects which have specific geometric structure. A common example is data which inhabits a representation of the group SO$(3)$ of 3D rotations: scalars, vectors, tensors, \textit{etc}. One way for a neural network to exploit prior knowledge of this structure is to enforce SO$(3)$-equivariance throughout its layers, and several such architectures have been proposed. While general methods for handling arbitrary SO$(3)$ representations exist, they computationally intensive and complicated to implement. We show that by judicious symmetry breaking, we can efficiently increase the expressiveness of a network operating only on vector and order-2 tensor representations of SO$(2)$. We demonstrate the method on an important problem from High Energy Physics known as \textit{b-tagging}, where particle jets originating from b-meson decays must be discriminated from an overwhelming QCD background. In this task, we find that augmenting a standard architecture with our method results in a \ensuremath{2.3\times} improvement in rejection score. △ Less

Submitted 20 March, 2023; originally announced March 2023.

arXiv:2203.06153 [pdf, other]

Symmetry Group Equivariant Architectures for Physics

Authors: Alexander Bogatskiy, Sanmay Ganguly, Thomas Kipf, Risi Kondor, David W. Miller, Daniel Murnane, Jan T. Offermann, Mariel Pettee, Phiala Shanahan, Chase Shimmin, Savannah Thais

Abstract: Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In t… ▽ More Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In this report, we argue that both the physics community and the broader machine learning community have much to understand and potentially to gain from a deeper investment in research concerning symmetry group equivariant machine learning architectures. For some applications, the introduction of symmetries into the fundamental structural design can yield models that are more economical (i.e. contain fewer, but more expressive, learned parameters), interpretable (i.e. more explainable or directly mappable to physical quantities), and/or trainable (i.e. more efficient in both data and computational requirements). We discuss various figures of merit for evaluating these models as well as some potential benefits and limitations of these methods for a variety of physics applications. Research and investment into these approaches will lay the foundation for future architectures that are potentially more robust under new computational paradigms and will provide a richer description of the physical systems to which they are applied. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: Contribution to Snowmass 2021

arXiv:2107.02908 [pdf, other]

Particle Convolution for High Energy Physics

Authors: Chase Shimmin

Abstract: We introduce the Particle Convolution Network (PCN), a new type of equivariant neural network layer suitable for many tasks in jet physics. The particle convolution layer can be viewed as an extension of Deep Sets and Energy Flow network architectures, in which the permutation-invariant operator is promoted to a group convolution. While the PCN can be implemented for various kinds of symmetries, w… ▽ More We introduce the Particle Convolution Network (PCN), a new type of equivariant neural network layer suitable for many tasks in jet physics. The particle convolution layer can be viewed as an extension of Deep Sets and Energy Flow network architectures, in which the permutation-invariant operator is promoted to a group convolution. While the PCN can be implemented for various kinds of symmetries, we consider the specific case of rotation about the jet axis the $η- φ$ plane. In two standard benchmark tasks, q/g tagging and top tagging, we show that the rotational PCN (rPCN) achieves performance comparable to graph networks such as ParticleNet. Moreover, we show that it is possible to implement an IRC-safe rPCN, which significantly outperforms existing IRC-safe tagging methods on both tasks. We speculate that by generalizing the PCN to include additional convolutional symmetries relevant to jet physics, it may outperform the current state-of-the-art set by graph networks, while offering a new degree of control over physically-motivated inductive biases. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Comments: To be presented at ML4Jets 2021

arXiv:1910.08606 [pdf, other]

AI Safety for High Energy Physics

Authors: Benjamin Nachman, Chase Shimmin

Abstract: The field of high-energy physics (HEP), along with many scientific disciplines, is currently experiencing a dramatic influx of new methodologies powered by modern machine learning techniques. Over the last few years, a growing body of HEP literature has focused on identifying promising applications of deep learning in particular, and more recently these techniques are starting to be realized in an… ▽ More The field of high-energy physics (HEP), along with many scientific disciplines, is currently experiencing a dramatic influx of new methodologies powered by modern machine learning techniques. Over the last few years, a growing body of HEP literature has focused on identifying promising applications of deep learning in particular, and more recently these techniques are starting to be realized in an increasing number of experimental measurements. The overall conclusion from this impressive and extensive set of studies is that rarer and more complex physics signatures can be identified with the new set of powerful tools from deep learning. However, there is an unstudied systematic risk associated with combining the traditional HEP workflow and deep learning with high-dimensional data. In particular, calibrating and validating the response of deep neural networks is in general not experimentally feasible, and therefore current methods may be biased in ways that are not covered by current uncertainty estimates. By borrowing ideas from AI safety, we illustrate these potential issues and propose a method to bound the size of unaccounted for uncertainty. In addition to providing a pragmatic diagnostic, this work will hopefully begin a dialogue within the community about the robust application of deep learning to experimental analyses. △ Less

Submitted 18 October, 2019; originally announced October 2019.

Comments: 8 pages, 5 figures

arXiv:1703.03507 [pdf, other]

doi 10.1103/PhysRevD.96.074034

Decorrelated Jet Substructure Tagging using Adversarial Neural Networks

Authors: Chase Shimmin, Peter Sadowski, Pierre Baldi, Edison Weik, Daniel Whiteson, Edward Goul, Andreas Søgaard

Abstract: We describe a strategy for constructing a neural network jet substructure tagger which powerfully discriminates boosted decay signals while remaining largely uncorrelated with the jet mass. This reduces the impact of systematic uncertainties in background modeling while enhancing signal purity, resulting in improved discovery significance relative to existing taggers. The network is trained using… ▽ More We describe a strategy for constructing a neural network jet substructure tagger which powerfully discriminates boosted decay signals while remaining largely uncorrelated with the jet mass. This reduces the impact of systematic uncertainties in background modeling while enhancing signal purity, resulting in improved discovery significance relative to existing taggers. The network is trained using an adversarial strategy, resulting in a tagger that learns to balance classification accuracy with decorrelation. As a benchmark scenario, we consider the case where large-radius jets originating from a boosted resonance decay are discriminated from a background of nonresonant quark and gluon jets. We show that in the presence of systematic uncertainties on the background rate, our adversarially-trained, decorrelated tagger considerably outperforms a conventionally trained neural network, despite having a slightly worse signal-background separation power. We generalize the adversarial training technique to include a parametric dependence on the signal hypothesis, training a single network that provides optimized, interpolatable decorrelated jet tagging across a continuous range of hypothetical resonance masses, after training on discrete choices of the signal mass. △ Less

Submitted 9 March, 2017; originally announced March 2017.

Journal ref: Phys. Rev. D 96, 074034 (2017)

arXiv:1602.07727 [pdf, other]

doi 10.1103/PhysRevD.94.055001

Boosting low-mass hadronic resonances

Authors: Chase Shimmin, Daniel Whiteson

Abstract: Searches for new hadronic resonances typically focus on high-mass spectra, due to overwhelming QCD backgrounds and detector trigger rates. We present a study of searches for relatively low-mass hadronic resonances at the LHC in the case that the resonance is boosted by recoiling against a well-measured high-$p_{\textrm{T}}$ probe such as a muon, photon or jet. The hadronic decay of the resonance i… ▽ More Searches for new hadronic resonances typically focus on high-mass spectra, due to overwhelming QCD backgrounds and detector trigger rates. We present a study of searches for relatively low-mass hadronic resonances at the LHC in the case that the resonance is boosted by recoiling against a well-measured high-$p_{\textrm{T}}$ probe such as a muon, photon or jet. The hadronic decay of the resonance is then reconstructed either as a single large-radius jet or as a resolved pair of standard narrow-radius jets, balanced in transverse momentum to the probe. We show that the existing 2015 LHC dataset of $pp$ collisions with $\int\mathcal{L}dt = 4\ \mathrm{fb}^{-1}$ should already have powerful sensitivity to a generic $Z'$ model which couples only to quarks, for $Z'$ masses ranging from 20-500 GeV/c$^2$. △ Less

Submitted 24 February, 2016; originally announced February 2016.

Journal ref: Phys. Rev. D 94, 055001 (2016)

arXiv:1401.1462 [pdf, other]

doi 10.1103/PhysRevD.89.095002

Systematically Searching for New Resonances at the Energy Frontier using Topological Models

Authors: Mohammad Abdullah, Eric Albin, Anthony DiFranzo, Meghan Frate, Craig Pitcher, Chase Shimmin, Suneet Upadhyay, James Walker, Pierce Weatherly, Patrick J. Fox, Daniel Whiteson

Abstract: We propose a new strategy to systematically search for new physics processes in particle collisions at the energy frontier. An examination of all possible topologies which give identifiable resonant features in a specific final state leads to a tractable number of `topological models' per final state and gives specific guidance for their discovery. Using one specific final state, $\ell\ell jj$, as… ▽ More We propose a new strategy to systematically search for new physics processes in particle collisions at the energy frontier. An examination of all possible topologies which give identifiable resonant features in a specific final state leads to a tractable number of `topological models' per final state and gives specific guidance for their discovery. Using one specific final state, $\ell\ell jj$, as an example, we find that the number of possibilities is reasonable and reveals simple, but as-yet-unexplored, topologies which contain significant discovery potential. We propose analysis techniques and estimate the sensitivity for $pp$ collisions with $\sqrt{s}=14$ TeV and $\mathcal{L}=300$ fb$^{-1}$. △ Less

Submitted 7 January, 2014; originally announced January 2014.

Report number: FERMILAB-PUB-13-529-T

Journal ref: Phys. Rev. D 89, 095002 (2014)

arXiv:1312.2592 [pdf, other]

doi 10.1103/PhysRevD.89.075017

Mono-Higgs: a new collider probe of dark matter

Authors: Linda M. Carpenter, Anthony DiFranzo, Michael Mulhearn, Chase Shimmin, Sean Tulin, Daniel Whiteson

Abstract: We explore the LHC phenomenology of dark matter (DM) pair production in association with a 125 GeV Higgs boson. This signature, dubbed `mono-Higgs,' appears as a single Higgs boson plus missing energy from DM particles escaping the detector. We perform an LHC background study for mono-Higgs signals at $\sqrt{s} = 8$ and $14$ TeV for four Higgs boson decay channels: $γγ$, $b \bar b$, and… ▽ More We explore the LHC phenomenology of dark matter (DM) pair production in association with a 125 GeV Higgs boson. This signature, dubbed `mono-Higgs,' appears as a single Higgs boson plus missing energy from DM particles escaping the detector. We perform an LHC background study for mono-Higgs signals at $\sqrt{s} = 8$ and $14$ TeV for four Higgs boson decay channels: $γγ$, $b \bar b$, and $ZZ^* \to 4\ell$, $\ell\ell j j$. We estimate the LHC sensitivities to a variety of new physics scenarios within the frameworks of both effective operators and simplified models. For all these scenarios, the $γγ$ channel provides the best sensitivity, whereas the $b\bar b$ channel suffers from a large $t \bar t$ background. Mono-Higgs is unlike other mono-$X$ searches ($X$=jet, photon, etc.), since the Higgs boson is unlikely to be radiated as initial state radiation, and therefore probes the underlying DM vertex directly. △ Less

Submitted 9 June, 2014; v1 submitted 9 December, 2013; originally announced December 2013.

Journal ref: Phys. Rev. D 89, 075017 (2014)

arXiv:1212.3352 [pdf, other]

doi 10.1103/PhysRevD.87.074005

Collider searches for dark matter in events with a Z boson and missing energy

Authors: Linda M. Carpenter, Andrew Nelson, Chase Shimmin, Tim M. P. Tait, Daniel Whiteson

Abstract: Searches for dark matter at colliders typically involve signatures with energetic initial-state radiation without visible recoil particles. Searches for mono-jet or mono-photon signatures have yielded powerful constraints on dark matter interactions with Standard Model particles. We extend this to the mono-Z signature and reinterpret an ATLAS analysis of events with a Z boson and missing transvers… ▽ More Searches for dark matter at colliders typically involve signatures with energetic initial-state radiation without visible recoil particles. Searches for mono-jet or mono-photon signatures have yielded powerful constraints on dark matter interactions with Standard Model particles. We extend this to the mono-Z signature and reinterpret an ATLAS analysis of events with a Z boson and missing transverse momentum to derive constraints on dark matter interaction mass scale and nucleon cross sections in the context of effective field theories describing dark matter which interacts via heavy mediator particles with quarks or weak bosons. △ Less

Submitted 21 December, 2012; v1 submitted 13 December, 2012; originally announced December 2012.

Comments: 6 pages, 5 figures

Report number: UCI-HEP-TR-2012-21

Showing 1–9 of 9 results for author: Shimmin, C