Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: stackengine

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: CC BY 4.0
arXiv:2312.17265v1 [cs.CV] 25 Dec 2023

𝝁𝝁\boldsymbol{\mu}bold_italic_μ-Net: ConvNext-Based U-Nets for Cosmic Muon Tomography

Lim Li Xin Jed    Qiu Ziming
Abstract

Muon scattering tomography utilises muons, typically originating from cosmic rays to image the interiors of dense objects. However, due to the low flux of cosmic ray muons at sea-level and the highly complex interactions that muons display when travelling through matter, existing reconstruction algorithms often suffer from low resolution and high noise. In this work, we develop a novel two-stage deep learning algorithm, ΞΌπœ‡\muitalic_ΞΌ-Net, consisting of an MLP to predict the muon trajectory and a ConvNeXt-based U-Net to convert the scattering points into voxels. ΞΌπœ‡\muitalic_ΞΌ-Net achieves a state-of-the-art performance of 17.14 PSNR at the dosage of 1024 muons, outperforming traditional reconstruction algorithms such as the point of closest approach algorithm and maximum likelihood and expectation maximisation algorithm. Furthermore, we find that our method is robust to various corruptions such as inaccuracies in the muon momentum or a limited detector resolution. We also generate and publicly release the first large-scale dataset that maps muon detections to voxels. We hope that our research will spark further investigations into the potential of deep learning to revolutionise this field.

Machine Learning, Deep Learning, U-Net, ConvNeXt, Muon Tomography

1 Introduction

Muon tomography is an imaging technique that utilises muons, typically originating from cosmic rays, to image the interiors of objects. By leveraging the fact that muons are highly penetrating particles, muon tomography offers a non-invasive and non-destructive means of investigating the internal composition of dense materials. Through tracking and analysis of muon trajectories and energies, this enables the accurate reconstruction of internal structures and features. Consequently, muon tomography has emerged as a tool with wide-ranging applications in fields such as geophysics (Tioukov etΒ al., 2019), civil engineering (Saracino etΒ al., 2018), and archaeology (Borselli etΒ al., 2022; Procureur etΒ al., 2023a).

Due to cosmic rays entering the Earth’s atmosphere, there is no need for a specialised muon source unlike other types of tomography. However, there are many other challenges in this task. First, the flux of cosmic muons at sea-level is very low (Beringer etΒ al., 2012). In order to produce a decent reconstruction, data has to be collected for a long period of time. Furthermore, unlike other types of tomography such as x-ray computed tomography, muons will scatter off atomic nuclei. This makes the forward operator highly nonlinear. In contrast, in computed tomography, x-rays only attenuate and hence, the forward operator is the linear radon transform. In addition, due to limitations of current day muon detectors, the momentum of muons which significantly affects the muon scattering angle is not known. At best, we will have an estimate of the momentum p^^𝑝\hat{p}over^ start_ARG italic_p end_ARG with a significant amount of uncertainty.

Several methods have been developed to tackle this problem. The first algorithm developed was the Point of Closest Approach (PoCA) algorithm (Schultz, 2003) which assumes that the muons only scatter once at the point of closest approach of its inward and outward trajectory. The Maximum Likelihood and Expectation Maximisation (MLEM) algorithm (Schultz etΒ al., 2007) improves on PoCA and iteratively optimises the reconstruction to maximise the likelihood of producing a given scattering outcome. Other algorithms have also been developed such as maximum a posterori (MAP) (Wang etΒ al., 2009), most likely path (Schulte etΒ al., 2008; Yi etΒ al., 2014; Chatzidakis etΒ al., 2018), scattering density estimation (SDE) (Jonkmans etΒ al., 2013) and angle statistics reconstruction (ASR) (Stapleton etΒ al., 2014). Other methods have also been developed for low dosages such as the binned clustering algorithm (Thomay etΒ al., 2012) and a method based on density clustering (Hou etΒ al., 2021).

However, no one has attempted to make use of developments in deep learning to directly solve this ill-posed problem and go directly from muon detections to a 3D reconstruction. Due to the abundance of data which can be obtained from simulations from software such as Geant4 (Agostinelli etΒ al., 2003) and the non-linearity of the problem, deep learning methods are well-suited for this task. They have been previously applied to other inverse problems such as computed tomography (Szczykutowicz etΒ al., 2022).

In this work, we develop a novel two-stage deep learning algorithm, ΞΌπœ‡\muitalic_ΞΌ-Net, for cosmic muon scattering tomography, based on using the point of closest approach (PoCA) algorithm and the U-Net architecture proposed by Ronneberger etΒ al. (2015). To our knowledge, this is the first application of deep learning in muon scattering tomography to directly perform the 3D reconstruction. Instead of using the more traditional Residual Blocks, we make use of ConvNeXt Blocks (Liu etΒ al., 2022). We find that our model significantly outperforms traditional reconstruction algorithms (in speed and accuracy) and achieves state-of-the-art performance. Our method achieves a PSNR of 17.14 with only 1024 muons while PoCA achieves a PSNR of only 13.66 while taking 22.5s to run while ΞΌπœ‡\muitalic_ΞΌ-Net runs in 126 ms.

We have also generated the first large-scale dataset mapping muon detections to voxels. In prior literature, there has been no standard benchmark to evaluate various methods for 3D muon tomographic reconstructions and no quantitative metrics used to compare different methods. As such, we release our dataset and data generation code publicly. We hope this will spark more systematic investigations into reconstruction algorithms on this task, using deep learning or using traditional methods.

2 Preliminaries

2.1 Physical Background

Muon scattering relies primarily on modelling the scattering interaction between muons and matter. Rossi & Greisen (1941) developed a scattering theory for charged particles and found that charged particles travelling through a plate of thickness xπ‘₯xitalic_x that undergo Coulomb scattering have scattering angles and lateral displacements that follow a Gaussian distribution with mean, ΞΌ=0πœ‡0\mu=0italic_ΞΌ = 0 and variance,

Οƒ2=Es22⁒p2⁒v2⁒xLr⁒a⁒dsuperscript𝜎2superscriptsubscript𝐸𝑠22superscript𝑝2superscript𝑣2π‘₯subscriptπΏπ‘Ÿπ‘Žπ‘‘\sigma^{2}=\frac{E_{s}^{2}}{2p^{2}v^{2}}\frac{x}{L_{rad}}italic_Οƒ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = divide start_ARG italic_E start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_p start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_v start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG divide start_ARG italic_x end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_r italic_a italic_d end_POSTSUBSCRIPT end_ARG (1)

where Es=21subscript𝐸𝑠21E_{s}=21italic_E start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 21 MeV, Ξ»πœ†\lambdaitalic_Ξ» is the radiation length of the material, xπ‘₯xitalic_x is the thickness of the plate and p𝑝pitalic_p and v𝑣vitalic_v are the momentum and velocity of the muon respectively.

From this formula, we can define the parameter of interest, the scattering density Ξ»πœ†\lambdaitalic_Ξ».

Ξ»=Οƒ2x=15⁒M⁒e⁒Vp2⁒v2⁒1Lr⁒a⁒dπœ†superscript𝜎2π‘₯15𝑀𝑒𝑉superscript𝑝2superscript𝑣21subscriptπΏπ‘Ÿπ‘Žπ‘‘\lambda=\frac{\sigma^{2}}{x}=\frac{15MeV}{p^{2}v^{2}}\frac{1}{L_{rad}}italic_Ξ» = divide start_ARG italic_Οƒ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_x end_ARG = divide start_ARG 15 italic_M italic_e italic_V end_ARG start_ARG italic_p start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_v start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG divide start_ARG 1 end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_r italic_a italic_d end_POSTSUBSCRIPT end_ARG (2)

With this, the task of muon tomography is to find the distribution of Ξ»πœ†\lambdaitalic_Ξ» within the object through looking at positions and directions of the incoming and outgoing muons.

2.2 Problem Statement

Before proceeding with a literature review and the description of the method, let us formalize the muon reconstruction problem. We shall follow a similar notation to Schultz etΒ al. (2007).

Let the object of interest be defined by its scattering density,

λ⁒(x,y,z)=(15p0)2⁒1Lr⁒a⁒d⁒(x,y,z)πœ†π‘₯𝑦𝑧superscript15subscript𝑝021subscriptπΏπ‘Ÿπ‘Žπ‘‘π‘₯𝑦𝑧\lambda(x,y,z)=\left(\frac{15}{p_{0}}\right)^{2}\frac{1}{L_{rad}(x,y,z)}italic_Ξ» ( italic_x , italic_y , italic_z ) = ( divide start_ARG 15 end_ARG start_ARG italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_r italic_a italic_d end_POSTSUBSCRIPT ( italic_x , italic_y , italic_z ) end_ARG (3)

where Lr⁒a⁒d⁒(x,y,z)subscriptπΏπ‘Ÿπ‘Žπ‘‘π‘₯𝑦𝑧L_{rad}(x,y,z)italic_L start_POSTSUBSCRIPT italic_r italic_a italic_d end_POSTSUBSCRIPT ( italic_x , italic_y , italic_z ) is the radiation length at each point within the object.

We can represent the scattering density in terms of some basis functions Ο•j⁒(x,y,z)subscriptitalic-ϕ𝑗π‘₯𝑦𝑧\phi_{j}(x,y,z)italic_Ο• start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_x , italic_y , italic_z ) such that

λ⁒(x,y,z)=βˆ‘jΞ±j⁒ϕ⁒(x,y,z)πœ†π‘₯𝑦𝑧subscript𝑗subscript𝛼𝑗italic-Ο•π‘₯𝑦𝑧\lambda(x,y,z)=\sum_{j}\alpha_{j}\phi(x,y,z)italic_Ξ» ( italic_x , italic_y , italic_z ) = βˆ‘ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ± start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ο• ( italic_x , italic_y , italic_z ) (4)

where 𝜢=[Ξ±j]𝜢delimited-[]subscript𝛼𝑗\boldsymbol{\alpha}=\left[\alpha_{j}\right]bold_italic_Ξ± = [ italic_Ξ± start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ] are the coefficients for the basis functions.

Suppose the muon detections, 𝐲𝐲\mathbf{y}bold_y follow a distribution D𝐷Ditalic_D parameterized by the scattering density of the object, i.e.

𝐘∼D⁒(𝜢)similar-to𝐘𝐷𝜢\mathbf{Y}\sim D(\boldsymbol{\alpha})bold_Y ∼ italic_D ( bold_italic_α ) (5)

Therefore, given a sample of n𝑛nitalic_n muon detections Yn=[𝐲𝟏,𝐲𝟐,…,𝐲𝐧]subscriptπ‘Œπ‘›subscript𝐲1subscript𝐲2…subscript𝐲𝐧Y_{n}=\left[\mathbf{y_{1}},\mathbf{y_{2}},\dots,\mathbf{y_{n}}\right]italic_Y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = [ bold_y start_POSTSUBSCRIPT bold_1 end_POSTSUBSCRIPT , bold_y start_POSTSUBSCRIPT bold_2 end_POSTSUBSCRIPT , … , bold_y start_POSTSUBSCRIPT bold_n end_POSTSUBSCRIPT ] we wish to construct a point estimate 𝐚⁒(Yn)𝐚subscriptπ‘Œπ‘›\mathbf{a}(Y_{n})bold_a ( italic_Y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) of 𝜢𝜢\boldsymbol{\alpha}bold_italic_Ξ±, which is approximated by the deep neural network π’‡πœ½β’(Yn)subscriptπ’‡πœ½subscriptπ‘Œπ‘›\boldsymbol{f_{\theta}}(Y_{n})bold_italic_f start_POSTSUBSCRIPT bold_italic_ΞΈ end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) parameterized by 𝜽𝜽\boldsymbol{\theta}bold_italic_ΞΈ.

In this paper, we will take p0subscript𝑝0p_{0}italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to be 15 MeV, so that we directly regress the reciprocal of the radiation length 1Lr⁒a⁒d⁒(x,y,z)1subscriptπΏπ‘Ÿπ‘Žπ‘‘π‘₯𝑦𝑧\frac{1}{L_{rad}(x,y,z)}divide start_ARG 1 end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_r italic_a italic_d end_POSTSUBSCRIPT ( italic_x , italic_y , italic_z ) end_ARG in units of cmβˆ’11{}^{-1}start_FLOATSUPERSCRIPT - 1 end_FLOATSUPERSCRIPT.

2.3 Motivation

Some key features of the muon reconstruction problem are immediately apparent from the problem statement, which help us design our model architecture are listed below:

  • β€’

    The model should be permutation-invariant, i.e. the order in which muons are inputted into the model should not change the output.

  • β€’

    The model should accept any number of input muons.

  • β€’

    The model should be able to make use of paired input and output muon detections i.e. the model needs to know which input muon corresponds to which output.

  • β€’

    The model should be able to take advantage of the 3D spatial structure of the target output.

This seems to suggest making use of Transformers (Vaswani etΒ al., 2017) with the positional encoding removed. However, the dot product attention module used in Transformers has a large time-complexity of O⁒(N2)𝑂superscript𝑁2O(N^{2})italic_O ( italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) where N𝑁Nitalic_N is the dosage. This is not acceptable given dosages can go up to 10000 muons or more. The transformer will also be unable to take advantage of the spatial structure of the output. Hence, we will make use of the PoCA algorithm and the U-Net architecture (Ronneberger etΒ al., 2015) to create a two-stage algorithm.

3 Related Work

3.1 Deep Learning Methods

Point Clouds.

From the key features of the muon tomography problem, we see that it is highly similar to point cloud problems, which also take in permutation invariant data (a set of points), but the points themselves also exhibit some 3D structure. Deep learning methods on point cloud data can generally be classified into 2 main categories, neural networks which operates directly on the points, and neural networks which operate on a voxelized representation of the data (Bello etΒ al., 2020). In our work, we chose the latter approach since the former tends to be slightly worse at capturing spatial structure, and our desired output is also in the form of voxels.

U-Net.

The U-Net was first proposed as an medical image segmentation model (Ronneberger et al., 2015). It consists of a downward branch, where the image is downsampled and an upward branch, where the image is upsampled. Skip connections are also used between the downward and upward branches to allow high resolution features, which may be removed during downsampling, to be retained. U-Nets have also been extended to process 3D data (Çiçek et al., 2016; Ho et al., 2021), by replacing the standard 2D convolution operations with 3D convolutions.

ConvNext.

ConvNext is a recent iteration of the family of convolutional neural networks (Liu etΒ al., 2022). It was proposed as an improvement of the ResNet (He etΒ al., 2016) by incorporating methods found in Vision Transformer architectures, most notably the SWIN transformer (Liu etΒ al., 2021), and has been shown to obtain competitive performance against Transformer-based methods but with a lower parameter count. In our work, we will use a modified 3D ConvNext block to form our U-Net.

IEEE BigData 2023 Cup.

Concurrent to this work, is the ”IEEE BigData 2023 Cup: Object Recognition with Muon Tomography using Cosmic Rays” organised by Wnuk etΒ al. (2023). Unlike our work which addresses full three-dimensional reconstruction, they focus on reconstruction of 2-dimensional objects placed in the central plane. The metric used is the mean average IoU. However, this metric equally penalises models for misidentifying materials with a small difference in radiation length and a large difference in radiation length. Hence, in this paper, we adopt the mean squared error and the peak-signal to noise ratio (PSNR) as our primary metrics. We are unfortunately unable to compare our results to the results from this competition as at the time of writing, the competition report and the papers of the winning entries are not accessible online.

3.2 Traditional Algorithms

Point of Closest Approach.

The Point of Closest Approach (PoCA) method is a commonly used method for muon scattering tomography, first proposed by Schultz (2003). It assumes that the muon is scattered once by an object at the point of closest approach between the input and output trajectories. However, this fails to take into account the fact that in many cases, muons may scatter multiple times. As a result, there will be predictions that muons scatter at points where there is actually no object as illustrated in Figure 1. In addition, some information is lost as some muons may have a point of closest approach outside of the object space.

Refer to caption
Figure 1: A weakness of PoCA. Since the muon scatters more than once, the computed PoCA is not within any object.

Maximum Likelihood Expectation Maximization.

The maximum likelihood expectation maximisation (MLEM) algorithm is a statistical reconstruction method proposed by Schultz etΒ al. (2007). It makes use of the statistical distribution of muon scattering to frame muon reconstruction as a maximum likelihood problem. An iterative expectation maximisation algorithm is then used to find the scattering densities within the object that is most likely to result in the observed data collected by the muon detectors.

Existing literature (Zeng etΒ al., 2020) have found that MLEM has better qualitative performance than direct allocation to PoCA. However, in our experiments, we observed that MLEM has significantly lower performance (see Figure 10). We hypothesize that this is due to the lower muon dosages we used in our experiments, which makes it difficult to apply statistical methods to the reconstruction. Due to the lower performance and higher computational cost of MLEM, we will be comparing our model against PoCA for most of our experiments.

4 Methods

Refer to caption
Figure 2: First stage of ΞΌπœ‡\boldsymbol{\mu}bold_italic_ΞΌ-Net. The muon features (initial position, initial momentum, etc.) are passed through an MLP and placed at PoCA scattering points within a 3D volume. If they overlap, the average is taken.

Scatter Operation.

Our goal is to convert a set of muon detections into a reconstruction output. One simple way to do this is to first convert the muon detections into some 3D representation, which can then be fed into a U-Net.

To achieve this, we first apply an MLP on the muon’s input parameters. The output features are reshaped into a dΓ—dΓ—dΓ—c𝑑𝑑𝑑𝑐d\times d\times d\times citalic_d Γ— italic_d Γ— italic_d Γ— italic_c block. We call d𝑑ditalic_d the point size and c𝑐citalic_c is the number of channels. Now, using the PoCA algorithm, we find the muon’s point of closest approach and scatter the output features into the voxels nearest to the PoCA point. We choose the PoCA point as the information about the muons that scatter in a given area is the primary information that is needed to help decide what the scattering density of that area is. For muons that do not scatter, we place this block at a random point along their trajectory. This provides improved performance since the vast majority of muons will not scatter. Placing them randomly along their trajectory provides additional information to the model that the scattering density in that region is low.

After this, if there is overlap in the scattering voxels, the sum is taken and a counter keeps track of how muons scatter in a given voxel. An MLP is used to combine this counter with the other projected features.

This approach can be implemented in a memory-efficient manner using TensorFlow’s 𝚝𝚏.πšœπšŒπšŠπšπšπšŽπš›β’_β’πš—πšformulae-sequenceπšπšπšœπšŒπšŠπšπšπšŽπš›_πš—πš\mathtt{tf.scatter\_nd}typewriter_tf . typewriter_scatter _ typewriter_nd to avoid creating a large number of 3D tensors in parallel. In contrast, other possibilities like converting each muon detection into its own 3D tensor and then summing them will take up alot of memory and is in practise, infeasible.

Nevertheless, this approach does come with some limitations. The most prominent being that the scattering operation is fundamentally non-differentiable. This means that we are unable to parameterise the position of the scattering points using a neural network directly. As such, we resort to using PoCA to get an approximate scattering point for the muons.

Feature Engineering.

Each muon detection is a 14-length vector consisting of the input position, 𝐱𝟎subscript𝐱0\mathbf{x_{0}}bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT, output position, 𝐱𝐟subscript𝐱𝐟\mathbf{x_{f}}bold_x start_POSTSUBSCRIPT bold_f end_POSTSUBSCRIPT, input momentum, 𝐩^𝟎subscript^𝐩0\mathbf{\hat{p}_{0}}over^ start_ARG bold_p end_ARG start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT, output momentum, 𝐩^𝐟subscript^𝐩𝐟\mathbf{\hat{p}_{f}}over^ start_ARG bold_p end_ARG start_POSTSUBSCRIPT bold_f end_POSTSUBSCRIPT, an estimate of the momentum magnitude, |𝐩|𝐩|\mathbf{p}|| bold_p | and an estimate of the scattering angle, |𝐩^πŸβˆ’π©^𝐒|subscript^𝐩𝐟subscript^𝐩𝐒|\mathbf{\hat{p}_{f}}-\mathbf{\hat{p}_{i}}|| over^ start_ARG bold_p end_ARG start_POSTSUBSCRIPT bold_f end_POSTSUBSCRIPT - over^ start_ARG bold_p end_ARG start_POSTSUBSCRIPT bold_i end_POSTSUBSCRIPT |. Although this estimate of the scattering angle can be easily computed from the other features, its inclusion improves performance as the model can easily know which muons should have a larger activation because they scattered more.

Algorithm 1 Scattering Operation
Β Β Input: muon detections {𝝁1,…,𝝁n}subscript𝝁1…subscript𝝁𝑛\{\boldsymbol{\mu}_{1},\dots,\boldsymbol{\mu}_{n}\}{ bold_italic_ΞΌ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_italic_ΞΌ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT }, resolution R𝑅Ritalic_R
Β Β Initialize π—βˆˆβ„rΓ—rΓ—rΓ—(c+1)𝐗superscriptβ„π‘Ÿπ‘Ÿπ‘Ÿπ‘1\mathbf{X}\in\mathbb{R}^{r\times r\times r\times(c+1)}bold_X ∈ blackboard_R start_POSTSUPERSCRIPT italic_r Γ— italic_r Γ— italic_r Γ— ( italic_c + 1 ) end_POSTSUPERSCRIPT
Β Β forΒ i=1𝑖1i=1italic_i = 1 to i=n𝑖𝑛i=nitalic_i = italic_nΒ do
     𝐲i←𝖬𝖫𝖯1⁒(𝝁i),𝐲iβˆˆβ„dΓ—dΓ—dΓ—cformulae-sequence←subscript𝐲𝑖subscript𝖬𝖫𝖯1subscript𝝁𝑖subscript𝐲𝑖superscriptℝ𝑑𝑑𝑑𝑐\mathbf{y}_{i}\leftarrow\mathsf{MLP}_{1}(\boldsymbol{\mu}_{i}),\mathbf{y}_{i}% \in\mathbb{R}^{d\times d\times d\times c}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ← sansserif_MLP start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_italic_ΞΌ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d Γ— italic_d Γ— italic_d Γ— italic_c end_POSTSUPERSCRIPT
Β Β Β Β Β if 𝝁isubscript𝝁𝑖\boldsymbol{\mu}_{i}bold_italic_ΞΌ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT scattersΒ then
Β Β Β Β Β Β Β Β Place 𝐲isubscript𝐲𝑖\mathbf{y}_{i}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in X at π–―π—ˆπ–’π– β’(𝝁i)π–―π—ˆπ–’π– subscript𝝁𝑖\mathsf{PoCA}(\boldsymbol{\mu}_{i})sansserif_PoCA ( bold_italic_ΞΌ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )
Β Β Β Β Β else
Β Β Β Β Β Β Β Β Place 𝐲isubscript𝐲𝑖\mathbf{y}_{i}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT at a random point along the muon’s trajectory
Β Β Β Β Β endΒ if
Β Β Β Β Β Increment the corresponding last channel of X by 1 in the corresponding place
Β Β endΒ for
Β Β X′←𝖬𝖫𝖯2⁒(X),Xβ€²βˆˆβ„rΓ—rΓ—rΓ—cformulae-sequence←superscript𝑋′subscript𝖬𝖫𝖯2𝑋superscript𝑋′superscriptβ„π‘Ÿπ‘Ÿπ‘Ÿπ‘X^{\prime}\leftarrow\mathsf{MLP}_{2}(X),X^{\prime}\in\mathbb{R}^{r\times r% \times r\times c}italic_X start_POSTSUPERSCRIPT β€² end_POSTSUPERSCRIPT ← sansserif_MLP start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_X ) , italic_X start_POSTSUPERSCRIPT β€² end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_r Γ— italic_r Γ— italic_r Γ— italic_c end_POSTSUPERSCRIPT
Β Β return Xβ€²superscript𝑋′X^{\prime}italic_X start_POSTSUPERSCRIPT β€² end_POSTSUPERSCRIPT

U-Net.

Now, we make use of the U-Net (Ronneberger etΒ al., 2015) to process the voxelised volume matrix from the first stage. In our U-Net, instead of using the more traditional Residual Blocks (He etΒ al., 2016), we make use of ConvNeXt Blocks (Liu etΒ al., 2022) which result in better performance than Residual Blocks at a lower computational cost. Each layer of the U-Net contains multiple such ConvNeXt blocks. Downsampling is done using Layer Normalisation followed by a convolutional layer of strides = 2. Upsampling is done by first applying a pointwise convolution and layer normalisation before using nearest neighbour upsampling.

Model Sizes.

In our experiments, we tested out 3 models - ΞΌπœ‡\muitalic_ΞΌ-Net-T, ΞΌπœ‡\muitalic_ΞΌ-Net-B and ΞΌπœ‡\muitalic_ΞΌ-Net-L. These variants only differ in the number of blocks B𝐡Bitalic_B and the number of channels C𝐢Citalic_C in each stage. The number of parameters of each of these models is listed as P𝑃Pitalic_P. We do not scale up the model further due to limited computational resources. We will leave further exploration of the scaling of ΞΌπœ‡\muitalic_ΞΌ-Net to future work.

  • β€’

    𝝁𝝁\boldsymbol{\mu}bold_italic_ΞΌ-Net-T: P=1.1⁒M𝑃1.1𝑀P=1.1Mitalic_P = 1.1 italic_M
    B=(1,2,3,4,5),C=(8,16,32,64,128)formulae-sequence𝐡12345𝐢8163264128B=(1,2,3,4,5),\;C=(8,16,32,64,128)italic_B = ( 1 , 2 , 3 , 4 , 5 ) , italic_C = ( 8 , 16 , 32 , 64 , 128 )

  • β€’

    𝝁𝝁\boldsymbol{\mu}bold_italic_ΞΌ-Net-B: P=5.0⁒M𝑃5.0𝑀P=5.0Mitalic_P = 5.0 italic_M
    B=(1,2,4,4,6),C=(16,32,64,128,256)formulae-sequence𝐡12446𝐢163264128256B=(1,2,4,4,6),\;C=(16,32,64,128,256)italic_B = ( 1 , 2 , 4 , 4 , 6 ) , italic_C = ( 16 , 32 , 64 , 128 , 256 )

  • β€’

    𝝁𝝁\boldsymbol{\mu}bold_italic_ΞΌ-Net-L: P=14.8⁒M𝑃14.8𝑀P=14.8Mitalic_P = 14.8 italic_M
    B=(1,2,4,6,8),C=(24,48,96,192,384)formulae-sequence𝐡12468𝐢244896192384B=(1,2,4,6,8),\;C=(24,48,96,192,384)italic_B = ( 1 , 2 , 4 , 6 , 8 ) , italic_C = ( 24 , 48 , 96 , 192 , 384 )

Training Techniques.

We train our model using the AdamW optimizer with a learning rate of 2Γ—10βˆ’32superscript1032\times 10^{-3}2 Γ— 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT and a weight decay of 4Γ—10βˆ’β’34superscript1034\times 10^{-}34 Γ— 10 start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT 3. The model is trained for 15 epochs.

Universal Approximation.

We have shown that ΞΌπœ‡\muitalic_ΞΌ-Net is an universal function approximator for continuous set functions given the model is large enough and either of the following conditions are met:

  • β€’

    The resolution of the reconstructed volume is sufficiently large such that there is no overlap in the reconstructed points or,

  • β€’

    the resolution of the reconstructed volume is finite but the point size is large enough to cover the entire reconstructed volume.

Formally, we can define Ο‡={S:SβŠ†β„mβ’π‘Žπ‘›π‘‘β’|S|=n}πœ’conditional-set𝑆𝑆superscriptβ„π‘šπ‘Žπ‘›π‘‘π‘†π‘›\chi=\{S:S\subseteq\mathbb{R}^{m}\;\mathit{and}\;|S|=n\}italic_Ο‡ = { italic_S : italic_S βŠ† blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_and | italic_S | = italic_n } and f:χ→ℝ:π‘“β†’πœ’β„f:\chi\rightarrow\mathbb{R}italic_f : italic_Ο‡ β†’ blackboard_R is a continuous set function w.r.t the Hausdorff distance dH⁒(β‹…,β‹…)subscript𝑑𝐻⋅⋅d_{H}(\cdot,\cdot)italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( β‹… , β‹… ). Our theorem proves that f𝑓fitalic_f can be arbitrarily approximated by our model if the resolution is sufficiently high, or if the resolution is fixed but the point size d𝑑ditalic_d is the same as the resolution.

Theorem 4.1.

Suppose f:χ→ℝpnormal-:𝑓normal-β†’πœ’superscriptℝ𝑝f:\chi\rightarrow\mathbb{R}^{p}italic_f : italic_Ο‡ β†’ blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT is a continuous set function w.r.t dH⁒(β‹…,β‹…)subscript𝑑𝐻normal-β‹…normal-β‹…d_{H}(\cdot,\cdot)italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( β‹… , β‹… ), such that for all Ο΅>0italic-Ο΅0\epsilon>0italic_Ο΅ > 0, there exists some configuration of the model parameters ΞΈπœƒ\thetaitalic_ΞΈ for sufficiently large p𝑝pitalic_p or ϕ⁒(η⁒(xi))=JpΓ—ditalic-Ο•πœ‚subscriptπ‘₯𝑖subscript𝐽𝑝𝑑\phi(\eta(x_{i}))=J_{p\times d}italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) = italic_J start_POSTSUBSCRIPT italic_p Γ— italic_d end_POSTSUBSCRIPT (i.e. the indicator function maps to every point), such that for any SβˆˆΟ‡π‘†πœ’S\in\chiitalic_S ∈ italic_Ο‡,

|f(S)βˆ’Ξ³ΞΈ([βˆ‘xi∈S{Ο•(Ξ·(xi))β‹…hΞΈ(xi)},βˆ‘xi∈S{Ο•(Ξ·(xi))β‹…JdΓ—c}])|<ϡ𝑓𝑆subscriptπ›Ύπœƒsubscriptsubscriptπ‘₯𝑖𝑆⋅italic-Ο•πœ‚subscriptπ‘₯𝑖subscriptβ„Žπœƒsubscriptπ‘₯𝑖subscriptsubscriptπ‘₯𝑖𝑆⋅italic-Ο•πœ‚subscriptπ‘₯𝑖subscript𝐽𝑑𝑐italic-Ο΅\left|f(S)-\gamma_{\theta}\left(\left[\sum_{x_{i}\in S}\{\phi(\eta(x_{i}))% \cdot h_{\theta}(x_{i})\},\right.\right.\right.\\ \left.\left.\left.\sum_{x_{i}\in S}\{\phi(\eta(x_{i}))\cdot J_{d\times c}\}% \right]\right)\right|<\epsilonstart_ROW start_CELL | italic_f ( italic_S ) - italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( [ βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) β‹… italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } , end_CELL end_ROW start_ROW start_CELL βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) β‹… italic_J start_POSTSUBSCRIPT italic_d Γ— italic_c end_POSTSUBSCRIPT } ] ) | < italic_Ο΅ end_CELL end_ROW (6)

where Ξ³ΞΈ:ℝpΓ—c→ℝpnormal-:subscriptπ›Ύπœƒnormal-β†’superscriptℝ𝑝𝑐superscriptℝ𝑝\gamma_{\theta}:\mathbb{R}^{p\times c}\rightarrow\mathbb{R}^{p}italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_p Γ— italic_c end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT is any continuous function, hΞΈ:ℝm→ℝdΓ—cnormal-:subscriptβ„Žπœƒnormal-β†’superscriptβ„π‘šsuperscriptℝ𝑑𝑐h_{\theta}:\mathbb{R}^{m}\rightarrow\mathbb{R}^{d\times c}italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_d Γ— italic_c end_POSTSUPERSCRIPT is any continuous function, Ξ·:ℝm→ℝnormal-:πœ‚normal-β†’superscriptβ„π‘šβ„\eta:\mathbb{R}^{m}\rightarrow\mathbb{R}italic_Ξ· : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R, Ο•:ℝm→ℝpΓ—dnormal-:italic-Ο•normal-β†’superscriptβ„π‘šsuperscriptℝ𝑝𝑑\phi:\mathbb{R}^{m}\rightarrow\mathbb{R}^{p\times d}italic_Ο• : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_p Γ— italic_d end_POSTSUPERSCRIPT and JdΓ—csubscript𝐽𝑑𝑐J_{d\times c}italic_J start_POSTSUBSCRIPT italic_d Γ— italic_c end_POSTSUBSCRIPT is the ones matrix of shape (d,c)𝑑𝑐(d,c)( italic_d , italic_c ). Ξ·πœ‚\etaitalic_Ξ· represents the PoCA function that generates a scattering point from the muon detection. Ο•italic-Ο•\phiitalic_Ο• is an indicator function for a set of intervals of length d𝑑ditalic_d derived from its input. The indicator function for each of these intervals is placed along one row in the last dimension. Ξ³ΞΈsubscriptπ›Ύπœƒ\gamma_{\theta}italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT and hΞΈsubscriptβ„Žπœƒh_{\theta}italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT can be taken to be any continuous function due to the universal approximation theorem for CNNs and MLPs. [𝐀,𝐁]𝐀𝐁[\textbf{A},\textbf{B}][ A , B ] represents the concatenation of 2 matrices along their last axes.

A brief proof is provided in the Appendix. We also note this theorem generalises to all dimensions easily by replacing p𝑝pitalic_p with rΓ—rπ‘Ÿπ‘Ÿr\times ritalic_r Γ— italic_r, rΓ—rΓ—rπ‘Ÿπ‘Ÿπ‘Ÿr\times r\times ritalic_r Γ— italic_r Γ— italic_r, etc.

The limitations of the theorem to arbitrarily large resolutions or point sizes comes from our lack of assumptions about the PoCA and scatter operations (i.e. ϕ⁒(η⁒(xi))italic-Ο•πœ‚subscriptπ‘₯𝑖\phi(\eta(x_{i}))italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ). The nature of these functions depends on the interactions between the muons and the object and is very difficult to analyse due to the complex interactions that muons exhibit with matter.

However, despite these limitations, we hypothesise that our model performs well with a fixed resolution and point size since most of the information about the muon scattering is contained near the the scattering point.

5 Experiments

5.1 Experimental Setup

Our data is generated using CERN’s Geant4 simulation software (Agostinelli etΒ al., 2003). To generate 3D objects to be analysed by our system, we make use of fractal noise to generate objects of various shapes. The materials of the object is randomly chosen from a set list of materials of different radiation lengths, taken from the IEEE BigData Competition on Muon Tomography (Wnuk etΒ al., 2023).

The geometry of the system can be found in Figure 3. The target object is contained within a cube of side length 1 m. The input and output detectors are squares of side length 2 m. They are separated from the object by a distance of 0.5 m.

Refer to caption
Figure 3: Simulation setup. A scale diagram (except the voxels) illustrating the simulation setup within Geant4.

For the muon beam, we use a beam with a cos2superscript2\cos^{2}roman_cos start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT angular distribution and a power law distribution, in accordance with characterised values of the cosmic muon flux (Shukla & Sankrith, 2018). Muons that are calculated not to hit the detector are killed at the start of the simulation to ensure the simulation runs at a reasonable speed for data generation.

We use 20000 samples for the training set, 1600 samples for the validation set and 1600 samples for the test set. Our model is implemented using TensorFlow and trained using 2 T4 GPUs.

5.2 Ablations

Point Size.

First, we vary the point size at various dosages to see its impact on the model’s performance. Interestingly, we find that a smaller point size of 1 results in the best performance.

Table 1: Smaller is better. The results of the model at different dosages for various point sizes. The inference times are evaluated on 2 T4 GPUs with a batch size of 8. The best results are bolded. The smaller point size of 1 outperforms the larger point size of 3 on almost all dosages.

Point Size Dosage Time↓normal-↓\downarrow↓ MSE↓normal-↓\downarrow↓ MAE↓normal-↓\downarrow↓ PSNR↑normal-↑\uparrow↑

1

1024 126 ms 0.2276 0.2204 17.1426

3

1024 134 ms 0.2289 0.2366 17.0971

1

2048 135 ms 0.1965 0.1989 17.7786

3

2048 143 ms 0.1947 0.1949 17.8280

1

4096 141 ms 0.1653 0.1725 18.5347

3

4096 178 ms 0.1685 0.1741 18.4482

1

8192 169 ms 0.1388 0.1438 19.2979

3

8192 219 ms 0.1403 0.1465 19.2434

1

16384 246 ms 0.1169 0.1207 20.0433

3

16384 318 ms 0.1205 0.1340 19.9047

1

32768 347 ms 0.0993 0.1156 20.7906

3

32768 574 ms 0.1048 0.1224 20.4946

Estimate of Scattering Angle.

We then test out the impact of not providing an estimate of the scattering angle. For a dosage of 1024 muons using ΞΌπœ‡\muitalic_ΞΌ-Net-T, this results in a PSNR of 16.9142, in contrast with a PSNR of 17.1426 when it is provided. This demonstrates that including the scattering angle, although it can be computed from the other features in the muon detection, helps to improve the model’s accuracy.

Random Placement of Muons.

We also test out the impact of placing the non-scattered muons at the centre of the their trajectory, rather than at a random point along their trajectory. For a dosage of 1024 muons using ΞΌπœ‡\muitalic_ΞΌ-Net-T, this results in a PSNR of 16.8650, in contrast with a PSNR of 17.1426 when the muons are placed randomly along their trajectory. This similarly demonstrates the effectiveness of this method in spreading out the information about unscattered muons across the 3D voxels, enabling a more accurate reconstruction.

5.3 Model Scaling

Refer to caption
Figure 4: Model scaling has little impact. The PSNR of the different model sizes plotted against dosage levels. We see that for low dosages, the model size has little impact. However, at larger dosages, the impact of model size is greater.

Now, we explore the impact of the scaling of the model on the performance. The results are plotted in Figure 4. We notice that the improvements in performance are actually quite small, especially for small dosages. We hypothesise that this is because we are near the ”limit” of how good the reconstruction can be given the U-Net’s input of the PoCA points. It is likely that attaining a better estimate of the scattering points would lead to better performance.

Furthermore, we notice that at larger dosages, the improvements in performance increase, likely due to the problem being more complex, hence, providing more room for improvement with a larger model.

5.4 Comparison with Traditional Algorithms

Refer to caption
Figure 5: 𝝁𝝁\boldsymbol{\mu}bold_italic_ΞΌ-Net outperforms PoCA. The PSNR of ΞΌπœ‡\muitalic_ΞΌ-Net and PoCA plotted against dosage levels. ΞΌπœ‡\muitalic_ΞΌ-Net consistently outperforms PoCA over all dosages and all model sizes.

Dosage.

We vary the dosage of muons from 1024 to 32768. The results are shown in Figure 4. We see that as the dosage increases, the performance of the models increase as well. It is also clear that our model, ΞΌπœ‡\muitalic_ΞΌ-Net, significantly outperforms the the traditional PoCA algorithm in Figure 5. However, we notice that at the gradient of the graph for PoCA appearing to be increasing. Future study is needed to ascertain if this is just a statistical anomaly or if the PoCA indeed curves upwards.

Refer to caption
Figure 6: 𝝁𝝁\boldsymbol{\mu}bold_italic_ΞΌ-Net is robust to momentum error. The PSNR of various methods plotted against percentage error in the momentum estimate. ΞΌπœ‡\muitalic_ΞΌ-Net*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT represents the model’s performance when it is finetuned on the new data. Our model is consistently shown to be robust to perturbations and to significantly outperform the PoCA algorithm.

Momentum Estimate.

We also look at how varying levels of error in the momentum will affect predictions in Figure 6. We again find that our model significantly outperforms the traditional PoCA algorithm. After fine-tuning, we also see that the model’s performance stays relatively constant as the error in the momentum increases, indicating our model is robust.

Refer to caption
Figure 7: 𝝁𝝁\boldsymbol{\mu}bold_italic_ΞΌ-Net is robust to detector resolution. The PSNR of various methods plotted against detector resolution. ΞΌπœ‡\muitalic_ΞΌ-Net*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT represents the model’s performance when it is finetuned on the new data. Our model is consistently shown to be robust to perturbations and to significantly outperform the PoCA algorithm.

Detector Resolution.

Finally, we look at how the model’s performance changes with the detector resolution in Figure 7. Again, we find that our model significantly outperforms the PoCA algorithm. In addition, we find that our model performs well at a variety of resolutions, showing that it is very robust. We also notice that the performance of PoCA stays constant across all resolutions and this is because the reconstruction volume has a resolution of 64Γ—64Γ—6464646464\times 64\times 6464 Γ— 64 Γ— 64.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 8: 2D Reconstruction Results. This is a cross-section of an object made of 3 different materials. The ability of the model to reconstruct the approximate shapes and differentiate materials is shown clearly. However, there is still a significant amount of blurring. (left) ground truth, (middle-left) ΞΌπœ‡\muitalic_ΞΌ-Net, (middle-right) PoCA (right) MLEM. MLEM was performed at a much lower resolution because it requires muon tracks to intersect with every voxel.
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 9: 3D Reconstruction Results. This is a 3D plot and a cross-section of one of the imaging targets from the testing set. The ability of the model to reconstruct the approximate shapes and differentiate materials is shown clearly. However, there is still a significant amount of blurring. (left) ground truth, (middle) ΞΌπœ‡\muitalic_ΞΌ-Net, (right) PoCA

Visual Comparison.

Now, we visually compare the reconstructions of PoCA and MLEM. These reconstructions are shown in Figure 8 and 9. More reconstructions from different dosages can be found in Appendix B. A dosage of 32768 muons is used. The MLEM reconstruction is significantly worse since it had to be done using a lower resolution, because the algorithm requires the muon tracks to pass through every voxel, which is not possible at higher resolutions.

We see that ΞΌπœ‡\muitalic_ΞΌ-Net provides superior reconstruction quality compared to PoCA. Furthermore, ΞΌπœ‡\muitalic_ΞΌ-Net is also able to distinguish different materials by their radiation length and accurately reconstruct the approximate shape of objects. Nevertheless, we still observe significant levels of blurring. Further analysis of the artifacts found in the reconstruction can be found in Appendix C.

6 Discussion

6.1 Applications

Nuclear Non-proliferation.

Muon tomography has been applied in detecting the presence of high-Z materials such as radioactive materials since these materials result in a large amount of scattering. It can also be applied for general screening of cargo for high-Z materials (Barnes etΒ al., 2023). Our model can be used to help improve the quality of these reconstructions, potentially reducing the dosage required to screen the cargo, enabling lower screening times.

Archaeology.

In addition, muon tomography also has numerous applications in archaeology. In particular, the ability of muons to easily penetrate thick layers of material such as rock, enables their use to image the interiors of large structures. For instance, researchers have used muon tomography to discover secret chambers within Khufu’s Pyramid (Procureur etΒ al., 2023a, b), image underground cavities in Mount Echia (Cimmino etΒ al., 2019; Saracino etΒ al., 2017) and discover a secret ancient Greek burial chamber in the centre of Naples (Tioukov etΒ al., 2023). In cases where the input detections are available, our model can be used to significantly improve the accuracy of the reconstructions.

6.2 Future Work

No Input Muon Information.

One limitation of the model is that it depends on the presence of information about the input muons because of its use of the PoCA algorithm. However, in some cases, such as in some archaeological applications, the direction of the input muons is unavailable.

Trajectory Prediction.

As shown in Figure 1, when muons scatter more than once, their PoCA will be outside of the boundaries of any object. This is a significant limitation of PoCA and sometimes results in false hotspots of scattering density in the final prediction. One solution to this could be to attempt to predict the muon’s full trajectory using a method like (Benton etΒ al., 2012) given an initial guess of the scattering density, which can be obtained from our current model. Using this information, more accurate coordinates of scattering points can be obtained, allowing a more accurate scattering density to be obtained. This process can then be iterated until convergence.

Out-of-Distribution Generalisation.

It is not clear to what extent the model’s performance is independent of factors such as the angular and spacial distribution of cosmic ray muons. In addition, in real life, the spacial distribution of the scattering densities will not be that of fractal noise, like what the model was trained on. Before such a model can be deployed in the real-world, it would be important to check its ability to generalise.

7 Conclusion

In conclusion, we have constructed a state-of-the-art model for muon scattering tomography which outperforms traditional methods such as PoCA and MLEM. Furthermore, we find that our model is robust to various corruptions, with its performance barely changing when they are applied. We hope that our research will spark further investigation into the usage of deep learning in this field. Improvements in imaging techniques for muon scattering tomography will have wide-ranging applications from ensuring nuclear non-proliferation to the discovery of secret chambers in ancient structures.

Acknowledgments

The authors would like to thank their friend Kannan Vishal for helping with the writing of an initial simulation in Geant4, their friends Prannaya Gupta, Kabir Jain, Mahir Hitesh for providing computing power for this project and their teachers Mr Silas Yeem Kai Ean and Mr Ng Chee Loong for providing useful advice.

References

  • Agostinelli etΒ al. (2003) Agostinelli, S., Allison, J., Amako, K., Apostolakis, J., Araujo, H., Arce, P., Asai, M., Axen, D., Banerjee, S., Barrand, G., Behner, F., Bellagamba, L., Boudreau, J., Broglia, L., Brunengo, A., Burkhardt, H., Chauvie, S., Chuma, J., Chytracek, R., Cooperman, G., Cosmo, G., Degtyarenko, P., Dell'Acqua, A., Depaola, G., Dietrich, D., Enami, R., Feliciello, A., Ferguson, C., Fesefeldt, H., Folger, G., Foppiano, F., Forti, A., Garelli, S., Giani, S., Giannitrapani, R., Gibin, D., Cadenas, J.Β G., GonzΓ‘lez, I., Abril, G.Β G., Greeniaus, G., Greiner, W., Grichine, V., Grossheim, A., Guatelli, S., Gumplinger, P., Hamatsu, R., Hashimoto, K., Hasui, H., Heikkinen, A., Howard, A., Ivanchenko, V., Johnson, A., Jones, F., Kallenbach, J., Kanaya, N., Kawabata, M., Kawabata, Y., Kawaguti, M., Kelner, S., Kent, P., Kimura, A., Kodama, T., Kokoulin, R., Kossov, M., Kurashige, H., Lamanna, E., LampΓ©n, T., Lara, V., Lefebure, V., Lei, F., Liendl, M., Lockman, W., Longo, F., Magni, S., Maire, M., Medernach, E., Minamimoto, K., deΒ Freitas, P.Β M., Morita, Y., Murakami, K., Nagamatu, M., Nartallo, R., Nieminen, P., Nishimura, T., Ohtsubo, K., Okamura, M., O'Neale, S., Oohata, Y., Paech, K., Perl, J., Pfeiffer, A., Pia, M., Ranjard, F., Rybin, A., Sadilov, S., Salvo, E.Β D., Santin, G., Sasaki, T., Savvas, N., Sawada, Y., Scherer, S., Sei, S., Sirotenko, V., Smith, D., Starkov, N., Stoecker, H., Sulkimo, J., Takahata, M., Tanaka, S., Tcherniaev, E., Tehrani, E.Β S., Tropeano, M., Truscott, P., Uno, H., Urban, L., Urban, P., Verderi, M., Walkden, A., Wander, W., Weber, H., Wellisch, J., Wenaus, T., Williams, D., Wright, D., Yamada, T., Yoshida, H., and Zschiesche, D. Geant4β€”a simulation toolkit. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 506(3):250–303, July 2003. doi: 10.1016/s0168-9002(03)01368-8. URL https://doi.org/10.1016/s0168-9002(03)01368-8.
  • Barnes etΒ al. (2023) Barnes, S., Georgadze, A., Giammanco, A., Kiisk, M., Kudryavtsev, V.Β A., Lagrange, M., and Pinto, O.Β L. Cosmic-ray tomography for border security. Instruments, 7(1), 2023. ISSN 2410-390X. doi: 10.3390/instruments7010013. URL https://www.mdpi.com/2410-390X/7/1/13.
  • Bello etΒ al. (2020) Bello, S.Β A., Yu, S., Wang, C., Adam, J.Β M., and Li, J. Deep learning on 3d point clouds. Remote Sensing, 12(11):1729, 2020.
  • Benton etΒ al. (2012) Benton, C.Β J., Smith, N.Β D., Quillin, S.Β J., and Steer, C.Β A. Most probable trajectory of a muon in a scattering medium, when input and output trajectories are known. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 693:154–159, November 2012. doi: 10.1016/j.nima.2012.07.008. URL https://doi.org/10.1016/j.nima.2012.07.008.
  • Beringer etΒ al. (2012) Beringer, J., Arguin, J.Β F., Barnett, R.Β M., Copic, K., Dahl, O., Groom, D.Β E., Lin, C.Β J., Lys, J., Murayama, H., Wohl, C.Β G., Yao, W.Β M., Zyla, P.Β A., Amsler, C., Antonelli, M., Asner, D.Β M., Baer, H., Band, H.Β R., Basaglia, T., Bauer, C.Β W., Beatty, J.Β J., Belousov, V.Β I., Bergren, E., Bernardi, G., Bertl, W., Bethke, S., Bichsel, H., Biebel, O., Blucher, E., Blusk, S., Brooijmans, G., Buchmueller, O., Cahn, R.Β N., Carena, M., Ceccucci, A., Chakraborty, D., Chen, M.Β C., Chivukula, R.Β S., Cowan, G., D'Ambrosio, G., Damour, T., deΒ Florian, D., deΒ GouvΓͺa, A., DeGrand, T., deΒ Jong, P., Dissertori, G., Dobrescu, B., Doser, M., Drees, M., Edwards, D.Β A., Eidelman, S., Erler, J., Ezhela, V.Β V., Fetscher, W., Fields, B.Β D., Foster, B., Gaisser, T.Β K., Garren, L., Gerber, H.Β J., Gerbier, G., Gherghetta, T., Golwala, S., Goodman, M., Grab, C., Gritsan, A.Β V., Grivaz, J.Β F., GrΓΌnewald, M., Gurtu, A., Gutsche, T., Haber, H.Β E., Hagiwara, K., Hagmann, C., Hanhart, C., Hashimoto, S., Hayes, K.Β G., Heffner, M., Heltsley, B., HernΓ‘ndez-Rey, J.Β J., Hikasa, K., HΓΆcker, A., Holder, J., Holtkamp, A., Huston, J., Jackson, J.Β D., Johnson, K.Β F., Junk, T., Karlen, D., Kirkby, D., Klein, S.Β R., Klempt, E., Kowalewski, R.Β V., Krauss, F., Kreps, M., Krusche, B., Kuyanov, Y.Β V., Kwon, Y., Lahav, O., Laiho, J., Langacker, P., Liddle, A., Ligeti, Z., Liss, T.Β M., Littenberg, L., Lugovsky, K.Β S., Lugovsky, S.Β B., Mannel, T., Manohar, A.Β V., Marciano, W.Β J., Martin, A.Β D., Masoni, A., Matthews, J., Milstead, D., Miquel, R., MΓΆnig, K., Moortgat, F., Nakamura, K., Narain, M., Nason, P., Navas, S., Neubert, M., Nevski, P., Nir, Y., Olive, K.Β A., Pape, L., Parsons, J., Patrignani, C., Peacock, J.Β A., Petcov, S.Β T., Piepke, A., Pomarol, A., Punzi, G., Quadt, A., Raby, S., Raffelt, G., Ratcliff, B.Β N., Richardson, P., Roesler, S., Rolli, S., Romaniouk, A., Rosenberg, L.Β J., Rosner, J.Β L., Sachrajda, C.Β T., Sakai, Y., Salam, G.Β P., Sarkar, S., Sauli, F., Schneider, O., Scholberg, K., Scott, D., Seligman, W.Β G., Shaevitz, M.Β H., Sharpe, S.Β R., Silari, M., SjΓΆstrand, T., Skands, P., Smith, J.Β G., Smoot, G.Β F., Spanier, S., Spieler, H., Stahl, A., Stanev, T., Stone, S.Β L., Sumiyoshi, T., Syphers, M.Β J., Takahashi, F., Tanabashi, M., Terning, J., Titov, M., Tkachenko, N.Β P., TΓΆrnqvist, N.Β A., Tovey, D., Valencia, G., van Bibber, K., Venanzoni, G., Vincter, M.Β G., Vogel, P., Vogt, A., Walkowiak, W., Walter, C.Β W., Ward, D.Β R., Watari, T., Weiglein, G., Weinberg, E.Β J., Wiencke, L.Β R., Wolfenstein, L., Womersley, J., Woody, C.Β L., Workman, R.Β L., Yamamoto, A., Zeller, G.Β P., Zenin, O.Β V., Zhang, J., Zhu, R.Β Y., Harper, G., Lugovsky, V.Β S., and Schaffner, P. Review of particle physics. Physical Review D, 86(1), July 2012. doi: 10.1103/physrevd.86.010001. URL https://doi.org/10.1103/physrevd.86.010001.
  • Borselli etΒ al. (2022) Borselli, D., Beni, T., Bonechi, L., Bongi, M., Brocchini, D., Casagli, N., Ciaranfi, R., Cimmino, L., Ciulli, V., D’Alessandro, R., Dini, A., Frosin, C., Gigli, G., Gonzi, S., Guideri, S., Lombardi, L., Nocentini, M., and Saracino, G. Three-dimensional muon imaging of cavities inside the temperino mine (italy). Scientific Reports, 12(1), December 2022. doi: 10.1038/s41598-022-26393-7. URL https://doi.org/10.1038/s41598-022-26393-7.
  • Chatzidakis etΒ al. (2018) Chatzidakis, S., Liu, Z., Hayward, J.Β P., and Scaglione, J.Β M. A generalized muon trajectory estimation algorithm with energy loss for application to muon tomography. Journal of Applied Physics, 123(12), mar 2018. doi: 10.1063/1.5024671. URL https://doi.org/10.1063%2F1.5024671.
  • Γ‡iΓ§ek etΒ al. (2016) Γ‡iΓ§ek, Γ–., Abdulkadir, A., Lienkamp, S.Β S., Brox, T., and Ronneberger, O. 3d u-net: learning dense volumetric segmentation from sparse annotation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, October 17-21, 2016, Proceedings, Part II 19, pp.Β  424–432. Springer, 2016.
  • Cimmino etΒ al. (2019) Cimmino, L., Baccani, G., Noli, P., Amato, L., Ambrosino, F., Bonechi, L., Bongi, M., Ciulli, V., D’Alessandro, R., D’Errico, M., Gonzi, S., Melon, B., Minin, G., Saracino, G., Scognamiglio, L., Strolin, P., and Viliani, L. 3d muography for the search of hidden cavities. Scientific Reports, 9(1), February 2019. doi: 10.1038/s41598-019-39682-5. URL https://doi.org/10.1038/s41598-019-39682-5.
  • He etΒ al. (2016) He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp.Β 770–778. IEEE Computer Society, 2016. doi: 10.1109/CVPR.2016.90. URL https://doi.org/10.1109/CVPR.2016.90.
  • Ho etΒ al. (2021) Ho, N.-V., Nguyen, T., Diep, G.-H., Le, N., and Hua, B.-S. Point-unet: A context-aware point-based neural network for volumetric segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, pp.Β  644–655. Springer, 2021.
  • Hou etΒ al. (2021) Hou, L., Zhang, Q., Yang, J., Cai, X., Yao, Q., Huo, Y., and Chen, Q. A novel reconstruction algorithm based on density clustering for cosmic-ray muon scattering inspection. Nuclear Engineering and Technology, 53(7):2348–2356, July 2021. doi: 10.1016/j.net.2021.01.014. URL https://doi.org/10.1016/j.net.2021.01.014.
  • Jonkmans etΒ al. (2013) Jonkmans, G., Anghel, V., Jewett, C., and Thompson, M. Nuclear waste imaging and spent fuel verification by muon tomography. Annals of Nuclear Energy, 53:267–273, March 2013. doi: 10.1016/j.anucene.2012.09.011. URL https://doi.org/10.1016/j.anucene.2012.09.011.
  • Liu etΒ al. (2021) Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pp.Β  10012–10022, 2021.
  • Liu etΒ al. (2022) Liu, Z., Mao, H., Wu, C., Feichtenhofer, C., Darrell, T., and Xie, S. A convnet for the 2020s. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pp.Β 11966–11976. IEEE, 2022. doi: 10.1109/CVPR52688.2022.01167. URL https://doi.org/10.1109/CVPR52688.2022.01167.
  • Procureur etΒ al. (2023a) Procureur, S., Morishima, K., Kuno, M., Manabe, Y., Kitagawa, N., Nishio, A., Gomez, H., AttiΓ©, D., Sakakibara, A., Hikata, K., Moto, M., Mandjavidze, I., Magnier, P., Lehuraux, M., Benoit, T., Calvet, D., Coppolani, X., Kebbiri, M., Mas, P., Helal, H., Tayoubi, M., Marini, B., Serikoff, N., Anwar, H., Steiger, V., Takasaki, F., Fujii, H., Satoh, K., Kodama, H., Hayashi, K., Gable, P., Guerriero, E., Mouret, J.-B., Elnady, T., Elshayeb, Y., and Elkarmoty, M. Precise characterization of a corridor-shaped structure in khufu’s pyramid by observation of cosmic-ray muons. Nature Communications, 14(1), March 2023a. doi: 10.1038/s41467-023-36351-0. URL https://doi.org/10.1038/s41467-023-36351-0.
  • Procureur etΒ al. (2023b) Procureur, S., Morishima, K., Kuno, M., Manabe, Y., Kitagawa, N., Nishio, A., Gomez, H., AttiΓ©, D., Sakakibara, A., Hikata, K., Moto, M., Mandjavidze, I., Magnier, P., Lehuraux, M., Benoit, T., Calvet, D., Coppolani, X., Kebbiri, M., Mas, P., Helal, H., Tayoubi, M., Marini, B., Serikoff, N., Anwar, H., Steiger, V., Takasaki, F., Fujii, H., Satoh, K., Kodama, H., Hayashi, K., Gable, P., Guerriero, E., Mouret, J.-B., Elnady, T., Elshayeb, Y., and Elkarmoty, M. Precise characterization of a corridor-shaped structure in khufu’s pyramid by observation of cosmic-ray muons. Nature Communications, 14(1), March 2023b. doi: 10.1038/s41467-023-36351-0. URL https://doi.org/10.1038/s41467-023-36351-0.
  • Ronneberger etΒ al. (2015) Ronneberger, O., Fischer, P., and Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Navab, N., Hornegger, J., III, W. M.Β W., and Frangi, A.Β F. (eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th International Conference Munich, Germany, October 5 - 9, 2015, Proceedings, Part III, volume 9351 of Lecture Notes in Computer Science, pp.Β  234–241. Springer, 2015. doi: 10.1007/978-3-319-24574-4_28. URL https://doi.org/10.1007/978-3-319-24574-4_28.
  • Rossi & Greisen (1941) Rossi, B. and Greisen, K. Cosmic-ray theory. Reviews of Modern Physics, 13(4):240, 1941.
  • Saracino etΒ al. (2017) Saracino, G., Amato, L., Ambrosino, F., Antonucci, G., Bonechi, L., Cimmino, L., Consiglio, L., Alessandro, R.Β D., Luzio, E.Β D., Minin, G., Noli, P., Scognamiglio, L., Strolin, P., and Varriale, A. Imaging of underground cavities with cosmic-ray muons from observations at mt. echia (naples). Scientific Reports, 7(1), April 2017. doi: 10.1038/s41598-017-01277-3. URL https://doi.org/10.1038/s41598-017-01277-3.
  • Saracino etΒ al. (2018) Saracino, G., Ambrosino, F., Bonechi, L., Cimmino, L., D'Alessandro, R., D'Errico, M., Noli, P., Scognamiglio, L., and Strolin, P. Applications of muon absorption radiography to the fields of archaeology and civil engineering. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 377(2137):20180057, December 2018. doi: 10.1098/rsta.2018.0057. URL https://doi.org/10.1098/rsta.2018.0057.
  • Schulte etΒ al. (2008) Schulte, R.Β W., Penfold, S.Β N., Tafas, J.Β T., and Schubert, K.Β E. A maximum likelihood proton path formalism for application in proton computed tomography. Medical Physics, 35(11):4849–4856, oct 2008. doi: 10.1118/1.2986139. URL https://doi.org/10.1118%2F1.2986139.
  • Schultz (2003) Schultz, L.Β J. Cosmic Ray Muon Radiography. PhD thesis, Portland State University, 2003.
  • Schultz etΒ al. (2007) Schultz, L.Β J., Blanpied, G.Β S., Borozdin, K.Β N., Fraser, A.Β M., Hengartner, N.Β W., Klimenko, A.Β V., Morris, C.Β L., Orum, C., and Sossong, M.Β J. Statistical reconstruction for cosmic ray muon tomography. IEEE transactions on Image Processing, 16(8):1985–1993, 2007.
  • Shukla & Sankrith (2018) Shukla, P. and Sankrith, S. Energy and angular distributions of atmospheric muons at the earth. International Journal of Modern Physics A, 33(30):1850175, October 2018. doi: 10.1142/s0217751x18501750. URL https://doi.org/10.1142/s0217751x18501750.
  • Stapleton etΒ al. (2014) Stapleton, M., Burns, J., Quillin, S., and Steer, C. Angle statistics reconstruction: a robust reconstruction algorithm for muon scattering tomography. Journal of Instrumentation, 9(11):P11019–P11019, November 2014. doi: 10.1088/1748-0221/9/11/p11019. URL https://doi.org/10.1088/1748-0221/9/11/p11019.
  • Szczykutowicz etΒ al. (2022) Szczykutowicz, T.Β P., Toia, G.Β V., Dhanantwari, A., and Nett, B. A review of deep learning CT reconstruction: Concepts, limitations, and promise in clinical practice. Current Radiology Reports, 10(9):101–115, July 2022. doi: 10.1007/s40134-022-00399-5. URL https://doi.org/10.1007/s40134-022-00399-5.
  • Thomay etΒ al. (2012) Thomay, C., Baesso, P., Cussans, D., Davies, J., Glaysher, P., Quillin, S., Robertson, S., Steer, C., Vassallo, C., and Velthuis, J. A novel technique to detect special nuclear material using cosmic rays. Geoscientific Instrumentation, Methods and Data Systems, 1(2):235–238, December 2012. doi: 10.5194/gi-1-235-2012. URL https://doi.org/10.5194/gi-1-235-2012.
  • Tioukov etΒ al. (2019) Tioukov, V., Alexandrov, A., Bozza, C., Consiglio, L., D’Ambrosio, N., Lellis, G.Β D., Sio, C.Β D., Giudicepietro, F., Macedonio, G., Miyamoto, S., Nishiyama, R., Orazi, M., Peluso, R., Sheshukov, A., Sirignano, C., Stellacci, S.Β M., Strolin, P., and Tanaka, H. K.Β M. First muography of stromboli volcano. Scientific Reports, 9(1), April 2019. doi: 10.1038/s41598-019-43131-8. URL https://doi.org/10.1038/s41598-019-43131-8.
  • Tioukov etΒ al. (2023) Tioukov, V., Morishima, K., Leggieri, C., Capriuoli, F., Kitagawa, N., Kuno, M., Manabe, Y., Nishio, A., Alexandrov, A., Gentile, V., Iuliano, A., and Lellis, G.Β D. Hidden chamber discovery in the underground hellenistic necropolis of neapolis by muography. Scientific Reports, 13(1), April 2023. doi: 10.1038/s41598-023-32626-0. URL https://doi.org/10.1038/s41598-023-32626-0.
  • Vaswani etΒ al. (2017) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.Β N., Kaiser, L., and Polosukhin, I. Attention is all you need. In Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.Β M., Fergus, R., Vishwanathan, S. V.Β N., and Garnett, R. (eds.), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp.Β 5998–6008, 2017. URL https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
  • Wang etΒ al. (2009) Wang, G., Schultz, L., and Qi, J. Bayesian image reconstruction for improving detection performance of muon tomography. IEEE Transactions on Image Processing, 18(5):1080–1089, May 2009. doi: 10.1109/tip.2009.2014423. URL https://doi.org/10.1109/tip.2009.2014423.
  • Wnuk etΒ al. (2023) Wnuk, M., Dziuba, J., Janusz, A., and Slezak, D. Ieee bigdata 2023 cup: Object recognition with muon tomography using cosmic rays, 2023. URL https://knowledgepit.ai/object-recognition-with-muon-tomography/.
  • Yi etΒ al. (2014) Yi, H., Zeng, Z., Yu, B., Cheng, J., Zhao, Z., Wang, X., Zeng, M., and Wang, Y. Bayesian-theory-based most probable trajectory reconstruction algorithm in cosmic ray muon tomography. In 2014 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), pp.Β  1–4, 2014. doi: 10.1109/NSSMIC.2014.7431084.
  • Zaheer etΒ al. (2017) Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R.Β R., and Smola, A.Β J. Deep sets. In Guyon, I., Luxburg, U.Β V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volumeΒ 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/file/f22e4747da1aa27e363d86d40ff442fe-Paper.pdf.
  • Zeng etΒ al. (2020) Zeng, W., Zeng, M., Pan, X., Zeng, Z., Ma, H., and Cheng, J. Principle study of image reconstruction algorithms in muon tomography. Journal of Instrumentation, 15(02):T02005, 2020.

Appendix A Proofs

To prove TheoremΒ 4.1, we shall split the theorem into the arbitrarily large resolution case and the arbitrary large point size case.

A.1 Arbitrarily Large Resolution

Theorem A.1.

Suppose f:χ→ℝpnormal-:𝑓normal-β†’πœ’superscriptℝ𝑝f:\chi\rightarrow\mathbb{R}^{p}italic_f : italic_Ο‡ β†’ blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT is a continuous set function w.r.t dH⁒(β‹…,β‹…)subscript𝑑𝐻normal-β‹…normal-β‹…d_{H}(\cdot,\cdot)italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( β‹… , β‹… ), such that for all Ο΅>0italic-Ο΅0\epsilon>0italic_Ο΅ > 0, there exists some configuration of the model parameters ΞΈπœƒ\thetaitalic_ΞΈ for sufficiently large p𝑝pitalic_p or ϕ⁒(η⁒(xi))=JpΓ—ditalic-Ο•πœ‚subscriptπ‘₯𝑖subscript𝐽𝑝𝑑\phi(\eta(x_{i}))=J_{p\times d}italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) = italic_J start_POSTSUBSCRIPT italic_p Γ— italic_d end_POSTSUBSCRIPT (i.e. the indicator function maps to every point), such that for any SβˆˆΟ‡π‘†πœ’S\in\chiitalic_S ∈ italic_Ο‡,

|f⁒(S)βˆ’Ξ³ΞΈβ’([βˆ‘xi∈S{ϕ⁒(η⁒(xi))β‹…hθ⁒(xi)},βˆ‘xi∈S{ϕ⁒(η⁒(xi))β‹…JdΓ—c}])|<ϡ𝑓𝑆subscriptπ›Ύπœƒsubscriptsubscriptπ‘₯𝑖𝑆⋅italic-Ο•πœ‚subscriptπ‘₯𝑖subscriptβ„Žπœƒsubscriptπ‘₯𝑖subscriptsubscriptπ‘₯𝑖𝑆⋅italic-Ο•πœ‚subscriptπ‘₯𝑖subscript𝐽𝑑𝑐italic-Ο΅\left|f(S)-\gamma_{\theta}\left(\left[\sum_{x_{i}\in S}\{\phi(\eta(x_{i}))% \cdot h_{\theta}(x_{i})\},\sum_{x_{i}\in S}\{\phi(\eta(x_{i}))\cdot J_{d\times c% }\}\right]\right)\right|<\epsilon| italic_f ( italic_S ) - italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( [ βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) β‹… italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } , βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) β‹… italic_J start_POSTSUBSCRIPT italic_d Γ— italic_c end_POSTSUBSCRIPT } ] ) | < italic_Ο΅

where Ξ³ΞΈ:ℝpΓ—c→ℝpnormal-:subscriptπ›Ύπœƒnormal-β†’superscriptℝ𝑝𝑐superscriptℝ𝑝\gamma_{\theta}:\mathbb{R}^{p\times c}\rightarrow\mathbb{R}^{p}italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_p Γ— italic_c end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT is any continuous function, hΞΈ:ℝm→ℝdΓ—cnormal-:subscriptβ„Žπœƒnormal-β†’superscriptβ„π‘šsuperscriptℝ𝑑𝑐h_{\theta}:\mathbb{R}^{m}\rightarrow\mathbb{R}^{d\times c}italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_d Γ— italic_c end_POSTSUPERSCRIPT is any continuous function, Ξ·:ℝm→ℝnormal-:πœ‚normal-β†’superscriptβ„π‘šβ„\eta:\mathbb{R}^{m}\rightarrow\mathbb{R}italic_Ξ· : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R, Ο•:ℝm→ℝpΓ—dnormal-:italic-Ο•normal-β†’superscriptβ„π‘šsuperscriptℝ𝑝𝑑\phi:\mathbb{R}^{m}\rightarrow\mathbb{R}^{p\times d}italic_Ο• : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_p Γ— italic_d end_POSTSUPERSCRIPT and JdΓ—csubscript𝐽𝑑𝑐J_{d\times c}italic_J start_POSTSUBSCRIPT italic_d Γ— italic_c end_POSTSUBSCRIPT is the ones matrix of shape (d,c)𝑑𝑐(d,c)( italic_d , italic_c ). Ξ·πœ‚\etaitalic_Ξ· represents the PoCA function that generates a scattering point from the muon detection. Ο•italic-Ο•\phiitalic_Ο• is an indicator function for a set of intervals of length d𝑑ditalic_d derived from its input. The indicator function for each of these intervals is placed along one row in the last dimension. Ξ³ΞΈsubscriptπ›Ύπœƒ\gamma_{\theta}italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT and hΞΈsubscriptβ„Žπœƒh_{\theta}italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT can be taken to be any continuous function due to the universal approximation theorem for CNNs and MLPs. [𝐀,𝐁]𝐀𝐁[\textbf{A},\textbf{B}][ A , B ] represents the concatenation of 2 matrices along their last axes.

Proof.

The idea is that for a sufficiently large p𝑝pitalic_p, the indicator functions of Ο•italic-Ο•\phiitalic_Ο• will not overlap, allowing S𝑆Sitalic_S to be recovered exactly using an inverse function T𝑇Titalic_T.
Let hΞΈsubscriptβ„Žπœƒh_{\theta}italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT simply be the identity function. 111Since hΞΈsubscriptβ„Žπœƒh_{\theta}italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT is the identity function, c=mπ‘π‘šc=mitalic_c = italic_m Now, consider a function 𝒯:ℝpΓ—mβ†’Ο‡:𝒯→superscriptβ„π‘π‘šπœ’\mathcal{T}:\mathbb{R}^{p\times m}\rightarrow\chicaligraphic_T : blackboard_R start_POSTSUPERSCRIPT italic_p Γ— italic_m end_POSTSUPERSCRIPT β†’ italic_Ο‡, 𝒯⁒(X)={x:|x|>0,xβˆˆβ„m,x is an entry in the last dimension of X}𝒯𝑋conditional-setπ‘₯formulae-sequenceπ‘₯0π‘₯superscriptβ„π‘šx is an entry in the last dimension of X\mathcal{T}(X)=\{x:|x|>0,x\in\mathbb{R}^{m},\textit{x is an entry in the last % dimension of X}\}caligraphic_T ( italic_X ) = { italic_x : | italic_x | > 0 , italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT , x is an entry in the last dimension of X }. Now, we define Ξ³ΞΈsubscriptπ›Ύπœƒ\gamma_{\theta}italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT as fβˆ˜π’―π‘“π’―f\circ\mathcal{T}italic_f ∘ caligraphic_T. Clearly,

|f⁒(S)βˆ’f⁒(𝒯⁒([βˆ‘xi∈S{ϕ⁒(η⁒(xi))β‹…hθ⁒(xi)},βˆ‘xi∈S{ϕ⁒(η⁒(xi))β‹…JdΓ—c}]))|=|f⁒(S)βˆ’f⁒(S)|=0<ϡ𝑓𝑆𝑓𝒯subscriptsubscriptπ‘₯𝑖𝑆⋅italic-Ο•πœ‚subscriptπ‘₯𝑖subscriptβ„Žπœƒsubscriptπ‘₯𝑖subscriptsubscriptπ‘₯𝑖𝑆⋅italic-Ο•πœ‚subscriptπ‘₯𝑖subscript𝐽𝑑𝑐𝑓𝑆𝑓𝑆0italic-Ο΅\left|f(S)-f\left(\mathcal{T}\left(\left[\sum_{x_{i}\in S}\{\phi(\eta(x_{i}))% \cdot h_{\theta}(x_{i})\},\sum_{x_{i}\in S}\{\phi(\eta(x_{i}))\cdot J_{d\times c% }\}\right]\right)\right)\right|=\left|f(S)-f(S)\right|=0<\epsilon| italic_f ( italic_S ) - italic_f ( caligraphic_T ( [ βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) β‹… italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } , βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) β‹… italic_J start_POSTSUBSCRIPT italic_d Γ— italic_c end_POSTSUBSCRIPT } ] ) ) | = | italic_f ( italic_S ) - italic_f ( italic_S ) | = 0 < italic_Ο΅

∎

A.2 Arbitrarily Large Point Size

Theorem A.2.

Suppose f:χ→ℝpnormal-:𝑓normal-β†’πœ’superscriptℝ𝑝f:\chi\rightarrow\mathbb{R}^{p}italic_f : italic_Ο‡ β†’ blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT is a continuous set function w.r.t dH⁒(β‹…,β‹…)subscript𝑑𝐻normal-β‹…normal-β‹…d_{H}(\cdot,\cdot)italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( β‹… , β‹… ), such that for all Ο΅>0italic-Ο΅0\epsilon>0italic_Ο΅ > 0, there exists some configuration of the model parameters ΞΈπœƒ\thetaitalic_ΞΈ for sufficiently large p𝑝pitalic_p or ϕ⁒(η⁒(xi))=JpΓ—ditalic-Ο•πœ‚subscriptπ‘₯𝑖subscript𝐽𝑝𝑑\phi(\eta(x_{i}))=J_{p\times d}italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) = italic_J start_POSTSUBSCRIPT italic_p Γ— italic_d end_POSTSUBSCRIPT (i.e. the indicator function maps to every point), such that for any SβˆˆΟ‡π‘†πœ’S\in\chiitalic_S ∈ italic_Ο‡,

|f⁒(S)βˆ’Ξ³ΞΈβ’([βˆ‘xi∈S{ϕ⁒(η⁒(xi))β‹…hθ⁒(xi)},βˆ‘xi∈S{ϕ⁒(η⁒(xi))β‹…JdΓ—c}])|<ϡ𝑓𝑆subscriptπ›Ύπœƒsubscriptsubscriptπ‘₯𝑖𝑆⋅italic-Ο•πœ‚subscriptπ‘₯𝑖subscriptβ„Žπœƒsubscriptπ‘₯𝑖subscriptsubscriptπ‘₯𝑖𝑆⋅italic-Ο•πœ‚subscriptπ‘₯𝑖subscript𝐽𝑑𝑐italic-Ο΅\left|f(S)-\gamma_{\theta}\left(\left[\sum_{x_{i}\in S}\{\phi(\eta(x_{i}))% \cdot h_{\theta}(x_{i})\},\sum_{x_{i}\in S}\{\phi(\eta(x_{i}))\cdot J_{d\times c% }\}\right]\right)\right|<\epsilon| italic_f ( italic_S ) - italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( [ βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) β‹… italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } , βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) β‹… italic_J start_POSTSUBSCRIPT italic_d Γ— italic_c end_POSTSUBSCRIPT } ] ) | < italic_Ο΅

where Ξ³ΞΈ:ℝpΓ—c→ℝpnormal-:subscriptπ›Ύπœƒnormal-β†’superscriptℝ𝑝𝑐superscriptℝ𝑝\gamma_{\theta}:\mathbb{R}^{p\times c}\rightarrow\mathbb{R}^{p}italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_p Γ— italic_c end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT is any continuous function, hΞΈ:ℝm→ℝdΓ—cnormal-:subscriptβ„Žπœƒnormal-β†’superscriptβ„π‘šsuperscriptℝ𝑑𝑐h_{\theta}:\mathbb{R}^{m}\rightarrow\mathbb{R}^{d\times c}italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_d Γ— italic_c end_POSTSUPERSCRIPT is any continuous function, Ξ·:ℝm→ℝnormal-:πœ‚normal-β†’superscriptβ„π‘šβ„\eta:\mathbb{R}^{m}\rightarrow\mathbb{R}italic_Ξ· : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R, Ο•:ℝm→ℝpΓ—dnormal-:italic-Ο•normal-β†’superscriptβ„π‘šsuperscriptℝ𝑝𝑑\phi:\mathbb{R}^{m}\rightarrow\mathbb{R}^{p\times d}italic_Ο• : blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT β†’ blackboard_R start_POSTSUPERSCRIPT italic_p Γ— italic_d end_POSTSUPERSCRIPT and JdΓ—csubscript𝐽𝑑𝑐J_{d\times c}italic_J start_POSTSUBSCRIPT italic_d Γ— italic_c end_POSTSUBSCRIPT is the ones matrix of shape (d,c)𝑑𝑐(d,c)( italic_d , italic_c ). Ξ·πœ‚\etaitalic_Ξ· represents the PoCA function that generates a scattering point from the muon detection. Ο•italic-Ο•\phiitalic_Ο• is an indicator function for a set of intervals of length d𝑑ditalic_d derived from its input. The indicator function for each of these intervals is placed along one row in the last dimension. Ξ³ΞΈsubscriptπ›Ύπœƒ\gamma_{\theta}italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT and hΞΈsubscriptβ„Žπœƒh_{\theta}italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT can be taken to be any continuous function due to the universal approximation theorem for CNNs and MLPs. [𝐀,𝐁]𝐀𝐁[\textbf{A},\textbf{B}][ A , B ] represents the concatenation of 2 matrices along their last axis.

Proof.

In the case of ϕ⁒(η⁒(xi))=JpΓ—ditalic-Ο•πœ‚subscriptπ‘₯𝑖subscript𝐽𝑝𝑑\phi(\eta(x_{i}))=J_{p\times d}italic_Ο• ( italic_Ξ· ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) = italic_J start_POSTSUBSCRIPT italic_p Γ— italic_d end_POSTSUBSCRIPT, the theorem reduces to

|f⁒(S)βˆ’Ξ³ΞΈβ’([βˆ‘xi∈S{hθ⁒(xi)},JpΓ—1])|<ϡ𝑓𝑆subscriptπ›Ύπœƒsubscriptsubscriptπ‘₯𝑖𝑆subscriptβ„Žπœƒsubscriptπ‘₯𝑖subscript𝐽𝑝1italic-Ο΅\left|f(S)-\gamma_{\theta}\left(\left[\sum_{x_{i}\in S}\{h_{\theta}(x_{i})\},J% _{p\times 1}\right]\right)\right|<\epsilon| italic_f ( italic_S ) - italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( [ βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } , italic_J start_POSTSUBSCRIPT italic_p Γ— 1 end_POSTSUBSCRIPT ] ) | < italic_Ο΅

which can be equivalently expressed as

|f⁒(S)βˆ’Ξ³ΞΈβ’(βˆ‘xi∈S{hθ⁒(xi)})|<ϡ𝑓𝑆subscriptπ›Ύπœƒsubscriptsubscriptπ‘₯𝑖𝑆subscriptβ„Žπœƒsubscriptπ‘₯𝑖italic-Ο΅\left|f(S)-\gamma_{\theta}\left(\sum_{x_{i}\in S}\{h_{\theta}(x_{i})\}\right)% \right|<\epsilon| italic_f ( italic_S ) - italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } ) | < italic_Ο΅

given that Ξ³ΞΈsubscriptπ›Ύπœƒ\gamma_{\theta}italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT is a universal function approximator. By invoking Theorem 7 in (Zaheer etΒ al., 2017), any continuous permutation invariant set function can be decomposed into the form

f⁒(S)=ρ⁒(βˆ‘xi∈S{ψ⁒(xi)})π‘“π‘†πœŒsubscriptsubscriptπ‘₯π‘–π‘†πœ“subscriptπ‘₯𝑖f(S)=\rho(\sum_{x_{i}\in S}\{\psi(x_{i})\})italic_f ( italic_S ) = italic_ρ ( βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_ψ ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } )

where ρ𝜌\rhoitalic_ρ and Οˆπœ“\psiitalic_ψ are continuous functions. By the universal approximation theorem for CNNs and MLPs, there exists some ΞΈπœƒ\thetaitalic_ΞΈ such that

ρ⁒(βˆ‘xi∈Sψ⁒(xi))=γθ⁒(βˆ‘xi∈S{hθ⁒(xi)})𝜌subscriptsubscriptπ‘₯π‘–π‘†πœ“subscriptπ‘₯𝑖subscriptπ›Ύπœƒsubscriptsubscriptπ‘₯𝑖𝑆subscriptβ„Žπœƒsubscriptπ‘₯𝑖\rho(\sum_{x_{i}\in S}\psi(x_{i}))=\gamma_{\theta}\left(\sum_{x_{i}\in S}\{h_{% \theta}(x_{i})\}\right)italic_ρ ( βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT italic_ψ ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) = italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } )

Therefore,

|f⁒(S)βˆ’Ξ³ΞΈβ’(βˆ‘xi∈S{hθ⁒(xi)})|=|f⁒(S)βˆ’f⁒(S)|<ϡ𝑓𝑆subscriptπ›Ύπœƒsubscriptsubscriptπ‘₯𝑖𝑆subscriptβ„Žπœƒsubscriptπ‘₯𝑖𝑓𝑆𝑓𝑆italic-Ο΅\left|f(S)-\gamma_{\theta}\left(\sum_{x_{i}\in S}\{h_{\theta}(x_{i})\}\right)% \right|=\left|f(S)-f(S)\right|<\epsilon| italic_f ( italic_S ) - italic_Ξ³ start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( βˆ‘ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S end_POSTSUBSCRIPT { italic_h start_POSTSUBSCRIPT italic_ΞΈ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } ) | = | italic_f ( italic_S ) - italic_f ( italic_S ) | < italic_Ο΅

∎

Appendix B Reconstruction Results

π–¦π—‹π—ˆπ—Žπ—‡π–½β’π–³π—‹π—Žπ—π—π–¦π—‹π—ˆπ—Žπ—‡π–½π–³π—‹π—Žπ—π—\sf{Ground\;Truth}sansserif_Ground sansserif_Truth 𝟣𝟒𝟀𝟦1024\sf{1024}sansserif_1024 𝟀𝟒𝟦πŸͺ2048\sf{2048}sansserif_2048 𝟦𝟒𝟫𝟨4096\sf{4096}sansserif_4096 πŸͺ𝟣𝟫𝟀8192\sf{8192}sansserif_8192 𝟣𝟨πŸ₯πŸͺ𝟦16384\sf{16384}sansserif_16384 πŸ₯𝟀𝟩𝟨πŸͺ32768\sf{32768}sansserif_32768
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Figure 10: 2D Cross-sections. 2D cross-sections of the 3D reconstructions produced by ΞΌπœ‡\muitalic_ΞΌ-Net-L at various dosages. The improvement in reconstruction quality as the dosage increases can be seen clearly. We also see that some cross-sections appear to have worse cross-sections. This is because the materials being reconstructed have a high radiation length, so the muons do not scatter very much.
π–¦π—‹π—ˆπ—Žπ—‡π–½β’π–³π—‹π—Žπ—π—π–¦π—‹π—ˆπ—Žπ—‡π–½π–³π—‹π—Žπ—π—\sf{Ground\;Truth}sansserif_Ground sansserif_Truth 𝟣𝟒𝟀𝟦1024\sf{1024}sansserif_1024 𝟀𝟒𝟦πŸͺ2048\sf{2048}sansserif_2048 𝟦𝟒𝟫𝟨4096\sf{4096}sansserif_4096
Refer to caption Refer to caption Refer to caption Refer to caption
πŸͺ𝟣𝟫𝟀8192\sf{8192}sansserif_8192 𝟣𝟨πŸ₯πŸͺ𝟦16384\sf{16384}sansserif_16384 πŸ₯𝟀𝟩𝟨πŸͺ32768\sf{32768}sansserif_32768
Refer to caption Refer to caption Refer to caption
π–¦π—‹π—ˆπ—Žπ—‡π–½β’π–³π—‹π—Žπ—π—π–¦π—‹π—ˆπ—Žπ—‡π–½π–³π—‹π—Žπ—π—\sf{Ground\;Truth}sansserif_Ground sansserif_Truth 𝟣𝟒𝟀𝟦1024\sf{1024}sansserif_1024 𝟀𝟒𝟦πŸͺ2048\sf{2048}sansserif_2048 𝟦𝟒𝟫𝟨4096\sf{4096}sansserif_4096
Refer to caption Refer to caption Refer to caption Refer to caption
πŸͺ𝟣𝟫𝟀8192\sf{8192}sansserif_8192 𝟣𝟨πŸ₯πŸͺ𝟦16384\sf{16384}sansserif_16384 πŸ₯𝟀𝟩𝟨πŸͺ32768\sf{32768}sansserif_32768
Refer to caption Refer to caption Refer to caption
Figure 11: 3D Reconstructions. More 3D reconstructions produced by ΞΌπœ‡\muitalic_ΞΌ-Net-L at various dosages. The improvement in reconstruction quality as the dosage increases can be seen clearly
π–¦π—‹π—ˆπ—Žπ—‡π–½β’π–³π—‹π—Žπ—π—π–¦π—‹π—ˆπ—Žπ—‡π–½π–³π—‹π—Žπ—π—\sf{Ground\;Truth}sansserif_Ground sansserif_Truth 𝟣𝟒𝟀𝟦1024\sf{1024}sansserif_1024 𝟀𝟒𝟦πŸͺ2048\sf{2048}sansserif_2048 𝟦𝟒𝟫𝟨4096\sf{4096}sansserif_4096
Refer to caption Refer to caption Refer to caption Refer to caption
πŸͺ𝟣𝟫𝟀8192\sf{8192}sansserif_8192 𝟣𝟨πŸ₯πŸͺ𝟦16384\sf{16384}sansserif_16384 πŸ₯𝟀𝟩𝟨πŸͺ32768\sf{32768}sansserif_32768
Refer to caption Refer to caption Refer to caption
π–¦π—‹π—ˆπ—Žπ—‡π–½β’π–³π—‹π—Žπ—π—π–¦π—‹π—ˆπ—Žπ—‡π–½π–³π—‹π—Žπ—π—\sf{Ground\;Truth}sansserif_Ground sansserif_Truth 𝟣𝟒𝟀𝟦1024\sf{1024}sansserif_1024 𝟀𝟒𝟦πŸͺ2048\sf{2048}sansserif_2048 𝟦𝟒𝟫𝟨4096\sf{4096}sansserif_4096
Refer to caption Refer to caption Refer to caption Refer to caption
πŸͺ𝟣𝟫𝟀8192\sf{8192}sansserif_8192 𝟣𝟨πŸ₯πŸͺ𝟦16384\sf{16384}sansserif_16384 πŸ₯𝟀𝟩𝟨πŸͺ32768\sf{32768}sansserif_32768
Refer to caption Refer to caption Refer to caption
Figure 12: 3D Reconstructions (continued). Even more 3D reconstructions produced by ΞΌπœ‡\muitalic_ΞΌ-Net-L at various dosages. The improvement in reconstruction quality as the dosage increases can be seen clearly

Appendix C Artifact Analysis

Little Squares.

We can see these little square in Figure 10 (first row). These are caused by the muons that do not scatter and are placed randomly along their trajectories. Since the model typically will see points placed within the voxels corresponding to there being actual material there, there is a slightly larger scattering density predicted at these regions where there should be nothing. These artifacts are only visible within the cross-section when there is nothing else inside (as is the case for the first row of Figure 10).

π–¦π—‹π—ˆπ—Žπ—‡π–½β’π–³π—‹π—Žπ—π—π–¦π—‹π—ˆπ—Žπ—‡π–½π–³π—‹π—Žπ—π—\sf{Ground\;Truth}sansserif_Ground sansserif_Truth 𝟣𝟒𝟀𝟦1024\sf{1024}sansserif_1024 𝟀𝟒𝟦πŸͺ2048\sf{2048}sansserif_2048 𝟦𝟒𝟫𝟨4096\sf{4096}sansserif_4096 πŸͺ𝟣𝟫𝟀8192\sf{8192}sansserif_8192 𝟣𝟨πŸ₯πŸͺ𝟦16384\sf{16384}sansserif_16384 πŸ₯𝟀𝟩𝟨πŸͺ32768\sf{32768}sansserif_32768
Refer to caption \stackinsetl7.2mmb2.6mm
Refer to caption
\stackinsetl7.2mmb2.6mm
Refer to caption
\stackinsetl7.2mmb2.6mm
Refer to caption
Refer to caption Refer to caption Refer to caption
Refer to caption \stackinsetl10mmb5.5mm
Refer to caption
\stackinsetl10mmb5.5mm
Refer to caption
\stackinsetl10mmb3.5mm
Refer to caption
\stackinsetl10mmb3.5mm
Refer to caption
\stackinsetl10mmb3.5mm
Refer to caption
\stackinsetl10mmb3.5mm
Refer to caption
Figure 13: Hotspots. 2D cross-sections of the 3D reconstructions produced by ΞΌπœ‡\muitalic_ΞΌ-Net-L at various dosages. The false PoCA hotspots are circled in red. We see that as the dosage increases, these hotposts fade in prominence.

False Hotspots.

We can see false hotspots in Figure 10 (2nd and 3rd law row) and highlighted in Figure 13. These are the result of muons scattering twice when passing through materials. PoCA assumes that muons scatter only once. This means that if a muon scatters twice, its PoCA point will end up somewhere end the midpoint of its actual scattering points. We also notice that as the dosage increases, these hotspots tend to fade in prominence. This is likely because with a larger dosages, the model is better able to distinguish between real scattering points and these false hotspots.

π–¦π—‹π—ˆπ—Žπ—‡π–½β’π–³π—‹π—Žπ—π—π–¦π—‹π—ˆπ—Žπ—‡π–½π–³π—‹π—Žπ—π—\sf{Ground\;Truth}sansserif_Ground sansserif_Truth π–±π–Ύπ–Όπ—ˆπ—‡π—Œπ—π—‹π—Žπ–Όπ—π—‚π—ˆπ—‡π—Œπ–±π–Ύπ–Όπ—ˆπ—‡π—Œπ—π—‹π—Žπ–Όπ—π—‚π—ˆπ—‡π—Œ\sf{Reconstructions}sansserif_Reconstructions
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption Refer to caption
Figure 14: Distortions. Several sample reconstructions of the same target object with a dosage of 8192. Each reconstruction uses a different sample of 8192 muons from the distribution of muon detections.

Distortions and Blurring.

In Figure 14, many of the objects in the cross-section are visibly distorted, with these distortions shifting as the dosage increases (particularly noticeable in last row and 2nd row). This is due to the fundamentally random nature of this type of tomography and the low dosages of muons. Some scattering points will inevitably fall outside of what is the actual material. The model will then assume that these are part of the actual material, leading to blurring and distortions due to the fundamentally random nature of where these points will be found.

Appendix D Raw Results

Table 2: Model Scaling. The results of the model at different dosages for various sizes. The inference times are evaluated on 2 T4 GPUs with a batch size of 8. The best results are bolded. For PoCA, the inference times are evaluated on a P100 GPU.

Model Dosage Time↓normal-↓\downarrow↓ MSE↓normal-↓\downarrow↓ MAE↓normal-↓\downarrow↓ PSNR↑normal-↑\uparrow↑
ΞΌπœ‡\muitalic_ΞΌ-Net-T 1024 126 ms 0.2276 0.2204 17.1426
ΞΌπœ‡\muitalic_ΞΌ-Net-B 1024 200 ms 0.2318 0.2313 17.0255
ΞΌπœ‡\muitalic_ΞΌ-Net-L 1024 288 ms 0.2295 0.2178 17.0646
PoCA 1024 22.5s 0.4595 0.2447 13.6627
ΞΌπœ‡\muitalic_ΞΌ-Net-T 2048 135 ms 0.1965 0.1989 17.7786
ΞΌπœ‡\muitalic_ΞΌ-Net-B 2048 208 ms 0.1936 0.1911 17.8486
ΞΌπœ‡\muitalic_ΞΌ-Net-L 2048 306 ms 0.1929 0.1918 17.8633
PoCA 2048 43.8s 0.4338 0.2465 13.9112
ΞΌπœ‡\muitalic_ΞΌ-Net-T 4096 141 ms 0.1653 0.1725 18.5347
ΞΌπœ‡\muitalic_ΞΌ-Net-B 4096 218 ms 0.1649 0.1738 18.5504
ΞΌπœ‡\muitalic_ΞΌ-Net-L 4096 301 ms 0.1644 0.1697 18.5388
PoCA 4096 79.8s 0.3950 0.2466 14.3228
ΞΌπœ‡\muitalic_ΞΌ-Net-T 8192 169 ms 0.1388 0.1438 19.2979
ΞΌπœ‡\muitalic_ΞΌ-Net-B 8192 234 ms 0.1350 0.1457 19.3958
ΞΌπœ‡\muitalic_ΞΌ-Net-L 8192 325 ms 0.1348 0.1389 19.4232
PoCA 8192 164s 0.3660 0.2420 15.0769
ΞΌπœ‡\muitalic_ΞΌ-Net-T 16384 246 ms 0.1169 0.1207 20.0433
ΞΌπœ‡\muitalic_ΞΌ-Net-B 16384 293 ms 0.1118 0.1238 20.2685
ΞΌπœ‡\muitalic_ΞΌ-Net-L 16384 384 ms 0.1062 0.1180 20.4322
PoCA 16384 310s 0.3285 0.2315 15.5586
ΞΌπœ‡\muitalic_ΞΌ-Net-T 32768 347 ms 0.0993 0.1156 20.7906
ΞΌπœ‡\muitalic_ΞΌ-Net-B 32768 434 ms 0.0919 0.1040 21.1595
ΞΌπœ‡\muitalic_ΞΌ-Net-L 32768 538 ms 0.0875 0.0983 21.3530
PoCA 32768 612s 0.3092 0.2258 17.1091
Table 3: Detector Resolutions. Results of various methods for resolutions of the detector. ΞΌπœ‡\muitalic_ΞΌ-Net-T*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT indicates that the model was finetuned on the new data for 10 epochs. The best results are bolded.

Model Detector Resolution MSE↓normal-↓\downarrow↓ MAE↓normal-↓\downarrow↓ PSNR↑normal-↑\uparrow↑
ΞΌπœ‡\muitalic_ΞΌ-Net-T 64Γ—64646464\times 6464 Γ— 64 0.2545 0.2647 16.5660
ΞΌπœ‡\muitalic_ΞΌ-Ne-Tt*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 64Γ—64646464\times 6464 Γ— 64 0.2317 0.2107 17.0075
PoCA 64Γ—64646464\times 6464 Γ— 64 0.5210 0.2690 13.1784
ΞΌπœ‡\muitalic_ΞΌ-Net-T 128Γ—128128128128\times 128128 Γ— 128 0.2259 0.2383 17.1177
ΞΌπœ‡\muitalic_ΞΌ-Net-T*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 128Γ—128128128128\times 128128 Γ— 128 0.2174 0.2349 17.3046
PoCA 128Γ—128128128128\times 128128 Γ— 128 0.5226 0.2696 13.1625
ΞΌπœ‡\muitalic_ΞΌ-Net-T 256Γ—256256256256\times 256256 Γ— 256 0.2123 0.2221 17.4097
ΞΌπœ‡\muitalic_ΞΌ-Net-T*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 256Γ—256256256256\times 256256 Γ— 256 0.2068 0.2226 17.5387
PoCA 256Γ—256256256256\times 256256 Γ— 256 0.5210 0.2690 13.1784
ΞΌπœ‡\muitalic_ΞΌ-Net-T 1024Γ—1024102410241024\times 10241024 Γ— 1024 0.1991 0.2057 17.7123
ΞΌπœ‡\muitalic_ΞΌ-Net-T*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 1024Γ—1024102410241024\times 10241024 Γ— 1024 0.2035 0.2119 17.6165
PoCA 1024Γ—1024102410241024\times 10241024 Γ— 1024 0.5210 0.2690 13.1784
ΞΌπœ‡\muitalic_ΞΌ-Net-T 2048Γ—2048204820482048\times 20482048 Γ— 2048 0.1970 0.2023 17.7616
ΞΌπœ‡\muitalic_ΞΌ-Net-T*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 2048Γ—2048204820482048\times 20482048 Γ— 2048 0.2013 0.1901 17.6695
PoCA 2048Γ—2048204820482048\times 20482048 Γ— 2048 0.5210 0.2690 13.1784
ΞΌπœ‡\muitalic_ΞΌ-Net-T ∞\infty∞ 0.1936 0.1911 17.8486
PoCA ∞\infty∞ 0.4338 0.2465 13.9112
Table 4: Momentum Error. Results of various models for different levels of error in the momentum estimate. ΞΌπœ‡\muitalic_ΞΌ-Net*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT indicates that the model was finetuned on the new data for 10 epochs. The best results are bolded.

Model Δ⁒𝐩Δ𝐩\Delta\mathbf{p}roman_Ξ” bold_p MSE↓normal-↓\downarrow↓ MAE↓normal-↓\downarrow↓ PSNR↑normal-↑\uparrow↑
ΞΌπœ‡\muitalic_ΞΌ-Net 0% 0.1951 0.1977 17.8110
ΞΌπœ‡\muitalic_ΞΌ-Net*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 0% 0.1920 0.1938 17.8865
PoCA 0% 0.4224 0.2442 14.0280
ΞΌπœ‡\muitalic_ΞΌ-Net 20% 0.1965 0.1989 17.7786
PoCA 20% 0.4338 0.2465 13.9112
ΞΌπœ‡\muitalic_ΞΌ-Net 40% 0.2004 0.2033 17.6844
ΞΌπœ‡\muitalic_ΞΌ-Net*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 40% 0.1987 0.1940 17.7258
PoCA 40% 0.4577 0.2515 13.6717
ΞΌπœ‡\muitalic_ΞΌ-Net 60% 0.2230 0.2207 17.2291
ΞΌπœ‡\muitalic_ΞΌ-Net*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 60% 0.1989 0.1920 17.7326
PoCA 60% 0.5316 0.2616 13.1310
ΞΌπœ‡\muitalic_ΞΌ-Net 80% 0.2761 0.2558 16.2882
ΞΌπœ‡\muitalic_ΞΌ-Net*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 80% 0.2020 0.2023 17.6497
PoCA 80% 0.6228 0.2730 12.5438
ΞΌπœ‡\muitalic_ΞΌ-Net 100% 0.3293 0.2866 15.6005
ΞΌπœ‡\muitalic_ΞΌ-Net*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT 100% 0.2000 0.2014 17.7015
PoCA 100% 0.8279 0.2933 11.5817