Revealing The Local Cosmic Web From Galaxies by Deep Learning
Revealing The Local Cosmic Web From Galaxies by Deep Learning
Revealing The Local Cosmic Web From Galaxies by Deep Learning
3847/1538-4357/abf040
© 2021. The American Astronomical Society. All rights reserved.
Abstract
A total of 80% of the matter in the universe is in the form of dark matter that composes the skeleton of the large-
scale structure called the cosmic web. As the cosmic web dictates the motion of all matter in galaxies and
intergalactic media through gravity, knowing the distribution of dark matter is essential for studying the large-scale
structure. However, the cosmic web’s detailed structure is unknown because it is dominated by dark matter and
warm−hot intergalactic media, both of which are hard to trace. Here we show that we can reconstruct the cosmic
web from the galaxy distribution using the convolutional-neural-network-based deep-learning algorithm. We find
the mapping between the position and velocity of galaxies and the cosmic web using the results of the state-of-the-
art cosmological galaxy simulations of Illustris-TNG. We confirm the mapping by applying it to the EAGLE
simulation. Finally, using the local galaxy sample from Cosmicflows-3, we find the dark matter map in the local
universe. We anticipate that the local dark matter map will illuminate the studies of the nature of dark matter and
the formation and evolution of the Local Group. High-resolution simulations and precise distance measurements to
local galaxies will improve the accuracy of the dark matter map.
Unified Astronomy Thesaurus concepts: Dark matter distribution (356); Cosmology (343); Large-scale structure of
the universe (902); Local Group (929)
1
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
following challenges. First, the local galaxy distribution at the sample over the given region, we make the volume-limited
low Galactic latitudes is hidden behind the intense radiation subsample of the CF3 as follows. First, since the number
from the Galactic disk and contaminated by the interstellar gas density of the CF3 galaxies close to the Galactic plane
and dust, which makes it hard to obtain the complete map of (Galactic latitude |b| < 10°) is lower than average, we only use
the galaxy distribution. Second, even if we had the complete the galaxies at |b| > 10°. Also, we use the B-band absolute
map of galaxies, they are biased tracers of the large-scale magnitude (MB) compiled from the Lyon Extragalactic
structure, that is, the distribution of galaxies does not Database (LEDA; Paturel et al. 2003) as a proxy of the stellar
necessarily reflect the distribution of dark matter. mass (Må; Wilman & Erwin 2012). We set the B-band
Previous attempts (Gottloeber et al. 2010; Libeskind et al. magnitude as −15 for the selection criterion, which is sufficient
2010; Carrick et al. 2015; Carlesi et al. 2016; Lavaux & for covering the 20 and 40 Mpc h−1 cubic volume around the
Jasche 2016) of making the local dark matter map, therefore, Milky Way. We have also tested the cases with MB < −16 and
have relied on the cosmological simulations constrained by the −17 and found no noticeable difference of the predictions from
smoothed density field at high Galactic latitudes. Typically, a the fiducial choice (see Section 4). Note that we have not used
smoothing scale of a few megaparsecs is employed when the KS-band absolute magnitude, one of the best-known tracers
matching the simulation output to the observation. However, of the stellar mass (Bell et al. 2003), because that information is
this observational constraint for the fully evolved galaxy missing for about 30% of the galaxies in our sample (Lavaux &
distribution is nontrivial to implement because the simulation Hudson 2011; Huchra et al. 2012).
needs the density distribution at the initial time. Alternatively, We calculate the radial peculiar velocity by subtracting the
the Bayesian Origin Reconstruction from Galaxies (BORG; Hubble flow from the velocity in the Galactic standard of rest
see, e.g., Jasche & Wandelt 2013; Jasche et al. 2015) approach (VGSR; Kourkchi et al. 2020). Note that we do not use the
uses the multiple Gaussian processes to draw the probability velocity in the CMB standard of rest (VCMB) to reduce any bias
distribution of the initial density perturbation from a given that might be introduced in the conversion. Instead, when
galaxy distribution. As based on the dark matter density field generating training and test samples from simulation data, we
evolution by second-order Lagrangian perturbation theory include the peculiar motion of the Milky Way corresponding
(2LPT) and the linear galaxy bias model, the method is also galaxy in each simulation. There exists a difference on the
limited to, again, the scale larger than a few megaparsecs where Hubble constant between recent CMB observations
the 2LPT and linear bias models are accurate. (H0 = 67.77 km s−1 Mpc−1; Planck Collaboration et al. 2020)
Here we overcome the challenges by taking a novel approach and the best fit from the CF3 (H0 = 75 km s−1 Mpc−1; Tully
based on deep learning (DL). DL, as well as a conventional et al. 2016). In this study, we have tested both values and find
machine-learning technique, has been introduced to measure the that the effect from the different Hubble constants stays within
dark matter distribution from weak gravitational lensing or spatial the uncertainty of the dark matter map (see Section 4).
distribution of dark matter halos (e.g., Modi et al. 2018; Shirasaki
et al. 2019; Jeffrey et al. 2020). On the contrary, our DL approach
2.2. Simulation Data: Illustris-TNG and EAGLE
aims to reconstruct the local dark matter map down to a
megaparsec scale by incorporating all information in the observed We use TNG100-1, a simulation with a comoving volume
galaxy data: the spatial distribution and the radial peculiar velocity V = (75 Mpc h-1)3 and 18203 dark matter and gas particles
of galaxies. We use the DL algorithm based on the convolutional from the Illustris-TNG simulation suite (Marinacci et al. 2018;
neural network (CNN) to find the mapping between the local dark Naiman et al. 2018; Nelson et al. 2018, 2019; Pillepich et al.
matter distribution and the observed positions and the radial 2018; Springel et al. 2018), as our high-resolution simulation
peculiar velocities of local galaxies. data (TNG100 hereafter). To mimic the observation from the
The structure of this paper is as follows. In Section 2, we Milky Way, we select 988 galaxies with stellar mass
describe the simulation and observational data used for DL 4 × 1010Me < Må < 1011Me (center galaxies hereafter) by
training and prediction, respectively. In Section 3, we will adopting that the Galactic stellar mass is about 5.2 × 1010Me
briefly describe our DL architecture and the evaluation of our (Licquia & Newman 2015). Around each center galaxy, we
DL model. In Section 4, we will show the reconstructed local make a subcube with 20 Mpc h−1 box size and calculate the
dark matter map and its statistical robustness. We will dark matter density field within the 643 uniform grid. We also
summarize our result in Section 5. calculate the relative position of galaxies with MB < −15
Throughout the paper, we assume a standard ΛCDM cosmology (target galaxies hereafter) and the difference of peculiar
in concordance with the Planck 2018 analysis (Planck Collabora- velocity between the target galaxy and center galaxy.
tion et al. 2020): (W0m , WL0 , h ) = (0.31, 0.69, 0.6777). It is For the low-resolution dark matter map with V =
similar to the standard cosmologies adopted in Illustris-TNG and (40 Mpc h-1)3, we use the TNG300-1 from the Illustris-TNG
EAGLE simulations: (W0m , WL0 , h ) = (0.3089, 0.6911, 0.6774) simulations, whose volume and number of particles are
and (0.307, 0.693, 0.6777), respectively (Springel et al. 2018; V = (205 Mpc h-1)3 and 25003, respectively (TNG300 here-
Schaye et al. 2015). after). Note that the amplitude of the luminosity function of
TNG300 is lower than the observation and TNG100, mainly
due to the lower spatial resolution of the simulation (Pillepich
2. Data et al. 2018). Therefore, we also apply the resolution correction
to find the center and target galaxies using the number density
2.1. Observational Data: Cosmicflows-3
obtained from TNG100 rather than directly using the face
We use the Cosmicflows-3 galaxy catalog (Tully et al. 2016; values of Må or MB. We also use TNG300-1-Dark, a dark-
CF3 hereafter), one of the most comprehensive galaxy catalogs matter-only counterpart of TNG300, to test how baryonic
that provide distance, radial peculiar velocity, and luminosity physics affects our result. We select the center and target
of 17,647 galaxies up to 200 Mpc. To produce a fair galaxy galaxies by finding the mass cut of dark matter halos with the
2
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
Figure 1. CNN architecture used for TNG300. We denote the layer size by the quadruple where the spatial dimension (2n, 2n, 2n) follows the number of channels. The
size (except the number of filters) of each layer for TNG100 is half that of TNG300.
same number density. The result from TNG300-1-Dark is can extract different physical features in the data. Specifically,
similar to or slightly worse than that from TNG300 (see we use a CNN architecture similar to the U-Net (Ronneberger
Section 4). et al. 2015) or V-Net (Milletari et al. 2016) to predict the dark
Also, we use RefL0100N1504, a reference simulation with matter density field from the galaxy position and radial peculiar
V = (67.77 Mpc h-1)3 and 15043 dark matter and gas particles velocity (see Figure 1). Our CNN architecture consists of the
from the EAGLE simulation suite (Schaye et al. 2015; Crain et al. following two stages: the encoding stage (Input—ConvNs),
2015, EAGLE hereafter), to check the fidelity of our result. For the with increasing number of filters and decreasing the size of
center galaxies, we use the same selection criterion as TNG100 hidden layers, and the decoding stage (UpConvNs—Output),
and find 478 center galaxies. For the target galaxies, however, we with decreasing number of filters and increasing the size of
do not directly use MB. This is because the luminosity function of hidden layers. Here, Ns denotes the spatial size of hidden
EAGLE is reliable only for bright galaxies (MB − 18) since the
layers. To retain the small-scale spatial resolution, we also
EAGLE simulations calculate the luminosity only to massive
attach the hidden layers in the equivalent (with the same layer
galaxies (Må
108.5Me; Camps et al. 2018). Instead, similar to
TNG300, we use the galaxy number density obtained from size) encoding stage as additional channels to the decoding
TNG100 to find the stellar mass cut of target galaxies. layer, doubling the number of channels. We refer to this
process as concatenation.
The encoding stage consists of a series of ConvNs layers.
3. Methods Let us define the input of a given ConvNs,0 as ℓ; i, j, k , where
i, j, k Î [1, Ns,0] are the spatial coordinates and ℓ Î [1, Nch,0] is
3.1. Deep-learning Architecture the channel index, with Nch,0 being the total number of
We construct the DL architecture using CNN that highlights channels. To accommodate the convolution at the edge, we
features in the data by a series of convolutions, resulting in so- have added the buffer around the input array (padding process).
called hidden layers. By varying the convolution filters, one As we use a 5 × 5 × 5 convolution filter, it suffices to add
3
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
Np = 2 padding pixels at both edges of each dimension. We fill In addition to the usual steps described above, the final
the padding pixels by reflecting the inner 2 pixels next to the Output layer requires following two special treatments so that
edge pixels. the output layer represents the single dark matter density
After the padding, we apply a three-dimensional convolution proportional to log10 (r r0 ), which can be both positive and
with a multichannel filter wℓ, ℓ ¢ ; i ¢ , j ¢ , k ¢ and bias bℓ, with indices negative. First, instead of a gradual decrease of the number of
i¢ , j ¢ , k ¢ Î [1, Nk = 5], ℓ ä [1, Nch,1], ℓ ¢ Î [1, Nch,0], to obtain output channels by a factor of 2, we set the number of output
the output as channels for Output as 1. Second, instead of the ReLU
activation function, whose output range is [0, +inf), we use the
ℓ; i, j, k = bℓ + å ℓ ¢ ; s (i ¢ ; i ), s ( j ¢ ; j ), s (k ¢ ; k ) wℓ, ℓ ¢ ; i ¢ , j ¢ , k ¢, (1 ) hyperbolic tangent function (tanh) so that its output range
ℓ ¢,i¢,j¢,k ¢
becomes finite ([ −1, +1] in this case).
where ℓ ¢ ; s (i ¢ ; i), s ( j ¢ ; j), s (k ¢ ; k ) is the input array after the padding. We have adopted different spatial sizes of the hidden layer
We sample the convolution sparsely, s (i¢ ; i ) = i ´ Nst + i¢, for TNG100 and TNG300 to accommodate the difference in
their spatial resolution. For TNG100, the encoding stage starts
and reduce the spatial dimension by a factor of 23 at each step
from two channels of 643-grid input layers and ends with the
by choosing the spatial interval Nst = 2 (strides hereafter). 2048 channels of the 23-grid layer (Conv2), and for TNG300,
Accompanying the reduction of spatial dimension, we increase the encoding stage starts from two channels of 1283-grid input
the number of channels Nch by a factor of 2 at each step of the layers and ends with the 2048 channels of the 43-grid layer
convolution, from 128 (Conv64) to 2048 (Conv4). Note that (Conv2). The final output layers are 643 and 1283 for TNG100
the convolution filter wℓ, ℓ ¢ ; i ¢ , j ¢ , k ¢ and bias bℓ are trainable and TNG300, respectively. We have also tested other CNN
parameters that we adjust for the training. architectures with various channel sizes and confirmed that the
The padding and convolution processes are linear operations, CNN architecture that we use here (shown in Figure 1)
so any combinations of these operations simplify to a single performs the best among the tested cases.
linear algebra operation. In order to fully utilize the multiple
hidden layers of DL, we apply the rectified linear unit (ReLU; 3.2. Training
Hahnloser et al. 2000; Glorot et al. 2011),
We divide the training and validation samples from
ℓ; i, j, k = max ( ℓ; i, j, k , 0) , (2 ) TNG100 so that all subcubes from the validation sample do
not overlap with those from the training sample. As a result, we
as a nonlinear activation function for each hidden layer. only use 525 subcubes—432 for training and 93 for validation.
Finally, we apply the batch normalization (Ioffe & Szegedy For each subcube, we make two 643 uniform grids as a two-
2015) channel input layer; each channel stores the number of target
ℓ; i, j, k - mℓ; i, j, k galaxies (Ngal) and the averaged radial peculiar velocity (Vpec)
ℓ; i, j, k = gℓ; i, j, k + b ℓ; i , j , k , (3 ) in units of km s−1. For the input layer, we apply the same
s 2ℓ; i, j, k + Galactic latitude mask as the CF3 data (masking out |b| < 10°).
For the output layer, we normalize the logarithm of dark matter
to obtain an output ConvNs,1 layer, ℓ; i, j, k (i, j, k ä [1, Ns,1 = density to be
Ns,0/2], ℓ ä [1, Nch,1]). Here, μℓ;i,j,k and σℓ;i,j,k are the mean and
standard deviation of ℓ; i, j, k over samples in the same mini- y=
1
log10 (r r 0) , (5 )
batch, and ò = 10−3 is a small value for the numerical stability. 4.5
Note that the mini-batch refers to the bundle of input−output
where ρ0 is the mean dark matter density of the universe so that
pairs that we have used for updating the trainable parameters.
all values in the output layer would be between −1 and +1.
The normalization factor γℓ;i,j,k and bias factor βℓ;i,j,k are other
For data augmentation, we allow swapping the (x, y, z)-axes
trainable parameters. The batch normalization introduces an of each subcube, which increases the number of samples by a
extra level of nonlinearity, ensuring that the trainable factor of three. We further increase the sample size by flipping
parameters introduced at earlier hidden layers still affect the the axis direction, with which the number of samples increases
output. eight times. Note that, unlike U-Net or V-Net, we do not split a
The decoding stage consists of a series of UpConvNs layers, single cube into multiple smaller cubes for data augmentation
which are constructed in a parallel manner. In contrast to the because that would change the Galactic latitude mask and the
ConvNs, where we decreases the spatial dimension by sparsely radial peculiar velocity. In the end, we obtain samples of
sampling the convolved array, we increase the spatial 10,368 and 2232, respectively, in training and validation sets.
dimension of each UpConvNs layer, We implement our CNN architecture in Keras (Chollet et al.
ℓ; i, j, k = ℓ; u (i ), u ( j ), u (k ), (4 ) 2015) with the Tensorflow back end (Abadi et al. 2015) and
perform the training with an NVIDIA Tesla V100 graphic
by duplicating the input array ℓ; i, j, k . Here u(x) = ⌈x/Nu⌉, and processing unit (GPU) with 16 GB memory. We choose the
we set the upsampling factor Nu = 2 in order to increase the mean squared error (MSE) as the loss function that the DL
spatial size of ℓ; i, j, k by a factor of 8. After the upsampling, we minimizes during the training:
concatenate the ConvNs layer (the same size) and apply batch 1 n
normalization. We then apply a three-dimensional convolution TNG100 =
n
å ( yi,pred - yi,truth )2 (6 )
i=1
with (Nk, Nst) = (3, 1), after the reflective padding of the edge
arrays with Np = 1. We decrease the number of output channels n 2
1 1
of each UpConvNs from 1024 to 128 by a factor of 2. After the =
n
å ⎡ 4.5 log10 (ri,pred ri,truth) ⎤ , (7 )
convolution, we apply the ReLU activation function. i=1 ⎣ ⎦
4
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
where the subscripts (i, pred) and (i, truth) are, respectively, the
prediction and truth values of the y (defined in Equation (5)) at
the ith grid.
Initially, we set the trainable parameters in the convolution
filters (θ; parameter vector hereafter) randomly. The training
process for minimizing the loss function is done with 200
epochs, a unit process that updates the parameter vector from a
subset of the train set and applies the updated parameter vector
to a subset of the validation set. The parameter vector update
process at each epoch consists of 1728 mini-batches. We set the
mini-batch size as six, mainly due to the GPU memory limit.
For each mini-batch we numerically calculate the gradient of
the loss function (q ) and update the parameter vector by the
Adam optimizer (Kingma & Ba 2014),
mt (1 - b 1t )
qt = qt - 1 - a (8 )
vt (1 - b 2t ) +
Figure 2. Evolution of loss function ( ) as a function of learning rate of the Adam
mt = b1mt - 1 + (1 - b1) q t (qt - 1) (9 ) optimizer (α) from an additional test training for TNG300. A too low learning rate
(α 10−8) gives a too slow update of the parameter vector, which is presented as
a flat slope of (a). On the other hand, a too high learning rate (α 10−5)
vt = b 2 vt - 1 + (1 - b 2)[q t (qt - 1)]2 . (10) prevents finding a solution, which is presented as a noisy increment of (a).
5
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
6
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
Figure 4. Three-way projections of a single TNG300 validation sample with 5 Mpc h−1 thickness. From left to right: galaxy number (Ngal), radial peculiar velocity
(Vpec), truth dark matter density (ρtruth), reconstructed dark matter density (ρTNG300), and another reconstruction from the CNN architecture without using the radial
peculiar velocity (noVpec; ρnoVpec). TNG300 can well reconstruct the filamentary structure of a few megaparsec scales in the true dark matter distribution, while
noVpec does not show such structure.
Table 2
Summary of the Performance Test Done by Validation Samples of TNG100, TNG300, and Their Comparison Models
Note. KS(ξpred, ξtruth) is the Kolmogorov–Smirnov statistics of the two-point correlation functions of dark matter distribution between truth and prediction. EAGLE-
TNG100 is the application of the TNG100 model to the EAGLE samples. diffH0 is identical to TNG300 since Hubble flow estimation is not considered in this test.
distribution—the only available input of the given DL model— SGY, and SGZ), extended to the full cube with the side length of
with a few megaparsec scale. As a result, the 2pCFs of noVpec 40 Mpc h−1. Figure 6 clearly shows known local objects that we
show a significant deviation from their truth in small scales designated by their common name. The figure also recovers
with r 3 Mpc h−1 (see Table 2). From the comparison to known local large-scale structures. For example, we find a
TNG300 and its other comparison models, it is apparent that 10 Mpc h−1 spread along + SGY-direction in the SGZ−SGY (top
the (radial) peculiar velocity plays a significant role in left panel) and SGY−SGX (bottom right panel) planes. This
reconstructing the small-scale filamentary structure. structure is known as the Local Sheet, which connects the Local
Group and Virgo Cluster and contains M81, NGC 5194, Canes II,
and Coma I groups (Tully et al. 2008; Courtois et al. 2013). We
4.2. Three-dimensional View of the Local Cosmic Web
also find that, around the Local Group, the Local Sheet is
Figure 6 shows a sliced view of the reconstructed cosmic web connected to the Fornax Wall (Fairall et al. 1994), which is a
integrated over 4 Mpc h−1 thickness. Each panel shows the cosmic 20 Mpc h−1 sized spread along the (−SGY, −SGZ) direction,
web on the plane of the Supergalactic Cartesian coordinates (SGX, containing the Fornax Cluster, Eridanus Cluster, and Dorado
7
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
Figure 5. Result of the performance tests for the DL result using the three-dimensional dark matter density field of simulations. Top panel: statistical comparison
between the ground truth and the predicted dark matter density from the entire TNG300 validation sample. From left to right: joint probability distribution (colors)
with 1σ, 2σ, 3σ certainty level contours (lines), median (lines) and 1σ deviation (shades) of histograms, and median (lines) and 1σ deviation (shades) of the two-point
correlation functions. Bottom panel: similar to the top panel, but by applying the TNG100 training to the entire EAGLE test sample.
Group as members (top left panel). At the opposite direction to the Furthermore, to estimate the uncertainties of the dark matter
Fornax Wall on the SGZ−SGY plane, the Local Void (Tully & map, we perform a stress test on our CNN models by incorporating
Fisher 1987) is also apparent (also shown on the SGZ−SGX distance measurement uncertainties in the CF3. We use the one
plane), which might extend beyond the boundary of our local standard deviation uncertainty in distance modulus (òμ) in the CF3,
universe sample. In Figure 6, we also present the velocity flow
1
lines derived from the reconstructed gravitational potential gradient m º . (15)
with arrows and black lines. The velocity flow shows the motion åi 1 i2
of material from the Local Void to nearby filamentary structures
and clusters such as the Local Sheet, Fornax Wall, and Virgo Here òi includes the one standard deviation uncertainty
Cluster. Note that we cannot reproduce the velocity flow from the determined from a recalibration of galaxy magnitude with H I
Virgo Cluster to the Great Attractor (+SGX-direction), because of line width (Tully & Courtois 2012), distance measurement of
the limited extension of the volume that we analyze here. the tip of the red giant branch from the Hubble Space
However, we would like to emphasize that the recovered dark Telescope, Type Ia supernovae from various samples (Tully
matter map provides us detailed density and velocity fields around et al. 2013), Tully–Fisher relation using Spitzer [3.6] photo-
these known local large-scale structures. metry, and the fundamental plane relation from the Six Degree
The recovered cosmic web also shows a hint of new Field Galaxy Survey (6dFGS; Tully et al. 2016). We then
structures that require further investigation. For example, the generate 1000 sets of random distance moduli that follow the
direction of the Local Sheet is similar to the direction of the so- normal distribution,
called vast polar structure (VPOS), which consists of satellite
galaxies, globular clusters, and stellar streams around the Milky 1 ⎡ Dm2 ⎤
P (Dm) = exp ⎢ - 2 ⎥. (16)
Way (Pawlowski et al. 2012). As shown in Figure 6, the Local m 2p ⎣ 2 m ⎦
Sheet, being the strongest filamentary structure around the
Local Group, is a source of velocity flow; that might cause a Then, we recalculate the radial peculiar velocity by subtracting
connection between the two. Also, a couple of small filaments the Hubble flow corresponding to the random distances from the
are visible in our maps, which could be good targets for VGSR. Since the distance measurement error exists only along the
systematic examination with deep imaging surveys. radial direction, we have generated the two-dimensional column
8
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
Figure 6. Three-dimensional density maps of the local dark matter with 40 Mpc h−1 box size and 4 Mpc h−1 thickness. Cross at the center: Milky Way. Dots: galaxies with
MB < − 15. Texts: galaxy groups, clusters, and local structures. Arrows: estimated directions of motion derived from the gradient of the reconstructed gravitational potential.
density map of the dark matter that is less affected by the error where θ, r, ρ(θ, r) are the two-dimensional sky coordinates,
than the three-dimensional dark matter density field (see Figure 9). distance from the observer, and the dark matter density at the given
Also, we find that the dark matter column density map driven (θ, r), respectively. We use the HEALPix (Górski et al. 2005;
from TNG300 shows significantly less deviation than that of Zonca et al. 2019) package to reconstruct the two-dimensional sky
TNG100, which suffers from some spurious structure consistently map from the three-dimensional data cube. We set the resolution
appearing near the Galactic plane. parameter Nside = 128, which roughly corresponds to the angular
resolution of 27¢. This figure also shows the locations and radial
4.3. Sky Map of the Local Cosmic Web peculiar velocities of galaxies that we use for the reconstruction
The left panels of Figure 7 (labeled as TNG300) show the (color-coded dots), as well as the locations of some well-known
recovered local dark matter map on the sky (gray map), galaxy groups and clusters (large dots).
The map in Figure 7 uses the radial distance and radial peculiar
S (q ) º ò d r r (q , r ) , (17) velocities reported in the CF3 catalog (Tully et al. 2016). We have
9
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
Figure 7. Two-dimensional full-sky map of the local dark matter column density with 4 Mpc h−1 widths. Left panels: predictions from TNG300 training, from the
nearest to the farthest radial bin. Right panels: comparison predictions from TNG100 training (TNG100), training with dark matter halos from the dark-matter-only
simulation (DMhalo), and training without using the radial peculiar velocity (noVpec). Small dots: positions and peculiar velocity (color) of known local galaxies.
Large dots: galaxy groups and clusters with their names.
10
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
11
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
Figure 9. Same as Figure 7, but showing statistical maps. Left panels: mean of the logarithm of dark matter column density estimated from 1000 random realizations
incorporating the uncertainties in distance estimate to the local galaxies. Middle panels: standard deviation from 1000 random realizations (Nside = 4, 8, 8, 16, 16
from top to bottom). Right panels: systematic bias from different simulation input for the DL (TNG300 vs. DMhalo).
Another interesting feature in the map is the dark matter to the farthest radial bin. However, we anticipate that the
distribution at lower Galactic latitudes (|b| < 10°), where we do theoretical uncertainties for the DL mapping would be most
not have any input galaxy data. To our surprise, we find that the substantial for this region. For example, from the aforemen-
averaged signal-to-noise ratios per pixel for this region are tioned studies on systematic uncertainties, we find that, on
4.18, 4.73, 5.31, 5.80, and 6.21, respectively, from the nearest average, a lower Galactic latitude (|b| < 10°) map suffers about
12
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
Table 3
On-sky Average (Median and 1σ Certainty Level in Parentheses) of the Systematics Dsys º ∣log10 S - log10 STNG300,face ∣ D log10 STNG300 over High Galactic Latitude
|b| > 10° with Different Radial Bins
Comparison Model 0.7–4 Mpc h−1 4–8 Mpc h−1 8–12 Mpc h−1 12–16 Mpc h−1 16–20 Mpc h−1
+1.993 +1.414
TNG100 2.281 (1.837- 1.104 ) 1.474 (1.196-0.842 ) L L L
+0.223 +0.148 +0.161 +0.153 +0.151
diffH0 0.212 (0.171- 0.115 ) 0.162 (0.133- 0.092 ) 0.154 (0.116- 0.083) 0.152 (0.117- 0.082 ) 0.160 (0.128- 0.092 )
+0.748 +1.089 +0.729 +0.751 +0.790
16mag 1.032 (0.949- 0.647 ) 1.093 (0.868- 0.611 ) 0.862 (0.716- 0.508 ) 0.785 (0.641-0.455 ) 0.804 (0.631-0.443 )
+1.081 +1.026 +0.947 +0.862 +0.833
17mag 1.178 (0.901- 0.572 ) 1.105 (0.889- 0.621 ) 1.001 (0.815- 0.575 ) 0.887 (0.726- 0.502 ) 0.898 (0.734- 0.506 )
+1.919 +1.120 +0.890 +0.751 +0.742
noVpec 1.935 (1.715-1.359 ) 1.105 (0.834- 0.631 ) 0.943 (0.701-0.524 ) 0.828 (0.672- 0.470 ) 0.750 (0.626-0.440 )
+1.435 +1.156 +0.909 +0.837 +0.899
stellarMass 1.544 (1.256- 0.843 ) 1.175 (0.946- 0.684 ) 0.925 (0.734- 0.521 ) 0.877 (0.692- 0.485 ) 0.907 (0.713- 0.490 )
+2.253 +1.414 +1.097 +1.029 +0.889
DMhalo 1.737 (1.154- 0.863 ) 1.445 (1.127-0.816 ) 1.176 (0.913-0.610 ) 1.057 (0.846- 0.595 ) 0.957 (0.796- 0.574 )
Note. See Table 1 for the definition of each comparison model except TNG100.
δΔsys ; 0.5 more systematical shifts than a higher Galactic the in-depth study of the nature of dark matter by cross-correlating
latitude (|b| > 10°) map. This is indicated in the top two panels the reconstructed dark matter map with the full-sky diffuse
of Figure 7 and the systematic shifts shown in the right panels emission maps constructed from the radio-to-gamma-ray electro-
of Figure 9. magnetic spectra, as well as the full-sky map of gravitational wave
binaries. The latter can test the models where black holes in
binaries have formed out of dark matter (Shandera et al. 2018).
5. Discussion Finally, as we have introduced a novel CNN-based DL
In this paper, we present a novel CNN-based DL method of method to reconstruct the local cosmic web, the quantitative
reconstructing the local dark matter distribution map and study comparing the prediction power of the DL method
discover the local cosmic web structure traced by the positions presented here with preexisting methods such as BORG may be
and radial peculiar velocities of Cosmicflow-3 galaxies. We in order. Note that, however, many previous studies reconstruct
find that including the radial peculiar velocity field is the key to the dark matter distribution on sales much larger than the size
recovering the dark matter distribution in the cosmic web. of our local cosmic web (e.g., 3 − 5 Mpc h−1 in Jasche &
Incorporating the observational uncertainties in the galaxy Wandelt 2013; Jasche et al. 2015), which complicates the direct
distance measurements, the average detection significance of comparison between the two methods. Nevertheless, an
the dark matter map exceeds 4.1σ for each HEALPix pixel at additional study that applies the existing methods to similar
higher Galactic latitudes (|b| > 10°). The quoted statistical observational and simulation data to ours and compares them to
significance, however, does not include the uncertainties in the our DL method would be beneficial, and we leave it for the
galaxy-to-dark-matter mapping itself. We have tested that the future.
DL results stay robust for three different simulations,
TNG100-1 and TNG300-1 from the Illustris-TNG simulation The authors acknowledge Christophe Pichon, Changbom
and RefL0100N1504 from the EAGLE simulation, but future Park, Sungryong Hong, Inkyu Park, Dongsu Bak, Graziano
studies must quantify the theoretical uncertainties by applying Rossi, and Yung-Kyun Noh for discussion. The authors also
the same method to the large-scale structure simulations with acknowledge an anonymous referee for suggestions to improve
different baryonic prescriptions. The comparison of the DL this article. The list of nearby galaxy groups and clusters is
results between TNG300-1 and N-body simulations, however, derived from www.atlasoftheuniverse.com. The authors
indicates that the filamentary cosmic web structure may not acknowledge the Korea Institute for Advanced Study for
suffer from the systematic effects. providing computing resources (KIAS Center for Advanced
The main statistical uncertainty in the galaxy data comes Computation Linux Cluster System). Computational data were
from the uncertainty in the distance measurement. As the transferred through a high-speed network provided by the
observed shift in the galaxy spectra constrains the sum of the Korea Research Environment Open NETwork (KREONET).
distance (Hubble flow) and the radial peculiar velocity, the S.E.H. was partly supported by Basic Science Research
uncertainty affects both the galaxy distribution and the radial Program through the National Research Foundation of Korea
peculiar velocity field. Therefore, to obtain a dark matter map funded by the Ministry of Education (2018R1A6A1A06024977).
with higher significance, it is necessary to explore the ways to S.E.H. was also partly supported by the project 우주거대구조를
reduce the uncertainties of the current distance estimators such 이용한암흑우주연구 (“Understanding Dark Universe Using
as the tip of the red giant branch, Type Ia supernovae, and the Large Scale Structure of the Universe”), funded by the Ministry
fundamental plane through continuous cross-calibration (Tully of Science. D.J. was supported at Pennsylvania State University
et al. 2016) and to increase the number of galaxies with by NSF grant (AST-1517363) and NASA ATP program
measured distances through systematic surveys (e.g., 6dFGS, (80NSSC18K1103). J.K. was supported by a KIAS Individual
Springob et al. 2014; James Webb Space Telescope, Gardner Grant (KG039603) via the Center for Advanced Computation at
et al. 2006). Korea Institute for Advanced Study.
We anticipate that the reconstructed three-dimensional dark Software: HEALPix (Górski et al. 2005), Healpy (Zonca et al.
matter map and peculiar velocity field will open an entirely new 2019), astropy (Astropy Collaboration et al. 2013, 2018), NumPy
chapter of cosmological study. For example, the dark matter map (van der Walt et al. 2011; Harris et al. 2020), Scipy (Jones et al.
can make it possible to run the cosmological galaxy simulations 2001; Virtanen et al. 2020), matplotlib (Hunter 2007), pandas
with the precise initial condition of the Local Group for studying (Wes McKinney 2010), Keras (Chollet et al. 2015), Tensorflow
the past and future of our cosmic neighborhood. It will also allow back end (Abadi et al. 2015).
13
The Astrophysical Journal, 913:76 (14pp), 2021 May 20 Hong et al.
ORCID iDs Hahnloser, R. H. R., Sarpeshkar, R., Mahowald, M. A., Douglas, R. J., &
Seung, H. S. 2000, Natur, 405, 947
Sungwook E. Hong (홍성욱) https://orcid.org/0000-0003- Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Natur, 585, 357
4923-8485 Huchra, J. P., Macri, L. M., Masters, K. L., et al. 2012, ApJS, 199, 26
Donghui Jeong https://orcid.org/0000-0002-8434-979X Hunter, J. D. 2007, CSE, 9, 90
Ioffe, S., & Szegedy, C. 2015, arXiv:1502.03167
Ho Seong Hwang https://orcid.org/0000-0003-3428-7612 Jasche, J., Leclercq, F., & Wandelt, B. D. 2015, JCAP, 2015, 036
Juhan Kim https://orcid.org/0000-0002-4391-2275 Jasche, J., & Wandelt, B. D. 2013, MNRAS, 432, 894
Jeffrey, N., Lanusse, F., Lahav, O., & Starck, J.-L. 2020, MNRAS, 492, 5023
Jones, E., Oliphant, T., Peterson, P., et al. 2001, SciPy: Open Source Scientific
References Tools for Python, http://www.scipy.org
Kingma, D. P., & Ba, J. 2014, arXiv:1412.6980
Aaronson, M. 1983, ApJL, 266, L11 Kourkchi, E., Courtois, H. M., Graziani, R., et al. 2020, AJ, 159, 67
Aartsen, M. G., Ackermann, M., Adams, J., et al. 2018, EPJC, 78, 831 Larson, D., Dunkley, J., Hinshaw, G., et al. 2011, ApJS, 192, 16
Abadi, M., Agarwal, A., Barham, P., et al. 2015, TensorFlow: Large-Scale Lavaux, G., & Hudson, M. J. 2011, MNRAS, 416, 2840
Machine Learning on Heterogeneous Systems, https://www.tensorflow. Lavaux, G., & Jasche, J. 2016, MNRAS, 455, 3169
org/ Libeskind, N. I., Yepes, G., Knebe, A., et al. 2010, MNRAS, 401, 1889
Abbott, T. M. C., Abdalla, F. B., Alarcon, A., et al. 2018, PhRvD, 98, 043526 Licquia, T. C., & Newman, J. A. 2015, ApJ, 806, 96
Ackermann, M., Albert, A., Anderson, B., et al. 2015, PhRvL, 115, 231301 Marinacci, F., Vogelsberger, M., Pakmor, R., et al. 2018, MNRAS, 480, 5113
Akerib, D. S., Alsum, S., Araújo, H. M., et al. 2017, PhRvL, 118, 021303 Milletari, F., Navab, N., & Ahmadi, S.-A. 2016, arXiv:1606.04797
Ammazzalorso, S., Gruen, D., Regis, M., et al. 2020, PhRvL, 124, 101102 McKinney, W. 2010, in Proc. 9th Python in Science Conf., ed.
Anderson, L., Aubourg, É., Bailey, S., et al. 2014, MNRAS, 441, 24 S. van der Walt & J. Millman (Austin, TX: SciPy), 56
Arcadi, G., Dutra, M., Ghosh, P., et al. 2018, EPJC, 78, 203 Modi, C., Feng, Y., & Seljak, U. 2018, JCAP, 2018, 028
Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, Naiman, J. P., Pillepich, A., Springel, V., et al. 2018, MNRAS, 477, 1206
558, A33 Nelson, D., Pillepich, A., Springel, V., et al. 2018, MNRAS, 475, 624
Astropy Collaboration, Price-Whelan, A. M., Sipőcz, B. M., et al. 2018, AJ, Nelson, D., Springel, V., Pillepich, A., et al. 2019, ComAC, 6, 2
156, 123 Paturel, G., Petit, C., Prugniel, P., et al. 2003, A&A, 412, 45
ATLAS Collaboration, Aaboud, M., Aad, G., et al. 2019, JHEP, 2019, 142 Pawlowski, M. S., Pflamm-Altenburg, J., & Kroupa, P. 2012, MNRAS,
Bell, E. F., McIntosh, D. H., Katz, N., & Weinberg, M. D. 2003, ApJS, 423, 1109
149, 289 Pillepich, A., Nelson, D., Hernquist, L., et al. 2018, MNRAS, 475, 648
Camps, P., Trčka, A., Trayford, J., et al. 2018, ApJS, 234, 20 Planck Collaboration, Aghanim, N., Akrami, Y., et al. 2020, A&A, 641, A6
Carlesi, E., Sorce, J. G., Hoffman, Y., et al. 2016, MNRAS, 458, 900 Ronneberger, O., Fischer, P., & Brox, T. 2015, arXiv:1505.04597
Carrick, J., Turnbull, S. J., Lavaux, G., & Hudson, M. J. 2015, MNRAS, Rubin, V. C., Ford, W., & Kent, J. 1970, ApJ, 159, 379
450, 317 Schaye, J., Crain, R. A., Bower, R. G., et al. 2015, MNRAS, 446, 521
Chollet, F., et al. 2015, Keras, https://keras.io Shandera, S., Jeong, D., & Grasshorn Gebhardt, H. S. 2018, PhRvL, 120,
Clowe, D., Bradač, M., Gonzalez, A. H., et al. 2006, ApJL, 648, L109 241102
Cooke, R. J., Pettini, M., Jorgenson, R. A., Murphy, M. T., & Steidel, C. C. Shirasaki, M., Yoshida, N., & Ikeda, S. 2019, PhRvD, 100, 043527
2014, ApJ, 781, 31 Smith, L. N. 2015, arXiv:1506.01186
Courtois, H. M., Pomarède, D., Tully, R. B., Hoffman, Y., & Courtois, D. Springel, V., Pakmor, R., Pillepich, A., et al. 2018, MNRAS, 475, 676
2013, AJ, 146, 69 Springob, C. M., Magoulas, C., Colless, M., et al. 2014, MNRAS, 445, 2677
Crain, R. A., Schaye, J., Bower, R. G., et al. 2015, MNRAS, 450, 1937 Tröster, T., Camera, S., Fornasa, M., et al. 2017, MNRAS, 467, 2706
Davis, M., Efstathiou, G., Frenk, C. S., & White, S. D. M. 1985, ApJ, 292, 371 Tully, R. B., & Courtois, H. M. 2012, ApJ, 749, 78
Desjacques, V., Jeong, D., & Schmidt, F. 2018, PhR, 733, 1 Tully, R. B., Courtois, H. M., Dolphin, A. E., et al. 2013, AJ, 146, 86
Fairall, A. P., Paverd, W. R., & Ashley, R. P. 1994, in ASP Conf. Ser. 67, Tully, R. B., Courtois, H. M., & Sorce, J. G. 2016, AJ, 152, 50
Unveiling Large-Scale Structures Behind the Milky Way, ed. Tully, R. B., & Fisher, J. R. 1987, Atlas of Nearby Galaxies (Cambridge:
C. Balkowski & R. C. Kraan-Korteweg (San Francisco, CA: ASP), 21 Cambridge Univ. Press)
Fang, K., Banerjee, A., Charles, E., & Omori, Y. 2020, ApJ, 894, 112 Tully, R. B., Shaya, E. J., Karachentsev, I. D., et al. 2008, ApJ, 676, 184
Fornasa, M., Cuoco, A., Zavala, J., et al. 2016, PhRvD, 94, 123005 van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, CSE, 13, 22
Gardner, J. P., Mather, J. C., Clampin, M., et al. 2006, SSRv, 123, 485 Vannerom, D. 2019, in Proc. of Science 352, XXVII Int. Workshop on Deep-
Giesen, G., Boudaud, M., Génolini, Y., et al. 2015, JCAP, 2015, 023 Inelastic Scattering and Related Subjects (DIS2019) (Trieste: Sissa
Glorot, X., Bordes, A., & Bengio, Y. 2011, in Proc. Machine Learning Medialab), 111
Research 15, Fourteenth Int. Conf. on Artificial Intelligence and Statistics , Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, NatMe, 17, 261
ed. G. Gordon et al. (Fort Lauderdale, FL: JMLR), 315 Wilman, D. J., & Erwin, P. 2012, ApJ, 746, 160
Górski, K. M., Hivon, E., Banday, A. J., et al. 2005, ApJ, 622, 759 Zonca, A., Singer, L., Lenz, D., et al. 2019, JOSS, 4, 1298
Gottloeber, S., Hoffman, Y., & Yepes, G. 2010, arXiv:1005.2687 Zwicky, F. 1933, AcHPh, 6, 110
14