Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A publishing partnership

The following article is Open access

The Multicomponent Fitting to the Star Formation Histories in the TNG Simulation

, , , , , and

Published 2024 June 18 © 2024. The Author(s). Published by the American Astronomical Society.
, , Citation Yang Wang et al 2024 ApJ 968 115 DOI 10.3847/1538-4357/ad402f

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/968/2/115

Abstract

The star formation history (SFH) is a key issue in the evolution of galaxies. In this work, we developed a model based on a Gaussian and gamma function mixture to fit SFHs with varying numbers of components. Our primary objective was to use this model to reveal the shape of SFHs and the corresponding physical driving factors. Specifically, we applied this model to fit SFHs from the TNG100-1 simulation. Our study led to the following findings. (1) Our model fits with TNG star formation histories well, especially for high-mass and red galaxies; (2) a clear relationship exists between the number and shape of fitted components and the mass and color of galaxies, with notable differences observed between central/isolated and satellite galaxies. (3) Our model allows us to extract different episodes of star formation within star formation histories with ease and analyze the duration and timing of each star formation episode. Our findings indicate a strong relationship between the timing of each star formation episode and galaxy mass and color.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Star formation is one of the most important processes driving the evolution of galaxies. Well-measured star formation history (SFH) can help us understand many aspects in galaxy evolution, such as the underlying physics of boosting and quenching of star formation, and the timing and timescale of these mechanisms (Gallazzi et al. 2005; Thomas et al. 2005; Panter et al. 2007; Choi et al. 2014; Conroy et al. 2014; Pacifici et al. 2016; Carnall et al. 2018; Schreiber et al. 2018). On the other hand, well-modeled SFH is important when inferring galaxy properties, such as stellar mass, star formation rate (SFR), metallicity, and dust contents, from the observed spectra or spectral energy distribution (SED; Conroy 2013; Kauffmann 2014; Janowiecki et al. 2017; Leja et al. 2017; Telles & Melnick 2018; Zhou et al. 2020b).

Currently, two commonly used methods exist for modeling the SFH: parametric and nonparametric. Parametric models employ simple analytical formulas with a small number of parameters to describe the SFH. Despite the inherent complexity of SFHs in real galaxies, these simple functions demonstrate their ability to derive the SFHs and constrain galaxy properties (Carnall et al. 2019). Researchers explore various forms of parameterized formulas to enhance predictive accuracy, including exponential decline (τ model; Wuyts et al. 2011; Reddy et al. 2012), delayed exponential decline (Gavazzi et al. 2002; Behroozi et al. 2010; Lee et al. 2010), rising form (Buat et al. 2008; Maraston et al. 2010; Papovich et al. 2011), log-normal (Gladders et al. 2013; Abramson et al. 2015; Diemer et al. 2017), double power laws (Pacifici et al. 2016; Carnall et al. 2018), modified exponentially declining (Simha et al. 2014; Ciesla et al. 2016), gamma function (Lu et al. 2015, 2016; Zhou et al. 2020a), Gaussian function (Bellstedt et al. 2020), and Bessel function (Iyer & Gawiser 2017). Parametric methods offer the advantages of computational efficiency, conceptual simplicity, and strong physical motivation.

Parametric models, however, cannot capture all SFHs and may introduce bias if inappropriate functions are utilized (Simha et al. 2014; Carnall et al. 2019). Consequently, researchers have introduced nonparametric models to accommodate the increased complexity and variability of SFH shapes. The simplest nonparametric model employs a piecewise constant function to fit the SFR(t); (Ocvirk et al. 2006; Kelson 2014; Leja et al. 2017; Chauke et al. 2018). The piecewise function method can be enhanced by incorporating adaptive time binning (Tojeiro et al. 2007). Other methods include direct fitting of SFH patterns from theoretical models (Finlator et al. 2007; Pacifici et al. 2012), polynomial expansion of the SFH (Jiménez-López et al. 2022), and Gaussian process regression (Iyer et al. 2019). Although nonparametric methods offer improved flexibility and accuracy, they are computationally demanding and are susceptible to various degeneracies due to the large number of involved free variables (Walcher et al. 2011; Conroy 2013; Lower et al. 2020).

Regardless of the methods employed, the shape of SFHs is always a significant consideration. There has not been universal agreement on the best form(s) of SFH models. Furthermore, Carnall et al. (2019) and Leja et al. (2019) have shown that the form of prior significantly impacts the posterior distributions of SFHs. Thus, selecting an appropriate prior, which is based on our comprehension of SFH shapes, is crucial to accurately reconstructing SFHs.

With the emergence of hydrodynamical cosmological simulations, e.g., Illustris (Vogelsberger et al. 2014), IllustrisTNG (Pillepich et al. 2018a; Nelson et al. 2019), EAGLE (Crain et al. 2015; Schaye et al. 2015), and FIRE (Wetzel et al. 2016; Hopkins et al. 2018), it is possible to get numerous realistic and physics-motivated SFHs. These simulated SFHs provide references and training sets for studies aimed at recovering SFHs (e.g., Iyer & Gawiser 2017; Iyer et al. 2019). Analyzing simulated SFHs will assist in selecting appropriate priors for observational tasks. Moreover, analyzing the SFHs is necessary to infer the underlying physics of galaxy formation. The simulations encompass numerous processes that play significant roles in galaxy SFHs. These processes include environment-dependent accretion, stochastic variations, minor and major mergers, gas-to-star conversion dictated by physical conditions in the interstellar medium, ejection of gas through galactic winds, and subsequent recycling of ejected material through accretion. Explicit functions can provide insights into the roles and quantities of these physical processes in galaxy formation. For these reasons, the parameterization of SFHs derived from simulations is a topic worthy of investigation (e.g., Simha et al. 2014).

In this study, our approach involves fitting the SFHs from TNG100-1 using a combination of two basis forms: a Gaussian distribution and gamma distribution. Our idea aligns closely with the dense basis approach advocated by Iyer & Gawiser (2017), with the distinction that we utilize a reduced number of basis forms. Our goal is to uncover the underlying patterns of SFHs and offer valuable insights for studies focused on recovering SFHs.

The paper is organized as follows. In Section 2, we present the data set employed in this study. In Section 3, we analyze the performance of fitting SFHs. In Section 4, we classify the SFHs based on our fitting results and investigate the relationship between SFH types and galaxy properties. In Section 5, we examine the specific features of SFHs, including peak positions, peak widths, and peak separations. Additionally, we analyze the relationship between SFH characteristics and galaxy properties. Finally, we present our conclusions and engage in additional discussions in Section 6.

2. Data and Methodology

This section provides a brief introduction to the data used in our study. We explore the star formation history of simulated galaxies using TNG100-1 simulation.

2.1. The Simulation and Samples

To conduct our analysis, we select galaxies from the TNG100-1 simulation and reconstruct their SFHs. The TNG100-1 simulation is a large-scale cosmological simulation that uses the AREPO moving mesh code developed by Springel (2010) and includes the effects of gravity and magnetohydrodynamics. Its cosmological parameters, in agreement with Planck2015 (Collaboration et al. 2016), are ΩΛ = 0.6911, Ωm = 0.3089, Ωb = 0.0486, σ8 = 0.8159, ns =0.9667, and h = 0.6774. The simulation covers a box size of 110.73 Mpc3 with 18203 dark-matter particles and 18203 initial hydrodynamic cells. The dark-matter particle mass resolution is 7.5 × 106 M and the initial mass resolution of baryons is 1.4 × 106 M. The simulation has 100 output snapshots, covering the redshift range 127–0. The TNG100-1 simulation features an updated physical model that encompasses the revised recipes for star formation and evolution, chemical enrichment, cooling, and feedback (Weinberger et al. 2017; Nelson et al. 2018a; Pillepich et al. 2018a, 2018b). It also features a modified active galactic nucleus (AGN) feedback model to regulate massive galaxies (Weinberger et al. 2017) and a galactic winds model to shape low-mass galaxies (Pillepich et al. 2018a). More details about the simulation and its data release can be found in the introductory paper series of TNG100-1 (Nelson et al. 2018a; Pillepich et al. 2018a; Marinacci et al. 2018; Naiman et al. 2018; Springel et al. 2018) and in the data release itself (Nelson et al. 2019).

A total of 43,440 galaxies were identified using the SUBFIND algorithm (Springel et al. 2001). These galaxies satisfy two criteria, with M*(z = 0) > 108 M/h at redshift 0, and a merger tree length constraint such that it is longer than 10 snapshots. We obtained the merger trees of these galaxies from the TNG project supplementary file, which are generated using the SubLink algorithm (Rodriguez-Gomez et al. 2015).

The SFH of a galaxy is determined based on its stellar population. Each stellar particle records its age, measured in terms of lookback time tL in units of gigayears (Gyr), as well as its initial stellar mass. We divide their initial stellar masses into different time bins based on their age, using a time bin width of Δt = 0.1 Gyr. The SFH of each galaxy is then calculated as SFR(tL ) = (M*,initial(tL − Δt) − M*,initial(tL ))/Δt. This definition of SFH closely resembles those reconstructed through observations, with numerous studies deriving SFR estimates using identical methods (Conroy 2013; Johnson et al. 2013; Donnari et al. 2019; Hahn et al. 2019; Matthee & Schaye 2019). The time bin width of 0.1 Gyr is widely employed in previous works. The chosen of time bin may affect the reconstruction of SFH. We represent the results of tests on different time bin widths in Appendix A. We found that the time bin width of 0.1 Gyr is appropriate for the SFH reconstruction based on the TNG100-1 data. Different from previous studies (Benítez-Llambay et al. 2015; Digby et al. 2019; Joshi et al. 2021), we employ the initial stellar mass rather than the stellar mass at z = 0 to reconstruct the SFH. In the TNG100-1 simulation, the stellar particles return approximately 40% of their mass to the interstellar medium during their evolution (Pillepich et al. 2018b) . Hence, utilizing the initial stellar mass allows us to account for stellar mass loss and obtain a more accurate estimation of the SFH. Appendix B shows the difference between the SFH derived from the initial stellar mass and the SFH derived from the stellar mass at z = 0.

2.2. Fitting Method

We employed a range of combinations of Gaussian and gamma distributions to establish the best fit to the SFH curves. The reason we use Gaussian and gamma distributions as basic functions is because these two patterns have been widely applied in previous works on SFH reconstruction and spectral energy distribution (SED) fitting (Iyer & Gawiser 2017; Zhou et al. 2020a). Moreover, the shapes of these two distributions can, respectively, fit symmetric and asymmetric distributions, thereby enabling the overall fitting function to have better universality while limiting the number of bases. The general fitting formula was expressed as follows:

Equation (1)

where φi (tL ) could be either a Gaussian distribution,

Equation (2)

or a gamma distribution,

Equation (3)

Here, Ci , μi , σi , αi , βi , and νi are fitting parameters.

Also, we utilized the R2 to evaluate the goodness of our fitting through all steps. For a sequence yi and its corresponding fitting values yfit,i , the R2 is calculated as:

Equation (4)

where $\overline{y}$ represents the mean of the sequence yi , n is the number of data points involved in the fitting, and p is the number of variables, which is set to 1 in this case. The R2 ranges between − and 1, with the results closer to 1 indicating better agreement between data and model. While generally ranging from 0 to 1, R2 may be negative if the fitting is extremely poor.

The fitting process is given in pseudo-code Algorithm 1. The main idea is to obtain the best fit of each component independently first and then combine them together. Each component is adjusted to match either the local or global curve of the SFH, and both the Gaussian and gamma distribution shape of fittings are tested. The best component fit is determined by the maximum R2 value. The overall fit is obtained by summing all the components together. Overall fits with different numbers of maximum components Nc , ranging from 1 to 6, are created. Then, the best fit is found based on the R2 values of overall fit candidates. We did not test the SFH fittings with Nc greater than 6 due to the very low fraction of these samples. Our fitting results indicate a significant decrease in the number of SFHs with an increasing number of components. In the current set, the fraction of SFH fits with six components is 4%. The fraction of SFH fits with more than six components should make up less than this value.

Algorithm 1. Fitting Algorithm for One SFH

Input: $x={t}_{L}$, $y={SFR}({t}_{L})$
Output: parameters of fitting $\hat{y}$
1 for ${N}_{c}\in [1,6]$ do
/*Try different maximum number of components Nc */
2 for $i\in [1,{N}_{c}]$ do
/* Fitting the i th component */
3 find the maximum point of y(x) as $({x}_{i},{y}_{i})$ ;
4 select local data ${y}_{{local}}=y({x}_{i}-1{Gyr}\lt x\leqslant {x}_{i}+1{Gyr})$ ;
5 fit ylocal with Gaussian distribution, get ${\hat{y}}_{{local},{gau}}$ ;
6 fit ylocal with Gamma distribution, get ${\hat{y}}_{{local},{gam}}$ ;
7 select global data ${y}_{{global}}=y({allrange})$ ;
8 fit yglobal with Gaussian distribution, get ${\hat{y}}_{{global},{gau}}$ ;
9 fit yglobal with Gamma distribution, get ${\hat{y}}_{{global},{gam}}$ ;
/* the priors for Gaussian fitting are ${\mu }_{i}={x}_{i}$ , ${\sigma }_{i}=1$ */
/* the ranges of parameters are ${\mu }_{i}\in [0,14]$ , ${\sigma }_{i}\in (0,10)$ */
/* the priors for Gamma fitting are ${\nu }_{i}={x}_{i}$ , ${\beta }_{i}=1$ , ${\alpha }_{i}=1$ */
/* the ranges of parameters are ${\nu }_{i}\in [0,14]$ , ${\beta }_{i}\in [0.1,+\infty ],{\alpha }_{i}\in [0,100]$ */
10 calculate R2 of ${y}_{{local},{gau}}$, ${y}_{{local},{gam}}$, ${y}_{{global},{gau}}$ and ${y}_{{global},{gam}}$ ;
11 find best component fit ${\hat{y}}_{i}=\mathop{\arg \,\max }\limits_{{R}^{2}}({y}_{{local},{gau}},{y}_{{local},{gam}},{y}_{{global},{gau}},{y}_{{global},{gam}}$) ;
12 remove fitted component $y=y-{\hat{y}}_{i}$ ;
13 change negative point to zero $y[y\lt 0]=0$ ;
14 combine all components $\hat{y}{{\prime} }_{{N}_{c}}={\sum }_{i}^{{N}_{c}}{\hat{y}}_{i}$ ;
15 use $\hat{y}{{\prime} }_{{N}_{c}}$ as prior to redo the fitting to ${SFR}({t}_{L})$ to get tuned $\hat{y}{{\prime\prime} }_{{N}_{c}}$ ;
16 calculate R2 of $\hat{y}{{\prime\prime} }_{{N}_{c}}$ ;
17 select the best fitting $\hat{y}=\mathop{\arg \,\max }\limits_{{R}^{2}}({\hat{y}}_{1}^{{\prime\prime} },{\hat{y}}_{2}^{{\prime\prime} },...,{\hat{y}}_{6}^{{\prime\prime} })$

Download table as:  ASCIITypeset image

Figure 1 illustrates the fittings to four galaxies SFHs. We select these four SFHs to cover a range of scenarios with regard to different fitting components and goodness. As it shows, our method is capable of accurately capturing the trends of SFHs, even when the R2 value is low. Furthermore, the bottom-right subplot reveals that the extremely small R2 value is attributed to the significant fluctuations of the origin SFH. By applying the F-test to determine significance, we find that only 281 SFH fittings fail to reach the 5% significance level (i.e., the p-value is greater than 0.05), indicating that only these 281 fittings should be considered unreliable. The R2 values of these unreliable fittings are from −0.067 to 0.021. However, in order to improve the credibility of the following analysis, we further define a group of well-matched samples by selecting SFH fittings that satisfy an arbitrary criterion of R2 ≥ 0.5. There are 23,229 SFHs remaining in this group. In the following sections, we will specify whether we use all samples or well-matched samples for our investigations.

Figure 1.

Figure 1. Four examples of the fittings to the SFHs. The blue solid line represents the actual SFH data, and the orange dashed line represents the corresponding fitted curve. The dotted lines show the decomposition of components in the fitted curve. The goodness of fit for the overall fitting is displayed in the legend. We illustrate the situations for different levels of fitting goodness: a good fit containing multiple components (top left), a good fit containing one component (top right), a median fit (bottom left), and a bad fit (bottom right).

Standard image High-resolution image

The SFHs of galaxies are classified based on the properties of the fittings. Six categories labeled as "C1" to "C6" are assembled based on the number of fitting components. The SFHs are also divided into three additional categories based on the major shapes of the fitting components. These categories are "G" type SFH with more Gaussian components, "Γ" type SFH with more gamma components, and "GΓ" type SFH with an equal number of both types of components. Table 1 presents the number of SFHs of each type.

Table 1. The Numbers of SFHs with Different Fitting Components

 All Samples R2 ≥ 0.5 Samples
Types ΓG ΓG
C119,46210,290091728162473303429
C210,8972933514028245786125926261901
C357192618031013680135802322
C434508881181138124464628561128
C5212182201299167159601075
C617914495338091484346443695
Total Number43,44018,000685418,58623,2298754392510,550

Note. Columns (2)–(5) show the statistics for all SFHs, while columns (6)–(9) show the results for well-matched SFHs. Types "C1" to "C6" mean the SFH types classified by component number from 1 to 6. Types "Γ," "GΓ," and "G" mean the SFH types classified by the major component shape, corresponding to SFHs with more gamma components, with an equal number of two components, or with more Gaussian components.

Download table as:  ASCIITypeset image

As Table 1 shows, fittings with fewer components are more prevalent than those with multiple components. This is because the number of galaxies increases exponentially as the mass decreases, and smaller-mass galaxies dominate in terms of quantity. The star formation process in these smaller-mass galaxies is usually shorter, therefore it can generally be described with fewer components. On the other hand, there are also numerical reasons for this, such as our method tending to achieve fitting with the fewest possible components. Table 1 also indicates that SFHs that match our fitting model well contain somewhat more Gaussian components than gamma components, with the exception of single-component galaxies that exhibit a tendency toward a gamma-shaped distribution.

3. Goodness of Fit

Before conducting the analysis, it is important to evaluate the ability of our fitting method to accurately describe the SFHs of galaxies. We examined the distribution of the goodness of fit for all samples, which is shown in Figure 2.

Figure 2.

Figure 2. Histograms of R2 of SFH fittings. The overall distribution is shown as a gray histogram. The histograms of R2 values for SFHs of different types of fittings are shown in different colors. The SFH fittings are classified by the number of components (left plot) or by the shape (gamma or Gaussian distribution) of the dominant component (right plot).

Standard image High-resolution image

For the total trends (gray lines in Figure 2), the median of the R2 value is 0.53 and the peak of the R2 distribution is at R2 = 0.70. The median value here is consistent with Iyer & Gawiser (2017). To quantitatively evaluate the performance of reconstructing the SFHs from mock catalogs, e.g., SAMs (Somerville & Davé 2015), the hydrodynamic simulation MUFASA (Davé et al. 2017), and stochastic SFHs (Kelson 2014), Iyer & Gawiser (2017) used R2, which has the same definition as in this work, to quantify the accuracy of the reconstruction. The R2 distribution plot indicates that the ability of reconstruction of our model reaches acceptable accuracy, and therefore it can be used to fit the majority of SFHs of galaxies. Nonetheless, the long tail in the R2 < 0 region suggests that there is a small number of SFHs that cannot be fitted with our model. Generally, fittings with multiple components exhibit good accuracy in fitting, where the fitting quality of Gaussian components is slightly superior to that of gamma components, as shown consistently in the results presented in Table 1.

The shape of the SFH curve of a galaxy is tightly related to its underlying physical processes. In our study, the "goodness of fit" value can indicate to some extent whether the shape of the SFH curve of a galaxy approximates a certain type of function/distribution, specifically a combination of Gaussian and gamma distributions. Hence, exploring how the R2 value relates to the properties of galaxies at z = 0 will be useful in our investigation of star formation history.

Figure 3 depicts the relationship between the R2 value of SFH fittings and the stellar mass of their corresponding galaxies at z = 0. It is evident from the figure that higher-mass galaxies exhibit better fitting quality. This implies that the star formation history of high-mass galaxies conforms more to Gaussian or gamma distribution shapes. This R2M* relation is partially ascribed to the resolution effect present in the simulation. Low-mass galaxies contain a lower number of particles, leading to reduced accuracy and greater noise in reconstructing SFHs. Consequently, the overall precision of the fitting is lowered. Although, we have already set a minimum mass threshold to ensure sufficient galaxy particles to ease the resolution effect. A more comprehensive assessment of this point would necessitate additional simulations with different resolutions, which are beyond the scope of this work.

Figure 3.

Figure 3. The two-dimensional distribution of R2 of SFH fittings and stellar mass of corresponding galaxies at z = 0. The gray line displays the median value of R2 for galaxies of different masses, while the error bars represent the range of R2 values from 20% to 80% in each mass bin.

Standard image High-resolution image

Figure 4 depicts the relationship between the R2 value of SFH fittings and the color of their corresponding galaxies at z = 0. It indicates that the correlation between SFH fitting goodness and color is weak for blue galaxies, where fitting goodness is relatively low. Conversely, at the red end, our model yields better fitting results, as redder colors correspond better with SFH peak functions that resemble Gaussian or gamma distributions. We attribute this phenomenon to the quenching process that red galaxies underwent during star formation history, leading to distinctive SFHs with peak-shaped functions that resemble Gaussian or gamma functions more closely.

Figure 4.

Figure 4. The two-dimensional distribution of R2 of SFH fittings and color of corresponding galaxies at z = 0. The gray line displays the median value of R2 for galaxies of different colors, while the error bars represent the range of R2 values from 20% to 80% in each color bin.

Standard image High-resolution image

Figure 5 illustrates the relationship between the R2 values and the age of galaxies. Following Nelson et al. (2018a), the age of a galaxy is defined as the mass-weighted average age of all stellar particles within it. It represents the overall distribution of star formation time, as well as an effective length of the SFH. Figure 5 shows that young and old galaxies have better fitting goodness with our model. This result is straightforward to comprehend. Young galaxies have shorter and simpler shapes, which result from fewer physical mechanisms in the star formation process. Old galaxies corresponds to those massive and red galaxies, which can achieve better fitting results to our model, as has been discussed in previous paragraphs. For both old or young galaxies, their SFHs should have relatively significant peaks at early or late stages, which makes the fittings easier. On the other hand, the mass-weighted age reaches half of the SFH length, approximately 6 Gyr here, when the SFR distributes evenly across the whole SFH. In this case, the SFH is hard to fit with Gaussian or gamma distributions, and thus results in the extremely low R2 value at age ≃ 6 Gyr. In subsequent content, Figure 6 illustrates that the age distribution of SFHs with more components tends to concentrate around 6 Gyr, which verifies the argument here.

Figure 5.

Figure 5. The two-dimensional distribution of R2 of SFH fittings and the ages of them. The gray line displays the median value of R2 for galaxies of different SFH lengths, while the error bars represent the range of R2 values from 20% to 80% in each age bin.

Standard image High-resolution image
Figure 6.

Figure 6. The cumulative density function of average galaxy age with different SFH types. Different colors of histograms represent different SFH types. The gray histogram shows the average galaxy age distribution of all samples. The SFH types are classified by the number of components (left plot) or by the shapes of dominant components (right plot). The top subpanel of each plot shows the residuals to the SFH length distribution of all samples and for samples with different SFH types.

Standard image High-resolution image

We evaluated the fitting goodness of different galaxy types, as depicted in Figure 7. The galaxies were categorized into two groups: satellite galaxies and central/isolated galaxies. In the simulation, we define the biggest galaxy in a dark-matter halo as the central or isolated galaxy, and other galaxies as satellites. The results show that satellite galaxies have a relatively better fitting. We also plot the cumulative density function (CDF) for subsamples with a stellar mass of around 109 M/h (dashed lines) and 1010 M/h (dotted lines). The discrepancy between centrals and satellites still exists, negating the possibility of a mass effect. A plausible explanation is that satellites are more likely to be quenched in their history, resulting in SFH curves that are similar to Gaussian- or gamma-like peaks. This explanation aligns with the color dependence of R2.

Figure 7.

Figure 7. The cumulative density function (CDF) of R2 of SFH fittings for both central and satellite galaxies. The blue lines indicate statistics of central and isolated galaxies. The orange lines indicate statistics of satellite galaxies. Solid lines denote the statistics of all samples. The dashed lines show the CDF of subsamples with stellar masses between 109 M h−1 and 2 × 109 M h−1. The dotted lines show the CDF of subsamples with stellar masses between 1010 M h−1 and 2 × 1010 M h−1.

Standard image High-resolution image

Generally, only half of the galaxies possess SFH shapes that are consistent with the Gaussian–gamma combined distribution. High-mass galaxies, red galaxies, and satellite galaxies generally demonstrate a better fitting for these shapes. We hypothesize that these results are mainly due to mechanisms that can shape distinctive peaks in SFHs, such as rapid increases and quenching of star formation.

4. Dependency of SFH Types

The SFHs that have been fitted are classified into different types based on the number of components and their major shapes, as explained in Section 2.2. This section aims to investigate how galaxy properties differ depending on the type of SFHs. In order to ensure the validity of our study, we will only investigate well-matched samples of SFHs in this section. This is because it is not convincing to attribute poorly fitted SFHs to specific types as defined by our classification.

Figure 8 shows the stellar mass function of galaxies at z = 0 corresponding to the different types of SFHs. High-mass galaxies tend to have SFHs with more decomposition components, which aligns with our intuition. Concerning the component shapes, high-mass galaxies tend to exhibit a prevalence of Gaussian components, whereas low-mass galaxies tend to be dominated by gamma components.

Figure 8.

Figure 8. The cumulative stellar mass function of galaxies at z = 0. Different colors represent galaxies with different SFH types. The gray line shows the cumulative stellar mass function of all samples. The SFHs types are classified by the number of components (left plot) or by the shapes of dominant components (right plot). In the top subpanel of each plot, the residuals between the cumulative mass function of all samples and those of the samples with different SFH types are given.

Standard image High-resolution image

Figure 9 shows the distribution of galaxy color at z = 0 corresponding to different types of SFHs. Multiple-component SFHs are predominantly found in blue galaxies, while single-component SFHs are largely associated with red galaxies. The color of galaxies shifts toward the blue area as the number of decomposed components in their history increases. A simple understanding of this is that galaxies with fewer stellar formation epochs are less likely to experience star formation in their later stages when compared to galaxies with multiple stellar formation epochs. Consequently, the colors of the former tend to be red. Galaxies with more gamma components tend to have a slightly bluer color than others. This is primarily because the typical pattern of star formation for many blue galaxies is a slow but continuous rise until recent times. The gamma distribution provides a good fit for this type of curve.

Figure 9.

Figure 9. The cumulative density function of galaxy colors (gr) at z = 0. Different colors represent different SFH types. The gray line shows the color distribution of all samples. The SFH types are classified by the number of components (left plot) or by the shapes of dominant components (right plot). The top subpanel of each plot shows the residuals between the color distribution of all samples and those of samples with different SFH types.

Standard image High-resolution image

Figure 6 shows the cumulative density distribution of the ages of galaxies. The age of a galaxy is defined as the mass-weighted age of all its particles. Galaxies with a single-component SFH are concentrated around two areas, approximately at ages of 4 and 10 Gyr. This corresponds to two types of galaxies. The former represents young galaxies that have just formed and only experienced one period of star formation. The latter refers to older, quiescent galaxies that have remained in a quiescent state after an initial period of star formation, hence having only one component in their early stage. As the number of components increases, the distribution of galaxy ages gradually converges toward the central position of 6 Gyr. The duration of star formation periods is determined by the underlying physical mechanisms, and thus will not extend infinitely. Therefore, for SFHs with finite lengths (which cannot exceed the age of the universe), there must be more effective star formation times to allow for additional star formation cycles and thus more SFH components. When a galaxy maintains a constant rate of star formation from the beginning of the universe, its average age of stellar particles tends to approach half of the age of the SFH length, approximately 6 Gyr in this case. So, more components of SFHs indicates more effective star formation time across a galaxy's life. The galaxy age does not show a strong correlation with the shape (Gaussian or gamma) of the SFH components. It implies that different physical drivers of star formation have relatively weak correlations with the galaxy age, indicating that they do not have a particular preference for the timing of occurrence.

Figure 10 shows the ratio of the number of central and isolated galaxies to satellite galaxies in galaxies with different SFH types. The horizontal dashed line represents the overall sample ratio. Data points above this line indicate fewer satellite galaxies in this subsample, and vice versa. The proportion of central and isolated galaxies increases with increasing decomposed SFH components, indicating more star formation epochs within central and isolated galaxies. Because most of the satellite galaxies are quenched in late stages, they naturally have fewer star formation epochs. The proportion of satellite galaxies with gamma SFHs is slightly higher than those with Gaussian SFHs. We believe that it is mainly due to the influence of cluster stripping, the star formation process in satellite galaxies is quenched more quickly, resulting in the asymmetric gamma distribution of the star formation history curve.

Figure 10.

Figure 10. The ratio of central and isolated galaxies over satellite galaxies for different types of SFHs. The horizontal line indicates the ratio for the whole sample.

Standard image High-resolution image
Figure 11.

Figure 11. Examples of C1, C2, and C3 type SFHs with different galaxy mass, color, and age. Each column shows one kind of SFH as labeled on the top. Each row represents one galaxy property range as labeled on the left. In each subfigure, the solid blue line represents the original SFH, and the dashed orange line represents the best fit to it. The colored dashed lines represent decomposed components.

Standard image High-resolution image
Figure 12.

Figure 12. Examples of C4, C5, and C6 type SFHs with different galaxy mass, color, and age. Each column shows one kind of SFH as labeled on the top. Each row represents one galaxy property range as labeled on the left. The blue solid line represents the actual SFH data, and the orange dashed line represents the corresponding fitted curve. The dotted lines show the decomposition of components in the fitted curve.

Standard image High-resolution image
Figure 13.

Figure 13. Examples of Γ, GΓ, and G type SFHs with different galaxy mass, color, and age. Each column shows one kind of SFH as labeled on the top. Each row represents one galaxy property range as labeled on the left. In each subfigure, the solid blue line represents the original SFH, and the dashed orange line represents the best fit to it. The colored dashed lines represent decomposed components.

Standard image High-resolution image

In order to provide readers with a more intuitive understanding of the relationship between galaxy SFH types and galactic properties, we present typical examples of various SFHs for galaxies with different galaxy properties through Figures 11, 12, and 13. The first two rows of Figures 11, 12, and 13 show SFH examples of low-mass and high-mass galaxies. The third and fourth rows of Figures 11, 12, and 13 show SFH examples of red and blue galaxies. The last two rows of Figures 11, 12, and 13 show SFH examples of galaxies with large and small ages.

Overall, high-mass, reddish, long-lived galaxies, and central and isolated galaxies tend to have more SFH components, suggesting more star formation periods. Meanwhile, the shapes of the main components of SFHs in these galaxies tend to be slightly Gaussian like. Nevertheless, there is considerable overlap in attribute distributions of galaxies with distinct SFH types, making it a challenge to establish a clear standard for strictly constraining the shape and number of SFH components to apply in reconstructing the SFH of a given galaxy.

5. Episodes of SFH

Both Gaussian and gamma distributions have well-defined shapes; hence, the fitting results can be used to characterize the episodes of each star formation history. By exploring the parameters of the fitting function, the peak position and full width at half-maximum (FWHM) of each episode can be easily identified and used to further understand the occurrence and timescale of star formation processes in galaxies. The position and width of SFH peaks are key parameters in this analysis, as they help us gain insight into the star formation history of galaxies. We continue to constrain our samples with R2 ≥ 0.5 in this section.

5.1. Number of Episodes

In this section, we recalculated the number of peaks in the SFH to determine the number of star formation episodes present in a given SFH. This is because some fitted components may overlap with others. We combine two fitted components into one peak when the distance between their peaks is smaller than the minimum FWHM of them. The numbers of SFHs with varying numbers of peaks obtained after applying this procedure are presented in the third column of Table 2, with the numbers of SFHs having the same number of components shown in the second column. The table indicates that for each type of multicomponent SFH, there are approximately 250 SFHs whose components can be combined, which is quite a small fraction of the total number of SFHs. This suggests that, in most cases, the number of fitted components is consistent with the number of star formation episodes in an SFH.

Table 2. SFHs with Different Numbers of Fitting Components and Peaks, Limited to R2 > 0.5

N SFH withSFH with
  N Components N Peaks
181628406
257865868
336803626
424462388
516711779
614841162

Download table as:  ASCIITypeset image

Table 2 indicates that 64.9% of the SFHs have a better fit with two or more peaks. We examined the intrinsic peak numbers of SFHs in the TNG100-1 simulation and confirmed that the reconstructed multipeak SFH fraction aligns with the intrinsic fraction. However, these results appear to contradict the findings of Iyer & Gawiser (2017) and Iyer et al. (2019), who reported a fraction of multiepisode SFHs ranging from 10% to 20%. They validated their method using hydro simulation and SAMs data, which resulted in a similar fraction for intrinsic peaks.

The main purpose of Iyer & Gawiser (2017) and Iyer et al. (2019) was to constrain the SFH through SED fitting. It is hard to extract information about all stages in a galaxy's life via an SED. The accuracy of fitting the episodes of SFH is highly sensitive to the timing, duration, and strength of the star formation, all of which are challenging to reconstruct solely from SED photometry/spectroscopy. Due to the degeneracy of various stellar populations in spectrum fitting and the noise from observation, the reconstructed SFH from an SED is always more smoothed out compared to a direct fit to the original SFH. Another possibility is that the discrepancy arises from the method used to count the intrinsic peaks. However, we question this possibility. Considering the intrinsic peak number, it is unlikely that the TNG100-1 simulation reproduces SFHs with significantly more peaks than those in the MUFASA simulation and SAMs. We conducted tests using different sets for intrinsic peak counting, and the results are listed in Appendix C. In summary, Iyer & Gawiser (2017) and Iyer et al. (2019) find fewer peaks, which is reasonable considering the nature of reconstructing SFHs from an SED. Taking advantage of direct fitting, our approach attempts to distinguish more small peaks compared with the work in Iyer & Gawiser (2017) and Iyer et al. (2019). This approach only concerns the performance of direct fitting to the SFH. Whether the same performance can be achieved in observations and whether such a capability can benefit studies are questions that necessitate further exploration and analysis.

5.2. Timing and Timescale of Episodes

Figure 14 shows the distribution of peak positions and widths of SFHs. Each data point in the figure corresponds to one peak. In order to provide readers with a better understanding of the distribution of different intensities of peaks in the multiple-component SFH, we ranked the peaks within the same SFH according to their height and grouped peaks with the same index as subsamples. Scatter plots and histograms with different colors in Figure 14 represent the results of peaks with different indices. However, we found that the position and width distributions of peaks across different orders are remarkably similar. Therefore, in all subsequent discussions, unless otherwise stated, the subject of the discussion is all peaks.

Figure 14.

Figure 14. The position and width of peaks of SFHs. Variable tL,peak is the lookback time of the peaks. FWHM is the full width at half-maximum of the peaks. In the middle panel, each point represents data of one peak. The upper panel shows the histogram of the peak positions, and the right panel shows the histogram of the peak widths. For each SFH, we sort the peaks by their height and distinguish the distribution of the peaks in different colors according to their order in the SFH. "Peak 1" is the strongest peak, "Peak 2" is the second strongest peak, and so on. The three wide dashed lines in the figure correspond to FWHM = 13.6 − tL,peak, tL,peak = 0, and FWHM = 1.

Standard image High-resolution image

The distribution of SFH peak position tL,peak reveals two conspicuous peaks. The first occurs at approximately tL,peak ≈ 0 Gyr, while the second arises at tL,peak ≈ 10 Gyr (z ≈ 1.8). The peak occurring at t ≈ 10 Gyr is consistent with the general trend of cosmic star formation rate, which showed that the star formation rate density peaks at z ≈ 2 (e.g., Behroozi et al. 2013; Vogelsberger et al. 2014; Pillepich et al. 2018b). The peak at t ≈ 0 Gyr corresponds to SFH components still rising. Iyer & Gawiser (2017) likewise documented a significant fraction of recent peaks.

The FWHMs of most peaks concentrate in the region of 0 Gyr < FWHM < 2 Gyr and reach the maximum of the distribution at FWHM ≃ 1.2 Gyr. Many works emphasize the importance of SFR regulators in this timescale (e.g., Matthee & Schaye 2019; Katsianis et al. 2020; Tacchella et al. 2020), which correspond to various physical process like gas inflow and outflow, AGN feedback, galactic wind, giant molecular cloud life cycles, gas recycling, halo dynamical timescale, etc. (see Table C1 in Iyer et al. 2020 for a list of timescale estimates). The peak of the FWHM distribution of 1.2 Gyr coincides with a dark-matter halo's dynamical timescale (Mo et al. 2010; Lilly et al. 2013). This suggests that the dynamical process of dark-matter halos significantly affects the star formation process of galaxies.

The FWHM distribution also includes a portion of peaks featuring wider widths that correspond to long-term star formation processes, primarily associated with mergers (Robertson et al. 2006; Boylan-Kolchin et al. 2008; Jiang et al. 2008; Hani et al. 2020), metallicity evolution (Torrey et al. 2018), and galaxy quenching (Sales et al. 2015; Nelson et al. 2018b; Rodríguez Montero et al. 2019; Wright et al. 2019). However, the number of wide peaks is 1 to 2 orders of magnitude smaller than that of the narrow peaks. It worth noting that, from the aspect of the algorithm, our method is designed to get wider peaks to better represent the whole trend of the SFH of a galaxy. But the FWHM distribution in Figure 14 still prefers narrow peaks. Therefore, it demonstrates the importance of short-timescale processes in shaping the SFH of a galaxy.

The joint distribution of SFH width and position shows three major branches, as indicated by the three thick dashed lines in Figure 14. One of these branches is located vertically near tL,peak = 0, which corresponds to galaxies exhibiting a sustained increase in star formation rate until the present time. Another branch manifests in the form of a horizontal line with 2 Gyr width near FWHM ≃ 1.2 Gyr. This branch suggests that a large number of star formation processes occurred within the timescale of 0–2 Gyr. These processes do not exhibit any clear preference for specific timing of occurrence. The third branch appears as a negative correlation between FWHM and tL,peak. For these components, the summation of the lookback time of peak position and the FWHM of peak width is close to the length of the SFH they belong to, namely tL,peak +FWHM ≈ SFH Length. These components are in line with the process of a sustained SFR increase from the birth of the galaxy, succeeded by a decline at a specific point in time. In Figure 14, the line FWHM = 13.6 − tL,peak is shown as a reference, and is not the regression line for these groups of components.

It is important to note that the bimodal distribution of peak positions does not indicate the distance between adjacent peaks in a single SFH. Within a single SFH, it is found that adjacent peaks still tend to occur at shorter time intervals. This can be seen in Figure 15. The median peak separation, accompanied by a 1σ scatter, is ${2.07}_{-0.92}^{+1.76}\,\mathrm{Gyr}$. This value is significantly smaller than the one measured in the study by Iyer et al. (2019). In their study, the peak spacing was determined to be ${0.42}_{-0.10}^{+0.15}{t}_{\mathrm{univ}}$, where tuniv is the age of the universe at the time of observation. Their estimate corresponds to ${5.72}_{-1.36}^{+2.04}\,\mathrm{Gyr}$ in the context of this work. However, the SFH reconstruction performed by Iyer et al. (2019) used a Gaussian process, which is different from the methodology used in this work. In their study, Iyer et al. (2019) attempted to use the fewest possible numbers of time points, which resulted in the smoothing of the SFH and the omission of certain short-term fluctuations. Additionally, as mentioned in the previous section, the SED fitting process naturally leads to fewer peaks being found when reconstructing the SFH. As a consequence, larger intervals between peaks are expected. On the other hand, although similar in methodology, the results from Iyer & Gawiser (2017) are not as appropriate for comparison. According to Figure 15 in Iyer & Gawiser (2017), the median peak separation is about 0.07tuniv, which is much smaller than the value found in this work and in Iyer et al. (2019). However, there are only a few tens of SFHs with two episodes in Iyer & Gawiser (2017), which reduces the statistical significance. In summary, comparing peak separation between different studies is challenging due to differences in data and fitting methodologies. These differences warrant further investigation.

Figure 15.

Figure 15. The histogram of separations between neighboring peaks.

Standard image High-resolution image

5.3. Mass Dependency

The SFH of a galaxy is strongly associated with the galaxy properties. Consequently, it is reasonable to anticipate the presence of correlations between the extracted SFH features (peak position, FWHM, and peak separation) and the galaxy properties, including stellar mass and color. Figure 16 illustrates the relationship between the stellar mass at z = 0 and the SFH features. For each SFH peak, we record its position, FWHM, and the stellar mass of the galaxy it belongs to. Also, we select all neighboring peak pairs from all SFHs and record their separation and the stellar mass of the galaxy they belong to. Then, we plot the 2D histograms of these SFH features and the stellar mass of the corresponding galaxy.

Figure 16.

Figure 16. The dependence of SFH peak position (top row), FWHM (middle row), and neighboring peak separation (bottom row) on stellar mass at z = 0. The colors correspond to logarithmic galaxy count in the corresponding SFH-properties-stellar-mass bins. The median values of the peak position, FWHM, or peak separation in different stellar mass bins are displayed by the gray lines. The error bars denote the range of the same value from 20% to 80% within the bin. The statistics for all galaxies are presented in the left column, those for central and isolated galaxies are in the middle column, while satellite galaxies are in the right column.

Standard image High-resolution image

The peak position of SFH in a galaxy exhibits a U-shaped relationship with the galaxy's mass. The peak of SFH occurs earlier in galaxies with high and low masses, while it occurs later in galaxies with moderate masses. Generally, the formation of high-mass galaxies requires larger dark-matter halos with deeper potential wells to attract more gas. The gas accretion into these strong potential wells triggers star formation earlier. As the galaxy evolves, mechanisms such as AGN feedback and metal enrichment gradually strengthen to suppress the process of stellar formation. This suppression effect is more pronounced in higher-mass galaxies, preventing their star formation rate from surpassing the early levels in subsequent times. The combination of these mechanisms leads to the earlier appearance of the SFH peaks in higher-mass galaxies. On the other hand, for galaxies at the lowest-mass end, their limited gas reservoirs result in an early cessation of stellar formation, leading to smaller final masses. Consequently, the peak position of their SFH remains in the early stages. As for satellite galaxies, their stellar formation process stops earlier due to accretion and gas stripping from host galaxies, causing the peak position of SFH in moderate-mass galaxies to shift toward earlier times.

The correlation between the width of SFH peak and galaxy mass is relatively weak. With the exception of galaxies at low- and high-mass ends, there is a slight decrease in the FWHM as galaxy mass increases. Analyzing the joint distribution of FWHM and mass, we observe a reduction in the number of broader peaks with the mass range of approximately $9\lesssim {\mathrm{log}}_{10}({M}_{* }/({h}^{-1}{M}_{\odot }))\lesssim 11$. This suggests a decreasing influence of the long timescale star formation driving mechanisms on high-mass galaxies, implying that short-timescale processes like feedback and gas recycling become more active in high-mass galaxies. Both at the high- and low-mass ends, the median FWHM experiences an increase, primarily due to the decline in the number of most narrow peaks in the SFHs. The SFH of massive galaxies involves a large number of physical processes, resulting in a more complex SFH shape and more decomposable components. As a result, our method tends to exclude short-timescale components when there are too many pronounced components to be extracted. On the other hand, the evolution process in low-mass galaxies is less complex with fewer fluctuations, leading to fewer instances of fitting with narrow peaks.

Excluding the highest- and lowest-mass galaxies, the peak separation of SFH decreases slightly with increasing galaxy mass. Based on the context of galaxy SFR rejuvenation (Fang et al. 2012), it can be inferred that a larger galaxy mass results in a shorter time required to return to a state of star formation. This is attributed to high-mass galaxies having stronger gas accretion rates and a higher chance of merging, facilitating the acquisition of fuel necessary for the star formation process. When distinguishing central/isolated galaxies from satellite galaxies, the correlation between peak separation and galaxy mass is weaker for satellite galaxies. We propose two reasons for this. On one hand, when satellite galaxies are influenced by central galaxies and experience gas stripping, the rise in their star formation rate is halted earlier, causing an earlier occurrence of the peak position in the star formation process. This suppressing effect is more pronounced in low-mass galaxies. Therefore, the larger peak separation of SFH at low mass is reduced more, resulting in a flatter relationship curve. On the other hand, the frequency of disturbance and stripping events for satellite galaxies is primarily dependent on the external environment rather than their own properties. This introduces more randomness and disturbance to their star formation process, thereby weakening the relationship between their SFH characteristics and their own stellar mass.

5.4. Color Dependency

Figure 17 illustrates the relationship between the galaxy color at z = 0 and the SFH features. For each SFH peak, we record its position, FWHM, and the gr color of the galaxy it belongs to. Also, we select all neighboring peak pairs from all SFHs and record their separation and the gr color of the galaxy they belong to. Then, we plot the 2D histograms of these SFH features (peak position, FWHM, and peak separation) and the color of corresponding galaxies.

Figure 17.

Figure 17. The dependence of SFH peak position (top row), FWHM (middle row), and neighboring peak separation (bottom row) on galaxy color gr at z = 0. The colors on plots correspond to logarithmic galaxy counts in the corresponding SFH-properties-color bins. The median values of the peak position, FWHM, or peak separation in different color bins are displayed by the gray lines. The error bars denote the range of the same value from 20% to 80% within the bin. The statistics for all galaxies are presented in the left column, those for central and isolated galaxies are in the middle column, while satellite galaxies are in the right column.

Standard image High-resolution image

According to Figure 17, there exists a clear linear correlation between the color of galaxies and the position of the SFH peak. It indicates that galaxies with a redder color tend to have earlier peaks, which is naturally attributed to the fact that galaxies formed earlier tend to have older stellar ages, thus displaying a redder color.

The width of the SFH peaks exhibits a trimodal distribution with the colors of galaxies. At the blue end with gr < 0.2, the peaks have smaller FWHM values. This is because the most blue galaxies are still in the early stage of star formation, with their SFR rapidly increasing and their peak positions mainly located at a redshift of 0. Therefore, the fitted peaks have narrower widths. When the gr value is within the range of 0.2–0.6, the width of the peaks has no correlation with the colors of the galaxies. When the galaxy color exceeds 0.6, a large number of wide peaks appear in satellite galaxies. These are SFH components in satellite galaxies whose star formation processes are regulated by self quenching. This result is consistent with previous work that summarized the quenching timescale in other simulations. Sales et al. (2015) found that the quenching timescale of satellite galaxies is about 2–5 Gyr in Illustris. Wright et al. (2019) found that the quenching timescale is about 2.5–3.3 Gyr, extending out to 10 Gyr, in EAGLE.

The peaks separation in the SFH also has a clear relationship with galaxy color. In redder galaxies, the stellar formation process is concentrated in earlier periods, resulting in a more concentrated peak position of star formation.

6. Conclusion

To summarize, this study has undertaken the following objectives and obtained statistical findings concerning the SFHs of galaxies from the TNG100-1 simulation.

  • 1.  
    We employed a function composed of up to six components, described by Gaussian or gamma distributions, to construct fittings for the SFHs in the TNG100-1 simulation. Our fitting method can well recover the trends of the SFH, though the goodness of fit can depend on galaxy mass, color, history length, and the type of galaxy.
  • 2.  
    We analyzed the SFHs that are well fitted by the Gaussian and gamma distribution combinations. Our finding indicates that low-mass galaxies, red galaxies, and satellite galaxies tend to have fewer components and to be predominantly fit by a gamma distribution, while high-mass galaxies, blue galaxies, and central galaxies tend to have more components and to be fit by a Gaussian distribution.
  • 3.  
    We calculated the distribution of the position, width, and separation of SFH peaks. The peak position distribution is concentrated at tL = 0 Gyr and tL = 10 Gyr. The width distribution of peaks is mainly concentrated at FWHM ∼ 0–2 Gyr. The joint distribution of peak position and width is concentrated in three regions, corresponding to different types of star formation modes. In addition, we found that the separation between adjacent peaks in the TNG simulation is concentrated around 2 Gyr, much smaller than that of previously reconstructed SFHs from observations.
  • 4.  
    We analyzed the relationships between three temporal attributes of the SFH, namely peak position, peak width, and peak spacing, and two properties of galaxies, namely stellar mass and galaxy color. Our findings indicate that the peak position and peak separation are related to stellar mass and galaxy color. The peak width shows a limited relationship with stellar mass and galaxy color. This is reasonable because the peak width reflects the timescales of basic physical processes that drive star formation.

Our model can accurately reproduce the trend of galaxies' stellar formation evolution. The method of multicomponent decomposition can greatly help us understand the different physical driving factors of star formation in galaxies. Although more than half of the SFHs have low goodness of fit values, this is mainly due to the excessive fluctuations in the SFHs, which make it extremely difficult to fit them for all kinds of functional templates. Of course, the method we used is based on simulation data, and the gaps between our method and SFH reconstruction methods suitable for observational galaxy data cannot be neglected. Since it is difficult to give an absolute resolution criterion for the shape types and number of fitted components, our model will require better algorithms and more advanced computing power to determine these factors when applied to observational data, which will be further developed and improved in future work.

Acknowledgments

This work is supported by "The Major Key Project of PCL." Y.W. thanks the support from the Kunlun server at SYSU. The authors thank the Illustris and Illustris-TNG projects for providing the data. The authors thank Prof. Cheng Li for valuable discussions and suggestions. The authors thank the referee for valuable comments.

Appendix A: Time Bin Resolution in SFH Reconstruction

The resolution of the time bin significantly influences the extraction of a galaxy's star formation history and the subsequent fitting work. Figure 18 illustrates the SFHs (represented by the blue line) and the corresponding fitting curves (represented by the orange line) obtained using different time bin resolutions. The figure reveals that a larger time bin results in a smoother SFH curve, albeit at the cost of losing finer details.

Figure 18.

Figure 18. The SFHs of three galaxies obtained at different time bin resolutions (blue solid lines), along with the corresponding fitted curve (orange dashed lines) and the decomposition of components (colored dotted lines). Each row contains the results for one time bin. Each column contains the results for one galaxy.

Standard image High-resolution image

However, a smaller time bin is not always the optimal choice. Given the finite number of particles in a galaxy during the simulation, an excessive number of time bins would lead to fewer particles in each bin, thereby causing significant numerical fluctuations. This effect consequently reduces the goodness of fit for the SFHs. As depicted in Figure 18, when the width of the time bin is 10 Myr, the noise in the SFH curve itself is considerably high. Despite this, our fitting method can still identify the main components in the curve, although the goodness of fit is significantly reduced numerically.

Conversely, a larger time bin results in a fitted curve that aligns better with the SFH. This is because using a larger time bin is akin to performing a smoothing process on the curve, which effectively eliminates noise.

In Figure 19, we present the statistical goodness of fit of our method for SFHs with different time bins. It can be observed that R2 decreases as the time bin decreases. This relationship is particularly pronounced in low-mass galaxies, primarily due to the numerical noise caused by the insufficient number of particles in these galaxies.

Figure 19.

Figure 19. The histograms of fitting goodness R2. The histograms of different colors represent the results of the SFHs obtained at different time bin resolutions. The upper panel shows the histograms for high-mass galaxies, while the lower panel shows the histograms for low-mass galaxies.

Standard image High-resolution image

Another point of interest is whether our fitting results would be affected when different time bins are used. According to our inspection results (partially shown in Figure 18), although the number of components and the type of each component (Gaussian or gamma) may vary, the major peaks can be well retained and maintain stable positions and widths. The main differences lie in the handling of some small components, where different time bins can lead to different results.

We use the fitting results of a time bin of 100 Myr as a benchmark and compare the number of components obtained by fitting the SFHs of other time bins with it. As shown in Figure 20, most galaxies' SFH fittings do not change the number of components relative to when the time bin equals 100 Myr (CΔt C0.1 = 0), where CΔt is the number of fitting components of a single SFH when the time bin is Δt.

Figure 20.

Figure 20. The histogram of the difference in the number of components in the SFH obtained at different time bin resolutions. Cdt represents the number of components in an SFH. The x-axis shows the difference between the number of components at four other time bin resolutions and the number of components at a resolution of 100 Myr. The y-axis represents the corresponding number of galaxies. The upper panel shows the results for high-mass galaxies, while the lower panel shows the results for low-mass galaxies.

Standard image High-resolution image

For high-mass galaxies, whether the time bin increases or decreases, the change in the number of components is mainly a decrease, i.e., the side where CΔt C0.1 < 0 is higher than the other side. As mentioned above, this is because the numerical noise brought about by the decrease in time bin resolution can blur some small peaks, while the smoothing effect brought about by the increase in time bin resolution may also erase these peaks. This suggests that 100 Myr is a good choice of time bin for high-mass galaxies in the TNG simulation.

For low-mass galaxies, we can see that the number of components tends to increase when the time bin increases. This suggests that the time bin of 100 Myr may still be too small for these galaxies, and a slightly larger time bin matches their particle number better.

Last, readers should remember that the influence of time bin resolution mainly comes from the mass resolution limit in a simulation. In other theoretical galaxy models, i.e., SAMs, or observational data, the influence of mass resolution can be alleviated.

Appendix B: Stellar Mass Loss

Due to the loss of mass in stellar particles in a simulation, the star formation histories (SFHs) constructed based on the mass of particles at z = 0 and those based on the initial mass of particles will differ. Figure 21 illustrates the differences between these two types of SFHs. As the age of each particle remains unchanged, with only the value of its stellar mass varying, the differences between the two SFHs are solely vertical shifts. Consequently, the differences in the fitting results based on the two types of SFHs primarily lie in the amplitude of the components, with no changes in the position and width of each component.

Figure 21.

Figure 21. The SFH obtained from the initial stellar mass (blue line) and the final stellar mass (orange line) of one galaxy.

Standard image High-resolution image

Appendix C: Factors Affecting Intrinsic SFH Peak Counts

Section 5.1 highlights a discrepancy between the "true fraction" of multiple-episode SFHs in this study and those reported in Iyer & Gawiser (2017) and Iyer et al. (2019). To investigate this difference, we emulate the method of identifying intrinsic peaks as described in Iyer & Gawiser (2017) and Iyer et al. (2019), and examine various sets of prominence for the peak-finding algorithm. As demonstrated in Table 3, a higher prominence in the peak-finding algorithm results in a smaller fraction of multiepisode SFHs. Based on the value of the multiepisode fraction, the SFHs reconstructed by our multicomponent method closely align with the intrinsic SFHs defined with a prominence of ${\mathrm{SFR}}_{\mathrm{peak}}-{\mathrm{SFR}}_{\min ,\mathrm{local}}\gt {\mathrm{SFR}}_{\max }/4$, where SFRmax is the max SFR value of one smoothed SFH. Although the smoothing method and prominence in our test cannot be aligned with Iyer & Gawiser (2017) and Iyer et al. (2019) exactly, we can confirm that the prominence accepted in Iyer et al. (2019) is larger than ${\mathrm{SFR}}_{\max }/4$ for the majority of SFHs in the TNG simulation. We apply the prominence from Iyer et al. (2019) to our data and find that the peaks defined by this prominence are more akin to the performance of the prominence of ${\mathrm{SFR}}_{\max }/2$ (refer to Figure 22).

Figure 22.

Figure 22. Two examples of the effects of prominence. In each figure, the blue solid line represents an original SFH from the TNG simulation. The orange dashed line represents the smoothed curve using the scipy.signal.savgol_filter function with a window of 20 points and polyorder 3. The peaks found with a prominence equivalent to that in Iyer et al. (2019) are indicated by green circles. Purple squares indicate the peaks found with a prominence of ${\mathrm{SFR}}_{\max }/2$, while red triangles indicate the peaks found with a prominence of ${\mathrm{SFR}}_{\max }/4$. To ensure the clarity of the markers, they have been slightly shifted upwards. In the left figure, three different sets of markers can identify the same peaks in the smoothed SFH. However, in the right figure, only the smallest prominence is able to distinguish the smaller peak.

Standard image High-resolution image

Table 3. Factors That Influence the Intrinsic Peak Counts of SFH

ReferenceData Set z M* Nsample Smooth MethodPeak Finder Prominencefrac(Npeak > 1)
Iyer2017MUFASA1≥109 M 1200not mentionednot mentioned12% ∼ 17%
 SAMs1 1200  15%–20%
 stochastic1 1200  ∼15%
Iyer2019MUFASA & SAMs0.5≥108 M 10,000 in 0.5 < z < 3Gaussian Processes $\mathrm{log}\tfrac{{\mathrm{SFR}}_{\mathrm{peak}}}{{\mathrm{SFR}}_{\min ,\mathrm{local}}}$ $\gt 1.5+\tfrac{1.5}{4}\mathrm{log}\tfrac{{M}_{* }}{{10}^{8}{M}_{\odot }}$ ∼20%
  1    10% ∼ 20%
  2    10% ∼ 15%
This workTNG0≥108 M 23,229savgol_filter a Iyer2019 b 16.6%
       ${\mathrm{SFR}}_{\max }/2$ c 19.9%
       ${\mathrm{SFR}}_{\max }/4$ 66.5%
This workTNG0≥108 M 23,22964.9%

Notes. The last row provides the reference for the multiepisode fraction of reconstructed SFHs in this work. Note that the multicomponent method we proposed in this work does not require a prominence parameter to define the peaks. Note that the prominence parameter in this work is the difference of SFR rather than log(SFR), while the latter one is used in Iyer et al. (2019).

a We use the savgol_filter function from the scipy.signal module. b This prominence is the same as that in Iyer et al. (2019). c SFRmax is the maximum value of each smoothed SFH.

Download table as:  ASCIITypeset image

Please wait… references are loading.
10.3847/1538-4357/ad402f