Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Network-community analysis of cellular senescence

Alda Sabalic Department of Medical and Life Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain Victoria Moiseeva Department of Medical and Life Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain Andres Cisneros Department of Medical and Life Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain Altos Labs, Inc., San Diego Institute of Science, San Diego, CA 92121, USA Oleg Deryagin Department of Medical and Life Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain Eusebio Perdiguero Department of Medical and Life Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain Altos Labs, Inc., San Diego Institute of Science, San Diego, CA 92121, USA Pura Muñoz-Cánoves Department of Medical and Life Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain Altos Labs, Inc., San Diego Institute of Science, San Diego, CA 92121, USA Jordi Garcia-Ojalvo Department of Medical and Life Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain
Abstract

Most cellular phenotypes are genetically complex. Identifying the set of genes that are most closely associated with a specific cellular state is still an open question in many cases. Here we study the transcriptional profile of cellular senescence using a combination of network-based approaches, which include eigenvector centrality feature selection and community detection. We apply our method to cell-type-resolved RNA sequencing data obtained from injured muscle tissue in mice. The analysis identifies some genetic markers consistent with previous findings, and other previously unidentified ones, which are validated with previously published single-cell RNA sequencing data in a different type of tissue. The key identified genes, both those previously known and the newly identified ones, are transcriptional targets of factors known to be associated with established hallmarks of senescence, and can thus be interpreted as molecular correlates of such hallmarks. The method proposed here could be applied to any complex cellular phenotype even when only bulk RNA sequencing is available, provided the data is resolved by cell type.

Keywords: Cellular senescence, molecular hallmarks, RNA sequencing, network centrality, community detection

1 Introduction

As cells age, they tend to accumulate damage. While in younger organisms the damage is usually correctly repaired, in older individuals it increasingly leaves traces. One of the direct consequences of this process is cellular senescence, a phenomenon resulting from unresolved molecular damage, which triggers an inflammatory response within the host tissue. Senescence is a stable and permanent state of growth arrest in which cells are unable to proliferate [1]. It can occur in healthy cells that experience a chronic damage response, involving either direct damaging of the DNA or events like telomere shortening or oncogenic mutations [2]. The affected cells go through an irreversible cell cycle arrest by which the effects of the damage become limited [3]. Additionally, senescent cells secrete a broad range of growth factors, pro-inflammatory proteins, and matrix proteinases, which alter the cellular microenvironment, in what is known as the senescence-associated secretory phenotype (SASP) [4].

Multiple lines of evidence show that senescent cells are directly implicated in causing age-related phenotypes. In studies where senescent cells were genetically or pharmacologically removed, both rapidly and naturally aged mice stayed healthy much longer, and, in some cases, even displayed signs of aging reversal [5, 6]. The opposite process has also been proven to be true: introducing only a small number of senescent cells into young mice resulted in physical dysfunction [7]. Moreover, it was found that senescent cells persist for extended periods of time, which leads to their accumulation during aging [8].

Even though senescent cells have shown high expression of cell cycle regulators such as proteins p16 and p21, a universally accepted criterion for their identification is still lacking [9]. Techniques such as RNA sequencing (both bulk and single-cell) can offer valuable insights by comparing the global transcriptional activity of senescent and non-senescent cells. However, traditional data analysis approaches have not revealed clear transcriptional signatures so far, probably due to the fact that senescence is a highly complex phenotype with multiple cellular implications. Within that context, the aim of this study was to complement the standard bioinformatics approaches by using a network-based method, which enables us to perform a “last-mile” filtering of the set of analyzed genes, and identify a small number of genes that are highly central in distinguishing the senescent from the non-senescent transcriptional profiles.

Complex networks serve as a representation of interactions and connections between their elements –which in the context of this investigation are both genes and cell states. Importantly, complex networks are usually characterized by a community structure, meaning that elements belonging to a community are highly interconnected to each other and therefore grouped together. Conversely, different communities are loosely associated with each other, and their elements are considered less similar. The community structure that arises from a network constructed using transcriptional data can provide useful information about the transcriptional determinants of a complex cell state such as senescence.

2 Experimental data

Our study focuses on revealing the transcriptional regulation of cellular senescence during muscle repair and aging, by hypothesizing that cellular senescence can be discriminated from its gene expression profile. To that end, we analyzed the transcriptional profile obtained by bulk RNA sequencing of distinct cell types. The cells were extracted from muscle tissue of healthy and injured mice, which were separated in three cellular states: senescent, non-senescent and basal (Fig. 1).

Refer to caption

Figure 1: Workflow of the experimental design. Tissue is extracted from the injured and non-injured leg of young and geriatric mice at 3 and 7 days post-injury. Three cell types of interest are sorted out - macrophages, FAPs and satellite cells, each of them having three cell states: basal, senescent and non-senescent. The basal cells were extracted from the non-injured leg, while the senescent and non-senescent are taken from the injured one. A low input RNA sequencing protocol was performed to obtain the expression matrix, which consists of 108 samples each described by 46 078 genes.

The data was obtained by performing experiments on mature adult mice belonging to two age groups - young (aged 3 months) and geriatric (aged 24 months). The experiments were conducted on a total of nine young and nine geriatric mice. The mice were subsequently divided into three pools after which the left leg was damaged with an injection of cardiotoxin, causing massive damage to the skeletal tissue [10]. Although the regenerative response happens almost immediately in both young and geriatric mice, senescent cells take a few days to emerge. The damaged muscle tissue was thus extracted 3 and 7 days after the injection. When combined, these two measurements allow cellular differentiation within a heterogeneous population. The tissue was processed and the cell types were sorted as discussed in Sec. 8.1 below. In this study, we focus on three different cell types because of their crucial role in muscle tissue regeneration - satellite cells (muscle stem cells), fibro-adipogenic progenitor (FAP) cells, and macrophages [11, 12]. Each of the selected cell types was further categorized into three cell states: senescent and non-senescent, extracted from the injured (left) leg, and basal, extracted from the non-injured (right) leg. The RNA of the cells was collected and a low input RNA sequencing analysis was performed to determine the expression levels of a total of 46 078 genes, including protein-coding genes, non-coding genes, and pseudogenes.

The study thus considered three cell types and three cell states in both young and geriatric mice, at two different time points (3 and 7 days after injury). Since each condition was replicated three times, this resulted in a total of 108 possible combinations, which represented the 108 conditions considered in the analysis. In this context, each of the 108 conditions is characterized by a 46 078-dimensional vector, where each dimension corresponds to the expression level of a specific gene.

3 Preliminary filtering

RNA sequencing is prone to exhibiting substantial technical noise, leading to dropout events in which a transcript is missing in a given replicate [13]. To avoid this issue, we averaged the expression of the three replicates for each condition and kept only the genes whose expression was non-zero in at least 2 out of 3 replicates. By doing so, we reduced the number of conditions to 36 and the number of genes to 28 603 (Fig. 2A).

Refer to caption

Figure 2: Dropout filtering. (A) The initial expression matrix consisted of 108 samples each defined by 46 078 genes. The 108 samples arise from 36 different cell conditions, each with 3 replicates. The first filtering consisted in averaging the expression of each gene among the 3 replicates of each condition and removing the ones whose expression was equal to 0 in at least 2 out of 3 replicates (gene cases A and C). This combination of averaging and filtering led to an expression matrix consisting of 36 samples (representing the different cell conditions) and 28 603 genes. (B) Principal component analysis of the remaining data, with cell conditions sorted either by cell type (left) or cell state (right).

We next asked to what extent the resulting filtering gene set was able to identify the senescent phenotype. To that end, given the large size of the data set, we performed a dimensionality reduction via principal component analysis (PCA). Figure 2B shows scatterplots of the different cell conditions for the first two principal components. The conditions are color-coded depending on either the cell type (left panel) or cell state (right panel). As can be seen in the plots, the 28 603 genes resulting from the filtering procedure described above are able to discriminate well between the three cell types analyzed (left panel), but not at all between cell states (right panel). In other words, the selected genes are not sufficiently filtered to separate the senescent from the non-senescent phenotype.

Since our primary objective was to find the genes that can effectively discriminate between the three different cell states, rather than the cell types, further filtering was necessary. To that end, a Kolmogorov-Smirnov statistical test was conducted to compare the expression distributions of the senescent samples to the non-senescent and basal ones (Fig. 3A).

Refer to caption

Figure 3: Significance filtering. (A) We compared the gene-expression distributions of the conditions between the senescent cell state and the non-senescent and basal states grouped together, for each one of the 28603 genes. The distributions were compared by using the Kolmogorov-Smirnov test and applying a Bonferroni correction to the p-value of 0.05. A total of 142 genes passed the test. (B) Principal component analysis of the remaining data, with cell conditions sorted either by cell type (left) or cell state (right).

After applying the Bonferroni statistical test adjustment to the p-value of 0.05 and eliminating the genes whose distributions were not significantly different among the cell states, we reduced the initial number of genes to 142. The strong reduction in gene number eliminated the ability of the gene set to distinguish among cell types (Fig. 3B, left panel), but the data was now a bit more informative in distinguishing between cell states, as can be seen by comparing the right panels of Figs. 2B and 3B.

4 Ranking and filtering genes via eigenvector centrality feature selection

The preliminary filtering methods discussed above show that filtering the gene set enables a better separation between cell states (specifically the senescent versus non-senescent phenotypes) at the expense of the discrimination between cell types (which is not a priority in our case). However the cell-state separation is still far from perfect. We thus decided to focus further on the genes that are more strongly associated with the senescent phenotype. To that end, we implemented the eigenvector centrality feature selection (ECFS) algorithm [14] (Fig. 4A).

Refer to caption

Figure 4: Ranking and filtering genes via eigenvector centrality feature selection. (A) The remaining 142 genes were ranked based on their importance in class separation by using the ECFS algorithm. The top 10 ranked genes were retained, using the criteria discussed in Sec. 5 below. (B) Principal component analysis of the remaining data, with cell conditions sorted either by cell type (left) or cell state (right).

The ECFS algorithm is a network-based method that aims to determine the most relevant features in a given classification task (in our case, we chose to distinguish between the non-senescent cells and the rest). The algorithm uses a fully connected weighted network where the nodes are represented by the features. Subsequently, it evaluates the discriminatory power of each feature by calculating the eigenvector centrality scores. In our case, the features are the genes, while the adjacency matrix is computed in terms of how discriminative each gene is for the stated separation. A detailed explanation of the use and implementation of the algorithm is given in Sec. 8.2 below.

After applying the ECSF approach, we obtained a ranking of the 142 genes in order of importance for the chosen class separation. The higher the eigenvector centrality score, the more influential the gene is in differentiating between the various conditions. This ranking allowed us to perform a final selection of the genes that we expect to have a key role in separating the senescent from the non-senescent phenotype, by enabling us to focus only on the highest ranked genes. If we focus for instance on the 10 highest ranked genes, the PCA results show a clear improvement in the ability of those 10 genes (and those 10 genes only) in distinguishing the senescent phenotype from the rest. The reason for choosing the top 10 genes is explained in the next Section.

5 Network community analysis

After ranking the genes with the ECFS algorithm, we proceeded by building network considering an increasing number N𝑁Nitalic_N of ranked genes, adding one gene at a time in the order of the ranking in each iteration of the method. The nodes of this network correspond to the 36 different cell conditions, while the edges are established by an adjacency matrix based on the Pearson’s correlation coefficients between the N𝑁Nitalic_N selected genes for each pair of conditions (Fig. 5A).

Refer to caption

Figure 5: Network-community assessment. (A) Once the ECFS ranking was obtained, we proceeded to construct cell-state networks. The nodes of the networks are represented by the 36 cell conditions, while the edge weights correspond to the Pearson’s correlation coefficients between the expression of the selected genes of the corresponding pair of conditions. (B) ECFS ranking performance for increasing number of genes, when comparing the non-senescent phenotype with the rest. The hhitalic_h score and the number of communities are plotted in the top and bottom panels, respectively. The best hhitalic_h-score of 0.7370.7370.7370.737 was obtained when considering the first 10 genes. The vertical and horizontal red dashed lines correspond to the optimal hhitalic_h score and the three communities, respectively. (C,D) For that case we obtain the best community separation, represented here. The genes listed on panel C give rise to 3 communities, each one predominantly consisting of a single cell state –basal, senescent or non-senescent–, each represented in panel D by a different color. The node labels have the format AnB, where A is either Y or G, corresponding to the sample coming from a young or geriatric mouse, respectivel, n is either 3 or 7, corresponding to a sample obtained 3 or 7 days after the injury, and B is either M, F or S, corresponding to the cell type being a macrophage, FAP, or satellite cell, respectively.

Once the network was constructed, we proceeded to find the network communities, using the Louvain algorithm for community detection [15]. The process was repeated until all 142 genes were included. The number of communities obtained for increasing number of ranked genes included in the network is shown in the bottom panel of Fig. 5B.

Ideally, the community detection algorithm should identify exactly three communities, with each cell state (senescent, non-senescent, and basal) belonging to its own individual community. Additionally, it was important to quantify the separation quality of the communities, i.e. how well the communities obtained with a given selection of genes discriminate between the cell states. To do so, we introduced a goodness of separation measure based on Shannon’s entropy H=ipiln(pi)𝐻subscript𝑖subscript𝑝𝑖subscript𝑝𝑖H=-\sum_{i}p_{i}\ln(p_{i})italic_H = - ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_ln ( start_ARG italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ), where pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the probability of occurrence of a specific event, in this case the fraction of cell conditions belonging to a specific community. The entropy was computed with respect to both communities and cell states, and the total entropy Htotsubscript𝐻totH_{\rm tot}italic_H start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT of a given community separation was obtained by summing the individual entropies. For a thorough description of this process along with an illustrative example, see Sec. 8.3 below.

Once we have computed the total entropy of the community separation, we now define the goodness of separation by:

h={HmaxHtotHmax,if n=30,if n3,casessubscript𝐻maxsubscript𝐻totsubscript𝐻maxif 𝑛30if 𝑛3\displaystyle h=\begin{cases}\dfrac{H_{\rm max}-H_{\rm tot}}{H_{\rm max}},&% \text{if }n=3\\ 0,&\text{if }n\neq 3,\end{cases}italic_h = { start_ROW start_CELL divide start_ARG italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT - italic_H start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT end_ARG start_ARG italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG , end_CELL start_CELL if italic_n = 3 end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL if italic_n ≠ 3 , end_CELL end_ROW (1)

where n𝑛nitalic_n is the number of communities found by the community detection algorithm, and Hmaxsubscript𝐻maxH_{\rm max}italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT is the maximum possible value of the entropy, which corresponds to the case when there is no clustering and each cell condition is in its own community (36 communities in total). Furthermore, the total entropy Htotsubscript𝐻totH_{\rm tot}italic_H start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT is normalized by Hmaxsubscript𝐻maxH_{\rm max}italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT, and its sign is inverted so that an hhitalic_h-value of 0 indicates maximum disorder, while a value of 1 indicates perfect separability of the three cell states, each belonging to its own community (see Sec. 8.3). Finally, we consider for assessment only the cases where exactly three communities were detected. The values of the hhitalic_h score for increasing number of genes is shown in the top panel of Fig. 5B.

As shown in Fig. 5B, the best cell state separation was obtained when the top 10 ranked genes from the ECFS algorithm output were considered. The corresponding genes are listed in Fig. 5C, and the corresponding network is shown in Fig. 5D, with the different cell states shown by different colors. As it can be observed, one community contains 9 out of 12 senescent cells without containing any other cell state. Additionally, the community primarily composed of basal cells includes two conditions that belong to the non-senescent state. This may be explained by the fact that non-senescent cells resemble basal cells, as they have not yet developed any traits specific to senescence by the time the tissue was collected. The predominantly non-senescent community contains three samples belonging to the senescent state. Similarly to the mixing of basal and non-senescent cells, this might be because both of these cell states are extracted from the injured leg.

To test the robustness of the obtained separation, we performed a permutation test that consisted of randomly changing the labels of the 36 cell conditions 1000 times. Those artificial datasets then underwent the ECFS algorithm, and the community detection was applied on a network constructed with the 10 top-ranking genes. We found that it is significantly unlikely that such separation could be obtained by chance (p-value = 0.00190.00190.00190.0019). This analysis is discussed in Sec. 8.4, and the corresponding distribution of randomly computed normalized entropy scores hhitalic_h is given in Figure 11.

6 Discussion

A final filtering of the gene list was performed by removing each one of the 10 genes obtained above, and determining which ones of these removals resulted in a noticeable disruption of the community structure, resulting in a modification of the separation of cell states. The results of this analysis are shown in Table 1. Furthermore, if only the 6 highlighted genes are considered for network construction and community detection, they produce the same separation among cell states as the 10 best-ranked genes.

Eliminated gene hhitalic_h score
Fabp3 0.605
Gm38050 0.737
Lncpint 0
Pcnp 0
Luc7l3 0
Micalcl 0.737
Ash1l 0.737
Al427809 0
Cdhr4 0.737
Cd59a 0.654
Table 1: Effect of eliminating one by one each gene on the goodness of separation. Those genes whose absence noticeably disrupts the cell-state separation are underlined.

Some of the 6 genes highlighted in Table 1 were found to be directly associated with senescence. For instance, one of the genes most relevant for the separation was Lncpint, a p53-induced long intergenic non-coding transcript. Removing it from the analysis resulted in a complete disruption of the otherwise well-defined clusters. In that case, there was no separation of senescent cells whatsoever - indicating the crucial role of this transcript in the differentiation of this cell state. Recently, there have been numerous reports about the role of Lncpint in various biological processes ranging from DNA damage responses [16], cell cycle and growth arrest [17], cellular senescence [18], cell migration [19] and apoptosis [17]. These findings are consistent with our results. A second of the highlighted genes, Fabp3 (Fatty acid-binding protein 3), is involved in the intracellular transport of long-chain fatty acids. Fabp3 has been reported to be upregulated in senescent cells, together with numerous other lipid-transport and lipid metabolism genes [20]. In general, lipid uptake plays a role among the generally accepted hallmarks of senescence, which include cell cycle arrest, resistance of apoptosis, metabolic and morphological changes, and highly secretory phenotype (SASP) [21]. Lipids are found to be essential for each of these features [22]. Furthermore, an accumulation of lipid droplets in senescent cells has been reported in various studies [23, 24, 25].

To provide an additional assessment of the other genes involved in the cell state separation, we conducted an upstream analysis of their transcription factors (TFs). Specifically, out of the 6 essential genes found to separate the clusters, 4 were identified as protein-coding. The transcription factors regulating those genes were extracted from the Gene Regulatory Network database (GRNdb) [26]. Given that the analyzed samples were extracted from muscle tissue, we used the TF-target regulations for mouse muscle available in the database. Subsequently, the implication of a particular transcription factor in a pathway was assessed using the KEGG Pathways database [27]. Finally, the transcription factors were then grouped according to their involvement in the pathways associated with the established hallmarks of senescence. As shown in Figure 6, many of the identified transcription factors were involved in pathways linked to several hallmarks of senescence such as cell cycle, secretory phenotype, cellular response to stress, apoptosis resistance, DNA damage, morphological alterations, accumulation of mitochondria, chromatin organization and cellular response to stimuli.

Refer to caption

Figure 6: Transcription factors of the found genes and their role in senescence. The transcription factors from the most relevant protein-coding genes are shown grouped by color, representing the pathways they are involved in. The shown pathways are related to the established hallmarks of senescence.

Finally, to validate our findings further, we analyzed a previously published single-cell RNA-seq dataset in normal and fibrotic liver and kidney tissue [28]. This study used the levels of p16 as a marker of senescence, since senescent cells are known to have an increased expression of that protein [29], First, we split the population into two subgroups based on the expression on the fluorescent marker tdTomato, which represents the expression of p16 in this dataset. The expression distribution of tdTomato, together with the chosen expression threshold, is shown in Figure 7.

Refer to caption

Figure 7: Distribution of p16 expression, as measure by the fluorescent marker tdTomato. The expression among all the considered cell types presents a bimodal distribution. The threshold for establishing p16+ and p16- cells is chosen to be 1.5.

The analysis above allowed us to establish a ground-truth separation between senescent (p16+, with tdTomato expression above the threshold) and non-senescent cells (p16-, with tdTomato expression below the threshold). The cells were project in a low dimensional UMAP representation of their transcriptome for the two cell states, as shown in Fig. 8A. We then superimposed the expression of one of the 6 genes identified with our analysis, Cd59a (which had not been associated so far with senescence) on this UMAP representation. As can be seen when comparing the two panels of Fig. 8B, Cd59a exhibited a clear enrichment in the senescent cell populations, larger than other traditional markers. The best categorization can be observed among Kupffer cells and macrophages, where the alignment between a cell being p16 positive and expressing a high level of Cd59a is the highest.

Refer to caption

Figure 8: Expression of Cd59a in p16+ and p16- cells. (a) The UMAP projection of the cell types. (b) The cells are classified into two groups based on their tdTomato expression. Cells with tdTomato expression larger than 1.5 are considered tdTomato+, whereas cells that have the expression of tdTomato 1.5absent1.5\leq 1.5≤ 1.5 are labeled as tdTomato-. A predominant presence of increased Cd59a expression is observed in the tdTomato+ cells, indicating that Cd59a could be a possible marker of senescence for the shown cell types.

The cells that showed the least matching between tdTomato+ and Cd59a were the endothelial and plasma cells, shown together with the other cell types in Fig. 9. It is worth noting that the cell types used for validation originate from an entirely distinct tissue from the ones employed in our analysis, which highlights the potential generalizability of our conclusions.

Refer to caption

Figure 9: Expression of Cd59a in tdTomato+ and tdTomato- cells. (a) The UMAP projection of all the considered cell types. (b) The cells are classified into two groups based on their tdTomato expression. Cells with tdTomato expression larger than 1.5 are considered tdTomato+, whereas cells that have the expression of tdTomato 1.5absent1.5\leq 1.5≤ 1.5 are labeled as tdTomato-. The cells in the tdTomato+ panel show a higher expression of C59a, except for the endothelial and plasma cells.

7 Conclusions

In order to better understand the transcriptional regulation landscape of cellular senescence in the tissue regeneration process, we conducted an analysis that leans on network-theory concepts such as community detection and eigenvector centrality. We found that, from a transcriptional point of view, senescence is a heterogeneous process characterized by a variety of molecular hallmarks. In search of these hallmarks, we identified a gene set that underlies the senescent phenotype. The relevance of the identified genes was assessed in terms of their transcriptional roles, which were found to correspond to several established phenotypical hallmarks of senescence.

Our findings aim to shed light on the transcriptional regulation landscape of cellular senescence in tissue regeneration and potentially contribute to the understanding of this heterogeneous process. Specifically, our analysis resulted in a set of 6 genes that are able to separate almost completely the senescent phenotype from the non-senescent and basal ones. Despite observing some mixing within the found clusters, we assume it could be attributed to certain shared characteristics. For instance, some non-senescent cells clustered together with the basal cells - probably because both cell types exhibit the same non-senescent features, even though they were extracted from different legs of the mice. Furthermore, some senescent cells were clustering with the non-senescent community. Since both of those cells were extracted from the same damaged leg of the mice, it is expected that they might share some transcriptional similarity. However, it is worth noting that there are no senescent cells present in the community consisting of predominantly basal cells and vice versa. This result supports the hypothesis that basal and senescent cells have a sufficiently different transcriptional signature. The genes found by our analysis might be associated with the already established characteristics of senescent cells and could, therefore, facilitate the identification of the molecular hallmarks of senescence.

8 Materials and methods

8.1 Cell sorting

After the filtering, digestion and purification of the extracted tissue, cell sorting by flow cytometry was performed to isolate the cell types of interest. Precisely, cells were selected by using forward- and side-scatter detectors. The forward-scatter measurement allows discriminating cells by size while measuring the side scatter provides information about the granularity of a cell. When combined, these two measurements allow cellular differentiation within a heterogeneous population. The entire experimental protocol, together with the RNA extraction procedure, was done as described by Moiseeva et al.[20].

8.2 Eigenvector centrality feature selection

The eigenvector centrality feature selection (ECFS) algorithm [14] is based on mapping the given feature selection (FS) problem to a graph in which the nodes represent the features of interest. In our analysis, the features are the genes that separate the predetermined classes of cell states. Specifically, given a set of features X={x(1),,x(n)}𝑋superscript𝑥1superscript𝑥𝑛X=\{x^{(1)},...,x^{(n)}\}italic_X = { italic_x start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , … , italic_x start_POSTSUPERSCRIPT ( italic_n ) end_POSTSUPERSCRIPT }, a fully-connected undirected graph G=(V,E)𝐺𝑉𝐸G=(V,E)italic_G = ( italic_V , italic_E ) is built. V𝑉Vitalic_V represents a set of vertices corresponding to each feature x𝑥xitalic_x, whereas E𝐸Eitalic_E represents the weighted edges among those features. The weighted edges are defined by the adjacency matrix A𝐴Aitalic_A that consists of two measures, one considering the relevance of a given feature in a supervised way and the other including redundancy in an unsupervised way. The first measure considered to compute the matrix is the Fisher criterion, defined as:

fi=|μi,1μi,2|2σi,12+σi,22,subscript𝑓𝑖superscriptsubscript𝜇𝑖1subscript𝜇𝑖22superscriptsubscript𝜎𝑖12superscriptsubscript𝜎𝑖22\displaystyle f_{i}=\frac{|\mu_{i,1}-\mu_{i,2}|^{2}}{\sigma_{i,1}^{2}+\sigma_{% i,2}^{2}},italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG | italic_μ start_POSTSUBSCRIPT italic_i , 1 end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_i , 2 end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_i , 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT italic_i , 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (2)

where μi,Csubscript𝜇𝑖𝐶\mu_{i,C}italic_μ start_POSTSUBSCRIPT italic_i , italic_C end_POSTSUBSCRIPT and μi,Csubscript𝜇𝑖𝐶\mu_{i,C}italic_μ start_POSTSUBSCRIPT italic_i , italic_C end_POSTSUBSCRIPT represent the mean and the standard deviation of the i𝑖iitalic_i-th feature when considering the samples of the C𝐶Citalic_C-th group. A feature is considered to be more discriminative the higher the value of fisubscript𝑓𝑖f_{i}italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Next, mutual information is used to assign a high score to the features related to the given classes:

mi=yYzx(i)p(z,y)log(p(z,y)p(z)p(y)),subscript𝑚𝑖subscript𝑦𝑌subscript𝑧superscript𝑥𝑖𝑝𝑧𝑦𝑝𝑧𝑦𝑝𝑧𝑝𝑦\displaystyle m_{i}=\sum_{y\in Y}\sum_{z\in x^{(i)}}p(z,y)\log\left(\frac{p(z,% y)}{p(z)p(y)}\right),italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_y ∈ italic_Y end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_z ∈ italic_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p ( italic_z , italic_y ) roman_log ( divide start_ARG italic_p ( italic_z , italic_y ) end_ARG start_ARG italic_p ( italic_z ) italic_p ( italic_y ) end_ARG ) , (3)

where Y𝑌Yitalic_Y represents the set of class labels and p(,)𝑝p(\cdot,\cdot)italic_p ( ⋅ , ⋅ ) the joint probability distribution. The kernel k𝑘kitalic_k for the adjacency matrix A𝐴Aitalic_A is then obtained by the following matrix product:

k=(fm)𝑘𝑓superscript𝑚top\displaystyle k=\left(f\cdot m^{\top}\right)italic_k = ( italic_f ⋅ italic_m start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) (4)

The kernel k𝑘kitalic_k accounts for the supervised part of the measure, given that it is computed taking into account the class label.

The second metric, defined in an unsupervised way, is based on standard deviation. Precisely, it captures the amount of dispersion of the features from the average:

Σ(i,j)=(σ(i),σ(j)),Σ𝑖𝑗superscript𝜎𝑖superscript𝜎𝑗\displaystyle\Sigma(i,j)=\left(\sigma^{(i)},\sigma^{(j)}\right),roman_Σ ( italic_i , italic_j ) = ( italic_σ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , italic_σ start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT ) , (5)

Finally, the adjacency matrix of the graph G𝐺Gitalic_G taking into account both metrics is given by:

A=αk+(1α)Σ,𝐴𝛼𝑘1𝛼Σ\displaystyle A=\alpha k+(1-\alpha)\Sigma,italic_A = italic_α italic_k + ( 1 - italic_α ) roman_Σ , (6)

where α𝛼\alphaitalic_α is the loading coefficient, which should be chosen to fit the purposes of the feature selection. When computing the feature selection ranking for our set of genes, we consider only the supervised part of the matrix, given that we know the labels of our classes (senescent, non-senescent and basal). Therefore, by putting α=1𝛼1\alpha=1italic_α = 1, the proposed adjacency matrix is simplified to A=k𝐴𝑘A=kitalic_A = italic_k. Once the adjacency matrix A𝐴Aitalic_A is constructed, the network is fully defined.

The feature selection process relies on finding the eigenvector centrality for the defined network. The eigenvector centrality measures the influence of a node in a network. Therefore, a node would have a high centrality score if it is connected to many nodes or in the case it is connected to highly influential nodes [30]. Let G=(V,E)𝐺𝑉𝐸G=(V,E)italic_G = ( italic_V , italic_E ) be a graph and A𝐴Aitalic_A its adjacency matrix with a generic entry aijsubscript𝑎𝑖𝑗a_{ij}italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT. The relative centrality score of a node xvsubscript𝑥𝑣x_{v}italic_x start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT is proportional to the sum of the centralities of the nodes to which it is connected. Hence,

xv=1λj=1naijxj,i=1,,nformulae-sequencesubscript𝑥𝑣1𝜆superscriptsubscript𝑗1𝑛subscript𝑎𝑖𝑗subscript𝑥𝑗𝑖1𝑛\displaystyle x_{v}=\frac{1}{\lambda}\sum_{j=1}^{n}a_{ij}x_{j},\quad i=1,...,nitalic_x start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_λ end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_i = 1 , … , italic_n (7)

where n𝑛nitalic_n is the number of nodes in the network and λ𝜆\lambdaitalic_λ is a constant. By reorganizing the previous equation, we obtain the eigenvector equation:

Ax=λx.𝐴x𝜆x\displaystyle A\textbf{x}=\lambda\textbf{x}.italic_A x = italic_λ x . (8)

In general, there are many different eigenvalues λ𝜆\lambdaitalic_λ that satisfy the previous equation. However, in the case of non-negative square matrices, the Perron-Frobenius theorem guarantees the existence of a positive real eigenvalue that is larger than all other eigenvalues, and its corresponding eigenvector is strictly positive (all components are positive). Therefore, the λ𝜆\lambdaitalic_λ value of interest is the largest eigenvalue of matrix A𝐴Aitalic_A. The vthsuperscript𝑣𝑡v^{th}italic_v start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT component of the related eigenvector results in the desired centrality score, which corresponds to the ranking of the vthsuperscript𝑣𝑡v^{th}italic_v start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT feature in the ECFS algorithm.

8.3 Measuring the goodness of separation

Since the data of our analysis consists of three different cell states (basal, senescent and non-senescent), the goodness of separation score is calculated only in the cases when the number of communities n𝑛nitalic_n detected by the community detection algorithm is exactly 3. Figure 10 provides an illustrative example of how the goodness of separation is determined. First, the probability of occurrence of a given cell state pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT was defined for every cell state. Taking into account that the 36 analyzed conditions include 12 senescent, 12 non-senescent, and 12 basal cells, in the cell type-wise entropy calculation, each entry pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT corresponds to the probability of a cell present in a community to belong to a given cell state. Therefore, when summing the values of pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for each row, a total probability of 1 is obtained (first table in Fig. 10).

Refer to caption

Figure 10: Entropy calculation example. To obtain the total entropy Htotsubscript𝐻totH_{\rm tot}italic_H start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT, first, we have to calculate the individual entropies, both cell type-wise and community-wise. Since a total of 12 senescent, 12 non-senescent and 12 basal cells were present in our analysis, in the cell type-wise entropy computation, the probability pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is defined as the probability of a cell found in a community to belong to a given cell state. In the community-wise entropy calculation, the probability pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is defined by the ratio of cells belonging to a specific cell state to the overall number of cells within that community. The total entropy Htotsubscript𝐻totH_{\rm tot}italic_H start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT is calculated by summing all the individual cell type-wise and community-wise entropies. The normalized entropy value hhitalic_h is obtained by normalizing the total entropy Htotsubscript𝐻totH_{\rm tot}italic_H start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT with the maximum possible entropy Hmaxsubscript𝐻maxH_{\rm max}italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT that corresponds to the case when each cell belongs to its own separate community.

In the community-wise calculation, each entry pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT represents the ratio of cells belonging to a cell state and the total number of cells in that community. In this case, the total probability of 1 is achieved by summing up the entries pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for each column, as shown in the lower table in Figure 10. Subsequently, Shannon’s entropy defined as H=ipiln(pi)𝐻subscript𝑖subscript𝑝𝑖subscript𝑝𝑖H=-\sum_{i}p_{i}\ln(p_{i})italic_H = - ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_ln ( start_ARG italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) is calculated for each of the cell states and cell types. The total entropy Htotsubscript𝐻totH_{\rm tot}italic_H start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT is defined as the sum of the individual entropies. To keep the entropy values in the interval between 0 and 1, a normalized entropy value hhitalic_h is computed. The normalization was done by taking into account the maximum possible entropy Hmaxsubscript𝐻maxH_{\rm max}italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT that corresponds to the case in which each cell belongs to its own community. The normalized entropy value is given as:

h={HmaxHtotHmax,if n=30,if n3,casessubscript𝐻maxsubscript𝐻totsubscript𝐻maxif 𝑛30if 𝑛3\displaystyle h=\begin{cases}\dfrac{H_{\rm max}-H_{\rm tot}}{H_{\rm max}},&% \text{if }n=3\\ 0,&\text{if }n\neq 3,\end{cases}italic_h = { start_ROW start_CELL divide start_ARG italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT - italic_H start_POSTSUBSCRIPT roman_tot end_POSTSUBSCRIPT end_ARG start_ARG italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG , end_CELL start_CELL if italic_n = 3 end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL if italic_n ≠ 3 , end_CELL end_ROW (9)

where n𝑛nitalic_n is the number of communities found by the community detection algorithm and Hmax=3ln112subscript𝐻max3112H_{\rm max}=-3\ln\frac{1}{12}italic_H start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT = - 3 roman_ln divide start_ARG 1 end_ARG start_ARG 12 end_ARG.

8.4 Statistical validation

To assess the robustness of the separation obtained by the 10 selected genes, a randomized permutation test was performed. The test consisted in generating 1000 datasets by permuting the labels of the 36 cell conditions. Each dataset was first put through the ECFS algorithm to obtain the ranking of the genes, after which we proceeded with the network formation and community detection for the top 10 ranked genes. By doing so, we obtained a distribution of the calculated normalized entropy scores hhitalic_h. The normalized entropy score from the original dataset hhitalic_h was then compared to the distribution by a Kolmogorov-Smirnov test. The test results showed that it is significantly unlikely to obtain such separation by chance, giving a p-value of 0.00190.00190.00190.0019. The distribution of hhitalic_h-scores together with the hhitalic_h-score from the original dataset are shown in Figure 11.

Refer to caption

Figure 11: Histogram of randomly computed hhitalic_h-scores. The hhitalic_h-scores represented in the histogram were computed by permuting the cell conditions labels. The artificially created labels were sorted into 2 classes, one consisting of non-senescent cells and the other of senescent and basal cells. The ECFS algorithm was performed for the classes, and the corresponding networks with the communities were computed. The hhitalic_h-score obtained for the 10 best-performing genes was extracted. The described procedure was repeated 1000 times and the corresponding hhitalic_h-scores are shown in the histogram. The red line represents the original maximum hhitalic_h-score obtained by our separation (h=0.7370.737h=0.737italic_h = 0.737). The p-value for our obtained hhitalic_h-score to be part of this distribution was found to be 0.00019.

Acknowledgments

This work was supported by project PID2021-127311NB-I00 financed by the Spanish Ministry of Science and Innovation, the Spanish State Research Agency and FEDER (reference MICIN/AEI/10.13039/501100011033/FEDER), by the Maria de Maeztu Programme for Units of Excellence in R&D (project CEX2018-000792-M), and by the Generalitat de Catalunya (ICREA Academia programme).

References

  • Kumari and Jat [2021] Kumari, R., Jat, P.. Mechanisms of Cellular Senescence: Cell Cycle Arrest and Senescence Associated Secretory Phenotype. Frontiers in Cell and Developmental Biology 2021;9. doi:doi:10.3389/fcell.2021.645593.
  • de Keizer [2017] de Keizer, P.L.J.. The Fountain of Youth by Targeting Senescent Cells? Trends in Molecular Medicine 2017;23(1):6–17. doi:doi:10.1016/j.molmed.2016.11.006.
  • Baar et al. [2018] Baar, M.P., Perdiguero, E., Muñoz-Cánoves, P., de Keizer, P.L.. Musculoskeletal senescence: A moving target ready to be eliminated. Current Opinion in Pharmacology 2018;40:147–155. doi:doi:10.1016/j.coph.2018.05.007.
  • Coppé et al. [2010] Coppé, J.P., Desprez, P.Y., Krtolica, A., Campisi, J.. The senescence-associated secretory phenotype: The dark side of tumor suppression. Annual Review of Pathology 2010;5:99–118. doi:doi:10.1146/annurev-pathol-121808-102144.
  • Baker et al. [2011] Baker, D.J., Wijshake, T., Tchkonia, T., LeBrasseur, N.K., Childs, B.G., van de Sluis, B., et al. Clearance of p16Ink4a-positive senescent cells delays ageing-associated disorders. Nature 2011;479(7372):232–236. doi:doi:10.1038/nature10600.
  • Bussian et al. [2018] Bussian, T.J., Aziz, A., Meyer, C.F., Swenson, B.L., van Deursen, J.M., Baker, D.J.. Clearance of senescent glial cells prevents tau-dependent pathology and cognitive decline. Nature 2018;562(7728):578–582. doi:doi:10.1038/s41586-018-0543-y.
  • Xu et al. [2018] Xu, M., Pirtskhalava, T., Farr, J.N., Weigand, B.M., Palmer, A.K., Weivoda, M.M., et al. Senolytics improve physical function and increase lifespan in old age. Nature Medicine 2018;24(8):1246–1256. doi:doi:10.1038/s41591-018-0092-9.
  • Childs et al. [2015] Childs, B.G., Durik, M., Baker, D.J., van Deursen, J.M.. Cellular senescence in aging and age-related disease: From mechanisms to therapy. Nature medicine 2015;21(12):1424–1435. doi:doi:10.1038/nm.4000.
  • Cohn et al. [2023] Cohn, R.L., Gasek, N.S., Kuchel, G.A., Xu, M.. The heterogeneity of cellular senescence: Insights at the single-cell level. Trends in Cell Biology 2023;33(1):9–17. doi:doi:10.1016/j.tcb.2022.04.011.
  • Garry et al. [2016] Garry, G.A., Antony, M.L., Garry, D.J.. Cardiotoxin Induced Injury and Skeletal Muscle Regeneration. Methods in Molecular Biology (Clifton, NJ) 2016;1460:61–71. doi:doi:10.1007/978-1-4939-3810-0_6.
  • Joe et al. [2010] Joe, A.W.B., Yi, L., Natarajan, A., Le Grand, F., So, L., Wang, J., et al. Muscle injury activates resident fibro/adipogenic progenitors that facilitate myogenesis. Nature Cell Biology 2010;12(2):153–163. doi:doi:10.1038/ncb2015.
  • Rigamonti et al. [2014] Rigamonti, E., Zordan, P., Sciorati, C., Rovere-Querini, P., Brunelli, S.. Macrophage plasticity in skeletal muscle repair. BioMed Research International 2014;2014:560629. doi:doi:10.1155/2014/560629.
  • Van Dijk et al. [2018] Van Dijk, D., Sharma, R., Nainys, J., Yim, K., Kathail, P., Carr, A.J., et al. Recovering gene interactions from single-cell data using data diffusion. Cell 2018;174(3):716–729. doi:doi:10.1016/j.cell.2018.05.061.
  • Roffo and Melzi [2016] Roffo, G., Melzi, S.. Feature Selection via Eigenvector Centrality. 2016,doi:doi:10.48550/arXiv.1704.05409.
  • Blondel et al. [2008] Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008;2008(10):P10008. doi:doi:10.1088/1742-5468/2008/10/P10008. arXiv:0803.0476.
  • Wang et al. [2021] Wang, Y.h., Guo, Z., An, L., Zhou, Y., Xu, H., Xiong, J., et al. LINC-PINT impedes DNA repair and enhances radiotherapeutic response by targeting DNA-PKcs in nasopharyngeal cancer. Cell Death & Disease 2021;12(5):454. doi:doi:10.1038/s41419-021-03728-2.
  • Bukhari et al. [2022] Bukhari, I., Khan, M.R., Hussain, M.A., Thorne, R.F., Yu, Y., Zhang, B., et al. PINTology: A short history of the lncRNA LINC-PINT in different diseases. WIREs RNA 2022;13(4):e1705. doi:doi:10.1002/wrna.1705.
  • Xiang et al. [2021] Xiang, X., Fu, Y., Zhao, K., Miao, R., Zhang, X., Ma, X., et al. Cellular senescence in hepatocellular carcinoma induced by a long non-coding RNA-encoded peptide PINT87aa by blocking FOXM1-mediated PHB2. Theranostics 2021;11(10):4929--4944. doi:doi:10.7150/thno.55672.
  • He et al. [2021] He, T., Yuan, C., Zhao, C.. Long intragenic non-coding RNA p53-induced transcript (LINC-PINT) as a novel prognosis indicator and therapeutic target in cancer. Biomedicine & Pharmacotherapy 2021;143:112127. doi:doi:10.1016/j.biopha.2021.112127.
  • Moiseeva et al. [2023] Moiseeva, V., Cisneros, A., Sica, V., Deryagin, O., Lai, Y., Jung, S., et al. Senescence atlas reveals an aged-like inflamed niche that blunts muscle regeneration. Nature 2023;613(7942):169--178. doi:doi:10.1038/s41586-022-05535-x.
  • González-Gualda et al. [2021] González-Gualda, E., Baker, A.G., Fruk, L., Muñoz-Espín, D.. A guide to assessing cellular senescence in vitro and in vivo. The FEBS journal 2021;288(1):56--80. doi:doi:10.1111/febs.15570.
  • Hamsanathan and Gurkar [2022] Hamsanathan, S., Gurkar, A.U.. Lipids as Regulators of Cellular Senescence. Frontiers in Physiology 2022;13:796850. doi:doi:10.3389/fphys.2022.796850.
  • Lizardo et al. [2017] Lizardo, D.Y., Lin, Y.L., Gokcumen, O., Atilla-Gokcumen, G.E.. Regulation of lipids is central to replicative senescence. Molecular BioSystems 2017;13(3):498--509. doi:doi:10.1039/C6MB00842A.
  • Flor et al. [2017] Flor, A.C., Wolfgeher, D., Wu, D., Kron, S.J.. A signature of enhanced lipid metabolism, lipid peroxidation and aldehyde stress in therapy-induced senescence. Cell Death Discovery 2017;3(1):1--12. doi:doi:10.1038/cddiscovery.2017.75.
  • Ogrodnik et al. [2019] Ogrodnik, M., Zhu, Y., Langhi, L.G.P., Tchkonia, T., Krüger, P., Fielder, E., et al. Obesity-Induced Cellular Senescence Drives Anxiety and Impairs Neurogenesis. Cell Metabolism 2019;29(5):1061--1077.e8. doi:doi:10.1016/j.cmet.2018.12.008.
  • Fang et al. [2020] Fang, L., Li, Y., Ma, L., Xu, Q., Tan, F., Chen, G.. GRNdb: Decoding the gene regulatory networks in diverse human and mouse conditions. Nucleic Acids Research 2020;49(D1):D97--D103. doi:doi:10.1093/nar/gkaa995.
  • Kanehisa and Goto [2000] Kanehisa, M., Goto, S.. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research 2000;28(1):27--30. doi:doi:10.1093/nar/28.1.27.
  • Omori et al. [2020] Omori, S., Wang, T.W., Johmura, Y., Kanai, T., Nakano, Y., Kido, T., et al. Generation of a p16 Reporter Mouse and Its Use to Characterize and Target p16high Cells In Vivo. Cell Metabolism 2020;32(5):814--828.e6. doi:doi:10.1016/j.cmet.2020.09.006.
  • Rayess et al. [2012] Rayess, H., Wang, M.B., Srivatsan, E.S.. Cellular senescence and tumor suppressor gene p16. International Journal of Cancer Journal International du Cancer 2012;130(8):1715--1725. doi:doi:10.1002/ijc.27316.
  • Bonacich [1972] Bonacich, P.. Factoring and weighting approaches to status scores and clique identification. The Journal of Mathematical Sociology 1972;2(1):113--120. doi:doi:10.1080/0022250X.1972.9989806.