Abstract
Understanding the structure and the dynamics of networks is of paramount importance for many scientific fields that rely on network science. Complex network theory provides a variety of features that help in the evaluation of network behavior. However, such analysis can be confusing and misleading as there are many intrinsic properties for each network metric. Alternatively, Information Theory methods have gained the spotlight because of their ability to create a quantitative and robust characterization of such networks. In this work, we use two Information Theory quantifiers, namely Network Entropy and Network Fisher Information Measure, to analyzing those networks. Our approach detects non-trivial characteristics of complex networks such as the transition present in the Watts-Strogatz model from k-ring to random graphs; the phase transition from a disconnected to an almost surely connected network when we increase the linking probability of ErdÅs-Rényi model; distinct phases of scale-free networks when considering a non-linear preferential attachment, fitness, and aging features alongside the configuration model with a pure power-law degree distribution. Finally, we analyze the numerical results for real networks, contrasting our findings with traditional complex network methods. In conclusion, we present an efficient method that ignites the debate on network characterization.
Similar content being viewed by others
Introduction
Understanding how networks arrange their connections (structure), and how the information flows through their nodes (dynamics), is a breakthrough for many scientific fields that rely on network science to assess all kinds of phenomena. Recently, Information Theory methods gained the spotlight because of their ability to create a more quantitative and robust characterization of complex networks, as an alternative to traditional methods. Standard quantifiers such as Shannon Entropy and Statistical Complexity were adapted to network analysis, providing a different perspective when evaluating networks1.
The quantification of systems with multidimensional measures, in particular, a 2D representation or âplane representationâ formed by Information Theory-based quantifiers is extensively applied to time series analysis and characterization. For instance, Rosso et al.2 employed the time causal Entropy-Complexity plane to distinguish stochastic from deterministic systems. The Entropy-Complexity plane is comprised of measures of entropy (\( {\mathcal H} \)) and Statistical Complexity (\({\mathscr{C}}\)). Two pieces of information are required to calculate \({\mathscr{C}}\), namely the information content and the disequilibrium (\({\mathscr{Q}}\)) of the system. To evaluate the Statistical Complexity, we use the entropy as a measure of the information content, and the disequilibrium is expressed by the divergence between the current system state and an appropriate reference state. The calculation of these quantifiers requires the use of a proper probability distribution that represents the system under study. In the case of time series, the distribution derived from the symbolization proposed by Bandt and Pompe3 has been successfully used to capture the intrinsic time causality behavior of the underlying system (see Supplemental Material). The Shannon Entropy is a global disorder measure commonly used in many applications of the Information Theory field and the Entropy-Complexity plane. It is relatively insensitive to substantial changes in the distribution taking place in a small-sized region of the space. For these reasons, the Shannon Entropy is referred to as a global measure. The Statistical Complexity (\({\mathscr{C}}\)), when defined as a divergence in the space of entropies, is also a global measure. Alternatively, the Fisher Information Measure (\( {\mathcal F} \)) can be interpreted as a measure of the ability to estimate a parameter, as the amount of information that can be extracted from a set of measurements, and also, a measure of the state of disorder of a system or phenomenon4. The Fisher Information Measure (\( {\mathcal F} \)) is a local measure as it is based upon the gradient of the underlying distribution, being, thus, significantly sensitive to even tiny localized perturbations.
Recently, the Entropy-Complexity plane was extended and used in the context of complex networks. In ref.â1, the authors showed that networks of the same category tend to cluster into distinct regions. To calculate the two required quantifiers, \( {\mathcal H} \), and \({\mathscr{C}}\), for complex networks, the authors used the probability distribution of a random walker traveling between two nodes to represent the topological properties of the network. Based on this distribution, they calculated the Shannon Entropy and the Statistical Complexity. For the evaluation of the disequilibrium, they used the Jensen-Shannon divergence between the actual network and random networks and used the last as a reference model. To obtain this divergence it is necessary to average several random networks with the same number of nodes, which is typically time-consuming. They demonstrated the applicability of their proposal to families of Random ErdÅs-Rényi5, Small-World Watts-Strogatz6, and Scale-Free Barabási-Albert7 networks. However, this plane presents several limitations, as regularly, the random networks overlap with all the other models, creating confusion and misleading the conclusions when evaluating the networkâs features.
In this work, we propose the Fisher information quantifier, more sensitive to a local relationship between nodes, as a measure of network disorder. Alongside, we suggest the use of the Shannon-Fisher plane as an alternative to the Entropy-Complexity plane for network characterization. Our approach does not require the calculation of a divergence to a reference model, which decreases the computational burden. We analyze two different groups of networks: synthetic and real-world networks.
Methods
Network definition
We assume a graph G(V, E), where V is the set of nodes and E is the set of links (edges) as a suitable model of a network. The graph is represented by an adjacency matrix A with dimension NâÃâN, N being the number of nodes in the network, where aijâ=â1 if a link exists between nodes i and j, otherwise, aijâ=â0. We consider undirected, unweighted, and without the presence of loops graph, i.e., simple unweighted graphs. Hence, their adjacency matrices have the main diagonal aiiâ=â0, âiâ=â1, â¦, N, and Aâ=âAT. The node degree ki is calculated by \({k}_{i}=\sum _{j}\,{a}_{ij}\), therefore, 0ââ¤âkiââ¤âNâââ1.
Network entropy
Network Entropy is based on the classical Shannon Entropy for discrete distributions. Small8 proposed a measure of Network Entropy based on the probability that a random walker goes from node i to any other node j. This probability distribution P(i) is defined for each node i and has entries
It is easy to observe that \({\sum }_{j}{p}_{i\to j}=1\) for each node i.
Based on the probability distribution P(i), the entropy for each node can be defined as
with \({{\mathscr{S}}}^{(i)}=0\) if node i is disconnnected.
After calculating the entropy for each node, we then calculate the normalized node entropy by
Finally, the normalized Network Entropy is calculated averaging the normalized node entropy over the whole network as
The normalized Network Entropy is maximal \( {\mathcal H} =1\) for fully connected networks, since piâââjâ=â(Nâââ1)â1 for every iââ âj and the walk becomes fully random, i.e., jumps from node i any other node j are equiprobable. The walk becomes predictable in a sparse network because it limits the possibility of jumps. The sparser the network, the lower becomes its Network Entropy.
The normalized Network Entropy \( {\mathcal H} \), hence, quantifies the heterogeneity of the networkâs degree distribution, with lower values for nodes with lower degrees and higher values for nodes with higher degrees. For example, peripheral nodes present lower \({ {\mathcal H} }^{(i)}\) than hubs. Entropy, thus, ranges from \( {\mathcal H} \to 0\) (sparse networks) to \( {\mathcal H} \to 1\) (fully connected networks).
Network fisher information measure
The normalized Fisher Information Measure (FIM)9 for a node i is given by
The normalized network Fisher Information Measure is given by
If the system under study is in a very ordered state, i.e., a sparse network, almost all piâââj values are zeros, we have Shannon Entropy \( {\mathcal H} \to 0\) and normalized Fisherâs Information Measure \( {\mathcal F} \to 1\). On the other hand, when a very disordered state represents the system under study, that is when all piâââj values are similar, we obtain \( {\mathcal H} \to 1\) and \( {\mathcal F} \to 0\). We can then define a Shannon-Fisher plane, which can also be used to characterize Complex Networks.
Results: Synthetic Networks
In this section, we analyze the behavior of Information Theory quantifiers when applied to Random (RN), Small World (SWN), and Scale-Free networks (SFN). We simulated independent instances of these networks for several parameters and then analyzed how their Network Entropy and Fisher Information Measure vary. These synthetic networks may present some degree of stochasticity related to its parameters setting, which results in variations of the quantifiers; for this, when we observe variations in any measure \({\mathscr{X}}\), we represent it by its average value \(\bar{{\mathscr{X}}}\) along with its sampling standard deviation \({s}_{{\mathscr{X}}}\). These variations, when too small, can be hard to distinguish in figures, but their numerical results should make this clearer.
ErdÅs-rényi: random networks
Boccaletti et al.10 state that: âthe term random graph refers to the disordered nature of the arrangement of links between different nodes.â According to ref.11, Solomonoff and Rapoport12 initiated the study upon the nature of random graphs, but ErdÅs and Rényi5 are most known by observing the properties of networks as they increase the number of random connections, thus, defining an ensemble of graphs GN,M, with N nodes and M edges. Later, Gilbert13 described an alternative method for generating random graphs by defining an ensemble of graphs GN,p with N nodes connecting randomly according to a linking probability p that is analogous to the link density \(\xi ={(N(N-1))}^{-1}\,{\sum }_{i}^{N}{k}_{i}\). Although ErdÅs-Rényi (ER) random graphs are well studied in network science, they often fail at describing essential properties of real networks.
We analyzed fifty independent ER graphs GN,p for every combination of Nâ=â{50, 1000, 10000} and pâââ{0, 0.001, 0.002, â¦, 0.99, 1} making, thus, a total of 50âÃâ3âÃâ1001 graphs. Figure 1 shows the variation of the Shannon Entropy (Fig. 1a) and Fisher Information Measure (Fig. 1b) with respect to the link density, while Fig. 1c depicts the relationship in between the Shannon Entropy and Fisher Information Measure.
Figure 1a shows how the Shannon Entropy \( {\mathcal H} \) varies with respect to the link density ξ. The variation starts steep, then saturates. This may enhance the sensibility of Shannon-Fisher plane for sparse networks, but it may not be sensitive to denser graphs. The relationship between \( {\mathcal H} \) and ξ also depends on the number of nodes N. The Shannon Entropy increases, for the same link density, with N. However, the rate of this growth decreases with N.
Figure 1b suggests that the Fisher Information Measure presents two distinct regimes for ER graphs as a function of their link density. Initially, this measure grows steadily: for pâ=â0 the network starts is totally disconnected; as p increases, it reaches a critical point pc that is relative to the network number of nodes N, after which the measure decreases in a quasi-linear fashion \( {\mathcal F} \approx 1-p\) for every pâ>âpc, regardless N. For Nâ=â50, the critical point is \(\overline{{p}_{c}}=0.08\) with standard deviation spcâ=â0.02 and \(\bar{ {\mathcal F} }=0.90\), \({s}_{ {\mathcal F} }=0.02\); for Nâ=â1000, \(\overline{{p}_{c}}=0.008\), spcâ=â0.002 with \(\bar{ {\mathcal F} }=0.99\), \({s}_{ {\mathcal F} }=0.001\); for Nâ=â10000, \(\overline{{p}_{c}}=0.001\) with \(\bar{ {\mathcal F} }=0.999\) and no variation observed. As the linking probability for ER graphs is analogous to the link density, this also stands for the link density, so \(\bar{ {\mathcal F} }\approx 1-\xi \) for every ξâ>âξc. This result relates to the expected phase transitions in random graphs at pcâ>âlnN/N14, as the network will almost surely be connected.
Figure 1c shows the relationship between the Shannon Entropy and the Fisher Information Measure. As expected, the larger the network is, the less the variability observed. For this reason, we will only present results for Nâ=â1000 hereinafter.
Watts-strogatz: small-world networks
Small-world networks present an intrinsic characteristic of having relatively small average path length between nodes15. Watts and Strogatz6 (WS) proposed a model to build graphs GN,k that can reproduce this small-world property with a high clustering coefficient. Start with a k-ring network with N nodes and a probability β. The rewiring consists of removing existing edges and connecting to another random node. When βâ=â0, we have a ring lattice, and for βâ=â1, it produces a random graph. For intermediate values, the model produces networks with the small-world property and a nontrivial clustering coefficient10. Figure 2 shows the same analysis as presented previously for random networks.
Figure 2a shows that the relationship between Network Entropy and link density is consistent with what was observed in Random Networks and that there is little variation with respect to β. Therefore, the Network Entropy \( {\mathcal H} \) by itself does not provide information to identify different Small-World models.
Figure 2b shows the relationship between the Fisher measure and link density, indexed by the rewiring probability β (shades of blue). As expected, the behavior in the limit βâ=â1 is the same (linear decay) as the one observed for RN; cf. Figure 1b. There is a lower bound which corresponds to k-rings (red dots).
The red arrows (Fig. 2c) identify the change of regime. For kâ=â1 (red-downward arrow), increasing the rewiring probability decreases the Fisher \( {\mathcal F} \) measure, while for kââ¥â2 (after red-upward arrow), this behavior is inverted, and increasing β also increases \( {\mathcal F} \).
The Shannon-Fisher plane provides a rich description of Watts-Strogatz (WS) networks. Similarly to what was observed with ER graphs, there are two distinct regimes when evaluating WS graphs. Firstly, we see in Fig. 2 that for kâ=â1, we start with a ring lattice where \( {\mathcal H} =0.1\), \( {\mathcal F} =0.999\), and increasing β, we see \( {\mathcal F} \) decreasing until it reaches a random graph; this happens because when kâ=â1 and βâ>â0, the rewiring mechanism isolates some nodes, and we have disconnected components lowering the \( {\mathcal F} \) values. Secondly, for kâ>â1, WS has a different behavior, wherefore the rewiring mechanism most likely will not leave isolated nodes when βâ>â0; in fact, it will create larger components without fully disconnecting the other ones, this will increase the \( {\mathcal F} \) values. The red arrows in Fig. 2 identify this change of regime. Alongside this, the Shannon-Fisher plane corroborates with the evidence of the transition between ring lattices and random graphs for the WS model.
Figure 3 summarizes the main differences between ErdÅs-Rényi (Fig. 3a) and Watts-Strogatz (Fig. 3b) networks in the \( {\mathcal H} \times {\mathcal F} \,\times \,\xi \) space. While ErdÅs-Rényi networks are equivalently well-described by the Network Entropy and the Fisher Information Measure (they span a 1D region of the space), Watts-Strogatz graphs are better characterized by the latter, as different networks span a 2D manifold. Permanent links to interactive versions of these 3D plots are available at http://tiny.cc/ERN and at http://tiny.cc/SWN.
Barabási-albert: scale-free networks
The literature often uses scale-free networks as models for real networks. They have a degree distribution that can be fitted by a power-law, i.e., P(k)â~âkâγ, where γ is the degree exponent usually in 2ââ¤âγââ¤â3, as for γâ>â3 the scale-free property can easily be confused with random networks16. We will evaluate the Barabási-Albert7 model (BA) for evolving scale-free networks, as it has two important features: network growth and the preferential attachment mechanism.
For network growth, at each time step t, new nodes are inserted with m links connecting with N0 existing nodes in the network. These links are created according to a probability given by the preferential attachment: the probability that a node i connects with j is proportional to the actual degree of node i:
In this way, the preferential attachment (PA) induces hubs (highly connected nodes), and peripheral communities, where nodes have similar degree. We know that the Barabási-Albert model is unable to reproduce all the diversity existing for scale-free networks, as it captures only the power-law with degree exponent γâ=â3. Therefore, many variations of this model have been proposed throughout the years. In this work, we extend our analysis for: non-linear preferential attachment; the fitness property; the aging property; and finally, the configuration model.
Non-linear preferential attachment
Krapivsky et al.17 introduced a non-linear preferential attachment that creates different regimes for the network according to an exponent α controlling the network topology. The non-linear PA is given by
For αââ â1, the growth model stops resulting in a power-law degree distribution. There are, thus, three different growth regimes:
-
The sublinear regime (αâ<â1) has a power-law with an exponential cutoff, where the preferential attachment is not strong enough to produce a pure power-law degree distribution.
-
The linear regime (αâ=â1) has a pure power-law behavior corresponding to the Barabási-Albert7 model, with a resulting γâ=â3.
-
The superlinear regime (αâ>â1) presents a particular behavior where the network condensates, i.e, very few nodes win all connections; it also does not result in a power-law degree distribution.
We mapped outcomes of the BA model with a non-linear preferential attachment using the Krapivskyâs model onto the Shannon-Fisher plane, as shown in Fig. 4a. For αâ=â0, we have a random network, since Î (i)â=â1 for every i, the network no longer obeys the preferential attachment mechanism, just the evolving growth property; the result is \(\bar{ {\mathcal H} }=0.074\), \({s}_{ {\mathcal H} }=0.001\) and \(\bar{ {\mathcal F} }=0.997\), \({s}_{ {\mathcal F} }=0.001\). Increasing α in steps of 0.01 changes the regime of the network slowly, and we see this transition in the Shannon-Fisher plane until it reaches αâ=â1. In the linear regime \(\bar{ {\mathcal H} }=0.063\), \({s}_{ {\mathcal H} }=0.001\) and \(\bar{ {\mathcal F} }=0.993\), \({s}_{ {\mathcal F} }=0.001\). In the superlinear regime \( {\mathcal H} \to 0\) and \( {\mathcal F} \) starts oscillating above αâ>â1.4, as seen in Fig. 4b. This oscillation happens due to the fact that after the network condensates, a small change in the network topology may cause \( {\mathcal F} \) to drop from \( {\mathcal F} =1\) to \( {\mathcal F} =0.5\), as the Fisher Information Measure is sensitive to local disturbances.
Similar to earlier sections, we evaluated the link density ξ along with Network Entropy \( {\mathcal H} \) and Fisher Information Measure \( {\mathcal F} \). This time, the results with link density in comparison with Network Entropy, shown in Fig. 5b, have more interesting behavior. Although the link density does not change (ξâ=â0.002), \( {\mathcal H} \) absorbs the changes and when increasing α, \( {\mathcal H} \to 0\). In Fig. 5a, we observe how Fisher Information Measure \( {\mathcal F} \) against link density ξ produces confusing results, as the oscillation for αâ>â1.4 heads toward \( {\mathcal F} \approx 0.5\) and \( {\mathcal F} \approx 1\) with ξâ=â0.002.
Fitness property
Some networks have nodes that create connections with more ability, e.g., a popular web page. Usually, these nodes gain relationships faster than common nodes. The Bianconi-Barabási model18,19 describes this property named fitness. We can model it using the preferential attachment considering a fitness coefficient ηi alongside the node degree ki:
In Eq. 9, the dependence of Î (i) on ηi models the fact that even younger nodes can acquire links faster if they have sufficiently higher fitness than older nodes. Therefore, we draw 30 networks with Nâ=â1000 considering a uniform distribution for the fitness ηi of each node i. For this, we do not expect a perfect power law, but we expect γâ=â2.255, asymptotically.
Figure 6a shows consistency between the Bianconi-Barabási, and the Barabási-Albert models, wherefore \( {\mathcal H} \) grows slowly while ξ does not change and their numerical results are \(\bar{{\rm{\xi }}}=0.00278\) with standard deviation sξâ=â0.00006. Figure 6b also shows a similar behavior in comparison with Fig. 5a. Finally, Fig. 6c shows that most networks generated by the fitness model lie at a region close to the results presented for the BA model with \(\bar{ {\mathcal H} }=0.091\), \({s}_{ {\mathcal H} }=0.005\) and \(\bar{ {\mathcal F} }=0.979\), \({s}_{ {\mathcal F} }=0.038\).
Aging property
Another aspect we can also consider for a scale-free network is the aging property20. Regularly, for the Barabási-Albert model, we account only for the node degree or as seen before, the fitness coefficient. However, what happens when a node starts to reduce the rate of acquiring new links with time? This aging process causes the nodes to lose relevance; thus, it changes the network structure and dynamics. We can model this property considering:
where ν is a parameter controlling the dependence of the attachment probability on the nodeâs age. According to ν, we can define three scaling regimes:
-
If νâ<â0, new nodes will connect to older nodes. If νâââââ, each new node connects to the oldest node, resulting in a condensed network or hub-and-spoke topology. Hence, we have a more heterogeneous network with a few hubs and many peripheral nodes.
-
If νâ>â0, nodes connect to younger nodes. By aging, nodes lose the ability of preferential attachment. In this case, the network tends to be more homogeneous.
-
For νâ>â1, the aging effect dominates the preferential attachment effect, the network loses its scale-free property, and it eventually approaches a random network. When νââââ, each node connects to its immediate predecessor.
For evaluating the aging property, we generate distinct networks with Nâ=â1000 and νâââ{â3.0, â2.9, â2.8, ..., 2.9, 3.0} with 30 replications of each setting; thus, we have a total of 18030 networks. Figure 7a shows the results for networks with a growing Network Entropy \( {\mathcal H} \) and a steady link density ξâ=â0.002, the same result as for BA model. Figure 7b shows the results for the aging property, and once more, we can observe the âoscillationâ that happens to all the other scale-free models previously discussed. Figure 7c presents the results considering the Network Entropy \( {\mathcal H} \) and Fisher Information Measure \( {\mathcal F} \), where we can see the Aging model transition in the plane according to its scaling regimes.
The numerical results for the Aging model are the following: for νâ=ââ3, \(\bar{ {\mathcal H} }=0.024\), \({s}_{ {\mathcal H} }=0.0014\) and \(\bar{ {\mathcal F} }=0.845\), \({s}_{ {\mathcal F} }=0.042\) i.e., we have a condensed network; when νâ>â0, \( {\mathcal H} \) and \( {\mathcal F} \) continue to grow until νâ=â1, wherefore \(\bar{ {\mathcal H} }=0.072\), \({s}_{ {\mathcal H} }=0.0009\) and \(\bar{ {\mathcal F} }=0.992\), \({s}_{ {\mathcal F} }=0.0021\); for νâ>â1, \( {\mathcal H} \) grows steadily and \( {\mathcal F} \) decays, reaching a random regime. Finally, we noticed that the scale-free regime expected for νâââ[0, 1] is observed in the Shannon-Fisher plane, where the values for \(\bar{ {\mathcal H} }=0.063\), \({s}_{ {\mathcal H} }=0.005\) and \(\bar{ {\mathcal F} }=0.990\), \({s}_{ {\mathcal F} }=0.0064\).
The configuration model
A recurrent problem is âhow do we generate networks with an arbitrary P(k)?â. For this, we use the configuration model, also known as a random network with a pre-defined degree sequence21. According to ref.16, the algorithm consists of the following steps:
-
1.
Assign a degree to each node as stubs or half-links. It is required that we start from an even number of stubs; otherwise, we will have unpaired stubs.
-
2.
Randomly selects a pair of half-links and connect them; then randomly choose another pair from the remaining 2Lâââ2 half-links and connect them.
-
3.
Repeat this process until all stubs are paired up. Depending on how we pair them up, we may obtain distinct networks. Some networks include cycles, self-loops, or multi-links. In this work, we consider only simple graphs, thus, after generating the network for a degree sequence, we âsimplifyâ the graph, removing self-loops and multi-links.
As we are trying to reproduce scale-free properties using the configuration model, we assign a pure power-law distribution P(k)â=âkâγ with γâââ[2, 5]. For this model, we expect that for 2ââ¤âγââ¤â3, the network is in the scale-free regime; when γâ>â3, the network starts to condensate, as the distribution has a steep curve. It means that few nodes have most of the links and most nodes have few links. Such networks present structure and dynamics more similar to a hub-and-spoke topology.
Finally, we evaluate these networks with Nâ=â1000 using a pure power-law distribution with γâââ{2.0, 2.1, 2.2, ..., 4.9, 5.0}; as we cannot guarantee that networks with the same degree exponent have the same topology, we replicate this experiment 30 times, then, we have 1312 networks. In Fig. 8a we have that \(\bar{{\rm{\xi }}}=0.001\), with sξâ=â0.0006 while the Network Entropy \( {\mathcal H} \) decreases as we increase the degree exponent. This behavior is similar for all the other scale-free models when we are in the condensed regime. Figure 8b shows how the \( {\mathcal F} \) values are stable for this model with \(\bar{ {\mathcal F} }=0.998\), \({s}_{ {\mathcal F} }=0.001\). And later, in Fig. 8c, we observe how Network Entropy \( {\mathcal H} \) is actually the one capturing the changes, wherefore the degree exponent γâââ[2, 3] we have \(\bar{ {\mathcal H} }=0.045\), \({s}_{ {\mathcal H} }=0.017\) and \(\bar{ {\mathcal F} }=0.998\), \({s}_{ {\mathcal F} }=0.001\); and for γâââ[3, 5], we have \(\bar{ {\mathcal H} }=0.010\), \({s}_{ {\mathcal H} }=0.005\) and \(\bar{ {\mathcal F} }=0.998\), \({s}_{ {\mathcal F} }=0.001\).
Figure 9 summarizes the main features observed in the Shannon-Fisher plane for simulated networks, alongside examples illustrating different topologies and their expected results in the Shannon-Fisher plane. The fitness, aging, and configuration models were left out of this plot, as they are represented well enough by the Barabási-Albert model with nonlinear preferential attachment and its distinct scaling regimes. Considering the standard definition for the scale-free property and the network models evaluated, we observe how scale-free networks are subtle and can be easily confused with others in the Shannon-Fisher plane, as the usual result for scale-free networks is confined to a tiny region of the plane, and a few inputs can dismantle the scale-free property. This observation reflects a recent discovery that states that âscale-free networks are rareâ22.
Results: Real Networks
We evaluate real networks, assessing their topological features such as clustering coefficient CÎ, average path length L, and if the degree distribution follows a power law (P(k)â~âkâγ); we also consider the degree exponent γ. Table 1 shows the real networks analyzed in this work. Each network is presented with its number of nodes N, average degree ãkã, and link density ξ. We inform their small-world indicators: average path length L, clustering coefficient CÎ, small-world-ness value SÎ (see below). We also provide scale-free indicators: degree exponent γ and p-value for the power-law fitting. Finally, we provide the Network Entropy \( {\mathcal H} \) and Fisher Information Measure \( {\mathcal F} \), along with a literature reference.
As shown in Fig. 10a,b, most real networks are sparse, with ξââ¤â0.009. The only exception is network #14 with ξâ=â0.157. Therefore, our analysis will rely on our ability to distinguish sparse networks, with link density ξ that does not present distortions in Network Entropy \( {\mathcal H} \) (Fig. 1a) nor Fisher Information Measure \( {\mathcal F} \) (Fig. 1b). Nevertheless, care is needed not to jump into conclusions without further analysis, as differences in the Shannon-Fisher plane for sparse networks are subtle; thus, it is where other metrics are welcome in helping to confirm our findings.
Foremost, considering the results for the Watts-Strogatz model in the Shannon-Fisher plane, it is expected that networks in between the upper (i.e., random) and lower (k-ring) limits are very likely to be small-world networks. Although, this should not be confused with saying that these networks have the same topology. Our purpose here is to study how the information flows through the nodes. Networks may present similar topology that will result in similar dynamics, but distinct topologies may have similar dynamics; this feature is noteworthy.
From the results in Fig. 11a and in Table 1, we may state that networks #1, #2, #4, #6, #10, #13 and #14 present small-world behavior. That said, we observe that the average path length L for these networks in Table 1 is small, considering its number of nodes N. Cohen and Havlin23 demonstrated that WS networks under some expected conditions have an average path length that scales as logN, and we can see that the largest network in our small-world set has Nâ=â4941, therefore, as logNâââ8.50, and observing that L for every small-world network in our study has Lâ<â8.5, following this criterion, they present the small-world behavior.
Another aspect that we can observe for small-world networks is their clustering coefficient (CÎ), but there is no certainty of which values to expect. In an attempt to perform quantitative analysis for clustering coefficient in small-world networks, and considering a relationship with the average path length, Humphries et al.24 proposed the small-world-ness SÎ, defined as
where CÎ and L are, respectively, the clustering coefficient and average path length, and CrandÎ and Lrand are the results computed for an ensemble of 100 ER networks, simulated with the same link density ξ as the real network. With this approach, Humphries et al.24 state that for SÎâ>â1, the network can be considered small-world.
However, Table 1 shows that only network #11 has SÎâ<â1, and if we analyze only the small-world-ness value, we may attest that all of the other networks are also small-world. This is inaccurate, if not wrong.
Alongside the small-world-ness value, as an attempt to identify scale-free networks, we can also estimate the degree exponent γ of the power-law degree distribution. Using a method proposed by Newman25, we reject the fit whenever p-valueâ<â0.05, and if the estimated γ lies between two and three (2â<âγâ<â3), we consider these networks scale-free; for γâ>â3, these networks may present hubs, but they become hard to distinguish from random or small-world networks.
We proceed with our study zooming into the Shannon-Fisher plane, as we can see in Fig. 11b. First, we observe networks #5 and #8, which are outside the small-world region, and network #3 that is overlapping the upper limit. We state that these three networks are random; Table 1 shows that they have γâ>â3 with p-valueâ>â0.05.
Although the identified random networks are outside the small-world region, they could easily be confused with scale-free or even regular networks considering just the results provided by the Shannon-Fisher plane, therefore, we ought to be careful with networks outside the small-world region, and for these cases, we must have a look into the degree distribution and estimate the degree exponent.
Network #7 is mapped into the \( {\mathcal H} \times {\mathcal F} \) plane closely to the classic Barabási-Albert model, along with γâ=â2.124 and p-valueâ=â1, thus, we cannot reject the fitting for a power-law, and we expect this network to be scale-free. Also, network #12 has γâ=â2.782 with p-valueâ=â1, and it is also close to network #7, although it is even closer to network #8.
Another âoddâ result is the fact that network #6, although its results and properties signal for a small-world network, has degree exponent γâ=â2.187 with p-valueâ=â0.896, also indicating the scale-free property. Finally, network #9 has resulted in a point in \( {\mathcal H} \times {\mathcal F} \) equal to those generated with superlinear preferential attachment, leading to a âcondensedâ network. Indeed, with Lâ=â3.771 and a degree distribution that does not fit a power-law at all, we state this network is âcondensed.â
Discussion
Complex networks have many faces, thus attempting to label them considering a single network property may be misleading. Real networks have many components and distinct interactions among them, for example, a scale-free network may have peripheral communities that lead to small-world structure. Our proposal quantifies network structure and dynamics, considering a simplified plot. We show consistent results with other network features when this methodology is applied to synthetic networks.
The Shannon-Fisher plane enhances our ability to evaluate complex networks:
-
The transition that the Watts-Strogatz model exhibits in between k-ring and random graphs, leading us to define the small-world region;
-
The two distinct regimes for the ErdÅs-Rényi model when reaching the critical linking probability;
-
The three regimes for the non-linear preferential attachment on scale-free networks and distinct growth models, which transits between random, scale-free and condensed networks;
-
The behavior of the fitness model when we consider a uniform distribution for the fitness of each node, and how it has similar features to the Barabási-Albert model;
-
The effect of aging for scale-free networks and how the aging exponent can control the systemâs behavior in the same manner to what happens with the non-linear preferential attachment;
-
And finally, how we can generate networks with a pure power-law considering distinct degree exponent.
The evaluation of real networks gave us a peek into the real world and its deceitful aspects. Our method succeeds in characterizing most of the real networks in comparison with synthetic networks, even though a few examples showed unexpected behaviors that will be widely explored further. That said, our proposal is not the perfect fit for labeling networks as a small-world or scale-free, but it opens a world of possibilities when evaluating information spread, network robustness, or controllability. Our approach allows identifying distinct interactions in real networks, observing how they transit within the Shannon-Fisher plane and comparing how they affect other network features.
Data availability
The datasets generated during and/or analysed during the current study are available in the fisher-networks repository, https://gitlab.com/cristophersfr/fisher-networks.
Change history
12 January 2021
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
References
Wiedermann, M., Donges, J. F., Kurths, J. & Donner, R. V. Mapping and discrimination of networks in the complexityentropy plane. Phys. Rev. E 96, 042304 (2017).
Rosso, O. A., Larrondo, H., MartÃn, M. T., Plastino, A. & Fuentes, M. Distinguishing noise from chaos. Phys. Rev. Lett. 99, 154102 (2007).
Bandt, C. & Pompe, B. Permutation entropy: a natural complexity measure for time series. Phys. Rev. Lett. 88, 174102 (2002).
Rosso, O. A., Olivares, F. & Plastino, A. Noise versus chaos in a causal fisher-shannon plane. Pap. Phys. 7, 070006 (2015).
ErdÅs, P. & Rényi, A. On random graphs. Publ. Math. 6, 290â297 (1959).
Watts, D. J. & Strogatz, S. H. Collective dynamics of âsmall-worldânetworks. Nat. 393, 440 (1998).
Barabási, A.-L. & Albert, R. Emergence of scaling in random networks. Sci. 286, 509â512 (1999).
Small, M. Complex networks from time series: Capturing dynamics. In Circuits and Systems (ISCAS), 2013 IEEE International Symposium on, 2509â2512 (IEEE, 2013).
Sánchez-Moreno, P., Yánez, R. & Dehesa, J. Discrete densities and Fisher information. In Proceedings of the 14 th International Conference on Difference Equations and Applications. Difference Equations and Applications. Istanbul, Turkey: Bahçesehir University Press, 291â298 (2009).
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D.-U. Complex networks: Structure and dynamics. Phys. Reports 424, 175â308 (2006).
Newman, M. Networks (Oxford university press, 2018).
Solomonoff, R. & Rapoport, A. Connectivity of random nets. The Bull. Math. Biophys. 13, 107â117 (1951).
Gilbert, E. N. Random graphs. The Annals Math. Stat. 30, 1141â1144 (1959).
ErdÅs, P. & Rényi, A. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5, 43 (1960).
Travers, J. & Milgram, S. The small world problem. Psychol. Today 1, 61â67 (1967).
Barabási, A.-L. Network science (Cambridge University Press, 2016).
Krapivsky, P. L., Redner, S. & Leyvraz, F. Connectivity of growing random networks. Phys. Rev. Lett. 85, 4629 (2000).
Bianconi, G. & Barabási, A.-L. Competition and multiscaling in evolving networks. EPL (Europhysics Lett. 54, 436 (2001).
Adamic, L. A. & Huberman, B. A. Power-law distribution of the world wide web. Sci. 287, 2115â2115 (2000).
Dorogovtsev, S. N. & Mendes, J. F. F. Evolution of networks with aging of sites. Phys. Rev. E 62, 1842 (2000).
Molloy, M. & Reed, B. A critical point for random graphs with a given degree sequence. Random structures & Algorithms 6, 161â180 (1995).
Broido, A. D. & Clauset, A. Scale-free networks are rare. Nat. communications 10, 1017 (2019).
Cohen, R. & Havlin, S. Complex networks: structure, robustness and function (Cambridge University Press, 2010).
Humphries, M. D. & Gurney, K. Network âsmall-world-nessâ: a quantitative method for determining canonical network equivalence. PloS One 3, e0002051 (2008).
Newman, M. E. Power laws, pareto distributions and zipfâs law. Contemp. Phys. 46, 323â351 (2005).
Guimera, R., Danon, L., Diaz-Guilera, A., Giralt, F. & Arenas, A. Self-similar community structure in a network of human interactions. Phys. Rev. E 68, 065103 (2003).
Moody, J. Peer influence groups: identifying dense clusters in large networks. Soc. Networks 23, 261â283 (2001).
Newman, M. E. The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. 98, 404â409 (2001).
Newman, M. E. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006).
Šubelj, L., Žitnik, S., Blagus, N. & Bajec, M. Node mixing and group structure of complex software networks. Adv. Complex Syst. 17, 1450022 (2014).
CAIDA. The caida as relationships dataset 2004â2007, http://www.caida.org/data/as-relationships (2007).
Å ubelj, L. & Bajec, M. Ubiquitousness of link-density and link-pattern communities in real-world networks. The Eur. Phys. J. B 85, 32 (2012).
Knuth, D. E. The Stanford GraphBase: a platform for combinatorial computing (ACM Press New York, 1993).
Kaiser, M. & Hilgetag, C. C. Spatial growth of real-world networks. Phys. Rev. E 69, 036103 (2004).
Takemura, S.-Y. et al. A visual motion detection circuit suggested by drosophila connectomics. Nat. 500, 175 (2013).
Helmstaedter, M. et al. Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nat. 500, 168 (2013).
Acknowledgements
We acknowledge the support for this research by FAPEAL, FAPESP and CNPq (Brazil).
Author information
Authors and Affiliations
Contributions
Implementation details and experiments were conceived by C.G.S.F., H.S.R. and A.L.L.A. The authors O.A.R. and A.C.F. conducted the experiments. All the authors analyzed the results, wrote and reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the articleâs Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the articleâs Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Freitas, C.G.S., Aquino, A.L.L., Ramos, H.S. et al. A detailed characterization of complex networks using Information Theory. Sci Rep 9, 16689 (2019). https://doi.org/10.1038/s41598-019-53167-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-019-53167-5
This article is cited by
-
The Ihara zeta function as a partition function for network structure characterisation
Scientific Reports (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.