A detailed characterization of complex networks using Information Theory

Freitas, Cristopher G. S.; Aquino, Andre L. L.; Ramos, Heitor S.; Frery, Alejandro C.; Rosso, Osvaldo A.

doi:10.1038/s41598-019-53167-5

Download PDF

Article
Open access
Published: 13 November 2019

A detailed characterization of complex networks using Information Theory

Scientific Reports volumeÂ 9, ArticleÂ number:Â 16689 (2019) Cite this article

13k Accesses
19 Citations
44 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 12 January 2021

This article has been updated

Abstract

Understanding the structure and the dynamics of networks is of paramount importance for many scientific fields that rely on network science. Complex network theory provides a variety of features that help in the evaluation of network behavior. However, such analysis can be confusing and misleading as there are many intrinsic properties for each network metric. Alternatively, Information Theory methods have gained the spotlight because of their ability to create a quantitative and robust characterization of such networks. In this work, we use two Information Theory quantifiers, namely Network Entropy and Network Fisher Information Measure, to analyzing those networks. Our approach detects non-trivial characteristics of complex networks such as the transition present in the Watts-Strogatz model from k-ring to random graphs; the phase transition from a disconnected to an almost surely connected network when we increase the linking probability of ErdÅs-RÃ©nyi model; distinct phases of scale-free networks when considering a non-linear preferential attachment, fitness, and aging features alongside the configuration model with a pure power-law degree distribution. Finally, we analyze the numerical results for real networks, contrasting our findings with traditional complex network methods. In conclusion, we present an efficient method that ignites the debate on network characterization.

Deciphering the generating rules and functionalities of complex networks

Article Open access 25 November 2021

Intrinsic dimension as a multi-scale summary statistics in network modeling

Article Open access 01 August 2024

Degree difference: a simple measure to characterize structural heterogeneity in complex networks

Article Open access 07 December 2020

Introduction

Understanding how networks arrange their connections (structure), and how the information flows through their nodes (dynamics), is a breakthrough for many scientific fields that rely on network science to assess all kinds of phenomena. Recently, Information Theory methods gained the spotlight because of their ability to create a more quantitative and robust characterization of complex networks, as an alternative to traditional methods. Standard quantifiers such as Shannon Entropy and Statistical Complexity were adapted to network analysis, providing a different perspective when evaluating networks¹.

The quantification of systems with multidimensional measures, in particular, a 2D representation or âplane representationâ formed by Information Theory-based quantifiers is extensively applied to time series analysis and characterization. For instance, Rosso et al.² employed the time causal Entropy-Complexity plane to distinguish stochastic from deterministic systems. The Entropy-Complexity plane is comprised of measures of entropy ($ {\mathcal H} $) and Statistical Complexity (${\mathscr{C}}$). Two pieces of information are required to calculate ${\mathscr{C}}$, namely the information content and the disequilibrium (${\mathscr{Q}}$) of the system. To evaluate the Statistical Complexity, we use the entropy as a measure of the information content, and the disequilibrium is expressed by the divergence between the current system state and an appropriate reference state. The calculation of these quantifiers requires the use of a proper probability distribution that represents the system under study. In the case of time series, the distribution derived from the symbolization proposed by Bandt and Pompe³ has been successfully used to capture the intrinsic time causality behavior of the underlying system (see Supplemental Material). The Shannon Entropy is a global disorder measure commonly used in many applications of the Information Theory field and the Entropy-Complexity plane. It is relatively insensitive to substantial changes in the distribution taking place in a small-sized region of the space. For these reasons, the Shannon Entropy is referred to as a global measure. The Statistical Complexity (${\mathscr{C}}$), when defined as a divergence in the space of entropies, is also a global measure. Alternatively, the Fisher Information Measure ($ {\mathcal F} $) can be interpreted as a measure of the ability to estimate a parameter, as the amount of information that can be extracted from a set of measurements, and also, a measure of the state of disorder of a system or phenomenon⁴. The Fisher Information Measure ($ {\mathcal F} $) is a local measure as it is based upon the gradient of the underlying distribution, being, thus, significantly sensitive to even tiny localized perturbations.

Recently, the Entropy-Complexity plane was extended and used in the context of complex networks. In ref.â¹, the authors showed that networks of the same category tend to cluster into distinct regions. To calculate the two required quantifiers, $ {\mathcal H} $, and ${\mathscr{C}}$, for complex networks, the authors used the probability distribution of a random walker traveling between two nodes to represent the topological properties of the network. Based on this distribution, they calculated the Shannon Entropy and the Statistical Complexity. For the evaluation of the disequilibrium, they used the Jensen-Shannon divergence between the actual network and random networks and used the last as a reference model. To obtain this divergence it is necessary to average several random networks with the same number of nodes, which is typically time-consuming. They demonstrated the applicability of their proposal to families of Random ErdÅs-RÃ©nyi⁵, Small-World Watts-Strogatz⁶, and Scale-Free BarabÃ¡si-Albert⁷ networks. However, this plane presents several limitations, as regularly, the random networks overlap with all the other models, creating confusion and misleading the conclusions when evaluating the networkâs features.

In this work, we propose the Fisher information quantifier, more sensitive to a local relationship between nodes, as a measure of network disorder. Alongside, we suggest the use of the Shannon-Fisher plane as an alternative to the Entropy-Complexity plane for network characterization. Our approach does not require the calculation of a divergence to a reference model, which decreases the computational burden. We analyze two different groups of networks: synthetic and real-world networks.

Methods

Network definition

We assume a graph G(V, E), where V is the set of nodes and E is the set of links (edges) as a suitable model of a network. The graph is represented by an adjacency matrix A with dimension NâÃâN, N being the number of nodes in the network, where a_ijâ=â1 if a link exists between nodes i and j, otherwise, a_ijâ=â0. We consider undirected, unweighted, and without the presence of loops graph, i.e., simple unweighted graphs. Hence, their adjacency matrices have the main diagonal a_iiâ=â0, âiâ=â1, â¦, N, and Aâ=âA^T. The node degree k_i is calculated by ${k}_{i}=\sum _{j}\,{a}_{ij}$, therefore, 0ââ¤âk_iââ¤âNâââ1.

Network entropy

Network Entropy is based on the classical Shannon Entropy for discrete distributions. Small⁸ proposed a measure of Network Entropy based on the probability that a random walker goes from node i to any other node j. This probability distribution P⁽ⁱ⁾ is defined for each node i and has entries

$${p}_{i\to j}=(\begin{array}{ll}0, & {\rm{for}}\,{a}_{ij}=0,\\ 1/{k}_{i}, & {\rm{for}}\,{a}_{ij}=1.\end{array}$$

(1)

It is easy to observe that ${\sum }_{j}{p}_{i\to j}=1$ for each node i.

Based on the probability distribution P⁽ⁱ⁾, the entropy for each node can be defined as

$${{\mathscr{S}}}^{(i)}\equiv {\mathscr{S}}[{P}^{(i)}]=-\,\mathop{\sum }\limits_{j=1}^{N-1}\,{p}_{i\to j}\,\mathrm{ln}\,{p}_{i\to j}=\,\mathrm{ln}\,{k}_{i}.$$

(2)

with ${{\mathscr{S}}}^{(i)}=0$ if node i is disconnnected.

After calculating the entropy for each node, we then calculate the normalized node entropy by

$${ {\mathcal H} }^{(i)}=\frac{{\mathscr{S}}[{P}^{(i)}]}{\mathrm{ln}(N-1)}=\frac{\mathrm{ln}\,{k}_{i}}{\mathrm{ln}(N-1)}.$$

(3)

Finally, the normalized Network Entropy is calculated averaging the normalized node entropy over the whole network as

$$ {\mathcal H} =\frac{1}{N}\mathop{\sum }\limits_{i=1}^{N}\,{ {\mathcal H} }^{(i)}=\frac{1}{N\,\mathrm{ln}(N-1)}\mathop{\sum }\limits_{i=1}^{N}\,\mathrm{ln}\,{k}_{i}.$$

(4)

The normalized Network Entropy is maximal $ {\mathcal H} =1$ for fully connected networks, since p_iâââjâ=â(Nâââ1)^â1 for every iââ âj and the walk becomes fully random, i.e., jumps from node i any other node j are equiprobable. The walk becomes predictable in a sparse network because it limits the possibility of jumps. The sparser the network, the lower becomes its Network Entropy.

The normalized Network Entropy $ {\mathcal H} $, hence, quantifies the heterogeneity of the networkâs degree distribution, with lower values for nodes with lower degrees and higher values for nodes with higher degrees. For example, peripheral nodes present lower ${ {\mathcal H} }^{(i)}$ than hubs. Entropy, thus, ranges from $ {\mathcal H} \to 0$ (sparse networks) to $ {\mathcal H} \to 1$ (fully connected networks).

Network fisher information measure

The normalized Fisher Information Measure (FIM)⁹ for a node i is given by

$${ {\mathcal F} }^{(i)}[{P}^{(i)}]=\frac{1}{2}\mathop{\sum }\limits_{j=1}^{N-1}\,{[\sqrt{{p}_{i\to j+1}}-\sqrt{{p}_{i\to j}}]}^{2}.$$

(5)

The normalized network Fisher Information Measure is given by

$$ {\mathcal F} =\frac{1}{N}\sum _{i}\,{ {\mathcal F} }^{(i)}[{P}^{(i)}].$$

(6)

If the system under study is in a very ordered state, i.e., a sparse network, almost all p_iâââj values are zeros, we have Shannon Entropy $ {\mathcal H} \to 0$ and normalized Fisherâs Information Measure $ {\mathcal F} \to 1$. On the other hand, when a very disordered state represents the system under study, that is when all p_iâââj values are similar, we obtain $ {\mathcal H} \to 1$ and $ {\mathcal F} \to 0$. We can then define a Shannon-Fisher plane, which can also be used to characterize Complex Networks.

Results: Synthetic Networks

In this section, we analyze the behavior of Information Theory quantifiers when applied to Random (RN), Small World (SWN), and Scale-Free networks (SFN). We simulated independent instances of these networks for several parameters and then analyzed how their Network Entropy and Fisher Information Measure vary. These synthetic networks may present some degree of stochasticity related to its parameters setting, which results in variations of the quantifiers; for this, when we observe variations in any measure ${\mathscr{X}}$, we represent it by its average value $\bar{{\mathscr{X}}}$ along with its sampling standard deviation ${s}_{{\mathscr{X}}}$. These variations, when too small, can be hard to distinguish in figures, but their numerical results should make this clearer.

ErdÅs-rÃ©nyi: random networks

Boccaletti et al.¹⁰ state that: âthe term random graph refers to the disordered nature of the arrangement of links between different nodes.â According to ref.¹¹, Solomonoff and Rapoport¹² initiated the study upon the nature of random graphs, but ErdÅs and RÃ©nyi⁵ are most known by observing the properties of networks as they increase the number of random connections, thus, defining an ensemble of graphs G_N,M, with N nodes and M edges. Later, Gilbert¹³ described an alternative method for generating random graphs by defining an ensemble of graphs G_N,p with N nodes connecting randomly according to a linking probability p that is analogous to the link density $\xi ={(N(N-1))}^{-1}\,{\sum }_{i}^{N}{k}_{i}$. Although ErdÅs-RÃ©nyi (ER) random graphs are well studied in network science, they often fail at describing essential properties of real networks.

We analyzed fifty independent ER graphs G_N,p for every combination of Nâ=â{50, 1000, 10000} and pâââ{0, 0.001, 0.002, â¦, 0.99, 1} making, thus, a total of 50âÃâ3âÃâ1001 graphs. Figure 1 shows the variation of the Shannon Entropy (Fig. 1a) and Fisher Information Measure (Fig. 1b) with respect to the link density, while Fig. 1c depicts the relationship in between the Shannon Entropy and Fisher Information Measure.

Figure 1a shows how the Shannon Entropy $ {\mathcal H} $ varies with respect to the link density Î¾. The variation starts steep, then saturates. This may enhance the sensibility of Shannon-Fisher plane for sparse networks, but it may not be sensitive to denser graphs. The relationship between $ {\mathcal H} $ and Î¾ also depends on the number of nodes N. The Shannon Entropy increases, for the same link density, with N. However, the rate of this growth decreases with N.

Figure 1b suggests that the Fisher Information Measure presents two distinct regimes for ER graphs as a function of their link density. Initially, this measure grows steadily: for pâ=â0 the network starts is totally disconnected; as p increases, it reaches a critical point p_c that is relative to the network number of nodes N, after which the measure decreases in a quasi-linear fashion $ {\mathcal F} \approx 1-p$ for every pâ>âp_c, regardless N. For Nâ=â50, the critical point is $\overline{{p}_{c}}=0.08$ with standard deviation s_pcâ=â0.02 and $\bar{ {\mathcal F} }=0.90$, ${s}_{ {\mathcal F} }=0.02$; for Nâ=â1000, $\overline{{p}_{c}}=0.008$, s_pcâ=â0.002 with $\bar{ {\mathcal F} }=0.99$, ${s}_{ {\mathcal F} }=0.001$; for Nâ=â10000, $\overline{{p}_{c}}=0.001$ with $\bar{ {\mathcal F} }=0.999$ and no variation observed. As the linking probability for ER graphs is analogous to the link density, this also stands for the link density, so $\bar{ {\mathcal F} }\approx 1-\xi $ for every Î¾â>âÎ¾_c. This result relates to the expected phase transitions in random graphs at p_câ>âlnN/N¹⁴, as the network will almost surely be connected.

Figure 1c shows the relationship between the Shannon Entropy and the Fisher Information Measure. As expected, the larger the network is, the less the variability observed. For this reason, we will only present results for Nâ=â1000 hereinafter.

Watts-strogatz: small-world networks

Small-world networks present an intrinsic characteristic of having relatively small average path length between nodes¹⁵. Watts and Strogatz⁶ (WS) proposed a model to build graphs G_N,k that can reproduce this small-world property with a high clustering coefficient. Start with a k-ring network with N nodes and a probability Î². The rewiring consists of removing existing edges and connecting to another random node. When Î²â=â0, we have a ring lattice, and for Î²â=â1, it produces a random graph. For intermediate values, the model produces networks with the small-world property and a nontrivial clustering coefficient¹⁰. Figure 2 shows the same analysis as presented previously for random networks.

Figure 2a shows that the relationship between Network Entropy and link density is consistent with what was observed in Random Networks and that there is little variation with respect to Î². Therefore, the Network Entropy $ {\mathcal H} $ by itself does not provide information to identify different Small-World models.

Figure 2b shows the relationship between the Fisher measure and link density, indexed by the rewiring probability Î² (shades of blue). As expected, the behavior in the limit Î²â=â1 is the same (linear decay) as the one observed for RN; cf. Figure 1b. There is a lower bound which corresponds to k-rings (red dots).

The red arrows (Fig. 2c) identify the change of regime. For kâ=â1 (red-downward arrow), increasing the rewiring probability decreases the Fisher $ {\mathcal F} $ measure, while for kââ¥â2 (after red-upward arrow), this behavior is inverted, and increasing Î² also increases $ {\mathcal F} $.

The Shannon-Fisher plane provides a rich description of Watts-Strogatz (WS) networks. Similarly to what was observed with ER graphs, there are two distinct regimes when evaluating WS graphs. Firstly, we see in Fig. 2 that for kâ=â1, we start with a ring lattice where $ {\mathcal H} =0.1$, $ {\mathcal F} =0.999$, and increasing Î², we see $ {\mathcal F} $ decreasing until it reaches a random graph; this happens because when kâ=â1 and Î²â>â0, the rewiring mechanism isolates some nodes, and we have disconnected components lowering the $ {\mathcal F} $ values. Secondly, for kâ>â1, WS has a different behavior, wherefore the rewiring mechanism most likely will not leave isolated nodes when Î²â>â0; in fact, it will create larger components without fully disconnecting the other ones, this will increase the $ {\mathcal F} $ values. The red arrows in Fig. 2 identify this change of regime. Alongside this, the Shannon-Fisher plane corroborates with the evidence of the transition between ring lattices and random graphs for the WS model.

Figure 3 summarizes the main differences between ErdÅs-RÃ©nyi (Fig. 3a) and Watts-Strogatz (Fig. 3b) networks in the $ {\mathcal H} \times {\mathcal F} \,\times \,\xi $ space. While ErdÅs-RÃ©nyi networks are equivalently well-described by the Network Entropy and the Fisher Information Measure (they span a 1D region of the space), Watts-Strogatz graphs are better characterized by the latter, as different networks span a 2D manifold. Permanent links to interactive versions of these 3D plots are available at http://tiny.cc/ERN and at http://tiny.cc/SWN.

BarabÃ¡si-albert: scale-free networks

The literature often uses scale-free networks as models for real networks. They have a degree distribution that can be fitted by a power-law, i.e., P(k)â~âk^âÎ³, where Î³ is the degree exponent usually in 2ââ¤âÎ³ââ¤â3, as for Î³â>â3 the scale-free property can easily be confused with random networks¹⁶. We will evaluate the BarabÃ¡si-Albert⁷ model (BA) for evolving scale-free networks, as it has two important features: network growth and the preferential attachment mechanism.

For network growth, at each time step t, new nodes are inserted with m links connecting with N₀ existing nodes in the network. These links are created according to a probability given by the preferential attachment: the probability that a node i connects with j is proportional to the actual degree of node i:

$${\Pi }^{(i)}=\frac{{k}_{i}}{{\sum }_{j}\,{k}_{j}}.$$

(7)

In this way, the preferential attachment (PA) induces hubs (highly connected nodes), and peripheral communities, where nodes have similar degree. We know that the BarabÃ¡si-Albert model is unable to reproduce all the diversity existing for scale-free networks, as it captures only the power-law with degree exponent Î³â=â3. Therefore, many variations of this model have been proposed throughout the years. In this work, we extend our analysis for: non-linear preferential attachment; the fitness property; the aging property; and finally, the configuration model.

Non-linear preferential attachment

Krapivsky et al.¹⁷ introduced a non-linear preferential attachment that creates different regimes for the network according to an exponent Î± controlling the network topology. The non-linear PA is given by

$${\Pi }^{(i)}=\frac{{k}_{i}^{\alpha }}{{\sum }_{j}\,{k}_{j}^{\alpha }}.$$

(8)

For Î±ââ â1, the growth model stops resulting in a power-law degree distribution. There are, thus, three different growth regimes:

The sublinear regime (Î±â<â1) has a power-law with an exponential cutoff, where the preferential attachment is not strong enough to produce a pure power-law degree distribution.
The linear regime (Î±â=â1) has a pure power-law behavior corresponding to the BarabÃ¡si-Albert⁷ model, with a resulting Î³â=â3.
The superlinear regime (Î±â>â1) presents a particular behavior where the network condensates, i.e, very few nodes win all connections; it also does not result in a power-law degree distribution.

We mapped outcomes of the BA model with a non-linear preferential attachment using the Krapivskyâs model onto the Shannon-Fisher plane, as shown in Fig. 4a. For Î±â=â0, we have a random network, since Î (i)â=â1 for every i, the network no longer obeys the preferential attachment mechanism, just the evolving growth property; the result is $\bar{ {\mathcal H} }=0.074$, ${s}_{ {\mathcal H} }=0.001$ and $\bar{ {\mathcal F} }=0.997$, ${s}_{ {\mathcal F} }=0.001$. Increasing Î± in steps of 0.01 changes the regime of the network slowly, and we see this transition in the Shannon-Fisher plane until it reaches Î±â=â1. In the linear regime $\bar{ {\mathcal H} }=0.063$, ${s}_{ {\mathcal H} }=0.001$ and $\bar{ {\mathcal F} }=0.993$, ${s}_{ {\mathcal F} }=0.001$. In the superlinear regime $ {\mathcal H} \to 0$ and $ {\mathcal F} $ starts oscillating above Î±â>â1.4, as seen in Fig. 4b. This oscillation happens due to the fact that after the network condensates, a small change in the network topology may cause $ {\mathcal F} $ to drop from $ {\mathcal F} =1$ to $ {\mathcal F} =0.5$, as the Fisher Information Measure is sensitive to local disturbances.

Similar to earlier sections, we evaluated the link density Î¾ along with Network Entropy $ {\mathcal H} $ and Fisher Information Measure $ {\mathcal F} $. This time, the results with link density in comparison with Network Entropy, shown in Fig. 5b, have more interesting behavior. Although the link density does not change (Î¾â=â0.002), $ {\mathcal H} $ absorbs the changes and when increasing Î±, $ {\mathcal H} \to 0$. In Fig. 5a, we observe how Fisher Information Measure $ {\mathcal F} $ against link density Î¾ produces confusing results, as the oscillation for Î±â>â1.4 heads toward $ {\mathcal F} \approx 0.5$ and $ {\mathcal F} \approx 1$ with Î¾â=â0.002.

Fitness property

Some networks have nodes that create connections with more ability, e.g., a popular web page. Usually, these nodes gain relationships faster than common nodes. The Bianconi-BarabÃ¡si model^18,19 describes this property named fitness. We can model it using the preferential attachment considering a fitness coefficient Î·_i alongside the node degree k_i:

$${\Pi }^{(i)}=\frac{{\eta }_{i}{k}_{i}}{{\sum }_{j}\,{\eta }_{j}{k}_{j}}.$$

(9)

In Eq. 9, the dependence of Î ⁽ⁱ⁾ on Î·_i models the fact that even younger nodes can acquire links faster if they have sufficiently higher fitness than older nodes. Therefore, we draw 30 networks with Nâ=â1000 considering a uniform distribution for the fitness Î·_i of each node i. For this, we do not expect a perfect power law, but we expect Î³â=â2.255, asymptotically.

Figure 6a shows consistency between the Bianconi-BarabÃ¡si, and the BarabÃ¡si-Albert models, wherefore $ {\mathcal H} $ grows slowly while Î¾ does not change and their numerical results are $\bar{{\rm{\xi }}}=0.00278$ with standard deviation s_Î¾â=â0.00006. Figure 6b also shows a similar behavior in comparison with Fig. 5a. Finally, Fig. 6c shows that most networks generated by the fitness model lie at a region close to the results presented for the BA model with $\bar{ {\mathcal H} }=0.091$, ${s}_{ {\mathcal H} }=0.005$ and $\bar{ {\mathcal F} }=0.979$, ${s}_{ {\mathcal F} }=0.038$.

Aging property

Another aspect we can also consider for a scale-free network is the aging property²⁰. Regularly, for the BarabÃ¡si-Albert model, we account only for the node degree or as seen before, the fitness coefficient. However, what happens when a node starts to reduce the rate of acquiring new links with time? This aging process causes the nodes to lose relevance; thus, it changes the network structure and dynamics. We can model this property considering:

$${\Pi }^{(i)}({k}_{i},t-{t}_{i}) \sim k{(t-{t}_{i})}^{-\nu },$$

(10)

where Î½ is a parameter controlling the dependence of the attachment probability on the nodeâs age. According to Î½, we can define three scaling regimes:

If Î½â<â0, new nodes will connect to older nodes. If Î½âââââ, each new node connects to the oldest node, resulting in a condensed network or hub-and-spoke topology. Hence, we have a more heterogeneous network with a few hubs and many peripheral nodes.
If Î½â>â0, nodes connect to younger nodes. By aging, nodes lose the ability of preferential attachment. In this case, the network tends to be more homogeneous.
For Î½â>â1, the aging effect dominates the preferential attachment effect, the network loses its scale-free property, and it eventually approaches a random network. When Î½ââââ, each node connects to its immediate predecessor.

For evaluating the aging property, we generate distinct networks with Nâ=â1000 and Î½âââ{â3.0, â2.9, â2.8, ..., 2.9, 3.0} with 30 replications of each setting; thus, we have a total of 18030 networks. Figure 7a shows the results for networks with a growing Network Entropy $ {\mathcal H} $ and a steady link density Î¾â=â0.002, the same result as for BA model. Figure 7b shows the results for the aging property, and once more, we can observe the âoscillationâ that happens to all the other scale-free models previously discussed. Figure 7c presents the results considering the Network Entropy $ {\mathcal H} $ and Fisher Information Measure $ {\mathcal F} $, where we can see the Aging model transition in the plane according to its scaling regimes.

The numerical results for the Aging model are the following: for Î½â=ââ3, $\bar{ {\mathcal H} }=0.024$, ${s}_{ {\mathcal H} }=0.0014$ and $\bar{ {\mathcal F} }=0.845$, ${s}_{ {\mathcal F} }=0.042$ i.e., we have a condensed network; when Î½â>â0, $ {\mathcal H} $ and $ {\mathcal F} $ continue to grow until Î½â=â1, wherefore $\bar{ {\mathcal H} }=0.072$, ${s}_{ {\mathcal H} }=0.0009$ and $\bar{ {\mathcal F} }=0.992$, ${s}_{ {\mathcal F} }=0.0021$; for Î½â>â1, $ {\mathcal H} $ grows steadily and $ {\mathcal F} $ decays, reaching a random regime. Finally, we noticed that the scale-free regime expected for Î½âââ[0, 1] is observed in the Shannon-Fisher plane, where the values for $\bar{ {\mathcal H} }=0.063$, ${s}_{ {\mathcal H} }=0.005$ and $\bar{ {\mathcal F} }=0.990$, ${s}_{ {\mathcal F} }=0.0064$.

The configuration model

A recurrent problem is âhow do we generate networks with an arbitrary P(k)?â. For this, we use the configuration model, also known as a random network with a pre-defined degree sequence²¹. According to ref.¹⁶, the algorithm consists of the following steps:

1.
Assign a degree to each node as stubs or half-links. It is required that we start from an even number of stubs; otherwise, we will have unpaired stubs.
2.
Randomly selects a pair of half-links and connect them; then randomly choose another pair from the remaining 2Lâââ2 half-links and connect them.
3.
Repeat this process until all stubs are paired up. Depending on how we pair them up, we may obtain distinct networks. Some networks include cycles, self-loops, or multi-links. In this work, we consider only simple graphs, thus, after generating the network for a degree sequence, we âsimplifyâ the graph, removing self-loops and multi-links.

As we are trying to reproduce scale-free properties using the configuration model, we assign a pure power-law distribution P(k)â=âk^âÎ³ with Î³âââ[2, 5]. For this model, we expect that for 2ââ¤âÎ³ââ¤â3, the network is in the scale-free regime; when Î³â>â3, the network starts to condensate, as the distribution has a steep curve. It means that few nodes have most of the links and most nodes have few links. Such networks present structure and dynamics more similar to a hub-and-spoke topology.

Finally, we evaluate these networks with Nâ=â1000 using a pure power-law distribution with Î³âââ{2.0, 2.1, 2.2, ..., 4.9, 5.0}; as we cannot guarantee that networks with the same degree exponent have the same topology, we replicate this experiment 30 times, then, we have 1312 networks. In Fig. 8a we have that $\bar{{\rm{\xi }}}=0.001$, with s_Î¾â=â0.0006 while the Network Entropy $ {\mathcal H} $ decreases as we increase the degree exponent. This behavior is similar for all the other scale-free models when we are in the condensed regime. Figure 8b shows how the $ {\mathcal F} $ values are stable for this model with $\bar{ {\mathcal F} }=0.998$, ${s}_{ {\mathcal F} }=0.001$. And later, in Fig. 8c, we observe how Network Entropy $ {\mathcal H} $ is actually the one capturing the changes, wherefore the degree exponent Î³âââ[2, 3] we have $\bar{ {\mathcal H} }=0.045$, ${s}_{ {\mathcal H} }=0.017$ and $\bar{ {\mathcal F} }=0.998$, ${s}_{ {\mathcal F} }=0.001$; and for Î³âââ[3, 5], we have $\bar{ {\mathcal H} }=0.010$, ${s}_{ {\mathcal H} }=0.005$ and $\bar{ {\mathcal F} }=0.998$, ${s}_{ {\mathcal F} }=0.001$.

Figure 9 summarizes the main features observed in the Shannon-Fisher plane for simulated networks, alongside examples illustrating different topologies and their expected results in the Shannon-Fisher plane. The fitness, aging, and configuration models were left out of this plot, as they are represented well enough by the BarabÃ¡si-Albert model with nonlinear preferential attachment and its distinct scaling regimes. Considering the standard definition for the scale-free property and the network models evaluated, we observe how scale-free networks are subtle and can be easily confused with others in the Shannon-Fisher plane, as the usual result for scale-free networks is confined to a tiny region of the plane, and a few inputs can dismantle the scale-free property. This observation reflects a recent discovery that states that âscale-free networks are rareâ²².

Results: Real Networks

We evaluate real networks, assessing their topological features such as clustering coefficient C^Î, average path length L, and if the degree distribution follows a power law (P(k)â~âk^âÎ³); we also consider the degree exponent Î³. Table 1 shows the real networks analyzed in this work. Each network is presented with its number of nodes N, average degree ãkã, and link density Î¾. We inform their small-world indicators: average path length L, clustering coefficient C^Î, small-world-ness value S^Î (see below). We also provide scale-free indicators: degree exponent Î³ and p-value for the power-law fitting. Finally, we provide the Network Entropy $ {\mathcal H} $ and Fisher Information Measure $ {\mathcal F} $, along with a literature reference.

Table 1 Real networks and their descriptors.

Full size table

As shown in Fig. 10a,b, most real networks are sparse, with Î¾ââ¤â0.009. The only exception is network #14 with Î¾â=â0.157. Therefore, our analysis will rely on our ability to distinguish sparse networks, with link density Î¾ that does not present distortions in Network Entropy $ {\mathcal H} $ (Fig. 1a) nor Fisher Information Measure $ {\mathcal F} $ (Fig. 1b). Nevertheless, care is needed not to jump into conclusions without further analysis, as differences in the Shannon-Fisher plane for sparse networks are subtle; thus, it is where other metrics are welcome in helping to confirm our findings.

Foremost, considering the results for the Watts-Strogatz model in the Shannon-Fisher plane, it is expected that networks in between the upper (i.e., random) and lower (k-ring) limits are very likely to be small-world networks. Although, this should not be confused with saying that these networks have the same topology. Our purpose here is to study how the information flows through the nodes. Networks may present similar topology that will result in similar dynamics, but distinct topologies may have similar dynamics; this feature is noteworthy.

From the results in Fig. 11a and in Table 1, we may state that networks #1, #2, #4, #6, #10, #13 and #14 present small-world behavior. That said, we observe that the average path length L for these networks in Table 1 is small, considering its number of nodes N. Cohen and Havlin²³ demonstrated that WS networks under some expected conditions have an average path length that scales as logN, and we can see that the largest network in our small-world set has Nâ=â4941, therefore, as logNâââ8.50, and observing that L for every small-world network in our study has Lâ<â8.5, following this criterion, they present the small-world behavior.

Another aspect that we can observe for small-world networks is their clustering coefficient (C^Î), but there is no certainty of which values to expect. In an attempt to perform quantitative analysis for clustering coefficient in small-world networks, and considering a relationship with the average path length, Humphries et al.²⁴ proposed the small-world-ness S^Î, defined as

$${S}^{\Delta }=\frac{{C}^{\Delta }/{C}_{{\rm{rand}}}^{\Delta }}{L/{L}_{{\rm{rand}}}},$$

(11)

where C^Î and L are, respectively, the clustering coefficient and average path length, and C_rand^Î and L_rand are the results computed for an ensemble of 100 ER networks, simulated with the same link density Î¾ as the real network. With this approach, Humphries et al.²⁴ state that for S^Îâ>â1, the network can be considered small-world.

However, Table 1 shows that only network #11 has S^Îâ<â1, and if we analyze only the small-world-ness value, we may attest that all of the other networks are also small-world. This is inaccurate, if not wrong.

Alongside the small-world-ness value, as an attempt to identify scale-free networks, we can also estimate the degree exponent Î³ of the power-law degree distribution. Using a method proposed by Newman²⁵, we reject the fit whenever p-valueâ<â0.05, and if the estimated Î³ lies between two and three (2â<âÎ³â<â3), we consider these networks scale-free; for Î³â>â3, these networks may present hubs, but they become hard to distinguish from random or small-world networks.

We proceed with our study zooming into the Shannon-Fisher plane, as we can see in Fig. 11b. First, we observe networks #5 and #8, which are outside the small-world region, and network #3 that is overlapping the upper limit. We state that these three networks are random; Table 1 shows that they have Î³â>â3 with p-valueâ>â0.05.

Although the identified random networks are outside the small-world region, they could easily be confused with scale-free or even regular networks considering just the results provided by the Shannon-Fisher plane, therefore, we ought to be careful with networks outside the small-world region, and for these cases, we must have a look into the degree distribution and estimate the degree exponent.

Network #7 is mapped into the $ {\mathcal H} \times {\mathcal F} $ plane closely to the classic BarabÃ¡si-Albert model, along with Î³â=â2.124 and p-valueâ=â1, thus, we cannot reject the fitting for a power-law, and we expect this network to be scale-free. Also, network #12 has Î³â=â2.782 with p-valueâ=â1, and it is also close to network #7, although it is even closer to network #8.

Another âoddâ result is the fact that network #6, although its results and properties signal for a small-world network, has degree exponent Î³â=â2.187 with p-valueâ=â0.896, also indicating the scale-free property. Finally, network #9 has resulted in a point in $ {\mathcal H} \times {\mathcal F} $ equal to those generated with superlinear preferential attachment, leading to a âcondensedâ network. Indeed, with Lâ=â3.771 and a degree distribution that does not fit a power-law at all, we state this network is âcondensed.â

Discussion

Complex networks have many faces, thus attempting to label them considering a single network property may be misleading. Real networks have many components and distinct interactions among them, for example, a scale-free network may have peripheral communities that lead to small-world structure. Our proposal quantifies network structure and dynamics, considering a simplified plot. We show consistent results with other network features when this methodology is applied to synthetic networks.

The Shannon-Fisher plane enhances our ability to evaluate complex networks:

The transition that the Watts-Strogatz model exhibits in between k-ring and random graphs, leading us to define the small-world region;
The three regimes for the non-linear preferential attachment on scale-free networks and distinct growth models, which transits between random, scale-free and condensed networks;
The behavior of the fitness model when we consider a uniform distribution for the fitness of each node, and how it has similar features to the BarabÃ¡si-Albert model;
The effect of aging for scale-free networks and how the aging exponent can control the systemâs behavior in the same manner to what happens with the non-linear preferential attachment;
And finally, how we can generate networks with a pure power-law considering distinct degree exponent.

The evaluation of real networks gave us a peek into the real world and its deceitful aspects. Our method succeeds in characterizing most of the real networks in comparison with synthetic networks, even though a few examples showed unexpected behaviors that will be widely explored further. That said, our proposal is not the perfect fit for labeling networks as a small-world or scale-free, but it opens a world of possibilities when evaluating information spread, network robustness, or controllability. Our approach allows identifying distinct interactions in real networks, observing how they transit within the Shannon-Fisher plane and comparing how they affect other network features.

Data availability

The datasets generated during and/or analysed during the current study are available in the fisher-networks repository, https://gitlab.com/cristophersfr/fisher-networks.

Change history

12 January 2021
An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

Wiedermann, M., Donges, J. F., Kurths, J. & Donner, R. V. Mapping and discrimination of networks in the complexityentropy plane. Phys. Rev. E 96, 042304 (2017).
ArticleÂ ADSÂ Google ScholarÂ
Rosso, O. A., Larrondo, H., MartÃn, M. T., Plastino, A. & Fuentes, M. Distinguishing noise from chaos. Phys. Rev. Lett. 99, 154102 (2007).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Bandt, C. & Pompe, B. Permutation entropy: a natural complexity measure for time series. Phys. Rev. Lett. 88, 174102 (2002).
ArticleÂ ADSÂ Google ScholarÂ
Rosso, O. A., Olivares, F. & Plastino, A. Noise versus chaos in a causal fisher-shannon plane. Pap. Phys. 7, 070006 (2015).
ArticleÂ Google ScholarÂ
Watts, D. J. & Strogatz, S. H. Collective dynamics of âsmall-worldânetworks. Nat. 393, 440 (1998).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
BarabÃ¡si, A.-L. & Albert, R. Emergence of scaling in random networks. Sci. 286, 509â512 (1999).
ArticleÂ ADSÂ MathSciNetÂ Google ScholarÂ
Small, M. Complex networks from time series: Capturing dynamics. In Circuits and Systems (ISCAS), 2013 IEEE International Symposium on, 2509â2512 (IEEE, 2013).
SÃ¡nchez-Moreno, P., YÃ¡nez, R. & Dehesa, J. Discrete densities and Fisher information. In Proceedings of the 14 ^th International Conference on Difference Equations and Applications. Difference Equations and Applications. Istanbul, Turkey: BahÃ§esehir University Press, 291â298 (2009).
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D.-U. Complex networks: Structure and dynamics. Phys. Reports 424, 175â308 (2006).
ArticleÂ ADSÂ MathSciNetÂ Google ScholarÂ
Newman, M. Networks (Oxford university press, 2018).
Solomonoff, R. & Rapoport, A. Connectivity of random nets. The Bull. Math. Biophys. 13, 107â117 (1951).
ArticleÂ MathSciNetÂ Google ScholarÂ
Gilbert, E. N. Random graphs. The Annals Math. Stat. 30, 1141â1144 (1959).
ArticleÂ Google ScholarÂ
ErdÅs, P. & RÃ©nyi, A. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5, 43 (1960).
MathSciNetÂ MATHÂ Google ScholarÂ
Travers, J. & Milgram, S. The small world problem. Psychol. Today 1, 61â67 (1967).
Google ScholarÂ
BarabÃ¡si, A.-L. Network science (Cambridge University Press, 2016).
Krapivsky, P. L., Redner, S. & Leyvraz, F. Connectivity of growing random networks. Phys. Rev. Lett. 85, 4629 (2000).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Bianconi, G. & BarabÃ¡si, A.-L. Competition and multiscaling in evolving networks. EPL (Europhysics Lett. 54, 436 (2001).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Adamic, L. A. & Huberman, B. A. Power-law distribution of the world wide web. Sci. 287, 2115â2115 (2000).
ArticleÂ ADSÂ Google ScholarÂ
Dorogovtsev, S. N. & Mendes, J. F. F. Evolution of networks with aging of sites. Phys. Rev. E 62, 1842 (2000).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Molloy, M. & Reed, B. A critical point for random graphs with a given degree sequence. Random structures & Algorithms 6, 161â180 (1995).
ArticleÂ MathSciNetÂ Google ScholarÂ
Broido, A. D. & Clauset, A. Scale-free networks are rare. Nat. communications 10, 1017 (2019).
ArticleÂ ADSÂ Google ScholarÂ
Cohen, R. & Havlin, S. Complex networks: structure, robustness and function (Cambridge University Press, 2010).
Humphries, M. D. & Gurney, K. Network âsmall-world-nessâ: a quantitative method for determining canonical network equivalence. PloS One 3, e0002051 (2008).
ArticleÂ ADSÂ Google ScholarÂ
Newman, M. E. Power laws, pareto distributions and zipfâs law. Contemp. Phys. 46, 323â351 (2005).
ArticleÂ ADSÂ Google ScholarÂ
Guimera, R., Danon, L., Diaz-Guilera, A., Giralt, F. & Arenas, A. Self-similar community structure in a network of human interactions. Phys. Rev. E 68, 065103 (2003).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Moody, J. Peer influence groups: identifying dense clusters in large networks. Soc. Networks 23, 261â283 (2001).
ArticleÂ ADSÂ Google ScholarÂ
Newman, M. E. The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. 98, 404â409 (2001).
ArticleÂ ADSÂ MathSciNetÂ CASÂ Google ScholarÂ
Newman, M. E. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006).
ArticleÂ ADSÂ MathSciNetÂ CASÂ Google ScholarÂ
Å ubelj, L., Å½itnik, S., Blagus, N. & Bajec, M. Node mixing and group structure of complex software networks. Adv. Complex Syst. 17, 1450022 (2014).
ArticleÂ MathSciNetÂ Google ScholarÂ
CAIDA. The caida as relationships dataset 2004â2007, http://www.caida.org/data/as-relationships (2007).
Å ubelj, L. & Bajec, M. Ubiquitousness of link-density and link-pattern communities in real-world networks. The Eur. Phys. J. B 85, 32 (2012).
ArticleÂ ADSÂ Google ScholarÂ
Knuth, D. E. The Stanford GraphBase: a platform for combinatorial computing (ACM Press New York, 1993).
Kaiser, M. & Hilgetag, C. C. Spatial growth of real-world networks. Phys. Rev. E 69, 036103 (2004).
ArticleÂ ADSÂ Google ScholarÂ
Takemura, S.-Y. et al. A visual motion detection circuit suggested by drosophila connectomics. Nat. 500, 175 (2013).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Helmstaedter, M. et al. Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nat. 500, 168 (2013).
ArticleÂ ADSÂ CASÂ Google ScholarÂ

Download references

Acknowledgements

We acknowledge the support for this research by FAPEAL, FAPESP and CNPq (Brazil).

Author information

Authors and Affiliations

Instituto de ComputaÃ§Ã£o, Universidade Federal de Alagoas, MaceiÃ³, Brazil
Cristopher G. S. Freitas,Â Andre L. L. AquinoÂ &Â Alejandro C. Frery
Instituto de FÃsica, Universidade Federal de Alagoas, MaceiÃ³, Brazil
Osvaldo A. Rosso
Departamento de CiÃªncia da ComputaÃ§Ã£o, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Heitor S. Ramos
Instituto de Medicina Traslacional e IngenierÃa Biomedica, Hospital Italiano de Buenos Aires & CONICET, Ciudad, AutÃ³noma de Buenos Aires, Argentina
Osvaldo A. Rosso

Authors

Cristopher G. S. Freitas
View author publications
You can also search for this author in PubMedÂ Google Scholar
Andre L. L. Aquino
View author publications
You can also search for this author in PubMedÂ Google Scholar
Heitor S. Ramos
View author publications
You can also search for this author in PubMedÂ Google Scholar
Alejandro C. Frery
View author publications
You can also search for this author in PubMedÂ Google Scholar
Osvaldo A. Rosso
View author publications
You can also search for this author in PubMedÂ Google Scholar

Contributions

Implementation details and experiments were conceived by C.G.S.F., H.S.R. and A.L.L.A. The authors O.A.R. and A.C.F. conducted the experiments. All the authors analyzed the results, wrote and reviewed the manuscript.

Corresponding author

Correspondence to Cristopher G. S. Freitas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental Material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the articleâs Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the articleâs Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Freitas, C.G.S., Aquino, A.L.L., Ramos, H.S. et al. A detailed characterization of complex networks using Information Theory. Sci Rep 9, 16689 (2019). https://doi.org/10.1038/s41598-019-53167-5

Download citation

Received: 19 June 2019
Accepted: 25 October 2019
Published: 13 November 2019
DOI: https://doi.org/10.1038/s41598-019-53167-5

This article is cited by

The Ihara zeta function as a partition function for network structure characterisation
- Jianjia Wang
- Edwin R. Hancock
Scientific Reports (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.