The Network Similarity Analysis (NSA) method is a powerful tool for identifying communities in we... more The Network Similarity Analysis (NSA) method is a powerful tool for identifying communities in weighted networks whose links depend on the edge's weight. The method can be applied to different types of networks, with nodes and edges receiving different interpretations. For instance, the network can be constructed based on the similarity between proteins that play the same function in distinct species, or between DNA sequences that share the same evolutionary origins. The original implementation of the NSA method consisted of pipeline of algorithms implemented in different programming languages, making it difficult to use. SCANNET is a software developed to integrate all algorithms of the NSA method into a unique computational environment, to perform all the steps needed to identify communities in networks in a single workflow. As an example for this paper, we applied SCANNET to implement a single workflow that correctly identified the communities in protein networks containing d...
Complex networks have been successfully applied to the characterization and modeling of complex s... more Complex networks have been successfully applied to the characterization and modeling of complex systems in several distinct areas of Biological Sciences. Nevertheless, their utilization in phylogenetic analysis still needs to be widely tested, using different molecular data sets and taxonomic groups, and, also, by comparing complex networks approach to current methods in phylogenetic analysis. In this work, we compare all the four main methods of phylogenetic analysis (distance, maximum parsimony, maximum likelihood, and Bayesian) with a complex networks method that has been used to provide a phylogenetic classification based on a large number of protein sequences as those related to the chitin metabolic pathway and ATP-synthase subunits. In order to perform a close comparison to these methods, we selected Basidiomycota fungi as the taxonomic group and used a high-quality, manually curated and characterized database of chitin synthase sequences. This enzymatic protein plays a key ro...
Complex networks have been successfully applied to the characterization and modeling of complex s... more Complex networks have been successfully applied to the characterization and modeling of complex systems in several distinct areas of Biological Sciences. Nevertheless, their utilization in phylogenetic analysis still needs to be widely tested, using different molecular data sets and taxonomic groups, and, also, by comparing complex networks approach to current methods in phylogenetic analysis. In this work, we compare all the four main methods of phylogenetic analysis (distance, maximum parsimony, maximum likelihood, and Bayesian) with a complex networks method that has been used to provide a phylogenetic classification based on a large number of protein sequences as those related to the chitin metabolic pathway and ATP-synthase subunits. In order to perform a close comparison to these methods, we selected Basidiomycota fungi as the taxonomic group and used a high-quality, manually curated and characterized database of chitin synthase sequences. This enzymatic protein plays a key ro...
Http Dx Doi Org 10 1080 00927870802179438, Dec 1, 2008
ABSTRACT Let G be a complete monomial group with abelian base, namely, G = AwrSym m , the wreath ... more ABSTRACT Let G be a complete monomial group with abelian base, namely, G = AwrSym m , the wreath product of a finite abelian group A with the symmetric group on m letters. Then the group G is determined by its integral group ring.
... C an. J . Math. 4 8 ( 6 ) (199 6 ), 1170 - 1179. [2 ] D o k uchae v , MA; J uriaans, S . O .;... more ... C an. J . Math. 4 8 ( 6 ) (199 6 ), 1170 - 1179. [2 ] D o k uchae v , MA; J uriaans, S . O .; P olcino Milies, F . C .;Integral Group ... [3 ] Giambruno, A.; S ehgal, S . Κ .; A utomorphisms o f the integral group ring o f the wreath produ c to f a p -groupwith ¡ n , Roc k Mountain J . 22 (199 2 ), ...
Chitin is a structural endogenous carbohydrate, which is a major component of fungal cell walls a... more Chitin is a structural endogenous carbohydrate, which is a major component of fungal cell walls and arthropod exoskeletons. A renewable resource and the second most abundant polysaccharide in nature after cellulose, chitin is currently used for waste water clearing, cosmetics, medical, and veterinary applications. This work comprises data mining of protein sequences related to the chitin metabolic pathway of completely sequenced genomes of extant organisms pertaining to the three life domains, followed by meta-analysis using traditional sequence similarity comparison and complex network approaches. Complex networks involving proteins of the chitin metabolic pathway in extant organisms were constructed based on protein sequence similarity. Several usual network indices were estimated in order to obtain information on the topology of these networks, including those related to higher order neighborhood properties. Due to the assumed evolutionary character of the system, we also discuss issues related to modularity properties, with the concept of edge betweenness playing a particularly important role in our analysis. Complex network approach correctly identifies clusters of organisms that belong to phylogenetic groups without any a priori knowledge about the biological features of the investigated protein sequences. We envisage the prospect of using such a complex network approach as a high-throughput phylogenetic method.
Nonextensive statistical mechanics has been a source of investigation in mathematical structures ... more Nonextensive statistical mechanics has been a source of investigation in mathematical structures such as deformed algebraic structures. In this work, we present some consequences of $q$-operations on the construction of $q$-numbers for all numerical sets. Based on such a construction, we present a new product that distributes over the $q$-sum. Finally, we present different patterns of $q$-Pascal's triangles, based on $q$-sum, whose elements are $q$-numbers.
Nonextensive statistical mechanics has been a source of investigation in mathematical structures ... more Nonextensive statistical mechanics has been a source of investigation in mathematical structures such as deformed algebraic structures. In this work, we present some consequences of $q$-operations on the construction of $q$-numbers for all numerical sets. Based on such a construction, we present a new product that distributes over the $q$-sum. Finally, we present different patterns of $q$-Pascal's triangles, based on $q$-sum, whose elements are $q$-numbers.
Complex network theory is used to investigate the structure of meaningful concepts in written tex... more Complex network theory is used to investigate the structure of meaningful concepts in written texts of individual authors. Networks have been constructed after a two phase filtering, where words with less meaning contents are eliminated and all remaining words are set to their canonical form, without any number, gender or time flexion. Each sentence in the text is added to the network as a clique. A large number of written texts have been scrutinised, and it is found that texts have small-world as well as scale-free structures. The growth process of these networks has also been investigated, and a universal evolution of network quantifiers have been found among the set of texts written by distinct authors. Further analyses, based on shuffling procedures taken either on the texts or on the constructed networks, provide hints on the role played by the word frequency and sentence length distributions to the network structure.
We propose a two-parametric non-distributive algebraic structure that follows from $(q,q')$-logar... more We propose a two-parametric non-distributive algebraic structure that follows from $(q,q')$-logarithm and $(q,q')$-exponential functions. Properties of generalized $(q,q')$-operators are analyzed. We also generalize the proposal into a multi-parametric structure (generalization of logarithm and exponential functions and the corresponding algebraic operators). All $n$-parameter expressions recover $(n-1)$-generalization when the corresponding $q_n\to1$. Nonextensive statistical mechanics has been the source of successive generalizations of entropic forms and mathematical structures, in which this work is a consequence.
A concept of higher order neighborhood in complex networks, introduced previously [Phys. Rev. E 7... more A concept of higher order neighborhood in complex networks, introduced previously [Phys. Rev. E 73, 046101 (2006)], is systematically explored to investigate larger scale structures in complex networks. The basic idea is to consider each higher order neighborhood as a network in itself, represented by a corresponding adjacency matrix, and to settle a plenty of new parameters in order to obtain a best characterization of the whole network. Usual network indices are then used to evaluate the properties of each neighborhood. The identification of high order neighborhoods is also regarded as intermediary step towards the evaluation of global network properties, like the diameter, average shortest path between node, and network fractal dimension. Results for a large number of typical networks are presented and discussed.
The Network Similarity Analysis (NSA) method is a powerful tool for identifying communities in we... more The Network Similarity Analysis (NSA) method is a powerful tool for identifying communities in weighted networks whose links depend on the edge's weight. The method can be applied to different types of networks, with nodes and edges receiving different interpretations. For instance, the network can be constructed based on the similarity between proteins that play the same function in distinct species, or between DNA sequences that share the same evolutionary origins. The original implementation of the NSA method consisted of pipeline of algorithms implemented in different programming languages, making it difficult to use. SCANNET is a software developed to integrate all algorithms of the NSA method into a unique computational environment, to perform all the steps needed to identify communities in networks in a single workflow. As an example for this paper, we applied SCANNET to implement a single workflow that correctly identified the communities in protein networks containing d...
Complex networks have been successfully applied to the characterization and modeling of complex s... more Complex networks have been successfully applied to the characterization and modeling of complex systems in several distinct areas of Biological Sciences. Nevertheless, their utilization in phylogenetic analysis still needs to be widely tested, using different molecular data sets and taxonomic groups, and, also, by comparing complex networks approach to current methods in phylogenetic analysis. In this work, we compare all the four main methods of phylogenetic analysis (distance, maximum parsimony, maximum likelihood, and Bayesian) with a complex networks method that has been used to provide a phylogenetic classification based on a large number of protein sequences as those related to the chitin metabolic pathway and ATP-synthase subunits. In order to perform a close comparison to these methods, we selected Basidiomycota fungi as the taxonomic group and used a high-quality, manually curated and characterized database of chitin synthase sequences. This enzymatic protein plays a key ro...
Complex networks have been successfully applied to the characterization and modeling of complex s... more Complex networks have been successfully applied to the characterization and modeling of complex systems in several distinct areas of Biological Sciences. Nevertheless, their utilization in phylogenetic analysis still needs to be widely tested, using different molecular data sets and taxonomic groups, and, also, by comparing complex networks approach to current methods in phylogenetic analysis. In this work, we compare all the four main methods of phylogenetic analysis (distance, maximum parsimony, maximum likelihood, and Bayesian) with a complex networks method that has been used to provide a phylogenetic classification based on a large number of protein sequences as those related to the chitin metabolic pathway and ATP-synthase subunits. In order to perform a close comparison to these methods, we selected Basidiomycota fungi as the taxonomic group and used a high-quality, manually curated and characterized database of chitin synthase sequences. This enzymatic protein plays a key ro...
Http Dx Doi Org 10 1080 00927870802179438, Dec 1, 2008
ABSTRACT Let G be a complete monomial group with abelian base, namely, G = AwrSym m , the wreath ... more ABSTRACT Let G be a complete monomial group with abelian base, namely, G = AwrSym m , the wreath product of a finite abelian group A with the symmetric group on m letters. Then the group G is determined by its integral group ring.
... C an. J . Math. 4 8 ( 6 ) (199 6 ), 1170 - 1179. [2 ] D o k uchae v , MA; J uriaans, S . O .;... more ... C an. J . Math. 4 8 ( 6 ) (199 6 ), 1170 - 1179. [2 ] D o k uchae v , MA; J uriaans, S . O .; P olcino Milies, F . C .;Integral Group ... [3 ] Giambruno, A.; S ehgal, S . Κ .; A utomorphisms o f the integral group ring o f the wreath produ c to f a p -groupwith ¡ n , Roc k Mountain J . 22 (199 2 ), ...
Chitin is a structural endogenous carbohydrate, which is a major component of fungal cell walls a... more Chitin is a structural endogenous carbohydrate, which is a major component of fungal cell walls and arthropod exoskeletons. A renewable resource and the second most abundant polysaccharide in nature after cellulose, chitin is currently used for waste water clearing, cosmetics, medical, and veterinary applications. This work comprises data mining of protein sequences related to the chitin metabolic pathway of completely sequenced genomes of extant organisms pertaining to the three life domains, followed by meta-analysis using traditional sequence similarity comparison and complex network approaches. Complex networks involving proteins of the chitin metabolic pathway in extant organisms were constructed based on protein sequence similarity. Several usual network indices were estimated in order to obtain information on the topology of these networks, including those related to higher order neighborhood properties. Due to the assumed evolutionary character of the system, we also discuss issues related to modularity properties, with the concept of edge betweenness playing a particularly important role in our analysis. Complex network approach correctly identifies clusters of organisms that belong to phylogenetic groups without any a priori knowledge about the biological features of the investigated protein sequences. We envisage the prospect of using such a complex network approach as a high-throughput phylogenetic method.
Nonextensive statistical mechanics has been a source of investigation in mathematical structures ... more Nonextensive statistical mechanics has been a source of investigation in mathematical structures such as deformed algebraic structures. In this work, we present some consequences of $q$-operations on the construction of $q$-numbers for all numerical sets. Based on such a construction, we present a new product that distributes over the $q$-sum. Finally, we present different patterns of $q$-Pascal's triangles, based on $q$-sum, whose elements are $q$-numbers.
Nonextensive statistical mechanics has been a source of investigation in mathematical structures ... more Nonextensive statistical mechanics has been a source of investigation in mathematical structures such as deformed algebraic structures. In this work, we present some consequences of $q$-operations on the construction of $q$-numbers for all numerical sets. Based on such a construction, we present a new product that distributes over the $q$-sum. Finally, we present different patterns of $q$-Pascal's triangles, based on $q$-sum, whose elements are $q$-numbers.
Complex network theory is used to investigate the structure of meaningful concepts in written tex... more Complex network theory is used to investigate the structure of meaningful concepts in written texts of individual authors. Networks have been constructed after a two phase filtering, where words with less meaning contents are eliminated and all remaining words are set to their canonical form, without any number, gender or time flexion. Each sentence in the text is added to the network as a clique. A large number of written texts have been scrutinised, and it is found that texts have small-world as well as scale-free structures. The growth process of these networks has also been investigated, and a universal evolution of network quantifiers have been found among the set of texts written by distinct authors. Further analyses, based on shuffling procedures taken either on the texts or on the constructed networks, provide hints on the role played by the word frequency and sentence length distributions to the network structure.
We propose a two-parametric non-distributive algebraic structure that follows from $(q,q')$-logar... more We propose a two-parametric non-distributive algebraic structure that follows from $(q,q')$-logarithm and $(q,q')$-exponential functions. Properties of generalized $(q,q')$-operators are analyzed. We also generalize the proposal into a multi-parametric structure (generalization of logarithm and exponential functions and the corresponding algebraic operators). All $n$-parameter expressions recover $(n-1)$-generalization when the corresponding $q_n\to1$. Nonextensive statistical mechanics has been the source of successive generalizations of entropic forms and mathematical structures, in which this work is a consequence.
A concept of higher order neighborhood in complex networks, introduced previously [Phys. Rev. E 7... more A concept of higher order neighborhood in complex networks, introduced previously [Phys. Rev. E 73, 046101 (2006)], is systematically explored to investigate larger scale structures in complex networks. The basic idea is to consider each higher order neighborhood as a network in itself, represented by a corresponding adjacency matrix, and to settle a plenty of new parameters in order to obtain a best characterization of the whole network. Usual network indices are then used to evaluate the properties of each neighborhood. The identification of high order neighborhoods is also regarded as intermediary step towards the evaluation of global network properties, like the diameter, average shortest path between node, and network fractal dimension. Results for a large number of typical networks are presented and discussed.
Uploads
Papers by Thierry Petit Lobao
of completely sequenced genomes of extant organisms pertaining to the three life domains, followed by
meta-analysis using traditional sequence similarity comparison and complex network approaches. Complex
networks involving proteins of the chitin metabolic pathway in extant organisms were constructed based on protein sequence similarity. Several usual network indices were estimated in order to obtain information on the topology of these networks, including those related to higher order neighborhood properties. Due to the assumed evolutionary character of the system, we also discuss issues related to modularity properties, with the concept of edge betweenness playing a particularly important role in our analysis. Complex network approach correctly identifies clusters of organisms that belong to phylogenetic groups without any a priori knowledge about the biological features of the investigated protein sequences. We envisage the prospect of using such a complex network approach as a high-throughput phylogenetic method.
of completely sequenced genomes of extant organisms pertaining to the three life domains, followed by
meta-analysis using traditional sequence similarity comparison and complex network approaches. Complex
networks involving proteins of the chitin metabolic pathway in extant organisms were constructed based on protein sequence similarity. Several usual network indices were estimated in order to obtain information on the topology of these networks, including those related to higher order neighborhood properties. Due to the assumed evolutionary character of the system, we also discuss issues related to modularity properties, with the concept of edge betweenness playing a particularly important role in our analysis. Complex network approach correctly identifies clusters of organisms that belong to phylogenetic groups without any a priori knowledge about the biological features of the investigated protein sequences. We envisage the prospect of using such a complex network approach as a high-throughput phylogenetic method.