As the massive size of contemporary social networks poses a serious challenge to the scalability of traditional graph clustering algorithms and the evaluation of discovered communities, we develop, in this manuscript, an approach used to discover hierarchical community structure in large networks. The introduced hybrid technique combines the strengths of bottom-up hierarchical clustering method with that of top-down hierarchical clustering. In fact, the first approach is efficient in identifying small clusters, while the second one is good at determining large ones. Our mixed hierarchical clustering technique, based on the assumption that there exists an initial solution composed of k classes and the combination of the two previously mentioned methods, does not the change of the number of partitions, modifies the repartition of the initial classes. At the end of the introduced clustering process, a fixed point, representing a local optimum of the cost function which measures the degree of importance between two partitions, is obtained. Consequently, the introduced combined model leads to the emergence of local community structure. To avoid this local optimum and detect community structure converged to the global optimum of the cost function, the detection of community structures, in this study, is not considered only as a clustering problem, but as an optimization issue. Besides, a novel mixed hierarchical clustering algorithm based on swarms intelligence is suggested for identifying community structures in social networks. In order to validate the proposed method, performances of the introduced approach are evaluated using both real and artificial networks as well as internal and external clustering evaluation criteria.
- SHC:
Similarity-based hierarchical community
Heuristic algorithm for multi-scale hierarchical community detection
Partial matrix approximation convergence
- SN:
Social network
- JS:
Jaccard similarity measure
- AgA :
Agglomerative algorithm
- DST:
Dependence similarity table
- AHL:
Ascendant hierarchical level
- DivA:
Divisive algorithm
- DHL:
Descendant hierarchical level
- MHA:
Mixed hierarchical algorithm
- T-D-H-L:
Top-down hierarchical level
- B-U-H-L:
Bottom-up hierarchical level
Mixed hierarchical algorithm-based swarms
- AntCDivA:
Ant colony-based divisive algorithm
- BeeCAgA:
Bee colony-based agglomerative algorithm
- LFR benchmark:
Lancichinetti Fortunato Radicchi benchmark
- CEC:
Cross-entropy clustering
- NMI:
Normalized mutual information
- DBI:
Davies–Bouldin index
- PGP:
Pretty good privacy
- SI:
Swarm intelligence
- \(Q_\mathrm{comb}\) :
Combined modularity function
- \(Q_\mathrm{comb}\) :
Separated modularity function
- \(\mathrm{SN} = (V; E; \mu )\) :
Graph modeling SN
- V :
Nodes representing to social network members
- E :
Edges modeling the relationship between social network members
- \(\mu \) :
Weight of edges
- n :
Number of nodes
- \(\ell \) :
Hierarchical level
- k :
Number of sub-detected partitions at each hierarchical level
- \(P=\{p_{1},p_{2},\ldots ,p_{s}\}\), \(G=\{g_{1},g_{2},\ldots ,g_{r}\}\), \(C=\{c_{1},c_{2},\ldots ,c_{s}\}\) :
SN detected partitions
- \(p_{1},p_{2},\ldots ,p_{s}\), \(g_{1},g_{2},\ldots ,g_{r}\), \(c_{1},c_{2},\ldots ,c_{s}\) :
- m :
Social network members’
- D :
Any element contained in SN partitions
- A[i, j]:
The adjacency matrix of SN
- \(\overline{A}{[}i{]}\) :
Average of the vector A[i]
- cov(\(E_{i,j}\)):
Covariance function
- Op(\(V_{i}\)):
Extracted opinions from the node\(V_{i}\)
- Op(\(V_{j}\)):
Extracted opinions from the node\(V_{j}\).
- \(N_{i}\) :
Neighbor of node i
- \(N_{j}\) :
Neighbor of node j
- \(Score_{importantOp}\) :
Function measuring the degree of importance of nodes
- \(GScore_{importantOp}\) :
General \(GScore_{importantOp}\)
- \(MoyScore_{importantOp}\) :
Average of \(Score_{importantOp}\) of sub-partitions
- Initpart:
Initial partition
- cordMin:
Function returning m having the least \(Score_{importantOp}\) value
- cordMax:
Function returning m having the highest \(Score_{importantOp}\) value
- \(Q_{DS}\) :
Dependance similarity-based modularity
- \(AgQ_{DS}\) :
\(Q_{DS}\) function for BeeCAgA
- \(DivQ_{DS}\) :
\(Q_{DS}\) function for AntCDivA
- \(MixQ_{DS}\) :
\(Q_{DS}\) function for MHAS
- E :
Energy function
Toujani, R., Akaichi, J. An approach based on mixed hierarchical clustering and optimization for graph analysis in social media network: toward globally hierarchical community structure. Knowl Inf Syst 60, 907–947 (2019). https://doi.org/10.1007/s10115-019-01329-2
Issue Date:
DOI: https://doi.org/10.1007/s10115-019-01329-2