The document discusses various approaches to identifying subgroups or structures within networks, including k-components, k-plexes, k-cores, k-cliques, core-periphery structures, and communities. It provides definitions and examples of each approach, noting that k-components, k-plexes, and k-cores examine the density of connections within a subgroup, while core-periphery structure and communities look at both internal and external connections of subgroups. The document aims to explain commonly used approaches to analyzing the meso-scale properties and structures that emerge within complex networks.
2. Why?
• Mostly observed real networks have:
– Heavy tail (powerlaw, exponential)
– High clustering (high number of triangles especially in
social networks, lower count otherwise)
– Small average path (usually small diameter)
– Communities/periphery/hierarchy
– Homophily and assortative mixing (similar nodes tend
to be adjacent)
• Where does the structure come from?
• How do we model it?
2
3. Macro and Meso Scale
• Macro Scale properties (using all the interactions):
– Small world (small average path, high clustering)
– Powerlaw degree distr. (generally pref. attachment)
• Meso Scale properties applying to groups (using k-
clique, k-core, k-plex):
– Community structure
– Core-periphery structure
• Micro Scale properties applying to small units:
– Edge properties (such as who it connects, being a
bridge)
– Node properties (such as degree, cut-vertex)
3
4. Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
6. Some local and global metrics
pertaining to structure of networks
6
Structure they capture Local Metrics Global metrics
Direct influence
General feel for the distribution of the
edges
Vertex degree,
in and out degree
Degree distribution
Closeness, distance between nodes Geodesic (path)
Distance (numerical
value)
Diameter, radius,
average path length
Connectedness of the network
How critical are vertices to the
connectedness of the graph?
How much damage can a network take
before disconnecting?
Existence of a bridge
Existence of a cut vertex
Cut sets
Degree distribution
Tight node/edge neighborhoods Clique, plex, core,
community,
k-dense (for edges)
Community detection
Centrality and influence Degree centrality Betweenness,
eigenvector, PageRank,
hub and authorities
7. Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
8. Components
• Recall that a graph is 𝑘-connected or 𝑘 -component
if it can be disconnected by removal of 𝑘 vertices,
and no 𝑘 -1 vertices can disconnect it.
• Component is a maximal size connected subgraph
• A 𝑘-component (𝑘-connected component) is a
connected maximal subgraph that can be
disconnected (or we’re left with a 𝐾1) by removal of
𝑘 vertices, and no 𝑘 − 1 vertices can disconnect it.
• Alternatively: A 𝑘-component is a connected
maximal subgraph such that there are 𝑘 -vertex-
independent paths between any two vertices
8
9. In class exercise
• The 𝑘-component tells how robust a graph or
subgraph is.
• Identify a
subgraph
that is either a:
– 1-connected
– 2-connected
– 3-connected
– 4-connected
9
10. Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
11. 𝑘-plex
• A 𝑘-plex : maximal subset of nodes, with
deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph
induced by S.
• What is a 1-plex?
– 𝑘 = 1 : clique
– 𝑘 > 1 : “approximate clique”
• Idea: missing a few edges to be a clique
– 𝑘 − 1 or fewer edges per vertex are allowed to be
missing
– Useful in identifying subgroups with small diameter,
(possible cliques in the ground truth network).
11
12. k-plex
• A 𝑘-plex : maximal subset of nodes, with
deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph
induced by S, and 𝑘 = |𝐺[𝑆]|.
• For what values of 𝑘
is the subgraph
𝐺[{2, 3, 4, 5, 6}]
a 𝑘 -plex?
12
13. In class exercise
• A 𝑘-plex : maximal subset of nodes, with
deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph
induced by S, and 𝑘 = |𝐺[𝑆]|.
• Identify a:
– 1-plex
– 2-plex
– 3-plex
– 4-plex
13
14. Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
15. k-core
• 𝐴 𝑘-core: maximal subset of nodes, 𝑆, with
deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph
induced by S.
• Idea: enough edges are present to make a
group strong, not worrying about diameter of
𝐺[𝑆].
• What is the relation between 𝑘 and 𝑙 if S is a
𝑘-core and 𝑙-plex?
– If S is a 𝑘 -core, then S is a (n − 𝑘)-plex
15
16. k-core
• A 𝑘-core of size n: maximal subset of nodes 𝑘
with deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph
induced by 𝑆.
• Approach: eliminate lower-order cores until
relatively dense subgroups are identified.
16
17. In class exercise
• A 𝑘-core of size n: maximal subset of nodes 𝑘
with deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph
induced by 𝑆.
• Identify a:
– 1-core
– 2-core
– 3-core
– 4-core
17
18. In class exercise
• A k- dense sub-graph is a group of vertices, in
which each pair of vertices {i, j} has at least
k-2 common neighbors.
• Identify a:
– 1-dense
– 2-dense
– 3-dense
– 4-dense
18
19. Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
20. k-dense
• A 𝑘-dense sub-graph is a group of vertices, in
which each pair of vertices {i, j} has at least
𝑘-2 common neighbors.
Idea: pairwise friends (𝑘 –dense looks at edges
rather than vertices in making them part of the
𝑘 group)
20
21. k-dense
• A 𝑘-dense sub-graph is a group of vertices, in
which each pair of vertices {i, j} has at least
𝑘-2 common neighbors.
21
22. Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
23. k-cliques
• A clique of size 𝑘: a subset of 𝑘 nodes, with every
node adjacent to every other member of the subset
(all 𝑘 − 1 one of them)
• We usually search for the maximum clique
• Hard to find (decision problem for the clique number
is NP-Complete)
• Why is it hard to use this concept on real networks?
– Because one might not infer/know all the edges of the true
network, so clique may exist but it may not be captured in
the data to be analyzed
– A relaxed version of a clique might be just as useful in large
networks.
23
24. In class exercise
• A clique of size 𝑘: a subset of 𝑘 nodes, with
every node connected to every other member of
the subset.
• Identify a:
– 1-clique
– 2-clique
– 3-clique
– 4-clique
24
25. Cliques, plexes and cores
• clique of size 𝑘: maximal subset of nodes, with every
node adjacent to every other member of the subset
• 𝑘-plex of size 𝑛 : maximal subset of nodes, with every
node adjacent to at least 𝑛 − 𝑘 other members of the
subset
– 𝑘 = 1 : clique
– 𝑘 > 1 : “approximate clique”
• 𝑘-core: maximal subset of nodes, with every node
adjacent to at least 𝑘 others in the subset
• A 𝑘-dense sub-graph is a group of vertices, in which
each pair of vertices {i, j} has at least 𝑘 − 2 common
neighbors.
Not used so much, rather k-core
26. Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
27. Communities
vs. core/dense/clique
• K-core/plex/dense/clique: look inside the group of
nodes
• Communities look both at internal and external ties
(high internal and low external ties)
• Core-periphery
decomposition
also looking at
internal and ext.
to the core (doesn’t
have to be a clique)
27
30. Deciding on core-periphery
30
How to decide if a network has core-
periphery structure?
• Not well defined either, but generally the
density of the 𝑘-core must be high:
• Desired: high correlation, 𝜌 , defined as:
𝜌 = 𝑖,𝑗 𝑎𝑖𝑗𝛿𝑖𝑗 ,
where 𝑎𝑖𝑗 is the (i,j) adjacency matrix entry, and
𝛿𝑖𝑗 =
1, 𝑖𝑓 𝑛𝑜𝑑𝑒 𝑖 𝑜𝑟 𝑗 𝑖𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑐𝑜𝑟𝑒
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
http://www.sciencedirect.com/science/article/pii/S0378873399000192
31. Extensions of core-periphery?!
31
Limitation:
• There are just two classes of nodes: core and
periphery.
• Is a three-class partition consisting of core,
semiperiphery, and periphery more realistic?
• Or even partitioning with more classes?
• The problem becomes more difficult as the
number of classes is increased, and good
justification is needed.
http://www.sciencedirect.com/science/article/pii/S0378873399000192
32. Review of structures!
32
From Aaron Clauset and Mason Porter
dark shade = 0 (nonadjacent)
light shade = 1 (adjacent)
33. References
• M. E. Newman, Analysis of weighted networks Physical Review E, vol.
70, no. 5, 2004.
• Borgatti, Stephen P., and Martin G. Everett. "Models of core/periphery
structures“ Social networks 21.4 (2000): 375-395.
• Csermely, Peter, et al. "Structure and dynamics of core/periphery
networks.“ Journal of Complex Networks 1.2 (2013): 93-123.
• Kitsak, Maksim, et al. "Identification of influential spreaders in complex
networks." Nature Physics 6.11 (2010): 888-893
• S. B. Seidman, Network structure and minimum degree, Social
networks, vol. 5, no. 3, pp. 269287, 1983
• Borgatti, Stephen P., and Martin G. Everett. "Models of core/periphery
structures." Social networks 21.4 (2000): 375-395.
33