Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Groups of vertices
and
Core-periphery
structure
By: Ralucca Gera, NPS
Why?
• Mostly observed real networks have:
– Heavy tail (powerlaw, exponential)
– High clustering (high number of triangles especially in
social networks, lower count otherwise)
– Small average path (usually small diameter)
– Communities/periphery/hierarchy
– Homophily and assortative mixing (similar nodes tend
to be adjacent)
• Where does the structure come from?
• How do we model it?
2
Macro and Meso Scale
• Macro Scale properties (using all the interactions):
– Small world (small average path, high clustering)
– Powerlaw degree distr. (generally pref. attachment)
• Meso Scale properties applying to groups (using k-
clique, k-core, k-plex):
– Community structure
– Core-periphery structure
• Micro Scale properties applying to small units:
– Edge properties (such as who it connects, being a
bridge)
– Node properties (such as degree, cut-vertex)
3
Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
Source: Guido Caldarelli, Communities and Clustering in Some social Networks
NetSci 2007 New York, May 20th 2007
0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 1 1 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0
In a very clustered graph,
the adjacency matrix can be
put in a block form by
communities.
Communities with Matrices
Some local and global metrics
pertaining to structure of networks
6
Structure they capture Local Metrics Global metrics
Direct influence
General feel for the distribution of the
edges
Vertex degree,
in and out degree
Degree distribution
Closeness, distance between nodes Geodesic (path)
Distance (numerical
value)
Diameter, radius,
average path length
Connectedness of the network
How critical are vertices to the
connectedness of the graph?
How much damage can a network take
before disconnecting?
Existence of a bridge
Existence of a cut vertex
Cut sets
Degree distribution
Tight node/edge neighborhoods Clique, plex, core,
community,
k-dense (for edges)
Community detection
Centrality and influence Degree centrality Betweenness,
eigenvector, PageRank,
hub and authorities
Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
Components
• Recall that a graph is 𝑘-connected or 𝑘 -component
if it can be disconnected by removal of 𝑘 vertices,
and no 𝑘 -1 vertices can disconnect it.
• Component is a maximal size connected subgraph
• A 𝑘-component (𝑘-connected component) is a
connected maximal subgraph that can be
disconnected (or we’re left with a 𝐾1) by removal of
𝑘 vertices, and no 𝑘 − 1 vertices can disconnect it.
• Alternatively: A 𝑘-component is a connected
maximal subgraph such that there are 𝑘 -vertex-
independent paths between any two vertices
8
In class exercise
• The 𝑘-component tells how robust a graph or
subgraph is.
• Identify a
subgraph
that is either a:
– 1-connected
– 2-connected
– 3-connected
– 4-connected
9
Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
𝑘-plex
• A 𝑘-plex : maximal subset of nodes, with
deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph
induced by S.
• What is a 1-plex?
– 𝑘 = 1 : clique
– 𝑘 > 1 : “approximate clique”
• Idea: missing a few edges to be a clique
– 𝑘 − 1 or fewer edges per vertex are allowed to be
missing
– Useful in identifying subgroups with small diameter,
(possible cliques in the ground truth network).
11
k-plex
• A 𝑘-plex : maximal subset of nodes, with
deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph
induced by S, and 𝑘 = |𝐺[𝑆]|.
• For what values of 𝑘
is the subgraph
𝐺[{2, 3, 4, 5, 6}]
a 𝑘 -plex?
12
In class exercise
• A 𝑘-plex : maximal subset of nodes, with
deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph
induced by S, and 𝑘 = |𝐺[𝑆]|.
• Identify a:
– 1-plex
– 2-plex
– 3-plex
– 4-plex
13
Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
k-core
• 𝐴 𝑘-core: maximal subset of nodes, 𝑆, with
deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph
induced by S.
• Idea: enough edges are present to make a
group strong, not worrying about diameter of
𝐺[𝑆].
• What is the relation between 𝑘 and 𝑙 if S is a
𝑘-core and 𝑙-plex?
– If S is a 𝑘 -core, then S is a (n − 𝑘)-plex
15
k-core
• A 𝑘-core of size n: maximal subset of nodes 𝑘
with deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph
induced by 𝑆.
• Approach: eliminate lower-order cores until
relatively dense subgroups are identified.
16
In class exercise
• A 𝑘-core of size n: maximal subset of nodes 𝑘
with deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph
induced by 𝑆.
• Identify a:
– 1-core
– 2-core
– 3-core
– 4-core
17
In class exercise
• A k- dense sub-graph is a group of vertices, in
which each pair of vertices {i, j} has at least
k-2 common neighbors.
• Identify a:
– 1-dense
– 2-dense
– 3-dense
– 4-dense
18
Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
k-dense
• A 𝑘-dense sub-graph is a group of vertices, in
which each pair of vertices {i, j} has at least
𝑘-2 common neighbors.
Idea: pairwise friends (𝑘 –dense looks at edges
rather than vertices in making them part of the
𝑘 group)
20
k-dense
• A 𝑘-dense sub-graph is a group of vertices, in
which each pair of vertices {i, j} has at least
𝑘-2 common neighbors.
21
Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
k-cliques
• A clique of size 𝑘: a subset of 𝑘 nodes, with every
node adjacent to every other member of the subset
(all 𝑘 − 1 one of them)
• We usually search for the maximum clique
• Hard to find (decision problem for the clique number
is NP-Complete)
• Why is it hard to use this concept on real networks?
– Because one might not infer/know all the edges of the true
network, so clique may exist but it may not be captured in
the data to be analyzed
– A relaxed version of a clique might be just as useful in large
networks.
23
In class exercise
• A clique of size 𝑘: a subset of 𝑘 nodes, with
every node connected to every other member of
the subset.
• Identify a:
– 1-clique
– 2-clique
– 3-clique
– 4-clique
24
Cliques, plexes and cores
• clique of size 𝑘: maximal subset of nodes, with every
node adjacent to every other member of the subset
• 𝑘-plex of size 𝑛 : maximal subset of nodes, with every
node adjacent to at least 𝑛 − 𝑘 other members of the
subset
– 𝑘 = 1 : clique
– 𝑘 > 1 : “approximate clique”
• 𝑘-core: maximal subset of nodes, with every node
adjacent to at least 𝑘 others in the subset
• A 𝑘-dense sub-graph is a group of vertices, in which
each pair of vertices {i, j} has at least 𝑘 − 2 common
neighbors.
Not used so much, rather k-core
Some common approaches to subgroup identification:
components
(and k-components)
k-plex k-core
k-dense
k-clique
Core-periphery
structure
Community
structure
Communities
vs. core/dense/clique
• K-core/plex/dense/clique: look inside the group of
nodes
• Communities look both at internal and external ties
(high internal and low external ties)
• Core-periphery
decomposition
also looking at
internal and ext.
to the core (doesn’t
have to be a clique)
27
K-core (k-shell)
http://3.bp.blogspot.com/-TIjz3nstWD0/ToGwUGivEjI/AAAAAAAAsWw/etkwklnPNw4/s1600/k-cores.png
Generally, but not well defined: the core of the network (the 𝑘-core for the largest 𝑘) and
the periphery (everything else).
There are modifications where several top values of 𝑘 make the core.
Core-periphery
29
dark blue = 1 (adjacent)
white = 0 (nonadjacent)
Deciding on core-periphery
30
How to decide if a network has core-
periphery structure?
• Not well defined either, but generally the
density of the 𝑘-core must be high:
• Desired: high correlation, 𝜌 , defined as:
𝜌 = 𝑖,𝑗 𝑎𝑖𝑗𝛿𝑖𝑗 ,
where 𝑎𝑖𝑗 is the (i,j) adjacency matrix entry, and
𝛿𝑖𝑗 =
1, 𝑖𝑓 𝑛𝑜𝑑𝑒 𝑖 𝑜𝑟 𝑗 𝑖𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑐𝑜𝑟𝑒
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
http://www.sciencedirect.com/science/article/pii/S0378873399000192
Extensions of core-periphery?!
31
Limitation:
• There are just two classes of nodes: core and
periphery.
• Is a three-class partition consisting of core,
semiperiphery, and periphery more realistic?
• Or even partitioning with more classes?
• The problem becomes more difficult as the
number of classes is increased, and good
justification is needed.
http://www.sciencedirect.com/science/article/pii/S0378873399000192
Review of structures!
32
From Aaron Clauset and Mason Porter
dark shade = 0 (nonadjacent)
light shade = 1 (adjacent)
References
• M. E. Newman, Analysis of weighted networks Physical Review E, vol.
70, no. 5, 2004.
• Borgatti, Stephen P., and Martin G. Everett. "Models of core/periphery
structures“ Social networks 21.4 (2000): 375-395.
• Csermely, Peter, et al. "Structure and dynamics of core/periphery
networks.“ Journal of Complex Networks 1.2 (2013): 93-123.
• Kitsak, Maksim, et al. "Identification of influential spreaders in complex
networks." Nature Physics 6.11 (2010): 888-893
• S. B. Seidman, Network structure and minimum degree, Social
networks, vol. 5, no. 3, pp. 269287, 1983
• Borgatti, Stephen P., and Martin G. Everett. "Models of core/periphery
structures." Social networks 21.4 (2000): 375-395.
33

More Related Content

13047926.ppt

  • 2. Why? • Mostly observed real networks have: – Heavy tail (powerlaw, exponential) – High clustering (high number of triangles especially in social networks, lower count otherwise) – Small average path (usually small diameter) – Communities/periphery/hierarchy – Homophily and assortative mixing (similar nodes tend to be adjacent) • Where does the structure come from? • How do we model it? 2
  • 3. Macro and Meso Scale • Macro Scale properties (using all the interactions): – Small world (small average path, high clustering) – Powerlaw degree distr. (generally pref. attachment) • Meso Scale properties applying to groups (using k- clique, k-core, k-plex): – Community structure – Core-periphery structure • Micro Scale properties applying to small units: – Edge properties (such as who it connects, being a bridge) – Node properties (such as degree, cut-vertex) 3
  • 4. Some common approaches to subgroup identification: components (and k-components) k-plex k-core k-dense k-clique Core-periphery structure Community structure
  • 5. Source: Guido Caldarelli, Communities and Clustering in Some social Networks NetSci 2007 New York, May 20th 2007 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 In a very clustered graph, the adjacency matrix can be put in a block form by communities. Communities with Matrices
  • 6. Some local and global metrics pertaining to structure of networks 6 Structure they capture Local Metrics Global metrics Direct influence General feel for the distribution of the edges Vertex degree, in and out degree Degree distribution Closeness, distance between nodes Geodesic (path) Distance (numerical value) Diameter, radius, average path length Connectedness of the network How critical are vertices to the connectedness of the graph? How much damage can a network take before disconnecting? Existence of a bridge Existence of a cut vertex Cut sets Degree distribution Tight node/edge neighborhoods Clique, plex, core, community, k-dense (for edges) Community detection Centrality and influence Degree centrality Betweenness, eigenvector, PageRank, hub and authorities
  • 7. Some common approaches to subgroup identification: components (and k-components) k-plex k-core k-dense k-clique Core-periphery structure Community structure
  • 8. Components • Recall that a graph is 𝑘-connected or 𝑘 -component if it can be disconnected by removal of 𝑘 vertices, and no 𝑘 -1 vertices can disconnect it. • Component is a maximal size connected subgraph • A 𝑘-component (𝑘-connected component) is a connected maximal subgraph that can be disconnected (or we’re left with a 𝐾1) by removal of 𝑘 vertices, and no 𝑘 − 1 vertices can disconnect it. • Alternatively: A 𝑘-component is a connected maximal subgraph such that there are 𝑘 -vertex- independent paths between any two vertices 8
  • 9. In class exercise • The 𝑘-component tells how robust a graph or subgraph is. • Identify a subgraph that is either a: – 1-connected – 2-connected – 3-connected – 4-connected 9
  • 10. Some common approaches to subgroup identification: components (and k-components) k-plex k-core k-dense k-clique Core-periphery structure Community structure
  • 11. 𝑘-plex • A 𝑘-plex : maximal subset of nodes, with deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph induced by S. • What is a 1-plex? – 𝑘 = 1 : clique – 𝑘 > 1 : “approximate clique” • Idea: missing a few edges to be a clique – 𝑘 − 1 or fewer edges per vertex are allowed to be missing – Useful in identifying subgroups with small diameter, (possible cliques in the ground truth network). 11
  • 12. k-plex • A 𝑘-plex : maximal subset of nodes, with deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph induced by S, and 𝑘 = |𝐺[𝑆]|. • For what values of 𝑘 is the subgraph 𝐺[{2, 3, 4, 5, 6}] a 𝑘 -plex? 12
  • 13. In class exercise • A 𝑘-plex : maximal subset of nodes, with deg𝐺[𝑆] 𝑣 ≥ 𝑛 − 𝑘, where 𝐺[𝑆] is the subgraph induced by S, and 𝑘 = |𝐺[𝑆]|. • Identify a: – 1-plex – 2-plex – 3-plex – 4-plex 13
  • 14. Some common approaches to subgroup identification: components (and k-components) k-plex k-core k-dense k-clique Core-periphery structure Community structure
  • 15. k-core • 𝐴 𝑘-core: maximal subset of nodes, 𝑆, with deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph induced by S. • Idea: enough edges are present to make a group strong, not worrying about diameter of 𝐺[𝑆]. • What is the relation between 𝑘 and 𝑙 if S is a 𝑘-core and 𝑙-plex? – If S is a 𝑘 -core, then S is a (n − 𝑘)-plex 15
  • 16. k-core • A 𝑘-core of size n: maximal subset of nodes 𝑘 with deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph induced by 𝑆. • Approach: eliminate lower-order cores until relatively dense subgroups are identified. 16
  • 17. In class exercise • A 𝑘-core of size n: maximal subset of nodes 𝑘 with deg𝐺[𝑆] 𝑣 ≥ 𝑘, where 𝐺[𝑆] is the subgraph induced by 𝑆. • Identify a: – 1-core – 2-core – 3-core – 4-core 17
  • 18. In class exercise • A k- dense sub-graph is a group of vertices, in which each pair of vertices {i, j} has at least k-2 common neighbors. • Identify a: – 1-dense – 2-dense – 3-dense – 4-dense 18
  • 19. Some common approaches to subgroup identification: components (and k-components) k-plex k-core k-dense k-clique Core-periphery structure Community structure
  • 20. k-dense • A 𝑘-dense sub-graph is a group of vertices, in which each pair of vertices {i, j} has at least 𝑘-2 common neighbors. Idea: pairwise friends (𝑘 –dense looks at edges rather than vertices in making them part of the 𝑘 group) 20
  • 21. k-dense • A 𝑘-dense sub-graph is a group of vertices, in which each pair of vertices {i, j} has at least 𝑘-2 common neighbors. 21
  • 22. Some common approaches to subgroup identification: components (and k-components) k-plex k-core k-dense k-clique Core-periphery structure Community structure
  • 23. k-cliques • A clique of size 𝑘: a subset of 𝑘 nodes, with every node adjacent to every other member of the subset (all 𝑘 − 1 one of them) • We usually search for the maximum clique • Hard to find (decision problem for the clique number is NP-Complete) • Why is it hard to use this concept on real networks? – Because one might not infer/know all the edges of the true network, so clique may exist but it may not be captured in the data to be analyzed – A relaxed version of a clique might be just as useful in large networks. 23
  • 24. In class exercise • A clique of size 𝑘: a subset of 𝑘 nodes, with every node connected to every other member of the subset. • Identify a: – 1-clique – 2-clique – 3-clique – 4-clique 24
  • 25. Cliques, plexes and cores • clique of size 𝑘: maximal subset of nodes, with every node adjacent to every other member of the subset • 𝑘-plex of size 𝑛 : maximal subset of nodes, with every node adjacent to at least 𝑛 − 𝑘 other members of the subset – 𝑘 = 1 : clique – 𝑘 > 1 : “approximate clique” • 𝑘-core: maximal subset of nodes, with every node adjacent to at least 𝑘 others in the subset • A 𝑘-dense sub-graph is a group of vertices, in which each pair of vertices {i, j} has at least 𝑘 − 2 common neighbors. Not used so much, rather k-core
  • 26. Some common approaches to subgroup identification: components (and k-components) k-plex k-core k-dense k-clique Core-periphery structure Community structure
  • 27. Communities vs. core/dense/clique • K-core/plex/dense/clique: look inside the group of nodes • Communities look both at internal and external ties (high internal and low external ties) • Core-periphery decomposition also looking at internal and ext. to the core (doesn’t have to be a clique) 27
  • 28. K-core (k-shell) http://3.bp.blogspot.com/-TIjz3nstWD0/ToGwUGivEjI/AAAAAAAAsWw/etkwklnPNw4/s1600/k-cores.png Generally, but not well defined: the core of the network (the 𝑘-core for the largest 𝑘) and the periphery (everything else). There are modifications where several top values of 𝑘 make the core.
  • 29. Core-periphery 29 dark blue = 1 (adjacent) white = 0 (nonadjacent)
  • 30. Deciding on core-periphery 30 How to decide if a network has core- periphery structure? • Not well defined either, but generally the density of the 𝑘-core must be high: • Desired: high correlation, 𝜌 , defined as: 𝜌 = 𝑖,𝑗 𝑎𝑖𝑗𝛿𝑖𝑗 , where 𝑎𝑖𝑗 is the (i,j) adjacency matrix entry, and 𝛿𝑖𝑗 = 1, 𝑖𝑓 𝑛𝑜𝑑𝑒 𝑖 𝑜𝑟 𝑗 𝑖𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑐𝑜𝑟𝑒 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 http://www.sciencedirect.com/science/article/pii/S0378873399000192
  • 31. Extensions of core-periphery?! 31 Limitation: • There are just two classes of nodes: core and periphery. • Is a three-class partition consisting of core, semiperiphery, and periphery more realistic? • Or even partitioning with more classes? • The problem becomes more difficult as the number of classes is increased, and good justification is needed. http://www.sciencedirect.com/science/article/pii/S0378873399000192
  • 32. Review of structures! 32 From Aaron Clauset and Mason Porter dark shade = 0 (nonadjacent) light shade = 1 (adjacent)
  • 33. References • M. E. Newman, Analysis of weighted networks Physical Review E, vol. 70, no. 5, 2004. • Borgatti, Stephen P., and Martin G. Everett. "Models of core/periphery structures“ Social networks 21.4 (2000): 375-395. • Csermely, Peter, et al. "Structure and dynamics of core/periphery networks.“ Journal of Complex Networks 1.2 (2013): 93-123. • Kitsak, Maksim, et al. "Identification of influential spreaders in complex networks." Nature Physics 6.11 (2010): 888-893 • S. B. Seidman, Network structure and minimum degree, Social networks, vol. 5, no. 3, pp. 269287, 1983 • Borgatti, Stephen P., and Martin G. Everett. "Models of core/periphery structures." Social networks 21.4 (2000): 375-395. 33