05 Networks
05 Networks
05 Networks
COMP7507
Visualization & Visual Analytics
Network Size
• Size of the WWW: ~60 billion pages
(http://www.worldwidewebsize.com/ )
[ Opte project]
2
Network size
• Facebook: 2.23 billion monthly active users
• Twitter: 330 million monthly active users
• Weibo: 360 million monthly active users
• QQ: 861 million monthly active users
(Data from:http://expandedramblings.com/index.php/resource-how-many-people-use-the-top-social-media/
updated August 2018)
3
[Paul Butler, http://paulbutler.org/archives/visualizing-facebook-friends/ ]
Graph Drawing
• Direct calculation based on graph structure
– Spanning tree
– Adjacency matrix layout
• Optimization-based
– Optimizing the graph aesthetic constraints
– Force-directed layout
4
Spanning Tree Layout
• Many graphs have tree-like structure or useful
spanning trees (i.e., trees that include all
vertices but only some edges of the original
graph)
– WWW, Social Networks
6
Adjacency Matrices
[ http://bost.ocks.org/mike/miserables/ ]
7
Node-Link Layout
• Severe edge crossings and cluttering
8
Optimization Techniques
• Formulate the layout problem as an optimization
problem
• Governed by an equation incorporating
– the costs of items (e.g., distance between nodes) to
be optimized; and
– some constraints (e.g., no edge crossings allowed)
• Non-deterministic and therefore resulting layout
is unpredictable
• Force-Directed Layout is commonly used for
undirected graph
9
Force Directed Layout
[ http://philogb.github.io/jit/static/v20/Jit/Examples/ForceDirected/example1.html ]
10
Force Directed Layout
• To model nodes and edges of a graph as physical
bodies tied with springs
• Two principles:
– Vertices connected by an edge should be drawn near
each other.
– Vertices should not be drawn too close to each other.
• Edges = springs (attractive force)
– Hooke’s Law: F = k * x x: length of spring
14
Filtering
• E.g., Edges linking you and your friends in your
Facebook network
• Ego network
[Hansen 2011] 15
Filtering
Results of Fruchterman-Reingold layout on
the 2007 US Senate voting data with and without edge filtering.
16
Clustering
• Structure-based clustering
– Use only structural information (e.g., edge
connections) of a graph
• Content-based clustering
– Use semantic data associated with graph elements
– Application specific
17
Clustering
• It is natural to choose the clustering with the least
number of edges between members
– or with the minimum total weight of the edges
connecting members for graphs with weighted edges
• Force directed layout algorithms
can also form clusters naturally
18
Edge Bundling
• Clustering of edges instead of nodes
• To reduce cluttering
• Examples:
– Geometry-Based Edge Clustering Structure-based
[Cui et al., "Geometry-Based Edge Clustering for Graph Visualization," TVCG, 2008]
20
Hierarchical Edge Bundles
• Visualization examples without edge bundling
Colored edges representing adjacency relations on (left) balloon trees, and (right) tree maps.
[Holten, "Hierarchical Edge Bundles: Visualization of Adjacency Relations in Hierarchical Data," TVCG , 2006]
21
Hierarchical Edge Bundles
• Bundle adjacency edges along tree hierarchy
1. Tree drawing for hierarchical relations
2. Add adjacency edge (P0, P4) to graph
Find path that goes through least common ancestor (LCA) Draw curved adjacency edge that ”aligns" to tree
22
Hierarchical Edge Bundles
• Examples with hierarchical edge bundling
23
Hierarchical Edge Bundles
25
Understanding a Network
• Studying network metrics over time helps
understand how a network evolves
• Network metrics help improve the visual display
of a network
– By assigning different visual attributes to the nodes
– By filtering and showing only the important nodes
26
Network Metrics
• Overall graph metrics
– Graph type; # of vertices; # of edges
– Self loops (e.g., a person replying his own emails)
– Connected components
– Isolated vertices
– Maximum geodesic distance (aka diameter)
• i.e., the distance between two nodes that are farthest apart
– Average geodesic distance
• i.e., average distance from one node to another through the
graph edges
– Graph density
• i.e., # edges ÷ # of possible edges
– etc.
For undirected graph, graph density = (2 * |E|) / (|V| * (|V|-1))
28
Network Metrics
• Node metrics: structure-based metrics associated with a
node
– Degree (in-degree, out-degree)
– Betweenness centrality
– Closeness centrality
– Clustering coefficient
– Eigenvector centrality
– PageRank
• Use for identifying special or important nodes or
subgroups
• There are edge metrics as well (e.g., edge betweenness)
29
Degree Centrality
• A node is considered more important if it is of a
greater degree
• Centrality means
“Importance”
30
Betweenness Centrality
• Measures the importance of a person in passing
information within a network
31
Betweenness Centrality
• The betweenness centrality of a node:
# of shortest path
between s and t passing through v
32
Betweenness Centrality
33
Closeness Centrality
• Measures how close a person is to the others
– how fast a message can reach all others from a
person
• What is the scenario for a person’s message to
reach all others in the fastest way?
• The closeness centrality of a node:
Farness of v:
Total distance between
v and all other nodes
34
Closeness Centrality
35
Clustering Coefficient
• Measures how well a person’s friends are
connected to each other
36
Clustering Coefficient
• Clustering coefficient of a node:
38
Eigenvector Centrality
• Measures the influence of a node in a network
– having a connection to an important person is more
important
• deg(Healther)
= deg(Ed) = 3
• Ed connects with Diane
who is the most popular
(i.e., having the largest
degree)
• Heather connects to Ike,
who is among the least
popular
• Hence, Ed’s eigenvector
centrality is higher
39
Eigenvector Centrality
• The eigenvector centrality score of a node is:
41
PageRank
• Used by Google search engine to rank websites
in their search results.
• Idea: More important websites are likely to
receive more links from other websites.
• A variant of eigenvector centrality
• Score of a page = the probability of being
brought to a page after many clicks.
42
PageRank
45
Visual Attributes Mapping
[Hansen 2011]
46
Tools and Datasets
• Tools for Graph Visualization
– Gephi
https://gephi.org
– NodeXL
http://nodexl.codeplex.com
47
Reference
• Ivan Herman, Guy Melançon, M. Scott Marshall, “Graph
Visualization and Navigation in Information
Visualization: A Survey”, IEEE Trans. Vis. Comput. Graph,
6 (1), 2000, pp. 24-43.
• Matthew Ward, Georges Grinstein and Daniel Keim,
"Interactive Data Visualization: Foundations, Techniques,
and Applications", 2010 [Chapter 8]
• Hansen, Shneiderman and Smith, “Analyzing Social
Media Networks with NodeXL: Insights from a
Connected World”, 2011.
• Isabel F. Cruz and Roberto Tamassia, “Graph Drawing
Tutorial” (http://cs.brown.edu/~rt/papers/gd-tutorial/gd-
constraints.pdf)
48