A HyperNet Architecture
A HyperNet Architecture
UKnowledge
Theses and Dissertations--Computer Science
Computer Science
2014
A HyperNet Architecture
Shufeng Huang
University of Kentucky, theremetony@gmail.com
Recommended Citation
Huang, Shufeng, "A HyperNet Architecture" (2014). Theses and Dissertations--Computer Science. Paper 18.
http://uknowledge.uky.edu/cs_etds/18
This Doctoral Dissertation is brought to you for free and open access by the Computer Science at UKnowledge. It has been accepted for inclusion in
Theses and Dissertations--Computer Science by an authorized administrator of UKnowledge. For more information, please contact
UKnowledge@lsv.uky.edu.
STUDENT AGREEMENT:
I represent that my thesis or dissertation and abstract are my original work. Proper attribution has been
given to all outside sources. I understand that I am solely responsible for obtaining any needed copyright
permissions. I have obtained needed written permission statement(s) from the owner(s) of each thirdparty copyrighted matter to be included in my work, allowing electronic distribution (if such use is not
permitted by the fair use doctrine) which will be submitted to UKnowledge as Additional File.
I hereby grant to The University of Kentucky and its agents the irrevocable, non-exclusive, and royaltyfree license to archive and make accessible my work in whole or in part in all forms of media, now or
hereafter known. I agree that the document mentioned above may be made available immediately for
worldwide access unless an embargo applies.
I retain all other ownership rights to the copyright of my work. I also retain the right to use in future
works (such as articles or books) all or part of my work. I understand that I am free to register the
copyright to my work.
REVIEW, APPROVAL AND ACCEPTANCE
The document mentioned above has been reviewed and accepted by the students advisor, on behalf of
the advisory committee, and by the Director of Graduate Studies (DGS), on behalf of the program; we
verify that this is the final, approved version of the students thesis including all changes required by the
advisory committee. The undersigned agree to abide by the statements above.
Shufeng Huang, Student
Dr. James Griffioen, Major Professor
Dr. Miroslaw Truszczynski, Director of Graduate Studies
A HyperNet Architecture
DISSERTATION
A dissertation submitted in partial fulfillment of the
requirements for the degree of Doctor of Philosophy in the
College of Engineering
at the University of Kentucky
By
Shufeng Huang
Lexington, Kentucky
Director: Dr. James Griffioen and Dr. Kenneth L. Calvert, Professor
of Computer Science
Lexington, Kentucky
2014
c Shufeng Huang 2014
Copyright
ABSTRACT OF DISSERTATION
A HyperNet Architecture
Network virtualization is becoming a fundamental building block of future Internet
architectures. By adding networking resources into the cloud, it is possible for
users to rent virtual routers from the underlying network infrastructure, connect
them with virtual channels to form a virtual network, and tailor the virtual network
(e.g., load application-specific networking protocols, libraries and software stacks on
to the virtual routers) to carry out a specific task. In addition, network virtualization
technology allows such special-purpose virtual networks to co-exist on the same set
of network infrastructure without interfering with each other.
Although the underlying network resources needed to support virtualized networks
are rapidly becoming available, constructing a virtual network from the ground up and
using the network is a challenging and labor-intensive task, one best left to experts.
To tackle this problem, we introduce the concept of a HyperNet, a pre-built, preconfigured network package that a user can easily deploy or access a virtual network
to carry out a specific task (e.g., multicast video conferencing). HyperNets package
together the network topology configuration, software, and network services needed
to create and deploy a custom virtual network. Users download HyperNets from
HyperNet repositories and then run them on virtualized network infrastructure
much like users download and run virtual appliances on a virtual machine. To support
the HyperNet abstraction, we created a Network Hypervisor service that provides a
set of APIs that can be called to create a virtual network with certain characteristics.
To evaluate the HyperNet architecture, we implemented several example HyperNets and ran them on our prototype implementation of the Network Hypervisor.
Our experiments show that the Hypervisor API can be used to compose almost any
special-purpose network networks capable of carrying out functions that the current
Internet does not provide. Moreover, the design of our HyperNet architecture is
highly extensible, enabling developers to write high-level libraries (using the Network
Hypervisor APIs) to achieve complicated tasks.
Keywords: HyperNet, virtual network, network hypervisor, programmable router,
SDN
A HyperNet Architecture
By
Shufeng Huang
ACKNOWLEDGMENTS
I am very thankful to all the people who had helped me to finish my doctoral study.
My thanks first go to my Ph.D. advisor Dr. James Griffioen. Thanks for advising
and helping me becoming a better researcher in Computer Science. Thanks for
arousing my curiosity about computer networking when I decided to pursue a Ph.D.
Thanks for encouraging me whenever I encountered troubles in research. Thanks for
spending time to review all my papers. Thanks for sending me to conferences where
I got to know lots of people with the same research enthusiasm. I could not have
achieved what I have achieved now without his professional guidance. I would also
like to thank Dr. Kenneth Calvert for co-advising me. Every conversation during our
weekly research project meeting was a great learning experience for me.
I would also like to thank the other members of my committee: Dr. Raphael
Finkel, Dr.
iii
iv
Contents
Acknowledgments
iii
Table of Contents
List of Tables
viii
List of Figures
ix
1 Introduction
1.1 Virtualization . . . . . . . . . . . . .
1.1.1 Virtual Networks . . . . . . .
1.2 Building Virtual Networks . . . . . .
1.2.1 Specifying the Network . . . .
1.2.2 Supporting New Functionality
1.3 The HyperNet Approach . . . . . . .
1.4 Example HyperNet Packages . . . . .
1.5 Contributions of the Thesis . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
2
3
4
4
5
6
8
11
2 Related Work
2.1 Virtualization Technology . . . . . . . . . . . . . . . .
2.1.1 Hypervisors and Virtual Machines . . . . . . . .
2.1.2 Virtual Appliances . . . . . . . . . . . . . . . .
2.1.3 From Virtual Machines to Virtual Infrastructure
2.1.4 Other Virtualization Approaches . . . . . . . .
2.1.5 Cloud Services . . . . . . . . . . . . . . . . . .
2.1.6 Deploying Cloud Services . . . . . . . . . . . . .
2.2 Virtual Networks . . . . . . . . . . . . . . . . . . . . .
2.2.1 ProtoGENI . . . . . . . . . . . . . . . . . . . .
2.2.2 ORCA . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 PlanetLab . . . . . . . . . . . . . . . . . . . . .
2.3 Programmable Network Infrastructure . . . . . . . . .
2.3.1 Active Networks . . . . . . . . . . . . . . . . .
2.3.2 NetServ . . . . . . . . . . . . . . . . . . . . . .
2.3.3 Service-Centric Networks . . . . . . . . . . . . .
2.3.4 OpenFlow . . . . . . . . . . . . . . . . . . . . .
2.4 Composable Network Stacks . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
14
14
15
17
18
20
20
22
23
24
26
27
29
29
31
32
33
33
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2.5
.
.
.
.
.
.
.
.
.
33
34
36
37
Providers (VNIPs)
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
39
43
44
46
.
.
.
.
.
.
.
.
.
.
47
47
50
51
53
55
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
59
59
60
61
63
64
65
65
66
66
67
69
70
71
72
81
84
87
89
89
91
94
94
95
97
97
98
100
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
vi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 A Prototype Implementation
6.1 The Information Base . . . . . . . . . . . . . .
6.2 The Location Manager . . . . . . . . . . . . .
6.3 The Topology Server/Routing Server (TS/RS)
6.3.1 Finding a Central Node . . . . . . . .
6.4 Random Topology Generator . . . . . . . . .
6.5 Hypervisor Performance . . . . . . . . . . . .
6.5.1 Experimental Context . . . . . . . . .
6.5.2 Build Time . . . . . . . . . . . . . . .
6.5.3 HyperNet Deployment Time . . . . . .
6.5.4 Concurrency Test . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
102
104
104
105
106
107
108
108
109
110
112
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
117
117
121
123
125
126
129
130
132
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
144
Vita
151
vii
List of Tables
5.1
7.1
7.2
viii
60
List of Figures
2.1
2.2
2.3
2.4
2.5
2.6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
16
17
18
18
25
3.1
40
4.1
4.2
51
55
67
68
69
71
5.1
5.2
5.3
5.4
5.5
6.1
6.2
6.3
6.4
90
93
94
96
97
98
6.7
Hypervisor Implementation . . . . . . . . . . . . . . . . . . . . . . .
Time spent building a HyperNet Ring Topology . . . . . . . . . . . .
Deploy Time for Ring Topologies in a GENI Aggregate . . . . . . . .
Time spent in deploying 50 HyperNet Topologies with concurrent
requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time spent in deploying 100 HyperNet Topologies with concurrent
requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time spent in deploying 200 HyperNet Topologies with concurrent
requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time spent in deploying HyperNet Topologies with sequential requests
7.1
7.2
7.3
6.5
6.6
ix
103
110
112
113
114
115
116
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
7.12
125
130
131
133
134
135
136
137
137
Chapter 1
Introduction
The huge success of the Internet over the past few decades has clearly demonstrated
the wisdom of the early designers, who decided to create a simple best-effort packet
delivery network and then couple that with (intelligent) programmable computers at
edges of the network. One consequence is that applications running on end systems
define the networks functionality not the network itself. This architecture enabled
innovation on end systems that has allowed the Internet to be adapted and enhanced
far beyond the imagination of the Internet designers.
However, it has become increasingly clear that certain innovations will require
new functionality in the network itself. For example, due to the need for trustworthy
communication, some argue that the Internet should provide intrinsic security in
which the integrity and authenticity of communication is guaranteed [1].
Some
argue that additional processing of packets should be provided within the network
so that users can define their own processing, thereby choosing how their packets
are processed by the network [2]. To support the increasing number of mobile end
systems (mainly cell phones) on the Internet, some argue that the Internet should be
re-designed to support network mobility at scale [3]. Yet there are other proposals
addressing new network types such as vehicular networks [4] and ad-hoc networks [5].
Meanwhile, the existing Internet has several other recognized problems. Well-known
Internet problems range from lack of address space [6] to routing problems [7, 8, 9],
to lack of support for mobility [10], to lack of security [11, 12, 13, 14, 15].
Although there are many arguments on how to realize a next generation Internet
[16, 17], it is widely agreed that the future Internet should be highly flexible and
programmable to support more innovation within the network itself. A promising
technique that offers users the ability to program the network in safe, user-specific,
ways is virtualization. Although virtual networks are beginning to emerge, the tools
and interfaces for users to create virtual networks are severely lacking.
This thesis proposes a new HyperNet abstraction that simplifies the
task of creating, deploying and using special purpose virtual networks
virtual networks that are tailored to the needs of particular applications,
and are also tailored to the set of participants and users of the virtual
network.
1.1
Virtualization
Propelled by the need to efficiently and cost effectively support web services, virtualization technology has recently gained widespread popularity and use. Virtualization
enables one to create multiple virtual instances of a device from a single physical
device, allowing for example multiple virtual web server machines to be hosted
on the same physical machine.
Virtual computers, referred to as Virtual Machines (VMs), have not only become
a key part of data centers hosting web services, but are now also commonly found on
desktop users machines. Virtual machines not only share the same set of physical
resources (i.e., run on the same physical machine), but more importantly, VMs are
also isolated from each other so that problems in one VM will not affect the execution
of other VMs.
Virtualization has also changed the way people create, package, share and deploy
software. The best example of this is a virtual appliance[18]. Virtual appliances
2
encapsulate all the pieces of a software system including the operating system,
application libraries, and configuration files into a single package that can be easily
run by users who would otherwise not be able to (or not want to) assemble and
configure a complex software system. For example, bringing up and hosting a Content
Management System (CMS) today no longer requires an expert to install, configure,
and initialize the appropriate OS, web servers, databases, file systems, and CMS
software. Instead, an average user can download a fully-configured and ready-to-run
content management appliance from a virtual appliance store and simply run it
on a virtual machine.
1.1.1
Virtual Networks
Recently the concept of virtualization has been extended from computers to network
devices (e.g., network routers, switches, and links). Early examples of virtual routers
were based on virtualized PCs acting as network routers. PlanetLab [19, 20], for
example, allows users to reserve slivers (virtual machines) from physical machines
scattered all across the world and connect them together via overlay links to form a
slice (an overlay network). The virtual routers in the overlay can be programmed
and controlled by users, enabling users to deploy custom code on the virtual routers
in order to create application-specific networks. In Emulab [21], users can obtain real
or virtual PCs from a cluster of resources for use as emulated network routers. The
emulated routers are then connected using Virtual Local Area Networks(VLANs)[22].
Again, because users have complete control of the emulated routers, they can deploy
their own protocols and network services to create a virtual network with customized
functionality. The emerging GENI network [23, 24, 25] is perhaps the best example,
offering a wide rang of (virtualized) network resources to create virtual networks that
span the continent. Like PlanetLab and Emulab, users can design application-specific
topologies and run application-specific code on the routers that comprise the network.
1.2
Although virtual networks bring many advantages, creating and building a virtual
network is not an easy task.
1.2.1
As anyone who has used one of the existing virtual network infrastructures can attest,
creating a fully functional virtual network is anything but easy. For example, in the
1.2.2
While ISPs might see a niche market and decide to create a special-purpose network
for a particular type of traffic, it is unlikely that the average user will create
virtual networks tailored to their needs and users. Expertise is needed to create,
maintain, and operate virtual networks. Unfortunately, ISPs have little incentive to
invest in a customized virtual network that does not have sufficient payback. For
example, ISPs may invest in virtual networks to support widely-used applications of
general interest to paying customers (e.g., a video conferencing network for large
corporate customers), but they will have little desire to build networks for less
profitable applications (e.g., a virtual network for a little-used application with few
potential customers). Moreover, the application-specific network that ISPs deploy for
high-margin applications will likely be long-running networks permanent services
available to all users. ISPs are unlikely to dynamically create a virtual network
tailored to the specific users of a given session, but instead will support users in
general (e.g., an ISP will create a single long-running video conferencing network
used by all video conferences, as opposed to dynamically creating and deploying a
network designed especially for the specific set of participants in some personal video
conference). Application-specific virtual networks that are tailored only for small
groups of participants and designed only for specific network applications will play a
very important role in the future Internet and thus, new models are needed for an
average user (instead of a network expert) to easily create and dynamically deploy a
highly-tailored virtual network.
1.3
Ideally, to create a virtual network, one would like an abstraction similar in spirit to
that of a virtual appliance, where all that one has to do to create their own virtual
network is obtain a file containing a virtual network specification (i.e., the logical
equivalent of a virtual appliance) and run it on virtual network infrastructure. Such
a virtual network should contain the virtual network topology, the software and
services needed by the network, and all the configuration information needed by the
virtual network. All the pieces of a virtual network, along with the expertise needed
invokes a local program that executes the HyperNet Package. The user is then
prompted for configuration information (assuming configuration information such as
a list of participants is required and was not previously specified in a configure
file).
a password) needed to join the network. As users join the HyperNet network, the
HyperNet Package verifies that they are permitted to join, and their packets begin
flowing across the virtual network. When the application session ends (which may
be a few minutes or a few months or years later), the creator or other authorized
individual tears down the virtual network.
In order to make virtualized networks usable to the largest audience possible2 , the
HyperNet abstraction must be simple to use, yet powerful enough to support a wide
range of applications. There are several challenging problems that must be addressed
in the definition and implementation of such an abstraction including (1) determining
where the abstraction fits into the overall network virtualization picture (which is still
emerging), (2) defining the interfaces (APIs) used to launch a HyperNet Package and
then access a HyperNet network, (3) deciding the set of capabilities a HyperNet
Package should support, (4) addressing the topology generation issues associated
with deploying a HyperNet network, (5) solving the participant joining problems (i.e.,
creating a convenient interface for Internet participants to join a virtual network), (6)
ensuring the security of HyperNet networks, and (7) providing the building blocks
needed to support a variety of economic eco-systems in which HyperNet networks
can thrive. Throughout the remainder of this thesis, we will describe our solutions to
these problems and present an initial prototype that demonstrates the feasibility of
our approach.
1.4
We envision three classes of users: builders, creators, and participants, which will be described
later.
Unlike conventional CDNs that rely on DNS tricks at the source to redirect packets
to CDN caches, the CDN HyperNet network operates at the IP level, running code
on routers to intercept packets en route to the server and redirects them to a nearby
CDN cache without DNS involvement. The HyperNet Package also includes an end
system API that the web server can use to push content into the CDN network.
The HyperNet Package might even come with a pre-installed and pre-configured web
server that pushes content into the CDN by default.
One can imagine a wide variety of other example HyperNet Packages, each
designed for a different purpose and set of users. To get a feel for the types of
HyperNet Packages that might be useful, consider the examples listed below:
Audio/Video Net
An audio/video conferencing HyperNet Package would create a multicast
infrastructure among the participants.
Wireless Net
A wireless HyperNet Package, designed for wireless end systems, might contain
code to be loaded into wireless access points to alleviate the problems that arise
because of a weak signal or packet loss over the first/last hop wireless link.
Bit-escrow Net
A bit-escrow HyperNet Package would deploy code into routers to cache packets
(i.e., bit escrow) as they pass through the router. The cached packets would be
stored only for a short period of time, but during that interval the router would
respond to retransmission requests directly from the cache, thereby reducing
the time required to recover from lost packets.
Backup Net
A backup HyperNet Package might create multiple paths between the client
and the backup server to maximize the throughput across the network, thereby
reducing the time required to complete a backup.
FS Net
A distributed (disconnected) file system HyperNet Package might deploy code
along the paths between the participants to ensure file transfers are both
reliable and efficient, transferring files hop-by-hop and supporting disconnected
operation (e.g., a laptop that loses connectivity for some time).
Home Net
A home HyperNet Package would deploy code into the ISPs first hop router to
prioritize traffic being sent over the limited-bandwidth channel coming into the
home. This would allow home network users to ensure that low priority bulk
transfers (e.g., Youtube video streams) do not interfere with higher priority
traffic (e.g., Skype phone calls or interactive sessions like ssh).
10
None of these networks is particularly novel in and of itself. Similar types of specialpurpose networks have been proposed in the context of active and programmable
networks for many years[30, 31, 32]. What makes these networks interesting is not
the fact that they can be constructed, but rather the way they are constructed. In
particular, the goal of HyperNet Packages is to identify all code and expertise needed
to create one of these networks and capture it in a self-contained object that can then
be unpacked and deployed into an underlying virtual network infrastructure with
little or no human (or application) intervention. As a result, even an average user
can deploy special-purpose virtual networks.
1.5
package and distribute the package so that average users can deploy customized
virtual networks.
Decoupling Service Providers from Resource Owners: In our HyperNet architecture, the resource owners (e.g., todays ISPs) are no longer the only creators
of the network. Instead, ordinary users can also become the Network Creators,
creating virtual networks and providing network services. In fact, we expect
more virtual networks will be created by ordinary users than by ISPs. Moreover,
the virtual networks can be at small scale for short time durations, supporting
a small group of participants. This will encourage smaller, lightweight, specialpurpose virtual networks to appear, bringing new functionalities that have never
been seen in the Internet.
A New Business Model:
12
13
Chapter 2
Related Work
This chapter describes current and previous work related with HyperNets. Section 2.1
talks about virtualization technologies, including hypervisors, virtual machines,
virtual appliances, commercial cloud services and platforms for building network
services on the cloud. Section 2.2 introduces state-of-the-art techniques used in
providing a virtual network infrastructure, including various tools used to facilitate
the creation and management of virtual networks in GENI. The next two sections talk
about previous approaches taken towards a programmable network. Section 2.3
describes programmable network infrastructure, including past Active Networks
research along with more recent projects such as NetServ, Service-centric Networks,
and OpenFlow.
including the X-kernel protocol stack, the Tau protocols, and the NC State Silo
project.
2.1
Virtualization Technology
A variety of commercial and open source efforts now offer powerful and highly
configurable virtualization solutions. Examples include VMWare, OpenStack, Citrix
Xen, VirtualBox, and many others [33, 34, 35, 36, 37]. The idea of virtualization
is to generate multiple instances of resources (computation power, memory, storage,
etc) out of one physical device. These virtual instances of resources share the same
14
underlying physical device but yet are protected from one another. In the following
section we briefly highlight some of the most common approaches to virtualization,
pointing out their advantages and drawbacks.
2.1.1
Apps
Apps
OS
OS
OS
Virtual Machine
Virtual Machine
...
Virtual Machine
Apps
Apps
OS
OS
OS
...
Virtual Machine
Virtual Machine
Virtual Machine
Type II Hypervisor(VMware,VirtualBox,etc)
OS(Linux,Windows,Mac, etc)
Hardware(CPU,memory,disk space,etc)
15
Virtual
Machine
Virtual
Machine
...
...
...
that offer this type of true virtualization include IBMs zOS [39] on its
p and z-series machines, the Citrix XenServer [40] and the VMware ESX
hypervisor architecture [41].
A light weight form of virtualization that does not mimic a physical machine
is Operating System level Virtual Containers. In this case the containers are
Container
Container
Container
...
Host OS
Hardware(CPU,memory,disk space,etc)
To simplify the
software installation setup and configuration steps, the virtualization community has
developed the concept of a Virtual Appliance (VA).
2.1.2
Virtual Appliances
system, because it contains only the OS features needed to support the specific
application. In other words, JeOS is a super light weight OS tailored only for the use
of a specific application. Thus, compared with a general-purpose operating system, a
JeOS requires a much smaller footprint, fewer patches, and is more secure and easy
to maintain.
Virtual
Appliance
virtual machine
Virtual
Appliance
Virtual
Appliance
virtual machine
virtual machine
...
...
...
2.1.3
VM2
VA1
VA2
VM3
VA3
...
Resource Pool
18
Note that in a virtual infrastructure, the network is only used to enable the migration of
appliances. There is no (or limited) explicit sharing of network resources (i.e., routers and switches)
in virtual infrastructure at present.
19
2.1.4
Interestingly enough, if we say virtual machines realize virtualization in a bottomup fashion (the same set of hardware supports the execution of multiple software
stacks, including operating systems), there are other virtualization approaches that
are taken in a top-down fashion. The most well-known example is a Java Virtual
Machine (JVM). A JVM is a virtualized software platform used to execute Java
programs. Every Java program is first compiled into an intermediate language called
Java bytecodes that is interpreted and executed by the JVM. Interpreting bytecodes
is slow. To improve the performance of Java, most Java systems support a Just InTime Compiler (JIT) that translates Java bytecode into native machine language that
can be executed directly on the physical machine. The .NET Framework compiles
and executes C++/C#/VB code in the same way as JVM. First, code is converted
to the Microsoft Intermediate Language (MS-IL) akin to Java bytecodes. Then by
using a JIT compiler similar to JVM, the Common Language Runtime (CLR) of the
.NET Framework converts MS-IL to native machine language. These virtualization
approaches enable one to execute the same code on hardware regardless of the
operating system or the hardware being used. In some way, this leads to the idea of
executing code on routers, which later on becomes one of the efforts made towards a
programmable network the Active Networks Approach, which will be described in
Section 2.3.
2.1.5
Cloud Services
The term cloud is used as a metaphor to depict the Internet as an abstraction of the
virtualized resources it contains. The Cloud refers to a shared pool of computing
resources, storage services and higher level applications/services that can be accessed
over the Internet (say, via a web browser) to perform a service for users on-demand.
Despite various concerns about security, users have been rapidly embracing the
20
There is some confusion about cloud computing, grid computing, and utility computing.
Generally speaking, grid computing focuses on how a group of computers cooperate to finish a
huge task; utility computing focuses on packaging computer resources as a metered service.
21
includes cloud storage, managed hosting, and development environment that allow
users to build applications. Amazons EC2 [50] is an example of IaaS.
In short, cloud computing provides users a cheaper and easier way to manage their
applications than using the traditional model. Due to the large amount of resources
in the cloud, cloud computing can be potentially optimized to boost the execution
of applications and thus is more efficient than the traditional model.
However,
the overall performance of cloud computing depends not only on the efficiency of
the computing part, but also relies on the efficiency of the networking part. In
other words, if the Internet does not provide fast, reliable, Quality-of-Service (QoS)
guaranteed delivery, the overall performance of cloud computing will be reduced.
Moreover, it is the users responsibility to configure the resources in the cloud to
build any new services, which is a hard task that is typically left to experts. Big
companies will only build new services with mass appeal.
2.1.6
Cloud computing makes it easier and cheaper for service providers to provision
computing resources and to flexibly extend resources on the fly. Virtual appliances
make it easy to configure software stacks on a single node. However, it is still a
challenging and time-consuming task when it comes to building and configuring an
entire network infrastructure which contains many nodes. Chef [51] is designed to
facilitate the creation and deployment of cloud services. In Chef, rules for configuring
each node to reach a pre-defined state are expressed in the form of a run list. A
node might be an application load balancer, an application server, an application
database cache, an application database, or a monitoring node. Each node of the
infrastructure is pre-loaded with a Chef-client. The Chef-client is in charge of fetching
the run list of that node from a centralized Chef-server. In addition, the Chef-client
also ensures that the run list of a node is achieved in compliance with the policy set
22
by the Chef-server. A run list is a list of recipes and a recipe simply states which
application (or resource) a node should install, what configuration a node should
have, etc. Recipes are stored in cookbooks to realize code re-use and modularity.
In addition, Chef also provides a search function so that the configuration of the
service infrastructure can adapt automatically with the addition or removal of nodes
to/from the infrastructure. As a result, to setup and configure a cloud service, the
user only needs to define policies and recipes in the Chef-server.
In some sense, Chef is trying to achieve the same goal as HyperNets: to ease
the process of building and deploying a virtualized infrastructure. However, the
HyperNet architecture is different from Chef in the following ways. First, Chef only
deals with cloud service infrastructure and thus, is more specialized. In other words,
Chef only cares about the infrastructure used within a network service (a network
service infrastructure might include multiple service load balancers, applications,
databases and database caches). How end systems reach the service infrastructure
is not Chefs concern. The HyperNet architecture, however, can be used to create
a virtual network that covers the end-to-end communication between any two end
systems (participants) in the network. Secondly, the HyperNet architecture tries to
answer the question of how to tailor a network to match the requirement of a specific
type of application and a specific set of participants and thus is designed with APIs
to help a HyperNet builder achieve this. Chef, on the other hand, only provides
interfaces for one to customize the inner structure of a cloud service.
2.2
Virtual Networks
More recently, the concept of virtualization has been extended from computers to
networking devices. Emerging network testbeds allow users to reserve virtualized
network devices that act like virtual routers, and connect them together with virtual
links to form a virtual network. The GENI [23] network is probably the best example
23
of a virtual network provider that is available to users today. This section describes
some of the virtual network infrastructures available in GENI and the current tools
provided by each of them.
GENI (Global Environment for Network Innovation) is a testbed network designed
to give researchers the opportunity to experiment with new network protocols and
architectures at scale. Currently GENI includes the following control frameworks:
ProtoGENI [52], PlanetLab [20], ExoGENI (previously known as ORCA) [53] and
ORBIT [54]. Because GENI is still under development, the list of tools that users can
leverage to build their experimental network (slice in GENI terminology) is rather
limited. However, there are tools and services that assist with resource discovery and
allocation, tools to load software onto the nodes (slivers) in a slice, and tools to
help instrument and monitor an experiment. In the following, we introduce some of
the GENI control frameworks and the corresponding tools that help users managing
resources in those control frameworks.
2.2.1
ProtoGENI
ProtoGENI is one of the control frameworks that supports the GENI Aggregate
Manager (AM) API [24]. Its predecessor is Emulab [21]. Each virtual network
instance in ProtoGENI is called a slice and each programmable node in a slice
is called a sliver.
24
Network resources in ProtoGENI are grouped into aggregates spread across locations
in the United States.
Flack [55] is a web-based GUI originally designed to create slices in ProtoGENI [52].
25
visualized idea about where (geographically) the resources are reserved, what types
of resources are reserved (for both virtual routers and virtual links), and what the
logical topology looks like, as shown in Fig 2.6.
Although the Flack interface makes it easy for users to create virtual networks
and even load software on the nodes in a virtual network, it does not help users
determine which resources should be included in the network, nor does it help setup
or configure the network. Users still need to manually log onto each reserved node
and do node-specific configuration (e.g., change routing tables, configure software,
run node-specific commands, etc.) after the virtual network is created via Flack.
Moreover, Flack only helps to allocate and connect GENI resources together, users
must manually create channels/tunnels between non-GENI resources and the GENI
resources allocated to a slice. Since the majority of users are located on non-GENI
nodes/resources, there is a need to support a regular Internet user who wishes to
connect to and use GENI.
2.2.2
ORCA
ORCA [53] is another GENI Control Framework. It uses Flukes [57], a Java-based
GUI similar to Flack, to create, inspect, and manage experiments on ORCA. Flukes
features are similar to Flack, except that unlike Flack, Flukes uses ORCAs native
resource description language (NDL-OWL) to manage its resources as opposed to
GENI RSpecs. Recent enhancements to ORCA have enabled it to connect RSpecs
to NDL-OWL, but at its base level, ORCA still uses NDL-OWL. By using the NDLOWL interface, users are able to specify things not available in RSpecs, e.g., to define
and configure node groupings, to specify functional dependencies between nodes by
configuring node boot sequences, or to specify post-boot script templates. However,
like Flack, experimenters are still responsible for much of the configuration and setup
after the resources have been allocated and become available for use.
26
2.2.3
PlanetLab
27
slice or its topology), but rather assumes this has been done with some other tool. In
order for a user to leverage Gushs software loading capabilities, the user must write a
Gush script and create the associated tar files using Gushs XML-based configuration
language. Finally Gush, like the other tools, lacks support for connecting to and
interacting with real world users on non-GENI nodes.
Raven [62] (previously Stork [63] in Planetlab) is another tool in GENI that is
specifically designed to support long-term experiments. It provides configuration
management tools and resource management tools for long-term experiments where
both software and resources can change during the lifetime of the experiment. It
provides interfaces for administrators to enter instructions that will be applied to
each individual system. Just like Gush, Raven also provides helper tools to assist
experimenters in monitoring and managing their experiments.
Although some additional tools have begun to emerge for GENI, there is a
significant learning curve required to utilize them effectively. Moreover, the decisions
about topology and resource allocation are still very visible to the experimenter.
While the topology is important, the details of the topology often are not important
to the experimenter. For example, if a user wants to develop a new interactive
multiplayer gaming network with a centralize controller, the user will not care about
the specific nodes chosen to be part of the network. Instead, the user will simply want
to know that the topology connects the various players of the game (participants) to
a node that is centrally located relative to the participants. In other words, the user
would like to be able to select the type of topology, not necessarily know the details.
With the existing GENI tools, the experimenter is responsible for identifying,
selecting, and then loading, all the software needed by the network. Networks are
complex systems with multiple layers of software. In most cases, experimenters have
no desire to (re)create all these software layers. Instead they would like to leverage
existing software and services to the greatest extent possible, only modifying or
28
enhancing the layer of software that is the focus of their experiment. In other words,
they would like to select an existing software stack, and have that stack automatically
deploy as the software system for the experimental network.
Finally to get real-world participants to use an experimenters new network
requires support to connect them into the slice even though they are on machines
that are not GENI-enabled. In other words, experimenters would like the ability to
include non-GENI-enabled nodes in the topology. This typically requires some sort
of tunneling over IP to link these nodes into the GENI network.
2.3
Beginning in the mid 90s, a significant amount of research was directed toward
the goal of creating a network infrastructure that is programmable.
Much of
2.3.1
Active Networks
The existing Internet architecture pushes all functionality to end nodes and keeps
the network as simple as possible. The idea behind active networks, however, is
to let custom programs be executed within the network, on routers. The idea first
emerged in the 90s when the computational power of network devices dramatically
increased and people realized that they can put more computational tasks into routers
and switches. Generally speaking, there are two major advantages of using active
networks: (1) they enable a wide range of new applications that leverage computation
in the network and (2) they accelerate the pace of network innovation by separating
service from the underlying infrastructure.
A capsule-based active network pushes the programmability of active networks
to an extreme: every active packet (or capsule) in the network carries programming
29
code (or instructions) that can be read and executed by intermediate forwarding
nodes which in turn change the functionality of the network. The ANTS toolkit [64],
designed by Wetherall et al., allows end users to send out capsules containing any
program the user wants. A capsule includes an ANTS-specific header immediately
following the IP header (since capsules do not want to stop non-activated nodes from
forwarding them). Besides version, type and the actual type-specific programming
code fields, the ANTS header also includes the previous address to enable an active
node to go back to the previous active hop to fetch program code. The ANTS
approach assumes there are no programs in the active nodes initially. A lightweight
code distribution system pre-transfers the programs to active nodes. The codedistribution control plane is separated from the data plane of capsules. The type field
in the ANTS header is actually an MD5 hash of the corresponding program code.
Thus, a type identifier names an implementation, not an interface. The previous
address field enables program code delivery even when capsules (and their code) are
created on-the-fly. The combination of the above two ensures that capsules can be
carried by reference and also loaded on demand.
Many Active Network papers talk about how active nodes should be designed
and programmed.
30
some particular active network platforms. This layered stack makes the maintenance
and configuration of active networks complicated.
Active networks have not achieved widespread adoption for a number of reasons.
First, the security concerns associated with executing arbitrary user code in the
network have discouraged ISPs from enabling programmability in their networks.
Second, running a program every time a packet arrives at a router introduces
significant performance issues. Third, it is unclear how to charge users for this feature.
Emerging virtual network infrastructure avoids or answers some of the problems
that have plagued active networks. Because slices are private and isolated, security
concerns are greatly diminished. Because slice resources are reserved/purchased,
billing can be more easily integrated.
2.3.2
NetServ
The NetServ [65] Project is a collaboration between Columbia University, Bell Labs,
Deutsche Telekom Laboratories and DoCoMo Labs Europe. The purpose of this
project is to provide an extensible architecture for core network services for the next
generation Internet. The key idea of NetServ is service modularization. By using
NetServ, the functions and resources on a network node are divided into small and
reusable building blocks. New network services can be composed and implemented
by combining the functionality of building blocks available from multiple network
nodes. To efficiently manage the building blocks (or service modules which are
composed of multiple building blocks), a virtual services framework is created as
one of the key pieces of the NetServ Architecture. The framework offers security,
portability across hardware platforms, and the ability to control resource allocation
among modules. Moreover, the framework supports adding and removing service
modules at runtime [65]. The NetServ group implemented a prototype version of
their network using a Click modular router [66] as the base router platform and
31
a Java-based dynamic module system OSGi [67] as the virtual services framework.
They showed that using Java as the execution environment can result in tolerable
processing time delays.
2.3.3
Service-Centric Networks
Service-centric networks [68, 69], proposed by Wolf et al., base the communication
abstractions of the future Internet on the processing (including transfer) of information rather than the process of sending data. By giving the network some clues about
the semantics of the information that is to be transferred, the network can provide
more advanced services as opposed to blindly forwarding bits. In their Information
Transfer and Data Services (ITDS) architecture, (by having processing as a firstclass networking feature) an end system application can specify the information
transfer that is desired and the network can then determine the appropriate handling
of the data.
32
2.3.4
OpenFlow
2.4
Active networks try to control/program the network at the granularity of one packet.
This fine-grain control gives the user a lot of flexibility but at the same time, adds
a lot of complexity to the system.
2.4.1
33
protocols, sessions, and messages. Protocol objects can be static or passive; common
examples include the protocols currently used in the IP stack. Session objects are
passive but can be generated dynamically, representing connections. Message objects
are active objects that move through the session and protocol objects in the kernel.
A set of support routines facilitate the implementation of protocols. For example, the
buffer manager routines manipulates messages. The map manager routines maintain
bindings from identifiers (such as those extracted from message headers) to kernel
objects. The event manager routines manage the timing of any procedure (e.g., so
that a protocol can timeout). Last, the X-Kernel provides infrastructure to support
communication objects. The relationship between communication objects can be
represented by a simple textual graph description language. The relationship graph
can be read by a composition tool which in turn generates the C code that creates
and initializes the protocols described in the graph. Each object is implemented in a
object-oriented style, i.e., each object has pointers to object-specific functions. More
specifically, the protocol object can create a session by itself (the open function), or
can pass its capability to a lower level protocol and ask it to create a session (the
open enable function), or tell its upper level protocol that it has created a session
on its behalf (open done function). In addition, the protocol object can also pass a
message to one of its sessions via the demux function. The session object can either
pass a message down to a low-level session via the push function, or pass a message
up to a high-level session via the pop function (called by the demux function). A
user-space process is also treated as a protocol in the X-Kernel protocol stack design,
i.e., the user must export those operations that a protocol or session may invoke.
2.4.2
Tau protocols
34
because it requires that messages be processed sequentially. The Tau (Transport and
up) protocol [71, 72], proposed by Calvert et al., achieves modularity in protocol
implementations. They focus on end systems, and thus, transport layer and higher
protocols. The term protocol module is used to refer to the object that implements
the functions defined by a protocol specification.
35
2.4.3
The SILO [73, 74] Project focuses on cross-layer (or cross-service) interactions. The
design goals of SILO are (1) construct a framework of fine-grain building blocks
along with explicit support for combining elemental functions in a highly configurable
manner to achieve flexibility and extensibility; (2) use a fine-grained modularization
of networking functions to enable a scalable, unified Internet; (3) explicitly build in
the ability for functional blocks to interact with each other to facilitate cross-service
interactions; (4) treat security functions as easily pluggable components to smoothly
integrate security features; (5) offload small but computationally intensive functions
to secondary CPUs to take advantage of new performance-enhancing techniques.
Services are the fundamental building blocks in SILO. A service is a welldefined atomic function performed on application data that accomplishes a specific
communication task. This fine-grained definition provides more flexibility compared
with current protocols which typically embed complex functionality. Any services
can be selected to accomplish a particular task, but the order that those services are
applied is not tied to layers; instead, it is tied to a set of more general precedence
constrains. SILO separates a service and its implementation, which the authors
call a method (so that multiple methods could be associated with one service).
A method that implements a service must implement the service-specific interfaces
(as described in the service specification), as well as any service-specific knobs. A
silo is actually an ordered subset of methods, each of which represent a different
service.
for composing such a silo. Besides composing a silo, a control agent is also in
charge of adjusting all service and method-specific knobs to facilitate cross-service
interactions. SILO also defines a minimum set of precedence constrains: Requires,
MustOccurAbove, MustOccurImmediatelyAbove, MustNotOccurImmediatelyAbove
to guide the service composition. By explicitly requiring each method to provide a
36
minimum control interface, and a minimum set of precedence constraints [74], SILO
bypasses the potential unmanageable and unmaintainable consequences caused by
cross-layer interaction.
2.5
Summary
The use of virtualization has made the development, deployment and management
of applications a lot easier than the traditional model. Developers no longer need to
worry about platforms and hence can focus on features of the applications they are
developing. Virtual Infrastructure has enabled the sharing of a network of resources
and the migration of virtual appliances among them. Currently the focus of virtual
infrastructure is on enterprise server networks and the resources it deals with are
mostly virtualized servers and storage (network is assumed to be robust and is
used as a reliable connectivity provider among servers). Although the design of a
HyperNet Package is not intended to facilitate a virtual infrastructure, the concept
of a Network Hypervisor service can also be used in building a delivery network for
virtual infrastructure.
Programmable Network Infrastructure and Composable Network Stack proposals
show previous efforts towards making a programmable Internet. They are not directly
related to the design of the HyperNet architecture but are great examples of potential
platforms for building/operating HyperNet Packages. Moreover, they also suggest
what a future programmable network device might look like or be capable of. We
are not interested in dynamic composition of protocols as used in SILO because it
is extremely challenging to compose an optimal stack on the fly. Instead, we want
experts to compose a set of stacks and include them in a HyperNet Package along
with some simple rules describing when each of the stacks should be used. Moreover,
the HyperNet Package should not only include stacks, but also applications. Thus,
the HyperNet Package is a complete package which contains both the protocol stacks
37
necessary to build a network (or network services) and the applications that will use
them.
38
Chapter 3
Virtual Network Infrastructure
Providers (VNIPs)
Before describing our HyperNet Architecture, it is important to take a moment to
understand the evolving network virtualization efforts and the roles ISPs will play
in offering virtualization services in the future. In order for HyperNet networks to
provide end-to-end support between participants, we must understand the challenges
and possibilities of the emerging virtual network providers.
We envision a future where there exist providers who offer virtual network
infrastructure for a fee (or possibly some other form of compensation say the
right to monitor your traffic and sell that information to advertisers).
We call
39
make some basic assumptions of all VNIPs. We will discuss our required common set
of services in the later part of this chapter.
We envision two types of VNIPs.
Infrastructure Provider (HIP) who owns and operates physical network hardware
(routers, switches, PCs acting as routers, wired and wireless channels, etc) that can
be virtualized and assigned to different virtual networks.
IRs
Backbone
HIPs
Regional
HIPs
Programmable Router
Sliver
HIP:
IR:
Infrastructure Reseller
Figure 3.1: Types of virtual network infrastructure providers: HIPs and IRs
The second type of VNIP is an Infrastructure Reseller (IR) who does not own
the hardware, but rather purchases the virtual network resources from hardware
infrastructure providers (HIPs) and then resells them. Just as a travel agent does not
own airplanes but resells flights from various airlines to create a complete flight for a
customer, IRs can compose and resell HIP resources in more appealing/useful ways.
Fig. 3.1 illustrates the two types of VNIPs: HIPs and IRs. A sliver in the graph is
simple a virtual node (along with the resources) reserved out from a physical node.
To abstract the types of devices for use by a HyperNet network, we define four
types of virtualized resources that a hardware infrastructure provider can make
available for use:
40
Programmable Routers (PRs) can be programmed via one of several standardized interfaces1 . PRs may be virtual or physical routers, but are reserved for
use by the HyperNet network. Here Programmable means the authorized end
users (or the HyperNet Package that acts on their behalf) can load any program
on the PR and run it. This gives the HyperNet Package complete control over
the processing of packets that pass through the PR. Many of the current VNIPs
(e.g., GENI, PlanetLab, Emulab) provide such programmable routers (typically
in the form of a Linux PC).
Way Points (WPs) are non-programmable routers that a user can only tunnel
packets through en route to a PR or end system. An example of a way point
could be an OpenFlow switch whose forwarding table can be tweaked by the
HyperNet Package to forward (tunnel) network packets to some next hop. Just
like a programmable router (PR), a Way Point may be virtual or physical,
but they are exclusively reserved for use by the virtual network created by the
HyperNet Package. Unlike a PR, a way point only allows the HyperNet Package
to configure the routing table (or forwarding table) in the way point. A way
point is not programmable by the end user. If we consider programmable
routers as a way of controlling how each packet in the network is processed, then
the way points can be seen as a light-weight way of controlling how network
packets (typically in units of flows) are routed. Compared with programmable
routers, Way Points provide a light weight way of defining paths which is
often all that is needed.
Programmable Servers (PSs) are virtualized resources offered by the VNIP
that provide computation and storage at locations inside the network. Programmable servers typically have huge amounts of disk space and high
1
For now we assume that a programmable router provides a Unix-like user interface, and if
granted, users have superuser access to install or run any program on it. Other interfaces may also
be supported in our future HyperNet implementations.
41
processing power, and are capable of dealing with service requests and sending
back results. The platform provided by a Programmable Server can vary, from
an OS-level abstraction to an application-as-a-service abstraction. For example,
the platform might provide a content management service interface, or a web
hosting service interface, or possibly a bare virtual machine on which users can
install a customized operating system.
Links. Links may be physical cables, fibers or wireless channels that physically
connect resources together, or they may be virtual channels. VNIPs allow links
to be allocated and set up for the HyperNet Package to connect resources owned
by the VNIP. VNIPs must also support links that connect to other VNIPs. Note
that some VNIPs will own hardware that is not directly connected. For example,
PlanetLab consists of hardware nodes spread across the world and interconnects
its infrastructure using links that traverse the existing TCP/IP Internet (i.e.,
IP tunnels). In this case the VNIP is using conventional IP routers as resources
in the topology, but we assume these are completely hidden/unobservable by
the user of the VNIP (i.e., the HyperNet Package). Examples of such links
would be GRE (Generic Routing Encapsulation) tunnels. While it is unlikely
that there exists a physical channel from every virtualized router to every other
virtualized router, it is possible (in fact likely) that a VNIP (HIP or IR) would
offer full N N connectivity between virtual routers via virtual channels,
thereby significantly increasing the number of potential paths and indirectly
complicating the HyperNet Packages task of selecting a optimal topology for
the virtual network. Also, note that, although physical links based on IP have
no ability to offer QoS, virtual channels between virtual routers may be able
to offer QoS guarantees, which need to be reserved during the network creation
phase.
Much like the current Internet, we envision several tiers of hardware infrastructure
42
3.1
In order for HyperNet Packages to be able to communicate with and reserve resources
from VNIPs, we need to make some basic assumptions about VNIPs and the APIs
they provide. These assumptions are not critical and could be relaxed or modified if
VNIPs were to change or converge to a standard in the future. Given that current
VNIPs are connected to the existing TCP/IP Internet, we assume that the Internet
Protocol (IP) is supported by all VNIPs and can be used to identify and access the
(virtual) resources offered by VNIPs, including physical nodes, end systems, PRs and
WPs. In other words, in our architecture, IP is guaranteed to be available for users,
VNIPs and hypervisor servers to communicate with each other. However, assuming
IP does not imply the software that will be loaded onto the virtual nodes has to use
IP protocols. For the HyperNet Package, we only use IP as the control channel to
identify and communicate with the resources (i.e., to deliver HyperNet Packages and
software Images, and to control HyperNet virtual networks). The actual data channel
can use HyperNet-specific protocols.
To help HyperNet Packages discover appropriate resource to include in the virtual
network, we assume that a VNIP knows:
Location Information about the resources. This location information can be
represented by an IP address or a geographical location or even a covered area
(e.g., a zip code or a city/building name).
43
Static information about its resources, such as the resource type (virtual/physical router, virtual/physical PC, etc) and resource capability (total amount of
CPU/memory/disk, link capacity, etc).
Dynamic information about its resources, such as current available bandwidth,
current delay/loss rate of a physical link. That information can be monitored by
substrate resources and kept in a local repository [75]. The VNIP may further
summarize the information into long term average statistics and store them in
a centralized database.
It is not critical that this information be real time, but the more accurate it is,
the better HyperNet Packages will be at selecting good resources to use.
Physical topology information about its resources.
topology can help a HyperNet Package allocate the most appropriate resources
and can avoid issues of placing multiple virtual resources on the same physical
resource. The physical topology can be queried by the HyperNet Package but
the VNIP may apply its own policy regarding how much information to give
back to customers.
3.2
We assume that there are a set of basic functionalities that all VNIPs provide,
including:
API calls to reserve and free resources.
API calls to connect the reserved resources together to form a virtual network.
In our design, a VNIP provides two API calls to reserve virtual links:
T unnel createT unnel(Node a, Node b);
T unnel createT unnel(Node a, Node b, NodeSet tunnel points);
44
45
VNIPs may apply different information hiding policies. For example, one VNIP
may choose to expose its full physical topology to its users while another VNIP may
only give back a partial topology upon a discover topology request from its user.
3.3
VNIP API
To provide the basic VNIP services described above, each VNIP may provide its
own API calls for its users to use. Although the appearance (i.e., name, parameters
and return types) of the API calls may differ from one another, in our design, we
assume that a set of common VNIP API calls can be summarized for every VNIP to
follow. In fact, the GENI community is making a big effort towards such a common
set of API calls [76]. The GENI AM API works well across several different GENI
control frameworks including Planetlab, ProtoGENI, InstaGENI and ExoGENI. Our
architecture simply leverages the GENI AM API as the VNIP API.
46
Chapter 4
HyperNet Packages
Emerging virtualization infrastructure, be it virtualized cloud resources or virtualized
network resources such as routers, switches, and links, requires a new abstraction to
efficiently program and utilize the infrastructure. In the following, we introduce a
new abstraction, called a HyperNet Package, that enables average users to effectively
use these new and emerging programmable infrastructures.
4.1
47
to install, configure, an initialize the appropriate OS, web servers, databases, file
systems, and CMS software. Instead, a normal user can download a fully configured
and ready-to-run content management appliance from a virtual appliance store
and simply run it in a virtual machine. All the expertise is contained in the virtual
appliance.
Because virtual appliances run on VMs, they are highly portable and can be run
on any platform that supports VMs. While this advantage is indeed a consequence
of virtualization, it is not necessarily the main contribution of virtual appliances.
Instead, one can argue that the key contribution of the virtual appliance abstraction
is its ability to bundle software and expertise into a package that can be run by
the average user who would otherwise not be able to set up such a complex system.
Unfortunately, a similar abstraction for virtualized networks does not exist. Ideally,
one would like a similar abstraction, where all that is needed to run a custom,
application-specific network is to obtain the appropriate virtual network appliance
(i.e., file/package) and run it.
Given such an abstraction, one could imagine companies or individuals putting
together a wide range of special-purpose virtual networks customized for a particular
application or application domain, or designed to support a particular set of service
requirements (QoS), network sizes, sets of participants, etc.
To illustrate the HyperNet abstraction, consider a video conferencing network,
in which a conference organizer downloads a HyperNet Package from a web site
(e.g., a HyperNet market, similar to Apples app store concept), and uses it to
create a virtual network specifically designed for video conferencing and tailored to the
participants of her conference. For example, the HyperNet Package might be designed
for MPEG video [78] and include code (deployed on virtual routers at strategic
locations) that gives priority to packets containing I frames over those containing
48
I frames are intra frames, which contain the information of a full picture in a video and can
be decoded independent of P and B frames. Thus, an I frame is more important in MPEG video
decoding than P and B frames which only include information about changes from other frames.
49
of the virtual network and the software to run on the virtual routers. As a result,
the average user can simply download and use a HyperNet Package without having
to know how the network is setup, configured, or operated.
4.2
Establishing Context
The first thing a virtual appliance must do is to establish the context it needs in
order to offer its services. For example, a virtual appliance that acts as firewall, or an
intrusion detection system, or a network nanny begins by specifying the characteristics
of the virtual machine (VM) on which it must be run. In particular, the appliance
will need at least two network interfaces: one facing the Internet and one facing the
local area network (LAN). A virtual appliance that offers a video game experience
needs to ensure that the VM has the necessary graphics/display capabilities needed
to offer the game.
A similar problem arises in the context of HyperNet Packages. When a HyperNet
Package is about to start up, it needs to create a topology from the available resources
which involves discovering the available resources, selecting the ones to be used,
and initializing them. It also means providing abstractions/methods by which the
participating parties can access/use the newly created virtual network. In other
words, a HyperNet Package must contain code that can create a virtual network
topology from the available resources. Once created, the HyperNet Package needs
to load application-specific software onto the resources, configure the software and
start it. The HyperNet Package must load different types of software on the nodes
depending on whether they are routers or end systems. Finally the HyperNet Package
must be able to watch or monitor the resulting network to react to any failures
or changes that need to occur (e.g., a new participant wants to join the network).
In short, a HyperNet Package is essentially a program that establishes the virtual
network and monitors it. The question is what does this program look like? and
50
where does it run?. In the following section we describe the HyperNet architecture
including the design of a HyperNet Package and the platform (Network Hypervisor)
on which it is run.
4.3
Video Net
CDN Net
Server
xbox
Wii
Cache
Wii
Game
Server
Cache
Cache
Wii
xbox
app software
creation
scripts
protocol
stacks
config
topology specs
app software
creation
scripts
HyperNet
... ...
xbox
protocol
stacks
app software
creation
scripts
config
topology specs
protocol
stacks
config
topology specs
Hypervisor API
Hypervisor
Network Hypervisor
IR API
SDN
Providers
Local HIP API
51
load the necessary software and/or configuration files on each node of the topology,
and then monitor and adapt the topology over time as network conditions change
and network participants come and go.
Below the Network Hypervisor reside the virtual network infrastructure providers
(VNIPs) described earlier, which provide the resources needed by the HyperNet
Package. The Network Hypervisor serves as a common interface to the various VNIPs,
which typically have their own particular interface/API.
Running on the Network Hypervisor are any number of HyperNet Packages, each
representing a special-purpose virtual network. Each HyperNet Package contains all
the application-specific software, configuration files, topology specification code, and
monitor code needed to create the virtual network and monitor it as it runs. HyperNet
Packages interact with the Network Hypervisor via the Network Hypervisor API. The
Network Hypervisor API attempts to hide the details of the underlying VNIPs from
the HyperNet Package, allowing the HyperNet Package to be written independent of
the specifics of any particular VNIP. In addition the Network Hypervisor monitors the
underlying VNIPs and provides upcalls to the HyperNet Package (akin to interrupts
in an operating system) so the HyperNet Package can adapt to changes in network
characteristics or network membership.
Once a HyperNet Package is deployed and running, participants can join or
leave the virtual network at any time. Any such membership changes are reported
to the HyperNet Package via an upcall from the Network Hypervisor.
In this
sense, the Network Hypervisor serves as the rendezvous point for all virtual network
changes/modifications. All virtual network instantiation/teardown requests from the
HyperNet Packages and all participant join/leave requests are handled by the Network
Hypervisor. As a result the Network Hypervisor service must be able to scale to handle
the potentially large number of requests that it may need to service. Fortunately, the
Network Hypervisor is highly parallelizable, allowing us to use cloud-like services to
52
4.4
Consider the video conference HyperNet Package described earlier in section 1.4.
Imagine that end user A wants to have a video conference with end user B. User A
begins by downloading a video conferencing HyperNet Package from the HyperNet
market. The HyperNet Package is in the form of an executable application. User
A then executes the HyperNet Package and when prompted, enters As and Bs IP
53
54
networks.
4.5
HyperNet Roles
There are six roles involved in building and using a HyperNet Package. Specifically,
a HyperNet system consists of Virtual Network Infrastructure Providers (VNIPs),
Network Hypervisor Providers, HyperNet Builders, Network Creators, HyperNet
Participants, and the HyperNet Marketplace. The relationship among those roles
is illustrated in in Figure 4.2.
Marketplace
Download HyperNet
Upload HyperNet
HyperNet
Participant
HyperNet
Builder
Network
Creator
Join HyperNet
Execute HyperNet
Network Hypervisor
Provider
Negotiate/Reserve
VNIP
VNIP
......
VNIP
network creator can be an average user. A network creator does not need to be a
HyperNet Participant in the virtual network it is creating, but it can be if it wants
to.
A HyperNet Participant is an end system that joins and uses the virtual network.
Just like a network creator, a participant must also obtain (purchase/download) a
copy of the HyperNet Package from the HyperNet marketplace and execute it. The
only difference is that a participant only executes the end system part of a HyperNet
Package, which we call a HyperNet End System Package. A HyperNet End System
package helps a participant join the virtual network and it contains software needed
to communicate across the virtual network.
The HyperNet Builder is the individual or company with the expertise to
write/develop a HyperNet Package. Much like there are IPhone app developers for
IPhones, there will be HyperNet Builders that design and sell HyperNet Packages.
They are network experts and are well-versed with the Network Hypervisor API calls.
A HyperNet Package contains two parts: the HyperNet router part and the HyperNet
end system part. The HyperNet router part2 is downloaded and run by the Network
Creator to create the virtual network and initialize the routers that comprise the
network. The HyperNet end system part (that we call a HyperNet End System
package) is downloaded and run by a HyperNet participant in order to join and make
use of the virtual network. HyperNet Builders upload their HyperNet packages to the
HyperNet Marketplace where those packages can be downloaded by Network Creators
and HyperNet Participants.
The bottom layer of the system are the Virtual Network Infrastructure Providers
(VNIPs), which provide the (virtual) resources that will be used to construct the
virtual network.
In the rest of the thesis, if not specifically specified, a HyperNet Package means the router part
of the HyperNet Package.
56
57
gate for network innovation especially for small networks since network creators
no longer need to be large service providers who normally offer long-term network
services for potentially huge number of users all over the world. Instead, individual
network creators can create small networks (both in terms of scale and duration)
which last a short period of time and are only for personal use, e.g., a QoS-enhanced
video conference network or a low-delay gaming network.
58
Chapter 5
The Network Hypervisor
The Network Hypervisor provides a unified platform on which HyperNet Packages can
run to create application or service-specific virtual networks using the resources of
the underlying VNIPs. The Network Hypervisors role is to provide API calls that
make it easy for HyperNet Builders to create the desired virtual network topologies.
In this chapter, we explore the operations (API calls) that a Network Hypervisor
should support to help HyperNet Builders create HyperNet Packages.
We begin by describing the steps needed to build a HyperNet Package. We then
discuss the problem of discovering and reserving the VNIP resources needed to build
the HyperNet Package. Having identified the tasks a Network Hypervisor needs to
do, we present a set of Network Hypervisor API calls that achieve those goals.
5.1
59
Title
Identify participants
5.1.1
Description
Specify which participants should be part of
the HyperNet network.
Find the best attachment point for each participant to connect to the HyperNet network.
Find the best way to connect attachment
points to form a topology for the HyperNet
network.
Load HyperNet-specific software stacks, configuration files, and runtime scripts onto
specified programmable nodes in the HyperNet network.
Reserve the corresponding programmable
nodes and virtual links from the VNIPs and
deploy the HyperNet network.
Monitor the usage of the network so as to
detect changes and failures, send feedbacks,
and charge users.
The purpose of this step is to specify which participants should be part of the
HyperNet network. This step may require each participant to send a join request
message to the hypervisor. This join message will allow the hypervisor to verify
the participants identify and record the participants location information, so that it
is later possible for the hypervisor to map this participant to candidate attachment
points (see Section 5.1.2) that are close to it.
To support this step, the Network Hypervisor needs to offer an interface for
participants to join a virtual network (so as to record information such as IP address,
end system type, connection type, or platform). It should also be secure enough so
that it is hard for a malicious user to illegitimately join the HyperNet network.
To add participants to the virtual network, we designed a join() API function call
to help a HyperNet Builder to implement the end system part of a HyperNet
60
5.1.2
61
participants relative to a VNIPs PR resources (i.e., it must find the network distance
between a participant and potential attachment point candidates). Moreover, the
service must be scalable enough to handle large numbers of participants scattered all
over the world.
Similar problems have been studied in the past, but in a slightly different context.
For example, in todays Content Distribution Networks (CDNs), the content providers
want to find the best content caches to serve subscribers. The content caches should be
both close enough to the content subscribers to reduce network latency and powerful
enough to quickly satisfy requests from subscribers. The solution used by many CDNs
is to play Domain Name System (DNS) tricks. Upon receiving a DNS request for a
web page, the local DNS server redirects the request to a CDN mapping server (the IP
address of the CDN mapping server is pre-configured into the local DNS server by the
CDN provider). Knowing information about the location of local DNS servers, the
CDN mapping server can approximate an end-users network location by assuming
the user is near the DNS server from which the request arrived. The CDN mapping
server then chooses a nearby available caching server that is capable of responding to
the request. Thus, a CDN network provider is able to choose the best content caches
for its customers.
Our Network Hypervisor implementation could take a similar approach to solve
this problem in the context of the Network Hypervisor. In this case, the Network
Hypervisor would first need to find the local DNS server for the participant (via
DNS reverse lookup using the participants IP address). Knowing the participants
local DNS server, the hypervisor could then choose the programmable router that is
closest to the local DNS server as the entry point for the participant. The problem
is that the Network Hypervisor needs to maintain a mapping from local DNS servers
to nearby programmable nodes. However, creating such a mapping requires network
location information about both programmable nodes and DNS servers (and, like the
62
CDN solution, this solution relies on cooperation with the DNS system).
An alternative way is for the Network Hypervisor to delegate this mapping task
to the corresponding underlying VNIPs. In this way, the Network Hypervisor would
pass the request on to the VNIPs and let the VNIPs determine whether they have
programmable routers that are near the participant. Because two VNIPs may both be
close to a particular participant, the (estimated) distance from the participant to the
nearest programmable router must also be provided to help the hypervisor select the
appropriate programmable router. Yet another alternative is to create a third-party
service that is responsible for learning the location of VNIP resources and being able
to find resources near participants. For our prototype implementation we took this
approach, implementing a third party service which we describe in Section 6.2.
5.1.3
Given the list of VNIP resources, the HyperNet Builder must create an appropriate
virtual network topology.
63
5.1.4
After the topology of a virtual network is defined, the HyperNet Builder needs
to load specific software packages, configuration files, and libraries onto specific
programmable nodes in the topology. A mechanism needs to be designed to load
application-specific code onto VNIP nodes and on participants end systems.
We assume that the HyperNet Builder, as an experienced programmer, knows
which network code (including the protocol stacks, application-specific code and any
services that runs on a VNIP node) should be used in which circumstances. For
example, in the case of a High-Definition video broadcasting HyperNet network, when
a participant with a small screen cell phone connects to the virtual network via a
wireless link, a special appliance should be loaded on the attachment point PR which
connects to the participant to deal with video compression and high loss rate. As a
result, the HyperNet Package (designed by the HyperNet Builder) should be able to
load any pre-defined and pre-configured software packages (e.g., virtual appliances)
onto any reserved programmable node. We developed an API call loadApp() (a
more detailed description will be given in Section 5.3) for a HyperNet Builder to load
and execute any application (e.g., code, scripts, and configurations) onto any reserved
programmable nodes.
HyperNet Participants do not use loadApp() to load appliances. Instead, we
ask a HyperNet Participant to download and use a HyperNet End System package,
which includes all the software necessary for a participant to join and use a HyperNet
virtual network.
64
5.1.5
This step involves reserving the specified VNIP resources and setting up virtual links
among them. Once the virtual network has been defined (say via the RSpec in GENI),
it simply needs to be specified to the VNIP to be deployed. If the virtual network
includes programmable nodes in multiple VNIPs, then the network specification
(i.e., an RSpec file in GENI) needs to be provided to all involved VNIPs. The
programmable nodes needed to create inter-VNIP virtual links must also be identified
in the network specification so that special virtual links (e.g., a GRE tunnel) could
be created connecting programmable nodes in different VNIPs.
5.1.6
Feedback is also useful to HyperNet Builders to debug the HyperNet Packages they are creating.
65
5.2
Each HyperNet Package will only use a subset of the resources available from VNIPs.
The question is how does a HyperNet Package discover and define/specify the set of
resources it wants to include in its virtual network? In general, HyperNet Packages
have little desire to see all the possible resources a VNIP has (and VNIPs may
not want to show all of their resources and connections).
Instead, HyperNet
Packages typically want to know as little as possible about the underlying VNIP
resources and the VNIP physical topology. Only in cases where the details make a
big difference does the HyperNet Package want to drill down to the details of the
VNIPs resources/physical topology.
To accommodate multiple levels of detail, the Network Hypervisor begins by
providing the HyperNet Builder with a high-level, abstract, view of only the VNIP
resources that are immediately needed by the HyperNet Package, namely, Attachment
Points for the HyperNet Participants (the HyperNet Participant list can be predefined by a Network Creator via, e.g., a configuration file, which we will describe
later). The Network Hypervisor then uses the concept of a Transparent Tunnels to
connect attachment points together. The key feature of a Transparent Tunnel is
the ability to drill down to see the details (i.e., the resources) used to create the
tunnel, and to modify or adjust those resources if greater control is desired. Typically,
HyperNet Packages will view the VNIP topology at any of three levels of detail, as
described in the following sections.
5.2.1
Fig. 5.1 shows a topology in which only Attachment Points in the VNIP are visible.
Under this view, the VNIP only finds and reveals the nearest PR to every participant.
We call the links connecting Attachment Points Transparent Tunnels. If a VNIP
controls all the PRs and WPs along a transparent tunnel along the path, then it
66
Participant
5.2.2
For those HyperNet Builders who want to reserve certain types of programmable
routers/servers and/or control the network topology, the Network Hypervisor provides
a second level view of its topology and resources. Fig. 5.2 illustrates an example
network in which the HyperNet Package wants to utilize a programmable server,
centrally located between the attachment points and near a PR of its own. In this
view, the HyperNet Package asks to see resources that meet certain requirements
and then selects some subsets and maps them into the topology using Transparent
Tunnels. Again, transparent tunnels may contain hidden PRs, Way Points, and
67
Participant
68
5.2.3
For those advanced HyperNet Builders who desire to choose the paths in a transparent
tunnel themselves, the Network Hypervisor offers a third view of its resources. Fig. 5.3
illustrates this Detailed Topology View. In this view, the transparent tunnels become
Participant
Way Point
Transparent Tunnel
Programmable Server (PS)
69
Not only can transparent tunnels between programmable routers and programmable servers be customized, transparent tunnels between participants and their
attachment point PRs can also be customized. Of course, this customization requires
that the network devices between the participant and its attachment point routers
be controlled by the VNIP.
5.3
70
HyperNet
Packages
HyperNet
Library
VideoNet
GameNet
Topology Lib:
Routing Lib:
buildRPTree, buildSPTree
buildRing, buildStar
buildMesh, ......
Hypervisor API:
addTunnel
updateTunel
removeTunnel
getShortestPath
findCentralNode
seeTopo, drillDown
Network Hypervisor
findPR
addPR
updatePR
removePR
loadAPP
HIP
Handler 2
......
MuticastNet
App Lib:
loadRoutingProtocol
setAddress
setRoutes, ......
Topology Management:
Router Management:
HIP
Handler 1
HomeNet
loadCMApp
loadCCApp
loadPFApp, ......
More Libs
......
HyperNet Management:
getConfig, registerHyperNet
buildTopo, updateTopo
tearDown
HIP
Handler 3
HIP
Handler 4
......
VNIP Providers
5.3.1
VNIP Handlers
A VNIP handler is the interface that the Network Hypervisor uses to talk to an
underlying VNIP. As mentioned earlier, we envision the existence of multiple VNIP
providers in the future Internet. The Network Hypervisor needs to be able to handle
the various protocols offered by each individual VNIP in order to communicate with
them.
A VNIP handlers job includes: (1) connecting Network Hypervisor API calls into
the appropriate VNIP call (e.g., reserve/remove/update a node/link, load a file onto
a node, etc.), (2) fetching the topology/resource-availability information from the
71
VNIP so that the Network Hypervisor maintains the topology information (more on
this in chapter 6), and (3) retrieving network usage statistics and monitoring data
from the VNIP.
5.3.2
The Network Hypervisor API calls are designed to give HyperNet Builders the
programing tools/environment they need to programmatically create and deploy
application-specific virtual networks.
four classes: (1) Router Management, (2) Topology Management, (3) HyperNet
Management and (4) End System Management. Figure 5.4 shows the API calls
associated with each of the classes.
Router Management calls are used to find and add programmable routers into the
topology. Router Management API calls include:
node findPR(Participant p, Restriction r )
findPR finds the nearest attachment point for for participant p, satisfying
restriction r.
The Network Hypervisor first selects a small set of VNIPs that are thought
to be close to participant p. In our current Internet, a participant p can be
identified by its IP address. A VNIP can compute/keep a list of its best PRs
for every possible IP network, much like a CDN provider keeps a list of its
best caches for every DNS server. Each of the selected VNIPs then sends back
to the hypervisor its recommended attachment point, as well as the expected
network performance between the participant and the attachment point. A
lightweight probing mechanism can be used by each VNIP to generate the
performance result (e.g., ping for RTT, traceroute for number of hops, etc.).
The probe messages will be generated from each VNIP towards the participant.
The performance results will be collected by each VNIP and then reported to
72
the hypervisor. The hypervisor examines the returned probing statistics and
chooses the attachment point with the best performance (i.e., the closest PR
towards the participant). The restriction parameter r could be used to specify
the desired capacity of the attachment point, such as the CPU, memory/disk
space, and the number of available network interfaces on the attachment point.
Node addPR(Node pr )
This function reserves a programmable router pr. The data structure Node
allows the caller to specify the type of the programmable router (PC or VM)
and the capacities of the PR (CPU cores, memory size, disk space, etc.). If the
virtual network is not yet deployed, the instantiation of this PR is postponed
until the user calls buildTopo(). If the virtual network is already deployed, the
instantiation of this new PR is postponed until the user calls updateTopo(). This
API call will fail if the hypervisor does not think that a PR with the specified
capacity is available from the VNIP. Errors/failures might also happen at the
time of deployment when the VNIP tries to reserve the PR for the HyperNet
Package, which will result in an error message returned in the buildTopo() call
or in the updateTopo() call.
The current addPR() API call does not consider business models but we can
imagine that this function call is the primary method by which a HyperNet
virtual Network Creator pays for resources. As a result, one might consider
adding other parameters (e.g., credentials) to this function call in the future to
support various business transactions.
Node updatePR(Node pr )
This function updates the characteristics/capacity of a programmable router
pr that is already in the network.
existing PR via the virtual id field of that pr. Again, the actual effect is
73
deferred until the user actually deploys or updates the network via buildTopo()
or updateTopo(). This API call will fail if the hypervisor thinks that the newly
requested characteristics are unavailable.
Node removePR(Node pr )
This function removes a programmable router pr from the network, along with
all virtual links that connect pr.
Boolean loadApp(Node pr, Application app)
This function loads an application app onto a programmable router node pr.
The app may be a self-extracting software package or a virtual appliance.
loadApp() can be used both before a virtual network is deployed and after
a virtual network is deployed. In the first case, app is simply registered with
the Network Hypervisor. Only after the network is deployed is the application
uploaded onto the corresponding PRs and executed. In the latter case, the
Network Hypervisor immediately uploads and executes the application on the
corresponding PR2 .
The HyperNet Package might want to load additional software onto existing
programmable routers after the HyperNet network is deployed and running.
Sometimes the HyperNet Package will not know how to configure the routers
until the network is up and running and participants have joined.
For
example, when a participant has joined, a new (IP) address is assigned to the
corresponding HyperNet Participant as well as the new (GRE) interface on the
chosen gateway router. In order that all other routers in the virtual network to
know how to route to the newly joined participant, proper routing table entries
need to be added (unless there is a dynamic routing protocol running in the
2
We assume that the Network Hypervisor has the necessary privileges, credentials, ssh keys, etc
to copy software onto programmable nodes and execute it. Network Creators might need to provide
this information when they first create a HyperNet network, or the Network Hypervisor might use
its own credential to allocate/reserve/control resources.
74
virtual network). This configuration cannot be done when the network is first
deployed.
HyperNet Participants are responsible of finding, downloading, and executing
the corresponding HyperNet End System package to join and use the HyperNet
network. The HyperNet End System package includes all the software that is
necessary to run on the HyperNet Participants end systems in order to use
the HyperNet network. Hence, software that runs on end systems is not loaded
using loadApp() API call. Instead, it is included in the HyperNet End System
package, which runs on a HyperNet Participants end system.
API calls related to Topology Management include:
Link addTunnel (Node pr1, Node pr2, NodeSet way points, Restriction r)
This function reserves a Transparent Tunnel between two nodes pr1 and pr2
that travels through a set of specified way points or PRs, satisfying restriction
r.
This call results in all the way points specified being configured with proper
forwarding table entries (or VLAN configurations) that forward packets from
pr1 to pr2 following the node order defined in NodeSet way points. Because a
way point is a restricted form of a programmable router, the list of way points
can contain programmable routers. when this occurs, the PR will simply be
treated as if it were a WP. When way points is null, it returns a transparent
tunnel connecting pr1 and pr2 and follows the default path provided by the
underlying VNIP (typically the shortest path). Restriction specifies the capacity
of the transparent tunnel, such as its bandwidth, delay and loss rate. This
call will fail if pr1 and pr2 are not already included in the topology. If the
underlying VNIP decides that the required restriction can not be met, then an
75
Programmable Router at the center of the given PRs. Center might mean
that it provides the minimum number of network hops to all given PRs, or
it might mean that it provides the minimum average RTT to all given PRs.
This function is very useful in cases where a centralized server is needed, or a
Rendezvous Point (say for a multicast tree) is needed. For example, to create a
virtual network for a real-time online first person shooter game where each of
the player wants low delay to the game server, a centralized node needs to be
chosen.
76
Graph seeTopo()
This API call gives HyperNet Builders the ability to see the entire topology
available from all VNIP providers.
call gives HyperNet Builders more information about the underlying VNIP
topology, but it introduces more complexity in deciding which programmable
nodes to reserve and which tunnels to create. This API call needs support from
the underlying VNIP and the underlying VNIPs can control/limit what gets
returned (VNIPs may decide to hide part of their physical networks from the
user for security reasons, much like ISPs do today).
PathSet drillDown(Link myTunnel )
This function explores path choices within the transparent tunnel myT unnel.
Each returned path contains a number of way points (or PRs) contained in the
transparent tunnel. This is the main function for a HyperNet Builder to drill
down a transparent tunnel to explore the potential path choices. Depending on
the VNIPs policy, it may not reveal all the path choices to the user. Moreover,
if myT unnel only contains a directly connected physical link or it traverses
through a set of IP routers over which the VNIP has no control, this API call
returns null.
The previous calls dealt with the definition of the topology and allocation of
resources within a HyperNet. The following API calls deal with the management of
HyperNets such as registering a HyperNet with the Network Hypervisor:
77
78
79
This API call sends out a HyperNet join request to the Network Hypervisor and
waits for a TunnelInfo structure to be returned (TunnelInfo will be returned
by the Network Hypervisor upon receiving a corresponding addGW() API
call described later). TunnelInfo includes information necessary for the end
system to set up a tunnel with its assigned attachment point router. The join
request message includes: the HyperNetName of the HyperNet network, the
(optional) credential of the joiner, and (optionally) the participants identity
information myInf o. A participants identity information uniquely identifies a
participant within a HyperNet network. An example could be a participantID
assigned by the Network Creator. The participant information can further
include such things as the participants IP address, or the participants end
system connection type (wired or wireless). The hypervisor, upon receiving
a legitimate join request (via the checkJoin() API call described later), in
turn assigns a HyperNet-specific address to this participant and creates a
tunnel between the participant and the assigned attachment point router. The
hypervisor also informs the HyperNet Participant about its assigned HyperNetspecific address and the HyperNet attachment point router so the participant
can update its routing table, setting the assigned attachment point router as
the default gateway for all packets destined to the HyperNet virtual network.
JoinRequest checkJoin(String HyperNetName)
This API call is called by the join request handling code in the HyperNet
Package to handle join requests. It waits for join requests made to HyperNet
network whose name is HyperNetName. Upon receiving a new join request,
this API call returns a JoinRequest data structure to the caller.
The
80
5.3.3
In addition to providing a basic set of API calls to achieve basic node-level and linklevel tasks, our HyperNet architecture also provides a HyperNet Library which offers
advanced topology-level and system-wide functionality that helps HyperNet Builders
develop their HyperNet Packages. All HyperNet Library functions are implemented
using the existing Network Hypervisor API calls or other HyperNet Library API
calls. For example, the buildRing() Library call (which helps build a Ring topology)
makes use of the findPR(), addPR() and addTunnel() Network Hypervisor API calls
to achieve its functionality. The obvious advantage of the HyperNet Library calls
is that they make the HyperNet Builders task of composing a HyperNet Package
81
a lot easier. The Library does not sit inside the Network Hypervisor but rather
is dynamically linked with the HyperNet Package. As a result, any third party
developer (instead of the Network Hypervisor service provider) can contribute to
make the HyperNet Library more useful. Of course, the Network Hypervisor provider
can also add more HyperNet Library calls at any time to make it more capable. The
current Library offers the following calls:
Topology buildRing (NodeSet nodes);
This Library call creates a topology connecting all the given nodes into a ring.
This call first uses addPR() to reserve all the PRs specified in parameter nodes.
It then reserves transparent tunnels connecting the PRs into a ring using the
addTunnel() hypervisor API call.
Topology buildStar (Node center, NodeSet nodes);
This Library call creates a star topology having the node center as the center
of the star. Each of the nodes in the given NodeSet nodes then connects to the
center with a transparent tunnel, forming a star topology.
Topology buildMesh(NodeSet nodes, int degree);
This Library call creates a mesh topology, connects all the given nodes into a
connected topology. Each node on the connected mesh network has at least
degree number of links connecting it with other nodes. If degree is higher than
the number of nodes, then this library call will create a fully connected graph,
or a full mesh, as described next.
Topology buildFullMesh(NodeSet nodes);
This Library call creates a complete graph (or full mesh) topology connecting
the given nodes.
Tree buildRPTree(NodeSet nodes);
This Library call creates a Rendezvous-Point-Based tree, with all nodes in
82
protocol stacks on to each node in the topology and use those protocol stacks.
For example, to load and run a user space PIM multicast protocol, we define a
pre-configured pimd [82] routing daemon to be loaded onto each node of myT opo
via this API call.
Besides the set of HyperNet Library calls we introduced above (which are also
implemented in the Network Hypervisor), we can easily imagine many other HyperNet
Libraries that can be useful in composing different kinds of HyperNet Packages.
For example, an Application Library could help the HyperNet Builder load wellconfigured applications (software stacks such as a Content-Management App, a
Congestion-Control Traffic Engineering App, or a Packet-Filtering App). Similarly
an Instrumentation Library could help the HyperNet Builder load monitoring
tools onto critical nodes in the topology to monitor resource utilization, networking
performance etc. An OpenFlow Library that only deals with OpenFlow networks
could help make it easier to, for example, create an OpenFlow controller that does
load balancing. A Redundancy Library could help the HyperNet Builder create
backup servers and backup paths, etc. Clearly the Library list could grow long, but
the point is, a third party developer can easily contribute to the HyperNet Library
using the existing basic Network Hypervisor APIs.
5.4
The main contribution of our HyperNet Package abstraction is that it minimizes the
effort needed by an average user (Network Creator) to deploy and operate a personal
specialized virtual network the Network Creator only needs to run the HyperNet
Package. While that is all it takes to run some HyperNet Packages, some other
HyperNet Packages may need to be further configured by the network creator. In
this section, we describe the possible ways in which a Network Creator can further
tailor/configure an application-specific HyperNet network.
84
parameters far beyond the following list for Network Creators to manage their virtual
networks.
Maximum number of Gateways/Programmable nodes allowed: This parameter
allows the Network Creator to control the scale (as well as cost) of its virtual
network.
Expand Policy: Some HyperNet networks might be designed to be expandable
in the sense that as new participants join, the HyperNet Package automatically
adds new programmable routers to the network to optimize the virtual network
topology. Expand policies define rules such as when and where in the network
a PR should be added, when should PRs merge, how many participants can
share a single gateway, etc.
Height of the Tree: In the case of creating a multicast tree HyperNet network,
intuitively, the higher the created tree, the more efficient multicast will be.
For example, if the height of the tree is 1, the multicast sender will directly
connect with each receiver via a transparent tunnel. In this case, multicast
is the same as multiple unicast.
programmable routers in between the sender and each receiver and create a
shortest path tree that maps each of the edges onto a physical link between two
neighboring programmable routers, the resulting tree will be much higher and
is much more efficient than the multiple unicast tree. This parameter controls
the height of the tree.
5.5
HyperNet Participants
87
88
5.5.1
Joining a HyperNet
Voluntary Join
generates a list of participantIDs, and gives the Network Hypervisor this list (via
the registerHyperNet() API call). A then informs B via some method not specified in
the HyperNet Architecture (e.g., via an email or a phone call) about the new virtual
network (e.g., the HyperNetName) and the participantID generated for B so that
only B knows about its participantID (step 0 in the figure).
Upon receiving the message, B configures the HyperNet configuration file for the
corresponding HyperNet End System package to send a join request to the Network
Hypervisor. Just like the configuration file for a Network Creator to use to create
a HyperNet network, the configuration file for a HyperNet End System package
uses a similar format (which is also designed by the HyperNet Builder) so that the
89
HyperNet
Market
2.5 Download
eive
d/rec
g
in
el
nn
Tu
Hypervisor
Server
ing
nnel
Tu
2
0 Invitation (Outofband)
1 Request
2 Tunneling
en
3s
st
HyperNet
eque
R
1
2
ng
neli
n
Tu
ues
eq
R
Network Creator
Internet Participant
Programmable Router
Infrastructure Participant
the HyperNet Package used by A. After receiving the join request, the HyperNet
Package first finds an attachment point C for participant B via the findPR() call. It
then generates a HyperNet-specific address for B to use to communicate within the
virtual network. Finally it sends a tunneling message to B containing Bs assigned
attachment point PR and Bs assigned HyperNet-specific address so that B knows
who to tunnel through to communicate with the virtual network. The tunneling
message may optionally include additional end system appliances that B needs to
install in order to use the specific virtual network. The HyperNet Package sends the
same message (except the appliance information) to C so that C can work with B to
successfully set up a tunnel. All communication between the HyperNet Package and
Participant B in the joining process goes through the Network Hypervisor.
Any HyperNet Packet sent to Bs HyperNet-specific address will be forwarded to
B via C. After receiving the routing message from the HyperNet Package, B will
also set up a local HyperNet Routing Table so that any packet destined to any
HyperNet-specific address will be forwarded to Bs HyperNet attachment point C.
When an infrastructure participant D also wants to join this virtual network, it
needs to go through the same process as B, except that when the Network Hypervisor
receives the routing message from the HyperNet Package, it does not forward it to
D and its assigned gateway router. Instead, the Hypervisor Server directly sets up a
tunnel between D and the assigned gateway. The Hypervisor Server can achieve this
by, for example, assigning the switch interface so that the infrastructure participant
is connected to the same VLAN as the assigned gateway.
5.5.1.2
Involuntary Join
Address Translation) table entries on the jumping off point, and notify the requesting
participant (either an infrastructure participant or an internet participant) about the
jumping off point. Typically, the HyperNet package will choose the programmable
router that is closest to the third-party participant among the PRs that are already
in the HyperNet virtual network as the jumping off point (see Figure 5.6). After
2 NAT Info
3 Routing Entry
1 Request
Hypervisor
Server
st
HyperNet
eque
R
1
2
AT
N
Info
2 NAT Info
Network Creator
Internet Participant
Programmable Router
ThirdParty Participant
Virtual Link
Internet Connection
93
joinOther() if they also want to communicate with D through the HyperNet virtual
network. The HyperNet package first checks whether a HyperNet-specific address
and jumping off point PR are already assigned to D and if so, only the routing tables
on all PRs between the requesting participant and the previously assigned jumping
off point PR need to be updated.
5.5.2
HyperNet
Specific
App
HyperNet
... ...
Virtual Network
Regular Kernel
Hardware
Attachment
Point
Tunnel Packets
Overlay tunnels
94
As indicated by its name, in this model, the HyperNet end system applications are
specifically designed and built for one type of HyperNet network. The applications
that run on the end system have been built to use the HyperNets transparent tunnel
to tunnel into the HyperNet network. It uses HyperNet address, speaks HyperNet
protocols, etc. All the applications needed by the user must be included in the end
system part of the HyperNet Package.
As depicted in Figure 5.7, these applications run directly on a conventional kernel.
However, the communication between the special application and the attachment
point is via an application-specific IP overlay that the application must be fully aware
of. Thus, the application needs to be designed so that it supports and knows how
to talk the overlay protocol used to reach the attachment point, which also needs to
know the overlay protocol.
The major advantage of this usage model is that applications can be designed
to run in a standard OS context modulo the need to use the IP overlay for
communication. The downside is that such an application is specifically designed only
for one type of virtual network. Another drawback of this approach is the overhead of
writing HyperNet-specific applications. Ideally, we would like to make use of existing
conventional IP applications communicating over the HyperNet virtual network. The
next three models explore possible ways to leverage conventional applications.
5.5.2.2
The idea is to create an application gateway that speaks normal IP on one side
and HyperNet protocols on the other. The HyperNet side connects via a tunnel
to the attachment point.
Conventional
App
HyperNet
Virtual Network
Tun/Tap
Regular Kernel
Hardware
Attachment
Point
Inhost communication
Communication with Gateway PR
96
5.5.2.3
... ...
HyperNet
Virtual Network
Virtual Interface
Regular Kernel
Hardware
Attachment
Point
Virtual Interface
GRE Tunnel
97
HyperNet
... ...
Virtual Network
Wellconfigured OS
Regular Kernel
Hardware
Attachment
Point
Tunnel
5.6
The Internet uses DNS (Domain Name System) to translate human readable names
to/from network formatted IP addresses. Similarly, the HyperNet virtual network
may also need a DNS implementation to support human readable names. There are
basically two ways to achieve domain name translation in our HyperNet architecture:
(1) a single DNS used by all HyperNet Packages (a One-Size-Fits-All DNS) and (2)
a single DNS for each HyperNet network (a One-For-Each DNS).
98
99
5.7
The design of the Network Hypervisor aims to serve a large number of HyperNet
Packages at the same time (e.g., the Network Hypervisor might receive HyperNet
network creations/tear-downs/updates at the same time, and it also needs to always
maintain the runtime status of all active HyperNet networks at any moment). The
Network Hypervisor maintains a table containing all necessary pieces of information
about each active HyperNet network, from the meta data such as HyperNet names,
credentials, etc., to participant-to-gateway mappings. We will describe more details in
Chapter 6. It is unlikely that any aggregation of such information will happen among
HyperNet networks since all HyperNet networks are different from each other, using
different resources, with different participants. Thus, the space (either in memory
or on disk) needed in the Network Hypervisor for all HyperNet networks will grow
linearly with the number of HyperNet networks running.
The participant of our HyperNet architecture can be any Internet user from
anywhere, joining or leaving any HyperNet virtual networks. In regards to scalability
one obvious question is, how many participants can join/leave/participate at the
same time in our system? The current Internet achieves scalability via DHCP and
Local Area Networks so that the joining and leaving of an Internet user is usually
handled by a local DHCP server and the impact is kept within its local area network.
While it is obvious that the joining or leaving of a participant only has impact on the
targeting HyperNet virtual network it is joining/leaving, our Network Hypervisor also
delegates all the handling of a joining/leaving participant to the HyperNet Package.
The hypervisor only does a light-weight credential check for each joining participant
(or better yet, the hypervisor simply forwards the join request to the HyperNet
Package and let the HyperNet Package do credential check), then it forwards the
joining request to the HyperNet Package via an upcall. The HyperNet Package then
decides how to deal with this new participant either assigning a new attachment
100
point for the participant or using an existing programmable router as the attachment
point. It may or may not update the topology. After a HyperNet network is deployed,
the updating and maintenance task all relies on the HyperNet Package. Thus, the
scalability of the Network Hypervisor is not so much the issue as the scalability of
each individual HyperNet Package. The hypervisor simply accepts API calls and
carries them out. Caching can always be used to help carry out expensive API calls
such as finding the shortest path and finding the best attachment point PR. The
hypervisor does need to keep track of the resources consumed by each HyperNet
network for billing purposes and as a result, the (memory/disk) space cost grows as
each HyperNet network grows. In theory, the hypervisor could delegate this task to
the underlying VNIPs to spread the work load.
Finally, what is the scalability of each of the hypervisor API calls? For example,
how much time will it take to find a nearby programmable router or to find an
optimum path between two PRs? As mentioned earlier, the hypervisor can distribute
the task of finding nearby PRs to VNIPs by maintaining a IP address to VNIP
mapping (the hypervisor can get this mapping from e.g., the current DNS system).
The size of this mapping increases with the number of VNIPs and the number of IP
address segments. To find the shortest path between two programmable routers, one
can easily use a simple implementation of Dijkstras algorithm that has a complexity
of O(N 2 ), with N being the number of nodes in the graph. To find K shortest paths
between two nodes, one can run Dijkstras algorithm K times with an approximate
complexity of O(KN 2 ). While it is unrealistic to compute paths on demand, one
can pre-compute the paths and cache them with the assumption that the physical
topology will not change very often. Moreover, the hypervisor can map the task of
finding a PR-to-PR path to concatenating intra-VNIP paths with inter-VNIP paths
to further decrease the amount of time needed (with intra-VNIP and inter-VNIP
paths pre-computed).
101
Chapter 6
A Prototype Implementation
To demonstrate the HyperNet architecture, we have implemented a Network Hypervisor prototype using GENI [23] as the VNIP. As mentioned in Section 2.2, GENI
provides an Internet-scale network testbed for researchers to create experimental
networks that are isolated from each other. Multiple control frameworks exist in
GENI, including PlanetLab [20], ProtoGENI [52] (previously known as Emulab [21]),
InstaGENI [84] (an extension of ProtoGENI), ExoGENI (an extension of ORCA [53])
and ORBIT [54].
102
Hypervisor
Hypervisor API
Location
Manager
Information
Base
Topology Server/
Routing Server
VNIP Handler
VNIP
Probing Daemon
(Control Plane)
An earlier implementation of the Network Hypervisor used the ProtoGENI API to talk with
ProtoGENI aggregates just to prove that our implementation is able to support multiple VNIP
protocols.
103
6.1
information about each of the underlying VNIPs and the most up-to-date topology
information of the VNIPs. In addition, for each active HyperNet instance (i.e.,
currently executing HyperNet Package), the information base maintains a table which
includes: the HyperNet name, the participant ID list, creator information, active
participants information (e.g., their IDs, IP addresses, HyperNet addresses, and their
corresponding attachment points), the virtual network topology (such as reserved
nodes and links), a credential (which is verified whenever a new participant that does
not have a valid participant ID joins), and other monitoring information such as total
active time, resources consumed, and total cost. A NAT table is also maintained for
each of the running HyperNet networks containing the HyperNet name, third-party
participants (the IP address of the third-party participant), the assigned jumping off
point PRs, and the third-party participants assigned HyperNet-specific addresses.
6.2
The Location Manager determines the the closest attachment point to a participant.
The location manager fetches location information about each VNIP through the
VNIP handler and saves it in the Information Base when the Network Hypervisor
starts up. A VNIPs location information includes the VNIPs local view of nearby
end hosts (e.g., IP address prefixes). Just as todays ISPs configure local DNS servers
for their customers and thus know their nearby customers (or at least know that the
local DNS server is next to certain customers), the Network Hypervisor can obtain
this information from its VNIPs. The location manager also coordinates the task of
discovering the network location of a participant among nearby VNIPs. It fetches
104
the information about how close each VNIP is to the participant (i.e., the network
probing result from one of the VNIP infrastructure nodes to the participant). The
location manager also caches the results. Finally the location manager can determine
the PR with the best network performance (e.g., the smallest Round Trip Time) near
a participant and return that programmable node in the findPR() API call. In our
implementation, we consider each of the GENI aggregates as a distinct VNIP since
different aggregates typically sit in different geographical locations, use different sets
of IP addresses, and manage resources only within their own aggregate.
To implement the location manager, we created a long-lived experiment in GENI,
in which we reserved one PR from each aggregate to assist the location management
service.
plane to gather information for the location manager (see Figure 6.1).
Since
the Network Hypervisor has control over each of the reserved nodes, the Network
Hypervisor can load a probing application onto those nodes and fetch network
performance information between the reserved probe nodes and any participant.
The probing results from each of those nodes represent the closeness/nearness of
their corresponding aggregates (VNIPs) to the participant. The location manager
then picks an available PR from the closest VNIP which satisfies the capacity
requirements set by the HyperNet Package. In this way, the findPR() API call
discovers the closest PR from a set of VNIPs for each participant.
6.3
105
from the underlying VNIPs2 , save it in the information base, and implement the
corresponding hypervisor API calls. Since the seeTopo() API call returns a topology
with dynamic information about the availability of the resources in the topology (i.e.,
programmable nodes might be unavailable or links might be congested), the topology
server should update the (available) topology information in the information base
whenever a new HyperNet virtual network is created or a virtual network is torn down.
On the other hand, the underlying VNIP may also sell its resources to customers
other than the hypervisor. Thus, the topology server should also be responsible for
maintaining an up-to-date topology from the VNIP, either by continuously pulling
topology information from the VNIP or having the VNIP pushing updated topology
information to the hypervisor upon any changes. The Routing Server makes use
of the topology information in the information base and accomplishes tasks such as
calculating paths between two nodes, finding a central node among a set of nodes,
and building a multicast tree connecting a set of leaf nodes.
6.3.1
The algorithm to find the central node in between a set of nodes is simple. If the
provided set of nodes are all in the same aggregate, first of all, for each pair of the
nodes in the node set, the hypervisor finds the shortest path between them (assuming
all paths are symmetric and all virtual links have the same distance). Next, the most
popular node (i.e., the node that appears most frequently among all shortest paths)
is chosen as the candidate central node. Finally, the characteristics of the candidate
node is checked. If it satisfies the central node requirements (processing power, disk
space, etc.), it is returned as the central node, otherwise the second most popular node
is chosen as the candidate node. This process is repeated until it finds a candidate
node which satisfies the central node requirements. If the given set of nodes are from
2
VNIP topology information contains the topology that a VNIP exposes to the hypervisor via
the VNIP API. It does not have to be the physical topology. In cases where the VNIP wants to hide
part of its physical topology, virtual tunnels can be returned by the VNIP.
106
multiple aggregates, then the TS/RS uses the location manager to figure out the
best aggregate that provides the minimum average RTT to all given nodes. Then an
available PR is selected from that aggregate to serve as the central node.
6.4
In ProtoGENI (as well as in InstaGENI and ExoGENI), any two programmable nodes
in the same aggregate can be directly connected via a VLAN without going through
any other nodes. In other words, ProtoGENI nodes in the same aggregate form a fully
connected graph which is not reflective of real-world topologies and uninteresting
from an experimental standpoint. Moreover, the connection between two aggregates
in ProtoGENI is typically set up via one, or occasionally more, IP tunnels. Thus, the
best way to connect two programmable nodes in different aggregates is to directly
connect them via an IP tunnel (e.g., ProtoGENIs solution is a GRE tunnel).
In PlanetLab, the same holds true in the sense that the best way to connect
two PlanetLab nodes is also to directly connect them without going through any
other PlanetLab nodes. The only difference is that this connection is accomplished
via an overlay channel (e.g., a TCP connection or a UDP connection) instead of a
VLAN or a GRE tunnel. In short, nodes in both ProtoGENI (as well as InstaGENI
and ExoGENI) and PlanetLab form fully connected topologies and so using them as
VNIPs is uninteresting.
To make the topologies more interesting, we intentionally removed some of the
links from the topology after the Network Hypervisor fetches the topology from GENI.
Removing links from the topology helped us to avoid the uninteresting case where
the topology is fully connected. As a result, the observed topology might not be fully
connected. In our implementation, we developed a random topology generator on top
of ProtoGENI. It randomly generates a topology connecting all available resources
within a ProtoGENI aggregate. The generated topology looks like a physical point-
107
to-point topology, hiding the fact that the resources actually form a fully connected
topology. Next, we program the hypervisor API calls used to explore the underlying
physical topology to return information from this randomly generated topology rather
than the actual fully connected topology.
6.5
Hypervisor Performance
6.5.1
Experimental Context
108
MAC OS X, with a 2.4GHZ Intel Core 2 Duo CPU and 8GB DDR3 memory. We
tested with both Java Runtime Environment (JRE) version 1.6 and 1.7 to run the
server as well as the HyperNet Package. The results are similar. Communication
between the Network Creator and the Network Hypervisor is via XML-RPC, with a
round-trip time of about 0.5 milliseconds. As stated earlier, We used several GENI
aggregates as the VNIPs.
6.5.2
Build Time
Build time measures the interval between the time from when the Network Creator
starts executing a HyperNet Package and the time when the Network Creator creates a
topology description file describing the virtual network to be instantiated. In GENIs
context, the topology description file is an RSpec file. During this process,
the Network Hypervisor handles HyperNet registration request, initializes all the
corresponding tables for the HyperNet network (including the participant list, the
join request list, the attachment point table and the status table), creates the RSpec
file for the virtual network, saves the file to a per-HyperNet directory and returns the
RSpec to the caller (the Network Creator). This step does not include any queries to
the GENI aggregate manager.
The performance results are shown in Figure 6.2. In this set of experiments, we
use the buildRandomRing() HyperNet library call to create ring topologies. The X
axis shows the number of nodes in the ring topology. The Y axis measures the build
time. The RSpec file has length O(n): there are n nodes and n connection links. From
the results we can see that the time spent creating a ring topology increases with the
number of nodes on the ring. However, to create a ring topology with almost 400
nodes, our Network Hypervisor only used about 8 seconds. Now let us take a look at
the time needed for a VNIP to actually allocate, initialize, and deploy the network.
109
Build Time
10000
8000
Time (ms)
6000
4000
2000
0
0
50
100
150
200
250
Number of Nodes in the Ring
300
350
400
6.5.3
HyperNet deployment time measures the interval between the time when the Network
Hypervisor gives the topology description file to the underlying VNIP and the time
when the VNIP has mapped the description onto the infrastructure, the resources
are reserved, booted, configured, and ready for use. This step is outside the control
of the Network Hypervisor; it is purely the responsibility of the VNIP. It typically
includes:
1. Identifying the nodes that need to be included and then reserving them in
the GENI context, the RSpec file can specify particular nodes to include; GENI
calls them bound resources. The RSpec file can also specify that it only
needs a node, in which case GENI may choose any node; GENI calls these
nodes unbound resources. We have done experiments using both bound
and unbound resources and the results are similar. Two types of nodes are
110
GENIs control
Physical PC
Virtual Machine
130000
125000
120000
115000
110000
105000
100000
95000
90000
85000
3
3.5
4.5
5
5.5
Number of Nodes
6.5
For
6.5.4
Concurrency Test
Multiple HyperNet Packages could call the Network Hypervisor APIs at the same
time and thus impose a heavy load on the Network Hypervisor.
112
Although we
Time (s)
0
0
8
10
12
Number of Nodes in each Ring
14
16
18
20
Figure 6.4: Time spent in deploying 50 HyperNet Topologies with concurrent requests
could implement our Network Hypervisor on multiple hosts with queuing and load
balancing, we were interested in knowing the performance of a single Network
Hypervisor handling concurrent requests.
In this set of experiments, we ran multiple HyperNet packages simultaneously.
Each package asked the Network Hypervisor to create a Ring topology with a
randomly selected number of nodes (from 1 node to 20 nodes). We recorded the
build time for each request.
Figure 6.4 shows the build times when the Network Hypervisor receives 50
concurrent buildRandomRing() requests. Figure 6.5 shows the build times when the
Network Hypervisor receives 100 concurrent buildRandomRing() requests. Figure 6.6
shows the build times when the Network Hypervisor receives 200 concurrent
buildRandomRing() requests. In each set of experiments, the test script creates the
corresponding number of threads and each thread acts as a Network Creator executing
113
Time (s)
0
0
8
10
12
Number of Nodes in each Ring
14
16
18
20
Figure 6.5: Time spent in deploying 100 HyperNet Topologies with concurrent
requests
a package that requests to create a ring topology with a given number (picked from 1
to 20) of nodes. The test script needed 159 ms to create 50 threads, 233 ms to create
100 threads, and 658 ms to create 200 threads.
This series of graphs show that the larger the number of simultaneous requests,
the longer it takes for the Network Hypervisor to fulfill each request. In general, the
larger the number of nodes in the requested topology, the longer the build time. Most
importantly, even when there are 200 requests coming into the Network Hypervisor
simultaneously, the Hypervisor can still manage to handle all the requests in less than
10 seconds apiece. If we increase the number of concurrent threads on the test script,
it is possible that the server will fail with connection reset exceptions, because the
XML-RPC library we used in our Network Hypervisor supports a limited number
of concurrent TCP connections. For comparison, we did the same experiment using
sequential requests. In this case, instead of creating 200 threads at the same time, the
114
Time (s)
0
0
8
10
12
Number of Nodes in each Ring
14
16
18
20
Figure 6.6: Time spent in deploying 200 HyperNet Topologies with concurrent
requests
client waits until each thread successfully terminated (a thread will terminate when
the Network Hypervisor reports that it has accomplished the build phase) before
creating the next thread. The result is shown in Figure 6.7. From the result we can
see that the build time for each the request is within 0.5 seconds. The cumulated
build time for all 200 sequential requests is about 7.5 seconds (the graph only shows
the build time for individual request). This result indicates that sequential request
handling out-performs concurrent request handling. The worst case in the sequential
request handling is for a request to wait for 7.5 seconds before it gets executedwhich
still out-performs the 10 seconds worst case for concurrent request handling. As a
result, if we want to deploy a production Network Hypervisor, we would want to use
queuing technique to sequentially handle concurrent requests.
115
Time (s)
0
0
10
12
Number of Nodes
14
16
18
20
Figure 6.7: Time spent in deploying HyperNet Topologies with sequential requests
116
Chapter 7
Example HyperNet Packages
To demonstrate the concept of a HyperNet Package, we created several example
HyperNet Packages, including: a Multicast HyperNet, a MobileNet HyperNet, a
GameNet HyperNet, and an OpenFlow Load Balancing HyperNet. We tested all the
packages using the Network Hypervisor implementation described in Chapter 6.
7.1
A Multicast HyperNet
117
PR
PR
PR
PR
PR
PR
PR
PR
PR
PR
PR
PR
PR
PR
PR
PR
Programmable Router
Physical Link
118
S
PR
PR
G0
PR
M1
PR
M3
G5
M2
M4
PR
R5
G4
G1
G2
R1
PR
G3
R2
R4
R3
Programmable Router
Physical Link
Participant Node
119
and Figure 7.2 shows the reserved nodes and links. G0G5 are attachment point
gateways and M1M4 are multicast routers connecting the attachment points. S is
the sender of the multicast tree and R1R5 are receivers of in multicast tree (in fact,
this multicast tree also allows any receiver to send multicast packets to other nodes).
The pseudo code for the HyperNet Package is illustrated in Algorithm 1, with the
actual Java code slightly exceeding 200 lines.
Algorithm 1: The Multicast HyperNet Package
myConfig = readConfig() //Read configuration file
regHyperNet(myConfig) //Register HyperNet
for every participant p in myConfig do
gateway = findPR(p)
gatewayList.add(gateway)
end for
myTree = buildRPTree(gatewayList)
for each programmable router pr in myTree do
loadApp(pr, PIMSMApp) //load PIM Sparse Mode Application onto each pr
end for
buildTopo() //deploy the topology
//Join Request Handling Process
while true do
joinRequest = checkJoin() //get join request
tunnelInfo = createTunnelInfo(joinRequest)
//set up tunnel between gateway and new participant
addGW(tunnelInfo)
end while
The sender and receivers are all participants in the HyperNet network. They
connect to their corresponding gateway routers via the Multicast HyperNet End
System package. Just like the HyperNet Package, the End System package also
includes a configuration file, in which a participant specifies its participant ID. As
120
shown in Figure 7.2, a legitimate participant successfully joins the multicast HyperNet
network by creating a GRE tunnel from the end system to the corresponding
attachment point.
Our experiment uses the pimd multicast daemon [82] and we ran PIM-SM
multicast on each programmable node, with M1 as the Rendezvous Point selected by
the HyperNet Package code. The pseudo code for the HyperNet End System package
is illustrated in Algorithm 2, with the real code slightly exceeding 50 lines.
7.1.1
Multicast Results
Table 7.1 shows the amount of time consumed in each step of building the multicast
topology. We created the same multicast tree 8 times. On average, the time consumed
Table 7.1: Time Required to create a Multicast Virtual Network
Build Time
Exp. #
1
2
3
4
5
6
7
8
StdDev.
Avg.
Find
Gateway
(s)
2.26
2.26
2.28
2.28
2.28
2.26
2.26
2.3
0.015
2.27
Build
Tree
(s)
0.518
0.44
0.458
0.416
0.425
0.43
0.425
0.438
0.033
0.444
Deploy
Time
Deploy
Network (s)
Join (s)
100
101
103
102
102
81.6
82.5
104
9.25
97
0.685
0.679
0.699
0.673
0.659
0.614
0.696
0.831
0.06
0.692
to find a gateway router for each expected participant is just over 2 seconds. In our
implementation, in order to choose the nearest attachment point for a participant,
the location manager in the hypervisor made use of two representative nodes in the
Utah aggregate and the UK aggregate to get the average round trip time from each
representative to the participant using 50 ping probes when the HyperNet Package
121
calls the findPR() API call. Figure 7.1 and Figure 7.2 only show the resources in
one aggregate because all participants are found near the Kentucky Aggregate. The
Network Hypervisor chose a random node from the VNIP closest to each participant
as that participants attachment point. This step may take longer in the future as we
include more aggregates (VNIPs) in the HyperNet network. We used the algorithm
described earlier to build the RP-based multicast tree. More precisely, we use the
buildRPTree() HyperNet Library call to build a RP-based Tree. As shown in the
table, the time spent in step Build Tree is relatively small because we have a small
physical topology (Figure 7.1). However, we expect that even in a larger topology,
the time spent in building an RP-tree will not grow significantly as the number of
nodes increase, because the buildRPTree API call can take advantage of the cached
shortest paths between leave nodes in order to find the rendezvous point and then
build the tree. Deploy Network takes most of the overall time for each experiment,
consuming about 100 seconds.
reserving nodes and links from protoGENI, starting up reserved nodes, uploading
pimd daemon software (1.4 Megabytes) onto each reserved node, configuring pimd
properly on each programmable node and executing pimd. Interestingly, unlike the
other steps, the amount of time spent in this step may vary by as much as 20 seconds
across experiments. Since this step is carried out by ProtoGENI, the hypervisor has
no control over it. Finally, the average time it takes for a participant to join this
multicast network is about 0.7 second. This step is measured from the HyperNet
End System package sending out a join request until the end system receives the
tunnel information (including instructions for setting up a GRE tunnel). To test
the correctness and performance of the multicast tree, we compared the loss rate
experienced by each receiver under different sending rates in cases when the sender
was multicasting and when the sender was sending out multiple unicast UDP flows.
Table 7.2 shows the performance results for multicast vs multi-unicast. Each link in
122
the reserved network had a bandwidth of 100Mbps, and the bandwidth of the GRE
tunnel between each participant and its attachment point was 1Gbps. As expected,
in the multicast case, the receivers only experienced minor loss rates even when the
sender is sending out packets at a rate of 90Mbps, which is close to the full capacity of
the network (100Mbps). However, in the multiple unicast case, the sending rate from
the sender to each of the 5 receivers does not exceed 20Mbps for reliable delivery
(i.e., with minor loss rate experienced by the receivers). In all of our tests we used
iperf as the network performance testing tool. All results were generated by iperf
clients on the receivers.
7.2
A MobileNet HyperNet
Mobile devices are increasingly being used for data intensive applications such
as browsing web pages, watching videos, streaming music, downloading files, and
installing updates. Because these applications typically run over TCP, performance
can be seriously degraded when the first hop is a lossy or intermittent wireless
link, as is often the cases with mobile devices [86]. Even a loss rate of 0.1% can
significantly affect TCPs performance. MobileNet addresses this problem by using
a well-known technique called TCP splitting [87] that breaks the TCP connection
into two parts: one TCP connection traversing the wireless link, and another TCP
connection traversing the wired portion of the path. Consider the network shown
123
FS
Internet
P
(a)
FS
G
P
Utah Aggregate
Kentucky Aggregate
(b)
Programmable Router
Lossy Link
Internet Participant
InterSP Link
FS
File Server
Wired Link
124
7.2.1
MobileNet Results
TCP throughput vs Lossrate with and without the Mobile HyperNet
Normal TCP
MobileNet - 2ms RTT to P
MobileNet - 20ms RTT to P
MobileNet - 50ms RTT to P
90
Throughput (Mbps)
80
70
60
50
40
30
20
10
0
0.1 0.5
5
Loss Rate (%)
jumping off point PR for S since S is a web server we created using a PC in Utah
Aggregate.
Figure 7.4 shows the performance result of our MobileNet virtual network,
compared with regular TCPs performance. In all of our experiments, we use a fixed
Round-Trip Time of 100 ms between P and S. We used the traffic control toolkit tc
in Linux to control the delay and loss rate of each link. In particular we varied the
loss rate and delay on the first-hop link between P and G. To evaluate normal TCP
performance, traffic simply went between P and S without going through G and H.
For the TCP splitting tests, all traffic went through G and H. To understand how
much the delay of the wireless link between P and G affects MobileNet performance,
we used RTTs between P and G of 2ms, 20ms, and 50ms in our MobileNet tests, and
we varied the loss rate between 0% and 5%.
Looking at the normal TCP curve (the lowest curve on the graph), we see that
TCP performance decreases rapidly, going from 90 Mbps at a 0% loss rate to 4 Mbps
at a 0.1% loss rate and then quickly dropping toward 0 Mbps.
MobileNet (with a 2ms RTT between P and G) set up a split TCP connection
that was able to overcome small amounts of packet loss. The top curve shows that
even at loss rates up to 1%, MobileNet is able to maintain throughputs around 85
Mbps. As expected, moving G away from P (i.e., increasing the RTT between them)
reduces the throughput, but it is still better than normal TCP performance in all
cases.
7.3
The multiplayer on line gaming HyperNet Package we created aims to shorten the
network latency between the players and the game server thereby improving the
gaming experience for all players. In this example, we created a gaming hypernet
that automatically deploys a custom virtual network for the OpenArena game [89].
126
OpenArena is an open source multiplayer online game. The game has several publicly
accessible game servers to which participants can connect. However, being an open
source game, it also allows participants to compile and run their own game servers.
Consequently, it is possible for a set of players to identify the best location for a game
server (e.g., a location with the lowest delay for all participants), and then run their
own game server at that location.
The goal of our OpenArena Gaming hypernet was to automatically select the best
location for a game server, run the game server at that location, and then set up an
virtual network to connect the game server to all the participants.
Algorithm 3: Multiplayer Gaming HyperNet Package
myHyperNet = readConfig() //Read configuration file
regHyperNet(myHyperNet) //Register HyperNet network
while participantNum <predefinedtotalNum do
joinRequest = checkJoin() //get join request
Participant p = new Participant(joinRequest)
myHyperNet.participantList.put(p) //add p onto participant list
participantNum++
end while
for every participant p in myHyperNet do
gateway = findPR(p)
gatewayList.add(gateway)
participantGWMap[p] = gateway
end for
//find a central node where we can run a game server
gameServer = findCentralNode(gatewayList)
for every gateway g in gatewayList do
addTunnel(g, gameServer) //create a virtual link
end for
//load Open Arena game server software onto the central node
loadApp(gameServer, OpenArenaServer)
buildTopo() //deploy the topology
Start the Open Arena Server
//connect all participants to the game network
for every participant p in myHyperNet do
tunnelInfo = new TunnelInfo(p, participantGWMap[p])
addGW(tunnelInfo)
end for
The pseudo code for the OpenArena [89] Gaming HyperNet Package is illustrated
127
in Algorithm 3. The HyperNet first reads the configuration file for the HyperNet
network, including the name of the HyperNet network and the password needed to
join the network. Then it registers the HyperNet network using the regHyperNet()
API call. Then the HyperNet waits for join requests from game players. Whenever
a player joins the HyperNet network, the Gaming HyperNet Package keeps a record
of the players IP address. A players IP address can be found from the join() API
call. When all players have joined (this is decided by checking the total number of
players who have already joined the game), the HyperNet Package then finds a nearby
attachment point router for each of the players via the findPR() API call. Next, it
finds a central node that provides the minimum average delay to all the gateways via
the findCentralNode() API call, which takes the list of participant (player) attachment
points as a parameter. It then creates a virtual channel from each attachment point to
the central node. The HyperNet Package then loads the OpenArena server application
on the central node via the loadApp() API call1 . Finally the HyperNet Package
deploys the game network using the build() API call and then starts the OpenArena
game server with the following command on the central node:
./oa_ded.i386 +set dedicated 1
+exec server.cfg
+set net_ip 10.128.2.2
+set net_port 27961
The +set dedicated 1 parameter means this dedicated server is not visible to
the OpenArena master server. In other words, other Internet players are not able
to see this server from their in-game Internet Server list. It is dedicated to only
the three players in our game. The +exec server.cfg parameter means the server is
launched based on the configuration file server.cfg. The server.cfg file defines the
1
In our experiment, we were using OpenArena server version 0.8.8, with the actual tar-ball size
of 447 MB. It took about 48 seconds to load the application to the server node.
128
game parameters such as server name, the maximum number of clients allowed, the
maximum sending rate, the maximum allowed ping time from the clients, the game
types(e.g., tournament, team death match, last man standing, etc), and the game
map. The HyperNet carefully configures this file to optimize the gaming experience
for the players. The +set net ip and +set net port parameters simply define the
listening interface and listening port number on the server. Again, the HyperNet is
responsible for finalizing these parameters to make sure that the game server will not
be blocked by firewalls or other networking issues.
At this point, the game virtual network is up and running. Next, the HyperNet
Package figures out the configuration for the corresponding gateways (IP addresses
that need to be assigned, and the routing table entries that need to be added, etc) for
each game player (participant). It then creates a GRE tunnel between each player and
its assigned gateway using the addGW() API call. In our case, the HyperNet Package
reserves 100Mbps bandwidth for all virtual links shown in Figure 7.5, which is more
than sufficient for the maximum game sending rate of 200 Kbps set in server.cfg.
The HyperNet network is fully expandable to allow dynamic game joiners, i.e., by
finding and assigning more gateways to new requesting game players. Moreover, it
can also be designed to locate new game servers according to the current players and
migrate all gaming data from the old server to the new one.
7.3.1
In our experiment, we used two VNIPs: the Kentucky Aggregate and the Utah
Aggregate. Three game players are involved: two are located in Kentucky, the other
one is from Utah. As a result, two attachment point PRs are found from Kentucky
Aggregate and one attachment point PR is found from Utah Aggregate for the player
in Utah. We implemented the Game HyperNet Package and compared it with the
standard OpenArena game using existing public game servers. Using the Network
129
Internet
IS
P3
G3
P1
G1
P2
GS
Utah Aggregate
G2
Kentucky Aggregate
Internet Participant
(Game Player)
Programmable Router
IS:
GS:
SDN Link
GRE Tunnel
7.4
130
source
C
sw1
right
left
sw2
destination
Programmable Router/
OpenFlow Switch
C
Participant
Transparent Tunnel
Internal Connection
OpenFlow Controller
131
7.4.1
In our experiment, node source and node destination are two infrastructure
participants in the ExoGENI BBN Aggregate.
Figure 7.6 are programmable routers in the same Aggregate. After the OFLBH
network was deployed, we started a TCP flow using iperf [90] from node source to
132
Throughput (MBps)
10
0
00:00
00:30
01:00
01:30
02:00
Time (minute)
02:30
03:00
03:30
Figure 7.7: Load Balancer A total performance with no loss on either path.
node destination every 5 seconds, with a total number of 20 individual TCP flows.
We name each individual flow such that flow 1 starts 5 seconds earlier than flow 2,
etc. All flows last for 200 seconds. We measured both the individual throughput of
each flow as well as the total throughput on each path (left and right).
Figure 7.7 shows the total throughput on left and right paths under no loss using
Load Balancer A. As we can see, the load balancer makes full use of the bandwidth
on both paths (about 12MBps or 100Mbps).
Figure 7.8 shows the individual flow throughput using load-balancer A when there
is no loss on both left and right paths. From the per-flow performance graph we can
see that for both left and right paths, the throughput of each flow decreases as new
flows arrive. This result is due to TCPs congestion control mechanism: as new
flows arrive, the network starts to drop packets and TCP automatically adjusts its
congestion window size (and thus, throughput) for each flow. For the duration of
133
Throughput (MBps)
10
0
00:00
00:30
01:00
01:30
02:00
Time (minute)
02:30
03:00
03:30
04:00
Figure 7.8: Load Balancer A per-flow performance with no loss on either path.
our experiments (about 200 seconds) we can see that the throughput of all TCP
flows gradually dropped to around 2MBps. TCPs congestion control mechanism will
eventually balance the sharing of bandwidth such that each TCP flow will use similar
amount of bandwidth.
The per-flow throughput and total throughput performance results for loadbalancer B is similar to the results for load-balancer A when there is no loss for
both paths and thus, we omit the graphs here.
Figures 7.9 and 7.10 show the total performance as well as the per-flow
performance on both paths using load-balancer A when there is 5% loss rate on
the left path. Due to As load balancing algorithm, despite the fact that there is 5%
loss rate on the left path, the OpenFlow controller still forwards the new incoming
TCP flows in an alternating pattern onto left and right paths. The result is, 10 flows
are forwarded to the left path and 10 on the right path. The flows on the left path
134
Throughput (MBps)
10
0
00:00
00:30
01:00
01:30
02:00
Time (minute)
02:30
03:00
03:30
04:00
Figure 7.9: Load Balancer A total performance with 5% loss on the left path.
achieve very low throughput compared with the throughput of flows on the right
path.
On the other hand, if we use the load-balancer B, which makes load balancing
decisions based on the measured average per-flow throughput, we see that only TCP
flow 1 was directed to the left path from Figure 7.12. Because of the losses on the left
path, the average per-flow throughput is much lower on the left path than the right
path. Therefore no more flows are assigned to the left path after flow 1. TCP flow
1 was able to achieve a throughput of 0.5MBps. The remaining 19 TCP flows were
forwarded to the right path, sharing the total available 12MBps bandwidth. If we
compare Figures 7.9 and 7.11, we can see that in the case of 5% loss on the left path,
load-balancer A achieved a higher overall throughput between outside and inside
(the sum of total throughput on left and right paths) than load-balancer B. However,
when comparing Figures 7.10 and 7.12, we see that load-balancer B provides better
135
Throughput (MBps)
10
0
00:00
00:30
01:00
01:30
02:00
Time (minute)
02:30
03:00
03:30
Figure 7.10: Load Balancer A per-flow performance with 5% loss on the left path.
fairness to all TCP flows going through the load balancing network in the sense that
almost all TCP flows achieved similar throughput; whereas using load-balancer A,
the flows on the right path achieve much higher bandwidth than flows on the left
path. As a result, depending on the different characteristics of each path (loss rate)
and the virtual network users needs, the Network Creator might want to choose a
different kind of load-balancer to handle the load balancing task.
136
Throughput (MBps)
10
0
00:00
00:30
01:00
01:30
02:00
Time (minute)
02:30
03:00
03:30
Figure 7.11: Load Balancer B per-flow performance with 5% loss on the left path.
Throughput graph when the left path has a 5% loss rate.
Load Balancing based on average throughput
12
Left Path - flow 1
Right Path - flow 2
Right Path - flow 3
Right Path - flow 4
Right Path - flow 5
Right Path - flow 6
Right Path - flow 7
Right Path - flow 8
Right Path - flow 9
Right Path - flow 10
Right Path - flow 11
Right Path - flow 12
Right Path - flow 13
Right Path - flow 14
Right Path - flow 15
Right Path - flow 16
Right Path - flow 17
Right Path - flow 18
Right Path - flow 19
Right Path - flow 20
Throughput (MBps)
10
0
00:00
00:30
01:00
01:30
02:00
Time (minute)
02:30
03:00
03:30
Figure 7.12: Load Balancer B per-flow performance with 5% loss on the left path.
137
Chapter 8
Conclusion and Future Work
In this thesis, we introduce the concept of a HyperNet. Modeled after a virtual
appliance, a HyperNet is a software package that contains all the necessary pieces
needed to create a special-purpose virtual network, including topology specifications,
protocol stacks, software packages, configuration files, and runtime scripts, so that
all that one needs to do to deploy a virtual network is to download a HyperNet from
the HyperNet market and run it. The HyperNet abstraction makes it possible
(and extremely easy) for an average user to create a virtual network, which otherwise
would be a challenging and labor-intensive task even for a network expert.
We introduce a HyperNet Architecture that supports the HyperNet abstraction.
We also define the concept of a Virtual Network Infrastructure Provider (VNIP) which
provides virtual networking resources to users for a fee. We describe our assumptions
about VNIPs and the common set of services provided by VNIPs. We also show
how these services map onto the services provided by todays existing virtual network
testbed providers (e.g., GENI).
At the heart of the HyperNet architecture is the Network Hypervisor, which
provides the platform for HyperNets to run on. The Network Hypervisor acts as the
broker between HyperNet users (i.e., Network Creators and HyperNet Participants)
and VNIPs. The Network Hypervisor talks to VNIPs on behalf of HyperNet users
with different VNIPs to reserve resources, and it glues the resources together to form
138
a virtual network that might span multiple VNIPs. The Network Hypervisor provides
a set of Hypervisor API calls that HyperNet Builders can use to build a wide range
of HyperNets.
To demonstrate our design, we implement a Network Hypervisor using GENI
as the underlying VNIP. The Network Hypervisor contains multiple components
that help (1) maintain an up-to-date resource information of the underlying VNIPs,
(2) locate network nodes, including end systems (i.e., HyperNet Participants), (3)
calculate paths between network nodes, and (4) support the Network Hypervisor
APIs. Our implementation is extensible in the sense that any third party developer
can contribute by implementing HyperNet Libraries based on the provided Network
Hypervisor APIs.
HyperNet Routing Library to ease the task of creating virtual networks having
common shape and to automatically manage the routing tables of the nodes in a
virtual network. Experimental results show that the Network Hypervisor is able to
quickly create a virtual network topology and can scale as the number of HyperNets
increases, particularly since the Network Hypervisor can be parallelized.
To showcase the power of the HyperNet architecture, we created four HyperNet
examples.
functionality not present in the current Internet or out-performs what the Internet
can offer. The Multicast HyperNet enables the user to create a multicast network
over the Internet and thus effectively use the network bandwidth via multicasting.
The MobileNet example enables a mobile user to achieve 50Mbps throughput with up
to 5% network loss rate. The Multiplayer Gaming HyperNet drastically shortens the
average RTT from game players to the centralized game server. The OpenFlow Load
Balancing HyperNet enables the user to create multiple paths between source and
destination and to intelligently load balance network traffic going through each path
based on the network condition of each path. The point is not that these special-
139
purpose networks are particularly novel or new, but rather that they can be created
with the HyperNet architecture, and that they can be easily created and deployed.
With the help of the Network Hypervisor APIs and the HyperNet Libraries that we
implemented, it only takes around 200 lines of Java code to implement each of the
HyperNet examples. More importantly, to deploy any of those virtual networks, one
only needs to download the HyperNet Package and execute it.
8.1
Future Work
Our HyperNet system shows great promise as a way to deploy and manage specialpurpose virtual networks, but there remain aspects of the architecture that need
additional study.
The following outlines next steps that would help to bring about a complete
HyperNet architecture:
Devise a business model for the HyperNet architecture
The emergence of VNIP providers will definitely break the current business
model used by existing ISPs. A thorough mechanism/model for how VNIPs
charge Hypervisors, which in turn charge HyperNet users, needs to be designed.
It is also possible for network creators to offer their specialized virtual networks
for a fee to the HyperNet participants. New models that monitor networking
resource usage need to be designed so as to properly charge all types of users
in the HyperNet architecture.
Incorporate an update API call in the Network Hypervisor
Dynamic joining and leaving of participants may lead to the need to update the
virtual topology, or a Network Creator might want to renegotiate an allocation.
An updateTopo() API call should be provided by the Network Hypervisor. This
API call should be implemented such that an update operation on the virtual
140
141
requirements on a per-tunnel or overall basis, and if the VNIP does not support
such provisioning, there must be a standard fallback technique. Either granted
by the underlying VNIP or by the Network Hypervisor, new models and
algorithms are needed to offer QoS support.
Robust Fault Handling
Any robust system must be designed to deal with failures. The same holds
true for the HyperNet architecture. In our architecture, failures may happen
in a VNIP (e.g., a physical node/virtual node fails, a physical link breaks, or a
virtual link disconnects), in a Network Hypervisor, in a HyperNet participant
(e.g., a participant accidentally loses connection with its assigned attachment
point and wants to reconnect to the network). For some of the failures, the
HyperNet architecture might be able to automatically and seamlessly recover
from them without being noticed by the users. For example, a VNIP may
implement mechanisms to migrate a failed virtual router to a nearby healthy
programmable router while maintaining all the running states of the previous
router at the moment of failure. Similarly, the hypervisor cloud might include
features/mechanisms to automatically redirect users requests to healthy
nearby hypervisor instances in the case of a hypervisor instance failure.
Security
Security is not covered in this thesis, but mechanisms to deal with potential
security threats in the HyperNet architecture need to be designed to ensure a
142
healthy system welcomed by the users. Since anyone can download any number
of HyperNet Packages and deploy them into HyperNet networks, the Network
Hypervisor might implement mechanisms to prevent a Network Creator from
consuming all VNIP resources from other Network Creators. Moreover, since
the same HyperNet Package can be downloaded by potentially a large number
of Network Creators and be deployed into many HyperNet networks, a certain
trust relationship needs to be built between a HyperNet Builder and the
HyperNet Market (or certain security check mechanisms need to be built in the
HyperNet Market platform) to make sure that a maliciously built HyperNet
Package can not be uploaded by a HyperNet Builder (and hence will not be
downloaded and deployed by a Network Creator).
143
Bibliography
[1] A. Anand, F. Dogar, D. Han, B. Li, H. Lim, M. Machado, W. Wu, A. Akella,
D. G. Andersen, J. W. Byers, S. Seshan, and P. Steenkiste, XIA: An
Architecture for an Evolvable and Trustworthy Internet, in Proceedings of the
10th ACM Workshop on Hot Topics in Networks, ser. HotNets-X. New York,
NY, USA: ACM, 2011, pp. 2:12:6. [Online]. Available: http://doi.acm.org/10.
1145/2070562.2070564
[2] T. Wolf, J. Griffioen, K. L. Calvert, R. Dutta, G. N. Rouskas, I. Baldine, and
A. Nagurney, Choice as A Principle in Network Architecture, SIGCOMM
Comput. Commun. Rev., pp. 105106, Aug. 2012.
[3] F. Bronzino, K. Nagaraja, I. Seskar, and D. Raychaudhuri, Network Service
Abstractions for a Mobility-centric Future Internet Architecture, in Proceedings
of the Eighth ACM International Workshop on Mobility in the Evolving Internet
Architecture, ser. MobiArch 13. New York, NY, USA: ACM, 2013, pp. 510.
[Online]. Available: http://doi.acm.org/10.1145/2505906.2505908
[4] J. Yang and Z. Fei, Broadcasting with Prediction and Selective Forwarding
in Vehicular Networks, International Journal of Distributed Sensor Networks,
2013.
[5] J. Yang and Z. Fei, HDAR: Hole Detection and Adaptive Geographic Routing
for Ad Hoc Networks. in ICCCN. IEEE, 2010, pp. 16. [Online]. Available:
http://dblp.uni-trier.de/db/conf/icccn/icccn2010.html#YangF10a
[6] M. Boucadair, J.-L. Grimault, P. Levis, A. Villefranque, and P. Morand,
Anticipate IPv4 Address Exhaustion: A Critical Challenge for Internet
Survival, in Proceedings of the 2009 First International Conference on Evolving
Internet, ser. INTERNET 09. Washington, DC, USA: IEEE Computer Society,
2009, pp. 2732. [Online]. Available: http://dx.doi.org/10.1109/INTERNET.
2009.11
[7] A. Elmokashfi, A. Kvalbein, and C. Dovrolis, BGP Churn Evolution: a
Perspective from the Core, IEEE/ACM Trans. Netw., vol. 20, no. 2, pp.
571584, Apr. 2012. [Online]. Available: http://dx.doi.org/10.1109/TNET.2011.
2168610
[8] T. G. Griffin and G. Wilfong, An Analysis of BGP Convergence Properties,
in Proceedings of the conference on Applications, technologies, architectures,
144
and protocols for computer communication, ser. SIGCOMM 99. New York,
NY, USA: ACM, 1999, pp. 277288. [Online]. Available: http://doi.acm.org/10.
1145/316188.316231
[9] R. Mahajan, D. Wetherall, and T. Anderson, Understanding BGP
Misconfiguration, SIGCOMM Comput. Commun. Rev., vol. 32, no. 4, pp. 316,
Aug. 2002. [Online]. Available: http://doi.acm.org/10.1145/964725.633027
[10] P. Zhang, A. Durresi, and L. Barolli, A Survey of Internet Mobility, in
Proceedings of the 2009 International Conference on Network-Based Information
Systems, ser. NBIS 09. Washington, DC, USA: IEEE Computer Society, 2009,
pp. 147154. [Online]. Available: http://dx.doi.org/10.1109/NBiS.2009.94
[11] S. P. Leblanc, A. Partington, I. Chapman, and M. Bernier, An Overview of
Cyber Attack and Computer Network Operations Simulation, in Proceedings of
the 2011 Military Modeling & Simulation Symposium, ser. MMS 11. San Diego,
CA, USA: Society for Computer Simulation International, 2011, pp. 92100.
[Online]. Available: http://dl.acm.org/citation.cfm?id=2048558.2048572
[12] M. Nicholes and B. Mukherjee, A Survey of Security Techniques for The
Border Gateway Protocol (BGP), Commun. Surveys Tuts., vol. 11, no. 1, pp.
5265, Jan. 2009. [Online]. Available: http://dx.doi.org/10.1109/SURV.2009.
090105
[13] A. D. Keromytis, Voice-over-IP Security: Research and Practice, IEEE
Security and Privacy, vol. 8, no. 2, pp. 7678, Mar. 2010. [Online]. Available:
http://dx.doi.org/10.1109/MSP.2010.87
[14] S. Furnell, Remote PC Security: Securing The Home Worker, Netw. Secur.,
vol. 2006, no. 11, pp. 612, Nov. 2006. [Online]. Available: http://dx.doi.org/
10.1016/S1353-4858(06)70451-2
[15] P. Szewczyk and C. Valli, Ignorant Experts: Computer and Network Security
Support from Internet Service Providers, in Proceedings of the 2010 Fourth
International Conference on Network and System Security, ser. NSS 10.
Washington, DC, USA: IEEE Computer Society, 2010, pp. 323327. [Online].
Available: http://dx.doi.org/10.1109/NSS.2010.42
[16] A. Feldmann, Internet Clean-Slate Design: What and Why? SIGCOMM
Comput. Commun. Rev., vol. 37, pp. 5964, July 2007. [Online]. Available:
http://doi.acm.org/10.1145/1273445.1273453
[17] C. Dovrolis, What would Darwin Think about Clean-Slate Architectures?
SIGCOMM Comput. Commun. Rev., vol. 38, pp. 2934, January 2008. [Online].
Available: http://doi.acm.org/10.1145/1341431.1341436
[18] Virtual Appliance. [Online]. Available: http://en.wikipedia.org/wiki/Virtual
appliance
[19] Planetlab. [Online]. Available: http://www.planet-lab.org/
145
[20] L. Peterson, V. Pai, N. Spring, and A. Bavier, Using PlanetLab for Network
Research: Myths, Realities, and Best Practices, PlanetLab Consortium, Tech.
Rep. PDN05028, June 2005.
[21] W. D. Laverell, Z. Fei, and J. N. Griffioen, Isnt it Time You Had an Emulab?
in Proceedings of the 39th SIGCSE technical symposium on Computer science
education, ser. SIGCSE 08. New York, NY, USA: ACM, 2008, pp. 246250.
[Online]. Available: http://doi.acm.org/10.1145/1352135.1352223
[22] Virtual LAN. [Online]. Available: http://en.wikipedia.org/wiki/Virtual LAN
[23] GENI, Global Environment for Network Innovations - System Requirements
Document, 2009. [Online]. Available:
http://groups.geni.net/geni/wiki/
GpoDoc
[24] L. Peterson, S. Sevinc, J. Lepreau, R. Ricci, J. Wroclawski, T. Faber, S. Schwab,
and S. Baker, Slice-Based Facility Architecture, 2009. [Online]. Available:
http://www.cs.princeton.edu/ llp/arch abridged.pdf
[25] GENI. (2009) GENI Research Plan. [Online]. Available: http://groups.geni.
net/geni/attachment/wiki/OldGPGDesignDocuments/GDD-06-28.pdf
[26] Juniper, Juniper M7I Router. [Online]. Available: http://www.juniper.net/
customers/support/products/m7i.jsp
[27] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, OpenFlow: Enabling Innovation in Campus
Networks, SIGCOMM Comput. Commun. Rev., vol. 38, no. 2, pp. 6974, 2008.
[28] D. Farinacci, T. Li, S. Hanks, D. Meyer, and P. Traina, Generic Routing
Encapsulation (GRE), RFC 2784 (Proposed Standard), Internet Engineering
Task Force, Mar. 2000, updated by RFC 2890. [Online]. Available: http://www.
ietf.org/rfc/rfc2784.txt
[29] E. Rosen, A. Viswanathan, and R. Callon, Multiprotocol Label Switching
Architecture, RFC 3031 (Proposed Standard), Internet Engineering Task
Force, Jan. 2001, updated by RFC 6178. [Online]. Available: http://www.ietf.
org/rfc/rfc3031.txt
[30] B. Fenner, M. Handley, H. Holbrook, and I. Kouvelas, Protocol Independent
Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised), RFC
4601 (Proposed Standard), Internet Engineering Task Force, Aug. 2006,
updated by RFCs 5059, 5796, 6226. [Online]. Available: http://www.ietf.org/
rfc/rfc4601.txt
[31] G. Peng, CDN: Content Distribution Network, Tech. Rep., 2003.
[32] H. Balakrishnan, V. N. Padmanabhan, S. Seshan, and R. H. Katz, A
Comparison of Mechanisms for Improving TCP Performance over Wireless
Links, IEEE/ACM TRANSACTIONS ON NETWORKING, vol. 5, pp. 756
769, 1997.
146
http://linux-vserver.org/Welcome to
http://en.wikipedia.org/wiki/Virtual
147
http://research.cs.
http://appmanager.
http://www.cs.arizona.edu/
148
[68] T. Wolf, Service-Centric End-to-End Abstractions in Next-Generation Networks, in Proc. of Fifteenth IEEE International Conference on Computer
Communications and Networks (ICCCN), Arlington, VA, Oct. 2006, pp. 7986.
[69] S. Ganapathy and T. Wolf, Design of a Network Service Architecture, in Proc.
of Sixteenth IEEE International Conference on Computer Communications and
Networks (ICCCN), Honolulu, HI, Aug. 2007.
[70] N. C. Hutchinson and L. L. Peterson, The x-Kernel: An Architecture for
Implementing Network Protocols, IEEE Transactions on Software Engineering,
vol. 17, pp. 6476, 1991.
[71] K. Calvert, Beyond Layering: Modularity Considerations for Protocol Architectures, International conference on network protocols, pp. 9097, 1993.
[72] R. Clayton and K. Calvert, Structuring Protocols with Data Streams, Second
Workshop on High-Performance Protocol Architectures, 1995.
[73] R. Dutta, G. N. Rouskas, I. Baldine, A. Bragg, and D. Stevenson, The SILO
Architecture for Services Integration, Control, and Optimization for the Future
Internet, in IEEE ICC, 2007, pp. 2427.
[74] M. Vellala, A. Wang, G. Rouskas, R. Dutta, I. Baldine, and D. Stevenson,
A Composition Algorithm for the SILO Cross-Layer Optimization Service
Architecture, Proceedings of the Advanced Networks and Telecommunications
Systems Conference (ANTS 2007), 2007.
[75] I. Houidi, W. Louati, D. Zeghlache, and S. Baucke, Virtual Resource Description
and Clustering for Virtual Network Discovery, Communications Workshops,
2009. ICC Workshops 2009. IEEE International Conference on, pp. 1 6, jun.
2009.
[76] GENI AM API. [Online]. Available: http://groups.geni.net/geni/wiki/GAPI
AM API V3
[77] VMware, Vmware Virtual Appliances, http://www.vmware.com/appliances/.
[78] MPEG Video Standards. [Online]. Available: http://mpeg.chiariglione.org/
[79] L. Krishnamurthy,
AQUA: An Adaptive Quality of Service
Architecture for Distributed Multimedia Applications, 1997. [Onhttp://search.proquest.com/pqdtft/docview/304385403/
line]. Available:
14229056CF46A5EE93F/1?accountid=11836
[80] G. Schaffrath, C. Werle, P. Papadimitriou, A. Feldmann, R. Bless, A. Greenhalgh, A. Wundsam, M. Kind, O. Maennel, and L. Mathy, Network Virtualization Architecture: Proposal and Initial Prototype, in VISA 09: Proceedings of
the 1st ACM workshop on Virtualized infrastructure systems and architectures.
New York, NY, USA: ACM, 2009, pp. 6372.
149
http://
150
Vita
Shufeng Huang
Education
B.S. in Computer Science, Beijing Normal University, Beijing, China, 2006
Research Experience
Designed and implemented a toolkit called switchspider that traverses
a network of connected switches and fetches information about IP phones
using SNMP and CISCO MIB, combined with info fetched from Ciscos
CUCM server (via XML-RPC), providing an integrated information base
for University of Kentuckys VoIP Phone users.
Research Assistant, 2007 2013, Department of Computer Science,
University of Kentucky
Worked in VOEIS project, co-designed and implemented and IOS app
and a dataspoke system that fetches streaming data from buoys
which monitor the environmental variable changes for Kentucky Lake
and Flathead Lake in Montana.
Worked in PoMo project, co-designed and implemented the forwarding
plane as well as the E2L (EID to Locator) service for the PoMo
network.
Worked in the Treasury Project, designed and implemented a new
Remote Backup system with tracker in Linux Kernel using RB trees.
Co-designed and implemented a Reliable FEC transport protocol that
aims to minimize per-packet delay.
Intern at Raytheon BBN Technologies, 2013 summer, Boston
Build the Network Hypervisor platform that facilitates the creation
and deployment of virtual networks on GENI (an extension of my
thesis work).
Created advanced OpenFlow tutorials on building a load balancer and
a firewall using Trema.
Created advanced Content-Centric Network (CCN) tutorials on exploring features of CCN networks, using CCNX toolkit.
Created advanced TCP tutorials on experimenting with the performance of different TCP congestion control algorithms.
151
152