Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Coud Computing

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 204

UNIT WISE NOTES

“PES’s MCOE, Information Technology”

1
UNIT – I

INTRODUCTION TO CLOUD COMPUTING

1.1 DEFINING CLOUD COMPUTING


Computing as a service and accessing remote and distributed hardware and software resources
over a network is not a new concept. 1960's notions of : "computing utilities" (Cafaro&Aloisio,
2011; Kleinrock, 2005) Virtualisation (Graziano 2011) Gradual development over next forty
years, e.g. Distributed IT infrastructures in the 80's and 90's Application Service Provision
(ASPs) in the 90's and 00’s However they were all constrained by a lack of computing power and
network bandwidth. Factors conspired at the turn of the millennium to facilitate Cloud
Computing: Rise of cheap computing power and network bandwidth The rise of large scale
computing architectures and enabling technologies around Grid computing enabling affordable
high power computing tasks Adaptation of these architectures for large data centres of
commodity hardware to service the IT business needs of organisations such as Google, Amazon
and Microsoft Commercialisation of their computing architectures in ways that could be sold as
the first Cloud Computing services.
Cloud computing takes the technology, services, and applications that are similar to those on the
Internet and turns them into a self-service utility. The use of the word “cloud” makes reference to
the two essential concepts:
1. Abstraction: Cloud computing abstracts the details of system implementation from users and
developers. Applications run on physical systems that aren't specified, data is stored in
locations that are unknown, administration of systems is outsourced to others, and access by
users is ubiquitous.
2. Virtualization: Cloud computing virtualizes systems by pooling and sharing resources.
Systems and storage can be provisioned as needed from a centralized infrastructure, costs are
assessed on a metered basis, multi-tenancy is enabled, and resources are scalable with agility.
Cloud computing is an abstraction based on the notion of pooling physical resources and
presenting them as a virtual resource. It is a new model for provisioning resources, for staging
applications, and for platform-independent user access to services. Clouds can come in many
different types, and the services and applications that run on clouds may or may not be delivered
by a cloud service provider. These different types and levels of cloud services mean that it is
important to define what type of cloud computing system you are working with. To help clarify
how cloud computing has changed the nature of commercial system deployment, consider these
three examples:
1. Google: In the last decade, Google has built a worldwide network of datacenters to service its
search engine. In doing so Google has captured a substantial portion of the world's
advertising revenue. That revenue has enabled Google to offer free software to users based
on that infrastructure and has changed the market for user-facing software.
2. Azure Platform: By contrast, Microsoft is creating the Azure Platform. It enables .NET
Framework applications to run over the Internet as an alternate platform for Microsoft
developer software running on desktops.
3. Amazon Web Services: One of the most successful cloud-based businesses is Amazon Web
Services, which is an Infrastructure as a Service offering that lets you rent virtual computers
on Amazon's own infrastructure.
4. These new capabilities enable applications to be written and deployed with minimal expense
and to be rapidly scaled and made available worldwide as business conditions permit. This is
truly a revolutionary change in the way enterprise computing is created and deployed.
The United States government is a major consumer of computer services and, therefore, one of
the major users of cloud computing networks. The U.S. National Institute of Standards and
Technology (NIST) has a set of working definitions that separate cloud computing into service
models and deployment models. Those models and their relationship to essential characteristics
of cloud computing are shown in Figure. 1.1.
FIGURE.1.1. The NIST cloud computing definitions

The NIST model originally did not require a cloud to use virtualization to pool resources, nor did
it absolutely require that a cloud support multi-tenancy in the earliest definitions of cloud
computing. Multi-tenancy is the sharing of resources among two or more clients. The latest
version of the NIST definition does require that cloud computing networks use virtualization and
support multi-tenancy.
Because cloud computing is moving toward a set of modular interacting components based on
standards such as the Service Oriented Architecture (described in Chapter 13), you might expect
that future versions of the NIST model may add those features as well. The NIST cloud model
doesn't address a number of intermediary services such as transaction or service brokers,
provisioning, integration, and interoperability services that form the basis for many cloud
computing discussions. Given the emerging roles of service buses, brokers, and cloud APIs at
various levels, undoubtedly these elements need to be added to capture the whole story.
1.2 ESSENTIAL CHARACTERISTICS OF CLOUD COMPUTING
1. On-demand self-service: A client can provision computer resources without the need for
interaction with cloud service provider personnel.
2. Broad network access: Access to resources in the cloud is available over the network using
standard methods in a manner that provides platform-independent access to clients of all
types. This includes a mixture of heterogeneous operating systems, and thick and thin
platforms such as laptops, mobile phones, and PDA.
3. Resource pooling: A cloud service provider creates resources that are pooled together in a
system that supports multi-tenant usage. Physical and virtual systems are dynamically
allocated or reallocated as needed. Intrinsic in this concept of pooling is the idea of
abstraction that hides the location of resources such as virtual machines, processing, memory,
storage, and network bandwidth and connectivity.
4. Rapid elasticity: Resources can be rapidly and elastically provisioned. The system can add
resources by either scaling up systems (more powerful computers) or scaling out systems
(more computers of the same kind), and scaling may be automatic or manual. From the
standpoint of the client, cloud computing resources should look limitless and can be
purchased at any time and in any quantity.
5. Measured service: The use of cloud system resources is measured, audited, and reported to
the customer based on a metered system.
A client can be charged based on a known metric such as amount of storage used, number of
transactions, network I/O (Input/output) or bandwidth, amount of processing power used, and so
forth. A client is charged based on the level of services provided.

1.3 CLOUD DEPLOYMENT MODEL


A deployment model defines the purpose of the cloud and the nature of how the cloud is located.
The NIST definition for the four deployment models is as follows:
1. Public cloud: The public cloud infrastructure is available for public use alternatively for a
large industry group and is owned by an organization selling cloud services.
2. Private cloud: The private cloud infrastructure is operated for the exclusive use of an
organization. The cloud may be managed by that organization or a third party. Private clouds
may be either on- or off-premises.
3. Hybrid cloud: A hybrid cloud combines multiple clouds (private, community of public)
where those clouds retain their unique identities, but are bound together as a unit. A hybrid
cloud may offer standardized or proprietary access to data and applications, as well as
application portability.
4. Community cloud: A community cloud is one where the cloud has been organized to serve a
common function or purpose.

FIGURE.1.2. Deployment locations for different cloud types


It may be for one organization or for several organizations, but they share common concerns
such as their mission, policies, security, regulatory compliance needs, and so on. A community
cloud may be managed by the constituent organization(s) or by a third party. Figure shows the
different locations that clouds can come in. In the sections that follow, these different cloud
deployment models are described in more detail.

1.4 CLOUD SERVICE MODELS


In the deployment model, different cloud types are an expression of the manner in which
infrastructure is deployed. You can think of the cloud as the boundary between where a client's
network, management, and responsibilities ends and the cloud service provider's begins. As
cloud computing has developed, different vendors offer clouds that have different services
associated with them. The portfolio of services offered adds another set of definitions called the
service model.
Three service types have been universally accepted:
1. Infrastructure as a Service: IaaS provides virtual machines, virtual storage, virtual
infrastructure, and other hardware assets as resources that clients can provision. The IaaS
service provider manages all the infrastructure, while the client is responsible for all other
aspects of the deployment. This can include the operating system, applications, and user
interactions with the system.
2. Platform as a Service: PaaS provides virtual machines, operating systems, applications,
services, development frameworks, transactions, and control structures. The client can
deploy its applications on the cloud infrastructure or use applications that were
programmed using languages and tools that are supported by the PaaS service provider.
The service provider manages the cloud infrastructure, the operating systems, and the
enabling software. The client is responsible for installing and managing the application
that it is deploying.
3. Software as a Service: SaaS is a complete operating environment with applications,
management, and the user interface. In the SaaS model, the application is provided to the
client through a thin client interface (a browser, usually), and the customer's
responsibility begins and ends with entering and managing its data and user interaction.
Everything from the application down to the infrastructure is the vendor's responsibility.
The three different service models taken together have come to be known as the SPI model of
cloud computing. Many other service models have been mentioned: StaaS, Storage as a Service;
IdaaS, Identity as a Service; CmaaS, Compliance as a Service; and so forth. However, the SPI
services encompass all the other possibilities. It is useful to think of cloud computing service
models in terms of a hardware/software stack. One such representation called the Cloud
Reference Model is shown in Figure 1.5. At the bottom of the stack is the hardware or
infrastructure that comprises the network. As you move upward in the stack, each service model
inherits the capabilities of the service model beneath it. IaaS has the least levels of integrated
functionality and the lowest levels of integration, and SaaS has the most.
Examples of IaaS service providers include:
• Amazon Elastic Compute Cloud (EC2)
• Eucalyptus
• GoGrid
• Flexi Scale
• Linode
• RackSpace Cloud
• Terremark

1.4.1 Software as a Service


Software as a Service (SaaS) is defined as …software that is deployed over the internet… With
SaaS, a provider licenses an application to customers either as a service on demand, through a
subscription, in a “pay-as-you-go” model, or (increasingly) at no charge when there is
opportunity to generate revenue from streams other than the user, such as from advertisement or
user list sales SaaS is a rapidly growing market as indicated in recent reports that predict ongoing
double digit growth. This rapid growth indicates that SaaS will soon become commonplace
within every organization and hence it is important that buyers and users of technology
understand what SaaS is and where it is suitable.

1.4.1.1 Characteristics of SaaS


Like other forms of Cloud Computing, it is important to ensure that solutions sold as SaaS in fact
comply with generally accepted definitions of Cloud Computing. Some defining characteristics
of SaaS include;
4. Web access to commercial software
1. Software is managed from a central location
2. Software delivered in a “one to many” model
3. Users not required to handle software upgrades and patches
4. Application Programming Interfaces (APIs) allow for integration between different
pieces of software
1.4.1.2 Where SaaS Makes Sense
Cloud Computing generally, and SaaS in particular, is a rapidly growing method of delivering
technology. That said, organizations considering a move to the cloud will want to consider which
applications they move to SaaS. As such there are particular solutions we consider prime
candidate for an initial move to SaaS;
1. “Vanilla” offerings where the solution is largely undifferentiated. A good example of a
vanilla offering would include email where many times competitors use the same
software precisely because this fundamental technology is a requirement for doing
business, but does not itself confer an competitive advantage
2. Applications where there is significant interplay between the organization and the outside
world. For example, email newsletter campaign software
3. Applications that have a significant need for web or mobile access. An example would be
mobile sales management software
4. Software that is only to be used for a short term need. An example would be
collaboration software for a specific project
5. Software where demand spikes significantly, for example tax or billing software used
once a month
SaaS is widely accepted to have been introduced to the business world by the Sales force [10]
Customer Relationship Management (CRM) product. As one of the earliest entrants it is not
surprising that CRM is the most popular SaaS application area [11], however e-mail, financial
management, customer service and expense management have also gotten good uptake via SaaS.

1.4.1.3 Where SaaS May Not be the Best Option


While SaaS is a very valuable tool, there are certain situations where we believe it is not the best
option for software delivery. Examples where SaaS may not be appropriate include;
1. Applications where extremely fast processing of real time data is required
2. Applications where legislation or other regulation does not permit data being hosted
externally
3. Applications where an existing on-premise solution fulfils all of the organization’s needs
Software as a Service may be the best known aspect of Cloud Computing, but developers and
organizations all around the world are leveraging Platform as a Service, which mixes the
simplicity of SaaS with the power of IaaS, to great effect.

1.4.2 Platform as a Service


Platform as a Service (PaaS) brings the benefits that SaaS bought for applications, but over to
the software development world. PaaS can be defined as a computing platform that allows the
creation of web applications quickly and easily and without the complexity of buying and
maintaining the software and infrastructure underneath it. PaaS is analogous to SaaS except that,
rather than being software delivered over the web, it is a platform for the creation of software,
delivered over the web.

1.4.2.1 Characteristics of PaaS


There are a number of different takes on what constitutes PaaS but some basic characteristics
include
1. Services to develop, test, deploy, host and maintain applications in the same integrated
development environment. All the varying services needed to fulfil the application
development process
2. Web based user interface creation tools help to create, modify, test and deploy different UI
scenarios
3. Multi-tenant architecture where multiple concurrent users utilize the same development
application
4. Built in scalability of deployed software including load balancing and failover
5. Integration with web services and databases via common standards
6. Support for development team collaboration – some PaaS solutions include project planning
and communication tools
7. Tools to handle billing and subscription management
PaaS, which is similar in many ways to Infrastructure as a Service that will be discussed below,
is differentiated from IaaS by the addition of value added services and comes in two distinct
flavours;
1. A collaborative platform for software development, focused on workflow management
regardless of the data source being used for the application. An example of this approach
would be Heroku, a PaaS that utilizes the Ruby on Rails development language.
2. A platform that allows for the creation of software utilizing proprietary data from an
application. This sort of PaaS can be seen as a method to create applications with a
common data form or type. An example of this sort of platform would be the Force.com
PaaS from Salesforce.com which is used almost exclusively to develop applications that
work with the Salesforce.com CRM

1.4.2.2 Where PaaS Makes Sense


PaaS is especially useful in any situation where multiple developers will be working on a
development project or where other external parties need to interact with the development
process. As the case study below illustrates, it is proving invaluable for those who have an
existing data source – for example sales information from a customer relationship management
tool, and want to create applications which leverage that data. Finally PaaS is useful where
developers wish to automate testing and deployment services.
The popularity of agile software development, a group of software development methodologies
based on iterative and incremental development, will also increase the uptake of PaaS as it eases
the difficulties around rapid development and iteration of software. Some examples of PaaS
include Google App Engine, Microsoft Azure Services, and the Force.com platform.

1.4.2.3 Where PaaS May Not be the Best Option


We contend that PaaS will become the predominant approach towards software development.
The ability to automate processes, use pre-defined components and building blocks and deploy
automatically to production will provide sufficient value to be highly persuasive. That said, there
are certain situations where PaaS may not be ideal, examples include;
1. Where the application needs to be highly portable in terms of where it is hosted
2. Where proprietary languages or approaches would impact on the development process
3. Where a proprietary language would hinder later moves to another provider – concerns
are raised about vendor lock-in
4. Where application performance requires customization of the underlying hardware and
software

1.4.3 Infrastructure as a Service


Infrastructure as a Service (IaaS) is a way of delivering Cloud Computing infrastructure –
servers, storage, network and operating systems – as an on-demand service. Rather than
purchasing servers, software, datacenter space or network equipment, clients instead buy those
resources as a fully outsourced service on demand .
As we detailed in a previous whitepaper, within IaaS, there are some sub-categories that are
worth noting. Generally IaaS can be obtained as public or private infrastructure or a combination
of the two. “Public cloud” is considered infrastructure that consists of shared resources, deployed
on a self-service basis over the Internet.
By contrast, “private cloud” is infrastructure that emulates some of Cloud Computing features,
like virtualization, but does so on a private network. Additionally, some hosting providers are
beginning to offer a combination of traditional dedicated hosting alongside public and/ or private
cloud networks. This combination approach is generally called “Hybrid Cloud”.

1.4.3.1 Characteristics of IaaS


As with the two previous sections, SaaS and PaaS, IaaS is a rapidly developing field. That said
there are some core characteristics which describe what IaaS is. IaaS is generally accepted to
comply with the following;
1. Resources are distributed as a service
2. Allows for dynamic scaling
3. Has a variable cost, utility pricing model
4. Generally includes multiple users on a single piece of hardware
There are a plethora of IaaS providers out there from the largest Cloud players like Amazon Web
Services and Rackspace to more boutique regional players.As mentioned previously, the line
between PaaS and IaaS is becoming more blurred as vendors introduce tools as part of IaaS that
help with deployment including the ability to deploy multiple types of clouds.
1.4.3.2 Where IaaS Makes Sense
IaaS makes sense in a number of situations and these are closely related to the benefits that
Cloud Computing bring. Situations that are particularly suitable for Cloud infrastructure include;
1. Where demand is very volatile – any time there are significant spikes and troughs in
terms of demand on the infrastructure
2. For new organizations without the capital to invest in hardware
3. Where the organization is growing rapidly and scaling hardware would be problematic
4. Where there is pressure on the organization to limit capital expenditure and to move to
operating expenditure
5. For specific line of business, trial or temporary infrastructural needs

1.4.3.3 Where IaaS May Not be the Best Option


While IaaS provides massive advantages for situations where scalability and quick provisioning
are beneficial, there are situations where its limitations may be problematic. Examples of
situations where we would advise caution with regards IaaS include;
1. Where regulatory compliance makes the off shoring or outsourcing of data storage and
processing difficult
2. Where the highest levels of performance are required, and on-premise or dedicated
hosted infrastructure has the capacity to meet the organization’s needs.

1.5 MULTITENANCY
Multi-tenancy is an architecture in which a single instance of a software application serves
multiple customers. Each customer is called a tenant. Tenants may be given the ability to
customize some parts of the application, such as color of the user interface (UI) or business rules,
but they cannot customize the application's code.
Multi-tenancy can be economical because software development and maintenance costs are
shared. It can be contrasted with single-tenancy, an architecture in which each customer has their
own software instance and may be given access to code. With a multi-tenancy architecture, the
provider only has to make updates once. With a single-tenancy architecture, the provider has to
touch multiple instances of the software in order to make updates.
In cloud computing, the meaning of multi-tenancy architecture has broadened because of new
service models that take advantage of virtualization and remote access. A software-as-a-service
(SaaS) provider, for example, can run one instance of its application on one instance of a
database and provide web access to multiple customers. In such a scenario, each tenant's data is
isolated and remains invisible to other tenants.
Whether an IT organization is going with public or private clouds, it's important to understand
the nuances of multi-tenant architecture. For public clouds, IT managers need to understand the
degree of multi-tenancy supported by whichever vendor they are looking at. For private clouds,
the entire responsibility of designing a multi-tenant architecture rests with the IT managers.

FIGURE.1.3. Multi-tenancy On-Premises DC and Off-Premises DC


FIGURE.1.4.Single Vs Multi Tenancy
Enterprise cloud adoption has gone beyond the levels of intellectual pursuits and casual
experimentation. An analysis by IDC shows that $17 billion of the $359 billion of worldwide IT
spending for 2009 could be attributed to cloud computing. Two-thirds of Baseline magazine's
survey participants plan to clouds. None of that is to say that there aren't nagging issues, including
but not limited to how different enterprise workloads match up against different types of
clouds and responsible ways to plan and implement the necessary migrations.
Based on the characteristics of the workload, cloud adoption will swing between public and
private clouds. Large enterprises have requirements that will force them to strike a
balance between the two clouds for their workloads. This is different for small-to-medium
businesses (SMBs) and start-ups, which might have a strong business case for wanting to use
public clouds for almost all of their workloads. But in the end, their respective preferences will
not be as much about the size of the organization as they are about the nature of their IT
workloads.
Besides appropriate workload distribution, architectural considerations are also key. Multi-
tenancy is one such architectural consideration, and understanding multi-tenancy is a critical first
step towards broader IT cloud adoption. Due to the early traction seen in public clouds -- where
multiple enterprises end up being co-tenants -- "multi-tenancy" is wrongly used as a synonym for
"multi-enterprise." But they are very different concepts. Also, the granularity of tenancy is
established at the application level, not at the level of an individual user or entire enterprise.

FIGURE.1.5. Different IT workloads can be distributed differently across public and private
clouds
1.5.1 Multi-tenancy defined
A tenant is any application -- either inside or outside the enterprise -- that needs its own secure
and exclusive virtual computing environment. This environment can encompass all or some
select layers of enterprise architecture, from storage to user interface. All interactive applications
(or tenants) have to be multi-user in nature.
A departmental application that processes sensitive financial data within the private cloud of an
enterprise is as much a "tenant" as a global marketing application that publishes product
catalogson a public cloud. They both have the same tenancy requirements, regardless of the fact
that one has internal co-tenants and the other has external.
Multi-tenancy is the key common attribute of both public and private clouds, and it applies to all
three layers of a cloud: Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and
Software-as-a-Service (SaaS).
Most people point to the IaaS layer alone when they talk about clouds. Even so, architecturally,
both public and private IaaSes go beyond tactical features such as virtualization, and head
towards implementing the concept of IT-as-a-Service (ITaaS) through billing -- or chargeback in
the case of private clouds -- based on metered usage. An IaaS also features improved
accountability using service-level-agreements (SLAs), identity management for secured access,
fault tolerance, disaster recovery, dynamic procurement and other key properties.
By incorporating these shared services at the infrastructure layer, all clouds automatically
become multi-tenant, to a degree. But multi-tenancy in clouds has to go beyond the IaaS layer, to
include the PaaS layer (application servers, Java Virtual Machines, etc.) and ultimately to the
SaaS or application layer (database, business logic, work flow and user interface). Only then can
tenants can enjoy the full spectrum of common services from a cloud -- starting at the hardware
layer and going all the way up to the user-interface layer, depending on the degree of multi-
tenancy offered by the cloud.

1.5.2 Degrees of multi-tenancy


The exact degree of multi-tenancy, as it's commonly defined, is based on how much of the core
application, or SaaS, layer is designed to be shared across tenants. The highest degree of multi-
tenancy allows the database schema to be shared and supports customization of the business
logic, workflow and user-interface layers. In other words, all the sub-layers of SaaS offer multi-
tenancy in this degree.
F
IGURE.1.6. Degrees of multi-tenancy.
In the lowest degree, multi-tenancy is limited to the IaaS and PaaS layers, with dedicated SaaS
layers for each tenant. And in the middle degree of multi-tenancy are clusters of homogenous
tenants that share database schemas (or schemae) and other application layers. In the middle
level, each cluster of users has its own version of database schema and the application itself.
We can sum up the discussion on the degree of multi-tenancy as follows:
1. Highest degree: IaaS and PaaS are multi-tenant. SaaS is fully multi-tenant also.
2. Middle degree: IaaS and PaaS are multi-tenant. Small SaaS clusters are multi-tenant.

3. Lowest degree: IaaS and PaaS are multi-tenant. SaaS is single tenant.
For example, Salesforce.com, at the relatively high end of the multi-tenancy spectrum, has
72,500 customers who are supported by 8 to 12 multi-tenant instances (meaning IaaS/PaaS
instances) in a 1:5000 ratio. In other words, each multi-tenant instance supports 5,000 tenants
who share the same database schema. Intact, a financial systems SaaS provider in the middle of
the spectrum, has more than 2,500 customers who share 10 instances in a 1:250 ratio.
Private clouds, and offerings such as SAP's Business By Design (due this summer), would be at
the lowest end of the spectrum of multi-tenancy, with application layers that are dedicated and
are more suited for specific large enterprise customers.
1.6 THE CLOUD CUBE MODEL
The Jericho Forum has identified 4 criteria to differentiate cloud formations from each otherand
the manner of their provision. The Cloud Cube Model summarises these 4 dimensions.

FIGURE.1.7. Cloud CubModel

1.6.1 Cloud Cube Model Dimensions


1.6.1.1 Dimension: Internal (I) / External (E)
This is the dimension that defines the physical location of the data: where does the cloud form
you want to use exist - inside or outside your organization’s boundaries.
• If it is within your own physical boundary then it is Internal.
• If it is not within your own physical boundary then it is External.
For example, virtualised hard disks in an organisation’s data centre would be internal, while
Amazon SC33 would be external at some location “off-site”.
FIGURE.1.8. Internal (I) / External (E)

1.6.1.2 Dimension: Proprietary (P) / Open (O)


This is the dimension that defines the state of ownership of the cloud technology, services,
interfaces4, etc. It indicates the degree of interoperability, as well as enabling “data/application
transportability” between your own systems and other cloud forms, and the ability to withdraw
your data from a cloud form or to move it to another without constraint. It also indicates any
constraints on being able to share applications.

FIGURE.1.9. Dimension: Proprietary (P) / Open (O)


 Proprietary means that the organisation providing the service is keeping the means of
provision under their ownership. As a result, when operating in clouds that are proprietary,
you may not be able to move to another cloud supplier without significant effort or
investment. Often the more innovative technology advances occur in the proprietary domain.
As such the proprietor may choose to enforce restrictions through patents and by keeping the
technology involved a trade secret.
 Clouds that are Open are using technology that is not proprietary, meaning that there are
likely to be more suppliers, and you are not as constrained in being able to share your data
and collaborate with selected parties using the same open technology. Open services tend to
be those that are widespread and consumerised, and most likely a published open standard,
for example email (SMTP).An as yet unproven premise is that the clouds that most
effectively enhance collaboration between multiple organisations will be Open.

1.6.1.3 Dimension: Perimeterised (Per) / De-perimeterised (D-p) Architectures


The third dimension represents the “architectural mindset” - are you operating inside
yourtraditional IT perimeter or outside it? De-parameterisation has always related to the gradual
failure / removal / shrinking / collapse of the traditional silo-based IT perimeter.
1. Perimeterised implies continuing to operate within the traditional IT perimeter, often
signalled by “network firewalls”. As has been discussed in previous published Jericho Forum
papers, this approach inhibits collaboration. In effect, when operating in the perimeterised
areas, you may simply extend your own organisation’s perimeter into the external cloud
computing domain using a VPN and operating the virtual server in your own IP domain,
making use of your own directory services to control access. Then, when the computing task
is completed you can withdraw your perimeter back to its original traditional position. We
consider this type of system perimeter to be a traditional, though virtual, perimeter.
2. De-perimeterised, assumes that the system perimeter is architected following the principles
outlined in the Jericho Forum’s Commandments and Collaboration Oriented Architectures
Framework. The terms Micro-Perimeterisation and Macro- Perimeterisation will likely be in
active use here - for example in a de-perimeterised frame the data would be encapsulated
with meta-data and mechanisms that would protect the data from inappropriate usage. COA-
enabled systems allow secure collaboration.
In a de-perimeterised environment an organisation can collaborate securely with selected parties
(business partner, customer, supplier, outworker) globally over any COA capable network The
de-perimeterised areas in our Cloud Cube Model use both internal and external domains but the
collaboration or sharing of data should not be seen as internal or external – rather it is controlled
by and limited to the parties that the using organisations select. For example, in the future frame,
one organisation will not feel uncomfortable about allowing data into the internal COA-
compliant domain of a collaborating organisation; rather, they will be confident that the data will
be appropriately protected.

FIGURE.1.10. Dimension: Perimeterised (Per) / De-perimeterised (D-p)


This means:
 You can operate in any of the four cloud formations so far described (I/P,I/O,E/P,E/O) with
either of two architectural mindsets - Perimeterised or De-perimeterised.
 The top-right E/O/D-p cloud formation is likely to be the “sweet spot” where optimum
flexibility and collaboration can be achieved.
 A Proprietary cloud provider will likely want to keep you in the left side of the cube,
achieved by either continuous innovation that adds value, or by limiting the means of
migrating from the proprietary domain. The ability to move from that top-left cloud form to
the “sweet-spot” top-right cloud form will require a rare interface because facilitating you
making this move is going to be rarely in the cloud supplier’s best business interests.

While the underlying intent remains the same, an added distinction in describing
Deperimeterised cloud usage arises in that the detailed description changes based on the level of
abstraction at which you choose to operate. At the heart of all cloud forms is the concept of
abstraction. Cloud models separate one layer of business from another, e.g. process from
software, platform from infrastructure, etc. We show an example model here with four levels of
abstraction; we can expect other models identifying different layers and abstraction levels to
emerge to suit different business needs. Most cloud computing activities today are occurring at
the lower layers of the stack, so today we have more maturity at the lower level.

1.6.1.4 Dimension: Insourced / Outsourced


We define a 4th dimension that has 2 states in each of the 8 cloud forms: Per(IP,IO,EP,EO)and D-
p(IP,IO,EP,EO), that responds to the question “Who do you want running your Clouds?”
• Outsourced: the service is provided by a 3rd party
• Insourced: the service is provided by your own staff under your control
These 2 states describe who is managing delivery of the cloud service(s) that you use. This is
primarily a policy issue (i.e. a business decision, not a technical or architectural decision) which
must be embodied in a contract with the cloud provider. In the Cloud Cube Model diagram we
show this 4th dimension by 2 colors; any of the 8 cloud forms can take either colour.

1.7 CLOUD ECONOMICS AND BENEFITS


1.7.1 Economic Context
Like energy, computing has become an essential component of any economy. Historically, the
size of an economy was directly related to the energy it consumed. Likewise, a person’s
professional growth, the growth of an organization, or the growth of a country as a whole can
directly be related to the computing power they use. Rising energy costs, combined with a
growing global awareness of the potential impact of climate change due to carbon emissions puts
a renewed focus on energy usage and its associated carbon footprint. The challenge today is to
increase computing power consumption with lower energy consumption. As this chapter is being
written, every enterprise in the world is facing a global economic recession that has profoundly
affected all developed countries as well as those developing countries that produce products sold
in those markets. Uncertain times also bring opportunities – but taking advantage of strategic
opportunities typically must now be done quickly without additional capital funds or additional
corporate resources.
For information technology (IT) managers, energy cost management is not a small issue. In
addition, the maintenance of legacy enterprise data centers absorb the majority of IT budgets and
IT managers are looking for ways to create increased capacity and flexibility within their current
computing facility and hardware footprint thereby lowering costs and increasing their return on
assets (ROA). Because capacity planning for traditional enterprise data centers must
accommodate the company‘s peak load periods, there is typically very low server utilization
during non-peak periods which, depending on the industry, may be most of the year, The last few
years have seen a trend in data center management towards server virtualization which allows
faster deployment of specialized server configurations and towards higher server density without
increasing the size of the data center or its staff overhead or even higher energy consumption.
However these alternatives still require significant investments and long-term technology
commitments and there has been increasing attention paid to alternatives that provide the pay-as-
you-go options, unlimited scalability, quick deployment, and the minimal maintenance
requirements. Cloud computing is a computing paradigm that promises to meet all these
requirements.

1.7.2 The laws of cloudonomics


1. Utility services cost less even though they cost more.
Utilities charge a premium for their services, but customers save money by not paying for
services that they aren't using.
2. On-demand trumps forecasting.
The ability to provision and tear down resources (de-provision) captures revenue and lowers
costs.
3. The peak of the sum is never greater than the sum of the peaks.
A cloud can deploy less capacity because the peaks of individual tenants in a shared system are
averaged over time by the group of tenants.
4. Aggregate demand is smoother than individual.
Multi-tenancy also tends to average the variability intrinsic in individual demand because the
“coefficient of random variables” is always less than or equal to that of any of the individual
variables. With a more predictable demand and less variation, clouds can run at higher utilization
rates than captive systems. This allows cloud systems to operate at higher efficiencies and lower
costs.
5. Average unit costs are reduced by distributing fixed costs over more units of output.
Cloud vendors have a size that allows them to purchase resources at significantly reduced prices.
6. Superiority in numbers is the most important factor in the result of a combat
(Clausewitz).
Weinman argues that a large cloud's size has the ability to repel botnets and DDoS attacks better
than smaller systems do.
7. Space-time is a continuum (Einstein/Minkowski).
The ability of a task to be accomplished in the cloud using parallel processing allows real-time
business to respond quicker to business conditions and accelerates decision making providing a
measurable advantage.
8. Dispersion is the inverse square of latency.
Latency, or the delay in getting a response to a request, requires both large-scale and multi-site
deployments that are a characteristic of cloud providers. Cutting latency in half requires four
times the number of nodes in a system.
9. Don't put all your eggs in one basket.
The reliability of a system with n redundant components and a reliability of r is 1-(1-r)n.
Therefore, when a datacenter achieves a reliability of 99 percent, two redundant datacenters have
a reliability of 99.99 percent (four nines) and three redundant datacenters can achieve a
reliability of 99.9999 percent (six nines). Large cloud providers with geographically dispersed
sites worldwide therefore achieve reliability rates that are hard for private systems to achieve.
10. An object at rest tends to stay at rest (Newton).
Private datacenters tend to be located in places where the company or unit was founded or
acquired. Cloud providers can site their datacenters in what are called “greenfield sites.” A
greenfield site is one that is environmentally friendly: locations that are on a network backbone,
have cheap access to power and cooling, where land is inexpensive, and the environmental
impact is low. A network backbone is a very high-capacity network connection. On the Internet,
an Internet backbone consists of the high-capacity routes and routers that are typically operated
by an individual service provider such as a government or commercial entity.
1.7.3 Economic Benefits
Occasionally used to refer to the economics of cloud computing, the term “Cloudonomics” was
coined by Joe Weinman in a seminal article entitledThe 10 Laws of Cloudonomics. While far
from being a comprehensive or exhaustive list of economic factors, his “10 Laws” serve as a
useful starting point in our discussion. He examined the strategic advantages provided by public
utility cloud services over private clouds and traditional data centers. He posits that public utility
clouds are fundamentally different than traditional data center environments and private clouds.
For individual enterprises, cloud services provide benefits that broadly fall into the categories of
lowering overall costs for equivalent services (you pay only for what you use), increased
strategic flexibility to meet market opportunities without having to forecast and maintain on-site
capacity, and access to the advantages of cloud provider’s massive capacity: instant scalability,
parallel processing capability which reduces task processing time and response latency, system
redundancy which improves reliability, and better capability to repel botnet attacks. Further,
public cloud vendors can achieve unparalleled efficiencies compared to data centers and private
clouds because they are able to scale their capacity to address the aggregated demand of many
enterprises, each having different peak demand periods. This allows for much higher server
utilization rates, lower unit costs, and easier capacity planning netting a much higher return on
assets than is possible for individual enterprises. Finally, because the location of the public cloud
vendor’s facilities are not tied to the parochial interests of the individual clients, they are able to
locate, scale, and manage their operations to take optimum advantage of reduced energy costs,
skilled labor pools, bandwidth, or inexpensive real estate.
These are not the only benefits that have been identified. Matzke suggests that the levels of
required skills or specialized expertise along with the required economies of scale drive the
optimum choice for resourcing IT initiatives. For him, the availability of scalable skills
combined with other economies of scale are among the compelling benefits of cloud computing.
This is especially true for enterprises that are located in labor markets that have very few or only
very expensive IT staff resources available with the requisite skills.

1.7.4 Economic Costs


The costs associated with cloud computing facing early adopters include the potential costs of
service disruptions; data security concerns; potential regulatory compliance issues arising out of
sensitive data being transferred, processed or stored beyond defined borders; limitations in the
variety and capabilities of the development and deployment platforms currently available;
difficulties in moving proprietary data and software from one cloud services provider to another;
integration of cloud services with legacy systems; cost and availability of programming skills
needed to modify legacy application to function in the cloud environment; legacy software CPU-
based licensing costs increasing when moved to a cloud platform, etc.

1.7.5 Company Size and the Economic Costs and Benefits of Cloud Computing

The economic costs or benefits of implementing cloud services vary depending upon the size of
the enterprise and its existing IT resources/overheads including legacy data center infrastructure,
computer hardware, legacy software, maturity of internal processes, IT staffing and technical
skill base. These determine the strategic costs and benefits that accrue to individuals and
corporations depending upon their relative size.
In the past, large corporations have had an advantage over small corporations in their access to
capital and their ability to leverage their existing human, software, and hardware resources to
support new marketing and strategic initiatives. However, since the advent of cloud computing,
the barriers to entry for a particular market or market segment for a startup company have been
dramatically reduced and cloud computing may have tipped the balance of strategic advantage
away from the large established corporations towards much more nimble small or startup
companies. A small, dedicated, and talented team of individuals can now pool their individual
talents to address a perceived market need without an immediate need for a venture capital funds
to provide the necessary IT infrastructure. There are a number of cloud providers who provide
software development environments that include the requisite software development tools, code
repositories, test environments, and access to a highly scalable production environment on pay-
as-you-go basis.
Also contributing to this trend is the open-source movement. While licensing issues, support, and
feature considerations may dissuade larger enterprises from using open source software in the
development and deployment of their proprietary products, the availability of open source
software in nearly every software category has been a boon to SMEs, the self-employed, and
start-ups.
As these small companies grow into midsize and large companies they face changing cost
equations that modify the relative costs and benefits of cloud computing. For instance, at certain
data traffic volumes the marginal costs of operating on a cloud provider’s infrastructure may
become more expensive than providing the necessary IT infrastructure in-house. At that point,
there may be advantages of a mixed-use strategy in which some of the applications and services
are brought in-house and others continue to be hosted in the cloud. The following tables will
identify the differences that SMEs and large enterprises face in both the benefits and costs of
cloud services.
UNIT – II

VIRTUALIZATION, SERVER, STORAGE AND NETWORKING

2.1 VIRTUALIZATION CONCEPTS


Virtualization is a technique, which allows to share single physical instance of an application or
resource among multiple organizations or tenants (customers). It does so by assigning a logical
name to a physical resource and providing a pointer to that physical resource on demand.
Creating a virtual machine over existing operating system and hardware is referred as Hardware
Virtualization. Virtual Machines provide an environment that is logically separated from the
underlying hardware. The machine on which the virtual machine is created is known as host
machine and virtual machine is referred as a guest machine. This virtual machine is managed by
a software or firmware, which is known as hypervisor.

2.1.1 Hypervisor
The hypervisor is a firmware or low-level program that acts as a Virtual Machine Manager.
There are two types of hypervisor:
1. Type 1 hypervisor executes on bare system. LynxSecure, RTS Hypervisor, Oracle VM, Sun
xVM Server, VirtualLogic VLX are examples of Type 1 hypervisor. The following diagram
shows the Type 1 hypervisor. The type1 hypervisor does not have any host operating system
because they are installed on a bare system.
FIGURE.2.1. Type 1 hypervisor
2. Type 2 hypervisor is a software interface that emulates the devices with which a system
normally interacts. Containers, KVM, Microsoft Hyper V, VMware Fusion, Virtual Server
2005 R2, Windows Virtual PC and VMware workstation 6.0 are examples of Type 2
hypervisor. The following diagram shows the Type 2 hypervisor.

FIGURE.2.2. Type 2 hypervisor


2.1.2 TYPES OF HARDWARE VIRTUALIZATION
Here are the three types of hardware virtualization:
1. Full Virtualization
2. Emulation Virtualization

3. Para virtualization

2.1.2.1 Full Virtualization


In full virtualization, the underlying hardware is completely simulated. Guest software does not
require any modification to run.

2.1.2.2 Emulation Virtualization


In Emulation, the virtual machine simulates the hardware and hence becomes independent of it.
In this, the guest operating system does not require modification.

FIGURE.2.3. Full Virtualization


FIGURE.2.4. Emulation Virtualization
2.1.2.3 Para virtualization
In Para virtualization, the hardware is not simulated. The guest software run their own isolated
domains.

FIGURE.2.5. Para virtualization


VMware vSphere is highly developed infrastructure that offers a management infrastructure
framework for virtualization. It virtualizes the system, storage and networking hardware.

2.2 SERVER VIRTUALIZATION


2.2.1 Definition - What does Server Virtualization mean?
Server virtualization is a virtualization technique that involves partitioning a physical server into
a number of small, virtual servers with the help of virtualization software. In server
virtualization, each virtual server runs multiple operating system instances at the same time.
Server virtualization attempts to address both of these issues in one fell swoop. By using
specially designed software, an administrator can convert one physical server into multiple
virtual machines. Each virtual server acts like a unique physical device, capable of running its
own OS. In theory, you could create enough virtual servers to to use all of a machine's
processing power, though in practice that's not always the best idea.
Virtualization isn't a new concept. Computer scientists have been creating virtual machines on
supercomputers for decades. But it's only been a few years since virtualization has become
feasible for servers. In the world of information technology (IT), server virtualization is a hot
topic. It's still a young technology and several companies offer different approaches.

2.2.2 Why use server virtualization


There are many reasons companies and organizations are investing in server virtualization. Some
of the reasons are financially motivated, while others address technical concerns:
1. Server virtualization conserves space through consolidation. It's common practice to
dedicate each server to a single application. If several applications only use a small
amount of processing power, the network administrator can consolidate several machines
into one server running multiple virtual environments. For companies that have hundreds
or thousands of servers, the need for physical space can decrease significantly.
2. Server virtualization provides a way for companies to practice redundancy without
purchasing additional hardware. Redundancy refers to running the same application on
multiple servers. It's a safety measure -- if a server fails for any reason, another server
running the same application can take its place. This minimizes any interruption in
service. It wouldn't make sense to build two virtual servers performing the same
application on the same physical server. If the physical server were to crash, both virtual
servers would also fail. In most cases, network administrators will create redundant
virtual servers on different physical machines.
3. Virtual servers offer programmers isolated, independent systems in which they can test
new applications or operating systems. Rather than buying a dedicated physical machine,
the network administrator can create a virtual server on an existing machine. Because
each virtual server is independent in relation to all the other servers, programmers can run
software without worrying about affecting other applications.
4. Server hardware will eventually become obsolete, and switching from one system to
another can be difficult. In order to continue offering the services provided by these
outdated systems -- sometimes called legacy systems -- a network administrator could
create a virtual version of the hardware on modern servers. From an application
perspective, nothing has changed. The programs perform as if they were still running on
the old hardware. This can give the company time to transition to new processes without
worrying about hardware failures, particularly if the company that produced the legacy
hardware no longer exists and can't fix broken equipment.
5. An emerging trend in server virtualization is called migration. Migration refers to moving
a server environment from one place to another. With the right hardware and software,
it's possible to move a virtual server from one physical machine in a network to another.
Originally, this was possible only if both physical machines ran on the same hardware,
operating system and processor. It's possible now to migrate virtual servers from one
physical machine to another even if both machines have different processors, but only if
the processors come from the same manufacturer.

2.2.3 Three Kinds of Server Virtualization PREV


There are three ways to create virtual servers: full virtualization, para-virtualization and OS-level
virtualization. They all share a few common traits. The physical server is called the host. The
virtual servers are called guests. The virtual servers behave like physical machines. Each system
uses a different approach to allocate physical server resources to virtual server needs.
Full virtualization uses a special kind of software called a hypervisor. The hypervisor interacts
directly with the physical server's CPU and disk space. It serves as a platform for the virtual
servers' os. The hypervisor keeps each virtual server completely independent and unaware of the
other virtual servers running on the physical machine. Each guest server runs on its own OS --
you can even have one guest running on Linux and another on Windows.
The hypervisor monitors the physical server's resources. As virtual servers run applications, the
hypervisor relays resources from the physical machine to the appropriate virtual server.
Hypervisors have their own processing needs, which means that the physical server must reserve
some processing power and resources to run the hypervisor application. This can impact overall
server performance and slow down applications.
The para-virtualization approach is a little different. Unlike the full virtualization technique, the
guest servers in a para-virtualization system are aware of one another. A para-virtualization
hypervisor doesn't need as much processing power to manage the guest operating systems,
because each OS is already aware of the demands the other operating systems are placing on the
physical server. The entire system works together as a cohesive unit.
An OS-level virtualization approach doesn't use a hypervisor at all. Instead, the virtualization
capability is part of the host OS, which performs all the functions of a fully virtualized
hypervisor. The biggest limitation of this approach is that all the guest servers must run the same
OS. Each virtual server remains independent from all the others, but you can't mix and match
operating systems among them. Because all the guest operating systems must be the same, this is
called a homogeneous environment.
Which method is best? That largely depends on the network administrator's needs. If the
administrator's physical servers all run on the same operating system, then an OS-level approach
might work best. OS-level systems tend to be faster and more efficient than other methods. On
the other hand, if the administrator is running servers on several different operating systems,
para-virtualization might be a better choice. One potential drawback for para-virtualization
systems is support -- the technique is relatively new and only a few companies offer para-
virtualization software. More companies support full virtualization, but interest in para-
virtualization is growing and may replace full virtualization in time.
2.2.2 Limitations of Server Virtualization
The benefits of server virtualization can be so enticing that it's easy to forget that the technique
isn't without its share of limitations. It's important for a network administrator to research server
virtualization and his or her own network's architecture and needs before attempting to engineer
a solution.
For servers dedicated to applications with high demands on processing power, virtualization isn't
a good choice. That's because virtualization essentially divides the server's processing power up
among the virtual servers. When the server's processing power can't meet application demands,
everything slows down. Tasks that shouldn't take very long to complete might last hours. Worse,
it's possible that the system could crash if the server can't meet processing demands. Network
administrators should take a close look at CPU usage before dividing a physical server into
multiple virtual machines.
It's also unwise to overload a server's CPU by creating too many virtual servers on one physical
machine. The more virtual machines a physical server must support, the less processing power
each server can receive. In addition, there's a limited amount of disk space on physical servers.
Too many virtual servers could impact the server's ability to store data.
Another limitation is migration. Right now, it's only possible to migrate a virtual server from one
physical machine to another if both physical machines use the same manufacturer's processor. If
a network uses one server that runs on an Intel processor and another that uses an AMD
processor, it's impossible to port a virtual server from one physical machine to the other.
Why would an administrator want to migrate a virtual server in the first place? If a physical
server requires maintenance, porting the virtual servers over to other machines can reduce the
amount of application downtime. If migration isn't an option, then all the applications running on
the virtual servers hosted on the physical machine will be unavailable during maintenance.
Many companies are investing in server virtualization despite its limitations. As server
virtualization technology advances, the need for huge data centers could decline. Server power
consumption and heat output could also decrease, making server utilization not only financially
attractive, but also a green initiative. As networks use servers closer to their full potential, we
could see larger, more efficient computer networks. It's not an exaggeration to say that virtual
servers could lead to a complete revolution in the computing industry.
2.3 Desktop Virtualization
Deploying desktops as a managed service gives you the opportunity to respond quicker to
changing needs and opportunities. You can reduce costs and increase service by quickly and
easily delivering virtualized desktops and applications to branch offices, outsourced and offshore
employees and mobile workers on iPad and Androidtablets.VMware desktop solutions,
including Horizon, Fusion, and Mirage, are scalable, consistent, fully secure and highly available
to ensure maximum uptime and productivity.
Server computers -- machines that host files and applications on computer -- have to be powerful.
Some have central processing units (CPUs) with multiple processors that give these servers the
ability to run complex tasks with ease. Computer network administrators usually dedicate
each server to a specific application or task. Many of these tasks don't play well with others --
each needs its own dedicated machine. One application per server also makes it easier to track
down problems as they arise. It's a simple way to streamline a computer network from a
technical standpoint.
There are a couple of problems with this approach, though. One is that it doesn't take advantage
of modern server computers' processing power. Most servers use only a small fraction of their
overall processing capabilities. Another problem is that as a computer network gets larger and
more complex, the servers begin to take up a lot of physical space. A data center might become
overcrowded with racks of servers.

2.4 Storage virtualization


In computer science, storage virtualization uses virtualization to enable better functionality and
more advanced features in computer data storage systems. Broadly speaking, a "storage system"
is also known as a storage array or disk array or a filer. Storage systems typically use special
hardware and software along with disk drives in order to provide very fast and reliable storage
for computing and data processing. Storage systems are complex, and may be thought of as a
special purpose computer designed to provide storage capacity along with advanced data
protection features. Disk drives are only one element within a storage system, along with
hardware and special purpose embedded software within the system. Storage systems can
provide either block accessed storage, or file accessed storage. Block access is typically
delivered over Fibre Channel, SAS, FICON or other protocols. File access is often provided
using NFS or SMB protocols. Within the context of a storage system, there are two primary
types of virtualization that can occur:
1. Block virtualization used in this context refers to the abstraction (separation) of logical
storage (partition) from physical storage so that it may be accessed without regard to physical
storage or heterogeneous structure. This separation allows the administrators of the storage
system greater flexibility in how they manage storage for end users.
2. File virtualization addresses the NAS challenges by eliminating the dependencies between
the data accessed at the file level and the location where the files are physically stored. This
provides opportunities to optimize storage use and server consolidation and to perform non-
disruptive file migrations.

2.4.1 Block virtualization


2.4.1.1 Address space remapping
Virtualization of storage helps achieve location independence by abstracting the physical
location of the data. The virtualization system presents to the user a logical space for data storage
and handles the process of mapping it to the actual physical location. It is possible to have
multiple layers of virtualization or mapping. It is then possible that the output of one layer of
virtualization can then be used as the input for a higher layer of virtualization. Virtualization
maps space between back-end resources, to front-end resources. In this instance, "back-end"
refers to a logical unit number (LUN) that is not presented to a computer, or host system for
direct use. A "front-end" LUN or volume is presented to a host or computer system for use.
The actual form of the mapping will depend on the chosen implementation. Some
implementations may limit the granularity of the mapping which may limit the capabilities of the
device. Typical granularities range from a single physical disk down to some small subset
(multiples of megabytes or gigabytes) of the physical disk. In a block-based storage
environment, a single block of information is addressed using a LUN identifier and an offset
within that LUN – known as a logical block addressing (LBA).
A. Meta-data
The virtualization software or device is responsible for maintaining a consistent view of all the
mapping information for the virtualized storage. This mapping information is often called meta-
data and is stored as a mapping table. The address space may be limited by the capacity needed
to maintain the mapping table. The level of granularity, and the total addressable space both
directly impact the size of the meta-data, and hence the mapping table. For this reason, it is
common to have trade-offs, between the amount of addressable capacity and the granularity or
access granularity.
One common method to address these limits is to use multiple levels of virtualization. In several
storage systems deployed today, it is common to utilize three layers of virtualization. Some
implementations do not use a mapping table, and instead calculate locations using an algorithm.
These implementations utilize dynamic methods to calculate the location on access, rather than
storing the information in a mapping table.

B. I/O redirection
The virtualization software or device uses the meta-data to re-direct I/O requests. It will receive
an incoming I/O request containing information about the location of the data in terms of the
logical disk (vdisk) and translates this into a new I/O request to the physical disk location.
For example, the virtualization device may:
1. Receive a read request for vdisk LUN ID=1, LBA=32
2. Perform a meta-data look up for LUN ID=1, LBA=32, and finds this maps to physical
LUN ID=7, LBA0

3. Sends a read request to physical LUN ID=7, LBA0

4. Receives the data back from the physical LUN

5. Sends the data back to the originator as if it had come from vdisk LUN ID=1, LBA32

C. Capabilities
Most implementations allow for heterogeneous management of multi-vendor storage devices
within the scope of a given implementation's support matrix. This means that the following
capabilities are not limited to a single vendor's device (as with similar capabilities provided by
specific storage controllers) and are in fact possible across different vendors' devices.
D. Replication
Data replication techniques are not limited to virtualization appliances and as such are not
described here in detail. However most implementations will provide some or all of these
replication services.
When storage is virtualized, replication services must be implemented above the software or
device that is performing the virtualization. This is true because it is only above the virtualization
layer that a true and consistent image of the logical disk (vdisk) can be copied. This limits the
services that some implementations can implement – or makes them seriously difficult to
implement. If the virtualization is implemented in the network or higher, this renders any
replication services provided by the underlying storage controllers useless.
1. Remote data replication for disaster recovery
1.1. Synchronous Mirroring – where I/O completion is only returned when the remote site
acknowledges the completion. Applicable for shorter distances (<200 km)
1.2. Asynchronous Mirroring – where I/O completion is returned before the remote site has
acknowledged the completion. Applicable for much greater distances (>200 km)
2. Point-In-Time Snapshots to copy or clone data for diverse uses
1.1. When combined with thin provisioning, enables space-efficient snapshots

E. Pooling
The physical storage resources are aggregated into storage pools, from which the logical storage
is created. More storage systems, which may be heterogeneous in nature, can be added as and
when needed, and the virtual storage space will scale up by the same amount. This process is
fully transparent to the applications using the storage infrastructure.

F. Disk management
The software or device providing storage virtualization becomes a common disk manager in the
virtualized environment. Logical disks (vdisks) are created by the virtualization software or
device and are mapped (made visible) to the required host or server, thus providing a common
place or way for managing all volumes in the environment.
Enhanced features are easy to provide in this environment:
1. Thin Provisioning to maximize storage utilization
1.1. This is relatively easy to implement as physical storage is only allocated in the mapping
table when it is used.
2. Disk expansion and shrinking
1.1. More physical storage can be allocated by adding to the mapping table (assuming the using
system can cope with online expansion)
1.2. Similarly disks can be reduced in size by removing some physical storage from the
mapping (uses for this are limited as there is no guarantee of what resides on the areas removed)

4.1.1.2 Benefits
A. Non-disruptive data migration
One of the major benefits of abstracting the host or server from the actual storage is the ability
to migrate data while maintaining concurrent I/O access.
The host only knows about the logical disk (the mapped LUN) and so any changes to the meta-
data mapping is transparent to the host. This means the actual data can be moved or replicated to
another physical location without affecting the operation of any client. When the data has been
copied or moved, the meta-data can simply be updated to point to the new location, therefore
freeing up the physical storage at the old location.
The process of moving the physical location is known as data migration. Most implementations
allow for this to be done in a non-disruptive manner that is concurrently while the host continues
to perform I/O to the logical disk (or LUN).
The mapping granularity dictates how quickly the meta-data can be updated, how much extra
capacity is required during the migration, and how quickly the previous location is marked as
free. The smaller the granularity the faster the update, less space required and quicker the old
storage can be freed up.
There are many day to day tasks a storage administrator has to perform that can be simply and
concurrently performed using data migration techniques.
1. Moving data off an over-utilized storage device.
2. Moving data onto a faster storage device as needs require

3. Implementing an Information Lifecycle Management policy


4. Migrating data off older storage devices (either being scrapped or off-lease)

B. Improved utilization
Utilization can be increased by virtue of the pooling, migration, and thin provisioning services.
This allows users to avoid over-buying and over-provisioning storage solutions. In other words,
this kind of utilization through a shared pool of storage can be easily and quickly allocated as it
is needed to avoid constraints on storage capacity that often hinder application performance.
When all available storage capacity is pooled, system administrators no longer have to search for
disks that have free space to allocate to a particular host or server. A new logical disk can be
simply allocated from the available pool, or an existing disk can be expanded.
Pooling also means that all the available storage capacity can potentially be used. In a traditional
environment, an entire disk would be mapped to a host. This may be larger than is required, thus
wasting space. In a virtual environment, the logical disk (LUN) is assigned the capacity required
by the using host.
Storage can be assigned where it is needed at that point in time, reducing the need to guess how
much a given host will need in the future. Using Thin Provisioning, the administrator can create
a very large thin provisioned logical disk, thus the using system thinks it has a very large disk
from day one.
C. Fewer points of management
With storage virtualization, multiple independent storage devices, even if scattered across a
network, appear to be a single monolithic storage device and can be managed centrally.
However, traditional storage controller management is still required. That is, the creation and
maintenance of RAID arrays, including error and fault management.
D. Risks
i. Backing out a failed implementation
Once the abstraction layer is in place, only the virtualizer knows where the data actually resides
on the physical medium. Backing out of a virtual storage environment therefore requires the
reconstruction of the logical disks as contiguous disks that can be used in a traditional manner.
Most implementations will provide some form of back-out procedure and with the data migration
services it is at least possible, but time consuming.
ii. Interoperability and vendor support
Interoperability is a key enabler to any virtualization software or device. It applies to the actual
physical storage controllers and the hosts, their operating systems, multi-pathing software and
connectivity hardware.
Interoperability requirements differ based on the implementation chosen. For example,
virtualization implemented within a storage controller adds no extra overhead to host based
interoperability, but will require additional support of other storage controllers if they are to be
virtualized by the same software. Switch based virtualization may not require specific host
interoperability — if it uses packet cracking techniques to redirect the I/O. Network based
appliances have the highest level of interoperability requirements as they have to interoperate
with all devices, storage and hosts.
iii. Complexity
Complexity affects several areas :
1. Management of environment: Although a virtual storage infrastructure benefits from a single
point of logical disk and replication service management, the physical storage must still be
managed. Problem determination and fault isolation can also become complex, due to the
abstraction layer.
2. Infrastructure design: Traditional design ethics may no longer apply, virtualization brings a
whole range of new ideas and concepts to think about (as detailed here)
3. The software or device itself: Some implementations are more complex to design and code –
network based, especially in-band (symmetric) designs in particular — these
implementations actually handle the I/O requests and so latency becomes an issue.
E. Meta-data management
Information is one of the most valuable assets in today's business environments. Once
virtualized, the meta-data are the glue in the middle. If the meta-data are lost, so is all the actual
data as it would be virtually impossible to reconstruct the logical drives without the mapping
information. Any implementation must ensure its protection with appropriate levels of back-ups
and replicas. It is important to be able to reconstruct the meta-data in the event of a catastrophic
failure.
The meta-data management also has implications on performance. Any virtualization software or
device must be able to keep all the copies of the meta-data atomic and quickly updateable. Some
implementations restrict the ability to provide certain fast update functions, such as point-in-time
copies and caching where super fast updates are required to ensure minimal latency to the actual
I/O being performed.
F. Performance and scalability
In some implementations the performance of the physical storage can actually be improved,
mainly due to caching. Caching however requires the visibility of the data contained within the
I/O request and so is limited to in-band and symmetric virtualization software and devices.
However these implementations also directly influence the latency of an I/O request (cache
miss), due to the I/O having to flow through the software or device. Assuming the software or
device is efficiently designed this impact should be minimal when compared with the latency
associated with physical disk accesses.
Due to the nature of virtualization, the mapping of logical to physical requires some processing
power and lookup tables. Therefore, every implementation will add some small amount of
latency.
In addition to response time concerns, throughput has to be considered. The bandwidth into and
out of the meta-data lookup software directly impacts the available system bandwidth. In
asymmetric implementations, where the meta-data lookup occurs before the information is read
or written, bandwidth is less of a concern as the meta-data are a tiny fraction of the actual I/O
size. In-band, symmetric flow through designs are directly limited by their processing power and
connectivity bandwidths.
Most implementations provide some form of scale-out model, where the inclusion of additional
software or device instances provides increased scalability and potentially increased bandwidth.
The performance and scalability characteristics are directly influenced by the chosen
implementation.
G. Implementation approaches
1. Host-based
2. Storage device-based
3. Network-based

 Host-based
Host-based virtualization requires additional software running on the host, as a privileged task or
process. In some cases volume management is built into the operating system, and in other
instances it is offered as a separate product. Volumes (LUN's) presented to the host system are
handled by a traditional physical device driver. However, a software layer (the volume manager)
resides above the disk device driver intercepts the I/O requests, and provides the meta-data
lookup and I/O mapping.
Most modern operating systems have some form of logical volume management built-in (in
Linux called Logical Volume Manager or LVM; in Solaris and FreeBSD, ZFS's zpool layer; in
Windows called Logical Disk Manager or LDM), that performs virtualization tasks.
Note: Host based volume managers were in use long before the term storage virtualization had
been coined.

 Pros
1. Simple to design and code
2. Supports any storage type
3. Improves storage utilization without thin provisioning restrictions
 Cons
1. Storage utilization optimized only on a per host basis
2. Replication and data migration only possible locally to that host
3. Software is unique to each operating system
4. No easy way of keeping host instances in sync with other instances
5. Traditional Data Recovery following a server disk drive crash is impossible
 Specific examples
1. Logical volume management
2. File systems, e.g., (hard links, SMB/NFS)
3. Automatic mounting
 Storage device-based
Like host-based virtualization, several categories have existed for years and have only recently
been classified as virtualization. Simple data storage devices, like single hard disk drives, do not
provide any virtualization. But even the simplest disk arrays provide a logical to physical
abstraction, as they use RAID schemes to join multiple disks in a single array (and possibly later
divide the array it into smaller volumes).
Advanced disk arrays often feature cloning, snapshots and remote replication. Generally these
devices do not provide the benefits of data migration or replication across heterogeneous storage,
as each vendor tends to use their own proprietary protocols.
A new breed of disk array controllers allows the downstream attachment of other storage
devices. For the purposes of this article we will only discuss the later style which do actually
virtualized other storage devices.
 Concept
A primary storage controller provides the services and allows the direct attachment of other
storage controllers. Depending on the implementation these may be from the same or different
vendors.
The primary controller will provide the pooling and meta-data management services. It may also
provide replication and migration services across those controllers which it is .
 Pros
1. No additional hardware or infrastructure requirements
2. Provides most of the benefits of storage virtualization
3. Does not add latency to individual I/Os
 Cons
1. Storage utilization optimized only across the connected controllers
2. Replication and data migration only possible across the connected controllers and same
vendors device for long distance support
3. Downstream controller attachment limited to vendors support matrix
4. I/O Latency, non cache hits require the primary storage controller to issue a secondary
downstream I/O request
5. Increase in storage infrastructure resource, the primary storage controller requires the
same bandwidth as the secondary storage controllers to maintain the same throughput.
 Network-based
Storage virtualization operating on a network based device (typically a standard server or smart
switch) and using iSCSI or FC Fibre channel networks to connect as a SAN. These types of
devices are the most commonly available and implemented form of virtualization.
The virtualization device sits in the SAN and provides the layer of abstraction between the hosts
performing the I/O and the storage controllers providing the storage capacity.
 Pros
1. True heterogeneous storage virtualization
2. Caching of data (performance benefit) is possible when in-band
3. Single management interface for all virtualized storage
4. Replication services across heterogeneous devices
 Cons
1. Complex interoperability matrices – limited by vendors support
2. Difficult to implement fast meta-data updates in switched-based devices
3. Out-of-band requires specific host based software
4. In-band may add latency to I/O
5. In-band the most complicated to design and code

 Appliance-based vs. switch-based


There are two commonly available implementations of network-based storage
virtualization, appliance-based and switch-based. Both models can provide the same services,
disk management, metadata lookup, data migration and replication. Both models also require
some processing hardware to provide these services.
Appliance based devices are dedicated hardware devices that provide SAN connectivity of one
form or another. These sit between the hosts and storage and in the case of in-band (symmetric)
appliances can provide all of the benefits and services discussed in this article. I/O requests are
targeted at the appliance itself, which performs the meta-data mapping before redirecting the I/O
by sending its own I/O request to the underlying storage. The in-band appliance can also provide
caching of data, and most implementations provide some form of clustering of individual
appliances to maintain an atomic view of the metadata as well as cache data.
Switch based devices, as the name suggests, reside in the physical switch hardware used to
connect the SAN devices. These also sit between the hosts and storage but may use different
techniques to provide the metadata mapping, such as packet cracking to snoop on incoming I/O
requests and perform the I/O redirection. It is much more difficult to ensure atomic updates of
metadata in a switched environment and services requiring fast updates of data and metadata
may be limited in switched implementations.

 In-band vs. out-of-band


In-band, also known as symmetric, virtualization devices actually sit in the data path between
the host and storage. All I/O requests and their data pass through the device. Hosts perform I/O
to the virtualization device and never interact with the actual storage device. The virtualization
device in turn performs I/O to the storage device. Caching of data, statistics about data usage,
replications services, data migration and thin provisioning are all easily implemented in an in-
band device.
Out-of-band, also known as asymmetric, virtualization devices are sometimes called meta-data
servers. These devices only perform the meta-data mapping functions. This requires additional
software in the host which knows to first request the location of the actual data. Therefore, an I/O
request from the host is intercepted before it leaves the host, a meta-data lookup is requested
from the meta-data server (this may be through an interface other than the SAN) which returns
the physical location of the data to the host. The information is then retrieved through an actual
I/O request to the storage. Caching is not possible as the data never passes through the device.

2.5 Network virtualization


In computing, network virtualization or network virtualisation (see spelling differences) is the
process of combining hardware and software network resources and network functionality into a
single, software-based administrative entity, a virtual network. Network virtualization
involves platform virtualization, often combined with resource virtualization. Network
virtualization is categorized as either external virtualization, combining many networks or parts
of networks into a virtual unit, or internal virtualization, providing network-like functionality to
software containers on a single network server.
In software testing, software developers use network virtualization to test software under
development in a simulation of the network environments in which the software is intended to
operate. As a component of application performance engineering, network virtualization enables
developers to emulate connections between applications, services, dependencies, and end users
in a test environment without having to physically test the software on all possible hardware or
system software. Of course, the validity of the test depends on the accuracy of the network
virtualization in emulating real hardware and operating systems.

2.5.1 Component
Various equipment and software vendors offer network virtualization by combining any of the
following:
1. Network hardware, such as switches and network adapters, also known as network interface
cards (NICs)
2. ork elements, such as firewalls and load balancers
3. Networks, such as virtual LANs (VLANs) and containers such as virtual machines (VMs)
4. Network storage devices
5. Network machine-to-machine elements, such as telecommunications devices
6. Network mobile elements, such as laptop computers, tablet computers, and smart phones
7. Network media, such as Ethernet and Fibre Channel.

2.5.2 External Virtualization


External network virtualization combines or subdivides one or more local area networks (LANs)
into virtual networks to improve a large network's or data centre's efficiency. A virtual local area
network (VLAN) and network switch comprise the key components. Using this technology,
a system administrator can configure systems physically attached to the same local network into
separate virtual networks. Conversely, an administrator can combine systems on separate local
area networks (LANs) into a single VLAN spanning segments of a large network.
2.5.3 Internal Virtualization
Internal network virtualization configures a single system with software containers, such
as Xen hypervisor control programs, or pseudo-interfaces, such as a VNIC, to emulate a physical
network with software. This can improve a single system's efficiency by isolating applications to
separate containers or pseudo-interfaces.

2.5.4 Wireless network virtualization


Wireless network virtualization can have a very broad scope ranging from spectrum sharing,
infrastructure virtualization, to air interface virtualization. Similar to wired network
virtualization, in which physical infrastructure owned by one or more providers can be shared
among multiple service providers, wireless network virtualization needs the physical wireless
infrastructure and radio resources to be abstracted and isolated to a number of virtual resources,
which then can be offered to different service providers. In other words, virtualization, regardless
of wired or wireless networks, can be considered as a process splitting the entire network system.
However, the distinctive properties of the wireless environment, in terms of time-various
channels, attenuation, mobility, broadcast, etc., make the problem more complicated.
Furthermore, wireless network virtualization depends on specific access technologies, and
wireless network contains much more access technologies compared to wired network
virtualization and each access technology has its particular characteristics, which makes
convergence, sharing and abstraction difficult to achieve. Therefore, it may be inaccurate to
consider wireless network virtualization as a subset of network virtualization.
2.5.5 Performance
Until 1 Gbit/s networks, Network virtualization was not suffering from the overhead of the
software layers or hypervisor layers providing the interconnects. With the rise of high
bandwidth, 10 Gbit/s and beyond, the rates of packets exceed the capabilities of processing of the
networking stacks. In order to keep offering high throughput processing, some combinations of
software and hardware helpers are deployed in the so-called "network in a box" associated with
either a hardware dependent network interface controller (NIC) using SRIOV extensions of the
hypervisor or either using a fast path technology between the NIC and the payloads (virtual
machines or containers).
For example, in case of Open stack, network is provided by Neutron which leverages many
features from the Linux kernel for networking: iptables, iproute2, L2 bridge, L3 routing or OVS.
Since the Linux kernel cannot sustain the 10G packet rate, then some bypass technologies for
a fast path are used. The main bypass technologies are either based on a limited set of features
such as Open vSwitch (OVS) with its DPDK user space implementation or based on a full
feature and offload of Linux processing such as 6WIND Virtual Accelerator.

2.6 Storage Services


A public cloud storage service is usually suitable for unstructured data that is not subject to
constant change. The infrastructure usually consists of inexpensive storage nodes attached to
commodity drives. Data is stored on multiple nodes for redundancy and accessed through
Internet protocols, typically Representational State Transfer (REST). Public cloud storage
service providers include Amazon, AT&T, Iron Mountain, Microsoft, Nirvanix and Rackspace.
A private cloud storage service is more suitable for actively used data and for data that an
organization needs more control over. Storage is on a dedicated infrastructure within the data
center, which helps ensure security and performance. One example of a private cloud storage
offering is the Hitachi Data Systems Cloud Service for Private File Tiering.
Some enterprise users opt for a hybrid cloud storage model, storing unstructured data with a
public cloud provider but storing actively used and structured data with a private cloud provider.
Things to consider in a cloud storage service:
1. Does the service use REST, the most commonly used cloud storage API?
2. Are you migrating data from an existing archival storage product?
3. Does your data have to be preserved in some specific format to meet compliance
requirements? That capacity is not commonly available.
4. Can the provider deal with large fluctuations in resource demands?
5. Does the provider offer both public and private clouds?
This may become important if you someday want to migrate data from one type of service to the
other

2.7 Service virtualization


In software engineering, service virtualization is a method to emulate the behaviour of specific
components in heterogeneous component-based applications such as API-driven
applications, cloud-based applications and service-oriented architectures. It is used to
provide software development and QA/testing teams access to dependent system components
that are needed to exercise an application under test (AUT), but are unavailable or difficult-to-
access for development and testing purposes. With the behaviour of the dependent components
"virtualized," testing and development can proceed without accessing the actual live components.
Service virtualization is recognized by vendors, industry analysts, and industry publications as
being different than mocking.

2.7.1 Service virtualization Overview


Service virtualization emulates the behaviour of software components to remove dependency
constraints on development and testing teams. Such constraints occur in complex, interdependent
environments when a component connected to the application under test is:
2.1 Not yet completed
2.2 Still evolving
2.3 Controlled by a third-party or partner
2.4 Available for testing only in limited capacity or at
inconvenient times
2.5 Difficult to provision or configure in a test environment
2.6 Needed for simultaneous access by different teams with
varied test data setup and other requirements
2.7 Restricted or costly to use for load and performance testing
Although the term "service virtualization" reflects the technique's initial focus on
virtualizing web services, service virtualization extends across all aspects of composite
applications: services, databases, mainframes, ESBs, and other components that communicate
using common messaging protocols.
Service virtualization emulates only the behaviour of the specific dependent components that
developers or testers need to exercise in order to complete their end-to-end transactions. Rather
than virtualizing entire systems, it virtualizes only specific slices of dependent behaviour critical
to the execution of development and testing tasks. This provides just enough application logic so
that the developers or testers get what they need without having to wait for the actual service to
be completed and readily available. For instance, instead of virtualizing an entire database (and
performing all associated test data management as well as setting up the database for every test
session), you monitor how the application interacts with the database, then you emulate the
related database behaviour (the SQL queries that are passed to the database, the corresponding
result sets that are returned, and so forth).
2.7.2 Applying Service virtualization
Service virtualization involves creating and deploying a "virtual asset" that simulates the
behaviour of a real component which is required to exercise the application under test, but is
difficult or impossible to access for development and testing purposes.
A virtual asset stands in for a dependent component by listening for requests and returning an
appropriate response—with the appropriate performance. For a database, this might involve
listening for a SQL statement, then returning data source rows. For a web service, this might
involve listening for an XML message over HTTP, JMS, or MQ, then returning another XML
message. The virtual asset's functionality and performance might reflect the actual
functionality/performance of the dependent component, or it might simulate exceptional
conditions (such as extreme loads or error conditions) to determine how the application under
test responds under those circumstances.
Virtual assets are typically created by:
1. Recording live communication among components as the system is exercised from the
application under test (AUT)
2. Providing logs representing historical communication among components
3. Analyzing service interface specifications (such as a WSDL)
4. Defining the behavior manually with various interface controls and data source values
They are then further configured to represent specific data, functionality, and response times.
Virtual assets are deployed locally or in the cloud (public or private). With development/test
environments configured to use the virtual assets in place of dependent components, developers
or testers can then exercise the application they are working on without having to wait for the
dependent components to be completed or readily accessible.
Industry analysts report that service virtualization is best suited for "IT shops with significant
experience with 'skipping' integration testing due to 'dependent software', and with a reasonably
sophisticated test harness.

2.2 Internals of virtual machine


2.8.2 What is a Virtual Machine?
A virtual machine (VM) is an abstraction layer or environment between hardware components
and the end-user. Virtual machines run operating systems and are sometimes referred to as
virtual servers. A host operating system can run many virtual machines and shares system
hardware components such as CPUs, controllers, disk, memory, and I/O among virtual servers. A
“real machine” is the host operating system and hardware components, sometimes described as
“bare metal,” such as memory, CPU, motherboard, and network interface. The real machine is
essentially a host system with no virtual machines. The real machine operating system accesses
hardware components by making calls through a low-level program called the BIOS (basic
input/output system). Virtual machines are built on top of the real machine core components.
Goldberg describes virtual machines as “facsimiles” or a “hardware-software duplicate of a real
existing machine "Abstraction layers called hypervisors or VMMs (virtual machine monitors)
make calls from the virtual machine to the real machine. Current hypervisors use the real
machine hardware components, but allow for different virtual machine operating systems and
configurations. For example, a host system might run on SuSE Linux, and guest virtual machines
might run Windows 2003 and Solaris 10. Virtual machine monitors and hypervisors are similar
to “emulators.” Emulation is a “process whereby one computer is set up to permit the execution
of programs written for another computer”. Hypervisors offer a level of efficiency, in that
emulators translate every instruction or system call to the CPU, memory, and disk. Hypervisors
have specialized management functions that allow multiple VMs to co-exist peacefully while
sharing real machine resources. Malachi concludes the differences are largely semantic because
both hypervisors and emulators require I/O requests, memory mapping, and logical memory
schemes.

2.8.2 Virtual Machine History


Virtual machines have been in the computing community for more than 40 years. Early in the
1960s, systems engineers and programmers at MIT recognized the need for virtual machines. In
her authoritative discourse, “VM and the VM Community: Past, Present, and Future,” Melinda
Varian introduces virtual machine technology, starting with the Compatible Time-Sharing
System (CTSS). IBM engineers had worked with MIT programmers to develop a time-sharing
system to allow project teams to use part of the mainframe computers. Varian goes on to
describe the creation, development, and use of virtual machines on the IBM OS/360 Model 67 to
the VM/370 and the OS/390. Varian’s paper covers virtual machine history, emerging virtual
machine designs, important milestones and meetings, and influential engineers in the virtual
computing community. In 1973, Srodowa and Bates demonstrated how to create virtual
machines on IBM OS/360s. In “An Efficient Virtual Machine Implementation,” they describe the
use of IBM’s Virtual Machine Monitor, a hypervisor, to build virtual machines and allocate
memory, storage, and I/O effectively. Srodowa and Bates touch on virtual machine topics still
debated today: performance degradation, capacity, CPU allocation, and storage security.
Goldberg concludes “the majority of today’s computer systems do not and cannot support virtual
machines. The few virtual machine systems currently operational, e.g., CP-67, utilize awkward
and inadequate techniques because of unsuitable architectures“. Goldberg proposes the
“Hardware Virtualizer,” in which a virtual machine would communicate directly with hardware
instead of going through the host software. Nearly 30 years later, industry analysts are excited
about the announcement of hardware architectures capable of supporting virtual machines
efficiently. AMD and Intel have revealed specifications for Pacifica and Vanderpool chip
technologies with special virtualization support features. The 1980s and early 1990s brought
distributing computing to data centers. Centralized computing and virtual machine interest was
replaced by standalone servers with dedicated functions: email, database, Web, applications.
After significant investments in distributed architectures, renewed focus on virtual machines as a
complimentary solution for server consolidation projects and data center management initiatives
has resurfaced. Recent developments in virtual machines on the Windows x86 platform merit a
new chapter in virtual machine history. Virtual machine software from Virtuozzo, Microsoft,
Xen, and EMC (VMWare) has spurred creative virtual machine solutions. Grid computing,
computing on demand, and utility computing technologies seek to maximize computing power in
an efficient, manageable way. The virtual machine was created on the mainframe. It has only
recently been introduced on the mid-range, distributed, x86 platform. Technological
advancements in hardware and software make virtual machines stable, affordable, and offer
tremendous value, given the right implementation.
Virtual machines are implemented in various forms. Mainframe, open source, par virtualization,
and custom approaches to virtual machines have been designed over the years. Complexity in
chip technology and approaches to solving the x86 limitations of virtualization have led to three
different variants of virtual machines:
1. software virtual machines (see FIGURE.2.6), which manage interactions between the
host operating system and guest operating system (e.g., Microsoft Virtual Server 2005);
2. hardware virtual machines (see FIGURE.2.7), in which virtualization technology sits
directly on host hardware (bare metal) using hypervisors, modified code, or APIs to facilitate
faster transactions with hardware devices (e.g., VMWare ESX); and
3. virtual OS/containers (see FIGURE.2.8), in which the host operating system is partitioned
into containers or zones (e.g., Solaris Zones, BSD Jail).

FIGURE.2.6. Software Virtual machine


FIGURE.2.7. Hardware Virtual machine

FIGURE.2.8. Virtual OS/Containers Virtual machine


A simple UNIX implementation called chroot allows an alternate directory path for the root file
system. This creates a “jail,” or sandbox, for new applications or unknown applications. Isolated
processes in chroot are best suited for testing and applications prototyping. They have direct
access to physical devices, unlike emulators. Sun Microsystems’ “Solaris Zones” technology is
an implementation of chroot, similar to the FreeBSD jail design, with additional fea tures. Zones
allow multiple applications to run in isolated partitions on a single operating system. Each zone
has its own unique process table and management tools that allow each partition to be patched,
rebooted, upgraded, and configured separately. Distinct root privileges and file systems are
assigned to each zone. Microsoft Corporation’s Virtual Server 2005 is a new virtual machine
manager in the market. After acquiring virtual machine technology from software vendor
Connectix in 2003, Microsoft introduced the Virtual Server 2005 product, which runs on a
Windows 2003 host and, predictably, supports Windows guest operating systems only. At the
time of publishing this paper, Virtual Server is limited to running on single-processor hosts and
cannot support symmetric multiprocessing (SMP). SMP was introduced on RISC platforms, such
as Sun Sparc and DEC Alpha chipsets, before being adopted on the x86 Intel Xeon and AMD
Athlon processors. SMP allows multiple, identical chipsets to share one memory bank.
Instructions can be shared among the processors or isolated to a dedicated processor on the
system. The system can share a workload, and with increased efficiency. A variation of SMP is
AMD’s Opteron technology, which allows dual-processor chips. The Opteron uses DDR
SDRAM memory dedicated to each processor, as opposed to a single shared memory bank. The
multiprocessing nature of numerous virtual machine guest servers on one host makes dual-core
Opteron chips an attractive platform. Para virtualization is a variant of full operating system
virtualization. Para virtualization avoids “drawbacks of full virtualization by presenting a virtual
machine abstraction that is similar but not identical to the underlying hardware”. This technique
allows a guest operating system to be “ported” through a special API (application programming
interface) to run. The Xen par virtualization research project, at the University of Cambridge, is a
virtual machine monitor (hypervisor) that allows commodity operating systems to be
consolidated and effectively mobilizes guests across physical devices. Xen currently supports
only open source guest systems, though a Windows XP port is being developed. Denali is
another par virtualization implementation, but it requires significant modification to host system
binaries and focuses on high-performance virtual machines. EMC’s VMware technology is the
market leader in x86 virtualization technology. VMware ESX server uses a special hypervisor to
“dynamically rewrite portions of the hosted machine code to insert traps wherever VMM
intervention might be required”. The VMware solution is more costly, but it provides a robust
management console and full-virtualization support for an array of guest operating systems
including Solaris, Linux, Windows, and DOS.

2.9 KVM
KVM (for Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86
hardware containing virtualization extensions (Intel VT or AMD-V). It consists of a loadable
kernel module, kvm.ko, that provides the core virtualization infrastructure and a processor
specific module, kvm-intel.ko or kvm-amd.ko. Using KVM, one can run multiple virtual
machines running unmodified Linux or Windows images. Each virtual machine has private
virtualized hardware: a network card, disk, graphics adapter, etc. KVM is open source software.
The kernel component of KVM is included in mainline Linux, as of 2.6.20. The userspace
component of KVM is included in mainline QEMU.
2.9.1 The Virtualization component with KVM

FIGURE.2.9. Virtualization component with KVM


2.9.2 Internals
By itself, KVM does not perform any emulation. Instead, it exposes the /dev/kvm interface,
which a userspace host can then use to:
1. Set up the guest VM's address space. The host must also supply a firmware image (usually a
custom BIOS when emulating PCs) that the guest can use to bootstrap into its main OS.
2. Feed the guest simulated I/O.
3. Map the guest's video display back onto the host.

On Linux, QEMU versions 0.10.1 and later is one such userspace host. QEMU uses KVM when
available to virtualized guests at near-native speeds, but otherwise falls back to software-only
emulation. Internally, KVM uses SeaBIOS as an open source implementation of a 16-bit
x86 BIOS.
FIGURE.2.10. Internals of KVM

kvm-unit-tests is a project as old as KVM. As its name suggests, it's purpose is to provide unit
tests for KVM. The unit tests are tiny guest operating systems that generally execute only tens of
lines of C and assembler test code in order to obtain its PASS/FAIL result. Unit tests provide
KVM and virt hardware functional testing by targeting the features through minimal
implementations of their use per the hardware specification. The simplicity of unit tests make
them easy to verify they are correct, easy to maintain, and easy to use in timing measurements.
Unit tests are also often used for quick and dirty bug reproducers. The reproducers may then be
kept as regression tests. It's strongly encouraged that patches implementing new KVM features
are submitted with accompanying unit tests.
While a single unit test is focused on a single feature, all unit tests share the minimal system
initialization and setup code. There are also several functions made shareable across all unit tests,
comprising a unit test API. The setup code and API implementation are briefly described in the
next section, "Framework". We then describe testdevs in the "Testdevs"section, which are
extensions to KVM's userspace that provide special support for unit tests. Section "API" lists the
subsystems, e.g. MMU, SMP, which the API covers, along with a few descriptions of what the
API supports. It specifically avoids listing any actual function declarations though, as those may
change (use the source Luke!). The "Running tests"section gives all the details necessary to build
and run tests, and section "Adding a test" provides an example of adding a test. Finally
section "Contributing" explains where and how to submit patches.

2.9.3 Framework
The kvm-unit-tests framework supports multiple architectures; currently i386, x86_64, armv7
(arm), armv8 (arm64), ppc64, and ppc64le. The framework makes sharing code between similar
architectures, e.g. arm and arm64, easier by adopting Linux's configured asm symlink. With the
asm symlink each architecture gets its own copy of the headers, but then may opt to share the
same code.
The framework has the following components:
2.9.3.1 Test building support
Test building is done through make files and some supporting bash scripts.
2.9.3.2 Shared code for test setup and API
Test setup code includes, for example, early system init, MMU enablement, and UART init. The
API provides some common libc functions, e.g. strcpy, atol, malloc, printf, as well as some low-
level helper functions commonly seen in kernel code, e.g. irq_enable/disable(), barrier(), and
some kvm-unit-tests specific API for, e.g., installing exception handlers and reporting test
success/failure.
2.9.3.3 Test running support
Test running is provided with a few bash scripts, using a unit tests configuration file as input.
Generally tests are run from within the source root directory using the supporting scripts, but tests
may optionally be built as standalone tests as well. More information about the standalone
building and running is in the section "Running tests".
2.9.4 API
There are three API categories in kvm-unit-tests 1) libc, 2) functions typical of kernel code, and
3) kvm-unit-tests specific. Very little libc has been implemented, but some of the most
commonly used functions, such as strcpy, memset, malloc, printf, assert, exit, and others are
available. To give an overview of (2), it's best to break them down by subsystem.

2.9.5 Device discovery


ACPI - minimal table search support. Currently x86-only. Device tree - libfdt and a device tree
library wrapping libfdt to accommodate the use of device trees conforming to the Linux
documentation. For example there is a function that gets the "bootargs" from /chosen, which are
then fed into the unit test's main function's inputs (argc, argv) by the setup code before the unit
test starts.

2.9.6 Vectors
Functions to install exception handlers. On ARM a default register-dumping handler is installed
during system init before the unit test starts.

2.9.7 Memory
Functions for memory allocation. Free memory is prepared for allocation during system init
before the unit test starts. Functions for MMU enable/disable, TLB flushing, PTE setting, etc.

2.9.8 SMP
Functions to boot secondaries, iterate online cpus, etc.
Barriers, spinlocks, atomic ops, cpumasks, etc.

2.9.9 I/O
1. Output messages to the UART. The UART is initialized during system init before the unit
test starts.
2. Functions to read/write MMIO
3. Functions to read/write I/O ports (x86-only)
4. Functions for accessing PCI devices (Currently x86-only)
2.9.10 Power management
 PSCI (ARM-only)
 RTAS (PowerPC-only)

2.9.11 Interrupt controller


 Functions to enable/disable, send IPIs, etc.
 Functions to enable/disable IRQs

2.9.12 Virtio
Buffer sending support (Currently virtio-mmio only)

2.9.13 Misc
Special register accessorsSwitch to user mode supportLinux's asm-offsets generation, which can
be used for structures that need to be accessed from assembly.Note, many of the names for the
functions implementing the above are kvm-unit-tests specific, making them also part of the kvm-
unit-tests specific API. However, at least for ARM, any function that implements something for
which the Linux kernel already has a function, then we use the same name (and exact same type
signature, if possible). The kvm-unit-tests specific API also includes some testing specific
functions, such as report() and report_summary(). The report* functions should be used to report
PASS/FAIL results of the tests, and the overall test result summary.

2.10 Xen
2.10.1 What is the Xen Project Hypervisor?
The Xen Project hypervisor is an open-source type-1 or baremetal hypervisor, which makes it
possible to run many instances of an operating system or indeed different operating systems in
parallel on a single machine (or host). The Xen Project hypervisor is the only type-1 hypervisor
that is available as open source. It is used as the basis for a number of different commercial and
open source applications, such as: server virtualization, Infrastructure as a Service (IaaS),
desktop virtualization, security applications, embedded and hardware appliances. The Xen
Project hypervisor is powering the largest clouds in production today.
Here are some of the Xen Project hypervisor's key features:
1. Small footprint and interface (is around 1MB in size). Because it uses a microkernel design,
with a small memory footprint and limited interface to the guest, it is more robust and secure
than other hypervisors.
2. Operating system agnostic: Most installations run with Linux as the main control stack (aka
"domain 0"). But a number of other operating systems can be used instead, including
NetBSD and OpenSolaris.
3. Driver Isolation: The Xen Project hypervisor has the capability to allow the main device
driver for a system to run inside of a virtual machine. If the driver crashes, or is
compromised, the VM containing the driver can be rebooted and the driver restarted without
affecting the rest of the system.
4. Para virtualization: Fully paravirtualized guests have been optimized to run as a virtual
machine. This allows the guests to run much faster than with hardware extensions (HVM).
Additionally, the hypervisor can run on hardware that doesn't support virtualization
extensions.

The key aspects of the Xen Project architecture that a user needs to understsand in order to make
the best choices.
1. Guest types: The Xen Project hypervisor can run fully virtualized (HVM) guests, or
paravirtualized (PV) guests.
2. Domain 0: The architecture employs a special domain called domain 0 which contains
drivers for the hardware, as well as the toolstack to control VMs.
3. Toolstack: This section covers various toolstack front-ends available as part of the Xen
Project stack and the implications of using each.

2.10.2 Introduction to Xen Project Architecture


Below is a diagram of the Xen Project architecture. The Xen Project hypervisor runs directly on
the hardware and is responsible for handling CPU, Memory, and interrupts. It is the first program
running after exiting the boot loader. On top of the hypervisor run a number of virtual machines.
A running instance of a virtual machine is called a domain or guest. A special domain, called
domain 0 contains the drivers for all the devices in the system. Domain 0 also contains a control
stack to manage virtual machine creation, destruction, and configuration.
FIGURE.2.11. Xen Project architecture

2.10.2 .1 Components in detail


1. The Xen Project Hypervisor is an exceptionally lean (<150,000 lines of code) software
layer that runs directly on the hardware and is responsible for managing CPU, memory, and
interrupts. It is the first program running after the boot loader exits. The hypervisor itself has
no knowledge of I/O functions such as networking and storage.
2. Guest Domains/Virtual Machines are virtualized environments, each running their own
operating system and applications. The hypervisor supports two different virtualization
modes: Para virtualization (PV) and Hardware-assisted or Full Virtualization (HVM). Both
guest types can be used at the same time on a single hypervisor. It is also possible to use
techniques used for Para virtualization in an HVM guest: essentially creating a continuum
between PV and HVM. This approach is called PV on HVM. Guest VMs are totally isolated
from the hardware: in other words, they have no privilege to access hardware or I/O
functionality. Thus, they are also called unprivileged domain (or DomU).
3. The Control Domain (or Domain 0) is a specialized Virtual Machine that has special
privileges like the capability to access the hardware directly, handles all access to the
system’s I/O functions and interacts with the other Virtual Machines. It also exposes a
control interface to the outside world, through which the system is controlled. The Xen
Project hypervisor is not usable without Domain 0, which is the first VM started by the
system.
4. Toolstack and Console: Domain 0 contains a control stack (also called Toolstack) that
allows a user to manage virtual machine creation, destruction, and configuration. The
toolstack exposes an interface that is either driven by a command line console, by a graphical
interface or by a cloud orchestration stack such as OpenStack or CloudStack.
5. Xen Project-enabled operating systems: Domain 0 requires a Xen Project-enabled kernel.
Paravirtualized guests require a PV-enabled kernel. Linux distributions that are based on
recent Linux kernel are Xen Project-enabled and usually include packages that contain the
hypervisor and Tools (the default Toolstack and Console). All but legacy Linux kernels are
PV-enabled, capable of running PV guests.

2.10.2 .2 Guest Types

FIGURE.2.12. The evolution of the different virtualization modes in the Xen Project Hypervisor

The hypervisor supports running two different types of guests: Paravirtualization (PV) and Full
or Hardware assisted Virtualization (HVM). Both guest types can be used at the same time on a
single hypervisor. It is also possible to use techniques used for Paravirtualization in an HVM
guest and vice versa: essentially creating a continuum between the capabilities of pure PV and
HVM. We use different abbreviations to refer to these configurations, called HVM with PV
drivers, PVHVM and PVH.
 PV
Para virtualization (PV) is an efficient and lightweight virtualization technique originally
introduced by Xen Project, later adopted by other virtualization platforms. PV does not require
virtualization extensions from the host CPU. However, paravirtualized guests require a PV-
enabled kernel and PV drivers, so the guests are aware of the hypervisor and can run efficiently
without emulation or virtual emulated hardware. PV-enabled kernels exist for Linux, NetBSD,
FreeBSD and OpenSolaris. Linux kernels have been PV-enabled from 2.6.24 using the Linux
pvops framework. In practice this means that PV will work with most Linux distributions (with
the exception of very old versions of distros).
 HVM
Full Virtualization or Hardware-assisted virtualizing (HVM) uses virtualization extensions from
the host CPU to virtualize guests. HVM requires Intel VT or AMD-V hardware extensions. The
Xen Project software uses Qemu to emulate PC hardware, including BIOS, IDE disk controller,
VGA graphic adapter, USB controller, network adapter etc. Virtualization hardware extensions
are used to boost performance of the emulation. Fully virtualized guests do not require any
kernel support. This means that Windows operating systems can be used as a Xen Project HVM
guest. Fully virtualized guests are usually slower than paravirtualized guests, because of the
required emulation.

FIGURE.2.13. Difference between HVM with and without PV drivers


Note that it is possible to use PV Drivers for I/O to speed up HVM guests. On Windows this
requires that appropriate PV drivers are installed. You can find more information at
1. Xen Project PV Drivers
2. 3rd Party GPL PV Drivers (signed drivers are available)
3. Windows PV Drivers Portal
On operating systems with Xen Support - aka with PV or PVHVM drivers, these drivers will be
automatically used when you select the HVM virtualization mode.
 PVHVM
To boost performance, fully virtualized HVM guests can use special paravirtual device drivers
(PVHVM or PV-on-HVM drivers). These drivers are optimized PV drivers for HVM
environments and bypass the emulation for disk and network IO, thus giving you PV like (or
better) performance on HVM systems. This means that you can get optimal performance on
guests operating systems such as Windows.

FIGURE.2.14. Difference between HVM with and without PV and PVHVM drivers
Note that Xen Project PV (paravirtual) guests automatically use PV drivers: there is thus no need
for these drivers - you are already automatically using the optimized drivers. PVHVM drivers are
only required for HVM (fully virtualized) guest VMs.
 PVH
Xen Project 4.4 introduced a virtualization mode called PVH for DomU's. Xen Project 4.5
introduced PVH for Dom0 (both Linux and some BSD's). This is essentially a PV guest using
PV drivers for boot and I/O. Otherwise it uses HW virtualization extensions, without the need for
emulation. PVH is considered experimental in 4.4 and 4.5. It works pretty well, but additional
tuning is needed (probably in the 4.6 release) before it should be used in production. PVH has
the potential to combine the best trade-offs of all virtualization modes, while simplifying the Xen
architecture.
FIGURE.2.15. Difference between HVM (and its variants), PV and PVH

In a nutshell, PVH means less code and fewer Interfaces in Linux/FreeBSD: consequently it has
a smaller TCB and attack surface, and thus fewer possible exploits. Once hardened and
optimised, it should It also have better performance and lower latency, in particular on 64 bit
hosts.PVH requires support in the guest operating system and is enabled with pvh=1 in the
configuration file.
Status
1. PVH Guest support is available from Xen 4.4
2. PVH Dom0 support is available from Xen 4.5

3. PVH from Xen 4.7+: in late 2015

we started an initiative to re-architect and simplify the PVH architecture, as the original
implementation has some limitations. This effort has on the development list been dubbed as
HVMLite or PVHv2. From a user's perspective, the PVH re-work will behave as PVH does, but
implementation wise it uses much more of the HVM code execution path rather than the PV
execution path (which is why developers dubbed it HVMLite on the mailing lists and in a few
developer facing presentations). At this stage, we have not decided yet how to call the feature
once it is complete, but most likely we will stick with PVH to avoid confusion, although on the
mailing list you may still see references to HVMLite. Once complete, the original PVH
implementation will still be available for a bit, but will eventually be replaced by the new
version
2.11 HyperV
The Hyper-V role in Windows Server, practical uses for the role, the most significant new or
updated functionality in this version compared to previous versions of Hyper-V, hardware
requirements, and a list of operating systems (known as guest operating systems) supported for
use in a Hyper-V virtual machine.
A beta version of Hyper-V was shipped with certain x86-64 editions of Windows Server 2008.
The finalized version was released on June 26, 2008 and was delivered throughWindows
Update. Hyper-V has since been released with every version of Windows Server.
Microsoft provides Hyper-V through two channels:
1. Part of Windows: Hyper-V is an optional component of Windows Server 2008 and later. It is
also available in x64 SKUs of Pro and Enterprise editions of Windows 8, Windows
8.1 and Windows 10.
2. Hyper-V Server: It is a freeware edition of Windows Server with limited functionality and
Hyper-V component.

2.11.1 Hyper-V Server


Hyper-V Server 2008 was released on October 1, 2008. It consists of Windows Server 2008
Server Core and Hyper-V role; other Windows Server 2008 roles are disabled, and there are
limited Windows services. Hyper-V Server 2008 is limited to a command-line interface used to
configure the host OS, physical hardware and software. A menu driven CLI interface and some
freely downloadable script files simplify configuration. In addition, Hyper-V Server supports
remote access via Remote Desktop Connection. However, administration and configuration of
the host OS and the guest virtual machines is generally done over the network, using
either Microsoft Management Consoles on another Windows computer or System Center Virtual
Machine Manager. This allows much easier "point and click" configuration, and monitoring of
the Hyper-V Server.
Hyper-V Server 2008 R2 (an edition of Windows Server 2008 R2) was made available in
September 2009 and includes Windows PowerShell v2 for greater CLI control. Remote access to
Hyper-V Server requires CLI configuration of network interfaces and Windows Firewall. Also
using a Microsoft Vista PC to administer Hyper-V Server 2008 R2 is not fully supported.

2.11.2 Architecture
Hyper-V implements isolation of virtual machines in terms of a partition. A partition is a logical
unit of isolation, supported by the hypervisor, in which each guest operating system executes. A
hypervisor instance has to have at least one parent partition, running a supported version
of Windows Server (2008 and later). The virtualization stack runs in the parent partition and has
direct access to the hardware devices. The parent partition then creates the child partitions which
host the guest OSs. A parent partition creates child partitions using the hypercall API, which is
the application programming interface exposed by Hyper-V.

FIGURE.2.16. Hyper-V server architecture

A child partition does not have access to the physical processor, nor does it handle its
real interrupts. Instead, it has a virtual view of the processor and runs in Guest Virtual Address,
which, depending on the configuration of the hypervisor, might not necessarily be the
entire virtual address space. Depending on VM configuration, Hyper-V may expose only a
subset of the processors to each partition. The hypervisor handles the interrupts to the processor,
and redirects them to the respective partition using a logical Synthetic Interrupt
Controller (SynIC). Hyper-V can hardware accelerate the address translation of Guest Virtual
Address-spaces by using second level address translation provided by the CPU, referred to
as EPT on Intel and RVI (formerly NPT) on AMD.
Child partitions do not have direct access to hardware resources, but instead have a virtual view
of the resources, in terms of virtual devices. Any request to the virtual devices is redirected via
the VMBus to the devices in the parent partition, which will manage the requests. The VMBus is
a logical channel which enables inter-partition communication. The response is also redirected
via the VMBus. If the devices in the parent partition are also virtual devices, it will be redirected
further until it reaches the parent partition, where it will gain access to the physical devices.
Parent partitions run a Virtualization Service Provider (VSP), which connects to the VMBus and
handles device access requests from child partitions. Child partition virtual devices internally run
a Virtualization Service Client (VSC), which redirect the request to VSPs in the parent partition
via the VMBus. This entire process is transparent to the guest OS.
Virtual devices can also take advantage of a Windows Server Virtualization feature,
named Enlightened I/O, for storage, networking and graphics subsystems, among others.
Enlightened I/O is specialized virtualization-aware implementation of high level communication
protocols like SCSI to take advantage of VMBus directly, that allows bypassing any device
emulation layer. This makes the communication more efficient, but requires the guest OS to
support Enlightened I/O. Currently only the following operating systems support Enlightened
I/O, allowing them therefore to run faster as guest operating systems under Hyper-V than other
operating systems that need to use slower emulated hardware:
4. Windows Server 2008 and later
5. Windows Vista and later

6. Linux with a 3.4 or later kernel.

7. FreeBSD
2.11.3 Role and technology description
The Hyper-V role enables you to create and manage a virtualized computing environment by
using virtualization technology that is built in to Windows Server. Installing the Hyper-V role
installs the required components and optionally installs management tools. The required
components include Windows hypervisor, Hyper-V Virtual Machine Management Service, the
virtualization WMI provider, and other virtualization components such as the virtual machine
bus (VMbus), virtualization service provider (VSP) and virtual infrastructure driver (VID).
The management tools for the Hyper-V role consist of:
1. GUI-based management tools: Hyper-V Manager, a Microsoft Management Console
(MMC) snap-in, and Virtual Machine Connection, which provides access to the video
output of a virtual machine so you can interact with the virtual machine.
2. Hyper-V-specific cmdlets for Windows PowerShell. Windows Server 2012 includes a
Hyper-V module, which provides command-line access to all the functionality available
in the GUI, as well functionality not available through the GUI. For more information
about the Hyper-V module, see Hyper-V Module for Windows PowerShell.
If you use Server Manager to install the Hyper-V role, the management tools are included unless
you specifically exclude them. If you use Windows PowerShell to install the Hyper-V role, the
management tools are not included by default. To install the tools, use the parameter –
IncludeManagementTools.
The Hyper-V technology virtualizes hardware to provide an environment in which you can run
multiple operating systems at the same time on one physical computer. Hyper-V enables you to
create and manage virtual machines and their resources. Each virtual machine is an isolated,
virtualized computer system that can run its own operating system. The operating system that
runs within a virtual machine is called a guest operating system.

2.11.4 Practical applications


Hyper-V provides infrastructure so you can virtualized applications and workloads to support a
variety of business goals aimed at improving efficiency and reducing costs, such as:
3. Establish or expand a private cloud environment. Hyper-V can help you move to or
expand use of shared resources and adjust utilization as demand changes, to provide more
flexible, on-demand IT services.
4. Increase hardware utilization. By consolidating servers and workloads onto fewer, more
powerful physical computers, you can reduce consumption of resources such as power
and physical space.
5. Improve business continuity. Hyper-V can help you minimize the impact of both
scheduled and unscheduled downtime of your workloads.
6. Establish or expand a virtual desktop infrastructure (VDI). A centralized desktop strategy
with VDI can help you increase business agility and data security, as well as simplify
regulatory compliance and management of the desktop operating system and
applications. Deploy Hyper-V and Remote Desktop Virtualization Host (RD
Virtualization Host) on the same physical computer to make personal virtual desktops or
virtual desktop pools available to your users.
7. Increase efficiency in development and test activities. You can use virtual machines to
reproduce different computing environments without the need for acquiring or
maintaining all the hardware you would otherwise need.

2.11.5 System requirements


1. Host operating system:
 To install the Hyper-V role, Windows Server 2008, Windows Server 2008 R2 Standard,
Enterprise or Datacenter edition, Windows Server 2012 Standard or Datacenter
edition, Windows 8 (or 8.1) Pro or Enterprise edition, or Windows 10 Pro, Education or
Enterprise edition, is required. Hyper-V is only supported on x86-64 variants of
Windows.

 It can be installed regardless of whether the installation is a full or core installation.


2. Processor:
 An x86-64 processor
 Hardware-assisted virtualization support: This is available in processors that include a
virtualization option; specifically, Intel VT or AMD Virtualization (AMD-V, formerly
code-named "Pacifica").
 An NX bit-compatible CPU must be available and Hardware Data Execution
Prevention (DEP) must be enabled.
 Although this is not an official requirement, Windows Server 2008 R2 and a CPU
with second-level address translation support are recommended for workstations.
 Second-level address translation is a mandatory requirement for Hyper-V in Windows 8.
3. Memory
 Minimum 2 GB. (Each virtual machine requires its own memory, and so realistically
much more.)
 Windows Server 2008 Standard (x64) Hyper-V full GUI or Core supports up to 31 GB of
memory for running VMs, plus 1 GB for the Hyper-V parent OS.[14]
 Maximum total memory per system for Windows Server 2008 R2 hosts: 32 GB
(Standard) or 2 TB (Enterprise, Datacenter)
 Maximum total memory per system for Windows Server 2012 hosts: 4 TB
 Guest operating systems
 Hyper-V in Windows Server 2008 and 2008 R2 supports virtual machines with up to 4
processors each (1, 2, or 4 processors depending on guest OS-see below)
 Hyper-V in Windows Server 2012 supports virtual machines with up to 64 processors
each.
 Hyper-V in Windows Server 2008 and 2008 R2 supports up to 384 VMs per system
 Hyper-V in Windows Server 2012 supports up to 1024 active virtual machines per
system.
 Hyper-V supports both 32-bit (x86) and 64-bit (x64) guest VMs.
UNIT – III

MONITORING AND MANAGEMENT

3.1 AN ARCHITECTURE FOR FEDERATED CLOUD COMPUTING

3.1.1 THE BASIC PRINCIPLES OF CLOUD COMPUTING


In this section we unravel a set of principles that enable Internet scale cloud computing services.
We seek to highlight the fundamental requirement from the providers of cloud computing to
allow virtual applications to freely migrate, grow, and shrink.

3.1.1.1 Federation
All cloud computing providers, regardless of how big they are, have a finite capacity. To grow
beyond this capacity, cloud computing providers should be able to form federations of providers
such that they can collaborate and share their resources. The need for federation-capable cloud
computing offerings is also derived from the industry trend of adopting the cloud computing
paradigm
internally within companies to create private clouds and then being able to extend these clouds
with resources leased on-demand from public clouds. Any federation of cloud computing
providers should allow virtual application to be deployed across federated sites. Furthermore,
virtual applications need to be completely location free and allowed to migrate in part or as a
whole between sites. At the same time, the security privacy and independence of the federation
members must be maintained to allow competing providers to federate.

3.1.1.2 Independence
Just as in other utilities, where we get service without knowing the internals of the utility
provider and with standard equipment not specific to any provider (e.g., telephones), for cloud
computing services to really fulfil the computing as a utility vision, we need to offer cloud
computing users full independence. Users should be able to use the services of the cloud without
relying on any provider specific tool, and cloud computing providers should be able to manage
their infrastructure without exposing internal details to their customers or partners. As a
consequence of the independence principle, all cloud services need to be encapsulated and
generalized such that users will be able to acquire equivalent virtual resources at different
providers.

3.1.1.3 Isolation
Cloud computing services are, by definition, hosted by a provider that will simultaneously host
applications from many different users. For these users to move their computing into the cloud,
they need warranties from the cloud computing provider that their stuff is completely isolated
from others. Users must be ensured that their resources cannot be accessed by others sharing the
same cloud and that adequate performance isolation is in place to ensure that no other user may
possess the power to directly affect the service granted to their application.

3.1.1.4 Elasticity
One of the main advantages of cloud computing is the capability to provide, or release, resources
on-demand. These “elasticity” capabilities should be enacted automatically by cloud computing
providers to meet demand variations, just as electrical companies are able (under normal
operational circumstances) to automatically deal with variances in electricity consumption levels.
Clearly the behaviour and limits of automatic growth and shrinking should be driven by contracts
and rules agreed on between cloud computing providers and consumers. The ability of users to
grow their applications when facing an increase of real-life demand need to be complemented by
the ability to scale. Cloud computing services as offered by a federation of infrastructure
providers is expected to offer any user application of any size the ability to quickly scale up its
application by unrestricted magnitude and approach Internet scale. At the same time, user
applications should be allowed to scale down facing decreasing demand. Such scalability
although depended on the internals of the user application is prime driver for cloud computing
because it help users to better match expenses with gain.

3.1.1.5 Business Orientation


Before enterprises move their mission critical applications to the cloud, cloud computing
providers will need to develop the mechanisms to ensure quality of service (QoS) and proper
support for service-level agreements (SLAs). More than ever before, cloud computing offers
challenges with regard to the articulation of a meaningful language that will help encompass
business requirements and that has translatable and customizable service parameters for
infrastructure providers.

3.1.1.6 Trust
Probably the most critical issue to address before cloud computing can become the preferred
computing paradigm is that of establishing trust. Mechanisms to build and maintain trust
between cloud computing consumers and cloud computing providers, as well as between cloud
computing providers among themselves, are essential for the success of any cloud computing
offering.

3.1.2 A MODEL FOR FEDERATED CLOUD COMPUTING


In our model for federated cloud computing we identify two major types of actors: Service
Providers (SPs) are the entities that need computational resources to offer some service.
However, SPs do not own these resources; instead, they lease them from Infrastructure Providers
(IPs), which provide them with a seemingly infinite pool of computational, network, and storage
resources.
A Service Application is a set of software components that work collectively to achieve a
common goal. Each component of such service applications executes in a dedicated VEE. SPs
deploy service applications in the the cloud by providing to a IP, known as the primary site, with
a Service Manifest—that is, a document that defines the structure of the application as well as
the contract and SLA between the SP and the IP. To create the illusion of an infinite pool of
resources, IPs shared their unused capacity with each other to create a federation cloud. A
Framework Agreement is document that defines the contract between two IPs—that is, it states
the terms and conditions under which one IP can use resources from another IP.
Within each IP, optimal resource utilization is achieved by partitioning physical resources,
through a virtualization layer, into Virtual Execution Environments (VEEs)—fully isolated
runtime environments that abstract away the physical characteristics of the resource and enable
sharing. We refer to the virtualized computational resources, alongside the virtualization layer
and all the management enablement components, as the Virtual Execution Environment Host
(VEEH).
With these concepts in mind, we can proceed to define a reference architecture for federated
cloud computing. The design and implementation of such architecture are the main goals of the
RESERVOIR European research project. The RESERVOIR architecture, shown in Figure 3.1,
identifies the major functional components needed within an IP to fully support the cloud
computing paradigm. The rationale behind this particular layering is to keep a clear separation of
concerns and responsibilities and to hide low-level infrastructure details and decisions from high-
level management and service providers.

FIGURE 3.1. The RESERVOIR architecture: major components and interfaces.

 The Service Manager is the only component within an IP that interacts with SPs. It
receives Service Manifests, negotiates pricing, and handles billing. Its two most complex tasks
are (1) deploying and provisioning VEEs based on the Service Manifest and (2) monitoring and
enforcing SLA compliance by throttling a service application’s capacity.

 The Virtual Execution Environment Manager (VEEM) is responsible for the optimal
placement of VEEs into VEE Hosts subject to constraints determined by the Service Manager.
The continuous optimization process is driven by a site-specific programmable utility function.
The VEEM is free to place and move VEEs anywhere, even on the remote sites (subject to
overall cross-site agreements), as long as the placement satisfies the constraints. Thus, in
addition to serving local requests (from the local Service Manager), VEEM is responsible for the
federation of remote sites.

 The Virtual Execution Environment Host (VEEH) is responsible for the basic control
and monitoring of VEEs and their resources (e.g., creating a VEE, allocating additional resources
to a VEE, monitoring a VEE, migrating a VEE, creating a virtual network and storage pool, etc.).
Given that VEEs belonging to the same application may be placed on multiple VEEHs and even
extend beyond the boundaries of a site, VEEHs must support isolated virtual networks that span
VEEHs and sites. Moreover, VEEHs must support transparent VEE migration to any compatible
VEEH within the federated cloud, regardless of site location or network and storage
configurations.

The layered design stresses the use of standard, open, and generic protocols and interfaces to
support vertical and horizontal interoperability between layers. Different implementations of
each layer will be able to interact with each other. The Service Management Interface (SMI) with
its service manifest exposes a standardized interface into the RESERVOIR cloud for service
providers. The service provider may then choose among RESERVOIR cloud providers, knowing
that they share a common language to express their business requirements. The VEE
Management Interface (VMI) simplifies the introduction of different and independent IT
optimization strategies without disrupting other layers or peer VEEMs. Furthermore, VMI’s
support of VEEM-to-VEEM communication simplifies cloud federation by limiting the
horizontal interoperability to one layer of the stack. The VEE Host Interface (VHI) will support
plugging-in of new virtualization platforms (e.g., hypervisors), without requiring VEEM
recompilation or restart. RESERVOIR’s loosely coupled stack reference architecture should
promote a variety of innovative approaches to support cloud computing.
3.1.3 Features of Federation Types
Federations of clouds may be constructed in various ways, with disparate feature sets offered by
the underlying implementation architecture. This section is devoted to present these
differentiating features. Using these features as a base, a number of federation scenarios are
defined, comprised of subsets of this feature set. The first feature to consider is the framework
agreement support: Framework agreements, as defined in the previous section, may either be
supported by the architecture or not. If framework agreements are not supported, this implies that
federation may only be carried out in a more ad hoc opportunistic manner. Another feature is the
opportunistic placement support. If framework agreements are not supported by the architecture,
or if there is not enough spare capacity even including the framework agreements, a site may
choose to perform opportunistic placement. It is a process where remote sites are queried on-
demand as the need for additional resources arises, and the local site requests a certain SLA-
governed capacity for a given cost from the remote sites.
One interesting feature to take into account is the advance resource reservation support. This
feature may be used both when there is an existing framework agreement and when opportunistic
placement has been performed. Both types of advance reservations are only valid for a certain
time, since they impact the utilization of resources at a site. Because of this impact, they should
be billed as actual usage during the active time interval.
The ability to migrate machines across sites defines the federated migration support. There are
two types of migration: cold and hot (or live). In cold migration, the VEE is suspended and
experiences a certain amount of downtime while it is being transferred. Most modern operating
systems have support for being suspended, which includes saving all RAM contents to disk and
later restoring the runtime state to its prior state. Hot or live migration does not allow for system
downtime, and it works by transferring the runtime state while the VEE is still running.
Focusing on networks, there can be cross-site virtual network support: VEEs belonging to a
service are potentially connected to virtual networks, should this be requested by the SP. Ideally,
these virtual networks will span across sites. However, this requires substantial effort and
advanced features of the underlying architecture. In the same line, the federation can offer public
IP addresses retention post cross-site migration. With fully virtualized networks, this may be a
directly supported feature; but even if virtualized networks are not available,
it may still be possible to maintain public IP addresses by manipulating routing information.
Information disclosure within the federation has also to be taken into account. The sites in the
federation may provide information to different degrees (for instance, the information exchange
between sites may be larger within the same administrative domain than outside it). Information
regarding deployed VEEs will be primarily via the monitoring system, whereas some
information may also potentially be exposed via the VMI as response to a VEE deployment
request.
The last identified feature useful to define scenario is the VMI operation support: Depending on
the requirements of the federation scenario, only a subset of the VMI operations may be made
available. Which operations are required may be related to the amount of information that is
exposed by the remote sites; access to more information may also increase the possibility and
need to manipulate the deployed VEEs.

3.1.4 Federation Scenarios


In this section, a number of federation scenarios are presented, ranging from a baseline case to a
full-featured federation. These scenarios have various requirements on the underlying
architecture, and we use the features presented in previous section as the basis for differentiating
among them.
The baseline federation scenario provides only the very basic required for supporting
opportunistic placement of VEEs at a remote site. Migration is not supported, nor does it resize
the VEEs once placed at the remote site. Advanced features such as virtual networks across site
boundaries are also not supported. The baseline federation should be possible to build on top of
most public cloud offerings, which is important for interoperability. The basic
federation scenario includes a number of features that the baseline federation does not, such as
framework agreements, cold migration, and retention of public IP addresses. Notably missing is
(a) support for hot migration and (b) cross-site virtual network functionality. This scenario offers
a useful cloud computing federation with support for site collaboration in terms of framework
agreements without particularly high technological requirements on the
Underlying architecture in terms of networking support. The advanced federation scenario offers
advanced functionality such as cross-site virtual network support. The feature most notably
missing is hot migration, and the monitoring system also does not disclose VEE substate
metadata information. The full featured federation scenario offers the most complete set of
features, including hot migration of VEEs.

3.1.5 Layers Enhancement for Federation


Taking into account the different types of federation, a summary of the features needed in the
different layers of the RESERVOIR architecture to achieve federation is presented.

Service Manager. The baseline federation is the most basic federation scenario, but even here
the SM must be allowed to specify placement restrictions when a service is deployed.
Deployment restrictions are associated to an specific VEE (although the restriction expression
could involve other VEEs, as can be seen in the affinity restrictions above) and passed down to
the VEEM along with any other specific VEE metadata when the VEE is issued for creation
through VMI. They specify a set of constraints that must be held when the VEE is created, so
they can be seen as some kind of “contour conditions” that determine the domain that can be
used by the placement algorithm run at VEEM layer. Two kinds of deployment restrictions are
envisioned: First, there are affinity restrictions, related to the relations between VEEs; and
second, there can be site restrictions, related to sites.
In the basic federation scenario, federation uses framework agreement (FA) between
organizations to set the terms and conditions for federation. Framework agreements are
negotiated and defined by individuals, but they are encoded at the end in the service manager
(SM)—in particular, within the business information data base (BIDB). The pricing information
included in the FA is used by the SM to calculate the cost of resources running in remote systems
(based on the aggregated usage information that it received from the local VEEM) and correlate
this information with the charges issued by those remote sites. The SM should be able to include
as part of the VEE metadata a “price hint vector” consisting on a sequence of numbers, each one
representing an estimation of the relative cost of deploying the VEE on each federated site. The
SM calculate this vector based on the FA established with the other sites.
Given that the advanced federation scenario supports migration, the placement restrictions have
to be checked not only at service deployment time but also for migration. In addition, the SM
could update the deployment restrictions during the service lifespan, thereby changing the
“contour conditions” used by the placement algorithm. When the VEE is migrated across sites,
its deployment restrictions are included along with any other metadata associated with the VEE.
On the other hand, no additional functionality is needed from the service manager to implement
the full-featured federation.

Virtual Execution Environment Manager. Very little is needed in the baseline federation
scenario of the VEEM. The only requirement will be the ability to deploy a VEE in the remote
site, so it will need a plug-in that can communicate with the remote cloud by invoking the public
API. This will satisfy the opportunistic placement requirement. For the different features offered
by the basic federation scenario, the VEEM will need framework agreement, since it is necessary
that the VEEM implement a way to tell whether it can take care of the VEE or not, attending to
the SLAs defined in the framework agreement. The best module in the VEEM for the SLA
evaluation to take place is the admission control of the policy engine. Also, cold migration is
needed; therefore the VEEM needs the ability to signal the hypervisor to save the VEE state (this
is part of the VEEM life-cycle module) and also the ability to transfer the state files to the remote
site. Additionally, the VEEM need to be able to signal the hypervisor to restore the VEE state
and resume its execution (also part of the VEEM life-cycle module). Regarding advance resource
reservation support, the policy engine must be capable of reserving capacity in the physical
infrastructure given a timeframe for certain VEEs.
In the advanced federation scenario, the ability to create cross-site virtual networks for the VEEs
has to be achieved using the functionality offered by the virtual application network (VAN) as
part of the virtual host interface API. Therefore, the VEEM needs to correctly interface with the
VAN and be able to express the virtual network characteristics in a VEEM-to-VEEM
connection. In the full-featured federation scenario the live migration feature offered by this
scenario will need to be supported also in the VHI API. The VEEM will just
need to invoke the functionality of live migration to the hypervisor part of the VHI API to
achieve live migration across administrative domains.

Virtual Execution Environment Host. The ability to monitor a federation is needed. The
RESERVOIR monitoring service supports the asynchronous monitoring of a cloud data centers0
VEEHs, their VEEs, and the applications running inside the VEEs. To support federation, the
originating data center must be able to monitor VEEs and their applications running at a remote
site. When an event occurs related to a VEE running on a remote site, it is published and a
remote proxy forwards the request to the subscribing local proxy, which in turn publishes the
event to the waiting local subscribers. The monitoring framework is agnostic to type and source
of data being monitored and supports the dynamic creation of new topics.
No further functionality is required for the basic federation in the VEEH apart from the features
described for the baseline scenario. On the other hand, for the advanced federation one, several
features are needed. First, it must have the ability to implement federated network service with
virtual application network (VANs), a novel overlay network that enables virtual network
services across subnets and across administrative boundaries. VANs enables the establishment of
large-scale virtual networks, free of any location dependency, that in turn allows completely
“migratable” virtual networks. (1) The offered virtual network service is fully isolated, (2) it
enables sharing of hosts, network devices, and physical connections, and (3) hides network
related physical characteristics such as link throughputs, location of hosts, and so forth. Also, the
ability to do federated migration with non-shared storage service is required. RESERVOIR
enhances the standard VM migration capability typically available in every modern hypervisor
with support for environments in which the source and the destination hosts do not share storage;
typically the disk(s) of the migrated VM resided in the shared storage.
Regarding the full-featured federation scenario, hot migration is the functionality that affects the
most what is demanded from VEEH in this scenario. RESERVOIR’s separation principle
requires that each RESERVOIR site be an autonomous entity. Site configuration, topology, and
so on, are not shared between sites. So one site is not aware of the host addresses on another site.
However, currently VM migration between hosts require that the source and destination
hypervisors know each other’s addresses and transfer a VM directly from the source host to the
destination host. In order to overcome this apparent contradiction, RESERVOIR introduces a
novel federated migration channel to transfer a VEE from one host to another host without
directly addressing the destination host. Instead of transferring the VEE directly to the
destination host, it passes through proxies at the source site and destination site, solving the
unknown hypervisor location problem.
3.2 SLA management in cloud computing
SLA management of applications hosted on cloud platforms involves five phases.
1. Feasibility
2. On-boarding
3. Pre-production
4. Production
5. Termination

Different activities performed under each of these phases are shown in Figure 3.2. These
activities are explained in detail in the following subsections.

3.2.1 Feasibility Analysis


MSP conducts the feasibility study of hosting an application on their cloud platforms. This study
involves three kinds of feasibility: (1) technical feasibility, (2) infrastructure feasibility, and (3)
financial feasibility. The technical feasibility of an application implies determining the
following:
i. Ability of an application to scale out.
ii. Compatibility of the application with the cloud platform being used within the MSP’s data
center.
iii. The need and availability of a specific hardware and software required for hosting and
running of the application.
iv. Preliminary information about the application performance and whether they can be met by
the MSP.

Performing the infrastructure feasibility involves determining the availability of infrastructural


resources in sufficient quantity so that the projected demands of the application can be met. The
financial feasibility study involves determining the approximate cost to be incurred by the MSP
and the price the MSP charges the customer so that the hosting activity is profitable to both of
them. A feasibility report consists of the results of the above three feasibility studies. The report
forms the basis for further communication with the customer. Once the provider and customer
agree upon the findings of the report, the outsourcing of the application hosting activity proceeds
to the next phase, called “on boarding” of application. Only the basic feasibility of hosting an
application has been carried in this phase. However, the detailed runtime characteristics of the
application are studied as part of the on-boarding activity.

FIGURE 3.2. Flowchart of the SLA management in cloud


3.2.3 On-Boarding of Application
Once the customer and theMSP agree in principle to host the application based on the findings of
the feasibility study, the application is moved from the customer servers to the hosting platform.
Moving an application to the MSP’s hosting platform is called on-boarding. As part of the on-
boarding activity, the MSP understands the application runtime characteristics using runtime
profilers. This helps the MSP to identify the possible SLAs that can be offered to the customer
for that application. This also helps in creation of the necessary policies (also called rule sets)
required to guarantee the SLOs mentioned in the
application SLA. The application is accessible to its end users only after the on boarding activity
is completed.

On-boarding activity consists of the following steps:


i. Packing of the application for deploying on physical or virtual environments. Application
packaging is the process of creating deployable components on the hosting platform (could
be physical or virtual). Open Virtualization Format (OVF) standard is used for packaging the
application for cloud platform.
ii. The packaged application is executed directly on the physical servers to capture and analyze
the application performance characteristics. It allows the functional validation of customer’s
application. Besides, it provides a baseline performance value for the application in no virtual
environment. This can be used as one of the data points for customer’s performance
expectation and for application SLA. Additionally, it helps to identify the nature of
application—that is, whether it is CPU-intensive or I/O intensive or network-intensive and
the potential performance bottlenecks.
iii. The application is executed on a virtualized platform and the application performance
characteristics are noted again. Important performance characteristics like the application’s
ability to scale (out and up) and performance bounds (minimum and maximum performance)
are noted.
iv. Based on the measured performance characteristics, different possible SLAs are identified.
The resources required and the costs involved for each SLA are also computed.
v. Once the customer agrees to the set of SLOs and the cost, the MSP starts creating different
policies required by the data center for automated management of the application. This
implies that the management system should automatically infer the amount of system
resources that should be allocated/de-allocated to/from appropriate components of the
application when the load on the system increases/decreases. These policies are of three
types: (1) business, (2) operational, and (3) provisioning. Business policies help prioritize
access to the resources in case of contentions. Business policies are in the form of weights for
different customers or group of customers. Operational policies are the actions to be taken
when different thresholds/conditions are reached. Also, the actions when thresholds/
conditions/triggers on service-level parameters are breached or about to be breached are
defined. The corrective action could be different types of provisioning such as scale-up,
scale-down, scale-out, scale-in, and so on, of a particular tier of an application. Additionally,
notification and logging action (notify the enterprise application’s administrator, etc.) are
also defined.

3.2.3 Preproduction
Once the determination of policies is completed as discussed in previous phase, the application is
hosted in a simulated production environment. It facilitates the customer to verify and validate
the MSP’s findings on application’s runtime characteristics and agree on the defined SLA. Once
both parties agree on the cost and the terms and conditions of the SLA, the customer sign-off is
obtained. On successful completion of this phase the MSP allows the application to go on-live.

3.2.4 Production
In this phase, the application is made accessible to its end users under the agreed SLA. However,
there could be situations when the managed application tends to behave differently in a
production environment compared to the preproduction environment. This in turn may cause
sustained breach of the terms and conditions mentioned in the SLA. Additionally, customer may
request the MSP for inclusion of new terms and conditions in the SLA. If the application SLA is
breached frequently or if the customer requests for a new non-agreed SLA, the on-boarding
process is performed again. In the case of the former, on-boarding activity is repeated to analyze
the application and its policies with respect to SLA fulfilment. In case of the latter, a new set of
policies are formulated to meet the fresh terms and conditions of the SLA.

3.2.5 Termination
When the customer wishes to withdraw the hosted application and does not wish to continue to
avail the services of the MSP for managing the hosting of its application, the termination activity
is initiated. On initiation of termination, all data related to the application are transferred to the
customer and only the essential information is retained for legal compliance. This ends the
hosting relationship between the two parties for that application, and the customer sign-off is
obtained.

3.3 Performance prediction for HPC on Clouds


This section will discuss the issues linked to the adoption of the cloud paradigm in the HPC
context. In particular, we will focus on three different issues:
i. The difference between typical HPC paradigms and those of current cloud environments,
especially in terms of performance evaluation.
ii. A comparison of the two approaches in order to point out their advantages and drawbacks, as
far as performance is concerned.
iii. New performance evaluation techniques and tools to support HPC in cloud systems.

As outlined in the previous sections, the adoption of the cloud paradigm for HPC is a flexible
way to deploy (virtual) clusters dedicated to execute HPC applications. The switch from a
physical to a virtual cluster is completely transparent for the majority of HPC users, who have
just terminal access to the cluster and limit themselves to “launch” their tasks. The first and well-
known difference between HPC and cloud environments is the different economic approach: (a)
buy-and-maintain for HPC and (b) pay-per-use in cloud systems. In the latter, every time that a
task is started, the user will be charged for the used resources. But it is very hard to know in
advance which will be the resource usage and hence the cost. On the other hand, even if the
global expense for a physical cluster is higher, once the system has been acquired, all the costs
are fixed and predictable (in fact, they are so until the system is not faulty). It would be great to
predict, albeit approximately, the resource usage of a target application in a cloud, in order to
estimate the cost of its execution.
These two issues above are strictly related, and a performance problem becomes an economic
problem. Let us assume that a given application is well optimized for a physical cluster. If it
behaves on a virtual cluster as on the physical one, it will use the cloud resources in an efficient
way, and its execution will be relatively cheap. This is not so trivial as it may seem, as the pay-
per-use paradigm commonly used in commercial clouds (see Table 3.1) charges the user for
virtual cluster up-time, not for CPU usage. Almost surprisingly, this means that processor idle
time has a cost for cloud users.
For clarity’s sake, it is worth presenting a simple but interesting example regarding performance
and cost. Let us consider two different virtual clusters with two and four nodes, respectively. Let
us assume that the application is well-optimized and that, at least for a small number of
processors, it gets linear speed-up. The target application will be executed in two hours in the
first cluster and in one hour in the second one. Let the execution cost be X dollars per hour per
machine instance (virtual node). This is similar to the charging scheme of
EC2. The total cost is given by
(Cost per hour per instance) * (number of instances) * (hours)
In the first case (two-node cluster) the cost will be X*2*2, whereas in the second one it will be
X*1*4. It turns out that the two configurations have the same cost for the final user, even if the
first execution is slower than the second. Now if we consider an application that is not well-
optimized and has a speed-up less than the ideal one, the running time on the large virtual cluster
will be longer than two hours; as a consequence, the cost of the run of the second virtual cluster
will be higher than that on the small one. In conclusion: In clouds, performance counts two
times. Low performance means not only long waiting times, but also high costs. The use of
alternative cost factors (e.g., the RAM memory allocated, as for GoGrid in Table 3.1) leads to
completely different considerations and requires different application optimizations to reduce the
final cost of execution.
TABLE 3.1. Example of Cost Criteria
In light of the above, it is clear that the typical HPC user would like to know how long his
application will run on the target cluster and which configuration has the highest
performance/cost ratio. The advanced user, on the other hand, would also know if there is a way
to optimize its application so as to reduce the cost of its run without sacrificing performance. The
high-end user, who cares more for performance than for the cost to be sustained, would like
instead to know how to choose the best configuration to maximize the performance of his
application. In other words, in the cloud world the hardware configuration is not fixed, and it is
not the starting point for optimization decisions. Configurations can be easily changed in order to
fit the user needs. All the three classes of users should resort to performance analysis and
prediction tools. But, unfortunately, prediction tools for virtual environments are not available,
and the literature presents only partial results on the performance analysis of such systems.
An additional consequence of the different way that HPC users exploit a virtual cluster is that the
cloud concept makes very different the system dimensioning—that is, the choice of the system
configuration fit for the user purposes (cost, maximum response time, etc.). An HPC machine is
chosen and acquired, aiming to be at the top of available technology (under inevitable money
constraints) and to be able to sustain the highest system usage that may eventually be required.
This can be measured in terms of GFLOPS, in terms of number of runnable jobs, or by other
indexes depending on the HPC applications that will be actually executed. In other words, the
dimensioning is made by considering the peak system usage. It takes place at system acquisition
time, by examining the machine specifications or by assembling it using hardware components
of known performance. In this phase, simple and
global performance indexes are used (e.g., bandwidth and latency for the interconnect, peak
FLOPS for the computing nodes, etc.).
In clouds, instead, the system must be dimensioned by finding out an optimal trade-off between
application performance and used resources. As mentioned above, the optimality is a concept
that is fairly different, depending on the class of users. Someone would like to obtain high
performance at any cost, whereas others would privilege economic factors. In any case, as the
choice of the system is not done once and for all, the dimensioning of the virtual clusters takes
place every time the HPC applications have to be executed on new datasets. In clouds, the
system dimensioning is a task under the control of the user, not of the system administrator. This
completely changes the scenario and makes the dimensioning a complex activity, eager for
performance data and indexes that can be measured fairly easily in the PC world on physical
systems, but that are not generally available for complex and rapidly changing systems as virtual
clusters.

TABLE 3.2. Differences Between “Classical” HPC and HPC in Cloud Environments
Table 3.2 summarizes the differences between HPC classical environments and HPC in clouds.
To summarize the above discussion, in systems (the clouds) where the availability of
performance data is crucial to know how fast your applications will run and how much you will
pay, there is great uncertainty about what to measure and how to measure, and there are great
difficulties when attempting to interpret the meaning of measured data.

3.3.1 HPC Systems and HPC on Clouds: A Performance Comparison


The second step of our analysis is a performance comparison between classical HPC systems and
the new cloud paradigm. This will make it possible to point out the advantages and
disadvantages of the two approaches and will enable us to understand if and when clouds can be
useful for HPC. The performance characterization of HPC systems is usually carried out by
executing benchmarks. However, the only ones that make measurements of virtual clusters at
different levels and provide available results are the following:
 The LINPACK benchmark, a so-called kernel benchmark, which aims at measuring the
peak performance (in FLOPSs) of the target environment.
 The NAS Parallel Benchmarks (NPB), a set of eight programs designed to help to
evaluate the performance of parallel supercomputers, derived from computational fluid dynamics
(CFD) applications and consisting of five kernels and three pseudo-applications. As
performance index, together with FLOPS, it measures response time, network bandwidth usage,
and latency.
 mpptest, a microbenchmark that measures the performance of some of the basic MPI
message passing routines in a variety of different conditions. It measures (average) response
time, network bandwidth usage and latency When these benchmarks are executed on physical
achines (whether clusters or other types of parallel hardware), they give a coarse-level indication
of the system potentialities. In the HPC world, these benchmarks are of common use and widely
diffused, but their utility is limited. Users usually have an in-depth knowledge of the target
hardware used for executing their applications, and a comparison between two different
(physical) clusters makes sense only for Top500 classification or when they are acquired. HPC
users usually outline the potentiality and the main features of their system through (a) a brief
description of the hardware and (b) a few performance indexes obtained using some of the
above-presented benchmarks. In any case, these descriptions are considered useless for
application performance optimization, because they only aim at providing a rough classification
of the hardware.

The unsatisfactory performance of the network connection between virtual clusters. In any case,
the performance offered by virtual clusters is not comparable to the one offered by physical
clusters. Even if the results briefly reported above are of great interest and can be of
help to get insight on the problem, they do not take into account the differences between HPC
machines and HPC in the cloud, which we have summarized at the start of this section. Stated
another way, the mentioned analyses simply measure global performance indexes. But the
scenario can drastically change if different performance indexes are measured.
Just to start, the application response time is perhaps the performance index of great importance
in a cloud context. In fact, it is a measurement of interest for the final user and, above all, has a
direct impact on the cost of the application execution. An interesting consideration linked to
response time was proposed by Ian Foster in his blog [11]. The overall application response time
(RT) is given by the formula RT = ( job submission time) + (execution time).
In common HPC environments (HPC system with batch queue, grids, etc.) the job submission
time may be fairly long (even minutes or hours, due to necessity to get all the required
computing resources together). On the other hand, in a cloud used to run HPC workload (a
virtual cluster dedicated to the HPC user), queues (and waiting time) simply disappear. The
result is that, even if the virtual cluster may offer a much lower computational power, the final
response time may be comparable to that of (physical) HPC systems.
In order to take into account this important difference between physical and virtual
environments, Foster suggests to evaluate the response time in terms of probability of
completion, which is a stochastic function of time, and represents the probability that the job will
be completed before that time. Note that the stochastic behaviour mainly depends on the job
submission time, whereas execution time is usually a deterministic value. So in a VC the
probability of completion is a threshold function (it is zero before the value corresponding to
execution time of actual task, and one after). In a typical HPC environment, which involves
batch and queuing systems, the job submission time is stochastic and fairly long, thus leading to
a global completion time higher than the one measured on the VC. This phenomenon opens the
way to a large adoption of the cloud approach, at least for middle- or small- dimension HPC
applications, where the computation power loss due to the use of the cloud is more tolerable.
This is a context in which the grid paradigm was never largely adopted because of the high
startup overhead.

3.3.2 Supporting HPC in the Cloud


The above-presented analysis shows how the cloud approach has good chances to be widely
adopted for HPC, even if there are limits one should be aware of, before trying to switch to
virtualized systems. Moreover, the differences between “physical computing” and “virtual
computing,” along with their impact on performance evaluation, clearly show that common
performance indexes, techniques, and tools for performance analysis and prediction should be
suitably adapted to comply with the new computing paradigm.
To support HPC applications, a fundamental requirement from a cloud provider is that an
adequate service-level agreement (SLA) is granted. For HPC applications, the SLA should be
different from the ones currently offered for the most common uses of cloud systems, oriented at
transactional Web applications. The SLA should offer guarantees useful for the HPC user to
predict his application performance behaviour and hence to give formal (or semiformal)
statements about the parameters involved. At the state of the art, cloud providers offer their
SLAs in the form of a contract (hence in natural language, with no formal specification). Two
interesting examples are Amazon EC2 and GoGrid.
The first one (Amazon) stresses fault tolerance parameters (such as service uptime), offering
guarantees about system availability. There are instead no guarantees about network behaviour
(for both internal and external network), except that it will “work” 95% of the time. Moreover,
Amazon guarantees that the virtual machine instances will run using a dedicated memory (i.e.,
there will be no other VM allocated to on the physical machine using the same memory). This
statement is particularly relevant for HPC users, because it is of great help for the performance
predictability of applications.
On the other hand, GoGrid, in addition to the availability parameters, offers a clear set of
guarantees on network parameters, as shown in Table 17.3. This kind of information is of great
interest, even if the guaranteed network latency (order of milliseconds) is clearly unacceptable
for HPC applications. GoGrid does not offer guarantees about the sharing of physical computing
resources with other virtual machines. In conclusion, even if the adoption of SLA could be (part
of) a solution for HPC performance tuning, giving a clear reference for the offered virtual cluster
performances, current solutions offer too generic SLA contracts or too poor values for the
controlled parameters.
As regards performance measurement techniques and tools, along with their adaption for
virtualized environments, it should be noted that very few performance-oriented services are
offered by cloud providers or by third parties. Usually these services simply consist of more or
less detailed performance monitoring tools, such as Cloud Watch offered by Amazon or
CloudStatus, offered by Hyperic (and integrated in Amazon). These tools essentially measure
The performance of the cloud internal or external network and should help the cloud user to tune
his applications. In exactly the same way as SLAs, they can be useful only for the transactional
applications that are the primary objective of cloud systems, since, at the state of the art, they do
not offer any features to predict the behaviour of long-running applications, such as HPC codes.

TABLE 3.3. Service-Level Agreement of GoGrid Network

An interesting approach, although still experimental, is the one offered by solutions as C-meter
and PerfCloud, which offer frameworks that dynamically benchmark the target VMs or VCs
offered by the cloud. The idea is to provide a benchmark-on-demand service to take into account
the extreme variability of the cloud load and to evaluate frequently its actual state. The first
framework supports the GrenchMark benchmark. More detailed, the PerfCloud project aims at
providing performance evaluation and prediction services in grid-based clouds. Besides
providing services for ondemand benchmarking of virtual clusters, the PerfCloud framework
uses the benchmarking results to tune a simulator used for predict the performance of HPC
applications.
UNIT – IV

SECURITY

4.1 CLOUD SECURITY

 A computer cloud is a target-rich environment for malicious individuals and criminal


organizations.

 Major concern for existing users and for potential new users of cloud computing services.
Outsourcing computing to a cloud generates new security and privacy concerns.

 Standards, regulations, and laws governing the activities of organizations supporting


cloud computing have yet to be adopted. Many issues related to privacy, security, and
trust in cloud computing are far from being settled.

 There is the need for international regulations adopted by the countries where data
centers of cloud computing providers are located.

 Service Level Agreements (SLAs) do not provide adequate legal protection for cloud
computer users, often left to deal with events beyond their control.

4.2 CLOUD SECURITY RISKS

 Traditional threats: impact amplified due to the vast amount of cloud resources and the
large user population that can be affected. The fuzzy bounds of responsibility between the
providers of cloud services and users and the difficulties to accurately identify the cause.

 New threats: cloud servers host multiple VMs; multiple applications may run under each
VM. Multi-tenancy and VMM vulnerabilities open new attack channels for malicious
users. Identifying the path followed by an attacker more difficult in a cloud environment.

 Authentication and authorization: the procedures in place for one individual does not
extend to an enterprise.
 Third-party control: generates a spectrum of concerns caused by the lack of transparency
and limited user control.

 Availability of cloud services: system failures, power outages, and other catastrophic
events could shutdown services for extended periods of time.

4.2.1 Attacks in a cloud computing environment


Three actors involved; six types of attacks possible

 The user can be attacked by:


1. Service: SSL certificate spoofing, attacks on browser caches, or phishing attacks.

2. The cloud infrastructure: attacks that either originates at the cloud or spoofs to
originate from the cloud infrastructure.

 The service can be attacked by:


1. A user: buffer overflow, SQL injection, and privilege escalation are the common types
of attacks.

2. The cloud infrastructure: It is most serious line of attack. It limits the access to
resources, privilege-related attacks, data distortion, injecting additional operations.

 The cloud infrastructure can be attacked by:


1. A user: targets the cloud control system.

2. A service: requesting an excessive amount of resources and causing the exhaustion of


the resources.
Fig 4.1: Surfaces of attacks in a cloud computing environment.

4.2.2 Top threats to cloud computing

Identified by a 2010 Cloud Security Alliance (CSA) report:

1. The abusive use of the cloud - the ability to conduct nefarious activities from the cloud.

2. APIs that are not fully secure - may not protect the users during a range of activities starting
with authentication and access control to monitoring and
control of the application during runtime.

3. Malicious insiders - cloud service providers do not disclose their hiring standards and policies,
so this can be a serious threat.
4. Data loss or leakage - if the only copy of the data is stored on the cloud, then sensitive data is
permanently lost when cloud data replication fails followed by a
storage media failure.

5. Unknown risk profile - exposure to the ignorance or underestimation of the risks of cloud
computing.
6. Shared technology.

7. Account hijacking.

4.2.3 Auditability of cloud activities

The lack of transparency makes auditability a very difficult proposition for cloud computing.

 Auditing guidelines elaborated by the National Institute of Standards (NIST) are


mandatory for US Government agencies:

 the Federal Information Processing Standard (FIPS).

 the Federal Information Security Management Act (FISMA).

4.3 SECURITY - THE TOP CONCERN FOR CLOUD USERS.

 The unauthorized access to confidential information and the data theft top the list of user
concerns.
 Data is more vulnerable in storage, as it is kept in storage for extended periods of time.

 Threats during processing cannot be ignored; such threats can originate from flaws in the
VMM, rogue VMs, or a VMBR.
 There is the risk of unauthorized access and data theft posed by rogue employees of a
Cloud Service Provider (CSP).

 Lack of standardization is also a major concern.


 Users are concerned about the legal framework for enforcing cloud computing security.

 Multi-tenancy is the root cause of many user concerns. Nevertheless, multi-tenancy


enables a higher server utilization, thus lower costs.

 The threats caused by multi-tenancy differ from one cloud delivery model to another.

4.3.1 Legal protection of cloud users

The contract between the user and the Cloud Service Provider (CSP) should spell out explicitly

 CSP obligations to handle securely sensitive information and its obligation to comply to
privacy laws.

 CSP liabilities for mishandling sensitive information.

 CSP liabilities for data loss.

 The rules governing ownership of the data.

 The geographical regions where information and backups can be stored

4.4 PRIVACY

 Privacy: the right of an individual, a group of individuals, or an organization to keep


information of personal nature or proprietary information from being disclosed.

 Privacy is protected by law; sometimes laws limit privacy.

 The main aspects of privacy are: the lack of user control, potential unauthorized
secondary use, data proliferation, and dynamic provisioning.
 Digital age has confronted legislators with significant challenges related to privacy as
new threats have emerged. For example, personal information voluntarily shared, but
stolen from sites granted access to it or misused can lead to identity theft.

 Privacy concerns are different for the three cloud delivery models and also depend on the
actual context.

4.4.1 Federal Trading Commission Rules

 Web sites that collect personal identifying information from or about consumers online
required to comply with four fair information practices:

 Notice - provide consumers clear and conspicuous notice of their information practices,
including what information they collect, how they collect it, how they use it, how they
provide Choice, Access, and Security to consumers, whether they disclose the
information collected to other entities, and whether other entities are collecting
information through the site.

 Choice - offer consumers choices as to how their personal identifying information is


used. Such choice would encompass both internal secondary uses (such as marketing
back to consumers) and external secondary uses (such as disclosing data to other entities).

 Access - offer consumers reasonable access to the information a web site has collected
about them, including a reasonable opportunity to review information and to correct
inaccuracies or delete information.

 Security - take reasonable steps to protect the security of the information they collect
from consumers.
4.4.2 Privacy Impact Assessment (PIA)

 The need for tools capable to identify privacy issues in information systems.

 There are no international standards for such a process, though different countries and
organization require PIA reports.
 The centerpiece of A proposed PIA tool is based on a SaaS service.

 The users of the SaaS service providing access to the PIA tool must fill in a
questionnaire.

 The system used a knowledge base (KB) created and maintained by domain
experts.

 The system uses templates to generate additional questions necessary and to fill in
the PIA report.

 An expert system infers which rules are satisfied by the facts in the database and
provided by the users and executes the rule with the highest priority.

4.5 TRUST

 Trust: assured reliance on the character, ability, strength, or truth of someone or


something.

 Complex phenomena: enable cooperative behaviour, promote adaptive organizational


forms, reduce harmful conflict, decrease transaction costs, promote effective responses to
crisis.
 Two conditions must exist for trust to develop.

o Risk: the perceived probability of loss; trust not necessary if there is no risk
involved, if there is a certainty that an action can succeed.
o Interdependence: the interests of one entity cannot be archived without reliance
on other entities

 A trust relationship goes though three phases:

1. Building phase, when trust is formed.

2. Stability phase, when trust exists.

3. Dissolution phase, when trust declines

 An entity must work very hard to build trust, but may lose the trust very easily.

4.5.1 Internet trust

 Obscures or lacks entirely the dimensions of character and personality, nature of


relationship, and institutional character of the traditional trust.

 Offers individuals the ability to obscure or conceal their identity. The anonymity reduces
the cues normally used in judgments of trust.

 Identity is critical for developing trust relations, it allows us to base our trust on the past
history of interactions with an entity. Anonymity causes mistrust because identity is
associated with accountability and in absence of identity accountability cannot be
enforced.

 The opacity extends identity to personal characteristics. It is impossible to infer if the


entity or individual we transact with is who it pretends to be, as the transactions occur
between entities separated in time and distance.

 There are no guarantees that the entities we transact with fully understand the role they
have assumed.
4.5.2 How to determine trust

 Policies and reputation are two ways of determining trust.

 Policies reveal the conditions to obtain trust, and the actions when some of
the conditions are met. Policies require the verification of credentials;
credentials are issued by a trusted authority and describe the
qualities of the entity using the credential.

 Reputation is a quality attributed to an entity based on a relatively long


history of interactions or possibly observations of the entity.
Recommendations are based on trust decisions made by others and
filtered through the perspective of the entity assessing the trust.

 In a computer science context : trust of a party A to a party B for a service X is the


measurable belief of A in that B behaves dependably for a specified period within a
specified context (in relation to service X).

4.6 Operating system security

 A critical function of an OS is to protect applications against a wide range of malicious


attacks, e.g., unauthorized access to privileged information, tempering with executable
code, and spoofing.

 The elements of the mandatory OS security:

o Access control: mechanisms to control the access to system objects.


o Authentication usage: mechanisms to authenticate a principal.
o Cryptographic usage policies: mechanisms used to protect the data.
 Commercial OS do not support a multi-layered security; only distinguish
between a completely privileged security domain and a completely
unprivileged one.

 Trusted paths mechanisms: support user interactions with trusted software.


Critical for system security; if such mechanisms do not exist, then malicious
software can impersonate trusted software. Some systems provide trust paths
for a few functions, such as login authentication and password changing, and
allow servers to authenticate their clients.

4.6.1 Closed-box versus open-box platforms

 Closed-box platforms , e.g., cellular phones, game consoles and ATM could have
embedded cryptographic keys to reveal their true identity to remote systems and
authenticate the software running on them.

 Such facilities are not available to open-box platforms, the traditional hardware for
commodity operating systems.

 Commodity operating system offer low assurance. An OS is a complex software system


consisting of millions of lines of code and it is vulnerable to a wide range of malicious
attacks.

 An OS provides weak mechanisms for applications to authenticate to one another and


create a trusted path between users and applications.

 An OS poorly isolates one application from another; once an application is compromised,


the entire physical platform and all applications running on it can be affected. The
platform security level is reduced to the security level of the most vulnerable application
running on the platform.
4.7 VIRTUAL MACHINE SECURITY

 Hybrid and hosted VMs, expose the entire system to the vulnerability of the host OS.

 A secure TCB (Trusted Computing Base) is a necessary condition for security in a virtual
machine environment; if the TCB is compromised then the security of the entire system is
affected.
 In a traditional VM the Virtual Machine Monitor (VMM) controls the access to the
hardware and provides a stricter isolation of VMs from one another than the isolation of
processes in a traditional OS.

 A VMM controls the execution of privileged operations and can enforce memory
isolation as well as disk and network access.

 The VMMs are considerably less complex and better structured than traditional
operating systems thus, in a better position to respond to security attacks.

 A major challenge: a VMM sees only raw data regarding the state of a guest
operating system while security services typically operate at a higher logical level,
e.g., at the level of a file rather than a disk block.

Fig 4.2: a) Virtual security services provided by the VMM. b) A dedicated


Security VM.
4.7.1 VMM-based threats

 Starvation of resources and denial of service for some VMs. Probable causes:

(a) badly configured resource limits for some VMs.

(b) a rogue VM with the capability to bypass resource limits set in VMM.

 VM side-channel attacks: malicious attack on one or more VMs by a rogue VM under the
same VMM. Probable causes:

(a) lack of proper isolation of inter-VM traffic due to misconfiguration of the virtual
network residing in the VMM.

(b) limitation of packet inspection devices to handle high speed traffic, e.g., video traffic.

(c) presence of VM instances built from insecure VM images, e.g., a VM image having a
guest OS without the latest patches.

 Buffer overflow attacks.

4.7.2 VM-based threats

 Deployment of rogue or insecure VM. Unauthorized users may create insecure instances
from images or may perform unauthorized administrative actions on existing VMs.
Probable cause:
 improper configuration of access controls on VM administrative tasks
such as instance creation, launching, suspension, re- activation and so on.
 Presence of insecure and tampered VM images in the VM image repository. Probable
causes:
(a) lack of access control to the VM image repository.
(b) lack of mechanisms to verify the integrity of the images, e.g., digitally signed
image.
4.7.3 Security of virtualization

 The complete state of an operating system running under a virtual machine is captured by
the VM; this state can be saved in a file and then the file can be copied and shared.
Implications:

1. Ability to support the IaaS delivery model. In this model a user selects an image
matching the local environment used by the application and then uploads and runs the
application on the cloud using this image.

2. Increased reliability. An operating system with all the applications running under it can
be replicated and switched to a hot standby.

3. Improved intrusion prevention and detection. A clone can look for known patterns in
system activity and detect intrusion. The operator can switch to a hot standby when
suspicious events are detected.

4. More efficient and flexible software testing. Instead of a very large number of dedicated
systems running under different OS, different version of each OS, and different patches
for each version, virtualization allows the multitude of OS instances to share a small
number of physical systems.

4.7.4 More advantages of virtualization

 Straightforward mechanisms to implement resource management policies:


o To balance the load of a system, a VMM can move an OS and the applications
running under it to another server when the load on the current server exceeds a
high water mark.

o To reduce power consumption, the load of lightly loaded servers can be moved to
other servers and then, turn off or set on standby mode the lightly loaded
servers.
 When secure logging and intrusion protection are implemented at the VMM layer, the
services cannot be disabled or modified. Intrusion detection can be disabled and logging
can be modified by an intruder when implemented at the OS level. A VMM may be able
to log only events of interest for a post-attack analysis.

4.7.5 Undesirable effects of virtualization

 Diminished ability to manage the systems and track their status.

o The number of physical systems in the inventory of an organization is limited by


cost, space, energy consumption, and human support. Creating a virtual
machine (VM) reduces ultimately to copying a file, therefore the explosion of the
number of VMs. The only imitation for the number of VMs is the amount of
storage space available.

o Qualitative aspect of the explosion of the number of VMs: traditionally,


organizations install and maintain the same version of system software. In a
virtual environment the number of different operating systems, their versions, and
the patch status of each version will be very diverse. Heterogeneity will tax the
support team.

o The software lifecycle has serious implication on security. The traditional


assumption: the software lifecycle is a straight line, hence the patch management
is based on a monotonic forward progress. The virtual execution model maps to
a tree structure rather than a line; indeed, at any point in time multiple instances
of the VM can be created and then, each one of them can be updated, different
patches installed, and so on.
4.7.6 Implications of virtualization on security

 Infection may last indefinitely: some of the infected VMs may be dormant at the time
when the measures to clean up the systems are taken and then, at a later time, wake up
and infect other systems; the scenario can repeat itself.

 In a traditional computing environment a steady state can be reached. In this steady state
all systems are brought up to a desirable state. This desirable state is reached by installing
the latest version of the system software and then applying to all systems the latest
patches. Due to the lack of control, a virtual environment may never reach such a steady
state.

 A side effect of the ability to record in a file the complete state of a VM is the possibility
to roll back a VM. This allows a new type of vulnerability caused by events recorded in
the memory of an attacker.

 Virtualization undermines the basic principle that time sensitive data stored on any
system should be reduced to a minimum.

4.8 SECURITY RISKS POSED BY SHARED IMAGES

 Image sharing is critical for the IaaS cloud delivery model. For example, a user of AWS
has the option to choose between
 Amazon Machine Images (AMIs) accessible through the Quick Start.

 Community AMI menus of the EC2 service.

 Many of the images analyzed by a recent report allowed a user to undelete files, recover
credentials, private keys, or other types of sensitive information with little effort and
using standard tools.
 A software vulnerability audit revealed that 98% of the Windows AMIs and 58% of
Linux AMIs audited had critical vulnerabilities.

 Security risks:
1. Backdoors and leftover credentials.
2. Unsolicited connections.
3. Malware.

4.9 SECURITY RISKS POSED BY A MANAGEMENT OS.

 A virtual machine monitor, or hypervisor, is considerably smaller than an operating


system, e.g., the Xen VMM has ~ 60,000 lines of code.

 The Trusted Computer Base (TCB) of a cloud computing environment includes not only
the hypervisor but also the management OS.

 The management OS supports administrative tools, live migration, device drivers, and
device emulators.

 In Xen the management operating system runs in Dom0; it manages the building of all
user domains, a process consisting of several steps:

i. Allocate memory in the Dom0 address space and load the kernel of the
guest operating system from the secondary storage.
ii. Allocate memory for the new VM and use foreign mapping to load the
kernel to the new VM.
iii. Set up the initial page tables for the new VM.
iv. Release the foreign mapping on the new VM memory, set up the virtual
CPU registers and launch the new VM.
Fig 4.3: The trusted computing base of a Xen-based environment includes the hardware,
Xen, and the management operating system running in Dom0. The management OS supports
administrative tools, live migration, device drivers, and device emulators. A guest
operatingsystem and applications running under it reside in a DomU.

4.9.1 Possible actions of a malicious Dom0

 At the time it creates a DomU:


o Refuse to carry out the steps necessary to start the new VM.

o Modify the kernel of the guest OS to allow a third party to monitor and control the
execution of applications running under the new VM.

o Undermine the integrity of the new VM by setting the wrong page tables and/or
setup wrong virtual CPU registers.

 Refuse to release the foreign mapping and access the memory while the new VM is
running.
 At run time:
o Dom0 exposes a set of abstract devices to the guest operating systems using split
drivers with the frontend of in a DomU and the backend in Dom0. We have to
ensure that run time communication through Dom0 is encrypted. Transport Layer
Security (TLS) does not guarantee that Dom0 cannot extract cryptographic keys
from the memory of the OS and applications running in DomU

4.9.2 A major weakness of Xen

 The entire state of the system is maintained by XenStore.


 A malicious VM can deny to other VMs access to XenStore; it can also gain access to the
memory of a DomU.

4.9.3 How to deal with run-time vulnerability of Dom0

 To implement a secure run-time system, we have to intercept and control the hypercalls
used for communication between a Dom0 that cannot be trusted and a DomU we want to
protect.

 New hypercalls are necessary to protect:

o The privacy and integrity of the virtual CPU of a VM. When Dom0 wants to save
the state of the VM the hypercall should be intercepted and the contents of the
virtual CPU registers should be encrypted. When DomU is restored, the virtual
CPU context should be decrypted and then an integrity check should be carried
out.
o The privacy and integrity of the VM virtual memory. The page table update
hypercall should be intercepted and the page should be encrypted so that Dom0
handles only encrypted pages of the VM. To guarantee the integrity, the
hypervisor should calculate a hash of all the memory pages before they are saved
by Dom0. An address translation is necessary as a restored DomU may be
allocated a different memory region.

o The freshness of the virtual CPU and the memory of the VM. The solution is to
add to the hash a version number.

4.9.4 Xoar - breaking the monolithic design of TCB

 Xoar is a version of Xen designed to boost system security; based on micro-kernel design
principles. The design goals are:

o Maintain the functionality provided by Xen.

o Ensure transparency with existing management and VM interfaces.

o Tight control of privileges, each component should only have the privileges
required by its function.

o Minimize the interfaces of all components to reduce the possibility that a


component can be used by an attacker.

o Eliminate sharing. Make sharing explicit whenever it cannot be eliminated to


allow meaningful logging and auditing.

o Reduce the opportunity of an attack targeting a system component by limiting the


time window when the component runs.

 The security model of Xoar assumes that threats come from:

o A guest VM attempting to violate data integrity or confidentiality of another


guest VM on the same platform, or to exploit the code of the guest.
o Bugs in the initialization code of the management virtual machine.
4.9.5 Xoar system components

 Permanent components: XenStore-State maintains all information regarding the state of


the system

 Components used to boot the system; they self-destruct before any user VM is started.
They discover the hardware configuration of the server including the PCI drivers and
then boot the system:

 PCIBack - virtualizes access to PCI bus configuration.


 Bootstrapper - coordinates booting of the system.

 Components restarted on each request:

 XenStore-Logic.

 Toolstack - handles VM management requests, e.g., it requests the Builder


to create a new guest VM in response to a user request.

 Builder - initiates user VMs.

 Components restarted on a timer; the two components export physical storage device
drivers and the physical network driver to a guest VM.

 Blk-Back - exports physical storage device drivers using udev rules.

 NetBack - exports the physical network driver.


Fig 4.4: Xoar has nine classes of components of four types: permanent, self-destructing,
restarted upon request, and restarted on timer. A guest VM is started using the by the Builder
using the Toolstack; it is controlled by the XenStore-Logic. The devices used by the guest VM
are emulated by the Qemu component. Qemu is responsible for device emulation.
Fig 4.5: Component sharing between guest VMs in Xoar. Two VMs share only the XenStore
components. Each one has a private version of the BlkBack, NetBack and Toolstack.

4.10 TERRA - A TRUSTED VIRTUAL MACHINE MONITOR.

 Novel ideas for a trusted virtual machine monitor (TVMM):

 It should support not only traditional operating systems, by exporting the hardware
abstraction for open-box platforms, but also the abstractions for closed-box platforms (do
not allow the contents of the system to be either manipulated or inspected by the platform
owner).
 An application should be allowed to build its software stack based on its needs.
Applications requiring a very high level of security should run under a very thin OS
supporting only the functionality required by the application and the ability to boot. At
the other end of the spectrum are applications demanding low assurance, but a rich set of
OS features; such applications need a commodity operating system.

 Provide trusted paths from a user to an application. Such a path allows a human user to
determine with certainty the identity of the VM it is interacting with and allows the VM
to verify the identity of the human user.

 Deny the platform administrator the root access.

 Support attestation, the ability of an application running in a closed-box to gain trust from
a remote party, by cryptographically identifying itself.
UNIT – V

CLOUD IMPLEMENTATION AND APPLICATIONS

5.1 AMAZON ELASTIC COMPUTE CLOUD (AMAZON EC2)

 Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable
compute capacity in the cloud. It is designed to make web-scale cloud computing easier
for developers.

 Amazon EC2’s simple web service interface allows you to obtain and configure capacity
with minimal friction. It provides you with complete control of your computing resources
and lets you run on Amazon’s proven computing environment. AND APPLICATIONS

 Amazon EC2 reduces the time required to obtain and boot new server instances to
minutes, allowing you to quickly scale capacity, both up and down, as your computing
requirements change. Amazon EC2 changes the economics of computing by allowing
you to pay only for capacity that you actually use.

 Amazon EC2 provides developers the tools to build failure resilient applications and
isolate themselves from common failure scenarios.

5.1.1 Features of Amazon EC2

o Amazon EC2 provides the following features:

o Virtual computing environments, known as instances.

o Preconfigured templates for your instances, known as Amazon Machine Images


(AMIs), that package the bits you need for your server (including the operating
system and additional software).
o Various configurations of CPU, memory, storage, and networking capacity for
your instances, known as instance types.

o Secure login information for your instances using key pairs (AWS stores the
public key, and you store the private key in a secure place).
o Storage volumes for temporary data that's deleted when you stop or terminate
your instance, known as instance store volumes.

o Multiple physical locations for your resources, such as instances and Amazon
EBS volumes, known as regions and Availability Zones.

o Metadata, known as tags, that you can create and assign to your Amazon EC2
resources.

o Virtual networks you can create that are logically isolated from the rest of the
AWS cloud, and that you can optionally connect to your own network, known
as virtual private clouds (VPCs).

5.1.2 Getting Started with Amazon Elastic Compute Cloud (Amazon EC2)

 There are several ways to get started with Amazon EC2. You can use the AWS
Management Console, the AWS Command Line Tools (CLI), or AWS SDKs.

 Getting Started with the AWS Management Console.

Step 1: Set up and log into your AWS account

Log into the AWS Management Console and set up your root account.
Step 2: Launch an Amazon EC2 instance

In the Amazon EC2 Dashboard, choose "Launch Instance" to create and configure your
virtual machine.

Step 3: Configure your instance

o In this wizard, you have the option to configure your instance features.

Below are some guidelines on setting up your first instance.

o Choose an Amazon Machine Image (AMI): In step 1 of the wizard, we


recommend the Amazon Linux AMI (free-tier eligible).

o Choose an instance type: In step 2 of the wizard, we recommend the t2.micro


(free-tier eligible).

o Security group: In step 6, you have the option to configure your virtual firewall.

o Launch instance: In step 7, review your instance configuration and choose


"Launch".

o Create a key pair: Select "Create a new key pair" and assign a name. The key pair
file (.pem) will download automatically - save this in a safe place as we will later
use this file to log in to the instance. Finally, choose "Launch Instances" to
complete the set up.

Step 4: Connect to your instance.

After you launch your instance, you can connect to it and use it the way that you'd
use a computer sitting in front of you. To connect from the console, follow the steps
below:
o Select the EC2 instance you created and choose "Connect".

o Select "A Java SSH client directly from my browser". Ensure Java is installed and
enabled.

o Enter the Private key path (example: C:\KeyPairs\my-key-pair.pem).

o Choose "Launch SSH Client".

Step 5: Terminate instances

Amazon EC2 is free to start (learn more), but you should terminate your instances to
prevent additional charges. The EC2 instance and the data associated will be deleted.

o Select the EC2 instance, choose "Actions", select "Instance State", and
"Terminate".

5.2 AMAZON S3

 Amazon Simple Storage Service (Amazon S3), provides developers and IT teams with
secure, durable, highly-scalable cloud storage. Amazon S3 is easy to use object storage,
with a simple web service interface to store and retrieve any amount of data from
anywhere on the web.

 With Amazon S3, you pay only for the storage you actually use. There is no minimum
fee and no setup cost. Amazon S3 offers a range of storage classes designed for different
use cases including Amazon S3 Standard for general-purpose storage of frequently
accessed data, Amazon S3 Standard - Infrequent Access (Standard - IA) for long-lived,
but less frequently accessed data, and Amazon Glacier for long-term archive.

 Amazon S3 also offers configurable lifecycle policies for managing your data throughout
its lifecycle. Once a policy is set, your data will automatically migrate to the most
appropriate storage class without any changes to your applications.
 Amazon S3 can be used alone or together with other AWS services such as Amazon
Elastic Compute Cloud (Amazon EC2) and AWS Identity and Access Management
(IAM), as well as data migration services and gateways for initial or ongoing data
ingestion.

 Amazon S3 provides cost-effective object storage for a wide variety of use cases
including backup and recovery, nearline archive, big data analytics, disaster recovery,
cloud applications, and content distribution.

 In S3, there’s no initial charges, zero setup cost. You only pay for what you utilize. It is
utmost suitable for webmasters and bloggers, especially those who have the following
issues:

Running out of bandwidths: If you are on shared hosting account, any Stumble effect
can easily eat up the entire bandwidth limit for the month. Most of the time, the web host
will suspend the account until you have settle the payment for the extra bandwidths
consumed. Amazon S3 provides unlimited bandwidth and you’ll be served with any
amount of bandwidth your site needs. Charges will be made to credit card and payment
can be made at the end of the month.

Better scalability: Amazon S3 using cloud hosting and image serving is relatively fast.
Separating them away from normal HTTP request will definitely ease the server load and
thus, guarantees better stability.

Paying for more that you actually used: Whether you are on shared hosting, VPS or
dedicated server, you pay a lump sum each month (or year) and the amount includes hard
disk storage and bandwidth you might not fully make use of. Why pay for more when
you can pay only for what you are used.

Store files online: Instead of backing up your files in CD/DVDs to save more hard disk
space, here’s another option. Store them online, and you have the option to keep them
private or make them public accessible. It’s entire up to you.
Easier files retrieval and sharing: If you store your file online, you can access them
anywhere as long as there’s Internet connection. Amazon S3 also allows me to
communicate files better with friends, clients, and blog readers.

5.2.1 Getting Started with Amazon Simple Storage Service(Amazon S3)

 There are several ways to get started with Amazon S3 depending on your use case and
how you want to integrate the service into your use case.

 The AWS Management Console provides a web-based interface for accessing and
managing all your Amazon S3 resources. Programmatic access to Amazon S3 is enabled
by the AWS SDKs and the AWS Mobile SDK. In addition, Amazon S3 is supported by a
wide range of third party tools and gateways.

 Using the AWS Management Console

The AWS Management Console is a web-based interface for accessing and


managing your Amazon S3 resources. You can easily and securely create buckets,
upload objects, and set access controls using the AWS Management Console.

Step 1: To sign up for Amazon S3

o Go to http://aws.amazon.com/s3/ and click Sign Up.


o Follow the on-screen instructions.
AWS will notify you by email when your account is active and available for you
to use.
Step 2: To create a bucket

Sign into the AWS Management Console and open the Amazon S3 console
athttps://console.aws.amazon.com/s3.

o Click Create Bucket.

o In the Create a Bucket dialog box, in the Bucket Name box, enter a bucket
name.

o In the Region box, select a region. For this exercise, select Oregon from the
drop-down list.

o Click Create. When Amazon S3 successfully creates your bucket, the console
displays your empty bucket in the Buckets panel.

Step 3: To add an object to a Bucket.

Now that you've created a bucket, you're ready to add an object to it. An object can be
any kind of file: a text file, a photo, a video and so forth.

TO UPLOAD AN OBJECT

o In the Amazon S3 console, click the name of bucket that you want to upload an object to
and then click Upload.

o In the Upload - Select Files wizard, if you want to upload an entire folder, you must
click Enable Enhanced Uploader to install the necessary Java applet. You only need to
do this once per console session.

o Click Add Files.

o Select the file that you want to upload and then click Open.

o Click Start Upload.


Step 4: View an Object.

Now that you've added an object to a bucket, you can open and view it in a
browser. You can also download the object to your local computer.

TO OPEN OR DOWNLOAD AN OBJECT

o In the Amazon S3 console, in the Objects and Folders list, right-click the object or
objects that you want to open or download, then click Open or Download as appropriate.

o If you are downloading the object, specify where you want to save it. The procedure for
saving the object depends on the browser and operating system that you are using.

Step 5: Move an Object.

Now that you've added an object to a bucket and viewed it, you can move the object
to a different bucket or folder.

TO MOVE AN OBJECT

o In the Amazon S3 console, right-click the object that you want to move, and then click
Cut.

o Navigate to the bucket or folder where you want to move the object. Right-click the
folder or bucket and then click Paste Into.

Step 5: Delete an Object and Bucket.

If you no longer need to store the objects that you uploaded and moved while going
through this guide, you should delete them to prevent further charges.
TO DELETE AN OBJECT

o Sign in to the AWS Management Console and open the Amazon S3 console
athttps://console.aws.amazon.com/s3/.

o In the Objects and Folders panel, right-click the object that you want to delete, and then
click Delete.

o When a confirmation message appears, click OK.

You can empty a bucket, which deletes all the objects in the bucket without
deleting the bucket. For information on the limitations for emptying a bucket,
see Deleting/Emptying a Bucket in the Amazon Simple Storage Service Developer
Guide.

TO EMPTY A BUCKET

o Sign in to the AWS Management Console and open the Amazon S3 console
athttps://console.aws.amazon.com/s3/.

o Right-click the bucket that you want to empty, and then click Empty Bucket.

o When a confirmation message appears, enter the bucket name and then click Empty
bucket.

You can delete a bucket and all the objects contained in the bucket. For information on the
limitations for deleting a bucket, see Deleting/Emptying a Bucket in the Amazon Simple Storage
Service Developer Guide.
TO DELETE A BUCKET

o Sign in to the AWS Management Console and open the Amazon S3 console
athttps://console.aws.amazon.com/s3/.

o Right-click the bucket that you want to delete, and then click Delete Bucket.

o When a confirmation message appears, enter the bucket name and click Delete.

5.2.2 Uses of S3

1. Backup and Storage – Provide data backup and storage services for others.
2. Application Hosting – Provide services that deploy, install, and manage web
applications.
3. Media Hosting – Build a redundant, scalable, and highly available infrastructure that
hosts video, photo, or music uploads and downloads.
4. Software Delivery – Host your software applications that customers can download.

5.3 CLOUDSTACK

Cloudstack is an open source cloud computing software for creating, managing, and deploying
infrastructure cloud services. It uses existing hypervisors such as KVM, VMware vSphere, and
XenServer/XCP for virtualization.

1. Cloudstack Architecture Overview

Cloudstack is a Cloud Orchestration platform that pools computing resources to build public,
private and hybrid Infrastructure as a Service (IaaS) clouds. Cloudstack manages the network,
storage, and compute nodes that make up a cloud infrastructure. A Cloudstack cloud has a
hierarchical structure which enables it to scale to manage tens of thousands of physical servers,
all from a single management interface.

2. Availability Zones

 An Availability Zone is the largest organisational unit within a Cloudstack deployment.


Typically a data centre (DC) implementation will contain a single Zone, but there are no
hard and fast rules and a DC can contain multiple Zones.

 By structuring Cloudstack into geographical Zones, virtual instances and data storage can
be placed in specific locations in order to comply with an organization's data storage
policies etc. An Availability Zone consists of at least one Pod, and Secondary Storage,
which is shared by all Pods in the Zone.

 Zones are visible to end users, who can then choose which Zone they wish to create their
virtual instances in. A public Zone is visible to all users, but private Zones can also be
created which are then only available to members of a particular domain and its
associated sub-domains.
Fig 5.1: Availability Zone

1. Pods

A Pod is a selection of hardware configured to form Clusters. Typically a Pod is a rack


containing one or more Clusters, and a Layer 2 switch architecture which is shared by all
Clusters in that Pod. End users are not aware, and have no visibility of Pods.

2. Clusters

A Cluster is a group of identical Hosts running a common Hypervisor. For example a Cluster
could be a XenServer Pool, a set KVM Server or a VMware cluster pre-configured in vCenter.
Each Cluster has a dedicated Primary Storage device which is where the virtual machine
instances are hosted. With multiple Hosts within a Cluster, High Availability and Load
Balancing are standard features of a Cloudstack deployment.

3. Secondary Storage

Secondary Storage is used to store virtual machine Templates, ISO images and Snapshots. The
storage is available to all PODs in a Zone, and can also be replicated automatically between
Availability Zones thereby providing a common storage platform throughout the whole Cloud.
Secondary Storage uses the Network File System (NFS) as this ensures it can be accessed by any
Host in the Zone.

4. Primary Storage

Primary Storage is unique to each Cluster and is used to host the virtual machine
instances. Cloudstack is designed to work with all standards-compliant iSCSI and NFS Servers
supported by the underlying Hypervisor. Primary Storage is a critical component and should be
built on high performance hardware with multiple high speed disks.

Fig 5.2: Cloud Stack Cloud Architecture

 Cloudstack Cloud can have one or more Availability Zones (AZ).

Management Server Managing Multiple Zones


Fig 5.3: Management Server Managing Multiple Zones

Single Management Server can manage multiple zones. Zones can be geographically distributed
but low latency links are expected for better performance.

5.4 INTERCLOUD

 Intercloud is a term used in IT to refer to a theoretical model for cloud computing


services. The idea of the intercloud relies on models that have already been shown to be
effective in cases like the global Internet and the 3G and 4G wireless networks of various
national telecom providers.

 The idea behind an intercloud is that a single common functionality would combine many
different individual clouds into one seamless mass in terms of on-demand operations.

 Cloud hosting is largely intended to deliver on-demand services. Through careful use of
scalable and highly engineered technologies, cloud providers are able to offer customers
the ability to change their levels of service in many ways without waiting for physical
changes to occur.

 Terms like rapid elasticity, resource pooling and on-demand self-service are already part
of cloud hosting service designs that are set up to make sure the customer or client never
has to deal with limitations or disruptions. Building on all of these ideas, the intercloud
would simply make sure that a cloud could use resources beyond its reach by taking
advantage of pre-existing contracts with other cloud providers.

 Although these setups are theoretical as they apply to cloud services, telecom providers
already have these kinds of agreements. Most of the national telecom companies are able
to reach out and use parts of another company’s operations where they lack a regional or
local footprint, because of carefully designed business agreements between the
companies. If cloud providers develop these kinds of relationships, the intercloud could
become reality.
5.4.1 Architectural classification of Inter-Clouds.

Fig 5.4: Architectural classification of Inter-Clouds.

1. From architectural perspective, Volunteer Federations can be further classified as


follows:

o Centralized – in every instance of this group of architectures, there is a central entity that
either performs or facilitates resource allocation. Usually, this central entity acts as a
repository where available cloud resources are registered but may also have other
responsibilities like acting as a market place for resources.

o Peer-to-Peer – in the architectures from this group, clouds communicate and negotiate
directly with each other without mediators.

2. Independent Inter-Cloud developments can be further classified as follows:

o Services – application provisioning is carried out by a service that can be hosted either
externally or in-house by the cloud clients. Most such services include broker
components in themselves. Typically, application developers specify an SLA or a set of
provisioning rules, and the service performs the deployment and execution in the
background, in a way respecting these predefined attributes.

o Libraries - often, custom application brokers that directly take care of provisioning and
scheduling application components across clouds are needed. Typically, such approaches
make use of Inter-Cloud libraries that facilitate the usage of multiple clouds in a uniform
way.

Figure 5.5: Inter-Cloud developments’ architectures.


 The benefits of an Inter-Cloud environment for cloud clients are numerous and can
be broadly summarized as follows:

1. Diverse geographical locations - Leading cloud service providers have established


data centres worldwide. However, it is unlikely that any provider will be able to establish
data centres in every country and administrative region .

o Many applications have legislative requirements as to where data are stored.


Thus, a data centre within a region of countries may not be enough, and
application developers will need fine-grained control (specific country or state) as
to where resources are positioned.

o Only by utilizing multiple clouds can one gain access to so widely distributed
resources and provide well-performing and legislation-compliant services to
clients.

2. Better application resilience - During the past several years, there have been several
cases of cloud service outages, including ones of major vendors. The implications from
one of Amazon’s data centres failure were very serious for customers who relied on that
location only.

o In a post-mortem analysis, Amazon advised their clients to design their


applications to use multiple data centres for fault tolerance. Furthermore, in
Berkeley’s report on Cloud computing, Armbrust et al. emphasise that potential
unavailability of service is the number one inhibitor to adopting Cloud computing.

o Thus, they advise the use of multiple providers. Besides fault tolerance, using
resources from different providers acts as an insurance against a provider being
stopped because of regulatory or legal reasons as well.

3. Avoidance of vendor lock-in -By using multiple clouds and being able to freely
transit workload among them, a cloud client can easily avoid vendor lock-in. In case a
provider changes a policy or pricing that impact negatively its clients, they could easily
migrate elsewhere.

5.5 GOOGLE APP ENGINE

 Google App Engine (often referred to as GAE or simply App Engine) is a platform as a
service (PaaS) cloud computing platform for developing and hosting web applications in
Google-managed data centres.

 Applications are sandboxed and run across multiple servers. App Engine offers automatic
scaling for web applications—as the number of requests increases for an application, App
Engine automatically allocates more resources for the web application to handle the
additional demand.

 The App Engine requires that apps be written in Java or Python, store data in Google Big
Table and use the Google query language. Non-compliant applications require
modification to use App Engine.

 Google App Engine provides more infrastructure than other scalable hosting services
such as Amazon Elastic Compute Cloud (EC2). The App Engine also eliminates some
system administration and developmental tasks to make it easier to write scalable
applications.

 Google App Engine is free up to a certain amount of resource usage. Users exceeding the
per-day or per-minute usage rates for CPU resources, storage, number of API calls or
requests and concurrent requests can pay for more of these resources.

Google App Engine makes it easy to take your app ideas to the next level.

o Quick to start: With no software or hardware to buy and maintain, you can prototype
and deploy applications to your users in a matter of hours.

o Simple to use: Google App Engine includes the tools you need to create, test, launch,
and update your apps.
o Rich set of APIs: Build feature-rich services faster with Google App Engine’s easy-to-
use APIs.

o Immediate scalability: There’s almost no limit to how high or how quickly your app can
scale.

o Pay for what you use: Get started without any upfront costs with App Engine’s free tier
and pay only for the resources you use as your application grows.

5.5.1 When to use Google App Engine?

Use App Engine when:

1. You don’t want to get troubled for setting up a server.

2. You want instant for-free nearly infinite scalability support.

3. Your application’s traffic is spiky and rather unpredictable.

4. You don't feel like taking care of your own server monitoring tools.

5. You need pricing that fits your actual usage and isn't time-slot based (App engine
provides pay-per-drink cost model).

6. You are able to chunk long tasks into 60 second pieces.

7. You are able to work without direct access to local file system.
5.5.2 Advantages of Google App Engine.

There are many advantages of the app engine, including:

 Infrastructure for Security


Around the world, the Internet infrastructure that Google has is probably the most secure.
There is rarely any type of unauthorized access till date as the application data and code
are stored in highly secure servers. You can be sure that your app will be available to
users worldwide at all times since Google has several hundred servers globally.
Google’s security and privacy policies are applicable to the apps developed using
Google’s infrastructure.

 Scalability
For any app’s success, this is among the deciding factors. Google creates its own apps
using GFS, Big Table and other such technologies, which are available to you when you
utilize the Google app engine to create apps. You only have to write the code for the app
and Google looks after the testing on account of the automatic scaling feature that the app
engine has. Regardless of the amount of data or number of users that your app stores, the
app engine can meet your needs by scaling up or down as required.

 Performance and Reliability


Google is among the leaders worldwide among global brands. So, when you discuss
performance and reliability you have to keep that in mind. In the past 15 years, the
company has created new benchmarks based on its services’ and products’ performance.
The app engine provides the same reliability and performance as any other Google
product.

 Cost Savings
You don’t have to hire engineers to manage your servers or to do that yourself. You can
invest the money saved in to other parts of your business.
 Platform Independence
You can move all your data to another environment without any difficulty as there is not
much dependency on the app engine platform.

5.6 OPEN SOURCE CLOUD EUCALYPTUS

Eucalyptus is an open source software platform for implementing Infrastructure as a Service


(IaaS) in a private or hybrid cloud computing environment. The Eucalyptus cloud platform pools
together existing virtualized infrastructure to create cloud resources for infrastructure as a
service, network as a service and storage as a service. The name Eucalyptus is an acronym for
Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems.

Eucalyptus was founded out of a research project in the Computer Science Department at the
University of California, Santa Barbara, and became a for-profit business called Eucalyptus
Systems in 2009.

Eucalyptus Systems announced a formal agreement with Amazon Web Services (AWS) in
March 2012, allowing administrators to move instances between a Eucalyptus private cloud and
the Amazon Elastic Compute Cloud (EC2) to create a hybrid cloud. The partnership also
allows Eucalyptus to work with Amazon’s product teams to develop unique AWS-compatible
features.

Eucalyptus features include:

1. Supports both Linux and Windows virtual machines (VMs).


2. Application program interface- (API) compatible with Amazon EC2 platform.

3. Compatible with Amazon Web Services (AWS) and Simple Storage Service (S3).

4. Works with multiple hypervisors including VMware, Xen and KVM.

5. Can be installed and deployed from source code or DEB and RPM packages.

6. Internal processes communications are secured through SOAP and WS-Security.

7. Multiple clusters can be virtualized as a single cloud.


8. Administrative features such as user and group management and reports

5.6.1 Eucalyptus Architecture.

Figure 5.6: Eucalyptus Architecture.

Eucalyptus cloud based architecture shows the actual centralization of private cloud system
(Figure). The solution includes three essentials nodes for the monitoring of the services: the
cloud controller (CLC), the cluster controller CC and the node controller NC. The CLC is the
central node of the architecture. It monitors the system through the different CC on the network.

Cluster controllers are the intermediary nodes used by the CLC to collect information over the
entire network. Each CC is in charge of a number of node controllers. The role of the different
NCs is to monitor the single physical computing node on which they are installed. The other
components of the system deal with the management of users’ data and the security concerns.
Eucalyptus architecture is seemingly encountered in the other private cloud solutions such as
Open Nebula where the save terms namely cluster, node or cloud controller are used.

5.7 OPEN STACK

Open Stack is a set of software tools for building and managing cloud computing platforms for
public and private clouds. Backed by some of the biggest companies in software development
and hosting, as well as thousands of individual community members, many think that Open
Stack is the future of cloud computing. Open Stack is managed by the Open Stack, a non-profit
that oversees both development and community-building around the project.

5.7.1 Introduction to Open Stack

Open Stack lets users deploy virtual machines and other instances that handle different tasks for
managing a cloud environment on the fly. It makes horizontal scaling easy, which means that
tasks that benefit from running concurrently can easily serve more or fewer users on the fly by
just spinning up more instances.

For example, a mobile application that needs to communicate with a remote server might be able
to divide the work of communicating with each user across many different instances, all
communicating with one another but scaling quickly and easily as the application gains more
users.

And most importantly, Open Stack is open source software, which means that anyone who
chooses to can access the source code, make any changes or modifications they need, and freely
share these changes back out to the community at large.

It also means that Open Stack has the benefit of thousands of developers all over the world
working in tandem to develop the strongest, most robust, and most secure product that they can.

5.7.2 Use of Open Stack in a cloud environment


 The cloud is all about providing computing for end users in a remote environment, where
the actual software runs as a service on reliable and scalable servers rather than on each
end-user's computer.

 Cloud computing can refer to a lot of different things, but typically the industry talks
about running different items "as a service"—software, platforms, and infrastructure.
Open Stack falls into the latter category and is considered Infrastructure as a Service
(IaaS).

 Providing infrastructure means that Open Stack makes it easy for users to quickly add
new instance, upon which other cloud components can run. Typically, the infrastructure
then runs a "platform" upon which a developer can create software applications that are
delivered to the end users.

5.7.3 The components of Open Stack

Open Stack is made up of many different moving parts. Because of its open nature, anyone can
add additional components to Open Stack to help it to meet their needs. But the Open Stack
community has collaboratively identified nine key components that are a part of the "core" of
Open Stack, which are distributed as a part of any Open Stack system and officially maintained
by the Open Stack community.

1. Nova is the primary computing engine behind Open Stack. It is used for deploying and
managing large numbers of virtual machines and other instances to handle computing tasks.

2. Swift is a storage system for objects and files. Rather than the traditional idea of a referring to
files by their location on a disk drive, developers can instead refer to a unique identifier referring
to the file or piece of information and let Open Stack decide where to store this information. This
makes scaling easy, as developers don’t have the worry about the capacity on a single system
behind the software. It also allows the system, rather than the developer, to worry about how best
to make sure that data is backed up in case of the failure of a machine or network connection.
3. Cinder is a block storage component, which is more analogous to the traditional notion of a
computer being able to access specific locations on a disk drive. This more traditional way of
accessing files might be important in scenarios in which data access speed is the most important
consideration.

4. Neutron provides the networking capability for Open Stack. It helps to ensure that each of the
components of an Open Stack deployment can communicate with one another quickly and
efficiently.

5. Horizon is the dashboard behind Open Stack. It is the only graphical interface to Open Stack,
so for users wanting to give Opens tack a try, this may be the first component they actually
“see.” Developers can access all of the components of Opens tack individually through an
application programming interface (API), but the dashboard provides system administrators a
look at what is going on in the cloud, and to manage it as needed.

6. Keystone provides identity services for Opens tack. It is essentially a central list of all of the
users of the Opens tack cloud, mapped against all of the services provided by the cloud, which
they have permission to use. It provides multiple means of access, meaning developers can easily
map their existing user access methods against Keystone.

7. Glance provides image services to Opens tack. In this case, "images" refers to images (or
virtual copies) of hard disks. Glance allows these images to be used as templates when deploying
new virtual machine instances.
8. Ceilometers provides telemetry services, which allow the cloud to provide billing services to
individual users of the cloud. It also keeps a verifiable count of each user’s system usage of each
of the various components of an Open Stack cloud. Think metering and usage reporting.

9. Heat is the orchestration component of Open Stack, which allows developers to store the
requirements of a cloud application in a file that defines what resources are necessary for that
application. In this way, it helps to manage the infrastructure needed for a cloud service to run.

5.8 OPEN NEBULA

Open Nebula is a cloud computing platform for managing heterogeneous distributed data
centre infrastructures. The Open Nebula platform manages a data centre's virtual infrastructure to
build private, public and hybrid implementations of infrastructure as a service. Open Nebula
is free and open-source software, subject to the requirements of the Apache License version 2.

Open Nebula provides the most simple but feature-rich and flexible solution for the
comprehensive management of virtualized data centers to enable private, public and hybrid IaaS
clouds. Open Nebula interoperability makes cloud an evolution by leveraging existing IT assets,
protecting your investments, and avoiding vendor lock-in.

Open Nebula provides features at the two main layers of Data Centre Virtualization and Cloud
Infrastructure:

1. Data Centre Virtualization Management.


Many of our users use Open Nebula to manage data centre virtualization, consolidate servers,
and integrate existing IT assets for computing, storage, and networking. In this
deployment model, Open Nebula directly integrates with hypervisors (like KVM, Xen or
VMware ESX) and has complete control over virtual and physical resources, providing advanced
features for capacity management, resource optimization, high availability and business
continuity.
Some of these users also enjoy Open Nebula's cloud management and provisioning features
when they additional want to federate data centers, implement cloud bursting, or offer self-
service portals for users.

2. Cloud Management.

We also have users that use Open Nebula to provide a multi-tenant, cloud-like provisioning layer
on top of an existing infrastructure management solution (like VMware vCenter).

These users are looking for provisioning, elasticity and multi-tenancy cloud features like virtual
data centers provisioning, data enter federation or hybrid cloud computing to connect in-house
infrastructures with public clouds, while the infrastructure is managed by already familiar tools
for infrastructure management and operation.

The Open Nebula technology is the result of many years of research and development in efficient
and scalable management of virtual machines on large-scale distributed infrastructures. Open
Nebula was designed to address the requirements of business use cases from leading
companies and across multiple industries, such as Hosting, Telecom, eGovernment, Utility
Computing.

 The principles that have guided the design of Open Nebula are:

1. Openness of the architecture, interfaces, and code.


2. Flexibility to fit into any data enter.
3. Interoperability and portability to prevent vendor lock-in.
4. Stability for use in production enterprise-class environments.
5. Scalability for large scale infrastructures.
6. SysAdmin-centrism with complete control over the cloud.
7. Simplicity, easy to deploy, operate and use.
8. Lightness for high efficiency.
5.8.1 Benefits of Open Nebula

1. For the Infrastructure Manager

 Faster respond to infrastructure needs for services with dynamic resizing of the physical
infrastructure by adding new hosts, and dynamic cluster partitioning to meet capacity
requirements of services.

 Centralized management of all the virtual and physical distributed infrastructure.

 Higher utilization of existing resources with the creation of a infrastructure incorporating


the heterogeneous resources in the data center, and infrastructure sharing between
different departments managing their own production clusters, so removing application
silos.

 Operational saving with server consolidation to a reduced number of physical systems, so


reducing space, administration effort, power and cooling requirements.

 Lower infrastructure expenses with the combination of local and remote Cloud resources,
so eliminating the over-purchase of systems to meet peaks demands.

2. For System Integrators


 Fits into any existing data centre thanks to its open, flexible and extensible interfaces,
architecture and components.

 Builds any type of Cloud deployment.


 Open source software.

 Seamless integration with any product and service in the virtualization/cloud ecosystem
and management tool in the data centre, such as cloud providers, VM managers, virtual
image managers, service managers, management tools, scheduler.

3. For the Infrastructure User

 Faster delivery and scalability of services to meet dynamic demands of service end-users.

 Support for heterogeneous execution environments with multiple, even conflicting,


software requirements on the same shared infrastructure.

 Full control of the lifecycle of virtualized services management.

5.9 CLOUD IMPLEMENTATION AND APPLICATION.


7 Steps for a Successful Cloud Implementation are as follows,

1. Make an effective business case: Before you rip out your existing systems and introduce
your users to the cloud, it's important to plan ahead. This planning should include making clear
what benefits you expect to derive from the new solution and how this change will help meet
business objectives.

2. Research the industry.

Any bank that's considering moving to the cloud needs to undergo a prudent research process.
Taking the time to learn about the available solutions should ease concerns about the cloud's
ability to meet the strict regulations posed by banks, as there are a number of solutions designed
to help banks comply with these regulations.

3. Take advantage of peer reviews.


One of the most effective steps you can take is to meet your peers and listen to their stories.

4. Create the right TCO model.

There's no doubt that your bank has made significant investments in an infrastructure that the
cloud will replace, and you'll need to justify the cost of not using this infrastructure.

5. Determine what data belongs in the cloud.

Although moving to the cloud will replace some legacy hardware, your bank will likely need to
keep in place other existing on-premise solutions.

6. Pay close attention to your data and integration strategies.

Since you will have multiple solutions in place, determine which system is the master when it
comes to data governance. Historically, banks have been account-centric, but the shift today is
toward becoming more client-centric, so you need to make sure your data is set up to match this
new model.

7. Leverage experienced resources.

Configuring and managing cloud solutions will come with a learning curve for even the most
seasoned IT professionals.

CLOUD APPLICATIONS

The applications of cloud computing are practically limitless. With the right middleware, a cloud
computing system could execute all the programs a normal computer could run. Potentially,
everything from generic word processing software to customized computer programs designed
for a specific company could work on a cloud computing system.
1. Clients would be able to access their applications and data from anywhere at any time.
They could access the cloud computing system using any computer linked to the Internet.

2. It could bring hardware costs down. Cloud computing systems would reduce the need for
advanced hardware on the client side. You wouldn't need to buy the fastest computer with the
most memory, because the cloud system would take care of those needs for you.

3. The companies don't have to buy a set of software or software licenses for every employee.
Instead, the company could pay a metered fee to a cloud computing company.

4. Servers and digital storage devices take up space. Some companies rent physical space to
store servers and databases because they don't have it available on site. Cloud computing
gives these companies the option of storing data on someone else's hardware, removing the
need for physical space on the front end.

5. If the cloud computing system's back end is a grid computing system, then the client could
take advantage of the entire network's processing power.
UNIT – VI

UBIQUITOUS COMPUTING
6.1 Introduction

Definition : "Ubiquitous computing (ubicomp) is a post-desktop model of human-computer


interaction in which information processing has been thoroughly integrated into everyday objects
and activities"

Ubiquitous computing devices are not personal computers, but very tiny even invisible devices.

Can be either mobile or embedded in almost any type of object imaginable. This may include
cars, tools, appliances, clothing and various consumer goods all communicating through
increasingly interconnected networks.

Vision: Also known as Pervasive Computing. Some Visions are, “Computing Everywhere for
Everyone”, “Embed Computing devices in the environment”, “Keep the computers in the
background presence".

6.2 Applications:

1. Health Care: Pervasive computing offers opportunities for future healthcare provision both
for treating and managing disease, and for patient administration. Remote sensors and
monitoring technology might allow the continuous capture and analysis of patients’
physiological data. Medical staff could be immediately alerted to any detected irregularities.

2. Environmental monitoring: Pervasive computing provides improved methods to monitor the


environment. It will allow for continuous real-time data collection and analysis via remote,
wireless devices. This poses significant challenges for PCS developers. Devices may be required
to withstand harsh environmental conditions (such as heat, cold and humidity).

3.Intelligent transport systems: Such systems seek to bring together information and
telecommunications technologies in a collaborative scheme to improve the safety, efficiency and
productivity of transport network. Electronic devices could be directly integrated into the
transport infrastructure, and into vehicles themselves, with the aim of better monitoring and
managing the movement of vehicles within road, rail, air and sea transport systems.

4 . C o m m u n i c a t i o n : As across-application, the communications area affects all forms of


exchange and transmission of data, information, and knowledge. Communications thus
represents a precondition for all information technology domains.

5. Logistics: Tracking logistical goods along the entire transport chain of raw materials, semi-
finished articles, and finished products (including their eventual disposal) closes the gap in IT
control systems between the physical flow and the information flow. This offers opportunities for
optimizing and automating logistics that are already apparent today.

6. Motor traffic: Automobiles already contain several assistance systems that support the driver
invisibly. Networking vehicles with each other and with surrounding telemetric systems is
anticipated for the future.

7. Military: The military sector requires the provision of information on averting and fighting
external threats that is as close-meshed, multi-dimensional, and interrelated as possible. This
comprises the collection and processing of information. It also includes the development of new
weapons systems.
8. Production: In the smart factory, the flow and processing of components within
manufacturing are controlled by the components and by the processing and transport stations
themselves. Ubiquitous computing will facilitate a decentralized production system that will
independently configure, control and monitor itself.

9. E-commerce: The smart objects of ubiquitous computing allow for new business models
with a variety of digital services to be implemented. These include location-based services, a
shift from selling products to renting them, and software agents that will instruct components in
ubiquitous computing to initiate and carry out services and business transactions independently.

10. Inner security: Identification systems, such as electronic passport and the already abundant
smart cards, are applications of ubiquitous computing in inner security. In the future, monitoring
systems will become increasingly important—for instance, in protecting the environment or
surveillance of key infrastructure such as airports and the power grid.

11. Medical technology: Increasingly autarkic, multifunctional, miniaturized and networked


medical applications in ubiquitous computing offer a wide range of possibilities for monitoring the health
of the elderly in their own homes, as well as for intelligent implants.

6.3 Challenges and RequirementsHardware

 User Interfaces:- The multitude of different Ubicomp devices with their different sizes of
displays and interaction capabilities represents another challenge.

 Networking:- Another key driver for the final transition will be the use of short-range
wireless as well as traditional wired technologies. Wireless computing refers to the use of
wireless technology to connect computers to a network.
 Mobility:- First of all, the middleware has to deal with the mobility of the users and other
physical objects. Therefore it must support mobile wireless communication and small mobile
computing devices, such as PDAs and laptops.
 Scalability:- In a ubiquitous computing environment where possibly thousands and thousands
of devices are part of scalability of the whole system is a key requirement. All the devices are
autonomous and must be able to operate independently a decentralized management will
most likely be most suitable.

 Reliability:- Thus the reliability of ubiquitous services and devices is a crucial requirement.In
order to construct reliable systems self-monitoring, self-regulating and self-healing features
like they are found in biology might be a solution.

 Interoperability:- This will probably be one of the major factors for the success or failure of
the Ubicomp vision. This diversity will make it impossible that there is only one agreed
standard.

 Resource Discovery:- The ability of devices to describe their behaviour to the network is a
key requirement. On the other hand, it cannot be assumed that devices in a ubiquitous
environment have prior knowledge of the capabilities of other occupants.

 Privacy and Security:- In a fully networked world with ubiquitous, sensor-equipped devices
several privacy and security issues arise. The people in this environment will be worried
about their privacy since there is the potential of total monitoring must be understandable by
the user and it must be modelled into the system architecture.
6.5 Human Computer Interaction

Fig 6.1: Human Computer Interaction in Ubiquitous Computing.

Ubiquitous computing has high prospects for human life along with certain challenges across
computer science, system design, system engineering, system modelling and in Human
Computer Interactions (HCI) design. In case of Human Computer Interactions (HCI) there are
certain requirements and challenges for ubiquitous computing like minimum user attention in
order to enable them to focus on tasks rather than technology.

Traditional Human computer Interaction models in the form command line, menu driven or
Graphical User Interface (GUI) are inadequate and insufficient to meet the unique requirements
of the ubiquitous computing environment.

The smart objects of ubiquitous computing require developers who design user interfaces that
move beyond the formerly dominant monitor/keyboard principle. Most objects will have a
variety of interfaces to their environment, but these will not include visualization components.
Moreover, there will be many implicit interactions in which the user will have little or no
involvement in the computing process, to avoid flooding the user with information. Even so, the
user must be given the option of controlling the activities of ubiquitous computing by means of
an appropriate human-machine interface.

The human-machine interface is not a self-contained field of technology. It is instead an


interdisciplinary challenge that draws on such fields as computer science, ergonomics, the
cognitive sciences and microelectronics.

The design of human-machine interfaces is an important activity for most of the leading players
in consumer electronics and computer systems. Companies such as Microsoft and Siemens
maintain their own usability labs in order to test their products. The auto industry and its
suppliers, e.g., Toyota, BMW and Mitsubishi, or their suppliers Emerson and Siemens VDO, are
also working intensively on the interfaces of their driver assistance systems.

A central challenge for the human-machine interface is to construct a semantic model of the real
world, which would allow the meaning of a spoken sentence to be understood, for example. Such
models have been developed for specific domains such as medicine, but a general model is yet to
evolve.

These developments are currently getting a strong boost from the semantic web initiative of the
Internet standards organization- the World-wide Web Consortium. Semantic Web comprises a
collection of standards for classification systems such as RDF and OWL, which model the real
world in networks of concepts. Whether and how this approach might impact real applications
has yet to be seen.

Technology experts believe that human-machine interfaces will play a rather average role for the
evolution of ubiquitous computing. Speech technology is seen as particularly relevant, but also
as a possible technological bottleneck. The visionary approaches of gestures and implants are
believed to be less relevant for further evolution of ubiquitous computing.
6.6 Architectural Design for UbiCom Systems: Smart DEI Model
Three basic architectural design patterns for ubiquitous ICT system: smart devices, smart
environment and smart interaction are proposed. Here the concept smart simply means that the
entity is active, digital, networked, can operate to some extent autonomously, is reconfigurable
and has local control of the resources it needs such as energy, data storage, etc.

It follows that these three main types of system design may themselves contain sub-systems and
components at a lower level of granularity that may also be considered smart, e.g., a smart
environment device may consist of smart sensors and a smart controller, etc. There is even smart
dust. An illustrative example of how these three types of models can be deployed is given in
Figure below. These are many examples of sub-types28 of smarts for each of the three basic
types of smarts which are discussed in detail in the later chapters of this book.

The three main types of smart design also overlap, they are not mutually exclusive. Smart
devices may also support smart interaction. Smart mobile start devices can be used for control in
addition to the use of static embedded environment devices.

Smart devices may be used to support the virtual viewpoints of smart personal (physical
environment) spaces in a personal space that accompanies the user wherever they are. Smart DEI
also refers to hybrid models that combine the designs of smart device, smart environments and
smart interaction.
Figure 6.2: A UbiCom system model. The dotted line indicates the UbiCom system
boundary

6.7 Smart Devices

Smart devices, e.g., personal computer, mobile phone, tend to be multi-purpose ICT devices,
operating as a single portal to access sets of popular multiple application services that may reside
locally on the device or remotely on servers. There is a range of forms for smart devices. Smart
devices tend to be personal devices, having a specified owner or user. In the smart device model,
the locus of control and user interface reside in the smart device.

The main characteristics of smart devices are as follows: mobility, dynamic service discovery
and intermittent resource access (concurrency, upgrading, etc.). Devices are often designed to be
multi-functional because these ease access to, and simplify the interoperability of, multi-
functions at run-time. However, the trade-off is in a decreased openness of the system to
maintain (upgrade) hardware components and to support more dynamic flexible run-time
interoperability.
1. Weiser’s ICT Device Forms: Tabs, Pads and Boards We often think of computers primarily
in terms of the multi-application personal or server computers, as devices with some type of
screen display for data output and a keyboard and some sort of pointing devices for data input.
As humans, we routinely interact with many more devices that have single embedded computers
in them, such as household appliances, and with complex machines that have multiple embedded
computers in them.
Weiser noted that there was a trend away from many people per computer, to one computer per
person, through to many computers per person. Computer-based devices tend to become smaller
and lighter in weight, cheaper to produce. Thus devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser considered a range of device sizes in his early work from wearable centimetre-sized
devices (tabs), to hand-held decimetre-sized devices (pads) to metre-sized (boards) displays. ICT
Pads to enable people to access mobile services and ICT tabs to track goods are in widespread
use. Wall displays are useful for viewing by multiple people, for collaborative working and for
viewing large complex structures such as maps. Board devices may also be used horizontally as
surface computers as well used in a vertical position.

Weiser’s ICT Device Forms: Tabs,


Pads and Boards
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded computers in them, such
as household appliances, and with complex
machines
29
that have multiple embedded computers in
them. Weiser noted that there was a trend
away from
many people per computer,
30
to one computer per person,
31
through to many computers per person.
Computer-based devices tend to become
smaller and lighter in weight, cheaper to
produce. Thus
devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser con-
sidered a range of device sizes in his early
work from wearable centimetre-sized
devices (tabs), to
hand-held decimetre-sized devices (pads)
to metre-sized (boards) displays. ICT Pads
to enable
people to access mobile services and ICT
tabs to track goods are in widespread use.
Wall displays
are useful for viewing by multiple people,
for collaborative working and for viewing
large complex
structures such as maps. Board devices
may also be used horizontally as surface
computers as well
used in a vertical position.
Weiser’s ICT Device Forms: Tabs,
Pads and Boards
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded computers in them, such
as household appliances, and with complex
machines
29
that have multiple embedded computers in
them. Weiser noted that there was a trend
away from
many people per computer,
30
to one computer per person,
31
through to many computers per person.
Computer-based devices tend to become
smaller and lighter in weight, cheaper to
produce. Thus
devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser con-
sidered a range of device sizes in his early
work from wearable centimetre-sized
devices (tabs), to
hand-held decimetre-sized devices (pads)
to metre-sized (boards) displays. ICT Pads
to enable
people to access mobile services and ICT
tabs to track goods are in widespread use.
Wall displays
are useful for viewing by multiple people,
for collaborative working and for viewing
large complex
structures such as maps. Board devices
may also be used horizontally as surface
computers as well
used in a vertical position.
Weiser’s ICT Device Forms: Tabs,
Pads and Boards
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded computers in them, such
as household appliances, and with complex
machines
29
that have multiple embedded computers in
them. Weiser noted that there was a trend
away from
many people per computer,
30
to one computer per person,
31
through to many computers per person.
Computer-based devices tend to become
smaller and lighter in weight, cheaper to
produce. Thus
devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser con-
sidered a range of device sizes in his early
work from wearable centimetre-sized
devices (tabs), to
hand-held decimetre-sized devices (pads)
to metre-sized (boards) displays. ICT Pads
to enable
people to access mobile services and ICT
tabs to track goods are in widespread use.
Wall displays
are useful for viewing by multiple people,
for collaborative working and for viewing
large complex
structures such as maps. Board devices
may also be used horizontally as surface
computers as well
used in a vertical position.
Weiser’s ICT Device Forms: Tabs,
Pads and Boards
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded computers in them, such
as household appliances, and with complex
machines
29
that have multiple embedded computers in
them. Weiser noted that there was a trend
away from
many people per computer,
30
to one computer per person,
31
through to many computers per person.
Computer-based devices tend to become
smaller and lighter in weight, cheaper to
produce. Thus
devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser con-
sidered a range of device sizes in his early
work from wearable centimetre-sized
devices (tabs), to
hand-held decimetre-sized devices (pads)
to metre-sized (boards) displays. ICT Pads
to enable
people to access mobile services and ICT
tabs to track goods are in widespread use.
Wall displays
are useful for viewing by multiple people,
for collaborative working and for viewing
large complex
structures such as maps. Board devices
may also be used horizontally as surface
computers as well
used in a vertical position.
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded computers in them, such
as household appliances, and with complex
machines
29
that have multiple embedded computers in
them. Weiser noted that there was a trend
away from
many people per computer,
30
to one computer per person,
31
through to many computers per person.
Computer-based devices tend to become
smaller and lighter in weight, cheaper to
produce. Thus
devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser con-
sidered a range of device sizes in his early
work from wearable centimetre-sized
devices (tabs), to
hand-held decimetre-sized devices (pads)
to metre-sized (boards) displays. ICT Pads
to enable
people to access mobile services and ICT
tabs to track goods are in widespread use.
Wall displays
are useful for viewing by multiple people,
for collaborative working and for viewing
large complex
structures such as maps. Board devices
may also be used horizontally as surface
computers as well
used in a vertical position.
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded computers in them, such
as household appliances, and with complex
machines
29
that have multiple embedded computers in
them. Weiser noted that there was a trend
away from
many people per computer,
30
to one computer per person,
31
through to many computers per person.
Computer-based devices tend to become
smaller and lighter in weight, cheaper to
produce. Thus
devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser con-
sidered a range of device sizes in his early
work from wearable centimetre-sized
devices (tabs), to
hand-held decimetre-sized devices (pads)
to metre-sized (boards) displays. ICT Pads
to enable
people to access mobile services and ICT
tabs to track goods are in widespread use.
Wall displays
are useful for viewing by multiple people,
for collaborative working and for viewing
large complex
structures such as maps. Board devices
may also be used horizontally as surface
computers as well
used in a vertical position.
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded computers in them, such
as household appliances, and with complex
machines
29
that have multiple embedded computers in
them. Weiser noted that there was a trend
away from
many people per computer,
30
to one computer per person,
31
through to many computers per person.
Computer-based devices tend to become
smaller and lighter in weight, cheaper to
produce. Thus
devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser con-
sidered a range of device sizes in his early
work from wearable centimetre-sized
devices (tabs), to
hand-held decimetre-sized devices (pads)
to metre-sized (boards) displays. ICT Pads
to enable
people to access mobile services and ICT
tabs to track goods are in widespread use.
Wall displays
are useful for viewing by multiple people,
for collaborative working and for viewing
large complex
structures such as maps. Board devices
may also be used horizontally as surface
computers as well
used in a vertical position.
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded computers in them, such
as household appliances, and with complex
machines
29
that have multiple embedded computers in
them. Weiser noted that there was a trend
away from
many people per computer,
30
to one computer per person,
31
through to many computers per person.
Computer-based devices tend to become
smaller and lighter in weight, cheaper to
produce. Thus
devices can become prevalent, made more
portable and can appear less obtrusive.
Weiser con-
sidered a range of device sizes in his early
work from wearable centimetre-sized
devices (tabs), to
hand-held decimetre-sized devices (pads)
to metre-sized (boards) displays. ICT Pads
to enable
people to access mobile services and ICT
tabs to track goods are in widespread use.
Wall displays
are useful for viewing by multiple people,
for collaborative working and for viewing
large complex
structures such as maps. Board devices
may also be used horizontally as surface
computers as well
used in a vertical position.
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded
We often think of computers primarily in
terms of the multi-application personal or
server compu-
ters, as devices with some type of screen
display for data output and a keyboard and
some sort of
pointing devices for data input. As
humans, we routinely interact with many
more devices that have
single embedded
2. Extended Forms for ICT Devices: Dust, Skin and Clay The three forms proposed by Weiser
(1991) for devices, tabs, pads and boards, are characterised by: being macro-sized, having a
planar form and by incorporating visual output displays.
If we relax each of these three characteristics, we can expand this range into a much more
diverse and potential more useful range of ubiquitous computing devices. First, ICT devices can
be miniaturised without visual output displays, e.g., Micro Electro Mechanical Systems
(MEMS), ranging from nanometres through micrometers to millimetres. This form is called
Smart Dust. Some of these can combine multiple tiny mechanical and electronic components,
enabling an increasing set of functions to be embedded into ICT devices, the physical
environment and humans.
Today MEMS, such as accelerometers, are incorporated into many devices such as laptops to
sense falling and to park moving components such as disk arms, are being increasingly
embedded into widely accessed systems. They are also used in many devices to support gesture-
based interaction. Miniaturisation accompanied by cheap manufacturing is a core enabler for the
vision of ubiquitous computing Second, fabrics based upon light-emitting and conductive
polymers, organic computer devices, can be formed into more flexible non-planar display
surfaces and products such as clothes and curtains.
MEMS devices can also be painted onto various surfaces so that a variety of physical world
structures can act as networked surfaces of MEMS. This form is called Smart Skins. Third,
ensembles of MEMS can be formed into arbitrary three-dimensional shapes as artefacts
resembling many different kinds of physical object. This form is called Smart Clay.

3. Mobility: Mobile devices usually refer to communicators, multimedia entertainment and


business processing devices designed to be transported by their human owners, e.g., mobile
phone, games consoles, etc.

There is a range of different types of mobiles as follows:

 Accompanied: these are devices that are not worn or implanted. They can either be portable
or hand-held, separate from, but carried in clothes or fashion accessories.
 Portable: such as laptop computers which are oriented to two-handed operation while seated.
These are generally the highest resource devices.
 Hand-held: devices are usually operated one handed and on occasion hands-free, combining
multiple applications such as communication, audio-video recording and playback and
mobile office. These are low resource devices.
 Wearable: devices such as accessories and jewellery are usually operated hands-free and
operate autonomously, e.g., watches that act as personal information managers, earpieces that
act as audio transceivers, glasses that act as visual transceivers and contact lenses. These are
low resource devices.
 Implanted or embedded: these are often used for medical reasons to augment human
functions, e.g., a heart pacemaker. They may also be used to enhance the abilities of
physically and mentally able humans. Implants may be silicon-based macro- or micro-sized
integrated circuits or they may be carbon-based, e.g., nanotechnology.
Static can be regarded as an antonym for mobile. Static devices tend to be moved before
installation to a fixed location and then reside there for their full operational life-cycle. They tend
to use a continuous network connection and fixed energy source.
They can incorporate high levels of local computation resources, e.g., personal computer, AV
recorders and players, various home and office appliances, etc. The division between statics and
mobiles can be more finely grained. For example, statics could move between sessions of usage,
e.g., a mobile circus containing different leisure rides in contrast to the rides in a fixed leisure
park.

4. Volatile Service Access: Mobiles tend to use wireless networks. However, mobiles may be
intermittently connected to either wireless networks (WAN is not always available) or to wired
communications networks (moving from LAN to LAN) or to both. Service access by smart
mobile devices is characterised as follows. Intermittent (service access) devices access
software services and hardware intermittently. This may be because resources are finite and
demand exceeds supply, e.g., a device runs out of energy and needs to wait for it to be
replenished.
This may be because resources are not continually accessible. Service discovery: devices can
dynamically discover available services or even changes in the service context. Devices can
discover the availability of local access networks and link via core networks to remote network
home services. They can discover local resources and balance the cost and availability of local
access versus remote access to services.
Devices can be designed to access services that they discover on an intermittent basis. Context-
aware discovery can improve basic discovery by limiting discovery to the services to the ones
of interest, rather than needing to be notified of many services that do not match the context.
With asymmetric remote service access, more downloads than uploads, tends to occur. This is in
part due to the limited local resources.
For example, because of the greater power needed to transmit rather than receive
communication and the limited power capacity, high power consumption results in more
received than sent calls. Apart from the ability to create and transmit voice signals, earlier
phones were designed to be transreceivers and players. More recently, because of
miniaturisation, mobile devices not only act as multimedia players, they can also act as
multimedia recorders and as content sources.

5. Situated and Self-Aware: Smart devices although they are capable of remote access to any
Internet services, tend to use various contexts to filter information and service access. For
examples, devices may operate to focus on local views of the physical environments, maps,
and to access local services such as restaurants and hotels.

Mobiles are often designed to work with a reference location in the physical environment called
a home location, e.g., mobile network nodes report their temporary location addresses to a home
server which is used to help coordinate the mobility. Service providers often charge access to
services for mobile service access based upon how remote they are with respect to a reference
ICT location, a home ICT location. During transit, mobiles tend to reference a route from a start
location to a destination location.

Mobile devices support limited local hardware, physical, and software resources in terms of
power, screen, CPU, memory, etc. They are ICT resource constrained. Services that are accessed
or pushed to use such devices must be aware of these limitations, otherwise the resource
utilisation by services will not be optimal and may be wasted, e.g., receiving content in a format
that cannot be played.

In the latter case, the mobile device could act as an intermediary to output this content to another
device where it can be played. Mobile devices tend to use a finite internal energy cache in
contrast to an external energy supply, enhancing mobility. The internal energy supply may be
replenished from a natural renewal external source, e.g., solar power or from an artificial energy
gird: energy self-sufficiency.

This is particularly important for low-maintenance, tether less devices. Devices can
automatically configure themselves to support different functions based upon the energy
available. Without an internal energy cache, the mobility of devices may be limited by the
length of a power cable it is connected to. There is usually a one-to-one relationship between
mobiles and their owners. Devices’ configuration and operation tends to be personalised, to
support the concept of a personal information and service space which accompanies people
where ever they are.

6.8 Smart Environments


In a smart environment, computation is seamlessly used to enhance ordinary activities(Coen
Cook and Das 2007) refer to a smart environment as ‘one that is able to acquire and apply
knowledge about the environment and its inhabitants in order to improve their experience in that
environment’. A smart environment consists of a set of networked devices that have some
connection to the physical world.
Unlike smart devices, the devices that comprise a smart environment usually execute a single
predefined task, e.g., motion or body heat sensors coupled to a door release and lock control.
Embedded environment components can be designed to automatically respond to or to anticipate
users’ interaction using iHCI (implicit human–computer interaction), e.g., a person walks
towards a closed door, so the door automatically opens.
Hence, smart environments support a bounded, local context of user interaction. Smart
environment devices may also be fixed in the physical world at a location or mobile, e.g., air-
born. Smart environments could necessitate novel and revolutionary upgrades to be incorporated
into the environment in order to support less obtrusive interaction, e.g., pressure sensors can be
incorporated into surfaces to detect when people sit down or walk.
A more evolutionary approach could impart minimal modifications to the environment through
embedding devices such as surface mounted wireless sensor devices, cameras and microphones.

6.9 Tagging, Sensing and Controlling Environments

Smart environment devices support several types of interaction with environments such as the
physical environment as follows:

• Tagging and annotating the physical environment: tags, e.g., RFID32 tags, can be attached
to physical objects. Tag readers can be used to find the location of tags and to track them. Virtual
tags can be attached to virtual views of the environment, e.g., a tag can be attached to a location
in a virtual map.
• Sensing or monitoring the physical environment: Transducers take inputs from the physical
environment to convert some phenomena in the physical world into electrical signals that can be
digitised, e.g., how much ink is in a printer’s cartridges. Sensors provide the raw information
about the state of the physical environment as input to help determine the context in a context
aware system. Sensing is often a pre-stage to filtering and adapting.

• Filtering: a system forms an abstract or virtual view of part of its environment such as the
physical world. This reduces the number of features in the view and enables viewers to focus on
the features of interest.

• Adapting: system behaviour can adapt to the features of interest in the environment of adapt to
changes in the environment, e.g., a physical environment route is based upon the relation of the
current location to a destination location.

• Controlling the physical world: Controllers normally require sensors to determine the state of
the physical phenomena e.g., heating or cooling systems that sense the temperature in an
environment. Controlling can involve actions to modify the state of environment, to cause it to
transition to another state. Control may involve changing the order (assembly) of artefacts in the
environment or may involve regulation of the physical environment.

• Assembling: robots are used to act on a part of the physical world. There is a variety of robots.
They may be pre-programmed to schedule a series of actions in the world to achieve some goal,
e.g., a robot can incorporate sensors to detect objects in a source location, move them and stack
them in a destination location .

• Regulating: Regulators tend to work in a fixed location, e.g., a heating system uses feedback
control to regulate the temperature in an environment within a selected range.
6.10 Context-Aware Systems

Is the set of facts or circumstances that surround a situation or event. It deals with,

 The parts of a discourse that surround a word or passage and can throw light on its
meaning.
 The interrelated conditions in which something exists or occurs.

1. Context representation mainly deals with


Models for structuring context, general context models, context classification models, special
models e.g. location models.

2. Context processing mainly deals with


Flow of context data, storage of context, Context abstraction, Integration of context data,
Transformation of representation forms.

3. Context communication mainly deals with


Distributed communication of context in the environment, Central storage, processing and
communication.

Introduction Context in Ubicomp


 Context-acquisition: How can facts be retrieved that characterize a situation.
 Context-communication: How can facts be communicated to other entities of a system.
 Context-usage: How can applications be adopted to a situation and react accordingly.

Perception of Context
 Through examination of internal state -> internal Context.
 Through examination of surrounding environment through own sensor system embedded
in object -> Smart Artefacts.
 Through examination of surrounding environment with the help of sensors embedded
into the environment -> Smart Environments.
Examples
 MediaCup: Smart-Artefacts
 Aware Home / Context Toolkit: Smart-Environment

6.10.1 Architecture of Context-Aware Systems

Context-aware systems can be implemented in many ways. The approach depends on special
requirements and conditions such as the location of sensors (local or remote), the amount of
possible users (one user or many), the available resources of the used devices (high-end-PCs or
small mobile devices) or the facility of a further extension of the system. Furthermore, the
method of context-data acquisition is very important when designing context-aware systems
because it predefines the architectural style of the system at least to some extent.

 Chen (2004) presents three different approaches on how to acquire contextual


information:

Direct sensor access: This approach is often used in devices with sensors locally built in. The
client software gathers the desired information directly from these sensors, i.e., there is no
additional layer for gaining and processing sensor data. Drivers for the sensors are hardwired
into the application, so this tightly coupled method is usable only in rare cases. Therefore, it is
not suited for distributed systems due to its direct access nature which lacks a component capable
of managing multiple concurrent sensor accesses.

Middleware infrastructure: Modern software design uses methods of encapsulation to separate


e.g., business logic and graphical user interfaces. The middleware based approach introduces a
layered architecture to context-aware systems with the intention of hiding low-level sensing
details.
Compared to direct sensor access this technique eases extensibility since the client code has not
be modified anymore and it simplifies the reusability of hardware dependent sensing code due to
the strict encapsulation.

Context server: The next logical step is to permit multiple clients access to remote data sources.
This distributed approach extends the middleware based architecture by introducing an access
managing remote component. Gathering sensor data is moved to this so-called context server to
facilitate concurrent multiple access. Besides the reuse of sensors, the usage of a context server
has the advantage of relieving clients of resource intensive operations.

As probably the majority of end devices used in context-aware systems are mobile gadgets with
limitations in computation power, disk space etc., this is an important aspect. In return one has to
consider about appropriate protocols, network performance, quality of service parameters etc.
when designing a context-aware system based on client-server architecture.

Application

Storage/Management

Pre-processing

Raw data retrieval

Sensors

Fig 6.3: Layered conceptual framework for context-aware


systems
As mentioned above, a separation of detecting and using context is necessary to improve
extensibility and reusability of systems. The following layered conceptual architecture, as
depicted in above Figure, augments layers for detecting and using context by adding interpreting
and reasoning functionality.

The first layer consists of a collection of different sensors. It is notable that the word “sensor” not
only refers to sensing hardware but also to every data source which may provide usable context
information.

Concerning the way data is captured, sensors can be classified in three groups Physical sensors.
The most frequently used type of sensors are physical sensors. Many hardware sensors are
available nowadays which are capable of capturing almost any physical data.

Table 1 shows some examples of physical sensors.

Type of context Available Sensors


Light Photodiodes, color sensors, IR
and UV-sensors etc.
Visual Context Various cameras
Audio Microphones
Motion, Acceleration Mercury switches, angular sensors,
accelerometers, motion detectors, magnetic
fields
Location Outdoor: Global Positioning
System (GPS), Global System for Mobile
Communications (GSM); Indoor: Active
Badge system, etc.
Touch Touch sensors implemented in
mobile devices
Temperature Thermometers
Physical attributes Biosensors to measure skin resistance, blood
pressure

Table 6.1: Commonly used physical sensor types


Virtual sensors. Virtual sensors source context data from software applications or services. For
example, it is possible to determine an employee’s location not only by using tracking systems
(physical sensors) but also by a virtual sensor, e.g., by browsing an electronic calendar, a travel-
booking system, emails etc. for location information. Other context attributes that can be sensed
by virtual sensors include, e.g., the user’s activity by checking for mouse-movement and
keyboard input.

Logical sensors. These sensors make use of a couple of information sources, and combine
physical and virtual sensors with additional information from databases or various other sources
in order to solve higher tasks. For example, a logical sensor can be constructed to detect an
employee’s current position by analyzing logins at desktop PCs and a database mapping of
devices to location information.

The second layer is responsible for the retrieval of raw context data. It makes use of appropriate
drivers for physical sensors and APIs for virtual and logical sensors. The query functionality is
often implemented in reusable software components which make low-level details of hardware
access transparent by providing more abstract methods such as get Position().

By using interfaces for components responsible for equal types of context these components
become exchangeable. Therefore, it is possible, for instance, to replace a RFID system by a GPS
system without any major modification in the current and upper layers.

The Preprocessing layer is not implemented in every context-aware system but may offer useful
information if the raw data are too coarse grained. The preprocessing layer is responsible for
reasoning and interpreting contextual information.

The sensors queried in the underlying layer most often return technical data that are not
appropriate to use by application designers. Hence this layer raises the results of layer two to a
higher abstraction level. The transformations include extraction and quantization operations. For
example, the exact GPS position of a person might not be of value for an application but the
name of the room the person is in, may be.
The fourth layer, Storage and Management, organizes the gathered data and offers them via a
public interface to the client. Clients may gain access in two different ways, synchronous and
asynchronous. In the synchronous manner the client is polling the server for changes via remote
method calls. Therefore, it sends a message requesting some kind of offered data and pauses
until it receives the server’s answer.

The asynchronous mode works via subscriptions. Each client subscribes to specific events it is
interested in. On occurrence of one of these events, the client is either simply notified or a
client’s method is directly involved using a callback. In the majority of cases the asynchronous
approach is more suitable due to rapid changes in the underlying context. The polling technique
is more resource intensive as context data has to be requested quite often and the application has
to prove for changes itself, using some kind of context history.
The client is realized in the fifth layer, the Application layer. The actual reaction on different
events and context instances is implemented here. Sometimes information retrieval and
application specific context management and reasoning is encapsulated in form of agents which
communicate with the context server and act as an additional layer between the preprocessing
and the application layer (Chen, 2004). An example for context logic at the client side is the
display on mobile devices: as a light sensor detects bad illumination, text may be displayed in
higher color contrast.

6.10.2 Context Models

A context model is needed to define and store context data in a machine process able form. To
develop flexible and useable context ontologies that cover the wide range of possible contexts is
a challenging task.

Key-Value Models: These models represent the simplest data structure for context modeling.
They are frequently used in various service frameworks, where the key value pairs are used to
describe the capabilities of a service. Service discovery is then applied by using matching
algorithms which use these key-value pairs.
Markup Scheme Models: All markup based models use a hierarchical data structure consisting
of markup tags with attributes and content. Profiles represent typical markup-scheme models.
Typical examples for such profiles are the Composite Capabilities / Preference Profile (CC/PP)
(W3C, 2004a) and User Agent Profile (UAProf) (Wapforum, 2001), which are encoded in
RDF/S.

Graphical Models: The Unified Modeling Language (UML) is also suitable for modeling
context. Various approaches exist where contextual aspects are modeled in by using UML, e.g.,
Sheng and Benatallah (2005).

Object Oriented Models. Modeling context by using object-oriented techniques offers to use
the full power of object orientation (e.g., encapsulation, reusability, inheritance). Existing
approaches use various objects to represent different context types (such as temperature,
location, etc.), and encapsulate the details of context processing and representation. Access the
context and the context processing logic is provided by well-defined interfaces.

Logic Based Models. Logic-based models have a high degree of formality. Typically, facts,
expressions and rules are used to define a context model. A logic based system is then used to
manage the aforementioned terms and allows to add, update or remove new facts. The inference
(also called reasoning) process can be used to derive new facts based on existing rules in the
systems. The contextual information needs to be represented in a formal way as facts.

Ontology Based Models. Ontologies represent a description of the concepts and relationships.
Therefore, ontologies are a very promising instrument for modeling contextual information due
to their high and formal expressiveness and the possibilities for applying ontology reasoning
techniques. Various context-aware frameworks use ontologies as underlying context models.
Fig 6.4: Centre for Ubiquitous communication by Light.

The Canter offers a variety of vehicles through which different entities can partner with.
Its Partnership Programs allow you to engage in different Center’s activities at different levels,
license technologies, contract to use facilities and equipment, attend seminars, and gain access to
the latest research and innovation.

Partners can contribute to the Canter by sponsoring employees or students to work directly in the
Canter, and provide real-time comments on center operation and research thrusts. They can also
collaborate with the center to pursue state or government grants of joint interest.

The Centre also anticipates attention and support by different sponsors to enrich Center’s
research. Sponsors may similarly choose to become involved.

The Canter is establishing an advisory board consisting of invited experts and representatives
from industry, government and academia, to assess and reshape the Centre programs.

The center researchers consult advisory board members informally throughout the year, and the
board has a formal face-to-face meeting once per year around each Centre annual meeting. The
advisory board can evaluate the Center’s progress toward the metrics and benchmarks, and
identify potential opportunities and threats for the Centre to pursue.
6.11 Ubiquitous System Challenge and outlook

Key Challenges

Key Challenges for each of the core UbiCom system properties are considered:

• Distributed

• Context-Aware

• iHCI

• Artificial Intelligent

• Autonomous

Fig 6.5: Ubiquitous System Challenge and outlook.

Key Challenges: Distributed

• Reliability

• Openness

• Less clearly defined system boundary


• Synchronising data

• Privacy & security

• Event floods

• Ad hoc interactions

• Overwhelming choice, multiple versions, heterogeneity

• Reduced cohesion,

• Increase in Distribution computation and communication costs

Key Challenges: iHCI

• Users get overloaded.

• Disappearing technology problems

• Disruptions

• Ambiguous user intentions

• Loss of privacy & control

• Loss of presence in physical real-world

Key Challenges: Context-Awareness

• Localized scalability

• Unclear user goals and context

• Context adaptation leads to quicker commitments.

• Balancing system versus application versus user control of context adaptation


Key Challenges: Autonomous

• Loss of high value macro mobile resources

• Loss of many low value micro resources

• No-one wants to be an administrator

• Undesired or unintelligible adaptation

• Increase in Interdependencies

• Loss of control by user

Key Challenges: Intelligent

• System infers incorrectly.

• Greater reliance and dependencies on systems of systems interactions to operate.

• Systems learn to operate unsafely.

• Systems exceed normal human behaviour limits.

• Virtual organisation can masquerade as real organisations.

• Byzantine, disruptive and malicious behaviors.

1. Smart Interaction

• Smarter interaction between individual smart devices and smart environments is a key
enabler to promote richer, more seamless, personal, social and public spaces.

• Interaction with smart mobile & environment devices requires effective human computer
interaction design to make these systems useful

• Human interactions often need to be centred in physical world rather than centred in
virtual computer devices.
 Key challenges for Interaction in Smart Environments

 Multiplicity of interactions increase and contexts can be hard to determine.


 Key challenges for using ubiquitous computing applications in home type smart
environments (Edwards & Grinter, 2001) are:
"accidentally" smart home, impromptu interoperability, no systems administrator, designing
for domestic use, social implications of aware home technologies, reliability, and inference
in the presence of ambiguity.
 Their analysis can be generalised to smart (physical world) environment interaction.
 New design models of connectivity with wireless technologies are needed.

You might also like