Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
58 views

Unit 4 Infrastructure As A Service

Uploaded by

Tarun Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Unit 4 Infrastructure As A Service

Uploaded by

Tarun Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Unit 4

Infrastructure as a service

Definition:

Infrastructure can be considered as any hardware or software resource used to either execute
and/or control the complete computational operations. Infrastructure as a Service(IaaS), thus
can be explained as a mechanism of renting the infrastructure to the end user for
carrying infrastructure dependent work on the said infrastructure without
implementing infrastructure. This helps user to concentrate on the work and frees user from the
overhead of implementation of the required infrastructure to carry infrastructure dependent work.
User is charged generally on pay per use basis or depending upon the contract
conditions(Service level Agreements(SLA)) between user and infrastructure provider. Follows
the brief information on IaaS and types of resources, which can be converted into IaaS. Figure 8
shows a simple logical view of Infrastructure as a service scenario of cloud computing.
Load Balancing Classification
Metrics for load balancing

Policies of load balancing algorithms

Load Balancing Algorithms available:


Load Balancing Algorithms available:
Cloud Storage:
At the rate data is growing today, it's not surprising that cloud storage is also growing in
popularity. The fastest-growing data is archive data, which is ideal for cloud storage given a
number of factors, including cost, frequency of access, protection, and availability. But not all
cloud storage is the same. One provider may focus primarily on cost, while another focuses on
availability or performance. No one architecture has a singular focus, but the degrees to which an
architecture implements a given characteristic defines its market and appropriate use models.
Frequently used acronyms
• API: Application programming interface
• FTP: File Transfer Protocol
• HTTP: Hypertext Transfer Protocol
• HTTPS: HTTP over Secure Sockets Layer
• JFS: Journaling file system
• NFS: Network file system
• NIC: Network interface card
• RAID: Redundant array of independent disks
• REST: Representational State Transfer
• SAN: Storage area network
• SCSI: Small Computer System Interface
• SLA: Service level agreement
• TCP: Transmission Control Protocol
• UDP: User Datagram Protocol
• WAN: Wide area network

It's difficult to talk about architectures without the perspective of utility. The measure of an
architecture from a variety of characteristics, including cost, performance, remote access, and so
on. Therefore, I first define a set of criteria by which cloud storage models are measured, and
then explore some of the interesting implementations within cloud storage architectures.
First, let's discuss a general cloud storage architecture to set the context for the later exploration
of unique architectural features.
General architecture
Cloud storage architectures are primarily about delivery of storage on demand in a highly
scalable and multi-tenant way. Generically (see Figure 1), cloud storage architectures consist of a
front end that exports an API to access the storage. In traditional storage systems, this API is the
SCSI protocol; but in the cloud, these protocols are evolving. There, you can find Web service
front ends, file-based front ends, and even more traditional front ends (such as Internet SCSI, or
iSCSI). Behind the front end is a layer of middleware that I call the storage logic. This layer
implements a variety of features, such as replication and data reduction, over the traditional data-
placement algorithms (with consideration for geographic placement). Finally, the back end
implements the physical storage for data. This may be an internal protocol that implements
specific features or a traditional back end to the physical disks.

Figure 1. Generic cloud storage architecture


From Figure 1, you can see some of the characteristics for current cloud storage architectures.
Note that no characteristics are exclusive in the particular layer but serve as a guide for specific
topics that this article addresses. These characteristics are defined in Table 1.
Table 1. Cloud storage characteristics

Characteristic Description

Manageability The ability to manage a system with minimal resources

Access method Protocol through which cloud storage is exposed

Performance Performance as measured by bandwidth and latency

Multi-tenancy Support for multiple users (or tenants)

Scalability Ability to scale to meet higher demands or load in a graceful manner

Data availability Measure of a system's uptime

Ability to control a system—in particular, to configure for cost, performance, or other


Control characteristics
Characteristic Description

Storage
efficiency Measure of how efficiently the raw storage is used

Cost Measure of the cost of the storage (commonly in dollars per gigabyte)

Manageability
One key focus of cloud storage is cost. If a client can buy and manage storage locally compared
to leasing it in the cloud, the cloud storage market disappears. But cost can be divided into two
high-level categories: the cost of the physical storage ecosystem itself and the cost of managing
it. The management cost is hidden but represents a long-term component of the overall cost. For
this reason, cloud storage must be self-managing to a large extent. The ability to introduce new
storage where the system automatically self-configures to accommodate it and the ability to find
and self-heal in the presence of errors are critical. Concepts such as autonomic computing will
have a key role in cloud storage architectures in the future.
Access method
One of the most striking differences between cloud storage and traditional storage is the means
by which it's accessed (see Figure 2). Most providers implement multiple access methods, but
Web service APIs are common. Many of the APIs are implemented based on REST principles,
which imply an object-based scheme developed on top of HTTP (using HTTP as a transport).
REST APIs are stateless and therefore simple and efficient to provide. Many cloud storage
providers implement REST APIs, including Amazon Simple Storage Service (Amazon S3),
Windows Azure™, and Mezeo Cloud Storage Platform.
One problem with Web service APIs is that they require integration with an application to take
advantage of the cloud storage. Therefore, common access methods are also used with cloud
storage to provide immediate integration. For example, file-based protocols such as
NFS/Common Internet File System (CIFS) or FTP are used, as are block-based protocols such as
iSCSI. Cloud storage providers such as Six Degrees, Zetta, and Cleversafe provide these access
methods.
Although the protocols mentioned above are the most common, other protocols are suitable for
cloud storage. One of the most interesting is Web-based Distributed Authoring and Versioning
(WebDAV). WebDAV is also based on HTTP and enables the Web as a readable and writable
resource. Providers of WebDAV include Zetta and Cleversafe in addition to others.
Figure 2. Cloud storage access methods
You can also find solutions that support multi-protocol access. For example, IBM® Smart
Business Storage Cloud enables both file-based (NFS and CIFS) and SAN-based protocols from
the same storage-virtualization infrastructure.
Performance
There are many aspects to performance, but the ability to move data between a user and a remote
cloud storage provider represents the largest challenge to cloud storage. The problem, which is
also the workhorse of the Internet, is TCP. TCP controls the flow of data based on packet
acknowledgements from the peer endpoint. Packet loss, or late arrival, enables congestion
control, which further limits performance to avoid more global networking issues. TCP is ideal
for moving small amounts of data through the global Internet but is less suitable for larger data
movement, with increasing round-trip time (RTT).
Amazon, through Aspera Software, solves this problem by removing TCP from the equation. A
new protocol called the Fast and Secure Protocol (FASP™) was developed to accelerate bulk
data movement in the face of large RTT and severe packet loss. The key is the use of the UDP,
which is the parter transport protocol to TCP. UDP permits the host to manage congestion,
pushing this aspect into the application layer protocol of FASP (see Figure 3).
Figure 3. The Fast and Secure Protocol from Aspera Software
Using standard (non-accelerated) NICs, FASP efficiently uses the bandwidth available to the
application and removes the fundamental bottlenecks of conventional bulk data-transfer
schemes. The Related topics section provides some interesting statistics on FASP performance
over traditional WAN, intercontinental transfers, and lossy satellite links.
Multi-tenancy
One key characteristic of cloud storage architectures is called multi-tenancy. This simply means
that the storage is used by many users (or multiple "tenants"). Multi-tenancy applies to many
layers of the cloud storage stack, from the application layer, where the storage namespace is
segregated among users, to the storage layer, where physical storage can be segregated for
particular users or classes of users. Multi-tenancy even applies to the networking infrastructure
that connects users to storage to permit quality of service and carving bandwidth to a particular
user.
Scalability
You can look at scalability in a number of ways, but it is the on-demand view of cloud storage
that makes it most appealing. The ability to scale storage needs (both up and down) means
improved cost for the user and increased complexity for the cloud storage provider.
Scalability must be provided not only for the storage itself (functionality scaling) but also the
bandwidth to the storage (load scaling). Another key feature of cloud storage is geographic
distribution of data (geographic scalability), allowing the data to be nearest the users over a set of
cloud storage data centers (via migration). For read-only data, replication and distribution are
also possible (as is done using content delivery networks). This is shown in Figure 4.
Figure 4. Scalability of cloud storage
Internally, a cloud storage infrastructure must be able to scale. Servers and storage must be
capable of resizing without impact to users. As discussed in the Manageability section,
autonomic computing is a requirement for cloud storage architectures.
Availability
Once a cloud storage provider has a user's data, it must be able to provide that data back to the
user upon request. Given network outages, user errors, and other circumstances, this can be
difficult to provide in a reliable and deterministic way.
There are some interesting and novel schemes to address availability, such as information
dispersal. Cleversafe, a company that provides private cloud storage (discussed later), uses the
Information Dispersal Algorithm (IDA) to enable greater availability of data in the face of
physical failures and network outages. IDA, which was first created for telecommunication
systems by Michael Rabin, is an algorithm that allows data to be sliced with Reed-Solomon
codes for purposes of data reconstruction in the face of missing data. Further, IDA allows you to
configure the number of data slices, such that a given data object could be carved into four slices
with one tolerated failure or 20 slices with eight tolerated failures. Similar to RAID, IDA permits
the reconstruction of data from a subset of the original data, with some amount of overhead for
error codes (dependent on the number of tolerated failures). This is shown in Figure 5.
Figure 5. Cleversafe's approach to extreme data availability
With the ability to slice data along with cauchy Reed-Solomon correction codes, the slices can
then be distributed to geographically disparate sites for storage. For a number of slices (p) and a
number of tolerated failures (m), the resulting overhead is p/(p-m). So, in the case of Figure 5,
the overhead to the storage system for p = 4 and m = 1 is 33%.
The downside of IDA is that it is processing intensive without hardware acceleration. Replication
is another useful technique and is implemented by a variety of cloud storage providers. Although
replication introduces a large amount of overhead (100%), it's simple and efficient to provide.
Control
A customer's ability to control and manage how his or her data is stored and the costs associated
with it is important. Numerous cloud storage providers implement controls that give users greater
control over their costs.
Amazon implements Reduced Redundancy Storage (RRS) to provide users with a means of
minimizing overall storage costs. Data is replicated within the Amazon S3 infrastructure, but
with RRS, the data is replicated fewer times with the possibility for data loss. This is ideal for
data that can be recreated or that has copies that exist elsewhere.
Efficiency
Storage efficiency is an important characteristic of cloud storage infrastructures, particularly with
their focus on overall cost. The next section speaks to cost specifically, but this characteristic
speaks more to the efficient use of the available resources over their cost.
To make a storage system more efficient, more data must be stored. A common solution is data
reduction, whereby the source data is reduced to require less physical space. Two means to
achieve this include compression—the reduction of data through encoding the data using a
different representation—and de-duplication—the removal of any identical copies of data that
may exist. Although both methods are useful, compression involves processing (re-encoding the
data into and out of the infrastructure), where de-duplication involves calculating signatures of
data to search for duplicates.
Cost
One of the most notable characteristics of cloud storage is the ability to reduce cost through its
use. This includes the cost of purchasing storage, the cost of powering it, the cost of repairing it
(when drives fail), as well as the cost of managing the storage. When viewing cloud storage from
this perspective (including SLAs and increasing storage efficiency), cloud storage can be
beneficial in certain use models.
An interesting peak inside a cloud storage solution is provided by a company called Backblaze
(see Related topics for details). Backblaze set out to build inexpensive storage for a cloud storage
offering. A Backblaze POD (shelf of storage) packs 67TB in a 4U enclosure for under US$8,000.
This package consists of a 4U enclosure, a motherboard, 4GB of DRAM, four SATA controllers,
45 1.5TB SATA hard disks, and two power supplies. On the motherboard, Backblaze runs
Linux® (with JFS as the file system) and GbE NICs as the front end using HTTPS and Apache
Tomcat. Backblaze's software includes de-duplication, encryption, and RAID6 for data
protection. Backblaze's description of their POD (which shows you in detail how to build your
own) shows you the extent to which companies can cut the cost of storage, making cloud storage
a viable and cost-efficient option.
Cloud storage models
Thus far, I've talked primarily about cloud storage providers, but there are models for cloud
storage that allow users to maintain control over their data. Cloud storage has evolved into three
categories, one of which permits the merging of two categories for a cost-efficient and secure
option.
Much of this article has discussed public cloud storage providers, which present storage
infrastructure as a leasable commodity (both in terms of long-term or short-term storage and the
networking bandwidth used within the infrastructure). Private clouds use the concepts of public
cloud storage but in a form that can be securely embedded within a user's firewall. Finally,
hybrid cloud storage permits the two models to merge, allowing policies to define which data
must be maintained privately and which can be secured within public clouds (see Figure 6).
Figure 6. Cloud storage models

The cloud models are shown graphically in Figure 6. Examples of public cloud storage providers
include Amazon (which offer storage as a service). Examples of private cloud storage providers
include IBM, Parascale, and Cleversafe (which build software and/or hardware for internal
clouds). Finally, hybrid cloud providers include Egnyte, among others.
Virtual Machine Migration
Cloud NAS (Network Attached Storage)
What is Cloud NAS?

According to Technavio,, Cloud NAS is gaining traction in the marketplace. But we still see a lot
of confusion whenhen people hear the terms “Cloud NAS”, “Cloud
“Cloud-based
based NAS”, or storage
gateway. So what is Cloud NAS?NAS A cloud NAS works like the legacy, on-premises
premises NAS
currently in your data center. But unlike traditional NAS or SAN infrastructures, a cloud NAS is
not a physical machine; it’s software-based
software based and designed to work in the cloud.
Cloud NAS is a “NAS in the cloud” that takes advantage of cloud computing to simplify
infrastructure and reduce costs. Most cloud NAS products workwork on cloud providers like Amazon
Web Services (AWS) and Microsoft Azure. Cloud NAS uses the cloud as the central source for
all data, but still provides common enterprise NAS features.
Why do you need a Cloud NAS NAS?? The way we work has evolved, but data storage hasn’t changed
substantially in over two decades. It’s time for storage to catch up. With the right set of
capabilities, a cloud NAS shortens the amount of time it takes to migrate from an on-premises
on
NAS to the cloud. It’s also much easier to manage than legacy, on-premises
on premises NAS systems.
A Cloud NAS provides significant benefits including:
including
• Eliminate legacy NAS Systems: A cloud NAS works with public cloud providers, so you’ll no
longer need an on-premises
premises NAS. Once you’re done migrating to the cloud, you’ll finally be able
to unplug your legacy NAS and end your expensive maintenance renewal contracts.
• No More Local Backups or Tapes: Cloud providers such as AWS and Microsoft Azure
automatically backup and archive data, so you can consolidate backup and tape archive
operations from multiple sites to the cloud.
• Built-in Disaster Recovery: A cloud
cloud NAS uses the cloud as the central data source, letting you
consolidate all of your backup and DR under one roof. Since cloud providers use redundant
copies of data and multiple data centers to architect system durability in to their service, your
data is always recoverable. Your data is already stored off-site across multiple sites, so you don’t
have to worry about tape backups. Snapshots provide point-in-time recovery for as long as you
need it.
• Pay as You Go: You only pay your cloud provider for the storage you need. With cloud storage
becoming cheaper, you can instantly scale your cloud instances to best suit your needs and
not worry about costs.
Use cases for a Cloud NAS include:
• On-Premises to Cloud Backup: Replicate and backup your data from your VMware datacenter
to the cloud. Eliminate physical backup tape required by business compliance and archive data in
inexpensive S3 object storage or send to cold storage like AWS Glacier for long-term storage.
• New Apps and Proofs of Concept (POC): A cloud NAS lets developers quickly stand up
storage infrastructure for a new application or proof of concept project without any storage
hardware. Developers can easily create a storage infrastructure with just a few clicks.
• File Services for S3 Object Storage: Object storage systems provide a lower-cost, more durable
and scalable alternative to traditional NAS and SAN hardware storage systems. However, they’re
optimize for performance with object I/O, but do not perform as well with file I/O and lack the
robust capabilities of traditional NAS filers. Frequently, object storage solutions have limited or
low-performing files services. Cloud NAS enables customers to take advantage of the scalability,
durability and low cost of object storage. Replace expensive on-premises SAN and NAS
equipment, while still providing file services for existing enterprise applications.
• Docker Persistent Storage: Docker cannot natively share volumes across multiple Docker
hosts. If data is not in a volume, the data disappears when you delete the Docker container. With
a cloud NAS, you can share persistent storage between Docker containers and hosts. Share
snapshots of your data to S3 or elsewhere for use even after your Docker container has exited.
• SaaS-Enable Applications: The growing trend from on-premises to Software-as-a-Service
(SaaS) deployments is undeniable. Traditional applications typically do not support block storage
or object storage. Converting your client/server applications to support block or object storage
requires application development and is usually slow and costly. For legacy applications with
incompatible file protocols, cloud NAS offers file services and support for NFS, CIFS/SMB,
iSCSI, and AFP protocols with Active Directory and LDAP integration to SaaS-enable your
existing applications for the cloud with ease.
Case study
Micro soft Azure
AWS(Amazon Web Services)
Google App Engine (GAE)

Scope:

Platform as a Service (PaaS)

1. What is Google App Engine.

• Overview

• Programming languages support

• Data storage

• App Engine services

• Security

• Overview
Programming languages support
Java:
• App Engine runs JAVA apps on a JAVA 7 virtual machine (currently
supports JAVA 6 as well).
• Uses JAVA Servlet standard for web applications:
• WAR (Web Applications ARchive) directory structure.
• Servlet classes
• Java Server Pages (JSP)
• Static and data files
• Deployment descriptor (web.xml)
• Other configuration files
Python:
• Uses WSGI (Web Server Gateway Interface) standard.
• Python applications can be written using:
• Webapp2 framework
• Django framework
• Any python code that uses the CGI (Common Gateway Interface)
standard.
PHP (Experimental support):
• Local development servers are available to anyone for developing
and testing local applications.
• Only whitelisted applications can be deployed on Google App Engine.
(https://gaeforphp.appspot.com/).

Google’s Go:
• Go is an Google’s open source programming environment.
• Tightly coupled with Google App Engine.
• Applications can be written using App Engine’s Go SDK.
Data storage

Google cloud store:


• RESTful service for storing and querying data.
• Fast, scalable and highly available solution.
• Provides Multiple layers of redundancy. All data is replicated to multiple
data centers.
• Provides different levels of access control.
• HTTP based APIs.
App Engine services
App Engine also provides a variety of services to perform common operations
when managing your application.
• URL Fetch:
• Facilitates the application’s access to resources on the internet,
such as web services or data.
• Mail:
• Facilitates the application to send e-mail messages using Google
infrastructure.
• Memcache:
• High performance in-memory key-value storage.
• Can be used to store temporary data which doesn’t need to be
persisted.
Business running on Google App Engine
• http://www.lowes.com/
• Uses App Engine to host their MyLowes service used by customers to
customize and personalize their home improvement projects.
• http://www.getaround.com/
• Peer-to-peer car sharing and local car rental service.
• http://kissflow.com/
• Workflow service based on Google applications.
The basics of cloud-based data storage
Public cloud can offer near-limitless storage, though it’s not always the right
choice. Storing the wrong type of data there has its consequences.
Storing data in the public cloud has its obvious advantages. Cloud's elastic
provisioning capabilities give you access to additional storage space when you
need it. What you choose to store in the cloud versus on local servers, however,
makes a difference. The following types of data are good fits for public cloud
storage.
Customer-facing data
If your company has large amounts of customer-facing data, such as catalogs of
merchandise, it makes sense to host that data in the cloud where it can be copied
redundantly as needed, geographically distributed or provisioned up or down
according to customer demand. The statement, "Put data closest to the people who
need it," applies here.
Distributed-access data
Data that's accessed from several locations, particularly read-only data or data that
is synchronized periodically from a central source, is a good fit for the cloud.
Public cloud has fewer physical constraints on storage -- you can provision out as
much as you need and your budget will allow -- but IT admins must also take into
account bandwidth requirements and possible latency issues.
Data backups
Backing up data from a local system such as a desktop or an enterprise data center
to a cloud host is a good example of an instance in which cloud-based
storage makes sense. Bandwidth and storage space are two limiting factors; the
more of each you have at your disposal, the easier it is to mirror local data in the
cloud. Retrieving datafrom a cloud-based backup, however, can be tricky if you're
dealing with terabytes of data. If siphoning that data from the cloud over the
network isn't prohibitive, ask your cloud provider to send you a physical copy of
your data.
The case for clutching your data
Certain types of data, for one reason or another, are best kept in a local data center
or private cloud. Here are a few examples of data that should be kept on-premises.
Mirrored copies of data
In some cases, mirrored copies of data could be considered "backup in reverse."
Copies of data stored in the cloud are synchronized passively to one or multiple
hosts. Egnyte, for example, is a service that uses a VMware-hosted appliance to
perform local synchronization with an enterprise's private cloud.
Sensitive data
Some organizations choose to keep sensitive customer data local because of
security concerns or to adhere to certain regulatory guidelines, such as the Health
Insurance Portability and Accountability Act (HIPAA). On a practical level, at-rest
and in-transit encryption, more comprehensive service-level agreements (SLAs)
and other safeguards have helped restore enterprises' trust in housing sensitive data
in the cloud. But security is as much about perceptions as it is about actual
procedures, and some enterprises are simply more comfortable keeping sensitive
data local.
Synchronized data
Even though it's becoming increasingly possible to ensure multiple copies of a
piece of data remain consistent and in-sync, sometimes the only way to guarantee
this is to keep one copy where you use it most often -- locally.
Public cloud has fewer physical constraints on storage, but IT admins must also
take into account bandwidth requirements and possible latency issues.
Often, enterprises will keep some data in the cloud and related data on-premises. If
they must keep that data synchronized, one major consideration is application-
aware synchronization. If you're dealing with data that exists as files, this isn't
complicated. But sophisticated databases, for instance, must be synchronized
according to application.
Live-mounted databases need to be synchronized to and from the cloud via
attendant applications. In many cases, those apps must be able to see the sync
target as a conventional file system, or the apps would need an extension that
allows them to easily transfer data in and out of the cloud.
Large databases
In some cases, it's not practical to remotely host instances of data, or it doesn't
provide any business advantages. For example, you may not need to mirror a large
database that only a select number of people access to several locations. On the
other hand, housing "big data" in the cloud is a good fit for data that needs to be
accessed broadly, whether as a public resource, for data analytics or for business
intelligence (BI).

You might also like