Cloud Computing Course File
Cloud Computing Course File
TECHNOLOGY MEERUT
Computer Science
Department of Information & Engineering
Technology
Meerut Institute of Engineering and Technology, Meerut
Coursefile Checklist : 2023-24
Subject Name : Cloud Computing Subejct Code : KCS-713
Faculty Name : Rahul
ShivamSingh
Singhal Theory / Lab : Theory
Course Handout covers the Course File Contents as per S. No. 01 to 06.
S. Observation /
Parameter / Description Completion Status
No. Remarks
1 Basic Course Details
Vision and Mission of the Institute, Vision and Mission of the
2
Department, PEOs, POs, PSOs
Teaching and Evaluation Scheme, Syllabus of Subject (as
3 prescribed by AKTU), List of standard / text / reference books,
other study material / web links
Statements of COs, CO-wise Syllabus, CO-PO Mapping, CO-PSO
4
Mapping, Course End Survey (Blank Form)
5 Teaching / Lecture / Lesson Plan
Attainment Levels and Assessment Tools (direct and indirect
6
methods both)
CO-wise Questions Bank, CO-wise Tutorial Sheets, CO-wise
7
Home-assignments / Quiz
Previous Years End Sem Exam Question Papers (minimum – 3
8
years)
9 Previous / Current Years Sessionals’ Question Papers
Contents Beyond Syllabus (Topics), delivery details (plan / actual
10
date of delivery)
Innovative teaching-learning, Details of NPTEL / Other online
11
resource used (OPTIONAL)
Criteria for identification of slow / fast learners with Actions to be
12
Taken
13 Lecture / Class Notes (may be put in separate file) and PPT
a Notes / PPT of Course Contents pertaining to CO-1
b Notes / PPT of Course Contents pertaining to CO-2
c Notes / PPT of Course Contents pertaining to CO-3
d Notes / PPT of Course Contents pertaining to CO-4
e Notes / PPT of Course Contents pertaining to CO-5
14 Students Lists
15 Personal and Class Time-table
Record of CO-Attainment (as per OBE-06A) and Actions taken / to
16
be taken to improve attainment / academic performance
Performance Record of Weak / Bright Students with Actions
17
Taken / Impact Analysis (as per C-02)
Attendance Register with Teaching Plan / Progress Record
18
(completed)
19 Sessional Marks (As per AM-08 and Marks Uploaded to AKTU)
Record of Evaluated Answer Sheets (all tests, all sheets),
20
Assignments, Duly-filled Course End Survey etc.
Any Other
Meerut Institute of Engineering and Technology, Meerut
Department of Information
Computer
C Technology
Science & Engineering
Course File Cloud Computing 2023-24
Vision
To be an outstanding institution in the country imparting technical education, providing need-
based, value-based and career-based programmes and producing self-reliant, self-sufficient
technocrats capable of meeting new challenges.
Mission
The mission of the institute is to educate young aspirants in various technical fields to fulfill
global requirement of human resources by providing sustainable quality education, training
and invigorating environment besides molding them into skilled competent and socially
responsible citizens who will lead the building of a powerful nation.
Objectives
The objective of the institute is to have necessary instruments and resources to deliver
technical manpower that not only meets the local industry requirements but is also an
acceptable human resource for the global job market in Management & Technology as well
as Master in Computer Application and other technological sectors. We aim to provide
quality technical education facilities to every student admitted to the institute along with the
development of all round personality of the students. To be a technically high and
professional institute, we ensure to provide most competent staff and excellent support
facilities like well-equipped library, lecture theaters, computer and scientific laboratories
fitted with high quality equipment. The institute organizes in-house and outside training
programmes, seminars, conferences and workshops on continuous basis and encourages
participation of staff and students to improve its Academic Pedagogy through innovative
ideas and techniques in order to impart quality education. To fill the perception gap between
expectation of industry and preferences of academia through industry institute cell, we
organize interactive sessions in the campus. To produce competitive professionals for
meeting the upcoming demand of the nation, the institute will always remain dedicated to the
social causes of the public of this region.
DEPARTMENT VISION AND MISSION
Vision
Mission
M1: To provide quality education in the core and applied areas of information
technology, and develop students from all socio-economic levels into globally
competent professionals.
M3: To invigorate students skills so that deploy their potential in research and
development, and inculcate the habit of lifelong learning.
Program Educational Objectives, Program Outcomes, Program
Specific Outcome, Course Outcomes and Mapping with POs
PEO 2: To bring the physical, analytical and computational approaches of IT to solve real
world Engineering problems and provide innovative solutions by applying appropriate
models, tools and evaluations.
PEO 4: Students to imbibe professional attitudes, team spirit, effective communication and
contribute ethically to the needs of the society with moral values.
PEO 5: Encourage students for higher studies and entrepreneurial skills by imparting the
quality of lifelong learning in emerging technologies and work in multidisciplinary roles and
capacities.
Program Outcomes
2. Problem analysis: Identify, formulate, review research literature, and analyse complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in ltidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological change
Program Specific Outcomes
1. PSO 1: Ability to understand, apply and analyze computational concepts in the areas
related to algorithms, machine learning, multimedia, web designing, Data Science, and
networking on the systems having different degree of complexity.
CO-wise Syllabus
Statement Describe architecture and underlying principles of cloud computing.
Introduction to Cloud Computing: Definition of Cloud - Evolution of Cloud Computing -
1 CO-1
Syllabus Underlying Principle of Parallel and Distributed Computing - Cloud Charateristics -
Elasticity in Cloud - On-demand Provisioning.
Statement Explain need, types and tools of Virtualization of Cloud
Cloud Enabling Technology Service Oriented Architecture: Rest and Systems of Systems-
2 CO-2 Web Services - Publish, Subscribe Model - Basics of Virtualization - Types of Virtualization
Syllabus
Implementation Levels of Virtualization - Virtualization Structures - Tools and Mechanisms
Virtualization of CPU - Memory - I/O devices - Virtualization Support and Disaster Recovery.
Statement Describe Services Oriented Architecture and Various Types of Cloud Services.
Cloud Architecture, Services and Storage: Layered Cloud Architecture Design - NIST Cloud
3 CO-3 Computing Reference Architecture - Public, Private, and Hybrid Clouds - IaaS, PaaS and
Syllabus
SaaS, Architectural Design Challenges - Cloud Storage - Storage as-a-Service- Advantages
Of Cloud Storage - Cloud Storage Providers - S3.
Explain Inter Cloud Resources management cloud storage services and their providers
Statement
Assess Secuirty Services and Standards for Cloud Computing.
Resource Management and Security in Cloud: Inter Cloud Resource management -
4 CO-4
Resource Provisioning and Resource Provisioning Methods - Global Exchange of Cloud
Syllabus
Resources - Security Overview - Cloud Security Challenges - Software-as-a-Service Security
Security Governance - Virtual Machine Security -IAM- Security Standards
Statement Analyze advanced cloud Technologies.
Cloud Technologies and Advancements Hadoop: MapReduce - Vitual Box - Google App
5 CO-5 Engine - Programming Environment for Google App Engine - Open Stack - Federation in the
Syllabus
Cloud - Four Levels of Federation - Federated Services and Applications - Future
of Federation.
Design/development of
Communications
S. No. Sub Code COx Statement of Course Outcomes (COs)
solutions
finance
Ethics
Upon completion of topic concerned, students will be
able to :
4 Develop and implement creative cloud solutions to solve complex business challenges.
Topics / lectures are arranged in sequence - same - as to be taught in the class. Maintain data related to "Date" in its hard copy.
Date
S. Lecture CO Teaching Reference
Topic Description Actual
No. No (No) Pedagogy Material Planned
Delivery
Inter Cloud Resource Management- Resource
33 33 CO4 PPT Notes 6/11/2023 6/11/2023
Provioning
34 34 CO4 Resource Provisioning Methods PPT Notes 7/11/2023 7/11/2023
35 35 CO4 Global Exchange of Cloud Resources PPT Notes 7/11/2023 7/11/2023
36 36 CO4 Security Overview and Defination PPT Notes 8/11/2023 8/11/2023
37 37 CO4 Cloud Security Challenges PPT Notes 14/11/23 14/11/23
38 38 CO4 Software-as-a-Service Security PPT Notes 14/11/23 14/11/23
39 39 CO4 Security Governance PPT Notes 15/11/23 15/11/23
40 40 CO4 Vertual Machine Security PPT Notes 20/11/23 20/11/23
41 41 CO4 IAM PPT Notes 21/11/23 21/11/23
42 42 CO4 Security Standards PPT Notes 21/11/23 21/11/23
43 43 CO4 Tutorials, Quick Revision PPT Notes 22/11/23 22/11/23
Cloud Technologies and Advancements
44 44 CO5 PPT Notes 27/11/23 27/11/23
Hadoop
45 45 CO5 MapReduce, Virtual Box PPT Notes 28/11/23 28/11/23
46 46 CO5 Google App Engine PPT Notes 28/11/23 28/11/23
Programming Environment for Google App
47 47 CO5 PPT Notes 29/11/23 29/11/23
Engine
Open Stack, Federation in the Cloud and Four
48 48 CO5 PPT Notes 11/12/2023 11/12/2023
Levels of Fedration
49 49 CO5 Federation Services and Applications PPT Notes 12/12/2023 12/12/2023
50 50 CO5 Future of Federation. PPT Notes 12/12/2023 12/12/2023
51 51 CO5 Quick revise CO4 and CO5 PPT Notes 13/12/23 13/12/23
Meerut Institute of Engineering & Technology, Meerut
Lesson Plan / Teaching Plan / Lecture Plan with Progress : B Tech -CSE
IT Semester :VII Session 2023-24
Course Name (Code) : Cloud Computing (KCS-713) Faculty : Rahul Singh
Shivam Singhal
Topics / lectures are arranged in sequence - same - as to be taught in the class. Maintain data related to "Date" in its hard copy.
Date
S. Lecture CO Teaching Reference
Topic Description Actual
No. No (No) Pedagogy Material Planned
Delivery
Assessment Methods Level Range of Students in a Class / Branch with target Marks
Direct Assessment 1 <50% student secure 60% Marks
(Internal Evaluation) 2 >=50 <60% Student secure 60% marks
3 >= 60% student secure 60% marks
Direct Assessment 1 <50% student secure 50% marks
(External Evaluation) 2 >=50 <60% student secure 50% marks
3 >=60% student secure 50% marks
2. Assessment Tool
Question 2: Why is cloud Computing required? List the Five Characteristics of Cloud
Computing.
Question 1: What do you mean by third party cloud services? Give suitable examples.
Question 2: Explain the major cloud features of Google applications engine.
Question 3: Write short notes on any two of the followings:
a. HADOO b. Microsoft Azure
Question 4: What do you mean by hadoop and its history? Why it is important? Illustrate
Hadoop Architecture.
Question 5: What do you mean by Google App Engine (GAE) and Open Stack? Explain
both advantages & disadvantages and open stack components.
3 4
3 4
Unit Ist
INTRODUCTION TO CLOUD COMPUTING
Cloud
Nowadays, Cloud computing is adopted by every company, whether it is a MNC or a startup
and many are still migrating towards it because of the cost-cutting, lesser maintenance, and
the increased capacity of the data with the help of servers maintained by the cloud providers.
One more reason for this drastic change from the On-premises servers of the companies to
the Cloud providers is the „Pay as you go‟ service provided by them i.e., you only have to pay
for the service which you are using. The disadvantage On-premises server holds is that if the
server is not in use the company still has to pay for it.
Cloud computing means storing and accessing the data and programs on remote servers that
are hosted on the internet instead of the computer‟s hard drive or local server. Cloud
computing is also referred to as Internet-based computing, it is a technology where the
resource is provided as a service through the Internet to the user. The data which is stored can
be files, images, documents, or any other storable document.
Cloud computing is all about renting computing services. This idea first came in the 1950s. In
making cloud computing what it is today, five technologies played a vital role. These are
distributed systems and its peripherals, virtualization, web 2.0, service orientation, and utility
computing.
Distributed Systems:
It is a composition of multiple independent systems but all of them are depicted as a single
entity to the users. The purpose of distributed systems is to share resources and also use them
effectively and efficiently. Distributed systems possess characteristics such as scalability,
concurrency, continuous availability, heterogeneity, and independence in failures. But the
main problem with this system was that all the systems were required to be present at the
same geographical location. Thus to solve this problem, distributed computing led to three
more types of computing and they were-Mainframe computing, cluster computing, and grid
computing.
Mainframe computing:
Mainframes which first came into existence in 1951 are highly powerful and reliable
computing machines. These are responsible for handling large data such as massive input-
output operations. Even today these are used for bulk processing tasks such as online
transactions etc. These systems have almost no downtime with high fault tolerance. After
distributed computing, these increased the processing capabilities of the system. But these
were very expensive. To reduce this cost, cluster computing came as an alternative to
mainframe technology.
Cluster computing:
Grid computing:
In 1990s, the concept of grid computing was introduced. It means that different systems were
placed at entirely different geographical locations and these all were connected via the
internet. These systems belonged to different organizations and thus the grid consisted of
heterogeneous nodes. Although it solved some problems but new problems emerged as the
distance between the nodes increased. The main problem which was encountered was the low
availability of high bandwidth connectivity and with it other network associated issues. Thus.
cloud computing is often referred to as “Successor of grid computing”.
Virtualization:
It was introduced nearly 40 years back. It refers to the process of creating a virtual layer over
the hardware which allows the user to run multiple instances simultaneously on the hardware.
It is a key technology used in cloud computing. It is the base on which major cloud
computing services such as Amazon EC2, VMware vCloud, etc work on. Hardware
virtualization is still one of the most common types of virtualization.
Web 2.0:
It is the interface through which the cloud computing services interact with the clients. It is
because of Web 2.0 that we have interactive and dynamic web pages. It also increases
flexibility among web pages. Popular examples of web 2.0 include Google Maps, Facebook,
Twitter, etc. Needless to say, social media is possible because of this technology only. It
gained major popularity in 2004.
Service orientation:
It acts as a reference model for cloud computing. It supports low-cost, flexible, and evolvable
applications. Two important concepts were introduced in this computing model. These were
Quality of Service (QoS) which also includes the SLA (Service Level Agreement) and
Software as a Service (SaaS).
Utility computing:
It is a computing model that defines service provisioning techniques for services such as
computer services along with other major services such as storage, infrastructure, etc which
are provisioned on a pay-per-use basis.
PARALLEL COMPUTING:
DISTRIBUTED COMPUTING:
In distributed computing we have multiple autonomous computers which seems to the user as
single system. In distributed systems there is no shared memory and computers communicate
with each other through message passing. In distributed computing a single task is divided
among different computers.
CHARACTERISTICS OF CLOUD
There are many characteristics of Cloud computing here are few of them:
1. On-demand self-services: The Cloud computing services does not require any human
administrators, user they are able to provision, monitor and manage computing
resources as needed.
2. Broad network access: The Computing services are generally provided over standard
networks and heterogeneous devices.
3. Rapid elasticity: The Computing services should have IT resources that are able to
scale out and in quickly and on as needed basis. Whenever the user require services it
is provided to him and it is scale out as soon as its requirement gets over.
4. Resource pooling: The IT resource (e.g., networks, servers, storage, applications, and
services) present are shared across multiple applications and occupant in an
uncommitted manner. Multiple clients are provided service from a same physical
resource.
5. Measured service: The resource utilization is tracked for each application and
occupant, it will provide both the user and the resource provider with an account of
what has been used. This is done for various reasons like monitoring billing and
effective use of resource.
6. Multi-tenancy: Cloud computing providers can support multiple tenants (users or
organizations) on a single set of shared resources.
7. Virtualization: Cloud computing providers use virtualization technology to abstract
underlying hardware resources and present them as logical resources to users.
8. Resilient computing: Cloud computing services are typically designed with
redundancy and fault tolerance in mind, which ensures high availability and
reliability.
9. Flexible pricing models: Cloud providers offer a variety of pricing models, including
pay-per-use, subscription-based, and spot pricing, allowing users to choose the option
that best suits their needs.
10. Security: Cloud providers invest heavily in security measures to protect their users‟
data and ensure the privacy of sensitive information.
11. Automation: Cloud computing services are often highly automated, allowing users to
deploy and manage resources with minimal manual intervention.
12. Sustainability: Cloud providers are increasingly focused on sustainable practices, such
as energy-efficient data centers and the use of renewable energy sources, to reduce
their environmental impact.
CLOUD ELASTICITY:
It works such a way that when number of client access expands, applications are naturally
provisioned the extra figuring, stockpiling and organization assets like central processor,
Memory, Stockpiling or transfer speed what‟s more, when fewer clients are there it will
naturally diminish those as per prerequisite.
CLOUD SCALABILITY:
Cloud scalability is used to handle the growing workload where good performance is also
needed to work efficiently with software or applications. Scalability is commonly used where
the persistent deployment of resources is required to handle the workload statically.
Types of Scalability:
In this type of scalability, we increase the power of existing resources in the working
environment in an upward direction.
2. Horizontal Scalability:
It is a mixture of both Horizontal and Vertical scalability where the resources are added both
vertically and horizontally.
On-demand computing (ODC) is a delivery model in which computing resources are made
available to the user as needed. The resources may be maintained within the user's enterprise
or made available by a cloud service provider. The term cloud computing is often used as a
synonym for on-demand computing when the services are provided by a third party -- such as
a cloud hosting organization.
The on-demand business computing model was developed to overcome the challenge of
enterprises meeting fluctuating demands efficiently. Because an enterprise's demand for
computing resources can be unpredictable at times, maintaining sufficient resources to meet
peak requirements can be costly. And cutting costs by only maintaining minimal resources
means there are likely insufficient resources to meet peak loads. The on-demand model
provides an enterprise with the ability to scale computing resources up or down whenever
needed, with the click of a button.
When an organization pairs with a third party to provide on-demand computing, it either
subscribes to the service or uses a pay-per-use model. The third party then provides
computing resources whenever needed, including when the organization is working on
temporary projects, has expected or unexpected workloads or has long-term computing
requirements. For example, a retail organization could use on-demand computing to scale up
their online services, providing additional computing resources during a high-volume time,
such as Black Friday.
Cloud computing is a general term for anything that involves delivering hosted services over
the internet. These services are divided into different types of cloud computing resources and
applications.
SaaS is a software distribution model where a cloud provider hosts applications and makes
them available to users over the internet.
DaaS is a form of cloud computing where a third party hosts the back end of a virtual desktop
infrastructure
Managed hosting services are an IT provisioning and cloud server hosting model where a
service provider leases dedicated servers and associated hardware to a single customer and
manages those systems on the customer's behalf.
Cloud storage is a service model where data is transmitted and stored securely on remote
storage systems, where it is maintained, managed, backed up and made available to users
over a network.
Cloud backup is a strategy for sending a copy of a file or database to a secondary location
for preservation in case of equipment failure.
Users can quickly increase or decrease their computing resources as needed -- either short-
term or long-term.
Removes the need to purchase, maintain and upgrade hardware.
The cloud service organization managing the on-demand services handles resources such as
servers and hardware, system updates and maintenance.
User friendly.
Many on-demand computing services in the cloud are user friendly enabling most users to
easily acquire additional computing resources without any help from their IT department.
This can help to improve business agility.
Cut costs.
Saves money because organizations don't have to purchase hardware or software to meet
peaks in demand. Organizations also don't have to worry about updating or maintaining those
resources
UNIT-II
Cloud Enabling Technology
Introduction to SOA
SOA is a style of software design. In the SOA concept, services are provided from externally to other
components as application components through a communication protocol over a network. The basic
principle of SOA does not depend upon technologies, products, and vendors. Each service in an SOA
embodies the code and data integrations required to execute a complete, discrete business function (e.g.,
checking a customer’s credit, calculating a monthly loan payment, or processing a mortgage
application). The service interfaces provide loose coupling, meaning they can be called with little or no
knowledge of how the integration is implemented underneath. The services are exposed using standard
network protocols—such as SOAP (simple object access protocol)/HTTP or JSON/HTTP—to send
requests to read or change data. The services are published in a way that enables developers to quickly
find them and reuse them to assemble new applications.
These services can be built from scratch but are often created by exposing functions from legacy
systems of record as service interfaces.
In this way, SOA represents an important stage in the evolution of application development and
integration over the last few decades. Before SOA emerged in the late 1990s, connecting an application
to data or functionality housed in another system required complex point-to-point integration—
integration that developers had to recreate, in part or whole, for each new development project.
Exposing those functions through SOA eliminates the need to recreate the deep integration every time.
SOA principles and guidelines guide architects and developers when defining business and technical
services.
Principle #2 Every service is defined by a formal contract that clearly separates the functionality being
provided from the technical implementation.
Principle#3 Services should only interact with other services through their well-defined public
interfaces.
Principle #4 Services should be accessible through standardized technologies that are available across a
wide range of environments. This also implies that the invocation mechanisms (protocols, transports,
discovery mechanisms, and service descriptions) should comply with widely accepted industry
standards. This means using SOAP, WSDL, XML, XML Schema, HTTP, JMS, and UDDI.
Principle #5 Services should be defined at a level of abstraction that corresponds to real world business
activities and recognizable business functions so as to better align business needs and technical
capabilities.
Principle #6 Services should be meaningful to service requester. Avoid letting legacy systems and
implementation details dictate services and service contracts.
Principle #8 Collections of related services should use the same XML document types to facilitate
information exchange among related services, and the structure and semantics of the documents must be
agreed to and well understood.
Principle #9 Services should perform discrete tasks and provide simple interfaces to access their
functionality to encourage reuse and loose coupling.
Principle #10 Services should provide metadata that defines their capabilities and constraints, and this
metadata should be available in a repository, which itself can be accessed through services. This allows
service definitions to be inspected without needing access to the service itself. This also allows service
requesters to dynamically locate and invoke the service, and through indirection, it provides location
transparency.
Primitive SOA:- SOA is a constantly growing field with various vendors developing SOA products
regularly. A baseline service-oriented architecture that is suitable to be realized by any vendor is known
as the primitive SOA. Baseline SOA, common SOA and core SOA are some of the other terms used to
refer to the primitive SOA. Application of service-orientation principles to software solutions produces
services and these are the basic unit of logic in the SOA. These services can exist autonomously, but
they are certainly not isolated. Services maintain certain common and standard features, yet they can be
evolved and extended independently. Services can be combined to create other services. Services are
aware of other services only through service descriptions and therefore can be considered loosely-
coupled. Services communicate using autonomous messages that are intelligent enough to self-govern
their own parts of logic. Most important (primitive) SOA design principles are loose coupling, service
contract, autonomy, abstraction, reusability, compensability, statelessness and discoverability.
Figure- Primitive SOA
Contemporary SOA: - Contemporary SOA is the classification that is used to represent the extensions
to the primitive SOA implementations in order to further achieve the goals of service-orientation. In
other words, contemporary SOA is used to take the primitive SOA to a target SOA state that the
organizations would like to have in the future. But, as the SOA (in general) evolve with time, the
primitive SOA is expanded by inheriting the attributes of contemporary SOA. Contemporary SOA helps
the growth of the primitive SOA by introducing new features, and then these features are adapted by the
primitive SOA model making its horizon larger than before. For all these reasons, contemporary SOA is
also referred to as future state SOA, target SOA or extended SOA.
l) SOA is an evolution.
Contemporary SOA and primitive SOA differ on the purpose they stand for within the context of SOA.
Primitive SOA is the baseline service-oriented architecture while, contemporary SOA is used to
represent the extensions to the primitive SOA. Primitive SOA provides a guideline to be realized by all
vendors, whereas Contemporary SOA expands the SOA horizon by adding new features to primitive
SOA. Currently, Contemporary SOA focuses on securing content of messages, improving reliability
through delivery status notifications, enhancing XML/SOAP processing and transaction processing to
account for task failure.
Every business and government organization is engaged in delivering services. Here are some examples:
Bank- Savings accounts, checking accounts, credit cards, safety deposit boxes, consumer loans,
mortgages, credit verification.
Travel agency- Holiday planning, business travel, travel insurance, annual summary of business travel
expenditures.
Insurance agency- Car insurance, home insurance, health insurance, accident assessment.
Retail store- In-store shopping, online shopping, catalog shopping, credit cards, extended warranties,
repair services.
Lawyer's office- Legal advice, wills preparation, business incorporation, bankruptcy proceedings.
Hospital Emergency- medical care, in-patient services, out-patient services, chronic pain
management.
Department of transportation- Driver testing and licensing, vehicle licensing, license administration,
vehicle inspections and emissions testing.
Department of human services- Benefits disbursement and administration, child support services and
case management.
The primary characteristics that should go into the design, implementation, and management of
services are as follows:
Loosely coupled.
Well-defined service contracts.
Meaningful to service requesters.
Standards-based.
A service should also possess as many of the following secondary characteristics as possible in order
to deliver the greatest business and technical benefits:
Primary Characteristics
Loosely Coupled Services
The notion of designing services to be loosely coupled is the most important, the most far reaching, and
the least understood service characteristic. Loose coupling is a broad term that actually refers to several
different elements of a service, its implementation, and its usage.
Interface coupling refers to the coupling between service requesters and service providers. Interface
coupling measures the dependencies that the service provider imposes on the service requesterthe fewer
the dependencies, the looser the coupling. Ideally, the service requester should be able to use a service
solely based on the published service contract and service-level agreement (see the next section), and
under no circumstances should the service requester require information about the internal
implementation of the service (for example, requiring that one of the input parameters be a SQL
command because the service provider uses a RDBMS as a data store). Another way of saying this is
that the interface should encapsulate all implementation details and make them opaque to service
requesters.
Technology coupling measures the extent to which a service depends on a particular technology,
product, or development platform (operating systems, application servers, packaged applications, and
middleware platforms). For instance, if an organization standardizes on J2EE for implementing all
services and requires all service requesters and service providers to use JNDI to look up user and role
information, then the service is tightly coupled to the J2EE platform, which limits the extent to which
diverse service requesters can access these services and the extent to which the service can be
outsourced to a third-party provider. Even worse would be a situation where the service requesters and
service providers are required to use a feature that is proprietary to a single vendor, which would
increase vendor lock-in. This is one of the main reasons to implement your SOA with Web services.
Process coupling measures the extent to which a service is tied to a particular business process. Ideally,
a service should not be tied to a single business process so that it can be reused across many different
processes and applications. However, there are exceptions. For instance, sometimes it is important to
define a service contract for a piece of business functionality (e.g., Photocopy-Check) that is only used
in one business process so that you have the option of non-invasively substituting another
implementation in the future. However, in this case, don't expect the service to be reusable across
different processes and applications.
Every service should have a well-defined interface called its service contract that clearly defines the
service's capabilities and how to invoke the service in an interoperable fashion, and that clearly
separates the service's externally accessible interface from the service's technical implementation. In this
context, WSDL provides the basis for service contracts; however, a service contract goes well beyond
what can be defined in WSDL to include document metadata, security metadata, and policy metadata
using the WS-Policy family of specifications. It is important that the service contract is defined based on
knowledge of the business domain and is not simply derived from the service's implementation.
Furthermore, changing a service contract is generally much more expensive than modifying the
implementation of a service because changing a service contract might require changing hundreds or
thousands of service requesters, while modifying the implementation of a service does not usually have
such far reaching effects. As a corollary, it is important to have a formal mechanism for extending and
versioning service contracts to manage these dependencies and costs.
Services and service contracts must be defined at a level of abstraction that makes sense to service
requesters. An appropriate level of abstraction will:
Capture the essence of the business service being provided without unnecessarily restricting future
uses or implementations of the service.
Use a business-oriented vocabulary drawn from the business service domain to define the business
service and the input and output documents of the business service.
Avoid exposing technical details such as internal structures or conventions to service requesters.
An abstract interface promotes substitutability that is, the interface captures a business theme and is
independent of a specific implementation, which allows a new service provider to be substituted for an
existing services provider as necessary without affecting any of the service requesters. In this way,
defining abstract interfaces that are meaningful to service requesters promotes loose coupling.
In general, services should perform discrete tasks and provide simple interfaces to encourage reuse and
loose coupling.
Open, Standards-Based
Services should be based on open standards as much as possible. Using open standards provides a
number of advantages, including:
Minimizing vendor lock-in. The Web services platform is defined using open, standards-based
technologies so that service requesters and service providers are isolated from proprietary, vendor-
specific technologies and interfaces.
Increasing the opportunities for the service requester to use alternative service providers.
Increasing the opportunities for the service provider to support a wider base of service requesters.
Increasing the opportunities to take advantage of open source implementations of the standards
and the developer and users communities that have grown up around these open source
implementations.
Services that possess the characteristics discussed earlier deliver the following technical benefits:
Efficient development.
More reuse.
Simplified maintenance.
Incremental adoption.
Graceful evolution.
Efficient Development
An SOA promotes modularity because services are loosely coupled. This modularity has positive
implications for the development of composite applications because:
After the service contracts have been defined (including the service-level data models), each
service can be designed and implemented separately by the developers who best understand the
particular functionality. In fact, the developers working on a service have no need to interact with or
even know about the developers working on the other business services.
Service requesters can be designed and implemented based solely on the published service
contracts without any need to contact the developers who created the service provider and without
access to the source code that implements the service provider (as long as the developers have access to
information about the semantics of the service; for example, the service registry may provide a link to
comprehensive documentation about the semantics of the service).
More Reuse
Meaningful to the service requester This characteristic makes it easier for the developer to
find the right service.
Well-defined service contracts published in a service repository This characteristic makes it
easier for the developer to find the right service. Proper registration policies and standardized
taxonomies enable easy discovery.
Dynamic, discoverable, metadata-driven This characteristic makes it easier for development-
time tools to find the right service and fully or partially generate artifacts for using the service and for
run-time code to dynamically adapt to changing conditions.
Loose process coupling Services that are decoupled from a single business process are easier
to reuse across different applications.
Loose technology coupling This characteristic is supported by the Web services platform and
allows service requesters to reuse services even if they are implemented using different technology (e.g.,
J2EE or .NET Framework). The developer does not have to worry about compiler versions, platforms,
and other incompatibilities that typically make code reuse difficult.
Open, standards-based This characteristic increases interoperability, so there is a better
chance that you can interact with the service at run-time.
Predictable service-level agreements This characteristic makes it easier for the developer to
verify that the service will satisfy any availability, reliability, and performance requirements.
Simplified Maintenance
SOA simplifies maintenance and reduces maintenance costs because SOA applications are modular and
loosely coupled. This means that maintenance programmers can make changes to services (even major
changes) as long as they adhere to the service contract that the service requesters depend upon without
worrying if their changes will affect those who maintain other parts of the system.
Incremental Adoption
Because SOA applications are modular and loosely coupled, they can be developed and deployed in a
series of small steps. Often, a reasonable subset of the full functionality can be developed and deployed
quickly, which has obvious time-to-market advantages. Additional functionality can readily be added in
planned stages until the full feature set has been realized.
4. Scalable – If any service obtaining several users then it is often simply scalable by attaching
additional servers. This will create service out there all time to the users.
5. Reliable – Services square measure typically tiny size as compared to the full-fledged application. So
it’s easier to correct and check the freelance services.
6. Same Directory Structure – Services have an equivalent directory structure so customers can access
the service information from an equivalent directory on every occasion. If any service has modified its
location then additionally directory remains the same. This is very helpful for consumers.
7. Independent of Other Services – Services generated using SOA principles are independent of each
other. So services are often utilized by multiple applications at an equivalent time.
Disadvantages
1. High Bandwidth Server – As therefore net service sends and receives messages and knowledge
often times so it simply reaches high requests per day. So it involves a high-speed server with plenty of
information measure to run an internet service.
2. Extra Overload – In SOA, all inputs square measures its validity before it’s sent to the service. If
you are victimization multiple services then it’ll overload your system with further computation.
Adapter: A software module added to an application or system that allows access to its capabilities via
a standards-compliant services interface.
Business Process Modeling: A procedure for mapping out what the business process does both in terms
of what various applications are expected to do and what the human participants in the business process
are expected to do.
Enterprise Service Bus: The enterprise service bus is the communications nerve center for services in a
service oriented architecture. It tends to be a jack-of all-trades, connecting to various types of
middleware, repositories of metadata definitions (such as how you define a customer number), registries
(how to locate information), and interfaces of every kind (for just about any application).
Service Broker: Software in a SOA framework that brings components together using the rules
associated with each component.
SOA Governance: SOA governance is an element of overall IT governance and as such lays down the
law when it comes to policy, process, and metadata management. (Metadata here simply means data that
defines the source of the data, the owner of the data, and who can change the data.)
SOA Repository: A database for all SOA software and components, with an emphasis on revision
control and configuration management, where they keep the good stuff, in other words.
SOA Service Manager: Software that orchestrates the SOA infrastructure — so that the business
services can be supported and managed according to well-defined Service Level Agreements.
SOA Registry: A single source for all the metadata needed to utilize the Web service of a software
component in a SOA environment. Universal Description, Discovery and Integration (UDDI) is a
directory service where businesses can register and search for Web services.
Layer 1: Operational systems layer. This consists of existing custom built applications, otherwise
called legacy systems, including existing CRM and ERP packaged applications, and older object-
oriented system implementations, as well as business intelligence applications. The composite layered
architecture of an SOA can leverage existing systems and integrate them using service-oriented
integration techniques.
Layer 2: Enterprise components layer. This is the layer of enterprise components that are
responsible for realizing functionality and maintaining the QoS of the exposed services. These
special components are a managed, governed set of enterprise assets that are funded at the enterprise
or the business unit level. As enterprise-scale assets, they are responsible for ensuring conformance to
SLAs through the application of architectural best practices. This layer typically uses container -based
technologies such as application servers to implement the components, workload management, high-
availability, and load balancing.
Layer 3: Services layer. The services the business chooses to fund and expose reside in this layer.
They can be discovered or be statically bound and then invoked, or possibly, choreographed into a
composite service. This service exposure layer also provides for the mechanism to take enterprise
scale components, business unit specific components, and in some cases, project- specific
components, and externalizes a subset of their interfaces in the form of service descriptions.
Level 4: Business process composition or choreography layer. Compositions and choreographies
of services exposed in Layer 3 are defined in this layer. Services are bundled into a flow t hrough
orchestration or choreography, and thus act together as a single application. These applications
support specific use cases and business processes.
Layer 5: Access or presentation layer. You can think of it as a future layer that you need to take into
account for future solutions. It is also important to note that SOA decouples the user interface from
the components, and that you ultimately need to provide an end-to-end solution
from an access channel to a service or composition of services.
Level 6: Integration (ESB). This layer enables the integration of services through the introduction of
a reliable set of capabilities, such as intelligent routing, protocol mediation, and other transformation
mechanisms, often described as the ESB (see Resources). Web Services Description Language
(WSDL) specifies a binding, which implies a location where the service is provided. On the other
hand, an ESB provides a location independent mechanism for integration.
Level 7: QoS. This layer provides the capabilities required to monitor, manage, and maintain QoS
such as security, performance, and availability. This is a background process through sense-and-
respond mechanisms and tools that monitor the health of SOA applications, including the all
important standards implementations of WS-Management and other relevant protocols and standards
that implement quality of service for a SOA.
UNIT-II
Cloud Enabling Technology
REST and WEB SERVICE
Introduction to REST
The acronym REST stands for REpresentational State Transfer. It was term originally coined by Roy
Fielding, who was also the inventor of the HTTP protocol. REpresentational State Transfer, or REST, is a
design pattern for interacting with resources stored in a server. Each resource has an identity, a data type,
and supports a set of actions. REST is a simple way to organize interactions between independent
systems. It's been growing in popularity since 2005, and inspires the design of services, such as the
Twitter API. This is due to the fact that REST allows you to interact with minimal overhead with clients
as diverse as mobile phones and other websites. In theory, REST is not tied to the web, but it's almost
always implemented as such, and was inspired by HTTP. As a result, REST can be used wherever HTTP
can.
The RESTful design pattern is normally used in combination with HTTP, the language of the internet. In
this context the resource's identity is its URI, the data type is its Media Type, and the actions are made up
of the standard HTTP methods (GET, PUT, POST, and DELETE). The HTTP POST method is used for
creating a resource, GET is used to query it, PUT is used to change it, and DELETE is used to destroy it.
The most common RESTful architecture involves a shared data model that is used across these four
operations. This data model defines the input to the POST method (create), the output for the GET
method (inquire) and the input to the PUT method (replace). A fifth HTTP method called 'HEAD' is
sometimes supported by RESTful web services. This method is equivalent to GET, except that it returns
only HTTP Headers, and no Body data. It's sometimes used to test the Existence of a resource. Not all
RESTful APIs support use of the HEAD method. These correspond to create, read, update, and delete (or
CRUD) operations, respectively. There are a number of other verbs, too, but are utilized less frequently.
The POST verb is most-often utilized to **create** new resources. In particular, it's used to create
subordinate resources. That is, subordinate to some other (e.g. parent) resource. In other words, when
creating a new resource, POST to the parent and the service takes care of associating the new resource
with the parent, assigning an ID (new resource URI), etc.
On successful creation, return HTTP status 201, returning a Location header with a link to the newly-
created resource with the 201 HTTP status.
POST is neither safe nor idempotent. It is therefore recommended for non-idempotent resource requests.
Making two identical POST requests will most-likely result in two resources containing the same
information.
Examples:
POST http://www.example.com/customers
POST http://www.example.com/customers/12345/orders
The HTTP GET method is used to **read** (or retrieve) a representation of a resource. In the ―happy‖
(or non-error) path, GET returns a representation in XML or JSON and an HTTP response code of 200
(OK). In an error case, it most often returns a 404 (NOT FOUND) or 400 (BAD REQUEST).
According to the design of the HTTP specification, GET (along with HEAD) requests are used only to
read data and not change it. Therefore, when used this way, they are considered safe. That is, they can be
called without risk of data modification or corruption—calling it once has the same effect as calling it 10
times, or none at all. Additionally, GET (and HEAD) is idempotent, which means that making multiple
identical requests ends up having the same result as a single request.
Do not expose unsafe operations via GET—it should never modify any resources on the server.
Examples:
GET http://www.example.com/customers/12345
GET http://www.example.com/customers/12345/orders
GET http://www.example.com/buckets/sample
PUT is most-often utilized for **update** capabilities, PUT-ing to a known resource URI with the
request body containing the newly-updated representation of the original resource.
However, PUT can also be used to create a resource in the case where the resource ID is chosen by the
client instead of by the server. In other words, if the PUT is to a URI that contains the value of a non-
existent resource ID. Again, the request body contains a resource representation. Many feel this is
convoluted and confusing. Consequently, this method of creation should be used sparingly, if at all.
Alternatively, use POST to create new resources and provide the client-defined ID in the body
representation—presumably to a URI that doesn't include the ID of the resource (see POST below).
On successful update, return 200 (or 204 if not returning any content in the body) from a PUT. If using
PUT for create, return HTTP status 201 on successful creation. A body in the response is optional—
providing one consumes more bandwidth. It is not necessary to return a link via a Location header in the
creation case since the client already set the resource ID.
PUT is not a safe operation, in that it modifies (or creates) state on the server, but it is idempotent. In
other words, if you create or update a resource using PUT and then make that same call again, the
resource is still there and still has the same state as it did with the first call.
If, for instance, calling PUT on a resource increments a counter within the resource, the call is no longer
idempotent. Sometimes that happens and it may be enough to document that the call is not idempotent.
However, it's recommended to keep PUT requests idempotent. It is strongly recommended to use POST
for non-idempotent requests.
Examples:
PUT http://www.example.com/customers/12345
PUT http://www.example.com/customers/12345/orders/98765
PUT http://www.example.com/buckets/secret_stuff
PATCH is used for **modify** capabilities. The PATCH request only needs to contain the changes to
the resource, not the complete resource.
This resembles PUT, but the body contains a set of instructions describing how a resource currently
residing on the server should be modified to produce a new version. This means that the PATCH body
should not just be a modified part of the resource, but in some kind of patch language like JSON Patch or
XML Patch.
PATCH is neither safe nor idempotent. However, a PATCH request can be issued in such a way as to be
idempotent, which also helps prevent bad outcomes from collisions between two PATCH requests on the
same resource in a similar time frame. Collisions from multiple PATCH requests may be more dangerous
than PUT collisions because some patch formats need to operate from a known base-point or else they
will corrupt the resource. Clients using this kind of patch application should use a conditional request
such that the request will fail if the resource has been updated since the client last accessed the resource.
For example, the client can use a strong ETag in an If-Match header on the PATCH request.
Examples:
PATCH http://www.example.com/customers/12345
PATCH http://www.example.com/customers/12345/orders/98765
PATCH http://www.example.com/buckets/secret_stuff
DELETE http://www.example.com/customers/12345
DELETE http://www.example.com/customers/12345/orders
DELETE http://www.example.com/bucket/sample
REST != HTTP
Though, because REST also intends to make the web (internet) more streamline and standard, he
advocates using REST principles more strictly. And that’s from where people try to start comparing
REST with web (HTTP). Roy fielding, in his dissertation, nowhere mentioned any implementation
directive – including any protocol preference and HTTP. Till the time, you are honoring the 6 guiding
principles of REST, you can call your interface RESTful.
In simplest words, in the REST architectural style, data and functionality are considered resources and are
accessed using Uniform Resource Identifiers (URIs). The resources are acted upon by using a set of
simple, well-defined operations. The clients and servers exchange representations of resources by using a
standardized interface and protocol – typically HTTP.
Resources are decoupled from their representation so that their content can be accessed in a variety of
formats, such as HTML, XML, plain text, PDF, JPEG, JSON, and others. Metadata about the resource is
available and used, for example, to control caching, detect transmission errors, negotiate the appropriate
representation format, and perform authentication or access control. And most importantly, every
interaction with a resource is stateless.
SOAP vs REST
SOAP REST
Meaning Simple Object Access Protocol Representational State Transfer
Design Standardized protocol with pre-defined rules to follow. Architectural style with loose
guidelines and
recommendations.
Approach Function-driven (data available as services, e.g.: “getUser”) Data-driven (data available as
resources, e.g. “user”).
Statefulness Stateless by default, but it’s possible to make a SOAP Stateless (no server-side
API stateful. sessions).
Caching API calls cannot be cached. API calls can be cached.
Security WS-Security with SSL support. Built-in ACID Supports HTTPS and SSL.
compliance.
Performance Requires more bandwidth and computing power. Requires fewer resources.
Message format Only XML. Plain text, HTML, XML, JSON,
YAML, and others.
Transfer HTTP, SMTP, UDP, and others. Only HTTP
protocol(s)
Recommended Enterprise apps, high-security apps, distributed Public APIs for web services,
for environment, financial services, payment gateways, mobile services, social networks
telecommunication services.
Advantages High security, standardized, extensibility. Scalability, better performance,
browser-friendliness, flexibility.
Disadvantages Poorer performance, more complexity, less flexibility. Less security, not suitable for
distributed environments.
Architectural Constraints / Guiding Principles of REST
REST defines 6 architectural constraints which make any web service – a true RESTful API.
1. Uniform interface
2. Client–server
3. Stateless
4. Cacheable
5. Layered system
6. Code on demand (optional)
1. Uniform interface
As the constraint name itself applies, you MUST decide APIs interface for resources inside the system
which are exposed to API consumers and follow religiously. A resource in the system should have only
one logical URI, and that should provide a way to fetch related or additional data. It’s always better
to synonymize a resource with a web page.
Any single resource should not be too large and contain each and everything in its representation.
Whenever relevant, a resource should contain links (HATEOAS) pointing to relative URIs to fetch
related information.
Also, the resource representations across the system should follow specific guidelines such as naming
conventions, link formats, or data format (XML or/and JSON).
All resources should be accessible through a common approach such as HTTP GET and similarly
modified using a consistent approach.
Once a developer becomes familiar with one of your APIs, he should be able to follow similar approach
for other APIs.
2. Client–server
This essentially means that client application and server application MUST be able to evolve separately
without any dependency on each other. A client should know only resource URIs, and that’s all. Today,
this is normal practice in web development, so nothing fancy is required from your side. Keep it simple.
Servers and clients may also be replaced and developed independently, as long as the interface between
them is not altered.
3. Stateless
Roy fielding got inspiration from HTTP, so it reflects in this constraint. Make all client-server interactions
stateless. The server will not store anything about the latest HTTP request the client made. It will treat
every request as new. No session, no history.
If the client application needs to be a stateful application for the end-user, where user logs in once and do
other authorized operations after that, then each request from the client should contain all the information
necessary to service the request – including authentication and authorization details.
No client context shall be stored on the server between requests. The client is responsible for managing
the state of the application.
4. Cacheable
In today’s world, caching of data and responses is of utmost important wherever they are
applicable/possible. The webpage you are reading here is also a cached version of the HTML page.
Caching brings performance improvement for the client-side and better scope for scalability for a server
because the load has reduced.
In REST, caching shall be applied to resources when applicable, and then these resources MUST declare
themselves cacheable. Caching can be implemented on the server or client-side.
5. Layered system
REST allows you to use a layered system architecture where you deploy the APIs on server A, and store
data on server B and authenticate requests in Server C, for example. A client cannot ordinarily tell
whether it is connected directly to the end server, or to an intermediary along the way.
Well, this constraint is optional. Most of the time, you will be sending the static representations of
resources in the form of XML or JSON. But when you need to, you are free to return executable code to
support a part of your application, e.g., clients may call your API to get a UI widget rendering code. It is
permitted.
WEB SERVICE
A Web Service can be defined by following ways:
A web service is any piece of software that makes itself available over the internet and uses a
standardized XML messaging system
Web services are application components, self-contained and self-describing
It exposes the existing function on the internet
It is a collection of open protocols and standards used for exchanging data between applications or
systems
Web services can be discovered using UDDI
XML is the basis for Web services
Envelope: (Mandatory) - Defines the start and the end of the message.
Header: (Optional)- Contains any optional attributes of the message used in processing the message,
either at an intermediary point or at the ultimate end point.
Body: (Mandatory) - Contains the XML data comprising the message being sent.
Fault: (Optional) - An optional Fault element that provides information about errors that occurred while
processing the message
All these elements are declared in the default namespace for the SOAP envelope
The SOAP envelope indicates the start and the end of the message so that the receiver knows when an
entire message has been received. The SOAP envelope solves the problem of knowing when you're done
receiving a message and are ready to process it. The SOAP envelope is therefore basic ally a packaging
mechanism
The SOAP body is a mandatory element which contains the application-defined XML data being
exchanged in the SOAP message. The body must be contained within the envelope and must follow any
headers that might be defined for the message. The body is defined as a child element of the envelope,
and the semantics for the body are defined in the associated SOAP schema.
The body contains mandatory information intended for the ultimate receiver of the message
SOAP Fault:
When an error occurs during processing, the response to a SOAP message is a SOAP fault element in the
body of the message, and the fault is returned to the sender of the SOAP message.
The SOAP fault mechanism returns specific information about the error, including a predefined code, a
description, the address of the SOAP processor that generated
Syntax Rules
Universal Description, Discovery and Integration (UDDI) is a directory service where businesses can
register and search for Web services. UDDI is a platform-independent framework for describing services,
discovering businesses, and integrating business services by using the Internet.
UDDI uses World Wide Web Consortium (W3C) and Internet Engineering Task Force (IETF)
Internet standards such as XML, HTTP, and DNS protocols.
UDDI uses WSDL to describe interfaces to web services
Additionally, cross platform programming features are addressed by adopting SOAP, known as
XML Protocol messaging specifications found at the W3C Web site.
UDDI Benefits
Making it possible to discover the right business from the millions currently online
Defining how to enable commerce once the preferred business is discovered
Reaching new customers and increasing access to current customers
Expanding offerings and extending market reach
Solving customer-driven need to remove barriers to allow for rapid participation in the global
Internet economy
Describing services and business processes programmatically in a single, open, and secure
environment
If the industry published an UDDI standard for flight rate checking and reservation, airlines could register
their services into an UDDI directory. Travel agencies could then search the UDDI directory to find the
airline's reservation interface. When the interface is found, the travel agency can communicate with the
service immediately because it uses a well-defined reservation interface.
UDDI is a cross-industry effort driven by all major platform and software providers like Dell,
Fujitsu, HP, Hitachi, IBM, Intel, Microsoft, Oracle, SAP, and Sun, as well as a large community
of marketplace operators, and e-business leaders.
Over 220 companies are members of the UDDI community.
WSDL
WSDL: Web Service Description Language is the standard format for describing a web service
in XML format
WSDL is an XML based protocol for information exchange in decentralized and distrubuted
environments
WSDL definition describes how to access a web service and what operations it will perform
WSDL is the language that UDDI uses
WSDL was developed jointly by Microsoft and IBM
WSDL Usage:
WSDL is often used in combination with SOAP and XML Schema to provide web services over
the Internet
A client program connecting to a web service can read the WSDL to determine what functions are
available on the server
Any special datatypes used are embedded in the WSDL file in the form of XML Schema
The client can then use SOAP to actually call one of the functions listed in the WSDL
<Defination>
<Types>
</Types>
<Messages>
</Messages>
<PortType>
<Operation>
</Operation>
</PortType>
<Binding>
</Binding>
<Service>
</Service>
</Defination>
<Messages> It contains parameters of each operations. Parameters of both request and response
<Operations> contains operation names and the messages that are involved
<Services> Defines name of the service and address of the service (Service Endpoint)
Example-
<definitions>
<message name="TutorialRequest">
<part name="TutorialID" type="xsd:string"/>
</message>
<message name="TutorialResponse">
<part name="TutorialName" type="xsd:string"/>
</message>
<portType name="Tutorial_PortType">
<operation name="Tutorial">
<input message="tns:TutorialRequest"/>
<output message="tns:TutorialResponse"/>
</operation>
</portType>
<output>
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:Tutorialservice"
use="encoded"/>
</output>
</operation>
</binding>
</definitions>
Synchronous services are best when the service can process the request in a small amount of time.
Synchronous services are also best when applications require a more immediate response to a request.
Web services that rely on synchronous communication are usually RPC-oriented. Generally, consider
using an RPC-oriented approach for synchronous Web services.
With asynchronous services, the client invokes the service but does not -- or cannot -- wait for the
response.
The client can continue with some other processing rather than wait for the response. Later, when it does
receive the response, it resumes whatever processing initiated the service request. Generally, a document-
oriented approach is used for asynchronous class of services. Services which process documents tend to
use an asynchronous architecture.
Service requestor:- From a business perspective, this is the business that requires Service requestor.
Certain functions to be satisfied. From an architectural perspective, this is the application that is looking
for and invoking or initiating an interaction with a service.
Service registry:- This is a searchable registry of service descriptions where service providers publish
their service descriptions. Service requestors find services and obtain binding information (in the service
descriptions) for services during development for static binding or during execution for dynamic binding.
For statically bound service requestors, the service registry is an optional role in the architecture, because
a service provider can send the description directly to service requestors.
For an application to take advantage of Web Services, three behaviors must take place:
Publication of service descriptions, lookup or finding of service descriptions, and binding or invoking
of services based on the service description. These behaviors can occur singly or iteratively. In detail,
these operations are:
The advantages of Web services are numerous, as shown in the list below:
Web services have an easy integration in an information system with a merchant platform
Their components are reusable,
Their interoperability makes it possible to link several systems together
They permit reduction of coupling between systems.
They have an extended functional scope made available to merchants: Import, Inventory,
Order Management, Pricing, After-Sales...
They connect heterogeneous systems
They interconnect middleware/or allow to install them
They allow servers and machines to communicate,
Reduced computing power is required
They allow a multi-user use, without disturbing sources
Easy component update
Low maintenance (like any big data tool)
They are not linked to any operating system or programming language
Characteristics of web services
The web service must be able to interact with the client and the server, and to be reusable,
therefore they have certain characteristics:
Machine-to-machine interactions
Loose coupling
Interoperability
Platform-independence
Operating system-independence
Language-independence
Leveraging the architecture of the World Wide Web
UNIT-2
CLOUD ENABLING TECHNOLOGIES
Virtualization
Virtualization uses software to create an abstraction layer over computer hardware that allows the hardware
elements of a single computer—processors, memory, storage and more—to be divided into multiple virtual
computers, commonly called virtual machines (VMs). Each VM runs its own operating system (OS) and behaves
like an independent computer, even though it is running on just a portion of the actual underlying computer
hardware.
Definition- Virtualization is the process of creating a software-based, or virtual, representation of something, such
as virtual applications, servers, storage and networks. It is the single most effective way to reduce IT expenses
while boosting efficiency and agility for all size businesses.
e.g.- Virtualization is there to present a logical view of the original things. In real time scenario When a user open
my computer icon, there appear some hard drive partitions say Local Disk (: C), Local Disk (: D), Local Disk (: E)
and so on. The Partition is the logical division of a hard disk drive to create in user-friendly view of multiple
numbers of separate hard drives. Is there really multiple numbers of separate hard drives? No, not at all. You have
probably seen your hard disk in your computer. It is an only single device. Hence Virtualization is a technique by
which creation of logical view of actual things can be undertaken. In cloud computing environment, there are
multiple number of servers connected through a network and make an environment through virtualization
technique that creates an interface on which users can access the data, and can deploy their applications but they
don’t need to know the underlying process like our hard drive example.
A brief history of virtualization
While virtualization technology can be sourced back to the 1960s, it wasn’t widely adopted until the early 2000s.
The technologies that enabled virtualization—like hypervisors—were developed decades ago to give multiple
users simultaneous access to computers that performed batch processing. Batch processing was a popular
computing style in the business sector that ran routine tasks thousands of times very quickly (like payroll).
But, over the next few decades, other solutions to the many users/single machine problem grew in popularity while
virtualization didn’t. One of those other solutions was time-sharing, which isolated users within operating
systems—inadvertently leading to other operating systems like UNIX, which eventually gave way to Linux®. All
the while, virtualization remained a largely unadopted, niche technology.
Fast forward to the the 1990s. Most enterprises had physical servers and single-vendor IT stacks, which didn’t
allow legacy apps to run on a different vendor’s hardware. As companies updated their IT environments with less-
expensive commodity servers, operating systems, and applications from a variety of vendors, they were bound to
underused physical hardware—each server could only run 1 vendor-specific task.
This is where virtualization really took off. It was the natural solution to 2 problems: companies could partition
their servers and run legacy apps on multiple operating system types and versions. Servers started being used more
efficiently (or not at all), thereby reducing the costs associated with purchase, set up, cooling, and maintenance.
Virtualization’s widespread applicability helped reduce vendor lock-in and made it the foundation of cloud
computing. It’s so prevalent across enterprises today that specialized virtualization management software is often
needed to help keep track of it all.
Benefits of virtualization
i. Efficiency:
Virtualization lets you have one machine serve as many virtual machines. This not only means you need fewer
servers, but you can use the ones you have to their fullest capacity. These efficiency gains translate into cost
savings on hardware, cooling, and maintenance—not to mention the environmental benefit of a lower carbon
footprint.
Virtualization also allows you to run multiple types of apps, desktops, and operating systems on a single machine,
instead of requiring separate servers for different vendors. This frees you from relying on specific vendors and
makes the management of your IT resources much less time-consuming, allowing your IT team to be more
productive.
ii. Reliability:
Virtualization technology allows you to easily back up and recover your data using virtual machine snapshots of
existing servers. It’s also simple to automate this backup process to keep all your data up to date. If an emergency
happens and you need to restore from a backed up virtual machine, it’s easy to migrate this virtual machine to a
new location in a few minutes. This results in greater reliability and business continuity because it’s easier to
recover from disaster or loss.
Virtualization software gives your organization more flexibility in how you test and allocate resources. Because of
how easy it is to back up and restore virtual machines, your IT team can test and experiment with new technology
easily. Virtualization also lets you create a cloud strategy by allocating virtual machine resources into a shared
pool for your organization. This cloud-based infrastructure gives your IT team control over who accesses which
resources and from which devices, improving security and flexibility.
iv. Minimal downtime: OS and application crashes can cause downtime and disrupt user productivity.
Admins can run multiple redundant virtual machines alongside each other and failover between them when
problems arise. Running multiple redundant physical servers is more expensive.
v. Faster provisioning: Buying, installing, and configuring hardware for each application is time-consuming.
Provided that the hardware is already in place, provisioning virtual machines to run all your applications is
significantly faster. You can even automate it using management software and build it into existing
workflows.
vi. Easier management: Replacing physical computers with software-defined VMs makes it easier to use and
manage policies written in software. This allows you to create automated IT service management
workflows. For example, automated deployment and configuration tools enable administrators to define
collections of virtual machines and applications as services, in software templates. This means that they
can install those services repeatedly and consistently without cumbersome, time-consuming. and error-
prone manual setup. Admins can use virtualization security policies to mandate certain security
configurations based on the role of the virtual machine. Policies can even increase resource efficiency by
retiring unused virtual machines to save on space and computing power.
Hypervisors
A hypervisor is the software layer that coordinates VMs. It serves as an interface between the VM and the
underlying physical hardware, ensuring that each has access to the physical resources it needs to execute. It also
ensures that the VMs don’t interfere with each other by impinging on each other’s memory space or compute
cycles.
A. Type 1 or “bare-metal” hypervisors interact with the underlying physical resources, replacing the traditional
operating system altogether. They most commonly appear in virtual server scenarios.
B. Type 2 hypervisors run as an application on an existing OS. Most commonly used on endpoint devices to run
alternative operating systems, they carry a performance overhead because they must use the host OS to access
and coordinate the underlying hardware resources.
What is a virtual machine?
A virtual machine is a computer file, typically called an image, which behaves like an actual computer. In other
words, creating a computer within a computer. It runs in a window, much like any other programme, giving the
end user the same experience on a virtual machine as they would have on the host operating system itself. The
virtual machine is sandboxed from the rest of the system, meaning that the software inside a virtual machine
cannot escape or tamper with the computer itself. This produces an ideal environment for testing other operating
systems including beta releases, accessing virus-infected data, creating operating system backups and running
software or applications on operating systems for which they were not originally intended.
Multiple virtual machines can run simultaneously on the same physical computer. For servers, the multiple
operating systems run side-by-side with a piece of software called a hypervisor to manage them, while desktop
computers typically employ one operating system to run the other operating systems within its programme
windows. Each virtual machine provides its own virtual hardware, including CPUs, memory, hard drives, network
interfaces and other devices. The virtual hardware is then mapped to the real hardware on the physical machine
which saves costs by reducing the need for physical hardware systems along with the associated maintenance costs
that go with it, plus reduces power and cooling demand.
Virtualization methods can change based on the user’s operating system. For example, Linux machines offer a
unique open-source hypervisor known as the kernel-based virtual machine (KVM). Because KVM is part of
Linux, it allows the host machine to run multiple VMs without a separate hypervisor. However, KVM is not
supported by all IT solution providers and requires Linux expertise in order to implement it.
There are five major needs of virtualization which are described below:
1. ENHANCED PERFORMANCE-
Currently, the end user system i.e. PC is sufficiently powerful to fulfill all the basic computation requirements of
the user, with various additional capabilities which are rarely used by the user. Most of their systems have
sufficient resources which can host a virtual machine manager and can perform a virtual machine with acceptable
performance so far.
3. SHORTAGE OF SPACE-The regular requirement for additional capacity, whether memory storage or
compute power, leads data centers raise rapidly. Companies like Google, Microsoft and Amazon develop their
infrastructure by building data centers as per their needs. Mostly, enterprises unable to pay to build any other data
center to accommodate additional resource capacity. This heads to the diffusion of a technique which is known as
server consolidation.
4. ECO-FRIENDLY INITIATIVES-
At this time, corporations are actively seeking for various methods to minimize their expenditures on power which
is consumed by their systems. Data centers are main power consumers and maintaining a data center operations
needs a continuous power supply as well as a good amount of energy is needed to keep them cool for well-
functioning. Therefore, server consolidation drops the power consumed and cooling impact by having a fall in
number of servers. Virtualization can provide a sophisticated method of server consolidation.
5. ADMINISTRATIVE COSTS-
Furthermore, the rise in demand for capacity surplus, that convert into more servers in a data center, accountable
for a significant increase in administrative costs. Hardware monitoring, server setup and updates, defective
hardware replacement, server resources monitoring, and backups are included in common system administration
tasks. These are personnel-intensive operations. The administrative costs is increased as per the number of servers.
Virtualization decreases number of required servers for a given workload, hence reduces the cost of administrative
employees.
VIRTUALIZATION REFERENCE MODEL-
Cloud computing is the delivery of shared computing resources, software, or data as a service through the
internet. Virtualization technology makes cloud computing possible by allocating virtual resources into
centralized pools that can be easily managed and deployed using a layer of management software. Here’s how
this works:
1. Virtualization uses a hypervisor to create virtual machines from physical servers, making those servers’
computing power, applications, or storage available to virtual environments.
2. These virtual resources get pooled together in a central location (usually an on-premises data center) that other
computers can access via a network. This centralized resource pool is also called a cloud.
3. When computers on the network need more storage or computing power, the cloud’s management software
lets administrators easily provision and provide these resources to the requesting computer. This step can also
be automated to enable a ―self-service‖ element to the cloud, so users don’t have to wait for admin approval.
4. Once the requesting computer no longer needs the cloud computing or storage, the cloud’s automation
capabilities can turn off the extra resources to reduce waste and control computing costs. This is known as
elastic or automated infrastructure scaling.
Types of virtualization
To this point we’ve discussed server virtualization, but many other IT infrastructure elements can be
virtualized to deliver significant advantages to IT managers (in particular) and the enterprise as a whole. In
this section, we'll cover the following types of virtualization:
Desktop virtualization
Network virtualization
Storage virtualization
Data virtualization
Application virtualization
Data center virtualization
CPU virtualization
GPU virtualization
Linux virtualization
Cloud virtualization
Desktop virtualization
Desktop virtualization lets you run multiple desktop operating systems, each in its own VM on the same
computer.
Virtual desktop infrastructure (VDI) runs multiple desktops in VMs on a central server and streams
them to users who log in on thin client devices. In this way, VDI lets an organization provide its users
access to variety of OSs from any device, without installing OSs on any device.
Local desktop virtualization runs a hypervisor on a local computer, enabling the user to run one or
more additional OSs on that computer and switch from one OS to another as needed without changing
anything about the primary OS.
Network virtualization
Network virtualization uses software to create a ―view‖ of the network that an administrator can use to
manage the network from a single console. It abstracts hardware elements and functions (e.g., connections,
switches, routers, etc.) and abstracts them into software running on a hypervisor. The network administrator
can modify and control these elements without touching the underlying physical components, which
dramatically simplifies network management.
Types of network virtualization include software-defined networking (SDN), which virtualizes hardware
that controls network traffic routing (called the ―control plane‖), and network function virtualization (NFV),
which virtualizes one or more hardware appliances that provide a specific network function (e.g., a
firewall, load balancer, or traffic analyzer), making those appliances easier to configure, provision, and
manage.
Storage virtualization
Storage virtualization enables all the storage devices on the network— whether they’re installed on individual
servers or standalone storage units—to be accessed and managed as a single storage device. Specifically,
storage virtualization masses all blocks of storage into a single shared pool from which they can be assigned
to any VM on the network as needed. Storage virtualization makes it easier to provision storage for VMs and
makes maximum use of all available storage on the network.
Data virtualization
Modern enterprises store data from multiple applications, using multiple file formats, in multiple locations,
ranging from the cloud to on-premise hardware and software systems. Data virtualization lets any application
access all of that data—irrespective of source, format, or location.
Data virtualization tools create a software layer between the applications accessing the data and the systems
storing it. The layer translates an application’s data request or query as needed and returns results that can
span multiple systems. Data virtualization can help break down data silos when other types of integration
aren’t feasible, desirable, or affordable.
Application virtualization
Application virtualization runs application software without installing it directly on the user’s OS. This differs
from complete desktop virtualization (mentioned above) because only the application runs in a virtual
environment—the OS on the end user’s device runs as usual. There are three types of application
virtualization:
Local application virtualization: The entire application runs on the endpoint device but runs in a
runtime environment instead of on the native hardware.
Application streaming: The application lives on a server which sends small components of the
software to run on the end user's device when needed.
Server-based application virtualization: The application runs entirely on a server that sends only its
user interface to the client device.
Data center virtualization abstracts most of a data center’s hardware into software, effectively enabling an
administrator to divide a single physical data center into multiple virtual data centers for different clients.
Each client can access its own infrastructure as a service (IaaS), which would run on the same underlying
physical hardware. Virtual data centers offer an easy on-ramp into cloud-based computing, letting a company
quickly set up a complete data center environment without purchasing infrastructure hardware.
CPU virtualization
CPU (central processing unit) virtualization is the fundamental technology that makes hypervisors, virtual
machines, and operating systems possible. It allows a single CPU to be divided into multiple virtual CPUs for
use by multiple VMs.
At first, CPU virtualization was entirely software-defined, but many of today’s processors include extended
instruction sets that support CPU virtualization, which improves VM performance.
GPU virtualization
A GPU (graphical processing unit) is a special multi-core processor that improves overall computing
performance by taking over heavy-duty graphic or mathematical processing. GPU virtualization lets multiple
VMs use all or some of a single GPU’s processing power for faster video, artificial intelligence (AI), and
other graphic- or math-intensive applications.
Pass-through GPUs make the entire GPU available to a single guest OS.
Shared vGPUs divide physical GPU cores among several virtual GPUs (vGPUs) for use by server-
based VMs.
Linux virtualization
Linux includes its own hypervisor, called the kernel-based virtual machine (KVM), which supports Intel and
AMD’s virtualization processor extensions so you can create x86-based VMs from within a Linux host OS.
As an open source OS, Linux is highly customizable. You can create VMs running versions of Linux tailored
for specific workloads or security-hardened versions for more sensitive applications.
Cloud virtualization
As noted above, the cloud computing model depends on virtualization. By virtualizing servers, storage, and
other physical data center resources, cloud computing providers can offer a range of services to customers,
including the following:
Infrastructure as a service (IaaS): Virtualized server, storage, and network resources you can
configure based on their requirements.
Platform as a service (PaaS): Virtualized development tools, databases, and other cloud-based
services you can use to build you own cloud-based applications and solutions.
Software as a service (SaaS): Software applications you use on the cloud. SaaS is the cloud-based
service most abstracted from the hardware.
Virtualization is not that easy to implement. A computer runs an OS that is configured to that particular
hardware. Running a different OS on the same hardware is not exactly feasible.
To tackle this, there exists a hypervisor. What hypervisor does is, it acts as a bridge between virtual OS and
hardware to enable its smooth functioning of the instance.There are five levels of virtualizations available that
are most commonly used in the industry. These are as follows:
In ISA, virtualization works through an ISA emulation. This is helpful to run heaps of legacy code which was
originally written for different hardware configurations. These codes can be run on the virtual machine
through an ISA. A binary code that might need additional layers to run can now run on an x86 machine or
with some tweaking, even on x64 machines. ISA helps make this a hardware-agnostic virtual machine. The
basic emulation, though, requires an interpreter. This interpreter interprets the source code and converts it to
hardware readable format for processing.
As the name suggests, this level helps perform virtualization at the hardware level. It uses a bare hypervisor
for its functioning. This level helps form the virtual machine and manages the hardware through virtualization.
It enables virtualization of each hardware component such as I/O devices, processors, memory etc. This way
multiple users can use the same hardware with numerous instances of virtualization at the same time. IBM had
first implemented this on the IBM VM/370 back in 1960. It is more usable for cloud-based infrastructure.
Thus, it is no surprise that currently, Xen hypervisors are using HAL to run Linux and other OS on x86 based
machines.
At the operating system level, the virtualization model creates an abstract layer between the applications and
the OS. It is like an isolated container on the physical server and operating system that utilizes hardware and
software. Each of these containers functions like servers. When the number of users is high, and no one is
willing to share hardware, this level of virtualization comes in handy. Here, every user gets their own virtual
environment with dedicated virtual hardware resource. This way, no conflicts arise.
Library Level
OS system calls are lengthy and cumbersome. Which is why applications opt for APIs from user-level
libraries? Most of the APIs provided by systems are rather well documented. Hence, library level
virtualization is preferred in such scenarios. Library interfacing virtualization is made possible by API hooks.
These API hooks control the communication link from the system to the applications. Some tools available
today, such as vCUDA and WINE, have successfully demonstrated this technique.
Application Level
Application-level virtualization comes handy when you wish to virtualized only an application. It does not
virtualized an entire platform or environment. On an operating system, applications work as one process.
Hence it is also known as process-level virtualization. It is generally useful when running virtual machines
with high-level languages. Here, the application sits on top of the virtualization layer, which is above the
application program. The application program is, in turn, residing in the operating system. Programs written in
high-level languages and compiled for an application-level virtual machine can run fluently here. Even though
there are five levels of virtualization, each enterprise doesn’t need to use all of them. It depends on what the
company is working on as to which level of virtualization it prefers. Companies tend to use virtual machines
for development and testing of cross-platform applications. With cloud-based applications on the rise,
virtualization has become a must-have for enterprises across the globe.
Virtualization Structures
Before virtualization, the operating system manages the hardware. After virtualization, a virtualization layer is
inserted between the hardware and the operating system. In such a case, the virtualization layer is responsible
for converting portions of the real hardware into virtual hardware. Therefore, different operating systems such
as Linux and Windows can run on the same physical machine, simultaneously. Depending on the position of
the virtualization layer, there are several classes of VM architectures, namely –
1. Hypervisor and Xen Architecture
2. Binary Translation with Full Virtualization
3. Para-Virtualization with Compiler Support
The guest OS, which has control ability, is called Domain 0, and the others are called Domain U. Domain 0 is
a privileged guest OS of Xen. It is first loaded when Xen boots without any file system drivers being
available. Domain 0 is designed to access hardware directly and manage devices. Therefore, one of the
responsibilities of Domain 0 is to allocate and map hardware resources for the guest domains (the Domain U
domains).
i. Full Virtualization
With full virtualization, noncritical instructions run on the hardware directly while critical instructions are
discovered and replaced with traps into the VMM to be emulated by software. Both the hypervisor and VMM
approaches are considered full virtualization. Why are only critical instructions trapped into the VMM? This
is because binary translation can incur a large performance overhead. Noncritical instructions do not control
hardware or threaten the security of the system, but critical instructions do. Therefore, running noncritical
instructions on hardware not only can promote efficiency, but also can ensure system security.
ii. Binary Translation of Guest OS Requests Using a VMM
This approach was implemented by VMware and many other software companies. As shown in Figure 3.6,
VMware puts the VMM at Ring 0 and the guest OS at Ring 1. The VMM scans the instruction stream and
identifies the privileged, control- and behavior-sensitive instructions. When these instructions are identified,
they are trapped into the VMM, which emulates the behavior of these instructions. The method used in this
emulation is called binary translation. Therefore, full vir-tualization combines binary translation and direct
execution. The guest OS is completely decoupled from the underlying hardware. Consequently, the guest OS
is unaware that it is being virtualized.
The performance of full virtualization may not be ideal, because it involves binary translation which is rather
time-consuming. In particular, the full virtualization of I/O-intensive applications is a really a big challenge.
Binary translation employs a code cache to store translated hot instructions to improve performance, but it
increases the cost of memory usage. At the time of this writing, the performance of full virtualization on the
x86 architecture is typically 80 percent to 97 percent that of the host machine.
iii. Host-Based Virtualization
An alternative VM architecture is to install a virtualization layer on top of the host OS. This host OS is still
responsible for managing the hardware. The guest OS are installed and run on top of the virtualization layer.
This host-based architecture has some distinct advantages, as enumerated next. First, the user can install this
VM architecture without modifying the host OS. The virtualizing software can rely on the host OS to provide
device drivers and other low-level services. This will simplify the VM design and ease its deployment.
3. Para-Virtualization with Compiler Support
Para-virtualization needs to modify the guest operating systems. A para-virtualized VM provides special APIs
requiring substantial OS modifications in user applications. Performance degradation is a critical issue of a
virtualized system. No one wants to use a VM if it is much slower than using a physical machine. The
virtualization layer can be inserted at different positions in a machine soft-ware stack. However, para-
virtualization attempts to reduce the virtualization overhead, and thus improve performance by modifying
only the guest OS kernel.
Virtualization of CPU –Memory – I/O Devices
CPU Virtualization- CPU Virtualization emphasizes on running programs and instructions through
virtual machine giving the feeling as it is working on a physical workstation. All the operations are handled by
an emulator that controls software to run according to it. Nevertheless, CPU Virtualization does not act as an
emulator. The emulator performs the same way as a normal computer machine does. It replicates the same
copy or data and generates the same output just like a physical machine does. The emulation function offers
great portability and facilitates working on a single platform acting like working on multiple platforms.
With CPU Virtualization, all the virtual machines act as physical machine and distribute their hosting
resources just like having various virtual processors. Sharing of physical resources takes place to each virtual
machine when all hosting services get the request. Finally, the virtual machines get a share of the single CPU
allocated to it, being a single-processor acting as dual-processor.
Memory Virtualization- It introduces a way to decouple memory from the server to provide a shared,
distributed or networked function.
It enhances performance by providing greater memory capacity without any addition to the main memory.
That’s why a portion of the disk drive serves as an extension of the main memory.
Implementations –
Application-level integration – Applications running on connected computers directly connect to the
memory pool through an API or the file system.
Operating System-Level Integration – The operating system first connects to the memory pool and makes
that pooled memory available to applications.
I/O Virtualization- I/O virtualization involves managing the routing of I/O requests between virtual
devices and the shared physical hardware. At the time of this writing, there are three ways to implement I/O
virtualization: full device emulation, para-virtualization, and direct I/O. Full device emulation is the first
approach for I/O virtualization. Generally, this approach emulates well-known, real-world devices.
I/O virtualization provides a foothold for many innovative and beneficial enhancements of the logical I/O
devices. The ability to interpose on the I/O stream in and out of a VM has been widely exploited in both
research papers and commercial virtualization systems.
One useful capability enabled by I/O virtualization is device aggregation, where multiple physical devices can
be combined into a single more capable logical device that is exported to the VM. Examples include
combining multiple disk storage devices exported as a single larger disk, and network channel bonding where
multiple network interfaces can be combined to appear as a single faster network interface.
New features can be added to existing systems by interposing and transforming virtual I/O requests,
transparently enhancing unmodified software with new capabilities. For example, a disk write can be
transformed into replicated writes to multiple disks, so that the system can tolerate disk-device failures.
Similarly, by logging and tracking the changes made to a virtual disk, the virtualization layer can offer a time-
travel feature, making it possible to move a VM’s file system backward to an earlier point in time. This
functionality is a key ingredient of the snapshot and undo features found in many desktop virtualization
systems.
Many I/O virtualization enhancements are designed to improve system security. A simple example is running
an encryption function over the I/O to and from a disk to implement transparent disk encryption. Interposing
on network traffic allows virtualization layers to implement advanced networking security, such as firewalls
and intrusion-detection systems employing deep packet inspection.
Disaster Recovery:-
Virtual disaster recovery is a combination of storage and server virtualization that helps to create more
effective means of disaster recovery and backup. It is now popular in many enterprise systems because of the
many ways that it helps to mitigate risk.
The general idea of virtual disaster recovery is that combining server and storage virtualization allows
companies to store backups in places that are not tied to their own physical location. This protects data and
systems from fires, floods and other types of natural disasters, as well as other emergencies. Many vendor
systems feature redundant design with availability zones, so that if data in one zone is compromised, another
zone can keep backups alive.
UNIT: - III
CLOUD ARCHITECTURE, SERVICES AND STORAGE
All of the physical manifestations of cloud computing can be arranged into a layered picture
that encompasses anything from software systems to hardware appliances. Utilizing cloud
resources can provide the “computer horsepower” needed to deliver services. This layer is
frequently done utilizing a data center with dozens or even millions of stacked nodes.
Because it can be constructed from a range of resources, including clusters and even
networked PCs, cloud infrastructure can be heterogeneous in character. The infrastructure can
also include database systems and other storage services.
The core middleware, whose goals are to create an optimal runtime environment for
applications and to best utilize resources, manages the physical infrastructure. Virtualization
technologies are employed at the bottom of the stack to ensure runtime environment
modification, application isolation, sandboxing, and service quality. At this level, hardware
virtualization is most frequently utilized. The distributed infrastructure is exposed as a
collection of virtual computers via hypervisors, which control the pool of available resources.
By adopting virtual machine technology, it is feasible to precisely divide up hardware
resources like CPU and memory as well as virtualize particular devices to accommodate user
and application needs.
Application Layer
1. The application layer, which is at the top of the stack, is where the actual cloud apps
are located. Cloud applications, as opposed to traditional applications, can take
advantage of the automatic-scaling functionality to gain greater performance,
availability, and lower operational costs.
2. This layer consists of different Cloud Services which are used by cloud users. Users
can access these applications according to their needs. Applications are divided into
Execution layers and Application layers.
3. In order for an application to transfer data, the application layer determines whether
communication partners are available. Whether enough cloud resources are accessible
for the required communication is decided at the application layer. Applications must
cooperate in order to communicate, and an application layer is in charge of this.
4. The application layer, in particular, is responsible for processing IP traffic handling
protocols like Telnet and FTP. Other examples of application layer systems include
web browsers, SNMP protocols, HTTP protocols, or HTTPS, which is HTTP’s
successor protocol.
Platform Layer
1. The operating system and application software make up this layer.
2. Users should be able to rely on the platform to provide them with Scalability,
Dependability, and Security Protection which gives users a space to create their apps,
test operational processes, and keep track of execution outcomes and performance.
SaaS application implementation’s application layer foundation.
3. The objective of this layer is to deploy applications directly on virtual machines.
4. Operating systems and application frameworks make up the platform layer, which is
built on top of the infrastructure layer. The platform layer’s goal is to lessen the
difficulty of deploying programmers directly into VM containers.
5. By way of illustration, Google App Engine functions at the platform layer to provide
API support for implementing storage, databases, and business logic of ordinary web
apps.
Infrastructure Layer
1. It is a layer of virtualization where physical resources are divided into a collection of
virtual resources using virtualization technologies like Xen, KVM, and VMware.
2. This layer serves as the Central Hub of the Cloud Environment, where resources are
constantly added utilizing a variety of virtualization techniques.
3. A base upon which to create the platform layer. constructed using the virtualized
network, storage, and computing resources. Give users the flexibility they want.
4. Automated resource provisioning is made possible by virtualization, which also
improves infrastructure management.
5. The infrastructure layer sometimes referred to as the virtualization layer, partitions the
physical resources using virtualization technologies like Xen, KVM, Hyper-V, and
VMware to create a pool of compute and storage resources.
6. The infrastructure layer is crucial to cloud computing since virtualization technologies
are the only ones that can provide many vital capabilities, like dynamic resource
assignment.
Datacenter Layer
1. In a cloud environment, this layer is responsible for Managing Physical Resources
such as servers, switches, routers, power supplies, and cooling systems.
2. Providing end users with services requires all resources to be available and managed
in data centers.
3. Physical servers connect through high-speed devices such as routers and switches to
the data center.
4. In software application designs, the division of business logic from the persistent data
it manipulates is well-established. This is due to the fact that the same data cannot be
incorporated into a single application because it can be used in numerous ways to
support numerous use cases. The requirement for this data to become a service has
arisen with the introduction of microservices.
5. A single database used by many microservices creates a very close coupling. As a
result, it is hard to deploy new or emerging services separately if such services need
database modifications that may have an impact on other services. A data layer
containing many databases, each serving a single microservice or perhaps a few
closely related microservices, is needed to break complex service interdependencies.
Community
Factors Public Cloud Private Cloud Cloud Hybrid Cloud
Scalability
and High High Fixed High
Flexibility
Between public
Cost- Distributed cost
Cost-Effective Costly and private
Comparison among members
cloud
Advantages of IaaS
IaaS is cost-effective as it eliminates capital expenses.
IaaS cloud provider provides better security than any other software.
IaaS provides remote access.
Disadvantages of IaaS
In IaaS, users have to secure their own data and applications.
Cloud computing is not accessible in some regions of the World.
2. PLATFORM AS A SERVICE (PAAS)
Platform as a Service (PaaS) is a type of cloud computing that helps developers to build
applications and services over the Internet by providing them with a platform.
PaaS helps in maintaining control over their business applications.
Advantages of PaaS
PaaS is simple and very much convenient for the user as it can be accessed via a web
browser.
PaaS has the capabilities to efficiently manage the lifecycle.
Disadvantages of PaaS
PaaS has limited control over infrastructure as they have less control over the
environment and are not able to make some customizations.
PaaS has a high dependence on the provider.
3. SOFTWARE AS A SERVICE (SAAS)
Software as a Service (SaaS) is a type of cloud computing model that is the work of
delivering services and applications over the Internet. The SaaS applications are called Web-
Based Software or Hosted Software.
SaaS has around 60 percent of cloud solutions and due to this, it is mostly preferred by
companies.
Advantages of SaaS
SaaS can access app data from anywhere on the Internet.
SaaS provides easy access to features and services.
Disadvantages of SaaS
SaaS solutions have limited customization, which means they have some restrictions
within the platform.
SaaS has little control over the data of the user.
SaaS are generally cloud-based, they require a stable internet connection for proper
working
ARCHITECTURAL DESIGN CHALLENGES IN CLOUD STORAGE:
1. Scalability:
Challenge: Designing a storage architecture that can seamlessly scale to accommodate
varying workloads and growing data volumes.
Solution: Implementing a distributed and scalable architecture that allows for easy
addition of resources as demand increases.
2. Data Security:
Challenge: Ensuring the security of data stored in the cloud, including encryption, access
controls, and protection against unauthorized access.
Solution: Implementing robust encryption mechanisms, access controls, and regular
security audits to safeguard stored data.
3. Data Durability and Reliability:
Challenge: Designing a system that ensures data durability, availability, and reliability
even in the face of hardware failures or network issues.
Solution: Employing redundancy, replication, and error-checking mechanisms to enhance
data durability and reliability.
4. Data Migration:
Challenge: Facilitating seamless data migration between different storage tiers or
providers without causing disruptions.
Solution: Implementing efficient data migration tools and strategies, ensuring minimal
downtime during the transition.
5. Performance Optimization:
Challenge: Optimizing storage performance to meet the varying demands of applications
and workloads.
Solution: Implementing caching, load balancing, and performance monitoring tools to
optimize storage performance.
Cost Efficiency:
Organizations can avoid the upfront costs of purchasing and maintaining physical
hardware, paying only for the storage resources consumed.
Accessibility:
Data stored in the cloud can be accessed from anywhere with an internet connection,
facilitating remote access and collaboration.
Data Security:
Cloud storage services implement robust security measures, including encryption and
access controls, to protect stored data.
Security Features:
S3 supports data encryption in transit and at rest, access control policies, and integration
with AWS Identity and Access Management (IAM) for fine-grained access control.
Versioning:
S3 supports versioning, allowing users to preserve, retrieve, and restore every version of
every object stored in a bucket.
Lifecycle Management:
Users can define lifecycle policies to automatically transition objects between storage
classes or delete them when they are no longer needed.
In summary, Amazon S3 is a powerful and versatile cloud storage solution that addresses
many architectural challenges, providing a scalable, secure, and feature-rich platform for
storing and managing data in the cloud.
UNIT-4
RESOURCE MANAGEMENT AND SECURITY IN CLOUD
Cloud computing is on-demand service provided by various organisations and companies. The main motive
of the cloud computing is to give high speed internet facility, Low cost usage, Best cloud gaming, High
specs PC usage to users. Organisations first established their computer and servers in Server farm in
different locations , and they made it available to all users. In short, Cloud computing is a term that provides
the use of data centers to users. This organisations established their various servers and cloud storage
services; This helps them to run the cloud computing very smoothly. As Cloud computing provides us High
speed internet, Integrated and High PC specs, Low cost usage, Cloud gaming, Storage devices and
hardware visualization, Cloud computing has achieved their best place all over the world. We can compute
to our account created in Cloud computing organisation. This organisations have established their difference
services for different uses. Like Cloud computing for online browsing and gaming at high speed and low
cost, Online data storage; This provides users online data storage to store their data digitally. Various Tech
Giants companies stores their data in the services like this. This is helpful to various users, If our data is
deleted from our PC or server, then we can recover it through this service which is very addition point.
Cloud provisioning is the allocation of resources and services from a cloud provider to a customer. The
growing catalog of cloud services that customers can provide includes infrastructure as a service, software
as a service, and platform as a service, in public or private cloud environments. Cloud provisioning is the
allocation of resources and services from a cloud provider to a client. The growing catalog of cloud services
that customers can provide includes infrastructure as a service, software as a service, and platform as a
service, in public or private cloud environments. Provisioning is the process of configuring the IT
infrastructure. It can also refer to the steps necessary to manage access to data and resources and make them
available to users and systems. Once something has been provisioned, the next step is configuration.
Provisioning resources in the cloud is a difficult task that can be compromised due to the unavailability of
expected resources. Quality of Service (QoS) requirements for workloads arises from the provision of
appropriate resources for cloud workloads. Discover the best workload: the pair of resources based on the
application requirements of cloud users is an optimization problem. Acceptable quality of service cannot be
provided to cloud users until the provision of resources is offered as a critical capability. Therefore, a
resource provisioning technique based on QoS parameters is required for efficient resource provisioning.
This research shows an in-depth analysis of the methodical literature on the provisioning of cloud resources
in general and the identification of cloud resources in particular. Existing research is generally classified
into several groups in the area of cloud resource provisioning.
1. Static Resource Provisioning:- For applications that have predictable and generally unchanging
demands/workloads, it is possible to use “static provisioning" effectively. With advance provisioning, the
customer contracts with the provider for services and the provider prepares the appropriate resources in
advance of start of service. The customer is charged a flat fee or is billed on a monthly basis.
2. Dynamic Resource Provisioning: - In cases where demand by applications may change or vary,
“dynamic provisioning" techniques have been suggested whereby VMs may be migrated on-the-fly to new
compute nodes within the cloud. With dynamic provisioning, the provider allocates more resources as they
are needed and removes them when they are not. The customer is billed on a pay-per-use basis. When
dynamic provisioning is used to create a hybrid cloud, it is sometimes referred to as cloud bursting.
3. User Self-provisioning: With user self provisioning:-(also known as cloud self-service), the customer
purchases resources from the cloud provider through a web form, creating a customer account and paying
for resources with a credit card. The provider's resources are available for customer use within hours, if not
minutes.
Need of Cloud Security:-Cloud computing and storage provides users with capabilities to store and
process their data in third-party data centers. Organizations use the cloud in a variety of different service
models (with acronyms such as SaaS, PaaS, and IaaS) and deployment models (private, public, hybrid,
and community). Security concerns associated with cloud computing fall into two broad categories: security
issues faced by cloud providers (organizations providing software-, platform-, or infrastructure-as-a-service
via the cloud) and security issues faced by their customers (companies or organizations who host
applications or store data on the cloud). The responsibility is shared, however. The provider must ensure that
their infrastructure is secure and that their clients’ data and applications are protected, while the user must take
measures to fortify their application and use strong passwords and authentication measures.
When an organization elects to store data or host applications on the public cloud, it loses its ability to have
physical access to the servers hosting its information. As a result, potentially sensitive data is at risk from
insider attacks. According to a recent Cloud Security Alliance report, insider attacks are the sixth biggest
threat in cloud computing. Therefore, cloud service providers must ensure that thorough background
checks are conducted for employees who have physical access to the servers in the data center.
Additionally, data centers must be frequently monitored for suspicious activity.
In order to conserve resources, cut costs, and maintain efficiency, cloud service providers often store more
than one customer's data on the same server. As a result, there is a chance that one user's private data can be
viewed by other users (possibly even competitors). To handle such sensitive situations, cloud service
providers should ensure proper data isolation and logical storage segregation.
The extensive use of virtualization in implementing cloud infrastructure brings unique security concerns for
customers or tenants of a public cloud service. Virtualization alters the relationship between the OS and
underlying hardware – be it computing, storage or even networking. This introduces an additional layer –
virtualization – that itself must be properly configured, managed and secured. Specific concerns include the
potential to compromise the virtualization software, or "hypervisor". While these concerns are largely
theoretical, they do exist. For example, a breach in the administrator workstation with the management
software of the virtualization software can cause the whole datacenter to go down or be reconfigured to an
attacker's liking.
Deterrent controls
These controls are intended to reduce attacks on a cloud system. Much like a warning sign on a fence or a
property, deterrent controls typically reduce the threat level by informing potential attackers that there will
be adverse consequences for them if they proceed. (Some consider them a subset of preventive controls.)
Preventive controls
Preventive controls strengthen the system against incidents, generally by reducing if not actually eliminating
vulnerabilities. Strong authentication of cloud users, for instance, makes it less likely that unauthorized
users can access cloud systems, and more likely that cloud users are positively identified.
Detective controls
Detective controls are intended to detect and react appropriately to any incidents that occur. In the event of
an attack, a detective control will signal the preventative or corrective controls to address the issue. System
and network security monitoring, including intrusion detection and prevention arrangements, are typically
employed to detect attacks on cloud systems and the supporting communications infrastructure.
Corrective controls
Corrective controls reduce the consequences of an incident, normally by limiting the damage. They come
into effect during or after an incident. Restoring system backups in order to rebuild a compromised system
is an example of a corrective control.
Relating to public and hybrid cloud environments, the loss of overall service visibility and the associated
lack of control can be a problem. Whether you’re dealing with public or hybrid cloud environments, a loss of
visibility in the cloud can mean a loss of control over several aspects of IT management and data security.
Where legacy style in- house infrastructure was entirely under the control of the company, cloud services
delivered by third-party providers don’t offer the same level of granularity with regards to administration and
management.
When it comes to visualizing potential security vulnerabilities, this lack of visibility can lead to a business
failing to identify potential risks. In some sectors, such as media, cloud adoption is as low as 17%, which
has been blamed on this lack of visibility and control.
Despite the fact that generally speaking, enterprise-grade cloud services are more secure than legacy
architecture, there is still a potential cost in the form of data breaches and downtime. With public and
private cloud offerings, resolving these types of problems is in the hands of the third-party provider.
Consequently, the business has very little control over how long critical business systems may be offline, as
well as how well the breach is managed.
Vendor Lock-In
For companies that come to rely heavily on public and hybrid cloud platforms, there is a danger that they
become forced to continue with a specific third-party vendor simply to retain operational capacity. If critical
business applications are locked into a single vendor, it can be very difficult to make tactical decisions such
as moving to a new vendor. In effect, the vendor is being provided with the leverage it needs to force the
customer into an unfavorable contract.
Logicworks recently performed a survey that found showed that some 78% of IT decision makers blame the
fear of vendor lock-in as a primary reason for their organization failing to gain maximum value from cloud
computing.
Compliance Complexity
In sectors such as healthcare and finance, where legislative requirements with regard to storage of private
data are heavy, achieving full compliance whilst using public or private cloud offerings can be more
complex.
Many enterprises attempt to gain compliance by using a cloud vendor that is deemed fully compliant.
Indeed, data shows that some 51% of firms in the USA rely on nothing more than a statement of compliance
from their cloud vendor as confirmation that all legislative requirements have been met.
But what happens when at a later stage, it is found that the vendor is not actually fully compliant? The client
company could find itself facing non-compliance, with very little control over how the problem can be
resolved.
A Lack of Transparency
When a business buys in third-party cloud services as either a public or hybrid cloud offering, it is likely
they will not be provided with a full service description, detailing exactly how the platform works, and the
security processes the vendor operates.
This lack of service transparency makes it hard for customers to intelligently evaluate whether their data is
being stored and processed securely at all times. Surveys have shown that around 75% of IT managers are
only marginally confident that company data is being stored securely by their cloud vendor.
Cloud vendors provide their customers with a range of Application Programming Interfaces (APIs), which
the customer uses to manage the cloud service. Unfortunately, not every API is entirely secure. They may
have been deemed to be initially, and then at a later stage be found to be insecure in some way. This
problem is compounded when the client company has built its own application layer on top of these APIs.
The security vulnerability will then exist in the customer’s own application. This could be an internal
application, or even a public facing application potentially exposing private data.
Insufficient Due Diligence
For companies that lack the internal resources to fully evaluate the implications of cloud adoption, then the
risk of deploying a platform that is ineffective and even insecure is real.
Responsibility for specific issues of data security needs to be fully defined before any deployment. Failing
to do so could lead to a situation where there is no clearly defined way to deal with potential risks and solve
current security vulnerabilities.
Shared Technology Vulnerabilities
Using public or hybrid cloud offerings can expose a business to security vulnerabilities caused by other
users of the same cloud infrastructure.
The onus is upon the cloud vendor to see that this does not happen, yet no vendor is perfect. It is always
possible that a security vulnerability caused by another user in the same cloud will affect every user.
Alongside the potential security vulnerabilities relating directly to the cloud service, there are also a number
of external threats which could cause an issue. Some of these are:
• Man in the Middle attacks – where a third party manages to become a relay of data between a source
and a destination. If this is achieved, the data being transmitted can be altered.
• Distributed Denial of Service – a DDoS attack attempts to knock a resource offline by flooding it with
too much traffic.
• Account or Service Traffic Hijacking – a successful attack of this kind could provide an intruder with
passwords or other access keys which allow them access to secure data.
There can be no doubt that cloud computing is a valuable technology for many businesses. However, as can
be seen from this short article, simply buying in cloud services is not a sure-fire way to eliminate data
security problems. The business still needs to take responsibility for monitoring its own data security
footprint and have processes in place to deal with any vulnerability which are discovered. Furthermore,
considerations such as vendor lock-in, service transparency, and visibility need to be fully evaluated before
making a commitment to a specific cloud vendor.
SaaS are considered to have several advantages for business managers and make life easier for users.
However, the technology can still be considered relatively new. While a majority of companies are using
them, there are still concerns, risks, and misconceptions regarding their services. It’s mainly because using SaaS
often implies not relying on an internal IT department for data storage. And that, in essence, can be a source of
worry.
1. Data Access Risk
Because they are giving their information and data to a third party, numerous users are concerned about who
gets access. It may seem out of their control and fear the potential dissemination, deletion, or corruption of
their data by unauthorized people. It’s a particular major worry for users who plan on storing sensitive data
that will be detrimental if it ends up in the hands of others, especially their competition.
However, every customer can review and discuss the policies and procedures that are implemented by the
SaaS provider. You can define the level of access and to whom you grant it. All providers are required to
include that condition in the Terms of Agreement, but make sure to check before signing so that you can
spare yourself later worries. In fact, be wary of the kind of privacy questions you should ask SaaS providers
and do not hesitate to inform yourself well on the technical side of the matter.
2. Instability
Security and stability are the true pillars that hold up a reliable SaaS software. The services are becoming
increasingly popular, which is a double-edged sword. On one hand, it means more options for users and
high-quality services because it forces every single provider to keep up with the competition. On the other
hand, not everyone will be able to keep up with the growing market. And, in the end, employed provider
might get shut down because they can no longer compete.
Data portability can be a hassle from that point on. It’s a major concern on what would happen because it means
that all the time and money invested in a service is going down the drain. Unfortunately, it’s a risk you will have
to take. The situation can be unpredictable. What will happen to all that data now that the SaaS provider
went out of business? It may not be as dramatic as a complete shutdown of the service, but you may
encounter changes in prices or security policy. To alleviate your worries, make sure you read the policy
careful regarding these issues before you are confronted with a potential data leak due to their protection
services being no longer active.
3. Lack of Transparency
SaaS providers are often secretive and assure their clients that they are better are keeping their data safe than
any other out there. At the very least, they guarantee that they will be capable of securing information and
files more proficiently than the customer themselves. However, not all users take their word at face value.
There are numerous concerns regarding the provider’s lack of transparency on how their entire security
protocol is being handled. Unfortunately, this is a matter up for debate.
This lack of transparency may cause distrust from their customers. Both the clients and industry analysts are
not getting answers to several security questions. It leaves them with empty spaces and speculations about
the service they are employing or reviewing. However, SaaS providers argue that the lack of transparency is
what keeps their services secure. Divulging information about data centers or operations might compromise
the security of their clients. The argument may appear reasonable for numerous users, but it still leaves
others with concerns.
4. Identity Theft
SaaS providers always require payment through credit cards that can be done remotely. It’s a quick and convenient
method, but it does concern some users about the potential risk it implies. There are numerous security
protocols placed to prevent problems. Identity management can be within the company’s LDAP directions,
inside the firm’s firewall, or on the SaaS provider’s site. It may depend. It’s also severely flawed because this
process is still in its infancy. Providers often do not have a better solution for identity management than the
company’s own firewall.
Identity theft then becomes a major concern that is often prevented only with the use of numerous security
tools. That implies using an additional software and perhaps payment of services that guarantee the safety of
your credit card information. It’s an issue that stems from managing access, which is famously easy for SaaS,
and the fact that the strategy may change through time. That can often result in concerns, especially for first-
time users who have not properly researched the provider before payment.
Most SaaS providers do not disclose where their data centers are, so customers are not aware where it’s actually
stored. At the same time, they must also be aware of the regulations placed by the Federal Information
Security Management Act (FISMA) which states that customers need to keep sensitive data within the
country. That means that you might or might not have access to your data if you’re flying out of the U.S. or you
might have other options.
Should you travel outside of the country, your SaaS provider, especially with cloud-based software, will
notify you that your information has been sent to another one of their centers (in Europe for example). That
means that your sensitive data is being transferred for your own convenience and access, but at the same
time, it leaves users to wonder where it is exactly. Some, such as Symantec, offer their services in over a dozen
countries, but it’s not a guarantee from every provider. You may not know where your valuable data is at a
given time.
Financial security is also an issue that may be born out of your agreement to use a SaaS provider. A good
majority of them require payment upfront and for long-term. That’s even if you are unsure of how long you
will need their service or if something in their policy will change through time. It’s a concern of investing in
a potentially crucial part of the company that might not be up to par and dissatisfy you as a customer. Some
might even force you to pay a year ahead.
Once the payment is made, your funds have been taken, and you have the service at hand. However, that
does not provide all customers with security. The service will surely remain, as settled by contract, but the
quality and security might change. There are worries that users might end up with an application that no
longer updates itself, which can affect both its use and safety. If the encryption is not kept up to date, you
may open yourself to several security issues, and your data could be compromised. It’s a detail to be checked
before paying the provider.
7. Not Sure What You Agreed To
Every business is required to provide terms and conditions where they explain, in scrutinizing detail, the
nuances of how their service works. However, not everyone bothers to read the lengthy document that is
typically standard. Even more, not all are IT aficionados with expertise in the slang commonly used for that
niche. That might have them end up to agreeing with certain things they do not properly understand. And
then when problems arise, most customers are not quite sure what exactly they agreed upon when signing.
The ideal situation would be to have someone familiar with the SaaS service check the Terms and
Conditions document in order to familiarize you with the basics and details. Or, have separate departments
read different sections that might affect their activity. It’s the safest way you will not have worries later on
regarding what you signed up for and what awaits in the case of issues.
8. How Your Data is Actually Secured
Customers should always know where and how their data is secured, but some explanations might not be
precisely understood. Not everyone knows and understands encryption protocols or what it all means.
Clients can be worried regarding certain aspects of the tech part, such as how their data can be recovered or
restored regarding issues. The very existence of restoring capabilities naturally implies that there are
servers out there who are storing your sensitive data and keeping it safe. But how safe?
SaaS providers have to make sure that their customers are well informed through their Privacy Policy about
how it all works. Even more, they should offer a standardized form on how they handle disaster recovery in
case their servers get shut down by an outage or natural phenomenon that might cause damage. Clients may,
unfortunately, have no guarantees that it will be possible, and it is certainly a worry that sensitive data may
be lost forever.
Along with concerns that the SaaS provider’s servers could shut down for good, there are risks and worries
regarding the fact that your data is not really in your control. The good side is that you don’t have to configure,
manage, maintain, or upgrade the software. The downside of that is that you essentially lose some control
over your data. For example, should something happen and your data is lost, you will have to contact the
service provider, wait for their answer no matter long that takes, and only then get an answer of what might’ve
happened.
It all depends on the level of customizability the provider offers which, again, may be limited. The SaaS
provider is in charge of the responsibilities concerning data storage. That may be a relief, but it’s also a loss of
control to a certain degree that opens users to worries and, in some cases, costs them a lot of time waiting for
answers when faced with issues.
10. The Service May Not Keep Up with Modern Security Standards
Plenty of providers boast of their security credentials and prove to their users that they have excellent
control over their data and security. However, most will speak of standards that are not up to date, and it
does say quite a lot about how mature the service really is. It offers the possibility that while the data may
be safe now, it might not be in a year or two when protocols have changed, policies have been updated, and
risks have heightened.
And, as mentioned above, most providers insist on a long-term investment in their SaaS software. You need
to make sure that your provider stays up to the with security measures in order to alleviate this particular
worry. However, you may rest assured that many of them need to maintain their software updated and their
servers maintained. Otherwise, they wouldn’t be able to keep up with their competition.
SaaS is always an excellent option, but there are pitfalls to the practice that haven’t been fixed yet. It leaves several
users worried and possibly reluctant to continue with the subscription. However, they can all be eliminated
if you tread carefully, pay attention, and treat it with the utmost care.
Cloud Security Standard Guidance
As customers transition their applications and data to use cloud computing, it is critically important that the
level of security provided in the cloud environment be equal to or better than the security provided by their
non-cloud IT environment. Failure to ensure appropriate security protection could ultimately result in higher
costs and potential loss of business, thus eliminating any of the potential benefits of cloud computing. This
paper focuses primarily on information security requirements for public cloud deployment since this model
introduces the most challenging information security concerns for cloud service customers.
The CSCC (Cloud Standards Customer Council) Security for Cloud Computing: 10 Steps to Ensure Success
white paper prescribes a series of ten steps that cloud service customers should take to evaluate and manage
the security of their cloud environment with the goal of mitigating risk and delivering an appropriate level
of support. The following are the steps:
The Open Commons Consortium (aka OCC - formerly the Open Cloud Consortium) is a 501(c)(3) non-
profit venture which provides cloud computing and data commons resources to support "scientific,
environmental, medical and health care research." OCC manages and operates resources including the Open
Science Data Cloud (aka OSDC), which is a multi-petabyte scientific data sharing resource. The consortium
is based in Chicago, Illinois, and is managed by the 501(c)3 Center for Computational Science.
• The Open Science Data Cloud - This is a working group that manages and operates the Open Science
Data Cloud (OSDC), which is a petabyte scale science cloud for researchers to manage, analyze and
share their data. Individual researchers may apply for accounts to analyze data hosted by the OSDC.
Research projects with TB- scale datasets are encouraged to join the OSDC and contribute towards its
infrastructure.
• Project Matsu - Project Matsu is collaboration between the NASA Goddard Space Flight Center and
the Open Commons Consortium to develop open source technology for cloud-based processing of
satellite imagery to support the earth science research community as well as human assisted disaster
relief.
• The Open Cloud Testbed - This working group manages and operates the Open Cloud Testbed. The
Open Cloud Testbed (OCT) is a geographically distributed cloud testbed spanning four data centers and
connected with 10G and 100G network connections. The OCT is used to develop new cloud computing
software and infrastructure.
• The Biomedical Data Commons - The Biomedical Data Commons (BDC) is cloud-based
infrastructure that provides secure, compliant cloud services for managing and analyzing genomic data,
electronic medical records (EMR), medical images, and other PHI data. It provides resources to
researchers so that they can more easily make discoveries from large complex controlled access
datasets. The BDC provides resources to those institutions in the BDC Working Group. It is an example
of what is sometimes called condominium model of sharing research infrastructure in which the
research infrastructure is operated by a consortium of educational and research organizations and
provides resources to the consortium.
• NOAA Data Alliance Working Group - The OCC National Oceanographic and Atmospheric
Administration (NOAA) Data Alliance Working Group supports and manages the NOAA data
commons and the surrounding community interested in the open redistribution of NOAA datasets.
In 2015, the OCC was accepted into the Matter healthcare community at Chicago's historic Merchandise
Mart. Matter is a community healthcare entrepreneurs and industry leaders working together in a shared
space to individually and collectively fuel the future of healthcare innovation.
In 2015, the OCC announced collaboration with the National Oceanic and Atmospheric Administration
(NOAA) to help release their vast stores of environmental data to the general public. This effort is managed
by the OCC's NOAA data alliance working group.
Responsibilities of OCC
• Manages storage and data commons infrastructure, such as the Open Science Data Cloud Public Data
Commons, the Environmental Data Commons, the BloodPAC Commons, the Biomedical Data
Commons, and various contract-specific commons.
• Provides a governance framework to align various stakeholders in the success of a Data Commons.
• Provides index, metadata, transfer and other value services supporting data commons activities.
• Manages cloud computing infrastructure such as the Open Science Data Cloud, to support scientific,
environmental, medical and health care research.
• Manages cloud computing testbeds to improve cloud computing software and services.
• Develops reference implementations, benchmarks and standards, such as the MalStone Benchmark, to
improve the state of the art of cloud computing.
• Sponsors workshops and other events related to cloud computing and data commons to educate the
community.
• RSS
o The acronym “Really Simple Syndication” or “Rich Site Summary”
o Used to publish frequently updated works—such as news headlines
o RSS is a family of web feed formats
• Atom & Atom Publishing Protocol
o The Atom format was developed as an alternative to RSS
• HTTP
o The acronym “Hypertext Transfer Protocol
o HTTP is a request/response standard between a client and a server
o For distributed, collaborative, hypermedia information systems
• XMPP
o The acronym “Extensible Messaging and Presence Protocol”
o Used for near-real-time, extensible instant messaging and presence information
o XMPP remains the core protocol of the Jabber Instant Messaging and Presence technology
• SIMPLE
o Session Initiation Protocol for Instant Messaging and Presence Leveraging Extensions
o For registering for presence information and receiving notifications
o It is also used for sending short messages and managing a session of realtime messages
between two or more participants
Standards for Security
• SAML
o Standard for communicating authentication, authorization, and attribute information among
online partners
o It allows businesses to securely send assertions between partners
o SAML protocol refers to what is transmitted, not how it is transmitted
o Three types of statements are provided by SAML: authentication statements, attribute
statements, and authorization decision statements
• OAuth (Open Authentication)
o OAuth is a method for publishing and interacting with protected data
o For developers, OAuth provides users access to their data
o OAuth allows users to grant access to their data
o OAuth by itself provides no privacy at all and depends on other protocols such as SSL
• OpenID
o OpenID is an open, decentralized standard for user authentication
o And allows users to log on to many services using the same digital identity
o It is a single-sign-on (SSO) method of access control
• SSL / TLS
o TLS or its predecessor SSL
o To provide security and data integrity for communications
o To prevent eavesdropping, tampering, and message forgery
Virtual Machine Security:- Security is a problem. Network security is an even bigger problem
because of the complex factors that define risks and the profound negative effects that can occur if you fail.
Virtual network security is the worst problem of all because it combines issues generated by traditional hosting
and application security with those from network security, and then adds the challenges of virtual resources and
services. It's no wonder we're only now starting to recognize the problems of cloud-virtual networking. And we're
a long way from solving them.
Isolate the new hosted elements:- Step one in securing virtual machine security in cloud computing is to isolate
the new hosted elements. For example, let's say three features hosted inside an edge device could be deployed in
the cloud either as part of the service data plane, with addresses visible to network users, or as part of a private
sub network that's invisible. If you deploy in the cloud, then any of the features can be attacked, and it's also
possible your hosting and management processes will become visible and vulnerable. If you isolate your hosting
and feature connections inside a private sub network, they're protected from outside access.
In container hosting today, both in the data center and in the cloud, application components deploy inside a
private sub network. As a result, only the addresses representing APIs that users are supposed to access are
exposed.
Certify virtual features and functions for security compliance:- Step two in cloud-virtual security is to certify
virtual features and functions for security compliance before you allow them to be deployed. Outside attacks are
a real risk in virtual networking, but an insider attack is a disaster. If a feature with a back-door security fault is
introduced into a service, it becomes part of the service infrastructure and is far more likely to possess open
attack vectors to other infrastructure elements. Private sub networks can help in addressing virtual machine
security in cloud computing. If new components can only access other components in the same service instance,
the risk is reduced that malware can be introduced in a new software-hosted feature. Yes, a back-door attack
could put the service itself at risk, but it's less likely the malware will spread to other services and customers.
Separate infrastructure management and orchestration from the service:- Step three is to separate
infrastructure management and orchestration from the service. Management APIs will always represent a major
risk because they're designed to control features, functions and service behavior. It's important to protect all such
APIs, but it's critical to protect the APIs that oversee infrastructure elements that should never be accessed by
service users. The most important area to examine to ensure virtual machine security in cloud computing is the
virtual-function-specific piece of the Virtual Network Functions Manager (VNFM) structure mandated by the
ETSI NFV (Network Functions Virtualization) Industry Specification Group. This code is provided by the
suppler of the VNF, and it is likely to require access to APIs that represent infrastructure elements as well as
orchestration or deployment tools. Nothing more than bad design is needed for these elements to open a gateway
to infrastructure management APIs that could affect the security and stability of features used in virtual-cloud
services.
Ensure that virtual network connections don't cross over between tenants or services:- The fourth and final
point in cloud-virtual network security is to ensure that virtual network connections don't cross over between
tenants or services. Virtual networking is a wonderful way of creating agile connections to redeployed or scaled
features, but each time a virtual network change is made, it's possible it can establish an inadvertent connection
between two different services, tenants or feature/function deployments. This can produce a data plane leak, a
connection between the actual user networks or a management or control leak that could allow one user to
influence the service of another.
IAM Security Standards- Identity and access management (IAM) in enterprise IT is about defining
and managing the roles and access privileges of individual network users and the circumstances in which users
are granted (or denied) those privileges. Those users might be customers (customer identity management) or
employees (employee identity management. The core objective of IAM systems is one digital identity per
individual. Once that digital identity has been established, it must be maintained, modified and monitored
throughout each user’s “access lifecycle.” Identity and access management is a critical part of any enterprise
security plan, as it is inextricably linked to the security and productivity of organizations in today’s digitally
enabled economy. Compromised user credentials often serve as an entry point into an organization’s network
and its information assets. Enterprises use identity management to safeguard their information assets against the
rising threats of ransomware, criminal hacking, phishing and other malware attacks. Global ransomware damage
costs alone are expected to exceed $5 billion this year, up 15 percent from 2016, Cybersecurity Ventures
predicted.
➢ Single Access Control Interface. Cloud IAM solutions provide a clean and consistent access control
interface for all cloud platform services. The same interface can be used for all cloud services.
➢ Enhanced Security. You can define increased security for critical applications.
➢ Resource-level Access Control. You can define roles and grant permissions to users to access resources
at different granularity levels.
Identity and Access Management technology can be used to initiate, capture, record, and manage user identities
and their access permissions. All users are authenticated, authorized, and evaluated according to policies and
roles. Poorly controlled IAM processes may lead to regulatory non-compliance; if the organization is audited,
management may not be able to prove that company data is not at risk of being misused.
Benefits of IAM
IAM technologies can be used to initiate, capture, record and manage user identities and their related access
permissions in an automated manner. This brings an organization the following IAM benefits:
➢ Access privileges are granted according to policy, and all individuals and services are properly
authenticated, authorized and audited.
➢ Companies that properly manage identities have greater control of user access, which reduces the risk of
internal and external data breaches.
➢ Automating IAM systems allows businesses to operate more efficiently by decreasing the effort, time and
money that would be required to manually manage access to their networks.
➢ In terms of security, the use of an IAM framework can make it easier to enforce policies around user
authentication, validation and privileges, and address issues regarding privilege creep.
➢ IAM systems help companies better comply with government regulations by allowing them to show
corporate information is not being misused. Companies can also demonstrate that any data needed for
auditing can be made available on demand.
✓ Unique passwords. The most common type of digital authentication is the unique password. To make
passwords more secure, some organizations require longer or complex passwords that require a
combination of letters, symbols and numbers. Unless users can automatically gather their collection of
passwords behind a single sign-on entry point, they typically find remembering unique passwords
onerous.
✓ Pre-shared key (PSK). PSK is another type of digital authentication where the password is shared among
users authorized to access the same resources -- think of a branch office Wi-Fi password. This type of
authentication is less secure than individual passwords.
✓ Behavioral authentication. When dealing with highly sensitive information and systems, organizations
can use behavioral authentication to get far more granular and analyze keystroke dynamics or mouse-use
characteristics. By applying artificial intelligence, a trend in IAM systems, organizations can quickly
recognize if user or machine behavior falls outside of the norm and can automatically lock down systems.
✓ Biometrics. Modern IAM systems use biometrics for more precise authentication. For instance, they
collect a range of biometric characteristics, including fingerprints, irises, faces, palms, gaits, voices and,
in some cases, DNA. Biometrics and behavior-based analytics have been found to be more effective than
passwords. When collecting and using biometric characteristics, companies must consider the ethics in the
following areas:
Apache Hadoop is an open-source software framework for storage and large-scale processing of
data-sets on clusters of commodity hardware. Hadoop is an Apache top-level project being built
and used by a global community of contributors and users. It is licensed under the Apache
License 2.0.
The Apache Hadoop framework is composed of the following modules:
Hadoop Common– contains libraries and utilities needed by other Hadoop modules
HadoopDistributed File System (HDFS) – a distributed file-system that stores data on
commodity machines, providing very high aggregate bandwidth across the cluster.
Hadoop YARN– a resource-management platform responsible for managing compute
resources in clusters and using them for scheduling of users' applications.
Hadoop MapReduce– a programming model for large scale data processing.
All the modules in Hadoop are designed with a fundamental assumption that hardware failures
(of individual machines, or racks of machines) are common and thus should be automatically
handled in software by the framework. Apache Hadoop's MapReduce and HDFS components
originally derived respectively from Google's MapReduce and Google File System (GFS)
papers. Beyond HDFS, YARN and MapReduce, the entire Apache Hadoop "platform" is now
commonly considered to consist of a number of related projects as well – Apache Pig, Apache
Hive, Apache HBase, Apache Spark, and others.
For the end-users, though MapReduce Java code is common, any programming language can be
used with "Hadoop Streaming" to implement the "map" and "reduce" parts of the user's program.
Apache Pig, Apache Hive, Apache Spark among other related projects expose higher level user
interfaces like Pig latin and a SQL variant respectively. The Hadoop framework itself is mostly
written in the Java programming language, with some native code in C and command line
utilities written as shell-scripts.
Apache Hadoop is a registered trademark of the Apache Software Foundation.
Hadoop on the Cloud
Most of the major Apache Hadoop distributions now offer their version of the Hadoop platform
in the cloud uniting two of the biggest topics in the tech world today: big data and cloud
computing. cloud offers several advantages for businesses looking to use Hadoop, so all
businesses – including small and medium-sized ones – can truly start to take advantage of big
data.
Bringing Hadoop to the cloud offers businesses the flexibility to use Hadoop according their
needs. One can scale up or scale down as often as one needs. Most cloud providers allow users to
sign up and get started within a matter of minutes, making it ideal for a small business with
limited resources and for large businesses that need to scale beyond their current solution without
making a long-term investment.
2. Cost-Effective
Despite the fact that the open source version of Hadoop is free, the claim that Hadoop is cheap is
a myth. While Hadoop is a much more cost-effective solution for storing large volumes of data,
it can still require major investments in hardware, development, maintenance and expertise. This
level of investment may be worth it for companies that plan on using Hadoop regularly, but for
those who only need to run occasional analytics or simply don’t have the budget for the upfront
investment, the cost of Hadoop can be a real issue. The cloud offers a cost-effective solution for
Hadoop. Most cloud providers charge on a pay-per-use basis so businesses can pay for the
storage or analytics they need without making that upfront investment or paying for maintaining
a system when it is not being used.
3. Real-Time Analytics
Big data analytics has a lot to offer. From a more complete view of the consumer to more
efficient manufacturing and improved product innovation, valuable insights lie within the mass
stores of multi-structured data being created. Unfortunately, the generic open source version of
Hadoop cannot currently handle real time analysis, meaning businesses that want to analyze and
adjust campaigns or products quickly can’t currently do so. However, MapR’s enterprise-ready
distribution of Hadoop, which re-architected the data platform and applied other architectural
innovations, can support real-time analytics. With this particular Hadoop distribution available
on the cloud, businesses can have instant access to their data for real-time processing and
analysis. These are just three advantages that Hadoop in the cloud has to offer. Depending on a
business’s needs the flexibility of the cloud allows it to adapt to various situations and solve
problems that may not even be listed here. Before dismissing Hadoop as beyond your reach, look
into what a cloud solution has to offer.
MapReduce
MapReduce is a processing technique and a program model for distributed computing based on
java. The MapReduce algorithm contains two important tasks, namely Map and Reduce. Map
takes a set of data and converts it into another set of data, where individual elements are broken
down into tuples (key/value pairs). Secondly, reduce task, which takes the output from a map as
an input and combines those data tuples into a smaller set of tuples. As the sequence of the
name MapReduce implies, the reduce task is always performed after the map job.
The major advantage of MapReduce is that it is easy to scale data processing over multiple
computing nodes. Under the MapReduce model, the data processing primitives are called
mappers and reducers. Decomposing a data processing application into mappers and reducers is
sometimes nontrivial. But, once we write an application in the MapReduce form, scaling the
application to run over hundreds, thousands, or even tens of thousands of machines in a cluster
is merely a configuration change. This simple scalability is what has attracted many
programmers to use the MapReduce model.
The Algorithm
Generally MapReduce paradigm is based on sending the computer to where the data
resides!
MapReduce program executes in three stages, namely map stage, shuffle stage, and
reduce stage.
o Map stage − The map or mapper’s job is to process the input data. Generally the
input data is in the form of file or directory and is stored in the Hadoop file
system (HDFS). The input file is passed to the mapper function line by line. The
mapper processes the data and creates several small chunks of data.
o Reduce stage − This stage is the combination of the Shuffle stage and
the Reduce stage. The Reducer’s job is to process the data that comes from the
mapper. After processing, it produces a new set of output, which will be stored in
the HDFS.
During a MapReduce job, Hadoop sends the Map and Reduce tasks to the appropriate
servers in the cluster.
The framework manages all the details of data-passing such as issuing tasks, verifying
task completion, and copying data around the cluster between the nodes.
Most of the computing takes place on nodes with data on local disks that reduces the
network traffic.
After completion of the given tasks, the cluster collects and reduces the data to form an
appropriate result, and sends it back to the Hadoop server.
Virtual Box
A Virtual Box or VB is a hypervisor for X86 computers from Oracle corporation. It was first
developed by Innotek GmbH and released in 2007 as an open source software package. The
company was later acquired by Sun Micro Systems in 2008. After that, Oracle has continued the
development of Virtual Box since 2010 and the product name is titled as Oracle VM Virtual
Box. Virtual Box comes in different flavours depending upon the operating systems for which it
is being configured. Virtual Box Ubuntu is more preferred, even though Virtual Box for
windows is equally popular. With the advent of android phones VirtualBox for android is
becoming the new face of VM in smartphones.
Virtual Box gets a lot of support, primarily because it is free and open-source. It also allows
unlimited snapshots – a feature only available in VMWare Pro. VMWare, on the other hand, is
great for drag and drop functionality between host and the VM, but many features come only in
paid version.
Google App Engine is a web application hosting service. By ―web application,‖ we mean an
application or service accessed over the Web, usually with a web browser: storefronts with
shopping carts, social networking sites, multiplayer games, mobile applications, survey
applications, project management, collaboration, publishing, and all the other things we’re
discovering are good uses for the Web. App Engine can serve traditional website content too,
such as documents and images, but the environment is especially designed for real-time dynamic
applications. Of course, a web browser is merely one kind of client: web application
infrastructure is well suited to mobile applications, as well.
In particular, Google App Engine is designed to host applications with many simultaneous users.
When an application can serve many simultaneous users without degrading performance, we say
it scales. Applications written for App Engine scale automatically. As more people use the
application, App Engine allocates more resources for the application and manages the use of
those resources. The application itself does not need to know anything about the resources it is
using.
Unlike traditional web hosting or self-managed servers, with Google App Engine, you only pay
for the resources you use. Billed resources include CPU usage, storage per month, incoming and
outgoing bandwidth, and several resources specific to App Engine services. To help you get
started, every developer gets a certain amount of resources for free, enough for small
applications with low traffic.
App Engine is part of Google Cloud Platform, a suite of services for running scalable
applications, performing large amounts of computational work, and storing, using, and analyzing
large amounts of data. The features of the platform work together to host applications efficiently
and effectively, at minimal cost. App Engine’s specific role on the platform is to host web
applications and scale them automatically. App Engine apps use the other services of the
platform as needed, especially for data storage.
An App Engine web application can be described as having three major parts: application
instances, scalable data storage, and scalable services. In this chapter, we look at each of these
parts at a high level. We also discuss features of App Engine for deploying and managing web
applications, and for building websites integrated with other parts of Google Cloud Platform.
An App Engine application responds to web requests. A web request begins when a client,
typically a user’s web browser, contacts the application with an HTTP request, such as to fetch a
web page at a URL. When App Engine receives the request, it identifies the application from the
domain name of the address, either a custom domain name you have registered and configured
for use with the app, or an .appspot.com subdomain provided for free with every app. App
Engine selects a server from many possible servers to handle the request, making its selection
based on which server is most likely to provide a fast response. It then calls the application with
the content of the HTTP request, receives the response data from the application, and returns the
response to the client.
From the application’s perspective, the runtime environment springs into existence when the
request handler begins, and disappears when it ends. App Engine provides several methods for
storing data that persists between requests, but these mechanisms live outside of the runtime
environment. By not retaining state in the runtime environment between requests—or at least, by
not expecting that state will be retained between requests—App Engine can distribute traffic
among as many servers as it needs to give every request the same treatment, regardless of how
much traffic it is handling at one time.
In the complete picture, App Engine allows runtime environments to outlive request handlers,
and will reuse environments as much as possible to avoid unnecessary initialization. Each
instance of your application has local memory for caching imported code and initialized data
structures. App Engine creates and destroys instances as needed to accommodate your app’s
traffic. If you enable the multithreading feature, a single instance can handle multiple requests
concurrently, further utilizing its resources.
Application code cannot access the server on which it is running in the traditional sense. An
application can read its own files from the filesystem, but it cannot write to files, and it cannot
read files that belong to other applications. An application can see environment variables set by
App Engine, but manipulations of these variables do not necessarily persist between requests. An
application cannot access the networking facilities of the server hardware, although it can
perform networking operations by using services.
In short, each request lives in its own ―sandbox.‖ This allows App Engine to handle a request
with the server that would, in its estimation, provide the fastest response. For web requests to the
app, there is no way to guarantee that the same app instance will handle two requests, even if the
requests come from the same client and arrive relatively quickly.
Sandboxing also allows App Engine to run multiple applications on the same server without the
behavior of one application affecting another. In addition to limiting access to the operating
system, the runtime environment also limits the amount of clock time and memory a single
request can take. App Engine keeps these limits flexible, and applies stricter limits to
applications that use up more resources to protect shared resources from ―runaway‖ applications.
A request handler has up to 60 seconds to return a response to the client. While that may seem
like a comfortably large amount for a web app, App Engine is optimized for applications that
respond in less than a second. Also, if an application uses many CPU cycles, App Engine may
slow it down so the app isn’t hogging the processor on a machine serving multiple apps. A CPU-
intensive request handler may take more clock time to complete than it would if it had exclusive
use of the processor, and clock time may vary as App Engine detects patterns in CPU usage and
allocates accordingly.
Google App Engine provides four possible runtime environments for applications, one for each
of four programming languages: Java, Python, PHP, and Go. The environment you choose
depends on the language and related technologies you want to use for developing the application.
The Python environment runs apps written in the Python 2.7 programming language, using a
custom version of CPython, the official Python interpreter. App Engine invokes a Python app
using WSGI, a widely supported application interface standard. An application can use most of
Python’s large and excellent standard library, as well as rich APIs and libraries for accessing
services and modeling data. Many open source Python web application frameworks work with
App Engine, such as Django, web2py, Pyramid, and Flask. App Engine even includes a
lightweight framework of its own, called webapp.
Similarly, the Java, PHP, and Go runtime environments offer standard execution environments
for those languages, with support for standard libraries and third-party frameworks.
All four runtime environments use the same application server model: a request is routed to an
app server, an application instance is initialized (if necessary), application code is invoked to
handle the request and produce a response, and the response is returned to the client. Each
environment runs application code within sandbox restrictions, such that any attempt to use a
feature of the language or a library that would require access outside of the sandbox returns an
error.
(Another Source)
Google App Engine
Google AppEngine is a PaaS implementation that provides services for developing and hosting
scalable Web applications. AppEngine is essentially a distributed and scalable runtime
environment that leverages Google’s distributed infrastructure to scale out applications facing a
large number of requests by allocating more computing resources to them and balancing the load
among them. The runtime is completed by a collection of services that allow developers to
design and implement applications that naturally scale on AppEngine.
Developers can develop applications in Java, Python, and Go, a new programming language
developed by Google to simplify the development of Web applications. Application usage of
Google resources and services is metered by AppEngine, which bills users when their
applications finish their free quotas.
Infrastructure
AppEngine hosts Web applications, and its primary function is to serve users requests efficiently.
To do so, AppEngine’s infrastructure takes advantage of many servers available within Google
datacenters. For each HTTP request, AppEngine locates the server s hosting the application that
pro- cesses the request, evaluates their load, and, if necessary, allocates additional resources (i.e.,
ser- vers) or redirects the request to an existing server. The particular design of applications,
which does not expect any state information to be implicitly maintained between requests to the
same application, simplifies the work of the infrastructure, which can redirect each of the
requests to any of the servers hosting the target application or even allocate a new one. The
infrastructure is also responsible for monitoring application performance and collecting sta-
tistics on which the billing is calculated.
Microsoft Azure
OpenStack is a free open standard cloud computing platform, mostly deployed as infrastructure-
as-a-service (IaaS) in both public and private clouds where virtual servers and other resources are
made available to users.The software platform consists of interrelated components that control
diverse, multi-vendor hardware pools of processing, storage, and networking resources
throughout a data center. Users either manage it through a web-based dashboard, through
command-line tools, or through RESTful web services.
OpenStack has a modular architecture with various code names for its components.
Compute (Nova)
Nova is the OpenStack project that provides a way to provision compute instances (aka virtual
servers). Nova supports creating virtual machines, baremetal servers (through the use of ironic),
and has limited support for system containers. Nova runs as a set of daemons on top of existing
Linux servers to provide that service. Nova is written in Python. It uses many external Python
libraries such as Eventlet (concurrent networking library), Kombu (AMQP messaging
framework), and SQLAlchemy (SQL toolkit and Object Relational Mapper). Nova is designed to
be horizontally scalable. Rather than switching to larger servers, you procure more servers and
simply install identically configured services. Due to its widespread integration into enterprise-
level infrastructures, monitoring OpenStack performance in general, and Nova performance in
particular, scaling has become an increasingly important issue. Monitoring end-to-end
performance requires tracking metrics from Nova, Keystone, Neutron, Cinder, Swift and other
services, in addition to monitoring RabbitMQ which is used by OpenStack services for message
passing. All these services generate their own log files, which, especially in enterprise-level
infrastructures, also should be monitored.
Networking (Neutron)
Neutron is an OpenStack project to provide ―network connectivity as a service‖ between
interface devices (e.g., vNICs) managed by other OpenStack services (e.g., nova). It implements
the OpenStack Networking API. It manages all networking facets for the Virtual Networking
Infrastructure (VNI) and the access layer aspects of the Physical Networking Infrastructure (PNI)
in the OpenStack environment. OpenStack Networking enables projects to create advanced
virtual network topologies which may include services such as a firewall, and a virtual private
network (VPN). Neutron allows dedicated static IP addresses or DHCP. It also allows Floating
IP addresses to let traffic be dynamically rerouted. Users can use software-defined networking
(SDN) technologies like OpenFlow to support multi-tenancy and scale. OpenStack networking
can deploy and manage additional network services—such as intrusion detection systems (IDS),
load balancing, firewalls, and virtual private networks (VPN).
Cinder is the OpenStack Block Storage service for providing volumes to Nova virtual machines,
Ironic bare metal hosts, containers and more. Some of the goals of Cinder are to be/have:
Cinder volumes provide persistent storage to guest virtual machines - known as instances, that
are managed by OpenStack Compute software. Cinder can also be used independent of other
OpenStack services as stand-alone software-defined storage. The block storage system manages
the creation,replication, snapshot management, attaching and detaching of the block devices to
servers.
Identity (Keystone)
Keystone is an OpenStack service that provides API client authentication, service discovery, and
distributed multi-tenant authorization by implementing OpenStack's Identity API. It is the
common authentication system across the cloud operating system. Keystone can integrate with
directory services like LDAP. It supports standard username and password credentials, token-
based systems and AWS-style (i.e. Amazon Web Services) logins. The OpenStack keystone
service catalog allows API clients to dynamically discover and navigate to cloud services.
Image (Glance)
The Image service (glance) project provides a service where users can upload and discover data
assets that are meant to be used with other services. This currently includes images and metadata
definitions.
Images
Glance image services include discovering, registering, and retrieving virtual machine (VM)
images. Glance has a RESTful API that allows querying of VM image metadata as well as
retrieval of the actual image. VM images made available through Glance can be stored in a
variety of locations from simple filesystems to object-storage systems like the OpenStack Swift
project.
Metadata Definitions
Glance hosts a metadefs catalog. This provides the OpenStack community with a way to
programmatically determine various metadata key names and valid values that can be applied to
OpenStack resources.
Swift is a distributed, eventually consistent object/blob store. The OpenStack Object Store
project, known as Swift, offers cloud storage software so that you can store and retrieve lots of
data with a simple API. It's built for scale and optimized for durability, availability, and
concurrency across the entire data set. Swift is ideal for storing unstructured data that can grow
without bound. In August 2009, Rackspace started the development of the precursor to
OpenStack Object Storage, as a complete replacement for the Cloud Files product. The initial
development team consisted of nine developers. Swift-Stack, an object storage software
company, is currently the leading developer for Swift with significant contributions from Intel,
Red Hat, NTT, HP, IBM, and more.
Dashboard (Horizon)
Horizon is the canonical implementation of OpenStack's Dashboard, which provides a web based
user interface to OpenStack services including Nova, Swift, Keystone, etc. Horizon ships with
three central dashboards, a ―User Dashboard‖, a ―System Dashboard‖, and a ―Settings‖
dashboard. Between these three they cover the core OpenStack applications and deliver on Core
Support. The Horizon application also ships with a set of API abstractions for the core
OpenStack projects in order to provide a consistent, stable set of reusable methods for
developers. Using these abstractions, developers working on Horizon don't need to be intimately
familiar with the APIs of each OpenStack project.
Orchestration (Heat)
Heat is a service to orchestrate multiple composite cloud applications using templates, through
both an OpenStack-native REST API and a CloudFormation-compatible Query API.
Workflow (Mistral)
Mistral is a service that manages workflows. User typically writes a workflow using workflow
language based on YAML and uploads the workflow definition to Mistral via its REST API.
Then user can start this workflow manually via the same API or configure a trigger to start the
workflow on some event.
Telemetry (Ceilometer)
OpenStack Telemetry (Ceilometer) provides a Single Point Of Contact for billing systems,
providing all the counters they need to establish customer billing, across all current and future
OpenStack components. The delivery of counters is traceable and auditable, the counters must be
easily extensible to support new projects, and agents doing data collections should be
independent of the overall system.
Database (Trove)
Sahara is a component to easily and rapidly provision Hadoop clusters. Users will specify several
parameters like the Hadoop version number, the cluster topology type, node flavor details
(defining disk space, CPU and RAM settings), and others. After a user provides all of the
parameters, Sahara deploys the cluster in a few minutes. Sahara also provides means to scale a
preexisting Hadoop cluster by adding and removing worker nodes on demand.
Ironic is an OpenStack project that provisions bare metal machines instead of virtual machines. It
was initially forked from the Nova Baremetal driver and has evolved into a separate project. It is
best thought of as a bare-metal hypervisor API and a set of plugins that interact with the bare-
metal hypervisors. By default, it will use PXE and IPMI in concert to provision and turn on and
off machines, but Ironic supports and can be extended with vendor-specific plugins to implement
additional functionality.
Messaging (Zaqar)
Zaqar is a multi-tenant cloud messaging service for Web developers. The service features a fully
RESTful API, which developers can use to send messages between various components of their
SaaS and mobile applications by using a variety of communication patterns. Underlying this API
is an efficient messaging engine designed with scalability and security in mind. Other OpenStack
components can integrate with Zaqar to surface events to end users and to communicate with
guest agents that run in the "over-cloud" layer.
DNS (Designate)
Designate is a multi-tenant REST API for managing DNS. This component provides DNS as a
Service and is compatible with many backend technologies, including PowerDNS and BIND. It
doesn't provide a DNS service as such as its purpose is to interface with existing DNS servers to
manage DNS zones on a per tenant basis.
Search (Searchlight)
Searchlight provides advanced and consistent search capabilities across various OpenStack cloud
services. It accomplishes this by offloading user search queries from other OpenStack API
servers by indexing their data into ElasticSearch. Searchlight is being integrated into Horizon
and also provides a Command-line interface.
Barbican is a REST API designed for the secure storage, provisioning and management of
secrets. It is aimed at being useful for all environments, including large ephemeral Clouds.
Magnum is an OpenStack API service developed by the OpenStack Containers Team making
container orchestration engines such as Docker Swarm, Kubernetes, and Apache Mesos available
as first class resources in OpenStack. Magnum uses Heat to orchestrate an OS image which
contains Docker and Kubernetes and runs that image in either virtual machines or bare metal in a
cluster configuration.
Vitrage is the OpenStack RCA (Root Cause Analysis) service for organizing, analyzing and
expanding OpenStack alarms & events, yielding insights regarding the root cause of problems
and deducing their existence before they are directly detected.
This alarming service enables the ability to trigger actions based on defined rules against metric
or event data collected by Ceilometer or Gnocchi.
OpenStack does not strive for compatibility with other clouds' APIs. However, there is some
amount of compatibility driven by various members of the OpenStack community for whom
such things are important.
The EC2 API project aims to provide compatibility with Amazon EC2
The GCE API project aims to provide compatibility with Google Compute Engine
Governance
Appliances
An OpenStack Appliance is the name given to software that can support the OpenStack cloud
computing platform on either physical devices such as servers or virtual machines or a
combination of the two. Typically a software appliance is a set of software capabilities that can
function without an operating system. Thus, they must contain enough of the essential
underlying operating system components to work. Therefore, a strict definition might be: an
application that is designed to offer OpenStack capability without the necessity of an underlying
operating system. However, applying this strict definition may not be helpful, as there is not
really a clear distinction between an appliance and a distribution. It could be argued that the term
appliance is something of a misnomer because OpenStack itself is referred to as a cloud
operating system so using the term OpenStack appliance could be a misnomer if one is being
pedantic. If we look at the range of Appliances and Distributions one could make the distinction
that distributions are those toolsets which attempt to provide a wide coverage of the OpenStack
project scope, whereas an Appliance will have a more narrow focus, concentrating on fewer
projects. Vendors have been heavily involved in OpenStack since its inception, and have since
developed and are marketing a wide range of appliances, applications and distributions.
Federation in cloud
Cloud federation is the practice of interconnecting the cloud computing environments of two or
more service providers for the purpose of load balancing traffic and accommodating spikes in
demand.
Cloud federation requires one provider to wholesale or rent computing resources to another
cloud provider. Those resources become a temporary or permanent extension of the buyer's cloud
computing environment, depending on the specific federation agreement between providers.
Cloud federation is the practice of interconnecting the cloud computing environments of two or
more service providers for the purpose of load balancing traffic and accommodating spikes in
demand.
Cloud federation requires one provider to wholesale or rent computing resources to another
cloud provider. Those resources become a temporary or permanent extension of the buyer's cloud
computing environment, depending on the specific federation agreement between providers.
Cloud federation offers two substantial benefits to cloud providers. First, it allows providers to
earn revenue from computing resources that would otherwise be idle or underutilized. Second,
cloud federation enables cloud providers to expand their geographic footprints and accommodate
sudden spikes in demand without having to build new points-of-presence (POPs).
Service providers strive to make all aspects of cloud federation—from cloud provisioning to
billing support systems (BSS) and customer support— transparent to customers. When
federating cloud services with a partner, cloud providers will also establish extensions of their
customer-facing service-level agreements (SLAs) into their partner provider's data centers.
The logical and operational level of a federated cloud identifies and addresses the challenges in
devising a framework that enables the aggregation of providers that belong to different
administrative domains within a context of a single overlay infrastructure, which is the cloud
federation. At this level, policies and rules for interoperation are defined. Moreover, this is the
layer at which decisions are made as to how and when to lease a service to—or to leverage a
service from— another provider. The logical component defines a context in which agreements
among providers are settled and services are negotiated, whereas the operational component
characterizes and shapes the dynamic behavior of the federation as a result of the single
providers’ choices. This is the level where MOCC is implemented and realized.
The infrastructural level addresses the technical challenges involved in enabling heterogeneous
cloud computing systems to interoperate seamlessly. It deals with the technology barriers that
keep separate cloud computing systems belonging to different administrative domains. By
having standardized protocols and interfaces, these barriers can be overcome. In other words, this
level for the federation is what the TCP/IP stack is for the Internet: a model and a reference
implementation of the technologies enabling the interoperation of systems. The infrastructural
level lays its foundations in the IaaS and PaaS layers of the Cloud Computing Reference Model.
Services for interoperation and interface may also find implementation at the SaaS level,
especially for the realization of negotiations and of federated clouds.
Meerut Institute of Engineering & Technology, Meerut
Lesson Plan / Teaching Plan / Lecture Plan with Progress : B Tech - VII Semester : 2023-24
Topics / lectures are arranged in sequence - same - as to be taught in the class. Maintain data related to "Date" in its hard copy.
Date
Lecture
S. No. CO (No) Topic Description Teaching Pedagogy Reference Material Remarks, if any
No
Planned Actual Delivery
Rajkumar Buyya, Christian Vecchiola, S.
1 1 CO1 Introduction To Cloud Computing TC1,TC2 29/08/2023 30/08/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
2 2 CO1 Definition of Cloud TC1,TC2 30/08/2023 31/08/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
3 3 CO1 Evolution of Cloud Computing TC1,TC2 31/08/2023 04/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
4 4 CO1 Underlying Principles of Paraller and Distributed Computing TC1,TC2 04/09/2023 05/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
5 5 CO1 Cloud Characteristics TC1,TC2 06/09/2023 06/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
6 6 CO1 Elasticity in Cloud TC1,TC2 06/09/2023 07/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
7 7 CO1 On-Demand Provisioning TC1,TC2 07/09/2023 11/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
8 8 CO2 Cloud Enabling Tech. Services Oriented Arch. In TC1,TC2 08/09/2023 12/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
9 9 CO2 REST and Systems of Systems TC1,TC2 12/09/2023 13/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
10 10 CO2 Web Services TC1,TC2 13/09/2023 14/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
11 11 CO2 Publish, Subscribe Model TC1,TC2 14/09/2023 18/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
12 12 CO2 Type of Virtualization TC1,TC2 15/09/2023 19/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
13 13 CO2 Implementaion Level of Virtualization Structure TC1,TC2 19/09/2023 20/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
14 14 CO2 Virtualization of CPU TC1,TC2 20/10/2023 21/09/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
15 15 CO2 Virtualization support and Disaster Recovery TC1,TC2 21/09/2023 02/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
16 16 CO2 Quick revise CO1 and CO2 TC1 21/09/2023 03/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
17 17 CO3 Cloud Computing Architecture Overview TC1,TC2 03/10/2023 04/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
18 18 CO3 Cloud Computing 1). Frontend 2). Backend TC1,TC2 04/10/2023 05/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
19 19 CO3 Benefits of Cloud Computing Architeture TC1,TC2 05/10/2023 09/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
20 20 CO3 Layered Cloud Architecture Desing TC1,TC2 05/10/2023 10/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
21 21 CO3 Introduction of NIST Cloud Computing TC1,TC2 06/10/2023 11/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
22 22 CO3 NIST Cloud Arcticture, Deployment Model TC1,TC2 09/10/2023 12/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
23 23 CO3 Service Model (IaaS, PaaS, SaaS) TC1,TC2 12/10/2023 16/10/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
24 24 CO3 Architectural Design Challenges in cloud Storage, SaaS, S3 TC1,TC2 16/10/2023 06/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
25 25 CO4 Overview of Resource Management and Security in Cloud TC1,TC2 16/10/2023 07/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
26 26 CO4 Inter Cloud Resource Management- Resource Provioning TC1,TC2 01/11/2023 08/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
27 27 CO4 Resource Provisioning Methods TC1,TC2 02/11/2023 09/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
28 28 CO4 Global Exchange of Cloud Resources TC1,TC2 14/11/2023 13/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
29 29 CO4 Security Overview and Defination TC1,TC2 14/11/2023 14/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
30 30 CO4 Software-as-a-Service Security TC1,TC2 15/11/2023 15/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
31 31 CO4 Security Governance TC1,TC2 15/11/2023 16/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
32 32 CO4 Vertual Machine Security TC1,TC2 16/11/2023 20/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
33 33 CO4 IAM TC1,TC2 20/11/2023 21/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
34 34 CO4 Security Standards TC1,TC2 21/11/2023 22/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
35 35 CO4 Tutorials, Quick Revision of CO3 & CO4 TC1 23/11/2023 23/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
36 36 CO5 Cloud Technologies and Advancements Hadoop TC1,TC2 23/11/2023 27/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
37 37 CO5 MapReduce, Virtual Box TC1,TC2 27/11/2023 28/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
38 38 CO5 Google App Engine TC1,TC2 28/11/2023 29/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
39 39 CO5 Programming Environment for Google App Engine TC1,TC2 30/11/2023 30/11/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
40 40 CO5 Open Stack, Federation in the Cloud and Four Levels of Fedration TC1,TC2 04/12/2023 04/12/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
41 41 CO5 Federation Services and Applications TC1,TC2 04/12/2023 05/12/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
Rajkumar Buyya, Christian Vecchiola, S.
42 42 CO5 Quick revise CO1,CO2,CO3,CO4 and CO5 TC1 04/12/2023 06/12/2023
ThamaraiSelvi, -"Mastering Cloud Computing"
TC2 PPT
COx Statement
4 Explain Inter Cloud resources management cloud storage services and their providers access security services and standards for cloud Computing.
S. No. Description
R1 Kai Hwang, Geoffrey C. Fox, Jack G. Dongarra, "Distributed and Cloud Computing, From Parallel Processing to the Internet of Things", Morgan Kaufmann Publishers, 2012.
Rittinghouse, John W., and James F. Ransome, - Cloud Computing: Implementation, Management and Security, CRC Press, 2017
R2
R3 Rajkumar Buyya, Christian Vecchiola, S. ThamaraiSelvi, -Mastering Cloud Computing, Tata Megraw Hill, 2013
R4 Toby Velte, Anthony Velte, Robert Elsenpeter, "Cloud Computing -A Practical Approach, Tata Megraw Hill, 2009
R5 George Reese, "Cloud Application Architectures: Building Applications and Infrastructure in the Cloud: Transactional Systems for EC2 and Beyond (Theory in Practice), O'Reilly, 2009
MEERUT INSTITUTE OF ENGINEERING & TECHNOLOGY, MEERUT
CLASS TIME TABLE (EVEN SEMESTER) W.e.f. 30/08/2023
Course: B.TECH Branch: CSE Semester VII
M-204 Section A+B
TIME
DAY 09:00 - 10:00 10:00 - 10:50 10:50-11:40 11:40-12:30 12:30-01:30 01:30-02:20 02:20-03:10 03:10-04:00 04:00-04:50
L L M-311 L L P M-008 L L
LUNCH
PME KCS-077 MAI RER CC KCS753 PME RER
MONDAY
KHU-702 L M-004 KOE-074 KCS-713 F1 KHU-702 KOE-074
AMA KCS-071 MSB NF1 SSL AMA NF1
L M-311 L P M-008 PG1 KCS751 AI LAB M-102 MSB P M-008
KCS-077 MAI CC KCS753 KCS751 DS LAB M-MAI KCS753
TUESDAY
L M-004 KCS-713 F1 PG2 M-103 F1
KCS-071 MSB SSL KCS752 M- 112 F2
L L M-311 L L P M-008 L L
PME KCS-077 MAI RER CC KCS753 PME RER
WEDNESDAY
KHU-702 L M-004 KOE-074 KCS-713 F1 KHU-702 KOE-074
AMA KCS-071 MSB NF1 SSL AMA NF1
L L M-311 PG1 KCS751 AI LAB M-103 MSB
CC KCS-077 MAI KCS751 DS LAB M- 112 MAI
THURSDAY LIBRARY
KCS-713 L M-004 PG2 M-213
SSL KCS-071 MSB KCS752 M- F2
LUNCH
KCS-077 MAI CC KCS751 DS LAB M- 102 MAI PME RER KCS753
MONDAY
L M-004 KCS-713 PG2 M-111 KHU-702 KOE-074 F1
KCS-071 MSB SSL KCS752 F2 AMA NF1
L M-311 L L L PG1 KCS751 AI LAB M-212 MSB P M-008
KCS-077 MAI CC PME RER KCS751 DS LAB M- 303 MAI KCS753
TUESDAY
L M-004 KCS-713 KHU-702 KOE-074 PG2 M-213 F1
KCS-071 MSB SSL AMA NF1 KCS752 M- 213 F2
L L M-311 P M-008 L L P M-008
CC KCS-077 MAI KCS753 PME RER KCS753
WEDNESDAY
KCS-713 L M-004 F1 KHU-702 KOE-074 F1
SSL KCS-071 MSB AMA NF1
L L M-311 L L
CC KCS-077 MAI PME RER
THURSDAY LIBRARY
KCS-713 L M-004 KHU-702 KOE-074
SSL KCS-071 MSB AMA NF1
LUNCH
KCS753 PME RER CC KCS-077 MAI
MONDAY
F1 KHU-702 KOE-074
LIBRARY KCS-713 L M-004
AMA NF1 SSL KCS-071 MSB
P M-008 PG1 KCS751 AI LAB M- 111 MSB L L L L M-311
KCS753 KCS751 DS LAB M- 112 MAI PME RER CC KCS-077 MAI
TUESDAY
F1 PG2 M- 203 KHU-702 KOE-074 KCS-713 L M-004
KCS752 M- F2 AMA NF1 SSL KCS-071 MSB
P M-008 L L L M-311 L PG1 KCS751 AI LAB M-213 MSB
KCS753 PME RER KCS-077 MAI CC KCS751 DS LAB M-112 AI
WEDNESDAY
F1 KHU-702 KOE-074 L M-004 KCS-713 PG2 M-111
AMA NF1 KCS-071 MSB SSL KCS752 M- 111 F2
P M-008 L M-311 L L L
KCS753 KCS-077 MAI CC PME RER
THURSDAY
F1 L M-004 KCS-713 KHU-702 KOE-074
LIBRARY
KCS-071 MSB SSL AMA NF1
FOR
AS PER
AICTE MODEL CURRICULUM
[Effective from the Session: 2021-22]
SEMESTER- VII
End
Sl. Subject Periods Evaluation Scheme
Subject Semester Total Credit
No.
Codes L T P CT TA Total PS TE PE
Total 12 0 12 850 18
*The Mini Project or internship (4 - 6 weeks) conducted during summer break after VI semester and will be assessed during VII semester.
SEMESTER- VIII
End
Sl. Subject Periods Evaluation Scheme
Subject Semester Total Credit
No.
Codes L T P CT TA Total PS TE PE
# #
1 KHU801/KHU802 HSMC-2 /HSMC-1 3 0 0 30 20 50 100 150 3
Departmental Elective-V
CO 2 Able to understand the basic concepts of access optimization and parallel computers K2, K3
Able to describe different parallel processing platforms involved in achieving high K3 , K4
CO 3
performance computing
CO 4 Develop efficient and high performance parallel programming. K2 , K3
CO 5 Able to learn parallel programming using message passing paradigm. K2 , K4
DETAILED SYLLABUS 3‐0‐0
Proposed
Unit Topic
Lecture
Overview of Grid Computing Technology, History of Grid Computing, High Performance
Computing, Cluster Computing. Peer‐to‐Peer Computing, Internet Computing, Grid Computing
08
Model and Protocols, Types of Grids: Desktop Grids, Cluster Grids, Data Grids, High‐
I
Performance Grids, Applications and Architectures of High Performance Grids, High Performance
Application Development Environment.
II Open Grid Services Architecture: Introduction, Requirements, Capabilities, Security 08
Considerations, GLOBUS Toolkit
Overview of Cluster Computing: Cluster Computer and its Architecture, Clusters Classifications,
III 08
Components for Clusters, Cluster Middleware and SSI, Resource Management and Scheduling,
Programming, Environments and Tools, Cluster Applications, Cluster Systems,
IV Beowulf Cluster: The Beowulf Model, Application Domains, Beowulf System Architecture, 08
Software Practices, Parallel Programming with MPL, Parallel Virtual Machine (PVM).
Overview of Cloud Computing: Types of Cloud, Cyber infrastructure, Service Oriented
V 08
Architecture Cloud Computing Components: Infrastructure, Storage, Platform, Application,
Services, Clients, Cloud Computing Architecture.
Text books:
1. Laurence T.Yang, Minyi Guo – High Performance Computing Paradigm and Infrastructure John Wiley
2. Ahmar Abbas, “Grid Computing: Practical Guide to Technology & Applications”, Firewall Media, 2004.
3. Joshy Joseph and Craig Fellenstein , “Grid Computing” Pearson Education, 2004.
4. lan Foster, et al.,“The Open Grid Services Architecture”, Version 1.5 (GFD.80). Open Grid Forum, 2006.
5. RajkumarBuyya. High Performance Cluster Computing: Architectures and Systems. PrenticeHall India, 1999.
CO 3 Understand vulnerability assessments and the weakness of using passwords for authentication K4
CO 5 Summarize the intrusion detection and its solutions to overcome the attacks. K2
Unit Proposed
Topic
Lecture
Introduction to security attacks, services and mechanism, Classical encryption techniques-
substitution ciphers and transposition ciphers, cryptanalysis, steganography, Stream and block
I 08
ciphers. Modern Block Ciphers: Block ciphers principles, Shannon’s theory of confusion and
diffusion, fiestal structure, Data encryption standard(DES), Strength of DES, Idea of differential
cryptanalysis, block cipher modes of operations, Triple DES
Introduction to group, field, finite field of the form GF(p), modular arithmetic, prime and relative
prime numbers, Extended Euclidean Algorithm, Advanced Encryption Standard (AES) encryption
II 08
and decryptionFermat’s and Euler’s theorem, Primarily testing, Chinese Remainder theorem,
Discrete Logarithmic Problem,Principals of public key crypto systems, RSA algorithm, security of
RSA
Message Authentication Codes: Authentication requirements, authentication functions, message
authentication code, hash functions, birthday attacks, security of hash functions, Secure hash
III 08
algorithm (SHA) Digital Signatures: Digital Signatures, Elgamal Digital Signature Techniques,
Digital signature standards (DSS), proof of digital signature algorithm,
Key Management and distribution: Symmetric key distribution, Diffie-Hellman Key Exchange,
IV 08
Public key distribution, X.509 Certificates, Public key Infrastructure. Authentication Applications:
Kerberos, Electronic mail security: pretty good privacy (PGP), S/MIME.
IP Security: Architecture, Authentication header, Encapsulating security payloads, combining
V security associations, key management. Introduction to Secure Socket Layer, Secure electronic, 08
transaction (SET) System Security: Introductory idea of Intrusion, Intrusion detection, Viruses and
related threats, firewalls
Text books:
1. William Stallings, “Cryptography and Network Security: Principals and Practice”, Pearson Education.
2. Behrouz A. Frouzan: Cryptography and Network Security, McGraw Hill .
3. C K Shyamala, N Harini, Dr. T.R.Padmnabhan Cryptography and Security ,Wiley
4. Bruce Schiener, “Applied Cryptography”. John Wiley & Sons
5. Bernard Menezes,” Network Security and Cryptography”, Cengage Learning.
6. AtulKahate, “Cryptography and Network Security”, McGraw Hill
Proposed
Unit Topic
Lecture
INTRODUCTION: Introduction to mobile applications – Embedded systems - Market and
I business drivers for mobile applications – Publishing and delivery of mobile applications – 08
Requirements gathering and validation for mobile applications
BASIC DESIGN: Introduction – Basics of embedded systems design – Embedded OS - Design
constraints for mobile applications, both hardware and software related – Architecting mobile
II 08
applications – User interfaces for mobile applications – touch events and gestures – Achieving
quality constraints – performance, usability, security, availability and modifiability
ADVANCED DESIGN: Designing applications with multimedia and web access capabilities –
III Integration with GPS and social media networking applications – Accessing applications hosted in a 08
cloud computing environment – Design patterns for mobile applications.
TECHNOLOGY I – ANDROID: Introduction – Establishing the development environment –
Android architecture – Activities and views – Interacting with UI – Persisting data using SQLite –
IV 08
Packaging and deployment – Interaction with server side applications – Using Google Maps, GPS
and Wi-Fi – Integration with social media applications.
TECHNOLOGY II –iOS: Introduction to Objective C – iOS features – UI implementation – Touch
frameworks – Data persistence using Core Data and SQLite – Location aware applications using
V 08
Core Location and Map Kit – Integrating calendar and address book with social media application –
Using Wi-Fi - iPhone marketplace. Swift: Introduction to Swift, features of swift
Text books:
1. Charlie Collins, Michael Galpin and Matthias Kappler, “Android in Practice”, DreamTech, 2012
2. AnubhavPradhan , Anil V Despande Composing Mobile Apps,Learn ,explore,apply
3. James Dovey and Ash Furrow, “Beginning Objective C”, Apress, 2012
4. Jeff McWherter and Scott Gowell, "Professional Mobile Application Development", Wrox, 2012
5. David Mark, Jack Nutting, Jeff LaMarche and Frederic Olsson, “Beginning iOS
6. Development: Exploring the iOS SDK”, Apress, 2013.
CO 2 Have an ability to design and conduct a software test process for a software testing project. K3, K4
Have an ability to identify the needs of software test automation, and define and develop a test
CO 3 K1 , K2
tool to support test automation.
Have an ability understand and identify various software testing problems, and solve these
CO 4 K1 , K2
problems by designing and selecting software test models, criteria, strategies, and methods.
Have basic understanding and knowledge of contemporary issues in software testing, such as
CO 5 K2
component-based software testing problems.
DETAILED SYLLABUS 3-0-0
Proposed
Unit Topic
Lecture
Review of Software Engineering: Overview of Software Evolution, SDLC, Testing Process,
Terminologies in Testing: Error, Fault, Failure, Verification, Validation, Difference Between
Verification and Validation, Test Cases, Testing Suite, Test ,Oracles, Impracticality of Testing
I All Data; Impracticality of Testing AllPaths. Verification: Verification Methods, SRS 08
Verification, Source Code Reviews, User Documentation Verification, Software, Project Audit,
Tailoring Software Quality Assurance Program by Reviews, Walkthrough, Inspection and
Configuration Audits
Functional Testing: Boundary Value Analysis, Equivalence Class Testing, Decision Table
Based Testing, Cause Effect Graphing Technique. Structural Testing: Control Flow Testing,
II 08
Path Testing, Independent Paths, Generation of Graph from Program, Identification of
Independent Paths, Cyclomatic Complexity, Data Flow Testing, Mutation Testing
Regression Testing: What is Regression Testing? Regression Test cases selection, Reducing the
III number of test cases, Code coverage prioritization technique. Reducing the number of test 08
cases: Prioritization guidelines, Priority category, Scheme, Risk Analysis
Software Testing Activities: Levels of Testing, Debugging, Testing techniques and their
applicability, Exploratory Testing Automated Test Data Generation: Test Data, Approaches to
IV 08
test data generation, test data generation using genetic algorithm, Test Data Generation Tools,
Software Testing Tools, and Software test Plan.
Object Oriented Testing: Definition, Issues, Class Testing, Object Oriented Integration and
V System Testing. Testing Web Applications: Web Testing, User Interface Testing, Usability 08
Testing, Security Testing, Performance Testing, Database testing, Post Deployment Testing
Text books:
1. Yogesh Singh, “Software Testing”, Cambridge University Press, New York, 2012
2. K..K. Aggarwal & Yogesh Singh, “Software Engineering”, New Age International Publishers, New Delhi, 2003.
3. Roger S. Pressman, “Software Engineering – A Practitioner’s Approach”, Fifth Edition, McGraw-Hill International Edition,
New Delhi,2001.
4. Marc Roper, “Software Testing”, McGraw-Hill Book Co., London, 1994.
5. M.C. Trivedi, Software Testing & Audit, Khanna Publishing House 6. Boris Beizer, “Software System Testing and Quality
Assurance”, Van Nostrand Reinhold, New York, 1984
CO 4 To know about Shared Memory Techniques and have Sufficient knowledge about file access K1
CO 3 Describe Services Oriented Architecture and various types of cloud services. K2, K3
Explain Inter cloud resources management cloud storage services and their providers Assess K2, K4
CO 4
security services and standards for cloud computing.
CO 5 Analyze advanced cloud technologies. K3, K6
DETAILED SYLLABUS 3‐1‐0
Proposed
Unit Topic
Lecture
Introduction To Cloud Computing: Definition of Cloud – Evolution of Cloud Computing –
I Underlying Principles of Parallel and Distributed Computing – Cloud Characteristics – Elasticity in 08
Cloud – On‐demand Provisioning.
Cloud Enabling Technologies Service Oriented Architecture: REST and Systems of Systems – Web
Services – Publish, Subscribe Model – Basics of Virtualization – Types of Virtualization –
II Implementation Levels of Virtualization – Virtualization Structures – Tools and Mechanisms – 08
Virtualization of CPU – Memory – I/O Devices –Virtualization Support and Disaster Recovery.
Cloud Architecture, Services And Storage: Layered Cloud Architecture Design – NIST Cloud
Computing Reference Architecture – Public, Private and Hybrid Clouds – laaS – PaaS – SaaS –
III Architectural Design Challenges – Cloud Storage – Storage‐as‐a‐Service – Advantages of Cloud 08
Storage – Cloud Storage Providers – S3.
Resource Management And Security In Cloud: Inter Cloud Resource Management – Resource
Provisioning and Resource Provisioning Methods – Global Exchange of Cloud Resources – Security
IV Overview – Cloud Security Challenges – Software‐as‐a‐Service Security – Security Governance – 08
Virtual Machine Security – IAM – Security Standards.
Cloud Technologies And Advancements Hadoop: MapReduce – Virtual Box — Google App
V Engine – Programming Environment for Google App Engine –– Open Stack – Federation in the 08
Cloud – Four Levels of Federation – Federated Services and Applications – Future of Federation.
Text books:
1. Kai Hwang, Geoffrey C. Fox, Jack G. Dongarra, “Distributed and Cloud Computing, From Parallel Processing to the
Internet of Things”, Morgan Kaufmann Publishers, 2012.
2. Rittinghouse, John W., and James F. Ransome, ―Cloud Computing: Implementation, Management and Security, CRC
Press, 2017.
3. Rajkumar Buyya, Christian Vecchiola, S. ThamaraiSelvi, ―Mastering Cloud Computing, Tata Mcgraw Hill, 2013.
4. Toby Velte, Anthony Velte, Robert Elsenpeter, “Cloud Computing – A Practical Approach, Tata Mcgraw Hill, 2009.
5. George Reese, “Cloud Application Architectures: Building Applications and Infrastructure in the Cloud: Transactional
Systems for EC2 and Beyond (Theory in Practice), O’Reilly, 2009.
CO 1 Describe the basic understanding of Blockchain architecture along with its primitive. K1, K2
CO 2 Explain the requirements for basic protocol along with scalability aspects. K2, K3
CO 3 Design and deploy the consensus process using frontend and backend. K3, K4
Apply Blockchain techniques for different use cases like Finance, Trade/Supply and
CO 4 Government activities. K4, K5
Mini Project or Internship Assessment (KCS 354 , KCS 554 , KCS 752)
Course Outcome ( CO) Bloom’s Knowledge Level (KL)
Learning professional skills like exercising leadership, behaving professionally, behaving K2, K4
CO 5 ethically, listening effectively, participating as a member of a team, developing appropriate
workplace attitudes.
Section – A
Attempt ALL the questions (10 x 2 = 20 marks)
Q. No. COs Attempt ALL the questions. Each Question is of 2 marks
A CO1 Define Cloud Computing. (BKL : K1-K2 Level).
B CO1 What is distributed computing? (BKL : K1-K2 Level).
C CO2 What do you mean by full virtualization? (BKL : K1-K2 Level).
D CO2 Differentiate between API and Web services. (BKL : K1-K2 Level).
E CO3 Write down any two advantages of SaaS? (BKL : K1-K2 Level).
F CO3 Give the name of some popular software- as- a Service vendors. (BKL : K1-K2 Level).
G CO4 What are security services in the cloud? (BKL : K1-K2 Level).
H CO4 What is IAM? (BKL : K1-K2 Level).
I CO5 What are modules of Hadoop? (BKL : K1-K2 Level).
J CO5 Explain virtual box. (BKL : K1-K2 Level).
Section – B
Attempt ALL the questions (5 x 6 = 30 marks)
Q.2 (CO-1) : What are the advantages and disadvantage of Cloud Computing?
OR
Explain the evolution of cloud computing with neat diagram.
Q.3 (CO-2) : Why is web services required? Illustrate web services in details.
OR
What is the difference between process virtual machines, host VMMs and native
VMMs?
Q.4 (CO-3) : Explain AWS Simple Storage Service (AWS S3) with its features in details.
OR
Discuss the challenges faced in architectural of cloud computing?
Q.5 (CO-4) : Explain the importance of workflow management systems in cloud with the help of Architecture
diagram.
OR
Explain the Cloud management and Services Creation Tools.
Q.6 (CO-5) : Take a suitable example and explain the concept of map reduces.
OR
Give a suitable definition of cloud federation stack and explain it in detail.
Section – C
Attempt ALL the questions. Each Question is of 10 marks.