1 Introduction
1 Introduction
1 Introduction
Introduction
Course Logistics
Resources:
Book:
“Cloud Computing, Theory and Practice” Dan C. Marinescu,
Morgan Kaufmann
Research papers (will be given per lecture)
2
Course Contents
1. Introduction to Cloud Computing
2. Virtualization I
3. Virtualization II
4. MapReduce Batch Processing
5. MapReduce in Heterogeneous Environments
6. Large-Scale Resource Management
7. Data Center Networking
8. Cloud Distributed Storage
9. Real-Time Data Stream Processing
3
Lecture Contents
4
What is Cloud Computing?
“Simply put, cloud computing is the delivery of computing services – servers, storage,
databases, networking, software, analytics and more – over the Internet (“the cloud”).
Companies offering these computing services are called cloud providers and typically
charge for cloud computing services based on usage, similar to how you’re billed for gas or
electricity at home.” https://azure.microsoft.com/en-gb/overview/what-is-cloud-computing/
5
Cloud Computing Models, Resources, Attributes
Delivery models
Software as a Service (SaaS) Deployment models
Platform as a Service (PaaS) Public cloud
Community cloud
Hybrid cloud
Cloud computing
Infrastructure
Distributed infrastructure
Defining attributes
Resource virtualization
Massive infrastructure
Autonomous systems
Utility computing. Pay-per-usage
Resources
Accessible via the Internet
Compute & storage servers
Networks Services Elasticity
Applications
6
Early Models of Cloud Computing
Basic reasoning: information and data processing can be done more efficiently
on large farms of computing and storage systems accessible via the Internet.
Two early models:
1. Grid computing – initiated by the National Labs in the early 1990s; targeted
primarily at scientific computing.
“Grid computing is the collection of computer resources from multiple locations to reach a
common goal. The grid can be thought of as a distributed system with non-interactive
workloads that involve a large number of files.”
2. Utility computing – initiated in 2005-2006 by IT companies and targeted at
enterprise computing.
“Utility computing is a service provisioning model in which a service provider makes
computing resources and infrastructure management available to the customer as needed,
and charges them for specific usage rather than a flat rate.”
7
Cloud computing - Characteristics
“Cloud Computing offers on-demand, scalable and elastic computing (and storage
services). The resources used for these services can be metered and users are
charged only for the resources used. “ from the Book
8
Cloud computing (cont’d)
Data Storage:
6.Data is stored:
in the “cloud”, in certain cases closer to the site where it is used.
appears to the users as if stored in a location-independent manner.
7.The data storage strategy can increase reliability, as well as security, and
can lower communication costs.
Management:
8.The maintenance and security are operated by service providers.
9.The service providers can operate more efficiently due to specialisation
and centralisation.
9
Cloud Computing Advantages
1. Resources, such as CPU cycles, storage, network bandwidth, are shared.
10
Cloud Computing Advantages
11
Types of clouds
1. Public Cloud - the infrastructure is made available to the general public or a
large industry group and is owned by the organization selling cloud services.
12
Why cloud computing is (could) be successful
when other paradigms have failed?
It is in a better position to exploit recent advances in
software, networking, storage, and processor technologies
promoted by the same companies who provide Cloud
services.
Economical reasons: It is used for enterprise computing; its
adoption by industrial organizations, financial institutions,
government, and so on has a huge impact on the economy.
Infrastructures Management reasons:
A single Cloud consists of a mostly homogeneous (now more
heterogeneous) set of hardware and software resources.
The resources are in a single Administrative Domain (AD).
Security, resource management, fault-tolerance, and quality of
service are less challenging than in a heterogeneous environment
with resources in multiple ADs.
13
Challenges for cloud computing
1. Availability of service: what happens when the service provider
cannot deliver?
14
More challenges
5. Performance unpredictability, one of the consequences of resource sharing.
How to use resource virtualization and performance isolation for QoS guarantees?
How to support elasticity, the ability to scale up and down quickly?
15
Cloud Delivery Models
1. Software as a Service (SaaS) (high level)
2. Platform as a Service (PaaS)
3. Infrastructure as a Service (IaaS) (low level)
16
Infrastructure-as-a-Service
Infrastructure is compute resources, CPU, VMs, storage, etc
(IaaS)
The user is able to deploy and run arbitrary software, which can include operating systems and
applications.
The user does not manage or control the underlying Cloud infrastructure but has control over
operating systems, storage, deployed applications, and possibly limited control of some networking
components, e.g., host firewalls.
Services offered by this delivery model include: server hosting, storage, computing hardware,
operating systems, virtual instances, load balancing, Internet access, and bandwidth provisioning.
17
Platform-as-a-Service (PaaS)
Allows a cloud user to deploy consumer-created or acquired applications using
programming languages and tools supported by the service provider.
The user:
Has control over the deployed applications and, possibly, application hosting environment
configurations.
Does not manage or control the underlying Cloud infrastructure including network, servers,
operating systems, or storage.
Not particularly useful when:
The application must be portable.
Proprietary programming languages are used.
The hardware and software must be customised to improve the performance of the
application.
Examples: Google App Engine, Windows Azure
18
Software-as-a-Service (SaaS)
Applications are supplied by the service provider.
The user does not manage or control the underlying Cloud infrastructure
or individual application capabilities.
Services offered include:
Enterprise services such as: workflow management, communications, digital
signature, Customer Relationship Management (CRM), desktop software,
financial management, geo-spatial, and search.
Not suitable for real-time applications or for those where data is not
allowed to be hosted externally.
19
The Three delivery models of Cloud Computing
20
Cloud activities
21
Cloud activities (cont’d)
22
Cloud activities (cont’d)
23
Ethical issues
24
De-perimeterisation
Systems can span the boundaries of multiple organisations and cross the
security borders.
Identity fraud and theft are made possible by the unauthorised access to
personal data in circulation and by new forms of dissemination through
social networks and they could also pose a danger to Cloud Computing.
25
Privacy issues
26
Cloud Vulnerabilities
Such events can affect the Internet domain name servers and
prevent access to a Cloud or can directly affect the Clouds:
in 2004 an attack at Akamai caused a domain name outage and a
major blackout that affected Google, Yahoo, and other sites.
in 2009, Google was the target of a denial of service attack which took
down Google News and Gmail for several days;
in 2012 lightning caused a prolonged down time at Amazon.
27
Back to Basics -- Parallel Computing
28
Parallel Computing – Amdahl’s Law
The speedup S measures the effectiveness of parallelisation:
S(N) = T(1) / T(N)
T(1) the execution time of the sequential computation.
T(N) the execution time when N parallel computations are executed
This is a theoretical upper bound on the best speedup we can get from parallelising
a certain program.
29
Back to Basics -- Distributed systems
Collection of autonomous computers, connected through a network and distribution
software (often) called middleware which enables computers to coordinate their
activities and to share system resources for a common goal.
Characteristics:
1. The users perceive the system as a single, integrated computing facility.
2. The components are autonomous.
3. Scheduling and other resource management and security policies are implemented by each
system.
4. There are multiple points of control and multiple points of failure.
5. The resources may not be accessible at all times.
6. Can be scaled by adding additional resources.
7. Can be designed to maintain availability even at low levels of hardware/software/network
reliability.
30
Summary
31