Module 4 - Cloud Programming and Software Environments

Uploaded by

gaxet91470

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views

Module 4 - Cloud Programming and Software Environments

Uploaded by

gaxet91470

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Features of Cloud and Grid Platforms

Parallel and Distributed Programming Paradigms

Introduction
• Cloud computing is a technology that enables us to create, configure, and
customize applications through an internet connection.
• Software environment is a collection of programs, libraries, and utilities
that allow users to perform specific tasks.
• Software environments are often used by programmers to develop applications or
run existing ones.
• A software environment for a particular application could include the operating
system, the database system, specific development tools, or compilers.
Features of Cloud and Grid Platforms
• Important features in real cloud and grid platforms!!
• In four tables, we cover the capabilities, traditional features, data
features, and features for programmers and runtime systems to use
• The entries in these tables are source references for anyone who
wants to program the cloud efficiently
Grid Computing
• Grid computing (sometimes referred to as virtual supercomputing) is a
group of networked computers that work together as a virtual
supercomputer to perform large tasks, such as analyzing huge sets of data
or weather modeling.
• Grid computing is extensively used in scientific research and high-
performance computing to solve complex scientific problems.
• For example, grid computing can be used to simulate the behavior of a
nuclear explosion, model the human genome, or analyze massive amounts
of data generated from particle accelerators.
• Advantages:
• Can solve larger, more complex problems in a shorter time
• Easier to collaborate with other organizations
• Make better use of existing hardware
Cloud vs Grid Computing
• https://www.skysilk.com/blog/2017/cloud-vs-grid-computing/
Cloud Capabilities and Platform Features
Commercial clouds need broad capabilities, as summarized in Table
Table 6.2 lists some low-level infrastructure features.
Table 6.3 lists traditional programming
environments for parallel and distributed systems that need to be supported in Cloud environments.
They can be supplied as part of system (Cloud Platform) or user environment.
Table 6.4 presents features emphasized by clouds and by some grids.
Traditional Features Common to Grids and Clouds
• Features related to workflow, data transport, security, and availability concerns
that are common to today’s computing grids and clouds
• Workflow
• Workflow links multiple cloud and non-cloud services in real applications on demand.
• Data Transport
• difficulty in using clouds: cost (in time and money) of data transport
• special structure of cloud data with blocks (in Azure blobs) and tables could allow high-
performance parallel algorithms, but initially, simple HTTP mechanisms are used to transport
data
• Security, Privacy, and Availability
• techniques are related to security, privacy, and availability requirements for developing a
healthy and dependable cloud programming environment
• Use virtual clustering to achieve dynamic resource provisioning with minimum overhead cost.
• Use stable and persistent data storage with fast queries for information retrieval.
• Use special APIs for authenticating users and sending e-mail using commercial accounts.
• Cloud resources are accessed with security protocols such as HTTPS and SSL.
• Fine-grained access control is desired to protect data integrity and deter intruders or hackers.
• Shared data sets are protected from malicious alteration, deletion, or copyright violations.
• Features are included for availability enhancement and disaster recovery with life migration of VMs.
• Use a reputation system to protect data centers. This system only authorizes trusted clients and stops
pirates.
• Data Features and Databases
• Program Library
• Many efforts have been made to design a VM image library to manage images used in
academic and commercial clouds
• Blobs and Drives
• basic storage concept in clouds is blobs for Azure and S3 for Amazon
• In addition to a service interface for blobs and S3, one can attach “directly” to compute
instances as Azure drives and the Elastic Block Store for Amazon.
• DPFS
• covers the support of file systems such as Google File System (MapReduce), HDFS (Hadoop),
and Cosmos (Dryad) with compute-data affinity optimized for data processing
• It could be possible to link DPFS to basic blob and drive-based architecture, but it’s simpler to
use DPFS as an application-centric storage model with compute-data affinity and blobs and
drives as the repository-centric view
• SQL and Relational Databases
• Both Amazon and Azure clouds offer relational databases
• Table and NOSQL Nonrelational Databases
• present in the three major clouds: BigTable in Google, SimpleDB in Amazon, and Azure Table
[13] for Azure
• Queuing Services
• Both Amazon and Azure offer similar scalable, robust queuing services that are used to
communicate between the components of an application
• Programming and Runtime Support
• desired to facilitate parallel programming and provide runtime support of important functions in today’s
grids and clouds
• Worker and Web Roles
• The roles introduced by Azure provide nontrivial functionality, while preserving the better affinity support that is possible in
a nonvirtualized environment.
• Worker roles are basic schedulable processes and are automatically launched.
• Note that explicit scheduling is unnecessary in clouds for individual worker roles and for the “gang scheduling” supported
transparently in MapReduce.
• Queues are a critical concept here, as they provide a natural way to manage task assignment in a fault–tolerant, distributed
fashion. Web roles provide an interesting approach to portals. GAE is largely aimed at web applications, whereas science
gateways are successful in TeraGrid.
• MapReduce
• There has been substantial interest in “data parallel” languages largely aimed at loosely coupled computations which
execute over different data samples.
• The language and runtime generate and provide efficient execution of “many task” problems that are well known as
successful grid applications.
• However, MapReduce, summarized in Table 6.5, has several advantages over traditional implementations for many task
problems, as it supports dynamic execution, strong fault tolerance, and an easy-to-use high-level interface.
• The major open source/commercial MapReduce implementations are Hadoop [23] and Dryad [24–27] with execution
possible with or without VMs.
• Cloud Programming Models
• Both the GAE and Manjrasoft Aneka environments represent programming models; both are applied to clouds, but are
really not specific to this architecture.
• Iterative MapReduce is an interesting programming model that offers portability between cloud, HPC and cluster
environments.
• SaaS
• Services are used in a similar fashion in commercial clouds and most modern distributed systems.
• We expect users to package their programs wherever possible, so no special support is needed to enable SaaS.
Parallel and Distributed Programming Paradigms
• We define a parallel and distributed program as a parallel program running on a
set of computing engines or a distributed computing system.
• The term carries the notion of two fundamental terms in computer science:
distributed computing system and parallel computing.
• A distributed computing system is a set of computational engines connected by a
network to achieve a common goal of running a job or an application.
• A computer cluster or network of workstations is an example of a distributed
computing system.
• Parallel computing is the simultaneous use of more than one computational
engine (not necessarily connected via a network) to run a job or an application.
• For instance, parallel computing may use either a distributed or a non-
distributed computing system such as a multiprocessor platform.
• Running a parallel program on a distributed computing system (parallel and
distributed programming) has several advantages for both users and
distributed computing systems.
• From the users’ perspective, it decreases application response time; from
the distributed computing systems’ standpoint, it increases throughput and
resource utilization.
• Running a parallel program on a distributed computing system, however,
could be a very complicated process.
• Therefore, to place the complexity in perspective, data flow of running a
typical parallel program on a distributed system is further explained in this
chapter.
Parallel Computing and Programming Paradigms
• Consider a distributed computing system consisting of a set of networked nodes or workers.
• The system issues for running a typical parallel program in either a parallel or a distributed
manner would include the following:
• Partitioning: This is applicable to both computation and data as follows:
• Computation partitioning:This splits a given job or a program into smaller tasks. Partitioning
greatly depends on correctly identifying portions of the job or program that can be performed
concurrently. In other words, upon identifying parallelism in the structure of the program, it can
be divided into parts to be run on different workers. Different parts may process different data or
a copy of the same data.
• Data partitioning: This splits the input or intermediate data into smaller pieces. Similarly, upon
identification of parallelism in the input data, it can also be divided into pieces to be processed on
different workers. Data pieces may be processed by different parts of a program or a copy of the
same program.
• Mapping: This assigns the either smaller parts of a program or the smaller pieces of data to
underlying resources. This process aims to appropriately assign such parts or pieces to be run
simultaneously on different workers and is usually handled by resource allocators in the system.
• Synchronization: Because different workers may perform different tasks, synchronization
and coordination among workers is necessary so that race conditions are prevented and
data dependency among different workers is properly managed. Multiple accesses to a
shared resource by different workers may raise race conditions, whereas data
dependency happens when a worker needs the processed data of other workers.
• Communication: Because data dependency is one of the main reasons for
communication among workers, communication is always triggered when the
intermediate data is sent to workers.
• Scheduling: For a job or program, when the number of computation parts (tasks) or data
pieces is more than the number of available workers, a scheduler selects a sequence of
tasks or data pieces to be assigned to the workers. It is worth noting that the resource
allocator performs the actual mapping of the computation or data pieces to workers,
while the scheduler only picks the next part from the queue of unassigned tasks based
on a set of rules called the scheduling policy. For multiple jobs or programs, a scheduler
selects a sequence of jobs or programs to be run on the distributed computing system. In
this case, scheduling is also necessary when system resources are not sufficient to
simultaneously run multiple jobs or programs.
Motivation for Programming Paradigms
• Because handling the whole data flow of parallel and distributed programming is
very time consuming and requires specialized knowledge of programming,
dealing with these issues may affect the productivity of the programmer and may
even result in affecting the program’s time to market. Furthermore, it may detract
the programmer from concentrating on the logic of the program itself.
• Therefore, parallel and distributed programming paradigms or models are offered
to abstract many parts of the data flow from users. In other words, these models
aim to provide users with an abstraction layer to hide implementation details of
the data flow which users formerly ought to write codes for. Therefore, simplicity
of writing parallel programs is an important metric for parallel and distributed
programming paradigms. Other motivations behind parallel and distributed
programming models are (1) to improve productivity of programmers, (2) to
decrease programs’ time to market, (3) to leverage underlying resources more
efficiently, (4) to increase system throughput, and (5) to support higher levels of
abstraction
• MapReduce, Hadoop, and Dryad are three of the most recently
proposed parallel and distributed programming models. They were
developed for information retrieval applications but have been shown
to be applicable for a variety of important applications [41]. Further,
the loose coupling of components in these paradigms makes them
suitable for VM implementation and leads to much better fault
tolerance and scalability for some applications than traditional
parallel computing models such as MPI
MapReduce, Twister, and Iterative
MapReduce
• MapReduce, as introduced in Section 6.1.4, is a software framework
which supports parallel and distributed computing on large data sets
[27,37,45,46]. This software framework abstracts the data flow of
running a parallel program on a distributed computing system by
providing users with two interfaces in the form of two functions: Map
and Reduce. Users can override these two functions to interact with
and manipulate the data flow of running their programs. Figure 6.1
illustrates the logical data flow from the Map to the Reduce function
in MapReduce frameworks. In the “value” part of the data, (key,
value), is the actual data, and the “key” part is only used by the
MapReduce controller to control the data flow

Role of AI in Improving The Criminal Justice System
No ratings yet
Role of AI in Improving The Criminal Justice System
15 pages
Architectural Design Challenges + Elasticity
No ratings yet
Architectural Design Challenges + Elasticity
8 pages
Palo Alto Networks Certified Network Security Administrator
No ratings yet
Palo Alto Networks Certified Network Security Administrator
6 pages
Overview of The Computing Paradigm: 1.1 Recent Trends in Distributed Computing
No ratings yet
Overview of The Computing Paradigm: 1.1 Recent Trends in Distributed Computing
5 pages
Application Layer Protocols in IoT
No ratings yet
Application Layer Protocols in IoT
8 pages
Chapter 4: Threads: Silberschatz, Galvin and Gagne ©2009 Operating System Concepts - 8 Edition
No ratings yet
Chapter 4: Threads: Silberschatz, Galvin and Gagne ©2009 Operating System Concepts - 8 Edition
29 pages
Unit5 CSM
No ratings yet
Unit5 CSM
22 pages
cs3551 Unit II Notes None
No ratings yet
cs3551 Unit II Notes None
35 pages
11 Aneka in Cloud Computing
No ratings yet
11 Aneka in Cloud Computing
14 pages
Internet of Things Unit-5
No ratings yet
Internet of Things Unit-5
88 pages
2-2-Parallel-Distributed Computing
No ratings yet
2-2-Parallel-Distributed Computing
2 pages
Unit-3-Greedy Method PDF
No ratings yet
Unit-3-Greedy Method PDF
22 pages
Cooperative Process: Prepared & Presented By: Abdul Rehman & Muddassar Ali
No ratings yet
Cooperative Process: Prepared & Presented By: Abdul Rehman & Muddassar Ali
18 pages
Grid Architecture
No ratings yet
Grid Architecture
19 pages
Unit-I OSI Security Architecture
No ratings yet
Unit-I OSI Security Architecture
14 pages
System Models For Distributed and Cloud Computing
No ratings yet
System Models For Distributed and Cloud Computing
15 pages
Database Management Systems: ©silberschatz, Korth and Sudarshan 1.1 Database System Concepts
No ratings yet
Database Management Systems: ©silberschatz, Korth and Sudarshan 1.1 Database System Concepts
33 pages
Fiot Unit 5
No ratings yet
Fiot Unit 5
27 pages
Unit 1 PPT CC
No ratings yet
Unit 1 PPT CC
38 pages
DC Notes - 2 Marks
No ratings yet
DC Notes - 2 Marks
11 pages
Component Level Design
No ratings yet
Component Level Design
2 pages
Unit I
No ratings yet
Unit I
53 pages
Multiprocessor and Multicomputers
No ratings yet
Multiprocessor and Multicomputers
5 pages
Cloudservices Notes
No ratings yet
Cloudservices Notes
31 pages
Ccs335 Cloud Computing-Unit - I Notes
No ratings yet
Ccs335 Cloud Computing-Unit - I Notes
37 pages
COA Chapter 6
No ratings yet
COA Chapter 6
6 pages
Cloud Computing Unit-3
No ratings yet
Cloud Computing Unit-3
37 pages
Layered Technology
No ratings yet
Layered Technology
57 pages
Counting Oneness in A Window
No ratings yet
Counting Oneness in A Window
12 pages
Computer Networks (1) - 157-172
No ratings yet
Computer Networks (1) - 157-172
16 pages
Data Analytics With R - BDS306C - LAB - Full
No ratings yet
Data Analytics With R - BDS306C - LAB - Full
61 pages
UNIT-1 Introduction To Scripting Languages: 1.1 Scripts and Programs
100% (2)
UNIT-1 Introduction To Scripting Languages: 1.1 Scripts and Programs
34 pages
Darshan Institute of Engineering & Technology
No ratings yet
Darshan Institute of Engineering & Technology
49 pages
IOT Mod4@AzDOCUMENTS - in
No ratings yet
IOT Mod4@AzDOCUMENTS - in
17 pages
Module 3
No ratings yet
Module 3
28 pages
Module-4 Cloud Computing Architecture PDF
No ratings yet
Module-4 Cloud Computing Architecture PDF
19 pages
FIOT Unit-1 Notes
No ratings yet
FIOT Unit-1 Notes
27 pages
Distribution Design Issues
No ratings yet
Distribution Design Issues
2 pages
Synchronisation Hardware
100% (4)
Synchronisation Hardware
17 pages
How Does A Single Bit Error Differs From Burst Error.
No ratings yet
How Does A Single Bit Error Differs From Burst Error.
4 pages
Distributed System
No ratings yet
Distributed System
162 pages
Difference Between Hyper Threading & Multi-Core Technology PDF
100% (1)
Difference Between Hyper Threading & Multi-Core Technology PDF
2 pages
TYPES OF SCHEDULING ALGORITHMS in Cloud
100% (1)
TYPES OF SCHEDULING ALGORITHMS in Cloud
4 pages
CC 2 Marks Unit 1 - 5
No ratings yet
CC 2 Marks Unit 1 - 5
24 pages
Unit-2 Cloud Computing
No ratings yet
Unit-2 Cloud Computing
23 pages
What Is Serial Computing?: Traditionally, Software Has Been Written For Serial Computation
No ratings yet
What Is Serial Computing?: Traditionally, Software Has Been Written For Serial Computation
22 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
2 pages
Unit 3 AI Srs 13-14
No ratings yet
Unit 3 AI Srs 13-14
45 pages
Distributed File System
No ratings yet
Distributed File System
49 pages
Locating Mobile Entities in Distributed Systems
67% (3)
Locating Mobile Entities in Distributed Systems
2 pages
Virtualization in Cloud Computing and Types
No ratings yet
Virtualization in Cloud Computing and Types
8 pages
Bca IV Sem Os -i Unit Notes
No ratings yet
Bca IV Sem Os -i Unit Notes
41 pages
Practical File Cloud Computing IT-704
No ratings yet
Practical File Cloud Computing IT-704
27 pages
Cloud Computing Unit-4
No ratings yet
Cloud Computing Unit-4
61 pages
Sensor Operating System Design Issues
No ratings yet
Sensor Operating System Design Issues
3 pages
Distributed Operating Systems: Unit - 2
No ratings yet
Distributed Operating Systems: Unit - 2
48 pages
Parallel DBMS Vendors
No ratings yet
Parallel DBMS Vendors
14 pages
Chapter 9 Multimedia Communication Systems
No ratings yet
Chapter 9 Multimedia Communication Systems
13 pages
Computer Network
No ratings yet
Computer Network
17 pages
Introduction to Linux: Installation and Programming
From Everand
Introduction to Linux: Installation and Programming
N. B. Venkateswarlu
No ratings yet
Cloud Computing Chapter 3
No ratings yet
Cloud Computing Chapter 3
17 pages
BCA_CC_Unit2(1)
No ratings yet
BCA_CC_Unit2(1)
34 pages
Ocial Engineering
No ratings yet
Ocial Engineering
26 pages
1 - NIC 20.07 (Microsoft ® Product End of Support Notice - Operating System)
No ratings yet
1 - NIC 20.07 (Microsoft ® Product End of Support Notice - Operating System)
4 pages
Security in Oracle WebLogic - Realm, Security Provider, Authentication, Authorization, Users - Online Identity & Access Management
No ratings yet
Security in Oracle WebLogic - Realm, Security Provider, Authentication, Authorization, Users - Online Identity & Access Management
12 pages
Autonomous Systems
No ratings yet
Autonomous Systems
13 pages
Unit 4
No ratings yet
Unit 4
18 pages
Cyber Security Incident Report: Electronic Physical
No ratings yet
Cyber Security Incident Report: Electronic Physical
1 page
SC400
No ratings yet
SC400
2 pages
Combating Cyber - Crimes and Creating A Safe Web Space: Gujarat Technological University
No ratings yet
Combating Cyber - Crimes and Creating A Safe Web Space: Gujarat Technological University
3 pages
Marketing and Event Management
No ratings yet
Marketing and Event Management
42 pages
05635801
No ratings yet
05635801
4 pages
ENISA Threat Landscape For Supply Chain Attacks
No ratings yet
ENISA Threat Landscape For Supply Chain Attacks
57 pages
Unit Ii: Block Ciphers & Public Key Cryptography
No ratings yet
Unit Ii: Block Ciphers & Public Key Cryptography
109 pages
Law & IT Ass. IX Sem. - Anas Mohsin
No ratings yet
Law & IT Ass. IX Sem. - Anas Mohsin
36 pages
Username: Password:: 4193 Votes 9 Days Old
No ratings yet
Username: Password:: 4193 Votes 9 Days Old
6 pages
Chap 6 Class 9 Comp
No ratings yet
Chap 6 Class 9 Comp
7 pages
Software Development KPIs For Website
No ratings yet
Software Development KPIs For Website
2 pages
Microsoft Office 365 Activation
No ratings yet
Microsoft Office 365 Activation
4 pages
Introduction To Cryptography: Basic Concepts Classical Techniqes Modern Conventional Techniques
No ratings yet
Introduction To Cryptography: Basic Concepts Classical Techniqes Modern Conventional Techniques
35 pages
Seminar Report: Biometrics
No ratings yet
Seminar Report: Biometrics
17 pages
1.3.6 Packet Tracer Configure Ssh
No ratings yet
1.3.6 Packet Tracer Configure Ssh
2 pages
ADSelfServicePlus Install+SSL+Certificate
No ratings yet
ADSelfServicePlus Install+SSL+Certificate
4 pages
Instant download Cryptology For Engineers: An Application-Oriented Mathematical Introduction 1st Edition Robert Schmied pdf all chapter
100% (1)
Instant download Cryptology For Engineers: An Application-Oriented Mathematical Introduction 1st Edition Robert Schmied pdf all chapter
65 pages
Audit Logs
No ratings yet
Audit Logs
5 pages
Service Interface User Guide, February 2022 DOCA0170EN-02
No ratings yet
Service Interface User Guide, February 2022 DOCA0170EN-02
64 pages
Std12e ch5
No ratings yet
Std12e ch5
8 pages
Censys ASM Datasheet
No ratings yet
Censys ASM Datasheet
2 pages
Cybersecurity Internship Details
No ratings yet
Cybersecurity Internship Details
7 pages
Enable SSH in Cisco
No ratings yet
Enable SSH in Cisco
3 pages