Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
81 views

Module 4 - Cloud Programming and Software Environments

Uploaded by

gaxet91470
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Module 4 - Cloud Programming and Software Environments

Uploaded by

gaxet91470
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Features of Cloud and Grid Platforms

Parallel and Distributed Programming Paradigms


Introduction
• Cloud computing is a technology that enables us to create, configure, and
customize applications through an internet connection.
• Software environment is a collection of programs, libraries, and utilities
that allow users to perform specific tasks.
• Software environments are often used by programmers to develop applications or
run existing ones.
• A software environment for a particular application could include the operating
system, the database system, specific development tools, or compilers.
Features of Cloud and Grid Platforms
• Important features in real cloud and grid platforms!!
• In four tables, we cover the capabilities, traditional features, data
features, and features for programmers and runtime systems to use
• The entries in these tables are source references for anyone who
wants to program the cloud efficiently
Grid Computing
• Grid computing (sometimes referred to as virtual supercomputing) is a
group of networked computers that work together as a virtual
supercomputer to perform large tasks, such as analyzing huge sets of data
or weather modeling.
• Grid computing is extensively used in scientific research and high-
performance computing to solve complex scientific problems.
• For example, grid computing can be used to simulate the behavior of a
nuclear explosion, model the human genome, or analyze massive amounts
of data generated from particle accelerators.
• Advantages:
• Can solve larger, more complex problems in a shorter time
• Easier to collaborate with other organizations
• Make better use of existing hardware
Cloud vs Grid Computing
• https://www.skysilk.com/blog/2017/cloud-vs-grid-computing/
Cloud Capabilities and Platform Features
Commercial clouds need broad capabilities, as summarized in Table
Table 6.2 lists some low-level infrastructure features.
Table 6.3 lists traditional programming
environments for parallel and distributed systems that need to be supported in Cloud environments.
They can be supplied as part of system (Cloud Platform) or user environment.
Table 6.4 presents features emphasized by clouds and by some grids.
Traditional Features Common to Grids and Clouds
• Features related to workflow, data transport, security, and availability concerns
that are common to today’s computing grids and clouds
• Workflow
• Workflow links multiple cloud and non-cloud services in real applications on demand.
• Data Transport
• difficulty in using clouds: cost (in time and money) of data transport
• special structure of cloud data with blocks (in Azure blobs) and tables could allow high-
performance parallel algorithms, but initially, simple HTTP mechanisms are used to transport
data
• Security, Privacy, and Availability
• techniques are related to security, privacy, and availability requirements for developing a
healthy and dependable cloud programming environment
• Use virtual clustering to achieve dynamic resource provisioning with minimum overhead cost.
• Use stable and persistent data storage with fast queries for information retrieval.
• Use special APIs for authenticating users and sending e-mail using commercial accounts.
• Cloud resources are accessed with security protocols such as HTTPS and SSL.
• Fine-grained access control is desired to protect data integrity and deter intruders or hackers.
• Shared data sets are protected from malicious alteration, deletion, or copyright violations.
• Features are included for availability enhancement and disaster recovery with life migration of VMs.
• Use a reputation system to protect data centers. This system only authorizes trusted clients and stops
pirates.
• Data Features and Databases
• Program Library
• Many efforts have been made to design a VM image library to manage images used in
academic and commercial clouds
• Blobs and Drives
• basic storage concept in clouds is blobs for Azure and S3 for Amazon
• In addition to a service interface for blobs and S3, one can attach “directly” to compute
instances as Azure drives and the Elastic Block Store for Amazon.
• DPFS
• covers the support of file systems such as Google File System (MapReduce), HDFS (Hadoop),
and Cosmos (Dryad) with compute-data affinity optimized for data processing
• It could be possible to link DPFS to basic blob and drive-based architecture, but it’s simpler to
use DPFS as an application-centric storage model with compute-data affinity and blobs and
drives as the repository-centric view
• SQL and Relational Databases
• Both Amazon and Azure clouds offer relational databases
• Table and NOSQL Nonrelational Databases
• present in the three major clouds: BigTable in Google, SimpleDB in Amazon, and Azure Table
[13] for Azure
• Queuing Services
• Both Amazon and Azure offer similar scalable, robust queuing services that are used to
communicate between the components of an application
• Programming and Runtime Support
• desired to facilitate parallel programming and provide runtime support of important functions in today’s
grids and clouds
• Worker and Web Roles
• The roles introduced by Azure provide nontrivial functionality, while preserving the better affinity support that is possible in
a nonvirtualized environment.
• Worker roles are basic schedulable processes and are automatically launched.
• Note that explicit scheduling is unnecessary in clouds for individual worker roles and for the “gang scheduling” supported
transparently in MapReduce.
• Queues are a critical concept here, as they provide a natural way to manage task assignment in a fault–tolerant, distributed
fashion. Web roles provide an interesting approach to portals. GAE is largely aimed at web applications, whereas science
gateways are successful in TeraGrid.
• MapReduce
• There has been substantial interest in “data parallel” languages largely aimed at loosely coupled computations which
execute over different data samples.
• The language and runtime generate and provide efficient execution of “many task” problems that are well known as
successful grid applications.
• However, MapReduce, summarized in Table 6.5, has several advantages over traditional implementations for many task
problems, as it supports dynamic execution, strong fault tolerance, and an easy-to-use high-level interface.
• The major open source/commercial MapReduce implementations are Hadoop [23] and Dryad [24–27] with execution
possible with or without VMs.
• Cloud Programming Models
• Both the GAE and Manjrasoft Aneka environments represent programming models; both are applied to clouds, but are
really not specific to this architecture.
• Iterative MapReduce is an interesting programming model that offers portability between cloud, HPC and cluster
environments.
• SaaS
• Services are used in a similar fashion in commercial clouds and most modern distributed systems.
• We expect users to package their programs wherever possible, so no special support is needed to enable SaaS.
Parallel and Distributed Programming Paradigms
• We define a parallel and distributed program as a parallel program running on a
set of computing engines or a distributed computing system.
• The term carries the notion of two fundamental terms in computer science:
distributed computing system and parallel computing.
• A distributed computing system is a set of computational engines connected by a
network to achieve a common goal of running a job or an application.
• A computer cluster or network of workstations is an example of a distributed
computing system.
• Parallel computing is the simultaneous use of more than one computational
engine (not necessarily connected via a network) to run a job or an application.
• For instance, parallel computing may use either a distributed or a non-
distributed computing system such as a multiprocessor platform.
• Running a parallel program on a distributed computing system (parallel and
distributed programming) has several advantages for both users and
distributed computing systems.
• From the users’ perspective, it decreases application response time; from
the distributed computing systems’ standpoint, it increases throughput and
resource utilization.
• Running a parallel program on a distributed computing system, however,
could be a very complicated process.
• Therefore, to place the complexity in perspective, data flow of running a
typical parallel program on a distributed system is further explained in this
chapter.
Parallel Computing and Programming Paradigms
• Consider a distributed computing system consisting of a set of networked nodes or workers.
• The system issues for running a typical parallel program in either a parallel or a distributed
manner would include the following:
• Partitioning: This is applicable to both computation and data as follows:
• Computation partitioning:This splits a given job or a program into smaller tasks. Partitioning
greatly depends on correctly identifying portions of the job or program that can be performed
concurrently. In other words, upon identifying parallelism in the structure of the program, it can
be divided into parts to be run on different workers. Different parts may process different data or
a copy of the same data.
• Data partitioning: This splits the input or intermediate data into smaller pieces. Similarly, upon
identification of parallelism in the input data, it can also be divided into pieces to be processed on
different workers. Data pieces may be processed by different parts of a program or a copy of the
same program.
• Mapping: This assigns the either smaller parts of a program or the smaller pieces of data to
underlying resources. This process aims to appropriately assign such parts or pieces to be run
simultaneously on different workers and is usually handled by resource allocators in the system.
• Synchronization: Because different workers may perform different tasks, synchronization
and coordination among workers is necessary so that race conditions are prevented and
data dependency among different workers is properly managed. Multiple accesses to a
shared resource by different workers may raise race conditions, whereas data
dependency happens when a worker needs the processed data of other workers.
• Communication: Because data dependency is one of the main reasons for
communication among workers, communication is always triggered when the
intermediate data is sent to workers.
• Scheduling: For a job or program, when the number of computation parts (tasks) or data
pieces is more than the number of available workers, a scheduler selects a sequence of
tasks or data pieces to be assigned to the workers. It is worth noting that the resource
allocator performs the actual mapping of the computation or data pieces to workers,
while the scheduler only picks the next part from the queue of unassigned tasks based
on a set of rules called the scheduling policy. For multiple jobs or programs, a scheduler
selects a sequence of jobs or programs to be run on the distributed computing system. In
this case, scheduling is also necessary when system resources are not sufficient to
simultaneously run multiple jobs or programs.
Motivation for Programming Paradigms
• Because handling the whole data flow of parallel and distributed programming is
very time consuming and requires specialized knowledge of programming,
dealing with these issues may affect the productivity of the programmer and may
even result in affecting the program’s time to market. Furthermore, it may detract
the programmer from concentrating on the logic of the program itself.
• Therefore, parallel and distributed programming paradigms or models are offered
to abstract many parts of the data flow from users. In other words, these models
aim to provide users with an abstraction layer to hide implementation details of
the data flow which users formerly ought to write codes for. Therefore, simplicity
of writing parallel programs is an important metric for parallel and distributed
programming paradigms. Other motivations behind parallel and distributed
programming models are (1) to improve productivity of programmers, (2) to
decrease programs’ time to market, (3) to leverage underlying resources more
efficiently, (4) to increase system throughput, and (5) to support higher levels of
abstraction
• MapReduce, Hadoop, and Dryad are three of the most recently
proposed parallel and distributed programming models. They were
developed for information retrieval applications but have been shown
to be applicable for a variety of important applications [41]. Further,
the loose coupling of components in these paradigms makes them
suitable for VM implementation and leads to much better fault
tolerance and scalability for some applications than traditional
parallel computing models such as MPI
MapReduce, Twister, and Iterative
MapReduce
• MapReduce, as introduced in Section 6.1.4, is a software framework
which supports parallel and distributed computing on large data sets
[27,37,45,46]. This software framework abstracts the data flow of
running a parallel program on a distributed computing system by
providing users with two interfaces in the form of two functions: Map
and Reduce. Users can override these two functions to interact with
and manipulate the data flow of running their programs. Figure 6.1
illustrates the logical data flow from the Map to the Reduce function
in MapReduce frameworks. In the “value” part of the data, (key,
value), is the actual data, and the “key” part is only used by the
MapReduce controller to control the data flow

You might also like