Cs6703 QB Cse
Cs6703 QB Cse
Cs6703 QB Cse
UNIT I - INTRODUCTION
Evolution of Distributed computing: Scalable computing over the Internet – Technologies for
network based systems – clusters of cooperative computers - Grid computing Infrastructures –
cloud computing - service oriented architecture – Introduction to Grid Architecture and standards
– Elements of Grid – Overview of Grid Architecture.
UNIT V - SECURITY
Trust models for Grid security environment – Authentication and Authorization methods – Grid
security infrastructure – Cloud Infrastructure security: network, host and application level –
aspects of data security, provider data and its security, Identity and access management
architecture, IAM practices in the cloud, SaaS, PaaS, IaaS availability in the cloud, Key privacy
issues in the cloud.
TOTAL: 45 PERIODS
OUTCOMES:
At the end of the course, the student should be able to:
Apply grid computing techniques to solve large scale scientific problems.
Apply the concept of virtualization.
Use the grid and cloud tool kits.
Apply the security models in the grid and the cloud environment.
TEXT BOOK:
1. Kai Hwang, Geoffery C. Fox and Jack J. Dongarra, “Distributed and Cloud Computing:
Clusters,Grids, Clouds and the Future of Internet”, First Edition, Morgan Kaufman Publisher, an
Imprint of Elsevier, 2012.
REFERENCES:
1. Jason Venner, “Pro Hadoop- Build Scalable, Distributed Applications in the Cloud”, A Press,
2009.
2. Tom White, “Hadoop The Definitive Guide”, First Edition. O‟Reilly, 2009.
3. Bart Jacob (Editor), “Introduction to Grid Computing”, IBM Red Books, Vervante, 2005
4. Ian Foster, Carl Kesselman, “The Grid: Blueprint for a New Computing Infrastructure”, 2nd
Edition, Morgan Kaufmann.
5. Frederic Magoules and Jie Pan, “Introduction to Grid Computing” CRC Press, 2009.
6. Daniel Minoli, “A Networking Approach to Grid Computing”, John Wiley Publication, 2005.
7. Barry Wilkinson, “Grid Computing: Techniques and Applications”, Chapman and Hall, CRC,
Taylor and Francis Group, 2010.
COURSE OUTCOMES:
Upon completion of the course, the students will have
CO1 An ability to apply grid computing techniques to solve large scale scientific problems.
CO2 An ability to understand grid services and their functionalities.
UNIT I INTRODUCTION
Evolution of Distributed computing: Scalable computing over the Internet – Technologies for
network based systems – clusters of cooperative computers - Grid computing Infrastructures –
cloud computing - service oriented architecture – Introduction to Grid Architecture and standards
– Elements of Grid – Overview of Grid Architecture.
PART A
1. List out the three computing Paradigms.R
The growth of Internet clouds as a new computing paradigm.
The maturity of radio-frequency identification (RFID),
Global Positioning System (GPS), and sensor technologies has triggered the development of
the Internet of Things (IoT).
17. Highlight the importance of the term "cloud computing". AN(Nov/Dec 2016)
Cloud computing is a computing paradigm, where a large pool of systems are connected
in private or public networks, to provide dynamically scalable infrastructure for application,
data and file storage. With the advent of this technology, the cost of computation, application
hosting, content storage and delivery is reduced significantly.
18. Tabulate the difference between high performance computing and high throughput
computing. AN (April/May 2017)
S. No. High Performance High Throughput
1. Granularity largely defined by the algorithm Granularity can be selected to fit the
environment
2. Load balancing difficult Load balancing easy
3. Hard to schedule different workloads Mixing workloads is easy
4. Reliability is all important Sustained throughput is the key goal
19. Give the operations of a VM. R (April/May 2017).
Virtualization is the enabling technology and creates virtual machines that allow a single
machine to act as if it were many machines.
Runs several applications at the same time on a single physical server by hosting each of
them inside their own virtual machine.
By running multiple virtual machines simultaneously, a physical server can be utilized
efficiently
20. “Grid inherits features of P2P and Cluster Computing Systems”. Is the statement true?
Validate your answer. AN(Nov/Dec 2017)
Yes. Grid inherits features of P2P and Cluster Computing Systems. Cluster computing is
the base of all distributed computing paradigm, it aggregates the resources locally and shares the
load. Grid computing is the extended version of cluster, in which resources are provisioned
through internet. Cloud is on top of all, it provides more or less same functionalities as the above
two, but provides in the form of services and bills the same.
21. Differentiate between grid and cloud computing. AN (Nov/Dec 2017)
Description Grid Cloud
Underlying Concept Utility Computing Utility Computing
Main benefit Solve computationally Provide a scalable standard environment
complex problems for network-centric application
development, testing and development
Resource Negotiate and manage Simple user-provider model;
Distribution/ resource sharing; Pay per use
Allocation Schedulers
Domains Multiple Domains Single Domain
22. "Networks are backbones of grid computing". Justify the statement. AN (April/May
2018).
A grid computing interconnects various pieces of network, providing a path for the
exchange of information between different nodes.
23. Differentiate GRIS with GIIS with an illustration. AN (April/May 2018).
Grid Resource Information Service (GRIS)
• Associated with each resource.
• Answers queries from client/user about the particular resource.
• Accesses an “information provider” deployed on that resource for
requested information.
Grid Index Information Service (GIIS)
• A directory service that collects („pulls”) information for GRIS‟s.
• A “caching” service.
• Provides indexing and searching functions
24. Outline any two advantages of distributed computing. R (Nov/Dec 2018)
Advantages
Shareability
Expandability
Local autonomy
Improved performance
Improved reliability and availability
Potential cost reductions
PART-B
1. Explain Evolution of Distributed computing? U
2. Explain Technologies for network based system? U
3. Design neat diagram for Grid computing Infrastructures and explain? C
4. Explain Grid computing elements? U
5. Explain SOA? U
6. Illustrate the architecture of virtual machine and brief about the operations. (AP) (Nov/Dec
2016)
7. Write short notes on: (i) cluster of cooperative computers. (8) (ii) service oriented
architecture. (8) (R )(Nov/Dec 2016)
8. (i) Demonstrate in detail about internet of things and cyber physical systems.(8) AP
(ii) Examine the memory, storage and wide area networking technology in network based
system. (8)
9. Analyze in detail about the GPU programming model.(16) (AN)
10. i)Explain the layered architecture of SOA for web services.(8) (AN)
ii) Compare the features of grid versus cloud. (8) (AN)
11. Brief the interaction between the GPU and CPU in performing parallel execution of
operations. (U )(April/May 2017)
12. Illustrate with a neat sketch, the grid computing Infrastructures. (R ) (April/May 2017)
13. (i) Describe the infrastructure requirements for grid computing. (U )(Nov/Dec 2017)
(ii) What are the issues in cluster design? How can they be resolved?
14.(i) Describe the layered grid architecture. How does it map onto internet protocol
architecture? U (Nov/Dec 2017)
(ii) Describe the architecture of a cluster with suitable illustrations. U
15. Explain in detail the layered architecture of a grid environment and the
functionalities of a grid server. (U) (April/May 2018)
16. Discuss the evolution path of cloud computing. Also, express the difference between
grid and distributed computing. (U) (April/May 2018)
(ii) Outline the similarities and differences between distributed computing grid
computing and cloud computing.(6)( AN) (Nov/Dec 2018)
18. What is grid computing? Draw a typical view of a grid environment and outline the key
elements of grid. (13) (U) (Nov/Dec 2018)
PART-A
1. Define OGSA? R
• OGSA defines what Grid services are, what they should be capable of, what type of
technologies they should be based on. OGSA does not give a technical and detailed
specification. It uses WSDL
• It is a formal and technical specification of the concepts described in OGSA.
• The Globus Toolkit 3 is an implementation of OGSI.
2. List the OGSA Design Goals. R
Operations are grouped to form interfaces, and interfaces are combined to specify a
service. Encourages code-reuse „ Simplifies application design Ease of composition of services
Service Virtualization: isolate users from details of service implementation and location.
3. Why IDL and Service Virtualization? AN
Service discovery Allows clients to query and find suitable services in an unfamiliar
environment.
Service composition
Code-reusage, dynamic construction of complex systems from simple components.
Specialization Use of different implementation of a service interface on different
platforms.
Interface extension
Allows extensions to specialized service interfaces
PART-A
1. What are the cloud deployment models? R
• Private Cloud
• Public Cloud
• Hybrid Cloud
• Community Cloud
2. Define private cloud. R
• Applies to private clouds implemented at a customer‟s premises.
Outsourced Private Cloud
• Applies to private clouds where the server side is outsourced to a hosting company.
3. Define public cloud. R
The most ubiquitous, and almost a synonym for, cloud computing. The cloud
infrastructure is made available to the general public or a large industry group and is owned by
an organization selling cloud services.
Examples of Public Cloud: Google App Engine
4. Define Hybrid cloud. R
The cloud infrastructure is a composition of two or more clouds (private, community, or
public) that remain unique entities but are bound together by standardized or proprietary
technology that enables data and application portability (e.g., cloud bursting for load-balancing
between clouds).
16. ''Virtualization is the wave of the future". Justify. Explicate the process of CPU, memory
And I/O device virtualization in data center.AN (April/May 2018)
17. (i) What are the pros and cons for public, private and hybrid cloud? (7) U (Nov/Dec 2018)
(ii) Explain virtualization of l/O devices with an example. (6)
18. What is a data center? Outline the issues to be addressed with respect to virtualization for
data center automation. U (Nov/Dec 2018)
Open source grid middleware packages –Globus Toolkit (GT4) Architecture, Configuration –
Usage of Globus –Main components and Programming model -Introduction to Hadoop
Framework - Mapreduce, Input splitting, map and reduce functions, specifying input and output
parameters, configuring and running a job – Design of Hadoop file system, HDFS concepts,
command line and java interface, dataflow of File read & File write.
PART A
1. What is open source? R
Open source refers to any program whose source code is made available for use or
modification as users or other developers. Open source software is usually developed as a
public collaboration and made freely available.
2. What is Middleware? R
Middleware is a general term for software that serves to "glue together" separate, often
complex and already existing, programs. Some software components that are frequently
connected with middleware include enterprise applications and Web services.
9. Define MapReduce. U
MapReduce is a programming model and an associated implementation for processing
and generating large data sets with a parallel, distributed algorithm on a cluster.
MapReduce data processing is driven by this concept of input splits. The number of
input splits that are calculated for a specific application determines the number of mapper
tasks. Each of these mapper tasks is assigned, where possible, to a slave node where the input
split is stored.
20. “HDFS is fault tolerant. Is it true? Justify your answer.AN (Nov/Dec 2017)
HDFS is highly fault tolerant. It handles faults by the process of replica creation.
The replica of users data is created on different machines in the HDFS cluster. So
whenever if any machine in the cluster goes down, then data can be accessed from other
machine in which same copy of data was created. HDFS also maintains the replication
factor by creating replica of data on other available machines in the cluster if suddenly
one machine fails.
21. What is the purpose of heartbeat in Hadoop. U(Nov/Dec 2017)
In Hadoop Name node and data node do communicate using Heartbeat. Therefore
Heartbeat is the signal that is sent by the data node to the name node after the regular
interval to time to indicate its presence, i.e. to indicate that it is alive.
If after a certain time of heartbeat Name Node does not receive any response from
Data Node, then that particular Data Node used to be declared as dead. The default
heartbeat interval is 3 seconds.
22. How does divide-and-conquer strategy relates to MapReduce paradigm?AN
(April/May 2018)
In MapReduce, you divide the work up serially, execute work packets in parallel, and tag
the results to indicate which results go with which other results. The merging is then serial for all
the results with the same tag, but can be executed in parallel for results that have different tags. In
more previous systems, the merge step became a bottleneck for all but the most truly trivial tasks.
With MapReduce it can still be if the nature of the tasks requires that all merging be done serially.
If, however, the task allows some degree of parallel merging of results, then MapReduce
gives a simple way to take advantage of that possibility.
23. Brief out the main components of Globus toolkit. U (April/May 2018)
Common runtime components
• Security
• Data management
• Information services
• Execution management
PART B
1. Explain detail Map Reduce. U
2. Explain detail components and Programming model. U
3. What is Hadoop Framework and Explain in detail? U
4. Write Short notes on HDFS. R
5. What is file system and explain in detail dataflow of File read & File write. U
6. Draw and explain the global toolkit architecture. U (Nov/Dec 2016)
7. Give a detailed note on Hadoop framework. R (Nov/Dec 2016)
8. Discuss MAPREDUCE with suitable diagrams. U (April/May 2017)
9. Elaborate HDFS concepts with suitable illustrations. AN(April/May 2017)
10. Illustrate dataflow in HDFS during file read/write operation with suitable diagrams.
U (Nov/Dec 2017)
11. What is GT4? Describe in detail the components of GT4 with a suitable diagram.
U (Nov/Dec 2017)
12. List the characteristics of Globus tool kit. With a neat sketch describe t he
architecture of Globus GT4 and the services offered. R April/May 2018)
13. With an illustration, emphasize the significance of MapReduce paradigm in Hadoop
framework. List out the assumptions and goals set in HDFS architecture for processing
the data based on divide-and-conquer strategy. AN (April/May 2018)
14. Explain the main components and programming model of Globus Toolkit. (13)
U (Nov/Dec
2018)
15. Explain the Hadoop distributed file system architecture with a diagram. (13) U (Nov/Dec
2018)
UNIT V - SECURITY
Trust models for Grid security environment – Authentication and Authorization methods – Grid
security infrastructure – Cloud Infrastructure security: network, host and application level –
aspects of data security, provider data and its security, Identity and access management
architecture, IAM practices in the cloud, SaaS, PaaS, IaaS availability in the cloud, Key privacy
issues in the cloud.
Part A
1. Illustrate Grid security infrastructure. U
2. What are the security challenges faced in a Grid environment? R
Security challenges faced in a Grid environment can be grouped into three categories:
(a) integration solutions where existing services needs to be used, and interfaces should be
abstracted to provide an extensible architecture.
(b) interoperability solutions so that services hosted in different virtual organizations that
have different security mechanisms and policies will be able to invoke each other.
(c) solutions to define, manage and enforce trust policies within a dynamic Grid
Environment
3. Different between Authentication and Authorization.AN
Authentication:
Authentication means validating whether app is accepting right user and rejecting invalid /
anonymous users or not.
Ex: 1) Login into internet banking with valid login credentials (application has to accept the
user).
2) Login into internet banking with invalid login credentials (application has to reject
the user)
Authorization:
Authorization means validating whether app providing right permissions to right users or
not.
Ex: 1) Agent: After login into IRCTC website, agent has permissions like book tickets more
than 5, cancel, edit information.
2) User: Ater login into IRCTC website, user has permissions like booking not go
beyond 5 tickets, only book ticket or cancel ticket. But not editing information like
Agent.
4. What is identity and access management in a cloud environment? U(Nov/Dec 2018)
An identity access management (IAM) system is a framework for business processes that
facilitates the management of electronic identities. The framework includes the technology
needed to support identity management.
5. List out the Cloud security issues.R
Cloud security issues fall primarily into three areas:
Data Residency: Many companies face legislation by their country of origin or the local
country that the business entity is operating in, requiring certain types of data to be kept
within defined geographic borders. There are specific regulations that must be followed,
centered around data access, management and control.
Data Privacy: Business data often needs to be guarded and protected more stringently
than non-sensitive data. The enterprise is responsible for any breaches to data and must be
able ensure strict cloud security in order to protect sensitive information.
Industry & Regulation Compliance: Organizations often have access to and are
responsible for data that is highly regulated and restricted. Many industry-specific regulations
such as GLBA, CJIS, ITAR and PCI DSS, require an enterprise to follow defined standards
to safeguard private and business data and to comply with applicable laws.
19. Write a brief note on the security requirements of grid. U(April/May 2017)
Authentication
Authorization
Assurance/Accreditation
Accounting
Audit
20. List any four host security threats in public IaaS. (Nov/Dec 2017)
Host security at IaaS Level
a. Virtualization software security
i. Hypervisor security
ii. Threats: Blue Pill attack on the hypervisor
b. Customer guest OS or virtual server security
i. Attacks to the guest OS: e.g., stealing keys used to access and manage the
hosts
21. Identify the trust model based on a site’s trust worthiness. U(Nov/Dec 2017)
In a reputation-based model, jobs are sent to a resource site only when the site is
trustworthy to meet users‟ demands. The site trustworthiness is usually calculated from the
following information such as the defense capability, direct reputation, and recommendation
trust. The defense capability refers to the site‟s ability to protect itself from danger.
22. On what basis trust models are set for grid environment? AN (April/May 2018)
A trust relationship must be established before the entities in the grid interoperate
with one another. The entities have to choose other entities that can meet the
requirements of trust to coordinate with. The entities that submit requests should believe
the resource providers will try to process their requests and return the results with a
specified QoS.
To create the proper trust relationship between grid entities, two kinds of trust
models are often used. One is the PKI-based model, which mainly exploits the PKI to
authenticate and authorize entities. The other is the reputation-based model.
23. State how CIA Triad plays a vital role in managing cloud security. AN
(April/May 2018)
There are three crucial components that make up the elements of the CIA triad, the
widely-used model designed to guide IT security. Those components are confidentiality,
integrity, and availability.
24. Write a short note on Kerberos. R (Nov/Dec 2018)
Kerberos (protocol) is a computer network authentication protocol that works on
the basis of tickets to allow nodes communicating over a non-secure network to prove
their identity to one another in a secure manner.
PART B
1. What are Security Mechanisms in Grid Security Infrastructure (GSI)? Explain. U
2. Explain in detail Cloud computing security.U
3. Write short notes on.R
SaaS
PaaS
IaaS
4. Explain in detail Authentication and Authorization. U
5. Explain trust models for grid security environment. U (Nov/Dec 2016)
6. Write in detail about cloud security infrastructure. R (Nov/Dec 2016)
7. (i) Analyze in detail about the IAM Standards, Protocols, and Specifications for Consumers.(8)
AN
(ii) Compare the Enterprise and Consumer Authentication Standards and Protocols.(8) AN
8.i) Analyze the infrastructure security of cloud at host level.(8)AN
ii)Explain in detail about virtual server security of cloud.(8) U
9.i)Tabulate in detail about the Comparison of SPI maturity models in the context of IAM.(8)
AN
ii) Tabulate the Comparison of maturity levels for IAM components in detail.(8) AN
10. Write detailed note on identity and access management architecture. U(April/May 2017)
11. Explain grid security infrastructure. U (April/May 2017)
12. What is the purpose of GSI? Describe the functionality of various layers in GSI.
U (Nov/Dec
2017)
13. What is the purpose of IAM? Describe its functional architecture with an illustration.
U(Nov/Dec 2017)
14. "In today's world, infrastructure security and data security is highly challenging at
network at network, host and application levels". Justify and explain the several ways of
protecting the data at -transit and at rest. AN (April/May 2018)
15. Explain the baseline Identity and Access Management (IAM) factors to be practiced by
the stakeholders of cloud services and the common key privacy issues likely to happen in
the environment.AN (April/May 2018)
16. Define authentication and authorization. Outline authentication and authorization in grids
with relevant examples.(13) U(Nov/Dec 2018)
17. Describe Infrastructure-as-a Service (IaaS), Platform-as-a Service (PaaS) and Software-
as-a Service (SaaS) with an example. (13) U(Nov/Dec 2018)